Mottos/Mantras/Maxims/Take-Home Lessons From Chapter 5

The two most common misconceptions people have about operational definitions of psychological constructs are that (1) no measure is valid and that (2) any measure is valid.
To measure a construct accurately, you must have a behavior that is an accurate indicator of that construct.
Unfortunately, measures of that behavior may be contaminated by random error and bias.
Both random error and bias can come from the observer/scorer, the researcher/administrator, or the participant. That is, the people scoring, administering, and filling out the measure may introduce random error by acting inconsistently and may introduce bias by acting in a way that pushes scores in a given direction.
Bias poisons a measure's validity; random error dilutes it.
Because random error tends to balance out, using multiple raters rather than a single rater can reduce the effects of random error due to the scorer, and asking multiple questions to measure a construct rather than a single question can reduce the effects of random error due to the participant.
Some procedures can reduce both bias and random error (e.g., standardization), some procedures (e.g., blind procedures) reduce only bias, and some procedures reduce random error but not bias (e.g., having multiple questions or having multiple raters).
Participants may bias results by acting in a way that they think (1) will make themselves look good (social desirability bias) or (2) will support the researcher's hypothesis (by obeying demand characteristics). Note that making responses anonymous should eliminate the social desirability bias (participants can't impress the researcher if the researcher won't know who they are) but may have no effect on bias due to participants obeying demand characteristics (anonymous participants can still try to help the researcher get the "right" results).
Reliability and validity are two different things. Reliability refers to consistency; validity refers to accuracy.
A valid measure is a reliable measure. That is, reliability is valuable because if you are measuring a stable trait, your measure should produce stable scores. For example, your measurements of someone's height should not vary from hour to hour.
A reliable measure is not necessarily a valid measure. For example, if your measure is tapping a construct other than the one you want to measure, your reliable measure would be giving you scores that are reliably wrong.
There are many different indexes of reliability--and they are not all the same. For example, measures of inter-observer reliability, like Cohen's kappa, are affected only by random error caused by inconsistencies between observers;
measures of internal consistency, like Cronbach's alpha, are affected only by random error caused by inconsistencies between how participants answer supposedly related questions; and test-retest reliability coefficients are affected by any random measurement error.
A test should have internal consistency: It should agree with itself. If half of the test is saying that the person is quiet and shy but the other half is saying that the person is loud and outgoing, there's a problem. Internal consistency problems can often be fixed by eliminating or editing questions that do not correlate with the rest of the test.
One reason so many tests are multiple-choice is that such tests eliminate concerns about observer error.
The case for construct validity relies on circumstantial evidence: The more, the better.
To make the case for construct validity, you should argue both that you are measuring the right thing and that you are not measuring the wrong thing.
You could argue that your measure is measuring the right thing by showing that
1. You have content (sampling) validity because your measure seems to have questions that adequately cover the key characteristics of what you are measuring
2. Your measure correlates with other measures and indicators of your construct.
You could argue that you are not measuring the wrong thing by
1. Having an objective measure, to show that your measure is not affected by scorer bias.
2. Having a reliable measure, to show that your measure is not unduly affected by random error.
3. Having internally consistent measure, to show that your measure is measuring one thing rather than many different things.
4. Having discriminant validity, to show that you are not measuring some other construct.
Although more care is typically taken in validating a measure than in validating a manipulation, this may be a mistake. At any rate, principles used to increase a measure's validity (e.g., standardization, blind techniques) can be used to increase a manipulation's validity.
Just as you can make a case for your measure's validity by showing that participant's scores on your measure correlate with scores on other measures of the construct, you can make the case for your manipulation's validity by showing that, on average, different levels of your manipulation correspond to different scores on the manipulation check.
Placebo treatments are a good way to neutralize the effects of participant bias.
In choosing a manipulation, consider whether you can easily standardize how you administer the manipulation and how vulnerable the manipulation is to participant bias,

Back to Chapter 5 Menu