Review Questions for Chapter 9: Threats to Internal Validity

  1. What is internal validity? Why do we value it?
  2. To establish that changes in one variable cause changes in another variable, what 3 things must you establish?
  3. Name and define the 8 threats to internal validity.
  4. What is the difference between history and maturation?
  5. What is the difference between testing and instrumentation?
  6. When comparing two groups--even without using random assignment--which threats to internal validity are avoided?
  7. If random assignment is not used, when comparing two groups, what is the most serious threat to internal validity?
  8. What are two explanations--other than the treatment having an effect--for participants who match on pretest scores not matching on posttest scores?
  9. Why does arbitrary assignment to group fail to eliminate the selection threat?
  10. Which threats to internal validity does the two-group design usually avoid?
  11. Which threats to internal validity does the before-after design avoid?
  12. Which of the 8 threats to internal validity may harm the internal validity of a before-after design?
  13. When should you be most concerned about regression?
  14. When would you be most concerned about selection by maturation?
  15. When would you not worry about mortality?
  16. How might a researcher, in trying to boost internal validity, harm external validity?
  17. What technique can be used to create an internally valid study.

Answers to Review Questions for Chapter 9

  1. What is internal validity? Why do we value it?
  2. Internal validity is the degree to which a study allows you to correctly make cause-effect statements.

            If you want to influence, help, change, make, affect, increase, decrease, prevent, produce, or trigger some action, you are interested in causing an effect. For example, if you want to help people, you need to use treatments that cause good effects and that prevent bad effects. So, if you were giving someone a treatment that had been researched, you would prefer that the study suggesting that the treatment was effective was an internally valid study.

  3. To establish that changes in one variable cause changes in another variable, what 3 things must you establish?
  4. 1.       The alleged cause and the alleged effect are correlated: Changes in the causal variable are associated with changes in the effectvariable. (If the variables are not related, they can not be causally related.)

    2.       Changes in the alleged cause come before changes in the alleged effect (If changes in what you are calling the cause come after changes in what you call the effect, what you think is the cause may really be the effect--;and what you think is the effect may really be the cause. Put another way, what you think is a cause may only be a symptom).

    3.       No other factor could account for the relationship between the alleged cause and the alleged effect (If you cannot rule out the many possible third variables--;also known as;lurking variables--;both of your variables may be symptoms/side effects of some other variable. For example, ice cream consumption is correlated with shark attacks, but both may be effects of warmer weather.

  5. The 8 threats to internal validity (8 categories of events other than the treatment that could account for the differences between treatment conditions);can be divided into 3 categories:
  6.  

    Category #1--;Problems due to comparing one group of participants against a different group or subgroup of participants:

    1. Selection: comparing groups that were different before the treatment was introduced (think of the saying, you are comparing apples to oranges).
    2. Mortality (also called attrition): comparing conditions that became different due to participants dropping out of the study. For example, if most of the participants in the treatment condition drop out but no participants drop out of the no-treatment condition, comparing the treatment condition to the no-treatment condition would be like comparing a selected subset of the treatment participants to all the no-treatment participants. (It would be like comparing a bunch of grapes to a bunch in which the overly ripe and rotten ones had been thrown out).

     

    Category #2-- Factors other than the treatment that may change participants:

     

    1. Testing: changes resulting from the participant learning from a previous test or measurement that they had been given earlier in the study.
    2. History: changes in the environment that were not controlled by the researcher.
    3. Maturation: growth and other changes within the participant.
    4. Selection by maturation: groups growing apart.

     

    Category #3--Factors that may cause scores to change even though participants have not changed:

     

    1. Instrumentation: the way participants are measured or scored changes.
    2. Regression (toward the mean): Individuals with extreme scores will tend to score closer to average on retesting.
      • For example, if you select only those people who got 100% when guessing the outcome of 4 coin flips, those people will tend to score closer to 50% if you have them guess the outcome of another 4 coin flips.
      • If you are having trouble understanding regression, try this: Imagine a pile of 1000 leaves put in the center of the yard. In this yard, the wind blows randomly from right to left, and shifts frequently. A day later, you find that the wind has blown the leaves all over the yard. Suppose you picked up only the 10 leaves that were blown 50 feet or more to the right and put them back in the middle of the yard. The next day, you will not find that the wind has blown all those leaves 50 feet or more to the right. Instead, you are likely to find that about half have been blown to the left and about half have been blown to the right, and that most are close to the center of the yard. The influence of random error on scores is like the influence of a randomly shifting wind on the location of those leaves. If you understand that, you understand regression.
  7. What is the difference between history and maturation?

    History refers to changes that occur outside the participant (in the participant's environment); maturation refers to changes inside the participant.

  8. What is the difference between testing and instrumentation?

    Testing and instrumentation both refer to participants' scores changing from one measurement to the next for factors unrelated to the treatment.

    Instrumentation refers to changes in scores being due to changes in the measuring instrument or in how participants are scored. For example, if after the pretest, the questionnaire is revised, the scoring system is refined, or raters are trained, changes in posttest scores may reflect changes in the instrument rather than changes in the participant.

    Testing, on the other hand, refers to changes in the participant due to the experience of being measured. A simple example of testing would be to take a trivia test one day and then take the same test the next day. If you researched the questions after taking the test the first time, you would do better the second time. In this case, the act of taking the first test changed you. In this class, you try to take advantage of the testing effect by taking the online practice quizzes--and by studying these review questions.

    In short, the difference is that in testing, participants really have changed--taking the first test caused them to learn something which caused participants to behave and score differently on the retest. In instrumentation, on the other hand, the participants haven't changed--;their scores change only because the measuring instrument has changed.

  9. When comparing two groups--even without using random assignment--which threats to internal validity are avoided?

    Having different groups get different treatments avoids threats like history (other than the treatment, the different groups should experience the same outside events), maturation (both groups have the same time to mature), instrumentation (participants are often only measured once by the same instrument), and testing (both groups are usually measured the same number of times--once). .

  10. If random assignment is not used, when comparing two groups, what is the most serious threat to internal validity?

    Although mortality can be a problem, the biggest problem is usually selection: Because people are different, comparing your two groups may be like comparing apples with oranges. That is, they may have been different before the treatment was introduced.

  11. What are two explanations--other than the treatment having an effect--for participants who match on pretest scores not matching on posttest scores?
    1. Regression toward the mean: Random error can cause scores--especially extreme scores-- to be a poor reflection of true scores. For example, suppose you were trying to match 6th grade students in a school with 4th grade students in that school on multiple-choice math test. To get the groups to have the same scores, you might have to select the highest scoring 4th graders and the lowest scoring 6th graders. On the retest, you might find that the 6th graders are scoring higher (more like the average 6th grader In that school) and that the 4th graders are scoring lower (more like the average 4th grader in that school). You may have picked 6th graders who were having a bad day and 4th graders who made some lucky guesses. That is, the similarity you found between the groups was an illusion--an illusion that disappeared when the students were retested. The key is to remember that if you select participants based extreme scores, those scores will scores will tend to less extreme on retesting.
    2. Selection by maturation interactions: Participants who are the same on one variable at the time of the pretest may differ on unmatched variables that will make them differ on the posttest. For example, suppose you matched 6th grade students in a school with 4th grade students in that school on multiple-choice math test. If the posttest occurred several months later, the 4th graders might now score higher than the 6th graders because their mathematical skills are progressing at a faster pace.
  12. Why does arbitrary assignment to group fail to eliminate the selection threat?
  13.  You are not making groups equal. Instead, you are making sure the groups differ in at least one way. They probably differ in other ways as well.

  14. Which threats to internal validity does the two-group design usually avoid?

    The groups will probably not differ in terms of History, Maturation, Instrumentation, and Testing.

  15. Which threats to internal validity does the before-after design avoid?

    By comparing each participant with herself, you are comparing apples with apples, so you do not have to worry about either Selection or Selection by Maturation

  16. Which of the 8 threats to internal validity may harm the internal validity of a before-after design?

    History, Maturation, Instrumentation, Testing, Regression, and Mortality

  17. When should you be most concerned about regression?

    When you are selecting participants based on their extreme scores and your measure is unreliable.

  18. When would you be most concerned about selection by maturation?

    When you had matched on pretest scores, but (a) there was time for participants to change from pretest to posttest and (b) the groups differed in ways that might affect maturation (e.g., they differed in terms of age or gender).

  19. When would you not worry about mortality?

    When no participants withdraw or are withdrawn from the study.

  20. How might a researcher, in trying to boost internal validity, harm external validity?

    In trying to keep nontreatment factors constant, the researcher may lose the ability to generalize to situations or participants that differ from the narrow range of situations and participants studied. For example, to help internal validity, a researcher might study participants who do not vary much from each other (white, male mice who are 180 days old or pairs of identical twins) under tightly controlled laboratory conditions. Would the results apply to other, more diverse populations? To less tightly controlled real world settings?

  21. What technique can be used to create an internally valid study?

    By using random assignment, you can do an internally valid study.

 

    For practice identifying internal validity problems with designs that do not use random assignment, click here.

    Return to Chapter 9 Main Menu