RDE | Height and Income| Archival research

You can assign the followingarticle when covering Chapter 6 (because it describes an archival study) orwhen covering Chapter 7 (because it uses survey data):

Judge, T. A.,& Cable, D. M. (2004). The effect of physical height on workplace successand income: Preliminary test of a theoretical model. Journal of AppliedPsychology, 89, 428-441.

This article addresses an interesting topic (thehypothesis that being tall leads to being paid more [a hypothesis that has beenunofficially proposed many timesin the Dilbert cartoon strip]), discusses regression in a straightforward way,and is relatively easy for students to read. To make the article easier forstudents to digest, (a) have students skip the first few pages and beginreading on page 433 (starting with the heading “Estimating the effect ofheight on earnings”) and (b) give students Table 1.

Table 1 Helping Students Understand the Article
Section	Tips, Comments, and Problem Areas
Estimating the effect of height on earnings	Gender Note that, in scientific writing, authors are expected to use the metric system. Thus, the authors state heights in centimeters (abbreviated cm). Age Implicitly norm height by age: consider how tall someone is for their age. Thus, a 6-foot tall teenager will not be considered as tall as a 6-foot tall eighty-year-old. Method Primary: main
Study 2	Measures Averaging scores (in this case, earnings) increases reliability because random error tends to balance out (see Chapter 4). a : Cronbach’s alpha, a measure of internal consistency (see page 104).
Study 3	Measures (last sentence) “this variable was standardized…”: the authors adjusted the scores so that the mean income would be zero and the standard deviation of the scores would be 1. (To standardize the scores, the authors [a] took each score and subtracted it from the mean of the original scores and then [b] divided the result of each subtraction by the standard deviation of the original scores.)
Results: First section	2^nd paragraph b: Like Pearson r (the typical correlation coefficient), b gives you an index of the relationship of the predictor variable with the outcome variable. Indeed, if the regression equation has only one predictor, b will be the same as r. One way of interpreting b is to say that for every increase of one standard deviation in the predictor variable, the outcome variable will change by b number of standard deviations. In this set of studies, a b of .18 for height means that an increase of one standard deviation in height would be accompanied by a .18 standard deviation increase in earnings. Two warnings about interepreting b. 1. Although it is tempting to say that b tells you how important a predictor is, realize that the size of b for one predictor is affected by what other predictors are in the equation—and how those predictors correlate with each other (to see why, see pages 533-535 of Research design explained). 2. Remember that, in a correlational study, b cannot tell you the effect of a variable because correlational studies do not allow you to make cause-effect conclusions. multiple correlation: Like a Pearson r, the multiple correlation (abbreviated R ) is a measure of the association between one (earnings) variable and a second variable. The main difference is that whereas the second variable in a Pearson r would be a single observed variable (height), the second variable in R is the predicted value of the first variable (estimated earnings)—and this second variable is a combination of several predictor variables. Specifically, it is calculated by weighting a combination of several variables (height, weight, age, and gender) in a way that would make the predicted values (estimated earnings) as close to the actual values (actual earnings) as possible. In short, one difference between r and R is that R is looking at the correlation between a set of predictors and a to-be-predicted variable whereas r is looking at the correlation between one predictor and a to-be-predicted variable. Another difference is that whereas r can vary between –1 and +1, R can only vary between 0 and 1: R cannot be negative. R²: Just as we can square r to calculate how well scores on one variable predict scores on another variable (specifically, r² gives us the coefficient of determination: the percentage of variability in scores on one variable that can be predicted by knowing scores another variable), we can square R to calculate how well scores on one set of variables predicts scores on another variable (specifically, R² tells us the percentage of variability in scores on one variable that can be predicted by knowing the scores on another set of variables). 3^rd paragraph Unstandardized regression coefficients: Regression coefficients are the numbers you multiply your predictors by to get your predicted scores (e.g., estimated earnings). Unstandardized regression coefficients (B) are numbers that, when you use them in your regression equation, will give you the predicted result in terms of the original way your outcome variable was measured (in this case, dollars earned). The problem with unstandardized regression coefficients is that they are dependent on the measurement unit you use and, consequently, they make it hard to compare the relative importance of different variables. For example, if we measure height in inches, the B for height would be one number, but if we measured height in centimeters , B would be a different number. Similarly, B would vary considerably depending on whether we measured weight in pounds, ounces, or kilograms. If we then try to compare the B for height with the B for weight, the B for weight will be bigger than the B for height when we use certain measurement units (ounces, feet) , but the B for weight will be smaller than weight if we use other measurement units (kilograms, centimeters). In short, comparing the B weights of different variables is like comparing apples with oranges. Standardized regression coefficients (b) do not deal with variable’s original (raw) units (pounds, inches, dollars, or whatever), but instead deal with standardized (standard deviation) units. Specifically, standardized regression coefficients tell you how many standard deviations the to-be-predicted variable could be expected to change when the predictor variable changes by one standard deviation. Thus, a b of .40 for a predictor suggests that a one standard deviation change in that predictor will be accompanied by a change of .40 standard deviations on the to-be-predicted variable . Often, standardized regression coefficients are reported instead of unstandardized coefficients because standardized coefficients allow readers to more easily compare the relative importance of the different predictors (e.g., a predictor with a b of .40 would seem to be twice as useful as a predictor with a b of .20). Note that we say seem because the size of a predictor’s b will depend on what other predictors are in the equation and how the predictors correlate with one another.
Results: Differential effects by Gender	1^st paragraph Basically, the authors are (a) comparing earnings of men who are taller than 2 out of 3 men with earnings of men who are shorter than 2 out of 3 men and (b) comparing earnings of women who score in the upper third of height with earnings of women who score in the lower third of height. 2^nd paragraph The authors tested to see whether the correlation between height and earnings for men was higher than the correlation between height and earnings for women. Often, people fail to do this direct test. Instead of comparing the correlations with each other, people often incorrectly assert that (a) the correlations are not different because both correlations are (or both are not) significantly different from zero, or that (b) the correlations are different because one is significantly different from zero, but the other is not. The only way to know that the correlations are different from each other is to compare them with each other. 3^rd paragraph The authors added a variable (the gender by height interaction) to the regression equation to determine whether that variable (representing height having a different effect for one gender than for another) would be an improvement over an equation that did not consider the possibility of the height and income relationship being different for men than women. In other words, they looked to see if gender was a moderator variable (for more about moderator variables, see pages 51and 52 of Research design explained). Including the gender by height interaction variable did not improve prediction. Incremental R²: increase in R² D : change DR²^:change in R², improvement in ability to predict outcome variable (income); incremental R²
Table 2	The “diagonal” is the series of 1.00’s, which indicate the correlation of a variable with itself. Thus, the “1.00” in the first entry of the column labeled “1” is the correlation of participant’s gender with participant’s gender. The correlation of each variable with itself is obviously a meaningless correlation and is often starred or blanked out in many correlational tables. The numbers below the 1.00’s, such as the -.03 in the second row of the column labeled “1,” indicate Study 1 correlations. Thus, that -.03 stands for the correlation between age and gender in Study 1. The numbers above the 1.00’s, such as the .04 in the first row of the column labeled “2,” are correlations for Study 2. Thus, that .04 in the first row of the column labeled “2” is the correlation between gender and age in Study 2.
Results: Does the height effect decline over time?	The first sentence is fairly difficult to understand. However, the rest of the paragraph clarifies that sentence.
Discussion: Beginning	5^th paragraph r= .26: r is an estimate of what the true correlation between two variables would be if the predictor variable had been perfectly reliable. This estimate takes into account that the observed correlation understates the true relationship between the variables because (a) random error is affecting scores on the measure of the predictor variable and (b) this random error moves the observed correlation toward zero (because random error does not correlate with anything, including the to-be-predicted variable). Ephemeral: short-lasting Dissipates: disappears, fades away Garner: get, earn Explicitly considered: stated as a requirement Implicitly: not stated, but still used Bona fide occupational qualification: legally legitimate requirement that one should possess in order to hold a certain job. Proxy: substitute
Discussion: Limitations and strengths	Open the black box: find out what causes a factor to have an effect; locate the mediating variable (for more about mediating variables, see pages 50-51 of Research design explained. Demand cues: demand characteristics (see page 93 of Research design explained) Perceptual: subjective Methodological artifacts: misleading results due to problems in the design of the study Robustness: strong, generalizable Diverse: different Almost simultaneously: almost at the same time Proxies: substitutes for, is an index of

Helping Students Understand the Article

Tips, Comments, and Problem Areas