You can assign the following article when covering Chapter 6 (because it describes an archival study) or when covering Chapter 7 (because it uses survey data):

Judge, T. A., & Cable, D. M. (2004). The effect of physical height on workplace success and income: Preliminary test of a theoretical model. Journal of Applied Psychology, 89, 428-441.

 

This article addresses an interesting topic (the hypothesis that being tall leads to being paid more [a hypothesis that has been unofficially  proposed many times in the Dilbert cartoon strip]), discusses regression in a straightforward way, and is relatively easy for students to read. To make the article easier for students to digest, (a) have students skip the first few pages and begin reading on page 433 (starting with the heading “Estimating the effect of height on earnings”) and (b) give students Table 1.

 

 

Table 1

Helping Students Understand the Article

Section

 Tips, Comments, and Problem Areas

Estimating the effect of height on earnings

Gender

Note that, in scientific writing, authors are expected to use the metric system. Thus, the authors state heights in centimeters (abbreviated cm).

Age

Implicitly norm height by age: consider how tall someone is for their age. Thus, a 6-foot tall teenager will not be considered as tall as a 6-foot tall eighty-year-old.

Method

Primary: main

Study 2

Measures

     Averaging scores (in this case, earnings) increases reliability because  

      random error tends to balance out (see Chapter 4).

     a : Cronbach’s alpha, a measure of internal consistency (see page 

     104). 

Study 3

Measures (last sentence)

this variable was standardized…”: the authors adjusted the scores so that the mean income would be zero and the standard deviation of the scores would be 1. (To standardize the scores, the authors  [a] took each score and subtracted it from the mean of the original scores and then [b] divided the result of each subtraction by the standard deviation of the original scores.)

Results: First section

2nd paragraph

b: Like Pearson r (the typical correlation coefficient), b gives you an index of the relationship of the predictor variable with the outcome variable. Indeed, if the regression equation has only one predictor, b will be the same as r. One way of interpreting b is to say that for every increase of one standard deviation in the predictor variable, the outcome variable will change by b number of standard deviations. In this set of studies, a b of .18 for height means that an increase of one standard deviation in height would be accompanied by a .18 standard deviation increase in earnings.

Two warnings about interepreting b.

1.     Although it is tempting to say that b tells you how important a predictor is, realize that the size of b for one predictor  is affected by what  other predictors are in the equation—and how those predictors correlate with each other (to see why, see pages 533-535 of Research design explained).

2.     Remember that, in a correlational study, b cannot tell you the effect of a variable because correlational studies do not allow you to make cause-effect conclusions.

multiple correlation: Like a Pearson r, the multiple correlation (abbreviated R ) is a measure of the association between one (earnings) variable and a second variable.  The main difference is that whereas the second variable  in a Pearson r would be a single observed variable (height), the second variable  in R is the predicted value of the first variable (estimated earnings)—and this second variable is a combination of several predictor variables. Specifically, it is calculated by weighting a combination of several variables (height, weight, age, and gender) in a way that would make the predicted values (estimated earnings) as close to the actual values (actual earnings) as possible. In short, one difference between  r and R is that R is looking at the correlation between a set of predictors and a to-be-predicted variable whereas  r is looking at the correlation between  one predictor and a to-be-predicted variable. Another difference is that whereas  r can vary between –1 and +1, R can only vary between 0 and 1: R cannot be negative.

 

R2: Just as we can square r to calculate  how well scores on one variable predict scores on another variable (specifically, r2 gives us the coefficient of determination: the percentage of variability in scores on one variable that can be predicted by knowing scores another variable),  we can square R to calculate how well  scores on one set of variables predicts scores on another variable (specifically, R2 tells us the percentage of variability in scores on one variable that can be predicted by knowing the scores on another set of variables).

3rd paragraph

Unstandardized regression coefficients: Regression coefficients are the numbers you multiply your predictors by to get your predicted scores (e.g., estimated earnings). Unstandardized regression coefficients (B) are numbers that, when you use them in your regression equation, will give you the predicted result in terms of the original way your outcome variable was measured (in this case, dollars earned).  The problem with unstandardized regression coefficients is that they are dependent on the measurement unit you use and, consequently, they make it hard to compare the relative importance of different variables. For example, if we measure height in inches, the B for height would be one number, but if we measured height in centimeters , B would be a different number. Similarly, B would vary considerably depending on whether we measured weight in pounds, ounces, or kilograms. If we then try to compare the B for height with the B for weight,  the B for weight will be bigger than the B for height when we use certain measurement units (ounces, feet) , but the B for weight will be smaller than weight if we use other measurement units (kilograms, centimeters). In short, comparing the B weights of different variables is like comparing apples with oranges. Standardized regression coefficients (b) do not deal with variable’s original (raw) units (pounds, inches, dollars, or whatever), but instead deal with standardized (standard deviation) units. Specifically, standardized regression coefficients tell you how many standard deviations the to-be-predicted variable could be expected to change when the predictor variable changes by one standard deviation. Thus, a b of .40 for a predictor suggests that a one standard deviation change in that predictor will be accompanied by a change of .40 standard deviations on the to-be-predicted  variable . Often, standardized regression coefficients are reported instead of unstandardized coefficients because standardized coefficients allow readers to more easily compare the relative importance of the different predictors (e.g., a predictor with a b of .40 would seem to be twice as useful as a predictor with a b of .20).  Note that we say seem because the size of a predictor’s b will depend on what other predictors are in the equation and how the predictors correlate with one another.

Results: Differential effects by Gender

1st paragraph

Basically, the authors are (a) comparing earnings of men who are taller than 2 out of 3 men with earnings of men who are shorter than 2 out of 3 men and (b) comparing earnings of women who score in the upper third of height with earnings of women who score in the lower third of height.

2nd paragraph

The authors tested to see whether the correlation between height and earnings for men was higher than the correlation between height and earnings for women. Often, people fail to do this direct test. Instead of comparing the correlations with each other, people often incorrectly assert that (a) the correlations are not different because both correlations are (or both are not) significantly different from zero, or that (b) the correlations are different because one is significantly different from zero, but the other is not. The only way to know that the correlations are different from each other is to compare them with each other.

3rd paragraph

The authors added a variable  (the gender by height interaction) to the regression equation to determine whether  that variable (representing height having a different effect for one gender than for another) would be an improvement over an equation that did not consider the possibility of the height and income relationship being different for men than women.  In other words, they looked to see if gender was a moderator variable  (for more about moderator variables, see pages 51and 52 of Research design explained). Including the gender by height interaction variable did not improve prediction.

Incremental R2: increase in R2

D : change

 DR2: change in R2, improvement  in ability to predict outcome variable (income); incremental R2

Table 2

The “diagonal” is the series of 1.00’s, which indicate the correlation of a variable with itself. Thus, the “1.00” in the first entry of the column labeled “1” is the correlation of participant’s gender with participant’s gender. The correlation of each variable with itself  is obviously a meaningless correlation and is often starred or blanked out in many correlational tables. The numbers below the 1.00’s, such as the -.03 in the second row of the column labeled “1,” indicate Study 1 correlations. Thus, that -.03 stands for the correlation between age and gender in Study 1. The numbers above the 1.00’s, such as the .04 in the first row of the column labeled “2,” are correlations for Study 2. Thus, that .04 in the first row of the column labeled “2” is the correlation between gender and age in Study 2.

Results: Does the height effect decline over time?

The first sentence is fairly difficult to understand. However,  the rest of the paragraph clarifies that sentence.

Discussion:

Beginning

5th paragraph

r= .26: r is an estimate of what the true correlation between two variables would be if the predictor variable had been perfectly reliable. This estimate takes into account that the observed correlation understates the true relationship between the variables because (a) random error is affecting scores on the measure of the predictor variable and (b) this random error moves the observed correlation toward zero (because random error does not correlate with anything, including the to-be-predicted variable).

Ephemeral: short-lasting

Dissipates: disappears, fades away

Garner: get, earn

Explicitly considered: stated as a requirement

Implicitly: not stated, but still used

Bona fide occupational qualification: legally legitimate requirement  that one should possess in order to hold a certain job.

Proxy: substitute

Discussion: Limitations and strengths

Open the black box: find out what causes a factor to have an effect; locate the mediating variable (for more about mediating variables, see pages 50-51 of Research design explained.

Demand cues: demand characteristics (see page 93 of Research design explained)

Perceptual: subjective

Methodological artifacts: misleading results due to problems in the design of the study

Robustness: strong, generalizable

Diverse: different

Almost simultaneously: almost at the same time

Proxies: substitutes for, is an index of

 

 

 


Back to Chapter 6 Main Menu

Back to Featured Article Menu

Back to RDE Main Menu