The Bell Curve : Corrected for Skew

This commentary documents serious pitfalls in the statistical analyses and the interpretation of empirical evidence presented in The Bell Curve. Most importantly, the role of education is re-evaluated and it is shown how, by neglecting it, The Bell Curve grossly overstates the case for IQ as a dominant determinant of social success. The commentary calls attention to important features of logistic regression coefficients, discusses sampling and measurement uncertainties of estimates based on observational sample data, and points to substantial limitations in interpreting regression coefficients of correlated variables.


Introduction
The Bell Curve by Richard Herrnstein and Charles Murray (henceforth H&M) puts forward a strong thesis about the centrality of intelligence in determining contemporary American social structure.Following its publication in October 1994, The Bell Curve sparked an intense public debate over its assertions, methodology and conclusions.Most of the book's critics, in a flood of newspaper articles, TV talk shows, academic journal articles and a few books, focused on The Bell Curve's treatment of ethnic and racial group differences in intelligence, the role of heredity in determining these differences, and the social and political agenda advocated by H&M.The heated debate was clearly another wave of the controversies about genes, IQ and public policy (see, e.g., Cronbach, 1975).
The Bell Curve is distinguished by its extensive use of statistical analyses to support a strong social theory.Other authors have provided critical examination of some statistical and measurement aspects of The Bell Curve, raising concerns about the appropriateness of causal inferences, model specification (most notably the absence of measures of education from the models), model fit and the validity of IQ and SES measures, among other issues.Some of these concerns will be echoed here in detail.The current commentary will go beyond delineation of these issues in principle or theory, to reexamine the statistical evidence and to analyze further the data presented in The Bell Curve.
H&M explore the relationship between social stratification and the distribution of cognitive abilities which, according to their thesis, will inevitably lead to a "world in which cognitive ability is the decisive dividing force" (p.25).Part I of the book is devoted to an elaborate exposition of the emergence and the increasing isolation of a "cognitive elite", driven by radical transformations in educational, occupational and economic forces in American society throughout the twentieth century.What are the consequences of this current American landscape that has been stratified so forcefully according to cognitive ability?
In part II of the book, H&M launch a series of statistical analyses to examine the role of intelligence, as measured by an IQ test, in determining a myriad of social ailments such as poverty, school dropout, unemployment and labor force dropout, welfare dependency and criminal behavior.The analyses of part II use a sub-sample of non-Latino white respondents from the National Longitudinal Survey of Youth (NLSY)--a nationally representative sample of 12,686 young men and young women who were 14 to 22 years of age when they were first surveyed in 1979.By focusing on the white sub-sample, H&M argue that "cognitive ability affects social behavior without regard to race and ethnicity" (p.125).Only later, in Part III, when the importance of intelligence as a powerful determinant of social behavior has been allegedly demonstrated, do H&M turn to examine ethnic and racial group differences.An evaluation of the scientific merit of the book will best be served by focusing on how H&M handle and present the less controversial evidence about the role of intelligence in the lives of young white Americans.As Charles Murray notes, "perhaps the most important section of The Bell Curve is Part II" (1995, p. 27).Indeed, many of the arguments and conclusions to appear later in the book rely heavily on the success of the case made in Part II, which constitutes (together with Appendices 2,3,4) a dense collection of statistics, tables, graphs, and technical details.H&M use the case of poverty, presented in Chapter 6, to "set the stage for the social behaviors to follow" (p.125).This chapter provides a basic template for their formulation of research questions, analysis strategies and use and interpretation of statistical methods.As such, it will be appropriate to focus here in some detail on this chapter.Chapter 6 asks, "What causes poverty?", or more specifically, "If you have to choose, is it better to be born smart or rich?" (p.127).Let us examine how H&M arrive at what they claim is an "unequivocal" answer: "smart".

Logistic Regression Coefficients
The basic analytical tool H&M employ is a set of multiple regression equations.The independent variables are IQ, SES, and age.(Age is included in the models because of the nature of the NLSY sample.It is inconsequential to the arguments presented here and will not be further discussed.)The IQ test used throughout The Bell Curve is the Armed Forces Qualification Test (AFQT), a subset of the Armed Services Vocational Aptitude Battery (ASVAB).The SES measure is an average of standardized parental education, parental occupation, and family income.The dependent variable is whether a respondent in the NLSY was below the poverty line in 1989.H&M examine the regression results: they observe that the IQ regression coefficient (-.84) is much larger than the SES coefficient (-.33); they then plot a graph showing how the probability of being in poverty is predicted by the model as a function of IQ or SES, holding the other variable constant at its average value.(The regression equation is given in p. 596, and the graph in p. 134.)H&M conclude: "Cognitive ability is more important than parental SES in determining poverty" (p.135), independent of any role SES might play in determining the likelihood of poverty.How warranted is this conclusion?
For those not versed in the details of regression analysis, H&M provide a primer in Appendix 1 (pp.553-577) entitled: "Statistics for People Who Are Sure They Can't Learn Statistics."After explaining basic statistical concepts, multiple linear regression is introduced.Logistic regression, the technique employed throughout Part II, is presented as a simple adaptation of linear regression to handle binary outcomes: "It tells us how much change there is in the probability of being unemployed, married, and so forth, given a unit change in any given variable, holding all the other variables in the analysis constant" (p.567).The unsuspecting reader misses one important point: The value chosen at which to "hold a variable in the analysis constant" has a direct impact on the magnitude of anticipated change in the probability of the outcome, given a unit change in any other variable.H&M identify the mathematical function responsible for this behavior of the logistic regression, the log odds, or logistic function, later in the introduction to the results in Appendix 4, but they are silent about its consequences.As we shall see, this seemingly insignificant technical point has crucial implications for the interpretation of logistic regression results on a probability scale.
Let us examine what happens when we use the same regression coefficients, the same model, but decide to hold SES at other values than its average.Should we expect to see any noticeable difference in the relations between IQ and the probability of being in poverty?After all, we are still holding SES constant, and, as H&M assure us, "here is the relationship of IQ to social behavior X after the effects of socioeconomic background have been extracted" (p.123).
Figure 1 depicts the predicted probabilities of being in poverty as a function of IQ at three values of SES: the SES average (the one shown in The Bell Curve), and 2 standard deviations above and below the SES average.Contrary to what we might have expected after being told that the effects of SES has been extracted out, the effect of IQ on the probability of being in poverty is much stronger when SES level is lower; it is much weaker when SES level is higher!This is a necessary consequence of the nature of the logistic regression model.For persons with lower socioeconomic status, the anticipated change in the likelihood of being poor associated with a unit change in IQ, is much larger than for those with higher socioeconomic status.This means that the risk of poverty induced by having lower intelligence is far more pronounced under conditions of adverse family environment.On the other hand, the privileges of a sound family background seem to mitigate the harsh consequences of lacking in cognitive abilities.
Take for example two persons, a "smart" with an IQ of 115 (one standard deviation above the average), and a "dull" with an IQ of 85 (one standard deviation below the average).How do they compare in their respective risks of being poor?If they both come from an extremely poor background, the "dull" person is 18% more likely to be in poverty than the "smart"; On the other hand, if they both come from a family of extremely high socioeconomic status, the difference shrinks to only 6%.If we return to H&M original assertion about the logistic regression coefficient as indicating how much change will occur in the probability of poverty, given a unit change in IQ, we find that a two-units change (moving from -1 to 1 in standard deviations) in IQ, means three times more change in the probability of being poor for those with low SES compared with those with high SES.So much for "holding all the other variables in the analysis constant".
Clearly, Figure 1 tells a more complicated story than the one H&M would have the student of their statistics primer believe on the basis of interpreting the logistic regression coefficients as if they were linear or additive.Even more experienced researchers, who routinely run linear regression analyses, need more than what H&M are willing to provide as a guide to the proper interpretation of their logistic regression results.In the authoritative source on Generalized Linear Models, of which logistic regression is a special case, McCullagh and Nelder (1989) provide such guidance, as well as call attention to the fact that "...statements given on the probability scale are more complicated because the effect on [the probability of an outcome] of a unit change in X2 depends on the values of X1 and X2" (p.110; italics added).In discussing the "special case of education" (we shall have more to say on this later), H&M quite rightly assert that "...to take education's regression coefficient seriously tacitly assumes that intelligence and education could vary independently and produce similar results.No one can believe this to be true in general: indisputably, giving nineteen years of education to a person with IQ of 75 is not going to have the same impact on life as it would for a person with an IQ of 125" (p.125).Why should we, then, take the IQ regression coefficient seriously when, as we just saw, having a high (or low) IQ for a person coming from a poor background is not going to have the same impact on life as for a person coming from a wealthy background?
Let us now review the substantive conclusion H&M draw from the regression results: "If a white child of the next generation is given a choice between being disadvantaged in socioeconomic status or disadvantaged in intelligence, there is no question about the right choice" (p.135).Indeed, there is no question: If your parents are rich enough, you can afford to be very dull and still can expect to escape poverty.If, on the other hand, you made the poor (literally) choice of being born to a low SES family, chances are that intellectual weakness will carry grave consequences for you.This, of course, is a caricature of serious hypothesizing about the dynamics of cognitive abilities and social conditions, but it brings us to the next issue--the independence (or the lack thereof) of independent variables.

Independence of Independent Variables
H&M point out that "variables that are closely related can in some circumstances produce a technical problem known as multicollinearity, whereby the solutions produced by regression equations are unstable and often misleading" (pp.124-125; italics in original).Attention to potential effects of multicollinearity (meaning simply that the independent variables are correlated with each other), is indeed warranted when dealing with an attempt to disentangle via statistical analysis the effects of variables that are highly correlated in nature.Observing correlations of .50 and .64 between education and SES and IQ, respectively, cause H&M to raise a concern about the interpretation of a regression model that includes all three of them as independent variables.But what about the association between SES and IQ? Are they free to vary independently?Are they sufficiently uncorrelated as not to sound a similar alarm?
The correlation between the AFQT scores and parental SES in the NLSY data is .55.After reporting this correlation, H&M summarize: "Being brought up in a conspicuously high-status or low-status family from birth probably has a significant effect on IQ, independent of the genetic endowment of the parent" (p.589).Although the magnitude of these effects or their explanation are debatable, the IQ scores used in The Bell Curve to demonstrate the independent role of a cognitive endowment are caused to an important degree by parent's SES.This means, to rephrase H&M argument about ignoring years of education in their regressions, that when IQ is used as an independent variable, it is to some extent expressing the effects of SES in another form.Can this be solved by the machinery of multiple regression?It is too often believed that regression analysis provides the proper statistical control, "accounting for" is the usual term, which mathematically remedies the confounding of effects imposed by the realities of the investigated phenomenon or by the study design.The answer is an unequivocal "No." Neter, Wasserman, and Kutner (1990) explain: "Sometimes the standardized regression coefficients, b1 and b2, are interpreted as showing that X1 has a greater impact on the [outcome variable] than X2 because b1 is much larger than b2.However, ...one must be cautious about interpreting regression coefficients, whether standardized or not.The reason is that when the independent variables are correlated among themselves, as here, the regression coefficients are affected by the other independent variables in the model."(By a happy circumstance, the correlation alluded to in this section is .569,almost exactly the correlation between IQ and SES!) "Hence, it is ordinarily not wise to interpret the magnitudes of standardized regression coefficients as reflecting the comparative importance of the independent variables" (p.294).
For a detailed discussion of these issues, the reader is invited to consult Chapter 13 of Mosteller & Tukey's Data Analysis and Regression (1977).They masterfully demonstrate the problems of interpreting regression coefficients, and sound very clear warnings concerning the comparison of regression coefficients even for fully deterministic systems under tight experimental control.

A Scale is a Scale is a Scale?
The correlation between independent variables is not the only factor affecting the magnitude, and consequently the interpretation, of linear or logistic regression coefficients.It is important to recognize the effects on estimated regression parameters due to errors of measurement.H&M go into great detail to document the superior measurement qualities of their IQ test -the AFQT.That the AFQT provides good measurement of g, general cognitive ability, is demonstrated by high correlations among its four constituent tests, by high correlations with other measures of general ability, and by high loadings on the general factor of the ASVAB battery.(The latter is purported to represent g in common psychometric practice.It is interesting to note, however, that Gustafsson and Muthen (1994) show that the ASVAB lacks measures of Fluid Intelligence and its general factor is closer to Crystallized Intelligence, which they interpret as a broad verbal factor, closely associated with academic achievement.)The conclusion is that the AFQT is an exceptionally high quality instrument.
What, then, are the measurement qualities of the measure of socioeconomic status?Compared with the treatment of the AFQT scale, only meager information is presented to allow evaluating the quality of the SES scale.However, from the two pieces of information that are presented, a reliability coefficient of .76 and correlations among the four indicators comprising the scale ranging from .36 to .63,we can safely conclude that the SES measure is substantially inferior as a measurement device and is subject to considerable error.Moreover, for more than a quarter of the subjects only three of the indicators were available, further compromising the reliability of the scale.Therefore, "one must conclude that as a proxy for 15 years of environment, this is a variable measured with substantial error" (Delvin et al., 1995(Delvin et al., , p. 1468)).The effect of the SES scale's low reliability on the regression results is quite clear: an underestimation of the SES effect run in a "horse race" against IQ.It is likely that the real differences between the effects of SES and IQ on the poverty in the population are smaller than what is reflected in H&M's estimates.In addition to errors of measurement, statistical uncertainties related to sampling are another major source of caution.

Uncertainty in Statistical Estimates
Based on the logistic regression results, as depicted by the plots they draw, H&M make two strong predictions to demonstrate the different roles IQ and SES play in determining poverty.Paying attention to the far left-hand side of the plots on p. 134, we can observe that a white person from an unusually deprived socioeconomic background, with an average IQ, has a probability of about 11% of being in poverty.On the other hand, an extremely dull person with an average SES, has a probability of about 26% of being in poverty -more than double.Notice that these prediction use extreme values of IQ and SES to produce dramatic differences.
How accurate are these statements?How much confidence should we have that the real proportions in the population are close to the ones suggested by the statistical model estimated for this particular sample?An appropriate indicator of statistical uncertainty is the confidence interval of prediction.It informs us about the range of likely values we expect to encounter if we were to sample again from the same population.Confidence intervals for prediction in logistic regression models are easily obtained by using conventional methods (see Agresti, 1990, Chapter 12) or alternatively, by utilizing a computer intensive resampling technique known as bootstrapping (see Efron & Tibshirani, 1993).
Using both methods, we may compute confidence intervals for the two predictions above (at the 95% confidence level).The range of plausible values for a person from a deprived socioeconomic background with an average IQ goes from 8% to 16%.The range of plausible values for a dull person with average SES goes from 20% to 35%.(Both methods gave similar results.)The confidence interval for the difference between the two predictions indicates that this difference can be as small as 6% or as big as 26%.
Evidently, The Bell Curve ascribes unwarranted precision to estimates that are subject to considerable sampling error.The dramatic difference between the two estimates becomes much less so when one takes into account the statistical uncertainty associated with them.Thus when H&M declare categorically that the odds of poverty for a person with low IQ and average SES are "more than twice as great as the odds facing the person from deprived home but with average intelligence" (p.135), one needs to exercise great caution before accepting it on face value.But then, H&M themselves acknowledge (though only in a footnote) the complexities involved in comparing the magnitude of effects in multiple regression and promise: "We refrain from precise numerical estimates of how much more important IQ is than socioeconomic background..." (note 13, p.691).
We may also ponder: How valid is a comparison between a person with an IQ score of about 70 (two standard deviations below the average) and a person from a very poor family?That people with very low cognitive capacity face severe limitations in life is hardly a surprising or a fresh finding.For example, Jensen states that "most persons with any experience in the matter would agree that those with IQs below 70 or 74 have unusual difficulty in school and in the world of work.Few jobs in a modern industrial society can be entrusted to persons below IQ 70 without making special allowances for their mental disability" (1981, p.12).We should also remember that the youth falling into what H&M call Cognitive Class V, the very dull, are also routinely afflicted by severe socioeconomic conditions--they are on average almost an entire standard deviation below the mean in SES.The very dull are also the very poor.Attempts to disentangle the independent effects of cognitive ability and harsh environment are doomed, not because of technical complications, but because American social reality is less than generous towards its weakest citizens.It seems that The Bell Curve has no new story to tell here, but presenting such an extreme situation as an example of the general effect of IQ on social consequences is neither informative nor especially valid.

The Special Case of Education
The impact of omission of important variables from a regression equation is widely recognized.Not only do the effects of the omitted variables cannot estimated, but other effects in the models might be biased and misinterpreted when an included independent variable is meaningfully correlated with an omitted one.Therefore, the absence of a measure of educational attainment from regression models set out to explain the likelihood of poverty, unemployment, welfare dependency and the likes, seems immediately curious.After all, education is the primary social institution responsible for providing the basic skills needed for a productive civil participation.The NLSY contains data on years of education respondents completed by 1990, which seems to be a natural scale to capture the effects of education.The omission of education from the regression models requires either a compelling argument for why it should not be included, or strong empirical evidence that education does not explain the social behaviors of interest to any meaningful extent.
H&M supply four reasons for why "the role of education versus IQ as calculated by a regression equation is tricky to interpret" (p.124).They assert that education is at least partly caused by intelligence, 1.
effects of education are likely to be discontinuous, that is high school or college graduation might be meaningful but not years of education, 2.
multicollinearity (that is the degree to which independent variables are correlated) might lead to unstable and misleading regression estimates, and 3.
the effects of education and intelligence are likely to be complex and require more complicated modeling.

4.
Assertions 3 and 4 were treated in some detail earlier in the sections on the independence of independent variables and logistic regression coefficients.We saw that the same arguments hold when we consider the correlation and complex effects of IQ and SES--either the role of SES versus IQ is also "tricky to interpret," which is probably the case, or these two arguments against the inclusion of education should not hold.H&M simply cannot have it both ways.Assertion 2 is nothing more than a technicality easily handled by including education in the regressions as a categorical variable with three levels: less than high school, high school, college or higher education.Moreover, by comparing results from using years of education against results from using this trichotomy, one could directly test assertion 2. H&M use this technique successfully to estimate the effects of Cognitive Classes, rather than a continuous IQ score (see p.

587).
Assertion 1 hypothesizes a causal link, whereby IQ determines the number of years of education completed .In Appendix 3, H&M present an alternative -they entertain the hypothesis that IQ gains are caused by years of education, and note that "it might be reasonable to think about IQ gains for six additional years of education when comparing subjects who had no schooling versus those who reached sixth grade, or even comparing those who dropped out in sixth grade and those who remained through high school" (p.591).The cause and effect relationship between IQ and education is admittedly complex and open to competing interpretations, but we are not given compelling argument or empirical evidence to support the dismissal of education and the inclusion of IQ in the regressions because of these complex relationships.We can just as validly argue for the inclusion of education and the dismissal of IQ from the regressions.One last point: if years of education as an independent variable competing with IQ for explanatory power, causes H&M so much concern, shouldn't they also worry about the fact that years of education constitute half (and sometime more) of the parental SES index?Surely, assertions 2-4, if valid, pose similar problems for the interpretation of the role of IQ versus SES.
What about empirical evidence?H&M's solution to the problems they raise is to run the IQ versus SES regressions separately for those who completed 12 years of education--the high school sample--and those who completed 16 years of education--the college sample.For college graduates, no matter what their IQ is, the risk of poverty is practically zero.(H&M do not show regression results for the college sample in Appendix 4--these are meaningless when only six of these subjects were in poverty, but they still plot the regression lines in p. 136.)For the high school sample, H&M notice similar patterns for IQ and SES as were previously observed for the entire sub-sample.IQ has a strong effect regardless of SES; SES has much weaker effect.They conclude: "Cognitive ability still has a major effect on poverty even within groups with identical education" (p.137).These analyses, however, do not answer the important question about education: What happens to the effect of IQ after "accounting for" years of education?Restricting the analysis to a homogenous sub-group in terms of educational attainment provides partial and highly misleading information about this question.When "years of education" is entered into the regression, one finds that it is a highly significant predictor of the likelihood of poverty (a regression coefficient of -.40), independent of IQ, and, even more importantly, the coefficient for IQ drops from -.84 to -.63.However, an even better solution exists.
Responding to criticisms about the SES scale, Murray poses a challenge: "Create some other scales and use some other method of combining them....As scholars are supposed to do, Herrnstein and I checked out these and many other possibilities -the results reported in The Bell Curve were triangulated in numbing detail over the years we worked on the book -and we knew that the critics who bothered to retrace our steps would discover: that there is no way to construct a measure of socioeconomic background using the accepted constituent variables that makes much difference in the independent role of IQ" (1995, p. 29).
The following exercise does the obvious.Given the strong correlation between subjects' years of education and parents' SES, and considering that doubtless the most direct way in which parental socioeconomic status can be translated into meaningful advantages for their children is to enable them to get more (and better) education, why not combine these two variables to achieve a better measure of SES?The gains are clear: we increase the SES index reliability, we avoid having three highly correlated variables in the same regression, we update the scale to capture directly at least part of the subjects' realized potential in socioeconomic status.At the same time we resolve some problems of the special case of education.This is achieved simply by averaging the original SES scale with a standardized variable of the subjects' years of education.Table 1 presents the results of the regression of poverty on IQ and the revised SES index.We can now examine how these new results translate to the plots of IQ versus SES in the roles they play in determining whether young white adults are below the poverty line.
This simple and straight-forward improvement of the SES scale -adding the subject's own years of education -brings the relative weights of IQ and SES in predicting poverty to a perfect tie.Dominance of IQ? Hardly.A crucial role for SES? Definitely.Especially if we recall, as H&M themselves acknowledge, that "[SES] has a significant effect on IQ, independent of the genetic endowment of the parent" (p.589).Moreover, this finding has devastating consequences for any argument about the dominance of the inherited portion of intelligence, 60 percent is the estimate favored by H&M (see p. 105), over environmental factors in determining the odds of being poor.Remember the question we started with?"If you have to choose, is it better to be born smart or rich?" (p.127; italics added).The answer is left to the reader.
Does the revised SES and IQ model should be considered adequate for making sound inferences about the relationships among socioeconomic background, education, intelligence, and social behavior?Certainly not.In reality, the social scientist faces an almost insurmountable task when trying to disentangle and bound causes and effects that present themselves only indirectly as a complex pattern of things that go together.Rich families provide better home environment and better education for their children, children with better home environment and better education do better on IQ tests, students who do better on IQ tests are more likely to complete more years of education, they are also more likely to come from families who are better off and less likely to end up poor, and so on and so on.The biggest fallacy behind The Bell Curve statistical analyses in Part II of the book is summarized by H&M in a single statement: "Regression analysis tells you how much each cause actually affects the result, taking the role of all the other hypothesized causes into account" (p.122; italics in original).If nothing more, this commentary should provide a demonstration of the dangers of blindly replacing hard thinking about a problem with an analytical formality, sophisticated as it may be.

Conclusion
In a response to The Bell Curve's critics, Charles Murray repairs to scientific middle-of-the-road and claims: "The statistical method we use throughout is the basic technique for discussing causation in nonexperimental situations: regression analyses, usually with only three independent variables.We interpret the results according to accepted practice" (1995, p. 27).Still, it appears that the analyses of relationships among IQ, SES, education, and poverty suffer in The Bell Curve from H&M's quest for simple answers.H&M prefer to ignore important details of their analyses, treat their models and estimated parameters as if they were accurate and complete descriptions of social reality, and pretend that statistical methods can miraculously unravel or unequivocally differentiate among causes that are inherently confounded .
The inconsistencies and selectiveness in arguments and analysis choices documented in the current commentary lead one to wonder whether H&M were not investing too much of their own IQs to make the case for the dominance of intelligence stronger than it really is? Otherwise, many of their conclusions, especially the ones they push about the proper policy response to ethnic and racial differences, lose critically in weight and can hardly be sustained by less extravagant demonstrations of the over-arching importance of IQ in the allocation of opportunities in current American society.
It is only appropriate to end by rephrasing Murray's words: "The unfounded criticisms of the statistics in The Bell Curve ... will merely cause embarrassment among a few who both understand the issues and have the decency to be embarrassed" (1995, p. 28).It is my hope that the founded criticisms of the statistics in The Bell Curve, will not merely cause embarrassment to its author, but will encourage those "who both understand the issues and have the decency" to set the record straight.