Why Production Function Analysis is Irrelevant in Policy Deliberations Concerning Educational Funding Equity

Hanushek and Walberg use production function methodology to contend that there is no relationship between school expenditures and student achievement. Production function methodology uses correlational methods to demonstrate relationships between input and output in an economic system. These correlational methods may serve to hide rather than reveal these relationships. In this paper threats to the validity of these correlational methods for analysis of expenditure-achievement data are discussed and an alternative method of investigation is proposed. The proposed method is illustrated using data from two states (Ohio and Missouri). The method demonstrates relationships between expenditures and achievement that were overlooked by the production function method.


Introduction
"On 26 February 1988 Bennett remarked, `Money doesn't cure school problems.'On 29 February 1988 he was more explicit: `We've done 147 studies at the Department of Education.We cannot show a strong, positive correlation between spending more and getting a better result.'In an earlier reference to those studies, he had said on 13 April 1987 that `in only two or three do we find even a weak correlation between spending and achievement.'"(Baker, 1991) The 147 studies referred to by Bennett are those summarized by Hanushek (1986) using the production function technique.Hanushek (1989) contended that "Variations in school expenditures are not systematically related to variations in student performance" and that "... schools are operated in an economically inefficient manner ."He suggested that "increased school expenditures by themselves offer no overall promise for improving education" and that "school decision making must move away from the traditional "input directed" policies to ones providing performance incentives."To support his contentions, Dr. Hanushek relied on the 26 year old, much maligned study by Coleman et al, Equality of Educational Opportunity, and his summary of 187 studies using educational production functions.
Walberg appears to base his case contending no relationship between achievement and productivity on his theory of causal influences on student learning and the resulting nine productivity factors (1982), on the triad relationship of socio-economic status, productivity, and expenditures (1989), and on Hanushek's model and the early literature related to production function analysis (1984).Monk (1992) described production function analysis as the relating of an input measure to an output measure using correlation or multivariate analysis (regression analysis).He reported that production research began in education some 30 years ago.The process involves the study of relationships between purchased schooling inputs and educational outcomes.The research, according to Monk, is deductively driven, although the deductive arguments tend to be abbreviated.He suggested that the approach has limited utility in policy research because of methodological and conceptual limitations.Monk pointed out that recent research includes more complex multivariate models which have greater potential for illuminating policy.

POLICY RELEVANCE OF THE PRODUCTION FUNCTION METHODOLOGY
Both traditional production function analyses and the modern multivariate version to which Monk alluded are based on correlational methods which are inadequate to deal with causation.In the simple linear correlation model, a single input variable (often, expenditures, but sometimes other school related inputs such as teacher experience or teacher preparation) is correlated with a single output variable (usually achievement, but sometimes percent passing minimum competency tests or rate of graduation).The multiple dimensionality of schooling suggests that such simple representations of either input or output are inadequate to describe the production relationships.
The second major production function analysis model is based on regression procedures, where a single output variable is predicted by one or more input variables (chosen from expenditure data, teacher experience or teacher preparation) and by intervening variables (such as socio-economic variables, school size, and the like).The purpose of using the intervening variables is to control factors which may confound the actual input-output relationship.In some applications the researcher permits the intervening variables to enter the regression prior to the entry of the input variables.There exists a serious problem with shared variance among the three sets of variables that will be discussed later.Regression based on the prediction of the output variable residual (which has been created by regressing the partial correlation residual of the intervening variables controlling for their relationship to the input variables with the output variable) by the input variables is a more appropriate application to control for confounding variables.

Problems with the Simple Linear Correlation Approach
The Assumptions of the Linear Correlation Approach.Application of the simple correlation model must meet the data assumptions required by correlation, the limitations to inference assumed in the use of the model, and the implicit assumptions about the relationship between correlates inherit in the production function methodology.For the application of the Pearson's Product Moment Correlation it is required that one have near or better than interval data for paired cases and that the full range of each variable be present.The coefficient attained measures the linear relationship between the two variables and indicates association, but not necessarily causation.
What Constitutes Differences in Expenditures?"Throwing a bucket of water on a raging fire will not keep a building from burning to the ground, but no one would argue on the basis of this experience that water has no value in fire-fighting.The value of water is apparent only when enough is applied to overcome the fire by reducing the heat below a critical point, degrading the fuel, or temporarily removing the air needed for combustion.An analogous situation often occurs in education.Frequently, we judge an intervention strategy to be ineffective before we have really implemented a program that is intense enough to achieve the desired effects."Compensatory education" is a case in point."(Bridge, Judd, and Moock, 1979) The above phenomenon has been labeled a threshold effect.One reason why the correlation method of production function analysis does not show effects of small differences of funding on achievement is the threshold effect.One dollar difference in funding will not purchase a commensurate or observable difference in achievement.Instead some larger, aggregate differences in funding, perhaps $600 or $700, is needed to purchase observable differences in achievement.
Perhaps the greatest problem in the use of the simple, linear correlation method beyond variable specification, is the absence of the cost disparities that are essential to demonstrate differences in educational purchasing power.An ordering of districts by amount of instructional expenditures does not necessarily order the same districts by their educational purchasing power.One district may have five dollars less in per pupil expenditure than a second district, but may have to pay on the average ten dollars more per teacher than does the second district.Ordering of districts by dollar differences which are less that the measurement error associated with expenditures results in gross underestimation of the true relationship between costs and achievement.
The Truncated Variable (Attenuation).Percent passing a test as a measure of achievement represents a somewhat unusual truncation of a variable in that the variance on the achievement measure is limited to variation of dichotomies rather that variation across the full set of test scores.Variable truncation also occurs when the tests have either floor or ceiling effects, when only one specific segment of the enrollment is used (such as at risk students or college students) or when data are not available for the entire sample being analyzed.
Potential Non-Linear Relationships.The simple, linear correlation method will not identify non-linear relationships between the input and output variables.In one of the two states discussed later in this paper, I found a quadratic relationship in exploring the data.A state department report in the second state also alluded to a potential quadratic relationship between input-output variables.

Problems with the Multiple Regression Approach
The Assumptions of the Multiple Regression Approach.The application of the regression approach is characterized by a single output variable (some form of achievement measurement or percent reaching an educational standard) being predicted by one or more input variables (expenditures, teacher characteristics, and the like) and controlling for one or more background variables (such as socio-economic variables or school size).Two ways are used to control for the background variables.The first way is to permit the background variables to enter first in the prediction equation.The second is a residualizing technique.The residualization process involves creating the residual of the output variable by regressing the first partial of the controlling variables with the vector of predictors on the output variable.The linear combination of the predictor variables are then regressed on the output residual.The regression approach requires that the researcher meet all of the assumptions that have to be met in the simple, linear correlation analysis.In addition, the researcher is required to have a theory or rationale for establishing the order of variable entry and an understanding of the shared variance problem.
The Order of Variable Entry Problem.The order of variable entry in the calculation of the correlation is important in handling shared variance or commonality of explanation.If two correlated independent variables (predictors) are related to a dependent variable (outcome or criterion), the first variable to enter into the regression calculation gets credit for all of its correlation with the dependent variable.When the second variable is entered into the regression calculation, it gets credit only for the correlation that it has with the dependent variable that has not been explained by the first variable entered.Hence, the first variable gets credit for the correlation with the dependent variable that is shared by the second variable.Critics of Coleman showed that his order of effects do not hold up across applications of different regression models.(Pedhazur, 1982) The Shared Variance Problem.In dealing with this triad relationship created by the output variable, the input variables and the controlling variables, Walberg (1989) simply failed to discuss how he handled the shared variance problem inherit in the triad.His regression model enters socio-economic status as the first predictor of students' test performances, size as the second prediction variable, and finally expenditures as the third predictor variable.The amount of explanation shared by socio-economic status and size and the amount of explanation shared by socio-economic status and expenditures are credited to socio-economic status solely; the amount of explanation shared by size and expenditures are then attributed to size alone.Certainly not much variance remains to be explained by expenditures.A different order of entry would produce markedly different results.Pedhazur (1982) credits Mayeske with the development of commonality analysis to address this problem, but this methodology has been subjected to some criticism.There is in fact no effective statistical method that will unconfound shared predictive relationships.The only appropriate treatment of the shared relationship problem is, perhaps, a straight-forward admission that it is the cause of the unresolvable ambiguity.

Other Design Problems for Both Correlational Models
Inadequate Variable Specification.In addition to the difficulty created by trying to represent multiple inputs and outputs by single variables, there is the additional difficulty of including confounding data elements in the input and output variable measurement.Selected single variables may provide inadequate description of key inputs or outputs, may be unlikely to have the relationship assumed by the production function paradigm, and may not be accurately measured.
Inclusion of Confounded Data Elements.Federal dollars are included in school expenditures as unrestrained expenditures.Some federal dollars are likely ear-marked for efforts that do not contribute to student performance on achievement tests and some federal funds are not involved in instruction.The inclusion of federal funds is not nearly the potential problem of some districts testing special education students and including their scores in the test results.Hence, when there is random confounding of the performance measure or the selection of a weak input variable, each serves to reduce the size of relationships.The choice of percent passing a basic competency test is an unfortunate choice of measure for an output variable.Percent passing immediately sets up a ceiling effect for those passing the test.The use of a dichotomized scoring process reduces the amount of variance to be explained and attenuates the observed relationship.
Inadequate Determination of the Input Variables.Variable specification problems occur three ways in the determination of input variables.Problems occur when input measures are chosen that are not related to instruction.Perhaps, the most frequent example of this problem occurs in the use of teacher salary as an input measure.Teacher salary is based on seniority and is likely not related to quality of instruction.The second way that problems occur in the selection of input measures is the selection of an input which cannot be measured adequately across all districts.An example of this can be seen where school district size varies enough that economy of scale enters into the accuracy of the measure.Very small districts require more dollars per pupil to provide educational services equivalent to those of larger districts.The third way that selection of input variables can create problems is when in some districts the input variable has larger investment in special students than do other districts.Such cases are generated when districts have a large number of "At Risk" students or where a district invests highly in advanced placement instruction.
Inadequate Determination of the Output Variables.Variable specification problems occur in at least four ways in the determination of output variables.The first way is when the output variable that was chosen was a minor emphasis of many schools.Such may be the case when school districts focus more on emotional, attitudinal, behavioral, or vocational outcomes.The second way that dependent variable specification problems can occur is when there are floor and ceiling effects to the measures.If the achievement measure has a ceiling or a floor effect, then many of the students making a perfect or a zero score have accomplishments that are not being measured.The third way that variable specification problems can occur is when the output variables have no logical linkage to either the selected input variables or to school quality.An example of this problem is the "Efficiencies" notion used by Walberg (1989)."Efficiencies" are school expected output developed by the use of prediction based on socio-economic status.The variable can be argued to better represent an error of measurement of the socio-economic construct than an actual measure of school output.The fourth way that variable specification problems can occur is the selection of an output measure that does not pertain to the whole student body.An example of this is the selection of freshman grade point averages for their first year of college.Differential proportions of students across districts go to college, college curriculum differ in difficulty and colleges differ in difficulty.
Crossing Economic Eras.Production function studies are often grouped for interpretation and for the making of policy recommendations.The 38 publications from which Hanushek extracted his review range from the late 1950s to the early 1980s.This means that several of the studies were conducted in different economic eras.In the 1950s, there was a dearth of federal funding, but there was a wave of post-war resources and the early beginning of inflation.The 1960s brought the Elementary and Secondary Education Act, increased federal funding, escalation of inflation, baby boom growth beginning to enter schools and the emergence of civil rights as major issues in education.The 1970s brought a slowing of federal funding, abatement of inflation and more focus on growing enrollment.The 1980s marked a reduction in federal funds, the beginning of a recession, the start of program retrenchment and the end of growing enrollment.It is quite likely that input-output relationships differ across these four decades.
Inconsistent Determination of What is to be Considered a Production Function Study.Several of the studies included in Hanushek's (1989) reviews do not have one or more of the elements required to be classified as production function analyses.One such study is a study that occurred in a large school district where teacher experience and differential teacher salaries were used as input variables.(Murname, 1975) In another study, college freshman grade-point-averages were used as the output variable.(Raymond, 1968) It seems necessary that every study called a production function analysis must have at a minimum an input variable, an output variable, an assumption of a logical linkage between the school, total group and unbiased estimates for both variables across the units of comparison and the computation of a correlational analysis.
Inadequate Sampling Representation.Problems with sampling representation occur in two ways: through lack of disclosure and through inadequate sample size.Sampling becomes very important in making an inference to a given population.In most production function analyses, the intent appears to be that the researcher wishes to generalize to all of the school districts in the United States.Not a single study or collection of studies appears to meet sampling requirements for this inference.

Criticism of the Work of Hanushek
As Spencer and Wiley (1981, p. 44) suggested "Hanushek offers a provocative interpretation of the last two decades of research on educational productivity."Unfortunately, "Hanushek misinterprets the data on which he bases his conclusion and draws inappropriate policy implications from them." (Spencer and Wiley, 1981, p. 41) After reading a sampling of Hanushek's articles, I concur with Hughes (1992) that one could quote from 20 years of Hanushek and destroy his current argument with his own words.However, I choose here to look at his current thesis and see if it stands on its own foundation or falls.
Hanushek contended that "There is no systematic relationship between school expenditures and student performance" (Hanushek, 1991, p. 425) and that "... schools are economically inefficient."(Hanushek, 1986(Hanushek, , p. 1166) He suggests that "increased school expenditures by themselves offer no overall promise for improving education" (Hanushek, 1986(Hanushek, , p. 1167) and that "school decision making must move away from the traditional `input directed' policies to ones providing performance incentives."(Hanushek, 1989, p. 49) To support his contentions, Dr. Hanushek relies on the 26 year old, greatly criticized study by Coleman et al, Equality of Educational Opportunity, Washington, D. C., Government Printing Office, 1966; and his own summary of 187 (147 of these studies are those referred to by Bennett) studies of educational production functions.(Hanushek, 1989, p. 46)

The Coleman Study as Support
The Coleman Study did indeed highlight input-output relationships across a large number of districts, using a regression model.Coleman et al concluded that family characteristics and peer group characteristics were more instrumental in promoting student achievement than were school system characteristics.Critics of the study suggested that this ordering of effects may be due to the analytic model used.Because the nature of regression analysis requires theory to specify models and order of variable entry into the computations, Coleman received considerable criticism, some of which resulted in George Mayeske's contributions to a new analytic technique, commonality analysis (Pedhazur, 1982).
The order of variable entry in the calculation of the correlation is important in handling shared variance or commonality of explanation.If two independent variables (predictors) are related to a dependent variable (outcome or criterion) and are related to each other, the first variable to enter into the computation gets credit for the correlation to the dependent variable that it shares with the second variable.Hence, if a family variable enters first in the computation of the correlation being used in predicting reading performance and then a peer variable enters into the calculation, the regression results will show for the family variable its unique correlation with reading performance plus the correlation to reading performance that it shares with the peer variable.For the peer variable only its unique correlation to reading performance is shown.Critics of Coleman show that his ranking of effects does not hold up across applications of different regression models.A second criticism of using Coleman as a primary research foundation lies in the age of the Coleman data.Any economist should be able to see that time has likely made relationships in the Coleman data obsolete with regard to today's economy.
Hanushek's Summary of Production Functions Hanushek's (1989) summary of 187 studies of educational production functions is a continuing theme throughout his publications.This summary began in 1981 with 29 articles and 130 studies, it was continued in 1986 with 33 articles and 147 studies, and it was completed in 1989 with 38 articles and 187 studies.The summary is the research foundation for Hanushek's assertion of no relationship between school districts' expenditures and student performance on standardized achievement tests.
There are several serious omissions and research flaws in the description and logic of Hanushek's summary.These include the lack of disclosure of sample sizes in the studies that were reviewed, inadequacy in size and representativeness of the 187 case studies, misinterpretation of the results of the hypothesis testing, potential misinterpretation of the summary, failure to use selected research that is not consistent with the ideas being promoted (Glass and Smith (1979), Spencer and Wiley (1981), Burstein (1980), and inadequate specification of the key variables.

Lack of Information on Sample Sizes in the
Studies that Were Reviewed.The studies that were reviewed by Hanushek were qualified in some unspecified manner.It appears that the primary criterion for qualification was publication.Hanushek stated that at least one study deals with a district or districts in all regions of the United States, with different grade levels, and across different performance measures.He provided two tables that are purported to describe the sample.In Table 1 of his 1989 article, "The Impact of Differential Expenditures on School Performance," Hanushek showed the number of studies dealing with single districts (60) and the number dealing with multiple districts (127), but he failed to provide any information on the number of districts involved in the multiple districts.In his Table 2, Hanushek showed that 90 studies deal with at least one grade level in the range of grades from 1 to 6 and that 97 studies deal with at least one grade level in the range of grades from 7 to 12.No attempt is made to show replication across grade levels, number of students involved at each grade level or for each district.With so few cases, the reader must wonder where the holes are in the sample.

Inadequate Size and Lack of Representativeness of the 187 Case Studies.
There are approximately 15,000 public school districts in the United States.These districts are characterized by a large variance in total enrollment.Samples that include a majority of the students and provide a confidence band of 0.95 percent are usually selected randomly using a stratified sampling frame that involves the selection of approximately 800 districts (See the Condition of Education Annual Reports by the National Center for Educational Statistics).A simple random sample without control for the number of students covered requires approximately 400 districts for a 0.95 percent confidence level and for representation (Schaeffer, Mendenhall and Ott, 1986).The sample used by Hanushek was not random and was likely smaller than either required sample sizes.The size is less bothersome than the scant likelihood of randomness.The 187 studies were likely to have been conducted in reaction to some problem or inquiry.Hence, are the relationships found in these unusual districts representative of those that exist in the other 15,000?No evidence is presented to allow the reader to judge the generalizability of the results.
Misinterpretation of the Results of the Hypothesis Testing.In hypothesis testing, the researcher assumes the null hypothesis and seeks reason to reject it.Failure to find such evidence does not permit one to accept the null hypothesis, but only permits one to fail to accept the alternative hypothesis.Failure to gather evidence that will lead to the acceptance of the alternative hypothesis and the subsequent rejection of the null hypothesis may be due to inadequate sample size, measurement errors or inaccurate model specification.Spencer and Wiley (1981) used the 109 studies which were analyzed in 1981 by Hanushek who sought to argue for the conclusion of no relationship between teacher-pupil ratio and the performance of students as an example that illustrates another of Hanushek's difficulties with the interpretation of significance tests on regression coefficients.Their argument showed that the null hypothesis can be rejected for positive results and then can be rejected for negative rejects; pointing out difficulty with the model used and the data set.
Potential Misinterpretation of the Summary.Baker (1991) discussed Hanushek's absence of a decision rule in his summary of the literature for the 147 studies (Hanushek, 1986).He stated that a synthesis of literature as reported by Hanushek can be conducted in one of two ways: either by the vote counting method with a stated expectancy or decision rule or by the meta-analysis method.Hanushek did not compute effect sizes so his review must have entailed by the vote counting method.Given the absence of the statement of a decision rule by Hanushek, Baker assumed a decision rule that 5% of the studies will be significant by chance.He then showed that 20% of the studies are significant, thus ruling out a chance relationship (Baker, 1991).
In Table 3 of his 1989 article, "The Impact of Differential Expenditures on School Performance," Hanushek showed the expenditure parameters for the 187 studies for seven educational inputs as they relate to student achievement test performance.Although he reported number of studies, he did not report number of districts, number of students, or grade levels to which the studies pertain.For the various components he reports the number of non-significant studies found.Hence, 82% of the 152 studies relating teacher/pupil ratio to student performance were found not significant (p<0.05);88% of the 113 studies relating teacher education to student performance were found not significant (p<0.05);64% of the 140 studies relating teacher experience to student performance were found not significant (p<0.05);78% of the 69 studies relating teacher salary to student performance were found not significant (p<0.05);75% of the 65 studies relating expenditures per pupil to student performance were found not significant (p<0.05);87% of the 61 studies relating administrative inputs to student performance were found not significant (p<0.05); and 84% of the 74 studies relating facilities to student performance were found not significant (p<0.05).For four of these seven inputs (Teacher experience, Teacher salary, Expenditures/pupil, and Administrative inputs) ratios of the significant to non-significant studies are equal to or exceed 11 to 4 odds in favor of positive relationships.

Failure to Cite Research that is not Consistent with the Ideas Being Promoted and Inadequate
Specification of the Key Study Variables.Given Hanushek's liberal qualification of studies and his reliance on the Coleman study, his rejection of the Glass study as being subject to too much criticism for attempting to calculate effect sizes for different class size intervals is surprising and unaccountable.Hanushek's failure to address the criticisms of Spencer and Wiley was also surprising.In his discussion of aggregation effects, the work of Burstein was overlooked.This work demonstrates the potential danger of aggregated data and correlation.
Inclusion of Confounded Data Elements.Federal dollars are included in school expenditures as unrestrained expenditures.Some federal dollars are ear-marked for efforts that do not contribute to student performance scores.An even more serious potential problem is that some districts test special education students and other districts fail to test special education students.Hence, there is random confounding of the performance measure, reducing the sizes of correlations possible.
Choice of Performance Measure.The choice of percent passing the basic competency test (bct) is an unfortunate choice of measure for a performance indicator.Percent passing immediately sets up a ceiling effect for those passing the test.Even if they benefit from additional or redistributed expenditures, their gains can never be shown in the scattergrams.Gains shown by those who pass and by those who continue to fail are not reflected in the measure.Baker (1991) noted that another major problem is Hanushek's failure to correct correlations for attenuation arising from the fact that per pupil expenditures are truncated.Baker stated that the correlation between achievement and expenditures is greatly reduced because "no schools spend a great deal more or less than others.... It is quite easy for a significant finding to be overlooked, if the observed data come from the center of a scattergram, where the attenuated data often appear to be random.(Baker, 1991, p. 4)

Criticism of the Work of Walberg
Walberg appears to base the case for no relationship between achievement and expenditures on his theory of causal inferences on student learning and the nine productivity factors (1982); on the triad relationship of socio-economic status, productivity, and expenditures (1989); and on reliance on Hanushek's model and on the early literature related to production function analysis (1984).

Theory of causal inferences on student learning and the nine productivity factors
Walberg's review of productivity research and his development of the "theory" of school learning has received much professional praise.I am in agreement with this praise in that the model appears to synthesize a large body of research clearly and usefully.Walberg's model includes a paradigm connecting Aptitude (ability, development and motivation), Instruction (amount and quality), and Environment (home, classroom, peers and television) as inputs to Learning (affective, behavioral and cognitive).I believe that this model is an accurate picture of a subset of variables that are precursors of productivity.My experience suggests that curriculum probably should not be ignored and left out of the model.Also, note that no variable entitled "expenditure" is included directly in the model.Yet, expenditures are represented indirectly in both Instruction and Environment.Walberg recognized this role in the following statement, "... and expenditure levels of schools and districts, and their political and sociological organization -are less alterable in a democratic, pluralistic society; are less consistently and powerfully linked to learning; and appear to operate mainly through the nine factors in the determination of achievement."(Walberg, 1982, p. 120) What is puzzling about about this statement is that Walberg appears to be trying to stretch logic to agree with Hanushek's weak and inconsistent position, and reasons that higher expenditures follow quality instruction rather than higher expenditures serve as mediating factors to the purchase of quality instruction.

The triadic relationship of socio-economic status, productivity, and expenditures
Walberg appears to be interested in the triadic relationship of socio-economic status, productivity (or at least efficiency of student test performance), and expenditures.This interest is expressed in several studies and reviews authored by Walberg.In several of the studies, Walberg appears to have problems in the specification of at least two or perhaps all three of the variables of the triad.Perhaps, one of the major problems with how Walberg has set out to study these variables is his lack of control of certain key school variables.In the discussion of studies of the relationship of class size to achievement test performances nothing is said as to how many of the small classes were made up of special education students or were composed for remediation.The overlooking of these two common practices in school certainly confounds the study of class size and the inclusion of special education students confounds the measure of student performance in reading, mathematics, science or other standard school curricula criteria used to define school productivity.In his studies of district size, he permits urbanism to confound his variable.Walberg is frequently unclear as to what is being measured as a variable representing productivity.Sometimes his productivity variable is measured as percent passing.The method of measurement clearly restricts the range of the achievement construct and serves to reduce the observed correlation.At other times, Walberg uses what he refers to as an efficiency measure, which is made up of the predicted achievement score using socio-economic status in the prediction equation divided by the observed achievement score.This configuration called "efficiency" appears to more closely represent a measure of prediction error for socio-economic status.Clearly, his expenditure data include funds for transportation, lunch, special education, and similar programs which do not bear directly on instruction.
In dealing with this triadic relationship, Walberg simply fails to discuss how he has handled the shared variance problem inherent in the relationship.His regression model enters socioeconomic status as the first predictor of students' test performances, size as the second prediction variable, and finally expenditures as the third predictor variable.The amount of explanation shared by socio-economic status and size and the amount of explanation shared by socio-economic status and expenditures are credited to socio-economic status solely; the amount of explanation shared by size and expenditures are then attributed to size alone.Certainly, not much explanation remains to be credited to expenditures.A different order of entry would produce markedly different results.Mayeske developed commonality analysis to address this problem, but the methodology has been subjected to some criticism.In actuality there is no effective statistical method that will unconfound shared predictive relationships.Appropriate treatment of the shared relationship is perhaps a straight-forward discussion of the irresolvability of the problem.

Reliance on Hanushek's model
Walberg depends in several literature reviews on the productivity analyses reported by Hanushek.He appears to rely on them without critical scrutiny and uses Hanushek's work as rationale for demoting the role of expenditures in his model and in further analyses.Walberg's acceptance without question of Hanushek's work raises some concern about the other studies that he uses in his argument.

Regression Analyses of New Jersey Data
The analyses performed for the New Jersey hearings (Walberg, 1989) appear to duplicate many of the faults discussed in Walberg's triad studies, and potentially contain a few new variances from standard research practice.On page 43 lines 4 and 5 of the 1989 document, Walberg's description of regression analyses is misleading.Regression analyses does not provide a method of simultaneous analysis of the predictive contribution of three variables.Order of entry attributes shared variance of two variables to the first one entered into the prediction process.Observed relations are most likely not independent; only the last variable to enter in the equation is likely to be independent.
Variable specification is again a problem as confounding other school factors such as special education, remediation processes, transportation costs, and the lunchroom expenditures have not been removed from the studies.It appears that the variable "expenditures" rather than "expenditures per student" was run in the correlations.The "Efficiencies" prediction is still used as a dependent variable and the truncated measurement of productivity (such as percent passing) is used in several of the achievement measures.
Order of entry and the problem of shared variance is a problem in these analyses.One wonders what kind of discussion would ensue if an appropriate expenditure variable was entered first in the prediction of test performances that had not been truncated or obscured by the use of ratios.

Demonstration of the lack of validity of the production function methodology A Suggested Alternative Approach
The production function method must be altered in three ways to make it policy relevant.To identify the effects of large versus small expenditures, the research task appears to demand a comparison rather than an association.Rather than asking if there is a consistent relationship across the whole population, it is better to ask for what kinds of districts do such effects exist within a state.A third change is to create a discrepancy in expenditures large enough to reveal differences in the purchasing power of educational services.
Finding Homogeneous Sets of Districts.Districts within a state differ on many dimensions.Furthermore, the dimensions that are most discriminating in one state may not be so in another.By grouping districts in a particular state into classes (e.g., rich vs. poor) according to the key dimension for that state (e.g., wealth), homogeneous subgroups can be obtained for further analysis.Size of districts, rural/urban, and number of exceptional children (either gifted or at risk) are variables whose subdivisions are likely to establish subsets of homogeneous groups.In states like Montana and Missouri, size is the dimension which creates homogeneous subgroups.In Alabama rural/urban is the variable that yields homogeneous subgroups.In Ohio, income levels or socio-economic status creates homogeneous subgroups.In some cases, there are one or two large, poor, urban districts which have to be considered as outliers so as to establish homogeneous subgroups.

Creating the Disparity in Funding.
In 1970 a study conducted for the Office of Panning and Program Evaluation/Bureau of Elementary and Secondary Education/United States Office of Education found that approximately 300 dollars was needed to improve elementary school children's reading scores one month over the course of a year.A proration of this finding suggests that a disparity of 600 to 700 dollars is needed between districts compared.Within each homogeneous subgroup, the districts are ordered by instructional expenditures and then divided into two groups where one is formed by the upper 30% and the other is defined by the lower 30%.The two groups are equal with regard to sample size and differences between the groups on expenditures should exceed 600 dollars.Given the satisfaction of these conditions differences in achievement scores should be apparent, if they exist.

Using t-Tests to Investigate the Results of the Disparity.
Given the creation of the two groups (upper and lower 30%) from a single homogeneous subgroup and the verification of a 600 dollars disparity, the independent t-test with pooled variance can be used to discover achievement test differences.If more than three homogeneous subgroups are to be analyzed, methods to deal with the inflation of the confidence level should be considered.Such methods include the recalculation of the confidence levels compensating for the use of several t-tests (the Bonferonni procedure) or the use of the family of t-tests notion (e.g., the Tukey procedure).
The proposed model can be used to investigate either a family of dependent or independent variables or both.The use of several t-tests provides the method for including a number of dependent or output variables.The ordering of districts for the determination of the upper 30% and lower 30% with regard to the input or independent variables permits the consideration of any number of independent variables.In Table 1 are shown the production function correlations for the achievement data for the school districts in Missouri.Note that there is only one correlation, the one for tenth grade mathematics, that is large enough to be judged statistically significantly different from zero.Since there are twenty production functions, one would conclude from such an analysis that the production function shows no relationship between instructional costs and achievement in Missouri.Application of the full alternative model involves not only the creation of the threshold, but also the elimination of outliers or of extreme scores which may have an unusual relationship between instructional expenditures and achievement.Such scores come from economies of scale effects in small districts, the concentrating of at-risk students, or the amassing of more than essential wealth.In order to complete the comparison, production function analyses were performed on the twenty distributions after the outliers had been eliminated.In the Table 3 are reported the results of these production function analyses.Significant non-zero correlations are found for four of the twenty coefficients: fourth grade reading, eighth grade reading and social studies, and ninth grade mathematics.The significant correlation for tenth grade mathematics was lost in the elimination of the outliers.However, only three of the correlations in Table 3 are negative, while nine are negative in Table 1.Still these four non-zero correlations make concluding a relationship between instructional expenditures and achievement too risky.The outliers removed were school districts with enrollments less than 300 and enrollments of greater than 25,000 students.Production function analyses were performed on the number of school districts in the state and contrasted with the results of t-tests performed after a threshold had been created.This sequence comparing production functions with t-test contrasts was then repeated after outliers were removed.

Application of the
In Table 5 are reported the nine production function analyses for Ohio.None of the nine achievement areas shows significantly non-zero correlations.In Table 6 are reported the t-test contrasts for the same nine Ohio distributions.None of the nine contrasts reach the Bonferonni significance levels.
Alternative Approach to Two States Data for the states of Missouri and Ohio were obtained through Education Policy Research, Incorporated which participated in the suits involving equity of the state system for funding the public schools.These data involved the per pupil expenditure data, the proxy data for socio-economic status of the attendance area of the districts, district enrollment, and achievement data which were used in the preparation of the cases by both sides in the lawsuit.The achievement data for Missouri are the Missouri Mastery Achievement Test (MMAT) prepared by the state to measure state objectives for the year 1990-91.The achievement data for Ohio are NCEs from standardized achievement tests selected by the districts for the year 1989-90.Both sets of achievement data are judged to have adequate reliability.

Table 1 :
Correlations Between Expenditures per Student and Student Performance on MMAT

In Table 2 are shown the t-tests resulting from a partial application of the alternative approach which creates the funding threshold not included in the production function analyses for the twenty distributions of achievement data. The creation of the threshold results in two of the distributions showing significant positive relationships using the Bonferonni procedure. Ten of the twenty t-tests reach significant levels for single applications for the t-test. Given the family-wise results, it remains risky to conclude a positive relationship between achievement and per pupil expenditures at this time.Table 2 :
Contrasts of High and Low Funded Districts on the MissouriMMAT  for 1990MMAT  for  -1991

Table 3 :
Correlations Between Expenditures per Student and Student Performance on MMAT Achievement Tests with Outliers Removed.

Table 4 :
Contrasts of High and Low Funded Districts with Outliers Removed onMissouriMMAT for 1990MMAT for  -1991

Table 5 :
Correlations Between Instructional Expenditures and Selected Variables in Ohio Database