Re-Examining Exit Exams : New Findings from the Education Longitudinal Study of 2002

Using the nationally representative, cohort-based data of the Education Longitudinal Study of 2002 (ELS:02), this study employs multiple regression to examine the effects of exit exams on student achievement and school completion. This study finds that exit exams as a whole do not have substantial effects on student achievement in mathematics, twelfth grade GPA, or school completion. Standards-based exams are a positive predictor of dropping out of school but lose their predictive power once GED recipients are coded as completing school. Exit exams do not affect GED seeking and acquisition. When exit exams are disaggregated by type and students are sorted by ninth grade GPA quartiles, end-of-course exams have some negative effects on mathematics test score gains. Students in the bottom two quartiles see reduced test score gains of 28% and 29% of a grade level equivalency (GLE). These effects disappear when students in North Carolina are coded as taking a different type of exam. Standards-based exams had a small positive effect, about 37% of a GLE, on the top quartile of students. Overall, the findings showed no results for school completion and mixed results for test score gains. The article concludes that policymakers looking to boost high school achievement would be better served by working to boost student accomplishments before high school.


Introduction
States administering high school exit exams enrolled 74% of all students and 83% of students of color in the 2009-2010 school year (Center on Education Policy, 2010).Although the degree of difficulty and material consequences of exit exams vary from state to state, these tests have a common policy rationale: the state's desire for accountability.
Any accountability that exit exams provide does not come cheaply.California's 2003-2004 budget included $21 million for the administration of the California High School Exit Examination (CAHSEE).This number does not represent the exam's total fiscal impact on the state.School districts assume additional costs as they prepare for the exam, administer it, and deal with the consequences for students who fail to pass.Even if all these costs were added together, CAHSEE's total cost would still be relatively small in the scope of the state's $55 billion budget for education.That does not mean it is insignificant: $21 million represented half the cost of California's Intensive Algebra Instruction Academies andElementary School Intensive Reading Program, discontinued in 2003-2004 as part of budget cuts and twice the 2003-2004 funding cuts for school and classroom library materials (O'Connell, 2003).
States like California continue to implement and refine their exit examination policies while education researchers struggle to provide decisive answers to the question of what, exactly, such exams do.The current literature on exit exams has not kept pace with these tests' evolving nature.Studies have relied on cohort data that do not account for the "second wave" of exit exams or state aggregate data that fail to apply important controls for student and state covariates.The present study uses newly available data from the Education Longitudinal Study of 2002 (ELS:02) and addresses key methodological issues.The opportunity costs and risks of exams like the CAHSEE must be subject to our most rigorous evaluations.Only then can policymakers make decisions that balance the need for meaningful high school diplomas with the need for access to education.
This study addresses two major research questions: First, how do high school exit exams affect school completion?Second, how do high school exit exams affect student achievement?These two questions cut to the heart of the debate about high school exit exams.States have a profound social and economic interest in ensuring that as many of their students as possible receive high school diplomas.States also have an interest in securing and signaling the value of these diplomas.This leads to a delicate balancing act for policymakers setting standards -a kind of Goldilocks effect -where some standards are too high (forcing too many students out of school or setting unachievable benchmarks), some are too low (reducing the value of a diploma or lowering aggregate achievement), and the elusive "just right" standards require sophisticated research that is aligned with policy evaluation for decisions that try to maximize outcomes for students and societies.
These policy questions do not occur in a vacuum.In an age of increasingly tight state budgets, policymakers must be conscious of the opportunity costs of their accountability decisions.Exit exams are expensive and time-consuming.If they do not provide substantial benefits, policymakers must rethink their accountability strategies and devise different mechanisms to monitor and incentivize school and student performance.

Exit Examinations and School Completion
American exit exams have changed dramatically in the last 20 years (Warren & Jenkins, 2005) as they have moved from minimum competency exams to more difficult standards-based assessments.This shift has led to some incommensurability in otherwise similar evaluations.There is some support for the claim that exit exams suppress graduation rates while increasing the number of students seeking a General Educational Development (GED) credential or diploma (Bishop, 2005;Dee & Jacob, 2006;Jacob, 2001;Papay, Murnane, & Willett, 2010;Reardon, 1996;Reardon, Atteberry, Arshan, & Kurlaender, 2009;Warren, Jenkins, & Kulick, 2006).Other studies (Catterall, 1987;Greene & Winters, 2004a;Griffin & Heidorn, 1996;Muller, 1998;Warren & Edwards, 2005;Warren & Jenkins, 2005) have found no relationship between exit exams and school completion.The most recent and definitive review of the literature to date (Holme, Richards, Jimerson, & Cohen, 2010) shows that easier exams do not affect school completion, while more difficult exams are associated with higher drop-out rates.Policy changes and methodological differences may explain some of the literature's divergent findings.
Until recently, longitudinal analyses were limited to use of the National Education Longitudinal Study 1988 (NELS:88) data -a set that does not account for the "second wave" of accountability measures (Dee & Jacob, 2006).Longitudinal studies are important in this area because the effects of graduation requirements vary over time as participating parties adjust their behavior (Lillard & DeCicca, 2001).A 2004 report by the Center for Education Policy (CEP) predicted that it might take "half a generation" (p.26) before students show the full effects of a high school exit exam.Data tracking a cohort over time allow researchers to control for student and school-level variables that are known to have substantial effects on outcomes like school completion and test scores.
Even when using appropriate data, researchers must navigate a number of methodological pitfalls.These include omitted variable bias (in particular, failure to control for prior academic achievement) and what Jacob (2001) has called the "endogeneity of the MCT [minimum competency test] policy variable."High school exit exams are correlated with other characteristics of schools or states that may influence dropout rates in either direction.Even local or school-specific requirements may bias statewide samples (Lillard & DeCicca, 2001).We know that states with the highest dropout rates, lowest overall student achievement, higher unemployment rates, and highest proportion of minority students are the states most likely to have high school exit exams (Reardon, 1996;Warren & Kulick, 2007).Because exit exams are often introduced as part of larger standardsbased reform and accountability measures, exit exam policies may seem to cause effects that are actually more closely related to other school or statewide variables (Reardon, 1996;Bishop, Mane, Bishop, & Moriarty, 2001).
For example, Lillard and DeCicca (2001) found that higher state-mandated minimum course requirements were positively related to dropout rates.They estimated that if state course graduation requirements (CGRs) were increased by one standard deviation (about 2.5 CGRs), attrition rates would change by about one percent.This small change in attrition was large when considered in absolute terms.As the authors note, "from a base population of roughly 13 million 14-17 year old youth in 1990, these results suggest that between 104,000 and 208,000 more students will leave high school before graduating when CGRs increase by one standard deviation" (p.465).This negative effect was strongest for minority students in the poorest quintile.The authors found no independent effects for high school exit exams.
By contrast, Dee and Jacob's (2006) analysis of data from the 2000 Census and the National Center for Education Statistics' (NCES) Common Core of Data (CCD) found that Minnesota's exit exam increased the dropout rate in poor and urban schools while nationwide exit exams significantly increased the probability of dropping out of high school for all students, and black students in particular.Reardon, et al. (2009) found that graduation rates in several large California school districts declined by 3.6 to 4.5 percentage points as a result of exit exam requirements.These effects were concentrated on low-achieving, female, and minority students.New research on the effects of Massachusetts' exit exam (Papay, Murnane, & Willett, 2010) showed that low-income urban students who barely failed the mathematics exit exam had an eight percentage point lower graduation rate than similar students who barely passed.Earlier multi-state studies, such as that by Greene and Winters (2004a), found no such effects; however, the measures that study used to calculate statelevel completion rates have been criticized for inaccuracy and a failure to control for observed and unobserved differences between states (Dee & Jacob, 2006;Warren, 2005;Warren, Jenkins, & Kulick, 2006).
Interaction between school completion and academic achievement outcomes may bias achievement measures in studies not using cohort data.If exit exams induce low-achieving students to drop out, then the achievement effect of the test may be inflated (Jacob, 2001, p. 104).On the other hand, if exit exams have the opposite effect on dropout decisions, achievement effects may be suppressed or diminished.
The most current research using a national sample (Current Population Survey, CCD, and GED exam data) to evaluate the relationship between exit exams and school completion was a 2006 study by Warren, Jenkins, and Kulick.The authors comprehensively revised previous models estimating school completion rates, finding that exit exams are associated with lower rates of school completion, especially in poor states with high percentages of racial and ethnic minorities.They argue that these findings are consistent with seemingly contradictory findings from analyses using the NELS:88 data because that survey did not include the era of more difficult exit exams and therefore was unable to distinguish between the effects of more and less difficult exams.

Exit Exams and Academic Achievement
The major claim in favor of exit exams is that they increase student achievement.In theory, exit exams provide a signal for distribution of rewards and consequences to succeeding and failing schools, teachers, and students (Bishop, Moriarty, & Mane, 2000).This signal should increase incentives for students to achieve by raising the value of a diploma and clearly articulating the conditions for its receipt.There is research that supports a relationship between exit exams and improved achievement (Bishop, 1996;Bishop, 1997;Bishop, 2005;Bishop, Moriarty, & Mane, 2000;Bishop, et al., 2001); again, there is evidence to the contrary (Reardon, et al., 2009;Grodsky, Warren, & Kalogrides, 2009;Jacob, 2001).When Holme, et al. (2010) reviewed the literature to date, they found that the available evidence did not show a link between easier or more difficult tests and improved student achievement; in fact, as they note, there is some evidence (e.g., Reardon, et al., 2009) that exit examinations reduce achievement among minority and low-achieving students.
The reliability of the evidence turns on methodological issues like the nature of the data studied (statewide or cohort-based), the age of the sample (much of the research uses the NELS:88 data) and the array of controls applied (for example, whether controls for state education policy or prior achievement were used).
Some research concludes that end-of-course exam systems have a greater effect than minimum competency exams on student achievement (Bishop 2005;Bishop, Moriarty, & Mane 2000).Comparative international studies support an especially strong relationship between end-ofcourse exams and student achievement (Bishop, 1996;Bishop, 1997;Bishop, 2005).Using NELS:88 data, Bishop, et al., 2001 found that end-of-course exams in New York State were significantly associated with score gains of 38% of a grade level equivalent (GLE) for B/B+ students and with roughly 50% of a GLE for A students.This number is derived from gains on a test score composite, averaging student gains on the four tests (science, math, social studies, and English).Math score gains in New York were significant only at the ten percent level on a one-tail test -a very weak threshold for significance in a sample greater than 11,000.
Research from Grodsky, Warren, and Kalogrides (2009) challenges these findings.This study analyzed the relationship between exit exams and achievement using the long-term trend data from the National Assessment of Educational Progress (NAEP).Controlling for prior achievement, socioeconomic status, ethnicity, and a variety of state factors, they found no achievement effects in reading and math at the mean or for students in the 10 th , 20 th , 80 th , or 90 th percentiles of the achievement distribution.These results were constant when exit exams were disaggregated by relative difficulty.

Methods
Using the second follow-up to the Education Longitudinal Study of 2002 (ELS:02) data, this study employed multivariate stepwise regressions to predict school completion and academic achievement while controlling for a variety of background factors including student characteristics, family characteristics, family processes, state characteristics, average state achievement, prior student achievement, school characteristics, and school processes.The change in p-value of the F-statistic required to include a variable was .05,while a change of .10 in p-value was grounds for removal.

Data
The ELS:02 is an ongoing longitudinal survey of a nationally representative sample of students, tracking a cohort of students from their sophomore year through their postsecondary experiences.The base-year survey collected data from a variety of sources, including students, parents, and school administrators.Subsequent rounds of the ELS:02 followed up with students and administrators in 2004 and 2006.This study used the secure version of the ELS:02 to extract state and school information.
Supplemental data were gathered to control for policy and economic conditions in students' states of residency.States' 2004 Education Week Quality Counts (Skinner & Staresina, 2004) ratings were used to control for state level education reform packages.School demographic data used to control for divergent school characteristics came from the CCD and Private School Survey (PSS) data linked to school codes and embedded in the secure ELS:02.State economic indicators represent select characteristics of each state and the District of Columbia in 2004.These variables (Table 1) control for state-specific economic conditions (Bishop, et al., 2001).

Sample
The ELS:02 base-year study sampled 750 public and private schools.Of 17,590 eligible selected sophomores, 15,360 completed a base-year questionnaire, as did 13,490 parents, 7,140 teachers, 740 principals, and 720 librarians (Ingels, Pratt, Wilson, Burns, Currivan, Rogers, & Hubbard-Bednasz, 2007).Cases were removed from the data set if they lacked base-year math test scores, follow-up math test scores, ninth grade GPA, or if their high school completion status was unknown at the time of the second follow-up survey.The final sample was composed of 12,520 students from 720 schools in 49 states (no students from North Dakota were included once the sample was cleaned for the purposes of this study) and the District of Columbia.
As the number of dropouts in the overall sample fell below the 10% threshold that would be necessary for prediction as a dependent variable with the full sample, a special subsample of the larger data set was created to allow prediction of school completion outcomes.First, the 850 status dropouts were extracted from the main data set.Then a random sample of 3,380 diploma recipients was extracted from the remaining cases.This resulted in a subsample N of 4,230, where 20% were classified as status dropouts.This subsample was used to predict school completion.

Outcome measures.
Two models evaluated the relationship between exit exams and school completion.The exit exam literature differs on whether GED recipients should be counted as dropouts.NCES defines GED recipients as completers but not as graduates, a classification that this study employs.This study was somewhat more interested in graduation than completion, but defined students as dropouts in two ways to test relationships between exams and these two outcomes.This approach permitted a test of the possibility that standards-based exams increased incentives for students to acquire their GEDs (Bishop, et al. 2001;Bishop, 2005).For the first analysis, a dummy variable differentiated between students with a high school diploma and those without.For the second analysis, an additional dummy variable was created where students with their GED were coded as high school completers.
Two additional models examined the relationship between exit exams and academic achievement.The first academic achievement model predicted students' standardized twelfth grade point average (GPA).The second model predicted students' gain in math test scores from tenth to twelfth grade.Like previous analyses using similar (NELS:88) data (Bishop, et al., 2001), this study predicted score gains using item response theory (IRT) estimated number right scores.IRT estimated number right scores are overall criterion-referenced measures of status (Ingels, et al., 2007) that estimate the number of questions students would have answered correctly if they had responded to all 72 questions in the mathematics pool of questions.Participants' math gain score was calculated by subtracting the base year math IRT estimated number right from the F1 math IRT estimated number right.Gain scores were subsequently standardized.

State Characteristics.
This study used a variety of state economic indicators to replicate the controls used by Bishop, Mane, Bishop, and Moriarty in their 2001 analysis of the NELS:88 data.This approach allowed consideration of the hypothesis that "new" (above minimum competency) exit exams change achievement and school completion outcomes for students in exit exam states.Table 1 describes these variables and their data sources.
Additional controls were used to account for changes in state education policy.Five Education Week Quality Counts ratings for 2004 (Skinner & Staresina, 2004) were matched with corresponding state codes for inclusion in the study's data set: standards and accountability, efforts to improve teacher quality, school climate, adequacy, and equity.The ratings were converted into numerical scores using a standard four point GPA scale.As a final set of controls for background characteristics that vary between states, this study used state 2003 NAEP scores in fourth and eighth grade reading and math.

Student and school characteristics.
This study controlled for sex (a dichotomous variable with female coded 2 and male coded 1), ethnicity (a dichotomous variable identifying students as White was used as a proxy) and ESL participation as well as student self-reports of time spent on homework and time spent watching television.While student self-reports of homework time are somewhat unreliable (Trautwien & Köller, 2003), they are useful as a measure of student perceptions of their own commitment to schooling (Cool & Keith, 1991).One additional student self-reported variable was drawn from the second follow-up interview.It represents the number of stressful life events the student experienced in the past two years.The survey and codebook do not define "stressful life event," leaving the interpretation up to the students..
Status variables like socio-economic status (SES), family characteristics (Battin-Pearson, Newcomb, Abbott, Hill, Catalano, & Hawkins, 2000;Rumberger, 1983) and family educational expectations (Ensminger & Slusarcick, 1992) are strong predictors of school completion.This study used two family characteristics variables as controls: an ELS:02-generated SES composite variable and a parent-reported count of the number of student siblings that dropped out of high school.ELS:02 offered two versions of the SES composite variable: one that used occupation prestige values based on the 1961 Duncan index (SES1), and one that used the 1989 General Social Survey occupational prestige scores (SES2).This study uses SES2 based on its superior fit with contemporary occupational data (Nakao & Treas, 1994).SES2 uses five equally weighted, standardized components: father's/guardian's education, mother's/guardian's education, family income, father's/guardian's occupation, and mother's/guardian's occupation.
There is a well-documented connection between school success and parental involvement in students' lives and schooling (Fan & Chen, 2001;Jimerson, Egeland, Sroufe, & Carlson, 2000;Rumberger & Arellano, 2007;Steinberg, Lamborn, Dornbusch, & Darling, 1992).Sixteen variables in the ELS:02 base year parental survey measured parental involvement.Exploratory factor analysis using a principal components method was used to examine the relationships between these variables.Four factors emerged (variables and factor loading tables are in the appendix).
The first factor, with an initial eigenvalue of 3.82 and a cumulative explained variance of 20.1%, was used in subsequent regression analysis.The variables with the highest loading on this factor included contacting the school about the program for the year (loading of .63),contacting the school about course selection (loading of .60),contacting the school about helping with homework (loading of .60),contacting the school about plans after high school (loading of .59),and contacting the school about fundraising/volunteer work (loading of .59).Given the very high eigenvalue of this first factor and its high explained variance, the remaining three factors were not used in regression analyses.
Any study of the impact of exit exams must control for prior student achievement (Jacob, 2001).One major shortcoming of the ELS:02 data is the lack of eighth grade achievement outcomes to set pre-high school baselines.This does not mean that cohort data should not be used to study the impact of exit exams; instead, caution should be used when interpreting the results.This study used transcript derived ninth grade GPA, the earliest ELS:02 metric, to control for prior achievement.School characteristics play an important role in student academic achievement and school completion (Goldschmidt & Wang, 1999;Rumsberger & Arellano, 2007).Including these variables allows for control of known characteristics of high and low performing schools, such as high rates of free and reduced price lunch eligibility and average class size.Researchers using the ELS:02 have little choice but to take administrator reports at face value even though these reports may be unreliable (Warren, 2005).Even unreliable reports are likely to have significant heuristic value.
Administrator-reported variables used here included the percentage of tenth graders receiving ESL, the percentage of tenth graders receiving remedial math, the percentage of tenth graders receiving remedial English.To control for the effects of internal dropout prevention programs, this study used an administrator-reported dichotomous variable indicating the presence of a dropout program at the school.A final administrator-reported control was a continuous variable representing years of mathematics coursework required to graduate.
School demographic variables include percentage of students eligible for free or reduced price lunch, the number of full time employee teachers, grade ten enrollment, percentage minority students, total school enrollment, and ratio of students to teachers.The CCD school type variable was recoded as a public/private dichotomy.
School safety contributes to student outcomes (Boyd, 2004;Flannery & Singer, 1999;Noakes & Noakes, 2000).Sixteen school characteristics variables in the ELS:02 administrator survey measure school safety.Exploratory factor analysis using a principal components method was used to examine the relationships between these variables.Four factors emerged.Their loading tables and variable descriptions appear in the appendix.
The first factor was extracted to use in regression analysis.The variables with the highest loading on this factor included how often the use of illegal drugs was a problem at school (loading of .70),how often students on drugs/alcohol at school was a problem (loading of .73),and how often the sale of drugs near school was a problem (loading of .71).Given the very high eigenvalue (5.78) of this first factor and its high explained variance (36.10%), the remaining three factors were not used in regression analyses.
Additional literature (Cash 1993;Earthman, 2002;Lackney, 1994;Phillips, 1997) supports a connection between clean and well-maintained school facilities and student outcomes.Fourteen school characteristics variables evaluate school facilities.Exploratory factor analysis using a principal components method was used to examine the relationships between these variables.Six factors emerged.Their loading tables along with variable descriptions appear in the appendix.
The first factor was extracted for use in subsequent regression analysis.The variables with the highest loading on this factor included trash on front hallway floors (loading of .63),graffiti on hallway walls/doors/ceiling (loading of .63),graffiti on bathroom walls and ceilings (loading of .57),and graffiti on bathroom staff doors/walls (loading of .57).Given the high eigenvalue of this first factor (2.94) and its high explained variance (17.32%), the remaining five factors were not used in regression analyses.

State high school exit examinations.
Information on state high school exit examinations in 2004 (Table 2) was drawn from three major sources: the CEP's 2004 report on high school exit examinations (Center on Education Policy, 2004), Dee and Jacob's 2006 paper assigning degrees of difficulty to high school exit examinations, and the 2004 Quality Counts ratings (Skinner & Staresina, 2004).Exams were coded by type and difficulty to account for their heterogeneous effects (Wößmann, 2005).
Discrepancies in six states (Alaska, Arkansas, Maryland, New Mexico, North Carolina and Ohio) required further investigation to reconcile conflicting accounts in the major data sources this study used to characterize the content of 2004 state graduation exams.
Alaska's High School Graduation Qualifying Exam (HSGQE) was implemented for the graduating class of 2004 (Center on Education Policy, 2004), and tested at tenth grade levels (Skinner & Staresina, 2004), so was included in this study contrary to Dee and Jacob's (2006) classification.
Closer examination revealed that neither Arkansas (Greene & Winters, 2004b;Howell, 2008) nor Maryland (Center on Education Policy, 2004; Center on Education Policy, 2005) had a high school exit examination with diploma consequences in 2004; this coding decision was at odds with Dee and Jacob (2006).
There was disagreement about how to code the New Mexico exam's degree of difficulty.The Quality Counts rankings said this exam tested at the tenth grade level or above.This classification was contrary to accounts by Dee and Jacob (2006), the NCES Overview and Inventory of State Education Reforms website (National Center for Education Statistics, 2008), and the New Mexico Public Education Department.
North Carolina has had an end-of-course exam system in place since 1988 (Bishop, et al., 2001).While these exams are mandatory for students taking the tested subjects, and count for the student's grade in class, they did not affect students' ability to receive a diploma in 2004 (Hagen, 2004).Because these exams did not determine whether students received a diploma (unlike the mandatory Regents curriculum in place in New York), North Carolina was coded as a standardbased examination state for one model and as an end-of-course exam state in another model designed to replicate Bishop, et al.'s (2001) coding.
The 2004 CEP report says that the Ohio Graduation Tests (OGT) was to be phased in for consequences in 2007, a status confirmed by the NCES' SER website.But the Quality Counts rankings listed Ohio as having a high-stakes graduation test for the class of 2004, and Dee and Jacob (2006) assigned this test a high degree of difficulty.Upon further examination, it seemed reasonable to assume that the non-CEP sources were referring to Ohio's Ninth Grade Proficiency Tests, currently being phased out in favor of the OGT.This qualifies as a high stakes high school exit examination, as students who fail to pass are not eligible to receive a diploma.As for degree of difficulty, the Ohio Department of Education confirmed that these assessments did not include ninth grade content.Students were eligible to take this test beginning in the summer of their eighth grade year.The exam's difficulty was recoded and the type changed to minimum competency exam (MCE).

Method of Analysis
The main method of analysis was stepwise regression, using mean substitution to account for missing data.Models were checked for muticollinearity; unless noted, none was present.All models used the independent variables noted in Table A-1.
Two models used the school completion subsample to estimate the relationship between exit exams and the two dependent measures of school completion.For academic achievement outcomes, stepwise regression in two models predicted math score gains and twelfth grade GPA.Because exit exams are hypothesized to affect students differently depending on their academic trajectory, students were separated into quartiles based on ninth grade GPA and the test score gain model was run for each quartile.This study used linear models rather than logistic regression to predict the dichotomous school completion outcome variables.The decision to use OLS with all dependent variables in this study has the benefit of ensuring commensurability among the dependent variables and their various coefficients.Comparative analysis has shown little practical difference in the models produced by logistic regression and OLS in this area (e.g., Dey and Astin's 1993 article comparing models predicting college student retention).Pohlmann and Leitner's 2003 comparison showed that either model yields the same substantive predictions when predicting school completion outcomes, even though logistic regression may be structurally superior.These findings are consistent with comparisons reaching back at least to Cleary and Angel's 1984 study showing that OLS and logistic regression models yielded essentially the same results even with very skewed (but large) samples when predicting dichotomous dependent variables.
This study used stepwise rather than forward or other forced entry regression methods.Although for some stepwise regression is considered only an exploratory method, this study used it as a key empirical tool to discover the effects of exit examinations.Stepwise regression ensures that only significant variables are included in the models predicting each dependent variable, and although for some this means using an "automatic algorithm," a decision to force in all variables also involves using an "automatic" algorithm.The decision to force the exit exam variables into equations where they are not significant does not tell us what the effects of exit examinations are.The empirical approach used here, on the other hand, allows us to see where and under what circumstances exit examination variables rise to significant levels while controlling for key background variables.
The most common concern about stepwise regression is that it will identify a variable as significant that might not otherwise be significant (i.e., a Type I error).If anything, the opposite has happened in the present study.The most interesting finding reported here is a lack of effect of the major (exit exam) variables of interest.In other words, the main conclusion of this study is that the variables predicted to enter the equations did not.It is highly unlikely that forcing all variables into an equation would result in the exit examination variables becoming significant.That would require the presence of suppressor variables, but there is no reason to believe that suppressor variables are operating in the present analysis.

School Completion
Two variables measured school completion: a dummy variable representing status dropout where GED recipients were counted as dropouts and a dummy variable counting GED recipients as high school graduates.Using the school completion subsample, multiple regressions estimated the specific effects of exit exams.Results are in Tables 3-4.
Both models are relatively moderate fits.The status dropout model predicts 25.6% of the variance in the dependent variable and the other model predicts 18.0% of the variance in dropping out without a GED.Exit exams did not significantly predict school completion for either dependent variable.The standards-based exam variable did make an appearance in the status dropout model as a positive predictor of dropping out (ß = 0.038), but it was small relative to other predictors in the model and had a significance level of p < .05,which is moderate for a sample of this size.When GED recipients were coded as graduates, standards-based exams were no longer a significant predictor of graduation.It is possible that standards-based exams increased incentives for students to acquire their GEDs (Bishop, et al. 2001;Bishop, 2005).
To evaluate this possibility, the 850 status dropouts were selected from the sample.A new dummy variable was created.The 490 students who were seeking their GED or had received a GED were coded as 1.Stepwise regression used the main study variables (excepting NAEP fourth grade reading and math scores and NAEP eighth grade reading scores, removed for multicollinearity).Table 5 reports the results.
The model is a weak fit for predicting the dependent variable.Students with higher socioeconomic status ratings were more likely to seek a GED, as were students who reported having been in an ESL program.Exit exams did not enter the model.

Academic Achievement
Exit exam variables were not significant predictors of test score gains or twelfth grade GPA.The model (Table 6) predicting standardized test score gains in math was a moderate fit (R 2 = .142).The model (Table 7) predicting twelfth grade GPA was a moderately strong fit (R 2 = .371).The strongest predictor in both models was ninth grade GPA.Since the literature suggests that exit exams may have disparate effects on students who are differently situated on the achievement spectrum (Bishop, et al., 2001;Bishop, Moriarty, & Mane, 2000), students were separated into quartiles based on ninth grade GPA.The cut points for quartiles were 2.2, 2.83, and 3.43.Students in each quartile were coded using a dummy quartile variable.
Stepwise regressions for students in each quartile were conducted to predict standardized test score gain while controlling for state, student, family, and school characteristics as above.Table 8 presents the results for each quartile.
For students in the bottom two quartiles, end-of-course exams were a significant negative predictor of score gain.This effect can be estimated at 4.4% of a standard deviation for students in the bottom quartile and 4.2% of a standard deviation in the second quartile.For the ELS measures, a test score differential of one grade level equivalency (GLE) is about 11.5 (Bishop, et al., 2001, p. 317).The standard deviation of test score gain is 6.6.This means that end-of-course exams were associated with students in the bottom two quartiles posting scores that were lower by 3.19 and 3.34 points.These losses amount to scores that are 28%-29% of a GLE lower for exit exam-taking students if all other factors are held constant.For students in the top quartile, standards-based exams were a significant positive predictor (ß = .056) of test score gains.This is a gain of about 4.25 points, or 37% of a GLE.Exit examinations were not significant predictors of test score gains for the third quartile.
The strongest findings for the achievement effects of end-of-course exams have been for the exam system in New York State (Bishop, et al., 2001).A dummy variable was created to represent New York residency for the 650 students attending school in New York.When the model predicting test score gains was run with the full set of variables in addition to the New York dummy variable, New York residency was not a significant predictor of test score gains.Bishop, et al. (2001) found that end-of-course exams had different effects for different students relative to their achievement level.To test these findings, students were separated by ninth grade GPA quartile, as above.No effects from New York residency were seen for any of the four quartiles, resulting in models identical to those in Table 9.
There is some confusion in the literature about whether North Carolina's unique mix of exams should mean that the state is counted as an end-of-course exam state.The earlier analysis was conducted with North Carolina's 410 students coded as taking standards based exams.Bishop, et al. (2001) considered North Carolina to be an exit exam state, even though the class of 2006-2007 was the first class to have diploma consequences for failure to pass end-of-course exams.That study argued that voluntary end-of-course exams should have substantial effects on student achievement because of their linkages to course content and ability to transform classroom culture.
Since end-of-course exams are on the rise in many parts of the country (Center on Education Policy, 2008) and the strongest findings for the achievement effects of exit exams are found for end-of-course exams (Bishop, 1996;Bishop, 1997;Bishop, Moriarty, and Mane, 2000;Bishop, et al., 2001;Bishop, 2003;Bishop, 2005), achievement outcomes were predicted where North Carolina students were coded as end-of-course exam students.Table 8 reports the results.No exit exam variables entered this model.Ninth grade GPA was the strongest predictor, with gender and SES the next strongest predictors.Collectively, the variables predicted only 14.2% of the variance in the dependent variable.Bishop, et al. (2001), argue that struggling students might see the greatest gains from end-ofcourse exams.They found support for this hypothesis in the NELS:88 data.To test this finding with the ELS:02 data, the recoded data set was used to predict score gains for each quartile of ninth grade GPA.None of the exit exam variables entered the quartile models.

School Completion Results
Students in exit exam states are more likely to drop out of school than their peers not subject to exit exams.Although it may seem that exit exams stand in the way of school completion for a few students, this relationship is correlative rather than causal.Exit exams predominate in poor states where students are already at a higher risk of dropping out.Looking at a national sample and considering exit exams regardless of type, this study found no substantial effects on school completion outcomes.
When this study considered exit exams by type, it found only one small effect.Standardsbased exams were a small but statistically significant negative predictor of the status dropout outcome (ß = .038,p < .05).This effect disappeared when GED recipients were coded as graduates.A separate model disaggregating GED seekers from other dropouts found no effects for any of the exit exam variables.
This study did not account for the factors that contribute the most to school completionlongitudinal analyses have shown that the biggest predictors are factors in early childhood and early education (Ensminger & Slusarcick, 1992;Jimerson, 2000).With all of the complexities of a pupil's educational trajectory, presumption is against finding an effect for exit exams.It is unlikely that a single test, even if quite difficult, could make the difference between graduation and dropping out.
The results here are contrary to some of the latest research in this area.Table 10 summarizes the basic differences between this study and other recent studies.
Papay, Murnane and Willett's ( 2010) study drew from a sample that allowed them to estimate the effects of barely passing or barely failing the Massachusetts exit exam while controlling for background characteristics.This approach is beyond the scope of the ELS data set, which only permits a control for the dichotomous exit exam variable, type of exit exam, and degree of difficulty.As the authors point out, they do not examine the effects of exit exam performance across the state's student population, so their results may not generalize to a larger effect for exit exams.Warren, Jenkins, and Kulick (2006) used a longitudinal approach.Using CPS and CCD data, they found that exit exams were associated with significantly lower school completion rates.This was at odds with previous studies that had used the NELS:88 data -a divergence the authors explained in two ways.First, they criticized other studies (e.g., Warren & Jenkins, 2005) for using a weak dependent variable to measure the dropout outcome.Second, they said that previous research did not account for the onset of newer, more difficult high school exit exams.
These criticisms do not apply to the present study.First, this study corrects for dependent variable weakness by explicitly testing two measures of dropout to compare disparate findings of the impact of exit exams.Results were the same (i.e., no results were found) whether GED recipients were included as dropouts or not.Second, the present study accounts for the new wave of exit exams by using the ELS:02 data.A cohort-based study such as this one may have particular strengths over statewide studies, even longitudinal statewide studies such as the Warren, Jenkins, and Kulick (2006) study.A cohort-based data set allows for controls at the student and school level in a way that examinations of aggregate graduation rates and state characteristics cannot.Ninth grade GPA is by far the largest predictor of school completion in the models used here.Eliminating this control is likely to produce substantial omitted variable bias in non-cohort studies.Cohort-based data also offers more specificity in the dependent variable.This study was able to include actual GED receipt for inclusion in the dropout outcome, while the Warren, Jenkins, and Kulick (2006) study was only able to include GED test-taking rates in their models estimating the school completion rate variable.Multi-cohort study in a single state may be more sensitive to test effects.
Finally, results here are not consistent with Reardon, et al.'s (2009) findings that California's graduation rate has declined by 3.6 to 4.5 percentage points as a result of the CAHSEE policy.Their data set has two major advantages over the ELS:02 set.First, it allows longitudinal tracking of several cohorts within one state and may therefore be more sensitive to long-term effects of the CAHSEE.Second, the ELS:02 data cannot tell us anything about the effects of exit exams in California because CAHSEE passage was not mandatory for graduation until 2006 -two years after the ELS:02 cohort's scheduled graduation.
There are some important limitations of the present study for estimating school completion effects.The models used here do not account for exclusion and retention rates that have been shown to bias dropout measures (Haney, 2000;Warren, and Jenkins, 2005).If students in this sample were excluded from having to take the high school exit exam, this could artificially reduce the effect of the main exit exam variable.However, Bishop (2005) has shown that exclusion rates have little effect in models linking student achievement to exit exams.Absent new research, there is no reason to believe this omission affects the results reported here.
It is possible that this study's relatively large proportion of private school students explains different outcomes.Nationally, about 8% of high school students are enrolled in private schools, remaining relatively constant since the late 1970s (Warren, Jenkins, & Kulick, 2006, p. 146).The Warren, Jenkins, and Kulick study is limited to public high school students, while 19.6% of the school completion subsample (and 22.2% of the full study sample) attended private school.But relative oversampling of the private school population should not affect the results in an analysis predicting outcomes at a student level while controlling for school type.At the very least, the presence of a Type II error should reveal itself in an unexpectedly large coefficient for the private school variable.

Academic Achievement Results
Papay, Murnane and Willett's (2010) study drew from a sample that allowed them to estimate the effects of barely passing or barely failing the Massachusetts exit exam while controlling for background characteristics.This approach is beyond the scope of the ELS data set, which only permits a control for the dichotomous exit exam variable, type of exit exam, and degree of difficulty.As the authors point out, they do not examine the effects of exit exam performance across the state's student population, so their results may not generalize to a larger effect for exit exams.Warren, Jenkins, and Kulick (2006) used a longitudinal approach.Using CPS and CCD data, they found that exit exams were associated with significantly lower school completion rates.This was at odds with previous studies that had used the NELS:88 data -a divergence the authors explained in two ways.First, they criticized other studies (e.g., Warren & Jenkins, 2005) for using a weak dependent variable to measure the dropout outcome.Second, they said that previous research did not account for the onset of newer, more difficult high school exit exams.
These criticisms do not apply to the present study.First, this study corrects for dependent variable weakness by explicitly testing two measures of dropout to compare disparate findings of the impact of exit exams.Results were the same (i.e., no results were found) whether GED recipients were included as dropouts or not.Second, the present study accounts for the new wave of exit exams by using the ELS:02 data.A cohort-based study such as this one may have particular strengths over statewide studies, even longitudinal statewide studies such as the Warren, Jenkins, and Kulick (2006) study.A cohort-based data set allows for controls at the student and school level in a way that examinations of aggregate graduation rates and state characteristics cannot.Ninth grade GPA is by far the largest predictor of school completion in the models used here.Eliminating this control is likely to produce substantial omitted variable bias in non-cohort studies.Cohort-based data also offers more specificity in the dependent variable.This study was able to include actual GED receipt for inclusion in the dropout outcome, while the Warren, Jenkins, and Kulick (2006) study was only able to include GED test-taking rates in their models estimating the school completion rate variable.
Finally, results here are not consistent with Reardon, et al.'s (2009) findings that California's graduation rate has declined by 3.6 to 4.5 percentage points as a result of the CAHSEE policy.Their data set has two major advantages over the ELS:02 set.First, it allows longitudinal tracking of several cohorts within one state and may therefore be more sensitive to long-term effects of the CAHSEE.Second, the ELS:02 data cannot tell us anything about the effects of exit exams in California because CAHSEE passage was not mandatory for graduation until 2006 -two years after the ELS:02 cohort's scheduled graduation.
There are some important limitations of the present study for estimating school completion effects.The models used here do not account for exclusion and retention rates that have been shown to bias dropout measures (Haney, 2000;Warren, and Jenkins, 2005).If students in this sample were excluded from having to take the high school exit exam, this could artificially reduce the effect of the main exit exam variable.However, Bishop (2005) has shown that exclusion rates have little effect in models linking student achievement to exit exams.Absent new research, there is no reason to believe this omission affects the results reported here.
It is possible that this study's relatively large proportion of private school students explains different outcomes.Nationally, about 8% of high school students are enrolled in private schools, remaining relatively constant since the late 1970s (Warren, Jenkins, & Kulick, 2006, p. 146).The Warren, Jenkins, and Kulick study is limited to public high school students, while 19.6% of the school completion subsample (and 22.2% of the full study sample) attended private school.But relative oversampling of the private school population should not affect the results in an analysis predicting at a student level while controlling for school type.At the very least, the presence of a Type II error should reveal itself in an unexpectedly large coefficient for the private school variable.When students were separated by ninth grade GPA quartile, standards-based exams were positive predictors of math test score gain in the top quartile (ß = .056).This effect size was about 37% of a grade level equivalent (GLE).End-of-Course Exams (EOCEs) When students were separated by ninth grade GPA quartile, endof-course exams were negative predictors of math test score gain in the bottom two quartiles, with effect sizes of 28% and 29% of a GLE.When North Carolina students were coded as end of course exam students, none of the exit exam variables were significant predictors of test score gain.

End-of-course exams.
Previous research found substantial effects for end-of-course exams (Bishop, 1996;Bishop, 1997;Bishop, Moriarty, and Mane, 2000;Bishop, et al., 2001;Bishop, 2003;Bishop, 2005).With the exception of the Bishop, et al. study in 2001 (using older NELS:88 data), these were not cohortbased studies that included school and student-level controls.The present study reflects a national sample unlike studies that focus on a single situation, such as New York's Regents Exams (Bishop, Moriarty, and Mane, 2000;Bishop, et al., 2000).
Initially, North Carolina students were coded as SBE students (consistent with other coding in this study, including the decision to code Maryland as a non-exit exam state).End-of-course exams were not a significant predictor of either academic outcome variable, GPA or test score gain.When students were separated by prior achievement (ninth grade GPA) quartiles, end-of-course exams had negative coefficients for the bottom two quartiles (students with ninth grade GPAs of less than 2.83, or roughly less than a B-grade point average).There was no statistically significant relationship between end-of-course exams and achievement for the top two quartiles.The effect sizes for the bottom and next to bottom quartile represented a loss of 28% and 29% of a GLE, respectively.
These results are contrary to findings in the 2001 study by Bishop, et al. who found that endof-course exams in New York State were significantly associated with gains of 38% of a GLE for B/B+ students, and with roughly 50% of a GLE for A students.Students with lower grades saw no significant effects.Those findings are not confirmed here in that New York residency was not a statistically significant predictor of test score gains for the full sample or by prior achievement quartile.Table 12 summarizes the major differences between the two studies.To further evaluate claims for the achievement benefits of end-of-course exams, another model was constructed where students in North Carolina were coded as end-of-course exam-taking students.The end-of-course exam variable was not a significant predictor of test score gains either in the aggregate or by quartile.It seems likely that this state's unique mix of curricular and policy reform was enough to neutralize the effect of end-of-course exams seen with the previous coding.On the other hand, effects for end-of-course exams were not especially large in the previous model (less than 5% of a standard deviation in the dependent variable for each quartile), so it is difficult to draw any substantive conclusions from the differences between these models.
Even if the virtues of North Carolina's exam system were enough to neutralize the negative effects of end-of-course exams, this would not be an argument for mandatory high-stakes end-ofcourse exams of the kind now being implemented in many states (Center on Education Policy, 2008).In North Carolina in 2004, students' scores on end-of-course exams counted for only 25% of their grade in the relevant class (Hagen, 2004).Students could fail their end-of-course exams and still pass their classes to receive a diploma.

Standards-based exams
Standards-based exams had some effect on student achievement outcomes.These exams had no effect on GPA or test score gains for the sample overall, although students in the top quartile did see test score gains of 5.6% of a standard deviation (37% of a GLE).This finding is commensurable with the findings of Grodsky, Warren, and Kalogrides (2009), who found no relationship between high school exit exams and long-term NAEP scores on statewide levels regardless of student achievement levels or degree of exam difficulty.Like their study, the present investigation found no effects for degree of exam difficulty (their control variable of interest).
It is worth noting that the Grodsky, Warren, and Kalogrides (2009) study did not apply controls for state educational policy.That does not explain the discordance between their findings and those of the present study, because a lack of those controls should make exit exam effects seem larger rather than smaller due to the endogeneity of the exit exam variable.

Minimum competency exams
No effects on achievement outcomes were found from minimum competency exams whether students were considered in the aggregate or by quartile.This is consistent with research suggesting that these kinds of exams set the bar too low to increase achievement effects (Bishop, et al., 2001;Jacob, 2001).

Directions for Future Research
Future research should try to address a number of specific issues raised by this study.There are fruitful possibilities for additional research in at least four areas: intervening school effects, the nature of testing systems, closer consideration of student level variables, and divergent methodological approaches.
Intervening school effects should be considered more closely.Exit exams have direct effects on schools, teachers, and students.The present study cannot approximate the ways that exit examinations might change school, teacher, and student processes.The "Goldilocks Effect" may mean that these exams are too easy to effectively discriminate between students who have mastered high school content and those who have not.It is also possible that schools and teachers have effectively targeted those students at risk of not graduating because of failure to pass the exit exam.The ELS:02 does not provide the necessary information to answer these empirical questions about intervening school effects, but future research might build a data set linking student, teacher and administrator reports on exit exam-related behavior.
While this study addressed the question of the effects of exit exams on a national sample, it did not look closely at the effects that might be generated in individual states, particularly when the characteristics of particular examination systems might be taken into account.The sorting techniques used here to disaggregate between types of exams and their degrees of difficulty are coarse at best.One of the major weaknesses of current research in this area is a failure to map standards onto exams and evaluate specific student performance in the content areas addressed by those standards.
Student-level variables are employed in this study but deserve more careful attention.The socio-economic status composite variable used here is effective in the complex models seeking aggregate effects of exit exam systems.But reports from California and other states working with very diverse populations suggest that researchers should pay more careful attention to testing systems' effects on at-risk students.Factors included in socio-economic status, including such wellknown indicators as parent education levels, should be considered separately in future research.The same care should be shown for students receiving special education services.
Finally, additional statistical methods should be employed to shed more light on the interaction between the types of variables considered here.Hierarchical linear modeling might provide some new insights into the interactions between family, student, school, and state level variables, especially as these variables relate to exit examinations.
The rush to adopt exit exams has not been unthinking.It is clear that states like California have a genuine interest in assuring that students achieve while in high school and graduate from high school.But high school exit exams are only one small piece of the vast ecological system that powers our schools.It is unreasonable to believe that any particular test will be enough to change the developmental trajectory of students.Policymakers interested in substantial results should consider prior achievement's substantial effect on test score gains.In this study, ninth grade GPA had an effect of 26% of a standard deviation in the model predicting test score gains.State budgets might be better spent trying to improve achievement before students enter high school than levying tests once students are in high school.
The history of high school exit exams is a case study in the difficulty of setting appropriate incentives.Minimum competency exams were "too cold," failing to improve student achievement while costing time and money for implementation.The present study tried to take the temperature of the new wave of more difficult exit exams, finding them neither "too hot" nor especially "just right."The best policy mix for increasing achievement must extend beyond a single test.

Table 1
State Economic Characteristics Variables Numbers reflect median earnings in 2004 inflation-adjusted dollars for the population 25 years and over.Data are limited to the household population and exclude the population living in institutions, college dormitories, and other group quarters.2004ratio of tuition at 4-year public colleges to weekly earnings in retail in home state 2004 mean weekly retailing wage rates were compiled from the Bureau of Labor Statistics' Occupational Employment and Wage Estimates databases.Annual wages in retail sales (BLS occupation code 41-2031) were extracted from http://www.bls.gov/oes/oes_dl.htm,dividedby 52, and rounded to two decimal places.2004tuition at 4-year public colleges was compiled using 2004 data from the National Postsecondary Student Aid Study (NPSAS) (http://nces.ed.gov/surveys/npsas).Data were filtered to reflect only public 4-year nondoctorate and public 4-year doctorate institutions.Mean tuition and fees are measured in 2004 dollars, reflecting the sampling procedure of the NPSAS.

Table 2
Selected Characteristics of State High School Graduation Exams, 2004

State Graduation contingent on performance on statewide exit or end-of-course exams Type Difficulty Exit or end-of-course exams based on state tenth
Dee and Jacob (2006)ducation Policy, based on information collected from state departments of education, July 2004.Difficulty ratings collected fromDee and Jacob (2006).Degree of difficulty was coded as 1: 2004 exit exam tested material below the ninth grade level; and 2: 2004 exit exam tested material above the ninth grade level.Note: SBE = Standards-Based Examination; MCE = Minimum Competency Examination; EOCE = End of Course Examination

Table 3
Summary of Stepwise Regression Analysis for Variables Predicting Status Dropout

Table 4
Summary of Stepwise Regression Analysis for Variables Predicting Dropout Without GED

Table 5
Summary of Stepwise Regression Analysis for Variables Predicting GED for Status Dropout Subsample

Table 6
Summary of Stepwise Regression Analysis for Variables Predicting Standardized Achievement Score Gains in Math

Table 8
Summary of Stepwise Regression Analysis for Variables Predicting Standardized Achievement Score Gains in Math, With North Carolina Coded as End-of-Course Exam State

Table 9
Summary of Stepwise Regression Analysis for Variables Predicting Standardized Test Score Gain, by Ninth Grade GPA Quartiles

Table 11
Types of Exit Exam and Academic Achievement: Major Findings

Table 12
Comparison Between the PresentStudy and Bishop, et al., 2001