Education Policy Analysis Archives

Volume 8 Number 41

The Texas Miracle in Education

Walt Haney

8. Summary and Lessons from the Myth Deflated

          Before recapping the territory covered in this article and suggesting some of the broader lessons that might be gleaned from the myth of the Texas miracle in education, I pause for one more digression (readers who have made it this far likely will not be too surprised by yet another detour). The detour is to recount a small survey of scholars undertaken in the summer of 1999. After this side excursion, I summarize "the myth of the Texas miracle." Finally, in closing, I suggest some of the broader lessons that might be gleaned from this examination of the illusory Texas miracle.

8.1 The "Two Questions Survey" on School Reform

          In August 1999, as I was preparing for the start of the TAAS trial in September, I re-read a number of key documents regarding the development of the TAAS testing program in Texas. One was the Minutes of the Texas State Board of Education in July 1990 (a full copy of these minutes is reproduced in appendix 8 of this article for ease of reference). It may be recalled that it was at this meeting that the Board set the passing scores on TAAS. When reviewing minutes of this meeting, I was struck by the following passage:
Commissioner [of Education in Texas] Kirby reiterated some of the information presented to Committee of the Whole during the Thursday, July 12, 1990, work session on the TAAS, noting the recommendations of the staff regarding this item.
          Mr. Davis asked for the rationale for the two-year phase in rather than going immediately to the 70% [passing score on TAAS] or a one-year phase in. The commissioner stated that this would give the board an opportunity to clearly set that 70% is the standard--to state the expectation and expect the schools to present the skills to the students and help the students develop those skills so that this is not an unreasonable expectation. Dr. Kirby said that since this is a different, more difficult test, the needed phase- in time is suggested at least until the results of the fall administration are known. Mr. Davis expressed concern that the test does not appear to be indicative of what is being presented in the classroom. Commissioner Kirby replied that the test is an accurate measurement of what students should be learning, but the test is moving much further in the areas of problem solving, higher order thinking skills, making inferences, and drawing conclusions. He said that it is not believed that at this point in time every student has been adequately prepared in those skills, because with the Texas Educational Assessment of Minimum Skills (TEAMS) tests, emphasis has been placed on the basic skills. The commissioner noted that the test drives the curriculum and that it will require a year or two to make that kind of adjustment in the focus of the curriculum. (TEA, 1997, Appendix 9 of the Texas Student Assessment Program Technical Digest for the Academic Year 1996-1997, pp. 337 – 354)
          My reaction to this record was that it is, shall we say, slightly implausible to suppose that simply changing from the basic skills TEAMS test to the more challenging TAAS test would lead to statewide changes in teaching in Texas such that within "a year or two" teachers would be focusing not simply on "basic skills" but on "problem solving, higher order thinking skills, making inferences, and drawing conclusions." To test my own reaction against the views of a broader sample of school reform observers, I undertook a "two questions survey of school reform."
          So, on Monday, August 16, 1999, I sent a survey via electronic mail to sixteen people, whom I respected as knowledgeable students of school reform initiatives around the country. On August 21, I resent the query to an additional 11 people whose names had been suggested by respondents to my first query. As of September 6, 1999, I had received 10 responses to my questions. Though I do not know what typical response rates are to email surveys of this sort (odd questions posed to busy people in late summer, with no explanation as to their possible import), my own view is that a response rate of 37% (10/27=0.3704) is probably not too bad.
          Here is the full text of the email survey, including the two questions posed:
Colleagues: I would like to ask the favor of asking you to answer two questions. Given your professional expertise, I trust the questions will be of some interest. Also, your answers may be of some import. For now, I will not explain the exact reason for my questions, as I would not want it to influence your answers. Imagine a very large school system that has been focusing on basic skills instruction for some years. The focus has been spurred in part by a high stakes test of basic skills. It is assumed that 80-90% of teachers have been covering the basic skills in their instruction.
          In light of current educational reform ideas, the system decides that it needs to move beyond basic skills teaching to focus in the future on problem solving, higher order thinking skills, making inferences and drawing conclusions.
          In light of this situation, and your expertise in studying school reform, my two questions to you are these:
  1. How long would it likely take for this large school system to shift from having 80-90% of teachers teaching basic skills, to having 80-90% of teachers teaching the more advanced skills?
  2. What would be the key ingredients required to make such a shift in instruction possible in the time you envision in your answer to the first question?
          Please keep your answers brief and email them to me by August 30. In exchange for your kindness in responding to my request, I will compile answers, distribute them to whomever responds, and explain the specific reason that motivates the questions.
          The ten scholars who responded to the survey were (in alphabetical order): David K. Cohen, Jane David, Daniel Koretz, Henry Levin, Hayes Mizell, Fred Newmann, Stan Pogrow, Ted Sizer, Adam Stoll, and Anne Wheelock.
          Before summarizing what they said in response to the survey, two prefatory points should be added. First, all of these correspondents have generously allowed me to reproduce the full text of their survey responses (see Appendix 9). Second, despite the generosity of these people in responding so quickly (all within three weeks at the end of summer 1999), we did not even attempt to use the survey results in the TAAS trial in September. Inasmuch as lawyers for the State of Texas were already trying to exclude from the trial evidence they had known about for months, Mr. Kauffman advised me that that they might not entirely welcome new evidence from a survey they had not even heard about before the trial began.
          As mentioned, all ten responses are reproduced in their entirety in Appendix 9. Here I simply summarize three overall patterns in the ten responses.
          Gentle Chiding. Half of the respondents (Cohen, Koretz, Pogrow, Stoll and Wheelock) chided me gently for advancing something of a false dichotomy between "basic skills" and advanced or "higher order thinking" skills. I can only plead mea culpa, but given the background to the survey explained above, I trust that my oversimplification may be forgiven.
          Shifting the course of large educational systems takes years. The first question asked "How long would it likely take for this large school system to shift from having 80-90% of teachers teaching basic skills, to having 80-90% of teachers teaching the more advanced skills?" Though all respondents qualified their answers in one way or another, all did provide some sort of time estimate. In brief these were: Cohen, 10 years; David, 10 to 20 years; Koretz, 3 to 4 years; Levin, 2 to 5 years; Mizell, 7 to 8 years; Newmann, At least six years; Pogrow, 2 to 4 years; Sizer, At least 5 years; Stoll, At least 20 years; Wheelock, 10 to 15 years.
          Two things strike me about these responses. First is the remarkable variance in responses; from "2 to 4 years" to "at least 20 years" (and even if we throw out these outliers, variance remains nearly as great). This suggests that even among scholars who have studied such matters, we really do not know very much about long it takes to shift the course of large educational enterprises. Second is that the median value seems to fall somewhere in the range of 5 to 10 years. This is of course far longer than the 1 to 2 years presumed by Commissioner Kirby in Texas in 1990.
          Huge resources required. The second survey question was:
"What would be the key ingredients required to make such a shift in instruction possible in the time you envision in your answer to the first question?" Answers to this question were generally far longer than answers to the first question, but in general indicated that a large quantity and range of resources would be needed to change the course of a large educational enterprise, including professional development opportunities for teachers, leadership, community outreach, lower pupil/teacher ratios, more instructional resources, better social services for students, and reform of teacher education institutions. Jane David's summary answer was "massive teacher re-education and powerful recruitment strategies." Henry Levin's answer suggested that significant change in instruction could come about in two to five years, given the following ingredients:
continuous staff development, continuous support and technical assistance, administrative encouragement, intrinsic and extrinsic incentives, public information on results, and a culture of commitment. Add to this transformation of local teacher training programs, careful selection of new teachers, and a strong public relations campaign, and things will move. Every administrator will have to become a cheerleader.
He then added "The problem is that no district has ever been able to achieve these conditions. Further, this will be competing with basic skills testing that is often high stakes and high visibility promoted by the states."
          Adam Stoll wrote, in part:
It's immensely hard to get a critical mass of teachers within a school, let alone a district, to significantly change their practice. I would think getting a majority to exhibit practice that is highly supportive of advanced skill acquisition would be very optimistic, but possibly attainable under optimal circumstances.
          I can only imagine having 80-90 % of teachers place a lot of emphasis on "teaching the more advanced skills" if some pretty sweeping changes occurred. I think it would take at least 20 years for these changes to begin affecting practice on this scale.
          These extracts are really an inadequate summary of the observations offered by survey respondents, so I encourage readers to review their observations, reproduced in full in Appendix 9. Nonetheless, it is clear that very few of the ingredients suggested as needed for large-scale educational reform were provided in Texas in the early 1990s. This suggests why the purported "miracle" of educational reform in Texas is not only largely illusory, but indeed has had widespread negative consequences for both students and educators in the Lone Star state. After recapping the myth of the Texas miracle, I will suggest that this is a lesson from which we should learn. Myopic accountability schemes based on high stakes testing likely will have similarly perverse consequences elsewhere if we do not learn from the unfortunate story of Texas education in the last decade of the 20th century.

8.2 Recapping the Myth

          Since the territory covered in this article is extensive, let me try to sum up the journey so far. After an introduction (pointing out among other things that this writer may not be viewed by all as a totally unbiased observer of education in Texas), I summarized the recent history of education and statewide testing in Texas, which led to introduction of the Texas Assessment of Academic Skills (TAAS) in 1990-91. Since then TAAS testing has been the linchpin of educational accountability in Texas, not just for students, but also for educators and schools.
          Part 3 recounted how a variety of evidence in the late 1990s led a number of observers to conclude that the state of Texas had made near miraculous educational progress on a number of fronts. Between 1994 and 1998, the percentage of students passing the three grade 10 TAAS tests had grown from 52% to more than 70%. Also, the racial gap in TAAS results seemed to have narrowed. Statistics from the Texas Education Agency showed that over the same interval dropout rates had declined steadily. Finally, in 1997, release of results from the National Assessment of Educational Progress (NAEP) showed Texas 4th graders to have made more progress on NAEP math tests between 1992 and 1996 than those in any other state participating in state NAEP testing. These developments led to a flurry of editorial praise for the apparent educational progress of the Lone Star State. Some went so far as to suggest even that the Texas experience should serve as a model for federal education legislation.
          Part 4 began a closer examination of both TAAS and what has been happening in Texas schools over the last several decades. Section 4.1 showed that by any of the prevailing standards for ascertaining adverse impact, grade 10 TAAS results continue to show discriminatory adverse impact on Black and Hispanic students in Texas. It was also shown that use of TAAS results in isolation to control award of high school diplomas is a clear violation of professional standards concerning appropriate test use. Previously I explained how expert witnesses for the state of Texas had challenged my interpretation or the Standards for Educational and Psychological Testing, sponsored by AERA, APA and NCME. In July, 2000, AERA issued a statement that, at least in my view confirms my interpretation of the Standards. (See www.aera.net/about/policy/stakes.htm)
          Section 4.2 demonstrated that the passing scores set on TAAS tests were arbitrary, discriminatory and failed to take measurement error into account. Furthermore, analyses comparing TAAS reading, writing and math scores with one another and with relevant high school grades raise doubts about the reliability and validity of TAAS scores. Finally, it was demonstrated how a sliding scale approach (taking into account both test scores and grades) could be applied in a more professionally sound and less discriminatory manner.
          Stepping back from the arcane technology of standardized testing, Part 5 discussed problems of missing students and other mirages in Texas. First, patterns of student enrollment in Texas between 1975 and 1999 were examined by studying rates of progress from grade 9 to high school graduation, grade to grade progression ratios, and grade 6 to high school graduation rates. Without trying to summarize results of all of those analyses here, let me mention just some of the substantive findings from these analyses. In 1990-91, Black and Hispanic high school graduates relative to the number of Black and Hispanic students enrolled in grade 9 three years earlier fell to less than 0.50 and this ratio remained just about at or below this level from 1992 to 1999 (the corresponding ratio had been about 0.60 in the late 1970s and early 1980s). This finding indicated that only 50% of minority students in Texas have been progressing from grade 9 to high school graduation since the initiation of the TAAS testing program.
          Subsequent analyses of progression ratios for all the grades indicated that the rates of Texas students being denied promotion from grade 9 to 10 have changed sharply over the last two decades. From 1977 until about 1981 rates of grade 9 retention were similar for Black, Hispanic and White students, but since about 1982, the rates at which Black and Hispanic students are denied promotion and required to repeat grade 9 have climbed steadily, such that by the late 1990s, nearly 30% of Black and Hispanic students were "failing" grade 9 and required to repeat that grade.
          This finding led to a third series of analyses examining rates of progress from grade 6 and grade 8 to high school graduation. It was found that the rate of progress from grade 6 to high school graduation fell from about 0.75 in 1990 to less than 0.70 for White students and from about 0.65 to 0.55 for minority students. (The rate for minority students started to climb above 0.60 only in 1997, the year in which Texas was forced to raise the passing score on the GED high school equivalency tests).
          Since all this discussion of rates and ratios may well obscure what is happening – or not happening – to large numbers of children in Texas, let us take one last look at the grade enrollment data for Texas. This time I show simply numbers of students, not ratios or percentages. Figure 8.1 shows progress from grade 6 to high school graduation 6.5 years later for the Texas high school classes of 1982 to 1999 simply in terms of numbers of students (that is, total numbers of Black, Hispanic and White students).


          Also shown in this figure is the difference, that is the numbers of students who do not make it from grade 6 to high school graduation 6.5 years later. As can be seen, the numbers of children lost between grade 6 and high school graduation in Texas were in the range of 50 to 60 thousand for the classes of 1982 to 1986. The numbers of lost children started to increase for the class of 1987 and jumped too almost 90 thousand for the class of 1991. For the classes of 1992 through 1999, in the range of 75 to 80 thousand children are being lost in each cohort. (For readers who may have not waded through all of the previous parts of this very long article and simply skipped to this conclusion, it is worth noting that as discussed in Part 7, these estimates are probably conservative, since there has been a net in-migration of people into Texas in the last two decades.
          Cumulatively for the classes of 1992 through 1999, there were about 2.2 million enrolled in grade 6 (in the academic years 1984-85 through 1992-93). The total number graduating from these classes was about 1.5 million. In other words, for the graduating classes of 1992 through 1999, around 700,000 children in Texas were lost or left behind before graduation from high school.
          Section 5.4 of the article examined cumulative rates of grade retention in Texas. These are almost twice as high for Black and Hispanic students as for White students. The next section (Section 5.5) reports on estimates of dropouts by grade. It was found that most dropouts occur between grade 9 and 10 (about 16% of Black and Hispanic students and 8% of White students) but that another 6 to 10 percent dropout after grade 10 and also after grade 11. This portion of the article also shows the way in which apparent increases in grade 10 TAAS pass rates tend to disappear, if they are based not on numbers of students taking TAAS in the spring of grade 10, but instead on fall grade 9 or even fall grade 10 enrollments.
          Having been alerted to the fact that some portion of the gains in grade 10 TAAS pass rates were illusory, in Section 5.6 I next sought to estimate the numbers of students taking the grade 10 tests who were classified as "in special education" and hence not counted in schools' accountability ratings. As reported in Section 5.6, the numbers of such students nearly doubled between 1994 and 1998.
          In the closing portion of Part 5, I sought to estimate what portion of apparent gains in TAAS pass rates might be due to such forms of exclusion. It was estimated that a substantial portion, but probably less than half of the apparent increases in TAAS pass rates in the 1990's are due to such exclusions.
          In Part 6 of this article, I sought to summarize the views of educators in Texas about TAAS, based on three statewide surveys of educators. These surveys were undertaken entirely independently, and surveyed somewhat different populations of educators. General findings from this review were as follows:
  1. Texas schools are devoting a huge amount of time and energy preparing students specifically for TAAS.
  2. Emphasis on TAAS is hurting more than helping teaching and learning in Texas schools.
  3. Emphasis on TAAS is particularly harmful to at- risk students.
  4. Emphasis on TAAS contributes to retention in grade and dropping out of school.
          Survey results indicated that the emphasis on TAAS is contributing to dropouts from Texas schools not just of students, but also teachers. In one survey, reading specialists were asked whether they agreed with the following statement:
It has also been suggested that the emphasis on TAAS is forcing some of the best teachers to leave teaching because of the restraints the tests place on decision making and the pressures placed on them and their students.
          A total of 85% of respondents agreed with this statement. In another survey, teachers volunteered comments such as the following: "Mandated state TAAS Testing is driving out the best teachers who refuse to resort to teaching to a low level test!"
          The penultimate portion of this article, Part 7, reviews a variety of additional evidence about education in Texas. Five different sources of evidence about rates of high school completion are compared and contrasted. In an effort to reconcile sharp differences apparent in these sources, a review of statistics on numbers of students, in Texas and nationally, taking the Tests of General Educational Development (GED) was undertaken. People take the GED tests in order, by achieving passing scores, to be awarded high school equivalency degrees. The review of GED statistics indicated tat there was a sharp upturn in numbers of young people taking the GED tests in Texas in the mid-1990s.
          This finding helps to explain why the TEA statistics on dropouts are misleading. According to TEA accounting procedures, if students leave regular high school programs to go into state-approved GED preparation programs, they are not counted as dropouts. As Greene (1998) observed:
[A]n important misleading feature of the [TEA] reported drop-out rates is that they exclude students who were transferred to approved alternate programs, including drop-out recovery programs. If the students in these drop-out or other alternative programs subsequently drop out, it is not counted against the district. This is like reporting death rates at hospitals where you exclude patients transferred to intensive care units.
          If we put aside the TEA-reported dropout rates as misleading, differences in other sources of evidence on rates of high school completion in Texas appear reconcilable. NCES reports (based on CPS surveys) indicate that the rate of high school completion among young people in Texas in the 1990s was about 80%. This would imply a non-completion (or dropout) rate of 20%. Initially this would appear markedly lower than the non-graduation rate of at least 30% derived from my analyses of TEA data on enrollments and graduates. But the CPS surveys count as high school completers, those who receive a regular high school diploma and those who receive a GED high school equivalency degree. So it seems clear that a convergence of evidence indicates that during the 1990s, slightly less than 70% of students in Texas actually graduated from high school (e.g. 1.5 million/2.2 million = 0.68). This implies that about 1 in 3 students in Texas in the 1990s dropped out of school and did not graduate from high school. (Some of these dropouts may have received GED equivalency degrees, but as discussed in Part 7, GED certification is by no means equivalent to regular high school graduation).
          Section 7.2 examined patterns of retention in grade 9 and high school completion among states for which such data are available. Results indicated that there is a strong association between high rates of grade 9 retention and low rates of high school completion (specifically, results suggested that for every 10 students retained to repeat grade 9, about seven will not complete high school).
          Part 7.3 examined SAT scores for Texas students as compared with national results. Evidence indicates that at least as measured by performance on the SAT, the academic learning of secondary school students in Texas has not improved since the early 1990s, at least as compared with SAT-takers nationally. Indeed results from 1993 to 1999 on the SAT-M indicate that the learning of Texas student has deteriorated relative to students nationally (and this result holds even after controlling for percentage of high school graduates taking the SAT).
          Part 7.4 revisited NAEP results for Texas. Results for eight state NAEP assessments conducted between 1990 and 1998 were reviewed. Because of the doubtful meaningfulness of the NAEP achievement levels, NAEP results for Texas and the nation were compared in terms of NAEP test scores. In order to compare NAEP results with those from TAAS, the "effect size" metric (from the meta-analysis literature) was employed. This review of NAEP results from the 1990s, showed that grade 4 and grade 8 students in Texas performed much like students nationally. On some NAEP assessments Texas students scored above the national average and on some below. In the two subject areas in which state NAEP assessments were conducted more than once during the 1990s, there is evidence of modest progress by students in Texas, but it is much like the progress evident for students nationally. Reviewing NAEP results for Texas by ethnic group, we see a more mixed picture. In many comparisons, Black and Hispanic students show about the same gain in NAEP scores as White students, but the 1998 NAEP reading results indicate that while White grade 4 reading scores in Texas have improved since 1992, those of Black and Hispanic students have not. More generally, however, the magnitudes of the gains apparent on NAEP for Texas fail to confirm the dramatic gains apparent on TAAS. Gains on NAEP in Texas are consistently far less than half the size (in standard deviation units) of Texas gains on state NAEP assessments. These results indicate that the dramatic gains on TAAS during the 1990s are more illusory than real. The Texas "miracle" is more hat than cattle.
          The final portion of the penultimate portion of this article (Section 7.5) provided a brief review of other evidence concerning the state of education in Texas. Perhaps the most striking portion of this review were results from the Texas Academic Skills Program or TASP test during the 1990s. Between 1994 and 1997, TAAS results showed a 20% increase in the percentage of students passing all three exit level TAAS tests (reading, writing and math). But during the same interval, TASP results showed a sharp decrease (from 65.2% to 43.3%) in the percentage of students passing all three parts (reading, math, and writing) of the TASP college readiness test.

8.3 Testing and Accountability

          What might be the broader lessons from the Texas myth for education elsewhere? Surely there are many different ones that might be read into this story (such as the need to be wary of the party line emanating from large bureaucracies, which education in Texas seems to have become; and the importance of comparing alternative forms of evidence in order to begin to get at the truth about large and complex enterprises). But in closing, I comment briefly on only three of what I view as the broader lessons from the Texas myth story.
          Aims of Education. The Texas myth story surely helps remind us of the broader aims of education in our society. The dramatic gains apparent on TAAS in the 1990s are simply not born out by results of other testing programs (such as the SAT, NAEP and TASP). But quite apart from test scores, surely one of the main outcomes of pre-collegiate education is how many students finish and graduate from high school. By this measure of success, surely the Texas system of education in which only two out of three young people in the 1990s actually graduated from high school should not be deemed a success, much less a miracle.
          Testing and Accountability. The TAAS testing program in Texas seems to have been spawned mainly by a yen for holding schools "accountable" for student learning. It is an unfortunately common manifestation of what has come to be called in the last several decades "outcomes accountability." As suggested above, however, quite apart from test scores, surely one of the most important outcomes of public education is how many young people finish schooling and graduate from high school. And this reminds us of the broader meaning of the term accountability (Haney & Raczek, 1994). In its broader meaning the word accountability refers to providing an account or explanation not just of consequences, but of conduct. The Texas myth story, it seems to me, reminds us of how vital it is when judging educational endeavors to return to the root meaning of the word accountability and inquire into conduct as well as consequences.
          It is of course always possible to come up with some sort of bureaucratic scheme, as in Texas, for weighing various sorts of data about schools and coming up with some kind of summary judgment about their quality. But anyone who believes in the rationality of such approaches has forgotten the old paradox of value from the field of economics. The paradox refers to the fact that many obviously useful commodities, such as air and water, have very low if any exchange values, whereas much less useful ones such as diamonds and gold, have extremely high value. According to Schumpeter's (1954) History of economic analysis, it was recognized as early as the 16th century, by "scholastic doctors" and natural philosophers that the exchange value or price of commodities derived not from any inherent characteristics of the commodities themselves but from their utility or "desiredness" and relative scarcity. Without wandering into a digression on the field of economic theory (concerning which I am an absolute amateur anyway), let me simply mention how this paradox was resolved by Kenneth Arrow. In 1950, Arrow published what has come to be known as his "impossibility theorem," in an article modestly titled "A difficulty in the concept of social welfare." In this article, Arrow proved mathematically that if there are at least three alternatives which members of society are free to order in any way, any social welfare function yielding an ordering based on those preferences violates one of three rational conditions (as long as trivial and dictatorial methods of aggregation are excluded). In short Arrow's "impossibility theorem" extended Pareto's finding about the immeasurability of general social welfare.
          Hazards of High Stakes Testing. More than anything though, the Texas miracle story shows us the hazards of high stakes testing. It is, of course, possible to impose a "whips and chains" test-based accountability system on schools (as Schrag, 2000, described the Texas approach). Yet the Texas miracle story shows us the need to return standardized testing to its rightful place, as a source of potentially useful information to inform human judgment, and not as a cudgel for implementing education policy.


0: Home   |   1: Intro.   |   2: History   |   3: The Myth   |   4: TAAS   |   5: Missing Students
6: Teachers   |   7: Other Evidence   |   8: Summary   |   Notes & Ref.   |   Appendix