Neglecting Democracy in Education Policy : A-F School Report Card Accountability Systems

Sixteen states have adopted school report card accountability systems that assign A-F letter grades to schools. Other states are now engaged in deliberation about whether they, too, should adopt such systems. This paper examines A-F accountability systems with respect to three kinds of validity. First, it examines whether or not these accountability systems are valid as a measure, that is, do these systems validly measure school quality? Second, it examines whether or not they are valid as a policy instrument. or, how far do A-F accountability systems fulfill the stated aims of their proponents—empowering parents, providing “simple” and “common sense” measures of educational quality, and so on? Finally, it examines whether or not A-F systems are valid as a democratic framework:, how well do these systems align with the broader goals of educating students Education Policy Analysis Archives Vol. 25 No. 109 POLICY COLLECTION 2 for democratic citizenship and of incorporating parents and community members in democratic deliberation about policies for their public schools? The paper concludes that A-F accountability systems are invalid along each of these lines, and provides recommendations for democratically developing and implementing criteria for school assessment.

To be clear, we evaluate A-F accountability systems from the normative framework of democracy and democratic values. 3We take public education to be the bedrock of democracy and the wellspring of robust democratic citizenship.On our view, there can be no evaluation of education policy, and no inquiry more generally, entirely shielded from values (Longino, 1990;Putnam, 2002).Because values cannot be drained away from policy evaluation, we stand upon defensible values, namely, democratic values.

School Report Cards
Sixteen states have adopted accountability systems that assign A-F grades to schools. 4ollowing the passage of the Every Student Succeeds Act (ESSA), other states are now engaged in deliberation, often contentious, about whether they too should adopt such systems, and how they should be conceived and implemented.Measures used to determine A-F grades for schools vary by state but often include graduation rates, ACT/SAT participation and scores, standardized student achievement test scores, growth in academic test scores, and attendance rates.
A-F grades have associated rewards and punishments, which vary by state.In Florida, for example, the Opportunity Scholarship Program allows students who have attended schools earning either one "F" or three consecutive years of "D" grades to exit and enroll in higher-performing public schools within their district or any other district in the state, provided space is available. 5The A-F accountability system in Indiana requires the State Board of Education to intervene with a menu of options in schools that have received an "F" grades.Options include merging the school with a nearby higher-performing school, assigning a "special management team" to operate all or some part of the school, closing the school, and revising the school's improvement plan, among others. 6Such state sanctions are examples of direct or bureaucratic accountability -systems where state officials determine rewards and punishments.
Typically, however, A-F school grading systems also incorporate market accountabilitysystems that allow parents and students to make choices about leaving one particular school for another, taking funding with them.Vehicles for market accountability are often choice and voucher programs.For example, the Indiana Choice Scholarship Program provides eligible students with state funding for partial or full tuition costs at participating choice schools, including religiously affiliated schools. 7Such programs make schools indirectly accountable; when information about their performance is disseminated in A-F grades, families decide whether students will remain in a school or not.Proponents of choice systems maintain that allowing parents to remove their children from schools receiving low grades will ultimately ensure that only high-performing schools survive.
A-F school grading systems have considerable intuitive appeal to policymakers and parents as a good way to convey the quality of schools, to foster parental participation, and to spur school improvement.There is reason to become skeptical of the validity of A-F school grading systems, however, when one considers their rationales and features more carefully, as we do in this paper.Below we look closely at rationales states have offered for implementing state A-F report cards systems.

Rationales for School Report Cards
Implemented over the last 17 years or so, the A-F grading systems are a somewhat recent variation within the accountability movement in public education (Meens & Howe, 2015).Florida was the first to adopt an A-F system.Jeb Bush, then governor of Florida, worked with the state legislature to craft and implement his "A+ Education Plan" in 1999, which put school A-F grades at the center.Students who attended schools that received an "F" two out of four years were eligible either to attend a higher-performing public school or to receive a voucher that could be used to attend a participating private school (Figlio & Lucas, 2004).While Florida policymakers have substantially revised the original A+ Plan, A-F grades remain central to Florida's accountability system.Fifteen states have now followed Florida in constructing accountability measures around A-F school grades.
Rationales given for A-F systems are strikingly similar across states, as if they reverberate in an echo chamber.Florida is frequently cited as an obvious success of A-F systems, and other states frequently cite similar-or indeed, identical-rationales when they choose the A-F path.For example, Jeb Bush's Foundation for Florida 's Future (n.d.) argues: "Assigning a letter grade (A-F) is a way to report a school's effectiveness in a manner everyone can understand.Used along with rewards for improving schools and support for schools that need to improve, grading schools encourages them to make student achievement their primary focus." Similarly, the Arizona Department of Education (2013) states "the A -F Letter Grade System was created to provide clear, easy to understand information to parents so that they could base their educational decisions on the best information available about the overall academic performance of schools and districts/charter holders."And in Utah, A-F proponents contend: "With this important accountability system in place, Utah is empowering everyone-whether school administrators, parents, classroom teachers or citizens-to make informed choices and to identify ways to strengthen and improve all of our schools for the benefit of every student in Utah" (Utah State Senate, 2013).School report cards, proponents suggest, "give schools a tool to encourage more parental and community involvement."Such involvement is assumed important because "schools with higher levels of parent and community involvement have a better chance of succeeding" (Utah State Senate, 2013).
Making an explicit link to the Florida system, Utah's school grading website prominently features a quote from Jeb Bush-"what gets measured gets done"-and provides other rationales that reference Florida.The Indiana Department of Education (n.d.) suggests that "giving schools letter grades for their performance-just as we do for our students-ensures parents, students, educators and communities understand how their schools are performing."They write further that "Indiana's A through F grading system gives parents, students, educators and communities a clear and concise assessment of how well their schools are doing."The West Virginia Department of Education (n.d.) echoes Indiana: "giving schools letter grades for their performance-just as we do for our students -ensures parents, students, educators and communities understand how their schools are performing."Furthermore, "West Virginia's A-F school grading system gives parents, students, educators and communities clear and concise information on how well their schools are doing." Private organizations such as Michelle Rhee's Students First, Jeb Bush's Foundation for Excellence in Education, and the American Legislative Exchange Council (ALEC) have added significant voices to the echo chamber, advocating for the creation of more such A-F accountability systems.Students First (2013a) had, until it merged with 50CAN, assigned A-F grades and GPA scores to states based on the extent to which they "empower parents," "elevate the teaching profession," and "spend wisely and govern well," which the organization took to require, among other policies, assigning A-F grades to all K-12 schools.Students First writes: "A simple, common sense solution is to provide families with easy-to-understand annual school report cards, much like parents already receive about their kids ' performance" (2013b, p. 1).ALEC has also endorsed A-F letter grades.Describing the adoption of letter grades in North Carolina, ALEC contends that A-F grades are "a crucial step toward increasing transparency in the system"; such grades, one ALEC report argues, describe school performance "on a universally understood scale" (Ladner & Myslinski, 2014, p. 2).
It appears, then, that the chorus in favor of A-F systems seems to be singing the same refrain: A-F systems supposedly are clear, concise systems that let everyone know how schools are doing and encourage parents to be involved in school choices and systems.Embedded in these claims, however, are several assumptions that need to be closely examined.One assumption is that these systems accurately and adequately measure what they purport to measure (school quality) and that they actually advance goals they purport to advance (parental empowerment, democratic engagement and citizenship, and so on).Another assumption is that fostering the democratic aims of education need not be among the considerations that go into designing accountability systems and assessing their validity.The following segments provide a close examination of these assumptions, finding them questionable at best.

The Validity of School Report Cards as a Measure of School Quality
Do state A-F school grades serve as valid indicators of school quality?Space limitations do not permit a description of each of the 16 state systems.To be sure, there are differences among state plans (see Table 1 below for detail on individual state systems).
Despite their proliferation and variation, there has been relatively little credible research on how far these state systems validly measure school quality.What is known comes primarily from a set of papers produced by university researchers in Oklahoma at The Oklahoma Center for Education Policy and The Center for Education Research and Evaluation.These reports raise substantial doubts about the validity of the Oklahoma A-F system as a measure.To our knowledge, these papers provide the best and most rigorous examinations of the validity of A-F school grading systems as a measure of school quality to date, and so we rely heavily on them in this analysis.
We found that all state A-F school grading systems share four pivotal features with Oklahoma's: (1) school quality is summarized in a single composite letter grade 8 on (2) a 5-point categorical scale (3) using proficiency levels to measure academic achievement.And (4): A-F school report cards are composite scores of unmediated outcomes.This fourth feature implicitly assumes that the school itself is primarily, if not exclusively, responsible for student performance.Because the four features are, indeed, shared across all state A-F systems, the findings from Oklahoma provide a source of criticisms that generalize relatively straightforwardly across other state systems.Questions about and criticisms of each component follow.

A Single Composite Grade
A single composite score as an index of school qualities is a dubious proposition.It is by no means clear what a single grade can mean across such a diverse array of criteria -achievement, attendance rates, dropout rates, advanced class offerings, and so on (see Table 1 for an illustration of the range of possible criteria).Little, if any, attention is paid to how to justify combining the diverse components of each grade to render a value on the A-F scale.For example, in addition to whether or not to include attendance as a criterion, policymakers have to decide how heavily to weight it if they do: 10%? 20%?Should improvement in achievement levels be calculated, or should only raw achievement scores be included?The selection and weighting of criteria seem to have no basis other than the seat-of-the pants intuitions of policymakers woefully lacking in technical knowledge and skills.

Five-Point Scale
A-F grades exemplify a crude categorical scale.This produces considerable imprecision.Schools with the same grade are represented as equivalent when they can differ substantially.Within the five categories differences are rendered invisible, and there is no way of knowing if the difference between an "F" and a "D" is of the same magnitude as the difference between a "D" and an "C," or if the difference between a "C" and "B" is of the same magnitude as the difference between a "B" and an "A."But the problem goes deeper than simply imprecise scaling.Successfully remedying the problem of the imprecision of the A-F scale assumes that the grades are potentially intelligible, if imprecise, indicators of school quality, which is by no means evident.The numerical intervals of computed composite scores that are translated into the various grades, like the weighing of the various criteria that go into the computations, have no firmer basis than unprofessional intuition.The fundamental problem here, that a more precise scale cannot remedy, is the assumption, discussed in (1), that a single composite score for school quality is meaningful.

Proficiency Level as Measures of Academic Achievement
The Oklahoma findings reveal serious problems of imprecision and lack of interpretability associated with the use of proficiency levels to represent the academic achievement component of school grades.Thirty-three percent of Oklahoma school grades are based on student achievement values.However, the numerical test scores are grouped into only four proficiency levels: unsatisfactory, limited knowledge, proficient, and advanced.It is these calculated proficiency levels that are used in the grading formula-and also in calculations of academic growth, weighted at 34% in the grading formula.The procedure of converting original test score data to proficiency levels and using the new proficiency data to produce values for achievement and growth introduces unnecessary imprecision because it "amounts to throwing away information about examinee test performance" (Oklahoma Center for Education Policy & Center for Educational Research and Evaluation [OCEP & CERE], 2013a, p. 12) and thereby masks otherwise detectable differences in student academic performance within proficiency levels (Dean Ho, 2008).
Such conflating of data muddies its interpretation.Empirical analysis of Oklahoma school grades revealed, for example, that there were practically no differences in average science and reading scores among "A," "B," and "C" schools.Students in "C" schools had higher average science scores than students in "B" schools.And students in "F" schools appeared to have had higher average reading and math achievement than students in "D" schools.Further, certain schools with lower letter grades performed better in mathematics than schools with higher letter grades (OCEP & CERE, 2013b, p. 12-14).Here it may be asked: "If a letter grade, which is based primarily on standardized test scores,9 does not necessarily tell us anything about school differences in reading, math, and science outcomes, what does it tell us" (OCEP & CERE, 2013b, p. 13)?The answer here seems to be that it tells us very little or nothing.To be meaningful, the letter grade would need to represent a school's performance pattern, but it turns out that within-school variation across subject areas, across grades, and across the academic year fluctuates a great deal.Thus, it is never clear what an "A" is or what an "F" indicates (OCEP & CERE, 2013b, p. 5).

A-F School Report Cards as Composite Scores of Unmediated Outcomes
The findings of the celebrated Coleman Report (Coleman et al., 1966), produced 50 years ago, have proved to be impressively robust: schools account for a remarkably small amount of the variance in student achievement scores (perceived as remarkably small in the mid-1960s) (Borman & Dowling, 2010).Credible empirical research continues to show that school effects typically account for less than 30% of student academic performance (Nye, Konstantopoulos, Hedges, 2004;Rockoff, 2004;Rowan, Correnti, Miller, 2002).Using only student academic performance and other isolated outcome measures to assign A-F school grades is, then, confusing-or even deceptive-because it ignores and obscures many important factors that contribute to school performance.Letter grades ignore, for example, the well-documented correlation between socioeconomic status and attendance and graduation rates (OCEP & CERE, 2013a, p. 5) and they attribute academic proficiency changes directly to schools that students attended only most recently (OCEP & CERE, 2013a, p. 15).The "primary assumption of the A-F accountability system, that student test scores can be dissected and manipulated into valid indicators of school performance, is simply false" (OCEP & CERE, 2013b, p. 8).
Two more recent papers examining the Oklahoma A-F system, produced by the same Oklahoma researchers, corroborate these concerns about the validity of school report cards as a measure of school quality.The papers document a number of flaws in the Oklahoma letter grades.The researchers find, for example, that the Oklahoma letter grades tend to hide, rather than reveal, achievement gaps.They write: "minority and FRL students in the lower ranking schools outperformed their minority and FRL peers in higher ranked schools… Further, FRL students in the lowest performing schools actually had higher average achievement than their FRL peers in the highest ranked schools" (Adams et al., 2016a, p. 15).More generally, they doubt the "informational significance" of A-F letter grades -their ability to validly measure and express school quality.They write: "After removing achievement variance attributed to factors unrelated to teaching or school effectiveness, letter grades were unable to differentiate schools by average student achievement… Informational significance is lost on grades that hide achievement variance within and between schools, making any diagnostic and improvement use of A-F grades ineffectual" (Adams et al., 2016b, p. 23).In sum, they find that "school grades do not accurately represent achievement patterns within schools, nor are they suitable for distinguishing between higher performing and lower performing schools" (Adams et al., 2016a, p. 19).
Despite such weaknesses, A-F school report cards are one among many school accountability systems spawned by No Child Left Behind's mania for assessment.State after state claims that school grades are intuitive and easy for parents and the public to understand, since they are analogous with subject matter grades, with which virtually everyone is familiar.School grades are thus touted as providing valuable information to parents in their decision-making about schools, facilitating increased and more effective participation on their part, and ultimately fostering school improvement.
These are largely claims about the validity of A-F school grading as a policy instrument, the topic of the next section.However, we make the preliminary observation here that it is unlikely that such grading systems can accomplish purported policy objectives if they fail on the prerequisite of validity-if they do not in fact accurately measure school quality.And they do in fact fail: as we show above, they do not and cannot provide an accurate assessment of school quality.Although there is some evidence that parents do, indeed, find school report cards useful in evaluating schools, especially when presented with appealing graphics (Mikulecky & Christie, 2014, p. 9-13), this is a case in which the perceived "face validity" of school report cards-the intuitive perception of validity-surely goes awry."If [an A-F grading system] seems easy to understand, it is only because the use of a single indicator to represent something complex is familiar.We are used to letter grades.A truly comprehensive evaluation system is best not boiled down to a single value because it masks the very complexity it is trying to capture" (OCEP & CERE, 2013a, p. 18).The formulas by which school report cards are computed are often not readily available, and are inscrutably byzantine in any case.It would require a very atypical parent, indeed, to understand what the grades mean, particularly when it is by no means clear that they have any coherent meaning at all.
One final observation about the validity of A-F school grades as a measure of school quality: to our knowledge, no state A-F system includes among its criteria democratic citizenship, the ability to engage in democratic dialogue with diverse others, and other public and civic educational outcomes. 10How far can a letter grade that makes no mention of democratic citizenship validly measure school quality in a democratic society?
In sum, there are very strong reasons to reject the validity of A-F school grading systems, as currently conceived and implemented, as a measure of school quality.But the problems that beset A-F school grading systems apply not just to current systems.There are no technical fixes: the single summary evaluation on a crude five-point scale is irremediably flawed.

The Validity of School Report Cards as a Policy Instrument
The question of validity as a policy instrument of A-F grading systems is the question of how far such systems succeed in fulfilling proponents' stated aims.Above, we detailed evidence of an "echo chamber," where rationales for A-F school grading systems were similar, or indeed identical, across the states.
We identified three rationales commonly articulated by proponents: (1) A-F school grades provide "simple" and "common sense" information to parents and communities about the education of their children.( 2) By providing such information, A-F school grades encourage and empower citizens, parents, teachers, and administrators to participate in and take rational control of decisions about schooling.(3) A-F school grading systems work to improve schools to everyone's benefit-as enabled and fostered by the realization of rationales (1) and (2).We argue that there are good reasons to doubt each of these rationales.
Rationale 1-letter grades provide parents and communities with clear information about school performance-is undermined thoroughly by the analysis of the previous section.However, simple and common sense school report cards may appear to the untrained eye, a modicum of technical analysis reveals them to be patently invalid representations of school quality.As previously observed, it follows that because school report cards are invalid as a representation of school quality, so must be policy instruments based upon them.The invalidity of school report cards as a representation of school quality leaves rationale one adrift, anchored in nothing.
Like Rationale 1, Rationale 2-A-F school grades encourage and empower citizens, parents, teachers, and administrators to participate in and take rational control over decisions about schooling-finds its warrant in no more than common sense, apparently, for supporters cite no empirical research in its defense.We found little empirical research that speaks directly to the issue.We did find, however, a small set of recent studies on the general relationship between state accountability systems and parents' and citizens' attitudes toward government, their political participation, and their involvement in the education of their children.When the findings of these studies are extrapolated to school report card systems, they undermine the claim that A-F grading empowers stakeholders.
Specifically, one study found that "parents residing in states with more developed assessment systems express significantly lower trust in government, substantially decreased confidence in government efficacy, and much more negative attitudes about their children's schools" (Rhodes, 2015, p. 3).Accountability policies "demobilize parents by excluding them from key educational decisions and enmeshing their children's schools in a punitive testing context that elicits parental anxiety and dissatisfaction" (p.3).Significantly, parents in these states were less likely to participate substantively in the education of their children.When parents are alienated from democratic deliberation about public schooling, as they are in an A-F environment, they come to hold negative attitudes about schools in particular and government generally; in this way, they are actually separated from substantial democratic involvement with schools.Thus, rather than enhancing parental participation, more highly developed accountability systems, such as those exemplified by A-F school grading systems, actually suppressed it. 11 Another recent study found very little evidence that the school performance information made available through school report cards in Ohio has been used by voters as they vote for school board members or by school board members as they make decisions about staffing.Indeed, the study finds "no evidence that voters act on these state or federal performance designations nor that school boards respond to them when making staffing decisions" (Kogan, Lavertu, & Peskowitz, 2016, p. 658).More generally, the study "indicates that despite the wide dissemination of simple and clear performance information, there is little evidence that electoral pressure served as a mechanism that motivated school board members to improve the quality of public education in Ohio" (p.659).The study undermines the foundation of Rationale 2: if no evidence can be found that citizens and elected officials reliably use school performance information made available by school report cards, report cards cannot be said to empower citizens and elected officials to participate in and take 11 A-F school grading systems meet many of Rhodes' criteria for determining which accountability systems count as "highly developed" and thereby suppress parental participation.These highly developed accountability systems include: (1) school ratings to measure school performance, (2) a statewide student identification system, allowing the state to link student test scores with schools or teachers, (3) rewards for high-performing or improving schools, (4) assistance to low-performing schools, and (5) sanctions for lowerperforming schools.Hence Rhodes' arguments apply broadly to A-F systems.rational control over decisions about schooling.
Rationale 3-A-F school grading systems work to improve schools to everyone's benefit (as enabled and fostered by the realization of rationales 1 and 2)-fails along with the others because of the cumulative relationship it bears to them.There are still further problems with this claim.As observed previously, the factors incorporated into A-F school report cards are confined to student academic performance and other outcome measures in isolation from the social, cultural, and economic context and from the policies, practices, and level of resources of schools.This is the source of two significant problems.
First, confining evaluation criteria to student academic performance and other outcome measures in isolation from the social, cultural, and economic context and from policies, practices, and resources of schools is unfair to teachers, administrators, students and others: it holds them fully accountable for outcomes that they have limited power to produce.Two of the cardinal requirements for fairly implementing high-stakes testing are: (1) that all students are taught in conditions that provide a fair opportunity to learn test material, and (2) that the validity of reporting categories (proficiency levels, for example, or A-F grades) be established (American Educational Research Association, 2000).Neither of these requirements is met by school report card systems.
The issue of fairness to those being held accountable is particularly germane to bureaucratic accountability, where rewards and sanctions follow directly from the report card evaluations and are assumed to be drivers of improvement.The so-called theory of action underlying bureaucratic accountability may be questioned (Lee & Reeves, 2012;National Research Council, 2011).Citing a recent white paper authored by an impressive group of educational testing policy scholars (Baker et al., 2010), the Oklahoma researchers contend that "it is a myth to think that using student test scores to punish or reward schools is a driver of improvement" (OCEP & CERE, 2013b, p. 27).In the view of these researchers, failure to improve academic outcomes emerges not from individual actors' failings, but rather from lack of necessary resources.Given that A-F letter grades and consequent interventions in Oklahoma do not meaningfully address profound differences in capacity and school resources, there is little reason to believe that they will strengthen schools.
The second significant problem with confining evaluation criteria to student academic performance and other outcome measures in isolation is that it precludes the capacity to produce the formative knowledge needed to improve performance on desired outcomes.In collapsing information from a limited number of outcome measures, grading plans divert attention from how school policies, practices, and resources interact with out-of-school factors and the characteristics of diverse students to produce (or fail to produce) desired educational outcomes.The focus on isolated outcomes, combined with the crude summary evaluations that grades on an A-F scale provide, undermines the claim that A-F grading systems function in general to improve schools.In fact, they are particularly ill -suited to address group-based gaps in achievement.In Oklahoma, for example, A-F letter grades tended to obscure, rather than reveal, within-school achievement gaps.Schools marked "A" and "B" were found to be least effective for minority students and students receiving free or reduced-price lunch (FRL) (OCEP & CERE, 2013b, p. 27).As stated before, FRL students attending "D" and "F" schools had better average math, reading, and science scores than FRL students in "A" and "B" schools.The measure of school quality embedded in the Oklahoma A-F system is blind to achievement gaps.Rather than making them visible and thus allowing communities and policymakers to address them, letter grades in this case have rendered them invisible, subsuming them into differences between schools.
Almost all state plans include achievement growth as a general criterion in addition to achievement growth in the lowest quartile as a distinct criterion.Growth measures serve as a way of controlling for the influence of different student characteristics by measuring the difference between student achievement at the beginning and the end of a given period of time, on the presumption that what happens in schools causes whatever difference exist.But this is hardly sufficient to overcome the problems associated with an exclusive focus on school outcomes: It neglects the role of social, cultural, and economic factors outside of schools, as well as of the policies, practices and resources of schools-all of which play a significant role in producing those outcomes.
Before proceeding, we consider studies that have found that A-F accountability systems have driven limited school improvement.Examining letter grades in Florida and New York City, these studies find, in sum, that receiving an "F" grade boosts student achievement as measured by test scores, but that no other letter grade promotes school improvement.These studies typically suggest that school improvement associated with receipt of an F grade is spurred on by the "shaming effect" of school report cards.One study finds that schools in New York City "receiving a failing grade realized positive effects in English the first year of the sanction" but found "no evidence that receiving letter grades other than F had positive effects" (Winters & Cowen, 2012, p. 313).Against expectations, the results of receiving a D-grade "appear to been negative, not just in year 1 but in the second year as well" (p.326).Another study of New York City finds that "summary letter grades drove improvements in student test scores in New York City schools that received an F grade" but that the "magnitude of the effect did appear to drop over time" (Winters, 2016, p. 9).Yet another study of New York City found that "the new accountability system put in place in New York City had important effects in the months that followed its launch in the fall of 2007.Math and English test scores improved in schools that received very low accountability grades" (Rockoff & Turner, 2010, p. 145-146).Finally, a study of the Florida A-F letter grade system found that "schools receiving an 'F' grade are more likely to focus on low-performing students, lengthen the amount of time devoted to instruction, adopt different ways to organize the day and learning environment of the students and teachers, increase resources available to teachers, and decrease principal control" (Rouse, Hannaway, Goldhaber, & Figlio, 2013, p. 275).
While these papers provide support for school letter grades in a limited range, we remain deeply skeptical of A-F systems.First, these studies presume that A-F letter grades are clear and meaningful measures of school quality to begin with.As noted above, there is good reason to doubt that letter grades validly measure and express school quality.Second, the positive effects of A-F letter grades are relatively minor, impacting only certain schools receiving F grades, and still fall well short of the educational benefits promised by their proponents.Indeed, these positive effects may well be outweighed by the negative consequences documented above.Third, as noted above, report cards neglect the bulk of the factors that account for student achievement -effects beyond the walls (and control) of schools.For this reason, A-F systems may well distract citizens and elected officials alike from democratic discussion about these out-of-school effects, including poverty and socioeconomic status.Fourth, A-F systems presume that the conception of schooling and achievement embedded within them is suitable for democratic society, which is by no means clear.We say more about this fourth concern below.
In summary, there are strong reasons to doubt that A-F school grades fulfill the aims articulated by their proponents and are valid as a policy instrument.Their neglect of contextual features, and of the policies, practices, and resources of schools, renders them ill-suited to drive school improvement.Rather than working to empower parents and community members in a way that promotes school involvement, they are more likely to alienate parents from democratic participation in the education of their children.

The Validity of School Report Cards as a Democratic Assessment Framework
Even if A-F school grades proved valid as a measure of school quality and valid as a policy instrument -which they do not -there are still strong reasons to hold that they are invalid as a democratic assessment framework.They are unsuited to guide schooling in democratic society for (at least) three reasons: first, they are blind to democratic educational outcomes; second, they impose a (neoliberal) conception of schooling with little apparent consideration of the range of competing educational and social visions; and, third, with anti-democratic consequences, they appear to presume that some "pure" conception of schooling and school quality, insulated from the political and ethical values of researchers, policymakers, and citizens, can be discovered and used to drive educational improvement.We detail each of these concerns below.

Neglecting Democratic Educational Outcomes
A-F systems appear to ignore entirely the fundamental place of schooling in preparing democratic citizens to engage in collaborative democratic deliberation.They are blind to democracy and democratic citizenship.No state A-F system measures directly the educational outcomes required to foster an effective democratic citizenry: civic engagement, the ability to engage with diverse others in authentic deliberation, understanding beliefs to be revisable and indeed revising them in light of contradictory evidence, working to maintain the conditions of democratic society, and so on.The general educational vision contained in A-F systems neglects, and undermines by crowding out, the role of schools in cultivating in students the prerequisite for democratic deliberation: democratic character, which includes the knowledge, abilities, and dispositions needed for effective participation in democratic politics.Michele Moses and John Rogers argue that democratic citizens must develop both capacities for and commitments to democratic deliberation, such as listening, weighing evidence, communicating with people from diverse backgrounds, and thinking critically about, rather than merely accordance with, authority (2013, p. 207-216).Except tangentially, no difference between "A" and "F" schools can tell us whether schools succeed in preparing students to be good democratic citizens or not.Schools that are granted "A" letter grades in existing accountability systems could be meeting these democratic educational ends considerably less well than schools receiving lower grades.
Post-NCLB accountability systems, which include A-F school grades, have driven a narrowing of the curriculum away from democratic educational outcomes, especially away from the curricular content necessary for cultivating the democratic character (Meens & Howe, 2015).The intense focus on content knowledge, particularly English and mathematics, created by accountability systems has significantly limited attention to other subjects and goals, including democratic outcomes (Robelon, 2011).There is little reason to believe that A-F systems will promote, without substantial revision, democratic education.Certainly they are not aimed directly at cultivating "critical habits of the mind and the inclination to deliberate and debate conscientiously on matters of social importance" which are central to democratic character (Howe & Meens, 2012, p. 12).A-F systems are thus invalid as a democratic framework: they do little to promote democratic educational ends and indeed risk crowding these ends out of schooling.
That A-F systems do not promote democratic education is not some abstract concern.Much hangs on whether or not all students, especially those who belong to historically marginalized groups, are given the tools necessary for participating in democratic politics.In democratic society, these students should be provided the abilities and knowledge for protesting the unjust circumstances into which they have been thrown, for giving voice to their experiences and making those voices forceful in democratic politics.Otherwise, their experiences and voices are denied, subsumed into dominant and narrow representations of how schools and society ought to be organized.And they are too often forced to comply with these dominant representations even as these representations diminish their own experiences and force them into alienating social and economic positions.Any accountability system that fails to recognize the responsibility to cultivate the democratic character might well be said to help maintain existing injustice along lines of social class, gender, race, sexual orientation, and so on.To deny these historically marginalized groups the very tools necessary for participating in democratic politics is to collaborate in the process of consciously reproducing the highly unequal status quo.In this way, existing A-F systems are complicit in maintaining the existing social order and, consequently, the power and status of those who benefit from contemporary power arrangements.
There is another side of this coin.When A-F systems neglect democratic educational outcomes, the problem is not only that historically marginalized groups are denied the tools needed for active democratic participation.A further, and less documented, problem is that academic, social, political, and economic elites are educated to be what Elizabeth Anderson (2010) calls "democratically incompetent."They too are denied the tools needed for robust democratic citizenship.While they have little trouble dominating political life, the elite are nonetheless incompetent, practicing an impoverished form of democratic citizenship at best: they are typically unresponsive to the needs and aspirations of a large swath of fellow citizens, and instead govern in their own image and, typically, to their own benefit.It is apparent that "certain kinds of knowledge, as well as ignorance, exist at both ends of the hierarchy of advantage" (Howe, 2015, p. 198).But school report cards do little, or nothing, to promote robust democratic citizenship at either end of the spectrum of power.In their neglect of democratic educational outcomes, then, A-F systems doubly exacerbate democratic inequality and consequent social and economic inequality.
Education policy that neglects democracy and democratic citizenship is not merely blemished; it is thoroughly wrong-headed from the start.Democratic values should not be seen as optional in education research and policy, one among many sets of values that might be promoted.Instead they should be seen as foundational, threaded into the fabric of good research and policy.No other institution is better situated to promote democratic citizenship than public schooling.Democracy is flimsy, no more than a pattern of behavior among citizens supported by institutions themselves constituted by patterns of behavior.Neglected in educational activity, policy, and research, it can wane.A-F systems, and education policy in general, cannot be properly evaluated in isolation from these normative considerations about the role of education in promoting and sustaining democracy.

Imposing (Neoliberal) Conceptions of Schooling and School Quality
Though in democracy citizens should be invited into deliberation about schooling, A-F systems appear to impose a particular conception of schooling and school quality with little or no consideration of competing educational and social visions.Questions about the validity of school report cards as measure of school quality and as a policy instrument, cannot be-should not beabstracted from the broader normative discussion about the place of education within a robust democracy.Typically, however, there is little or no public deliberation about which specific outcomes need to be incorporated into assessment systems.For example, while such outcomes as job preparation commonly are promoted, there is little discourse about why such preparation is essential, how it is best defined, or how the need for such a practical outcome might be balanced with others-like preparation for participation in active citizenship.Criteria reflect particular political commitments, and they currently are being imposed with little reflection on the range of possible educational and social values.
In contrast, in a democratic society the question of how schools ought to be structured should be subject to ongoing democratic deliberation (Gutmann, 1999).Implementation of particular visions should be open to revision as new reasons and contexts evolve.Proponents of the A-F systems claim they produce democratic engagement as a matter of course, as when, for example, Indiana policymakers state: "The greatest benefit of the A through F school grading system is heightened community awareness and increased dialogue and action among education stakeholders" (Indiana Department of Education, n.d.)And yet, existing evidence suggests that A-F systems conversely tend to stifle democratic control over educational structures.
But the problem is not only that A-F systems presume, and thereby impose, a conception of schooling and schooling quality.A further concern is that the presumed view is undesirable for democratic society, rooted in neoliberalism.We say more about neoliberalism below, before describing how report cards tend to promote a distinctly neoliberal conception of schooling.
For almost forty years, neoliberalism has been the ascendant political and economic framework, remaking political and economic life (Burgin, 2012;Mirowski & Phlewe, 2009;Stedman Jones, 2012).It has shaped education research and education researchers alike.The core of neoliberalism is the conviction that free markets should be spread to more and more domains of human life.For the neoliberal, the market is the best and most efficient mechanism for producing and distributing goods (Harvey, 2005;Peck, 2013).It is seen, further, as happily compatible with individual human freedom. 12The neoliberal "targets institutions and activities which lie outside of the market, such as universities, households, public administrations and trade unions… in order to bring them inside the market through acts of privatization" (Davies, 2014, p. 310.)Only through the extension of the market can efficiency and individual freedom be achieved.
The development of neoliberalism as a distinct political framework began, roughly, in the 1920s and 1930s (Stedman Jones, 2012).The early neoliberal intellectuals aimed to "reconstruct a neo-liberalism that remained true to the classical liberal commitment to individual liberty" (p.3).They feared that individual liberty, and classical liberalism broadly, were threatened not only by spreading fascism and totalitarianism, but also by New Deal liberalism, British social democracy, and Keynesian economic theory and policy.They converged on the central neoliberal position: among all economic alternatives, the free market most reliably secures individual freedom by denying any individual or group centralized authority over economic structures.
Later neoliberal intellectuals refined neoliberalism, developing a more mature and coherent political framework centered on more radical advocacy of free market reform, deregulation and privatization, and monetarism.They became, in particular, more and more suspicious of any intervention into the free market.For example, Milton Friedman's Capitalism and Freedom "presented the market as the means both to deliver social goods and to deliver the ends, the good life itself" (Stedman Jones, 2012, p. 8).Friedman writes: "there is an intimate connection between economics and politics, that only certain combinations of political and economic arrangements are possible… in particular, a society which is socialist cannot also be democratic, in the sense of guaranteeing individual freedom" (1962, p. 8).
The widespread implementation, and eventual dominance, of neoliberalism began, roughly, in 1980 (Stedman Jones, 2012).The energy crisis, the debt crises, and "stagflation" during the 1970s created the economic, political, and ideological conditions in which neoliberal principles -fiscal discipline and austerity, privatization, deregulation, market reform, and more -seemed reasonable economic alternatives to reigning New Deal and Great Society liberalism and British social democracy.Neoliberal economic policy was adopted by the International Monetary Fund (IMF), the World Bank (WB), the World Trade Organization (WTO), the European Union (EU), and in the North American Free Trade Agreement (NAFTA).The infamous "structural adjustment" programs, administered by the IMF and the WB, spread free market economic policy throughout the world.Despite substantial challenge, especially recently during the "Great Recession," the neoliberal framework has proven durable.It remains the dominant organizing principle in social and economic life.
Unlike its ancestor classical liberalism, neoliberalism is an active force.It works to create the conditions -social, political, economic, and ideological -needed for the proper functioning of free markets.Rather than the classical liberal imperative to clear space for individual self-determination, it seeks to construct individuals with the knowledge, skills, and dispositions needed for proper interaction with those markets.Unlike classical liberalism, neoliberalism has strengthened rather than weakened state control and monitoring over human life: Whereas classical liberalism represents a negative conception of state power in that the individual was to be taken as an object to be freed from the interventions of the state, neo-liberalism has come to represent a positive conception of the state's role in creating the appropriate market by providing the conditions, laws and institutions necessary for its operation.In classical liberalism the individual is characterized as having an autonomous human nature and can practice freedom.In neo-liberalism the state seeks to create an individual who is an enterprising and competitive entrepreneur.(Olssen, 1992, p. 340) It is no coincidence that the widespread implementation of neoliberal policy beginning in 1980 corresponds neatly with a significant shift in education policy that began, roughly, with the publication of A Nation at Risk and culminated with the accountability systems spawned by No Child Left Behind, including school report card systems, which are but a new variation on the same general theme.During this period, education policy shifted away from the "equity regime," in which the federal government played a narrow role in education typically confined to promoting equal educational opportunity, to a broader and more activist new policy regime, in which the federal government seeks to improve education through punitive accountability systems (McGuinn, 2005).
At the heart of this new regime is the punitive neoliberal "audit culture," which calls for constant monitoring and assessment of schools, along with associated rewards and punishments, intended to drive educational improvement.Alongside the rise of neoliberalism, we have witnessed a rapid proliferation of auditing, i.e., the use of business derived concepts of independent supervision to measure and evaluate performance by public agencies and public employees, from civil servants and school teachers to university lecturers and doctors: environmental audit, value for money audit, management audit, forensic audit, data audit, intellectual property audit, medical audit, teaching audit and technology audit emerged and, to varying degrees, acquired institutional stability and acceptance… Very few people have been left untouched by these developments.(Leys, 2003, p. 70) A-F systems exemplify neoliberal audit culture: they seek to drive educational improvement by auditing schools and rewarding or punishing them according to audit results.And, we argue below, they exemplify the activist neoliberal drive to create and maintain the individual and institutional conditions required for free markets: they tend to promote a distinctly neoliberal view of schooling that seeks to cultivate individuals with the needed skills, knowledge, and dispositions to navigate and sustain market society.
The neoliberal view of schooling can be characterized by two central tenets.First, schooling should be economically-oriented and prepare students to properly interface with markets.Markets safeguard individual freedom and promote efficiency.Schooling is one of the central institutions for sustaining markets: it can, and should, cultivate individuals with the skills, knowledge, and beliefs necessary for the proper functioning of markets.Second, where possible, the provision of schooling should be privatized and put on the market, allowing for market competition that will promote quality schooling chosen by consumers and wither away shoddy schooling not chosen by consumers.Schools should not be "artificially" sustained if they cannot survive on the market.For the neoliberal, such tinkering with the market would be ethically and practically concerning, undermining the potential of the market to safeguard individual freedom and promote efficiency in the production, distribution, and consumption of schooling.
A-F systems appear consistent with, if not outright supportive of, the first tenet of the neoliberal view of schooling: schooling should be economically oriented, training students to properly participate in free market life.These systems commonly conflate education and education for economic ends.For example, consider the rationales given for A-F school grades in a presentation produced by the Louisiana Department of Education (DoE).The Louisiana DoE (2013) contends "American education outcomes are not competitive internationally."Reports that many other countries have outperformed the US educationally, the department suggests, have substantive economic consequences: "there is substantial cost to our country and our state associated with lower educational outcomes.Had the US closed the international achievement gap by 1998, the GDP could have been $1.3 trillion to $2.3 trillion higher in 2008."The department notes, further, "Louisiana graduates will struggle to compete for jobs" because of inadequate school outcomes.Most new jobs, they write, will require education after high school.A-F school grades are taken to be a part of the solution to both of these (economic) problems.We find very similar discussion in other states.
Some, but not all, A-F systems are generally supportive of the second tenet of the neoliberal view of schooling: privatizing the provision of schooling, such that market competition will promote quality schooling and eliminate poor schooling.Whether an A-F system supports the second tenet of neoliberal schooling depends on the accountability rewards and punishments associated with letter grades.Recall, for example, the Indiana Choice Scholarship Program, which provides eligible students with state funding for partial or full tuition costs at participating choice schools, including religiously affiliated schools.Recall too the Florida Opportunity Scholarship Program, which allows students who have attended schools earning either one "F" or three consecutive years of "D" grades to exit and enroll in higher-performing public schools within their district or any other district in the state, provided space is available.Whenever these "choice schools" are managed by corporations or non-profit organizations, A-F systems exemplify neoliberal privatization.Both report card systems move in the direction of, but do not fully endorse, the second tenet: while neither call for the general privatization of the provision of schooling, both embrace the view that the market can drive educational improvement, allowing parents and students to make choices about leaving one particular school for another, supporting schools that are selected and pressuring those that are not.
Here and elsewhere, we find little to no discussion of non-market educational outcomescultivating, for example, good democratic citizens or ensuring that students have studied and worked with a diverse set of fellow citizens.In A-F systems, the neoliberal view of schooling (and especially its first tenet) typically crowds out democratic educational outcomes.To be clear, our worry is not that schools promote labor market skills, which can be a legitimate educational aim when held in proper balance with other educational aims.Rather, it is that A-F report cards, being as they are blind to explicit democratic educational outcomes, risk crowding out education for robust democratic citizenship.A-F systems that promote the neoliberal view of schooling to the detriment of democratic educational outcomes are undesirable in democratic society.
Presuming "Pure" Conceptions of Schooling and School Quality More broadly, report cards systems appear to presume that some "pure," or at least broadly uncontroversial, conception of schooling and school quality can found.They seek a conception of schooling and school quality insulated from enduring moral-political reflection among citizens over education that can be used to drive school improvement.But there can be no such pure conception: any legitimate view of schooling in democratic society will emerge from deliberation among citizens, shot through with the values and aspirations of those citizens (Howe, 2009).In presuming that pure conception, they allow a particular view of schooling, and associated moral and political values, to sneak in without scrutiny.Beneath the illusion of a value-neutral conception of schooling and school quality, they typically covertly promote an undesirable neoliberal view of schooling.
Consider two "domains" of questions in education research.In the technical domain, education researchers inquire into how far educational intervention X promotes educational outcome Y.The technical domain is the province of education researchers who possess the technical skills needed to answer technical questions.It is the domain, for example, of the statistician who draws on statistical methods to estimate the effects of class size reduction in a school district.In the normative domain, education researchers inquire into how far educational outcome Y is desirable, how far it conforms to the demands of robust democratic society.The normative domain is the jurisdiction of democratic citizens generally.Questions that fall in the normative domain should be subject to continued deliberation among researchers and citizens.(To be clear, the normative and technical domains are not cleanly separable in education research and policy: normative considerations will inevitably permeate the technical domain, while technical considerations will inevitably permeate the normative domain.We say more about this below.) Embedded inevitably within school report cards is some conception of schooling and school quality.To be meaningful, A-F letter grades must contain normative content.They must adopt some position, shot through with values, about the proper function of schooling in democratic society.They must adopt some view of what measures should be used to indicate fulfillment, or not, of that function.Here we inhabit the normative domain, the realm of democratic citizens.Normative views about school quality and the legitimate aims of schooling should be subject to deliberation among citizens.To presume or to impose some view of schooling and school quality, without deliberation, is to run afoul of democracy.
In the case of A-F systems, education researchers and policymakers appear to have proceeded directly to the technical domain, skipping over the normative domain.They have presumed the particular (inevitably normative) conceptions of schooling and school quality embedded in report card systems, rather than holding them open to deliberation and weighing them against the range of competing educational views.It is presented as pure, or at least uncontroversial, insulated from ongoing moral-political reflection about education.A-F letter grades are presented as pure or uncontroversial measures of school quality.But there can be no pure conception of schooling and school quality.Any conception of schooling will be laden with political values because schooling must always work toward some end.Any legitimate view of schooling in democratic society will emerge from deliberation among citizens, thoroughly saturated by the values of those citizens.The presumption of a pure conception of schooling is anti-democratic, alienating citizens from deliberation.In leaping over the normative domain and into the technical domain, researchers and policymakers have neglected the foundational role of democratic values in education research and policy.
Advocates of A-F systems might maintain that they should skip over the normative domain.They can, and should, bracket political and ethical values from their research.To insert their own values and aspirations into school letter grades would be anti-democratic.Instead, political and ethical questions -say, "what are the legitimate goals of schooling in democratic society?" and "what is a legitimate conception of school quality?" -are relegated to policymakers, who are democratically accountable to citizens.Education researchers should seek only to answer technical questions using their technical expertise, which will be governed only by epistemic and technical concerns and not contaminated by political considerations.They seek only to produce technical knowledge -how far does educational intervention X promote educational outcome Y -to be given to policymakers who will use that knowledge in deliberation. 13 But this view is flawed.While education researchers and policymakers should, out of respect for democracy, remain vigilant about covertly embedding their values into A-F systems, this strategy backfires.Education research and policy, in general, cannot be insulated from political and ethical considerations (Howe, 2009;Putnam, 2002).Any conception of school quality, some version of which must be presumed in A-F systems, will be loaded with normative considerations.Failing to disclose that answers to question within the technical domain will be shot through with moral and political values will bias deliberation among policymakers and citizens, silently promoting those views.Instead of the moral and political values that inevitably permeate A-F systems, it masks them, shielding them from deliberation and criticism.
In sum, report card systems neglect the priority of democracy to education research and policy.They are unlikely to promote democratic educational outcomes.Rather than inviting citizens to deliberate about the host of possible educational and social visions that could be embedded in A-F letter grades, they appear to impose particular (typically neoliberal) conceptions of schooling and school quality.And they presume, wrongly, that pure, or at least broadly uncontroversial, conceptions of schooling and school quality can found and used to drive school improvement.In doing so, they often covertly promote particular values and particular views of schooling and school quality, which are shielded from deliberation and scrutiny.
We find that report card systems are invalid as a democratic assessment framework.Because they are democratically invalid, A-F systems cannot be remediated with technical fixes.They are flawed normatively, beyond the reach of technical tinkering -they violate the general requirement for schooling in democratic society to promote democracy.They may well be irredeemably flawed, at least without a substantial consideration of the role of schooling and school accountability in democratic society.

Conclusion and Recommendations
We endorse three recommendations advanced by the Oklahoma researchers.First, policymakers should eliminate "the single grade, which cannot be composed without adding together unlike elements and promoting confusion and misunderstanding" (OCEP & CERE, 2013a, p. 6).Second, policymakers should develop "a report card format that uses multiple school indicators that more adequately reflect a school performance profile" (p. 6).Third, policymakers should enlist the services of assessment and evaluation experts in designing school accountability systems.
While we find these recommendations sound, we believe that alone they are too narrow, that they fail to take into consideration the need to consider the role and responsibilities of an educational system within a democratic society.As stated above, these technical fixes alone cannot remedy the deeper democratic defects in report card systems.Therefore, we add our own recommendations to those above, noting we believe these are relevant not only to A-F grading 13 For a well-known defense of this view, and one connected to neoliberalism, see Friedman (1953).Friedman maintains that positive economics, as distinct from normative economics, "is in principle independent of any particular ethical position or normative judgments… [It] can be an 'objective' science, in precisely the same sense as any of the physical sciences" (p.2).He contends that political and ethical values can, and should, be filtered from positive economics.
systems but to all school accountability systems.
Given the above discussion, we recommend that in determining accountability systems for schools, policymakers should enable democratic deliberation over the many possible purposes of schooling in a democratic society before determining assessment criteria.The indicators of "school quality" must be determined through authentic conversation, reflecting the voices and experiences of all members of our democratic society-not just the narrow vision of policymakers.We recommend further that policymakers should ensure that accountability systems promote, rather than neglect or inhibit, the formation of democratic character-which must be consciously cultivated.While democratic outcomes may not be the only legitimate goal for public schools, they surely should be counted among the most essential.Unless these modifications can be made, rendering A-F systems valid as a democratic assessment framework, we recommend that they be abandoned as irredeemable.