The Influence of Socioeconomic Factors on Kentucky ’ s Public School Accountability System : Does Poverty Impact School Effectiveness ?

Under the Commonwealth Accountability Testing System (CATS), Kentucky’s public schools have been assigned individualized baseline and improvement goal indices based upon past school performance in relation to the 2014 statewide index goal of 100. Each school’s CATS Accountability Index, a measure of school performance based upon both cognitive and non-cognitive measures, has then been compared to these individualized improvement goals for the purpose of designating schools as Meet Goal, Progressing, and Assistance Level (Kentucky Department of Education (KDE), 2000). Considered an interim target model, the design of CATS has been intended to negate the biasing effects of socioeconomic factors on school performance on accountability tests through the individualization of school goals (Ladd. 2001). Results of this study showed that 39.9% to 55.5% of the variance of the CATS indices was shared by school socioeconomic factors. Analysis of this interim target model for the 2000-2002 biennium showed that for elementary and middle schools this model negated the biasing effects of socioeconomic factors, but not for high schools. Moreover, analysis of the progress of schools toward their Improvement Goals in 2001 showed that both elementary and high schools from higher poverty backgrounds lagged significantly behind their more affluent peers, indicating inequitable capacity to meet improvement goals Education Policy Analysis Archives Vol. 12 No. 37 2 between the poorest and most wealthy schools. Adaptations to the present accountability systems were suggested for the purpose of providing more accurate information to the public regarding the effectiveness of public schools in Kentucky. The Commonwealth Accountability Testing System (CATS), Kentucky’s public school accountability system, has been designed to communicate to the public the progress of schools toward Kentucky’s education goals in terms of aggregated student achievement, inclusive of cognitive and non-cognitive measures. To determine rewards or sanctions, an accountability index has been calculated for each school based primarily on assessment results. This accountability index has then been compared to a school specific improvement goal for each biennial accountability cycle (KDE, 2000). Although the CATS system has been designed so that by 2014 the accountability index of all schools will have achieved the statewide goal of 100, or that schools will have been held accountable for their performance, it stands to reason that interim public opinion, interim public policy, or both regarding Kentucky’s public schools will have been shaped by the Meeting Goal, Progressing, or Assistance Level designations assigned to schools based upon the obtainment or non-obtainment of these biennial improvement goals (KDE, 2000). Because of the impact of school performance designations on public opinion, it has been critical that the accountability system communicate accurately to the general public, parents and the school community the relative effectiveness of public schools. Since approximately 90% to 95% of each school’s accountability index has been based upon student assessment results (KDE, 2000), and since approximately 52% to 62% of the variance in aggregate school or school district performance on accountability assessments has recently been shown to vary with student socioeconomic factors in both Illinois and Ohio (Lyons, 2001; Sutton & Soderstrom, 1999; Wilson & Martin, 2001), the question follows as to whether the CATS system has identified Kentucky schools as Meeting Goal that have produced high student performance with respect to the socioeconomic background of their students, or simply schools with favorable socioeconomic factors? Moreover, do schools with favorable socioeconomic factors have the same capacity to meet their respective Improvement Goals as their higher poverty peers as evidenced by the early attainment of these goals? Therefore, the purposes of this study were as follows: (1) To determine whether a significant relationship existed between school and community socioeconomic variables and 2000-2002 school-level accountability indices for Kentucky’s public elementary, middle, and high schools. (2) To determine whether the application of individualized improvement goals to classify school performance negates the biasing effects of the socioeconomic factors as evidenced both by Kentucky’s 2000-2002 biennial school performance classifications for public elementary, middle, and high schools. (3) To determine whether socioeconomic factors related to the midpoint attainment of 2000-2002 improvement goals for Kentucky’s public elementary, middle, and high schools. Influence of Socioeconomic Factors 3 Review of Related Literature School accountability systems, which have generally been composed of standards for student performance that articulate statewide goals, assessment systems that measure student progress toward performance standards, and a system of rewards and sanctions regarding aggregate school progress toward state goals, have been widely viewed as controversial among educators. The ability of standardized measures to accurately capture student achievement, as well as the manner in which these measures have been applied to a system of rewards and sanctions, have become the focal point of the debate concerning school accountability (Hanushek & Raymond, 2001; Ladd, 2001; Pearson, Vyas, Sensale, & Kim, 2001). The Public View of School Accountability Although contentious in the education community, the concept of school accountability has been popular with the American people. In a survey of Americans prior to the 2000 election (Business Roundtable, 2000) regarding issues termed Extremely Important to respondents as voters when selecting a Presidential and Congressional candidate, 61% identified Improving Schools, a response rate which eclipsed other prominent issues such as Protecting Social Security (56%), Encouraging Traditional Moral Values and Standards (48%), Protecting Patients Healthcare Rights (47%), and Providing Healthcare Coverage (42%). It was interesting to note that public interest in improving education surpassed all other policy issues in the minds of voters during the 2000 election. Additionally, an annual nationwide survey of attitudes toward education conducted in 2001 indicated that when respondents were asked to grade their local public schools on a scale of A to F, 51% assigned a grade of A or B. This approval rating was the highest received on the poll since 1989 (Rose & Gallup, 2001). The results of the aforementioned surveys pointed to a heightened interest in improving public schools amidst an increasing level of satisfaction with respondents’ respective local schools. On its face a mixed message, the results of the surveys taken together communicated a perception on the part of the general public that their local schools have been performing satisfactorily, and yet might still be improved. Moreover, as evidenced by a mere 23% A or B designation, respondents showed little confidence in public schools when asked about the quality of public schools nationwide. In short, survey results implied that the majority of the general public was satisfied with their local public schools, but dissatisfied with what they understood to be true about public schools in other communities and states (Business Roundtable, 2000; Rose & Gallup, 2001). Public support for accountability. Whether motivated by specific concern over local schools or a general concern over public schools across the country, survey respondents favored the use of systems of testing and accountability for public schools. Specifically, respondents were polled as to whether they favored President Bush’s initiative to hold schools accountable for student performance on standardized tests, with 75% favoring such an initiative. Moreover, although only 31% of respondents indicated that they felt standardized tests were the best way of assessing student achievement, 55% supported the increased used of standardized tests for accountability purposes (Rose & Gallup, 2001). Clearly public opinion has supported increased school accountability, a trend not unnoticed by federal and state policymakers as a variety of state-level education reform initiatives with accountability provisions have been enacted in recent years (Council of Education Policy Analysis Archives Vol. 12 No. 37 4 Chief State School Officers (CCSSO), 2000). With the signing into law of “No Child Left Behind” in January 2002, the shift toward increased public school accountability has reached the federal-level as well (Center on Education Policy (CEP), 2002). Purpose and Components of Accountability Models Since public opinion regarding the quality of individual schools has been shaped by the specific measures referenced by accountability systems (e.g., test scores, student attendance rates, student dropout rates) and reported to the public through school reports cards or other media, the measures used to characterize school performance and the structure of the accountability systems in general have been key issues in the debate over school accountability (Hanushek & Raymond, 2001; Ladd, 2001; Pearson, et al., 2001). In general, public school accountability models have been constructed around a framework of systemic goals, standards of school performance, a means of school performance measurement, and a system rewards and sanctions assigned to schools based upon varying levels of school performance (Hanushek & Raymond, 2001). These accountability systems have sought to leverage change by opening schools to public scrutiny for the purpose of placing pressure upon schools to take steps to increase student test scores (Gullatt & Ritter, 2000; Ladd, 2001). School progress toward state goals and standards has generally been publicized through detailed reports of school improvement indicators, referred to as school repo

between the poorest and most wealthy schools.Adaptations to the present accountability systems were suggested for the purpose of providing more accurate information to the public regarding the effectiveness of public schools in Kentucky.
The Commonwealth Accountability Testing System (CATS), Kentucky's public school accountability system, has been designed to communicate to the public the progress of schools toward Kentucky's education goals in terms of aggregated student achievement, inclusive of cognitive and non-cognitive measures.To determine rewards or sanctions, an accountability index has been calculated for each school based primarily on assessment results.This accountability index has then been compared to a school specific improvement goal for each biennial accountability cycle (KDE, 2000).
Although the CATS system has been designed so that by 2014 the accountability index of all schools will have achieved the statewide goal of 100, or that schools will have been held accountable for their performance, it stands to reason that interim public opinion, interim public policy, or both regarding Kentucky's public schools will have been shaped by the Meeting Goal, Progressing, or Assistance Level designations assigned to schools based upon the obtainment or non-obtainment of these biennial improvement goals (KDE, 2000).Because of the impact of school performance designations on public opinion, it has been critical that the accountability system communicate accurately to the general public, parents and the school community the relative effectiveness of public schools.
Since approximately 90% to 95% of each school's accountability index has been based upon student assessment results (KDE, 2000), and since approximately 52% to 62% of the variance in aggregate school or school district performance on accountability assessments has recently been shown to vary with student socioeconomic factors in both Illinois and Ohio (Lyons, 2001;Sutton & Soderstrom, 1999;Wilson & Martin, 2001), the question follows as to whether the CATS system has identified Kentucky schools as Meeting Goal that have produced high student performance with respect to the socioeconomic background of their students, or simply schools with favorable socioeconomic factors?Moreover, do schools with favorable socioeconomic factors have the same capacity to meet their respective Improvement Goals as their higher poverty peers as evidenced by the early attainment of these goals?Therefore, the purposes of this study were as follows: (1) To determine whether a significant relationship existed between school and community socioeconomic variables and 2000-2002 school-level accountability indices for Kentucky's public elementary, middle, and high schools.
(2) To determine whether the application of individualized improvement goals to classify school performance negates the biasing effects of the socioeconomic factors as evidenced both by Kentucky's 2000-2002 biennial school performance classifications for public elementary, middle, and high schools.
(3) To determine whether socioeconomic factors related to the midpoint attainment of 2000-2002 improvement goals for Kentucky's public elementary, middle, and high schools.

Review of Related Literature
School accountability systems, which have generally been composed of standards for student performance that articulate statewide goals, assessment systems that measure student progress toward performance standards, and a system of rewards and sanctions regarding aggregate school progress toward state goals, have been widely viewed as controversial among educators.The ability of standardized measures to accurately capture student achievement, as well as the manner in which these measures have been applied to a system of rewards and sanctions, have become the focal point of the debate concerning school accountability (Hanushek & Raymond, 2001;Ladd, 2001;Pearson, Vyas, Sensale, & Kim, 2001).

The Public View of School Accountability
Although contentious in the education community, the concept of school accountability has been popular with the American people.In a survey of Americans prior to the 2000 election (Business Roundtable, 2000) regarding issues termed Extremely Important to respondents as voters when selecting a Presidential and Congressional candidate, 61% identified Improving Schools, a response rate which eclipsed other prominent issues such as Protecting Social Security (56%), Encouraging Traditional Moral Values and Standards (48%), Protecting Patients Healthcare Rights (47%), and Providing Healthcare Coverage (42%).It was interesting to note that public interest in improving education surpassed all other policy issues in the minds of voters during the 2000 election.Additionally, an annual nationwide survey of attitudes toward education conducted in 2001 indicated that when respondents were asked to grade their local public schools on a scale of A to F, 51% assigned a grade of A or B. This approval rating was the highest received on the poll since 1989 (Rose & Gallup, 2001).
The results of the aforementioned surveys pointed to a heightened interest in improving public schools amidst an increasing level of satisfaction with respondents' respective local schools.On its face a mixed message, the results of the surveys taken together communicated a perception on the part of the general public that their local schools have been performing satisfactorily, and yet might still be improved.Moreover, as evidenced by a mere 23% A or B designation, respondents showed little confidence in public schools when asked about the quality of public schools nationwide.In short, survey results implied that the majority of the general public was satisfied with their local public schools, but dissatisfied with what they understood to be true about public schools in other communities and states (Business Roundtable, 2000;Rose & Gallup, 2001).
Public support for accountability.Whether motivated by specific concern over local schools or a general concern over public schools across the country, survey respondents favored the use of systems of testing and accountability for public schools.Specifically, respondents were polled as to whether they favored President Bush's initiative to hold schools accountable for student performance on standardized tests, with 75% favoring such an initiative.Moreover, although only 31% of respondents indicated that they felt standardized tests were the best way of assessing student achievement, 55% supported the increased used of standardized tests for accountability purposes (Rose & Gallup, 2001).
Clearly public opinion has supported increased school accountability, a trend not unnoticed by federal and state policymakers as a variety of state-level education reform initiatives with accountability provisions have been enacted in recent years (Council of Chief State School Officers (CCSSO), 2000).With the signing into law of "No Child Left Behind" in January 2002, the shift toward increased public school accountability has reached the federal-level as well (Center on Education Policy (CEP), 2002).

Purpose and Components of Accountability Models
Since public opinion regarding the quality of individual schools has been shaped by the specific measures referenced by accountability systems (e.g., test scores, student attendance rates, student dropout rates) and reported to the public through school reports cards or other media, the measures used to characterize school performance and the structure of the accountability systems in general have been key issues in the debate over school accountability (Hanushek & Raymond, 2001;Ladd, 2001;Pearson, et al., 2001).
In general, public school accountability models have been constructed around a framework of systemic goals, standards of school performance, a means of school performance measurement, and a system rewards and sanctions assigned to schools based upon varying levels of school performance (Hanushek & Raymond, 2001).These accountability systems have sought to leverage change by opening schools to public scrutiny for the purpose of placing pressure upon schools to take steps to increase student test scores (Gullatt & Ritter, 2000;Ladd, 2001).
School progress toward state goals and standards has generally been publicized through detailed reports of school improvement indicators, referred to as school report cards.As of September 2000, all state education agencies had adopted the policy of issuing on an annual basis at least one school accountability indicator report.Forty-six states had adopted as policy the issuance of accountability indicator reports disaggregated at the school district level, and forty states disaggregated accountability indicators at the school-level (CCSSO, 2000).States issuing accountability reports at the school-level in addition to the district-level did so to prevent school districts from hiding poor performing schools within the aggregated results of the district (Ladd, 2001).
Goals and standards.Statewide educational goals and performance standards have served both economic and equity purposes.Economically, goals and standards have been aimed at closing the gap between the achievement of United States public school students and their international peers.Additionally, goals and standards have been viewed as a mechanism for achieving educational equity by raising the bar for all students, thereby reducing the disparity in achievement between disadvantaged students and their peers (Gratz, 2000).Whether focused on economic impact or equity, statewide goals and standards have been designed to serve as a catalyst for increasing student achievement (Hanushek & Raymond, 2001).
It has been noted that statewide goals, which often have been embedded within legislation, have tended to be lofty, yet overly vague and ambiguous (Hanushek & Raymond, 2001).Moreover, although focused on bringing about increases in student achievement (Hanushek & Raymond, 2001) accountability goals and standards have tended to over-promise and then under-perform, often as the result of improper policy implementation (Gratz, 2000).Although imperfect, with only 22 states receiving a grade of B-or higher in a recent report on state accountability systems, statewide educational goals and standards in all core subject areas have been adopted by 48 states and the District of Columbia, the exceptions being Rhode Island and Iowa ("Quality Counts", 2002).
Student Assessment.The identification or development of valid and reliable measures of student performance for accountability purposes has generated significant controversy (Ladd, 2001).Critics have maintained that the multiple-choice format of most norm-referenced standardized tests has been incapable of capturing what students have known and have been able to do (Wiggins as reported in Pearson, et al., 2001).This criticism has led to the development of criterion-referenced tests aligned to state standards and administered either in place of or in addition to multiple-choice, norm-referenced tests.The format of these criterion-referenced tests has included extended response and short answer questions ("Quality Counts", 2002).
As of January 2002, 37 states use criterion-referenced tests for accountability purposes in English and math at least once at the elementary, middle, and high school levels.Fourteen states have administered both criterion-referenced and norm-referenced tests in English and math in grades three thru eight.The validity and reliability of these tests has been debated, with only 19 states having had their criterion-referenced tests externally aligned and reviewed.Additionally, only two states (Kentucky, Vermont) have incorporated student portfolios into their accountability system ("Quality Counts", 2002).
Consequences for individual and school performance.An underlying assumption of accountability systems has been that in the absence of real consequences for school or individual success or failure, there has been insufficient motivation to focus on the desired outcomes (Hanushek & Raymond, 2002).In 2000In -2001, 18 , 18 states assigned rewards to schools performing at or above state standards, with two more states phasing in such policies over the next three years.Sanctions of one type or another were in place in 20 states for schools that were persistently low performing, with 11 states allowing student transfers, 9 allowing closure, 15 providing for school reconstitution, and 2 states withholding funds from low performing schools.It was noted that in 2000-2001, six states provided for rewards for high-performing schools without the threat of sanctions for lowperforming schools, and seven states provided for sanctions to low-performing schools without the incentive of rewards for high-performance ("Quality Counts", 2002) Policies concerning individual accountability in 2000-2001 were more varied than school or district policies.Four states based decisions regarding grade-level promotion on individual performance on the statewide assessment.Seventeen states based decisions regarding high school graduation on statewide exit or end-of-course exams, with six states basing exit exam or end-of-course assessments on tenth grade standards.It was noted that eight states provided for consequences for low-performing schools without holding individual students accountable, and four states provided for consequence for students performing poorly on state assessments without holding low-performing schools accountable ("Quality Counts", 2002).
Federal accountability policy.The passage of No Child Left Behind (NCLB) in January 2002 has provided for federally mandated consequences for schools failing to meet state specified goals, much as with existing accountability measures in many states.However, federal legislation has called for states to disaggregate student scores on mandatory tests.These desegregations must occur for students grouped in terms of ethnicity, income, disability, limited English proficiency and migrant student.Schools must meet individual goals regarding the achievement gap that has been shown to exist between such groups (i.e., the gap between poor students and their peers) or face a range of consequences (CEP, 2002).
Specifically, schools failing to meet their performance goals for two consecutive years must receive technical assistance from the district, with students having the option to transfer to another school within the district.Schools failing to meet their goals for a third consecutive year will continue to receive technical assistance and must allow intra-district school choice, but must also allow eligible students to use their portion of Title I funds to purchase tutoring or other services directly from the district or an outside agency.A fourth consecutive year of failure would continue the previous consequences, but also impose re-staffing or other fundamental changes.After a fifth consecutive year of failure, governance must change (i.e., charter school, privatization of management services, state takeover of operations) (CEP, 2002).The severity of the consequences for school failure under NCLB has made it imperative that accountability measures and models have been well designed, truly identifying effective and ineffective schools.

Accountability System Designs
Well-designed and implemented accountability models, characterized by clearly articulated academic standards and tests, have served several functions inclusive of (a) promoting more challenging curricula, (b) fostering collaboration among teachers within and across schools, and (c) creating more productive dialogue among teachers and parents.Poorly designed and implemented accountability systems have had the effect of impeding, rather than advancing, education reform by creating an overall climate of frustration in schools regarding the measurement of school effectiveness (Gandal & Vranek, 2001).The mandatory sanctions provided for by the NCLB Act highlight the need for fair and consistent accountability systems.
When judging the desirability of the design of an accountability system, the following criteria have been considered: (a) The usefulness of the measures used to determine school effectiveness in diagnosing weaknesses in individual schools; (b) The usefulness of the accountability results to parents in making decisions about the education of their children; and (c) The fairness of accountability system to teachers and administrators regarding factors beyond their control (i.e., readiness to learn, home and environmental factors).Moreover, although it has been asserted that it is most appropriate to focus accountability on schools rather than on school districts, or individual students (Ladd, 2001), accountability systems have varied in their approach, some focusing on all three of those levels ("Quality Counts", 2002).Three basic accountability system models have been implemented for either individual schools, school districts or both, each of which has inherent strengths and weaknesses: (a) Accountability systems focusing on school-wide averages on test scores through comparison to a cut-off score or to the scores of other schools; (b) Accountability systems designed to capture the value-added by the school to student learning; and (c) Accountability systems focusing on a target rate of improvement for each individual school (Ladd, 2001).
Comparisons of average scores.Accountability systems of this type have used either mandatory cutoff scores on student assessments to determine a school's effectiveness, or have compared schools or school districts to each other, determining effectiveness based upon rank order (Ladd, 2001).For example, the accountability system in Texas has used cut-off scores to determine school effectiveness (Texas Education Association (TEA), 2001).
In Texas, schools have been classified as Exemplary, Recognized, Academically Acceptable/Acceptable and Academically Unacceptable/Low Performing depending upon the percentage their students passing the state's accountability tests in reading, writing and mathematics, as well as the percentage of students who dropped out of school.Moreover, when student scores were disaggregated by ethnicity and socioeconomic status, the percentage of students within these groups passing the tests and remaining in school must also have met state standards (TEA, 2001).
On a philosophical level, this style of accountability system has been consistent with the spirit of standards-based accountability in that all students and schools have been assessed against a uniform standard.However, there have been concerns that unaltered, this style of accountability system has not accounted for differences between the schools (i.e., socioeconomic factors) and that schools may have been categorized based as much on the socioeconomic differences as on the quality of the instructional program.Additionally, schools with advantaged populations and adequate scores have tended to become complacent, whereas schools with disadvantaged populations have struggled to meet seemingly unattainable state standards (Ladd, 2001).
Value-added designs.It has been noted that accountability systems work best when they measure what a school has added to the learning of children over a given year, and then hold the school accountable (Hanushek & Raymond, 2001).Termed the value-added approach, this style of accountability system has defined school effectiveness in terms of the gains in student achievement, rather than a uniform cutoff score.Through the extensive use of control variables, the value-added approach statistically accommodates for confounding factors (e.g., socioeconomic factors, prior school achievement) and therefore statistically estimates the portion of gain scores on state assessments attributable to teacher effects for a given year.Tennessee has instituted such a value-added component as part of the state's accountability system (Sanders & Horn, 2002).
Value-added approaches, although in theory providing the type of information requisite to promote school improvement and to communicate most clearly with the public regarding school performance, has met with resistance for several reasons.In general, there has been little consensus regarding the variables that should be controlled for in the value-added analysis.Additionally, there has been concern regarding potential peer effects that would impact the credibility of the results.On a practical level, the data required to complete a value-added analysis has often been unavailable or cost prohibitive to gather (Ladd, 2001).
Interim target designs.Accountability systems that utilize interim target goals as shortterm measures toward long-range statewide goals have served to base school effectiveness on school progress toward individualized improvement goals rather than either progress toward an absolute cut-score or value-added scores.This approach has allowed schools to be identified as effective provided they meet the improvement target established for a specific time-period.The CATS system in Kentucky has been designed in this fashion (Ladd, 2001).
Underlying the interim target approach has been the assumption that the interim school achievement targets would be reasonably attainable for a school functioning in an acceptable fashion.However, depending upon how the targets have been established, it has been possible that targets have been easier for some schools to meet than others.Schools starting nearer the long-term goal have not had to change their instructional practices significantly as compared to low-performing schools, which typically serving the poorest and most challenging students and who have had to make tremendous gains to meet their targets (Ladd, 2001).

Impact of Socioeconomic Status on Student Assessment Scores
Central to any accountability system has been the use of norm-referenced tests, criterion-referenced tests, or both (Ladd, 2001;"Quality Counts", 2002).Regardless of the type of test used, empirical studies have demonstrated that school scores on these tests reflect in large part student socioeconomic variables, which are beyond the control of the school, rather than reflecting variables within the control of the school that affect the achievement of students (Lyons, 2001;Sutton & Soderstrom, 1999;Wilson & Martin, 2000).
In an analysis of Grade 3 Reading and Mathematics scores from 1994 Illinois Goal Assessment Program (IGAP) assessments, Sutton and Soderstrom (1999) found that 56% of the variance in mathematics scores and 70% of the variance in reading scores was associated with variables beyond the control of the school, such as the percentage of students at each school participating in the free and reduced lunch program.Analysis of Grade 10 scores for reading and mathematics resulted in variables beyond the control of the school accounting for 74% and 62% of the variance in student scores, respectively.
Similar results have been reported in analyses of school-level and district-level scores in Ohio.In a study of public schools in the Toledo (OH) school district, Wilson and Martin (2001) found that per capita income was the most dominant predictor of student test scores on the both the Ohio Proficiency Test and the Metropolitan Achievement Test, accounting for 70.56% of the variance in student test scores.In a similar study of school district scores on the Ohio Proficiency Test, Lyons (2001) found that 58.5% of district means varied with school district per capita median income, per pupil property wealth, and percent free/reduced lunch participation.
As noted earlier, the impact of demographics on student test scores has made it unclear as to whether school performance on a test has been a result of an effective instructional program, or exceptional student demographics.The potential for differential socioeconomic status to bias assessment scores has created a fundamental problem with the fairness of the system to the schools subject to the accountability system, the parents of the children attending these schools, and the taxpayers of the state who draw conclusions regarding the quality of the schools they support based upon the results of these accountability systems.The interrelationship of test scores and student demographics has been a focal point for critics of accountability (Kohn, 2001), but also a consideration for designers of school accountability systems.For accountability systems to promote the aforementioned collaboration within school communities, to communicate accurately with parents and the general public, and to be fair to schools of all types, a means of either accounting for differences in demographics, should be designed into the system.

Accountability in Kentucky
In 1992, Kentucky's accountability system, at that time the Kentucky Instructional Results Information System (KIRIS), was implemented as part of the Kentucky Education Reform Act (KERA).KIRIS was designed to reflect the degree to which schools were improving the effectiveness of their instructional program within the context of the Kentucky's learning goals, as measured by both cognitive and non-cognitive indicators.A system of rewards and sanctions based upon schools' progress toward improvement goals was implemented as an incentive for schools to progress toward KIRIS goals (KDE, 1998).
With KIRIS, each individual school was assigned a KIRIS improvement goal every two-years based upon the difference between each school's current KIRIS accountability index and the 20-year statewide goal of 100.This method of establishing improvement goals was intended to allow interim goals to be individualized for each school, with initially low-performing schools having to show larger biennial gains than schools that were initially high-performing (KDE, 1998).
Schools meeting or exceeding their improvement goal were designated as being in Rewards, which entitled them to receive monetary rewards from the state.Schools failing to meet their improvement goal, but that scored at or above their baseline, were identified as Maintaining.These schools were not eligible for rewards and were required to submit a school improvement plan.Schools whose KIRIS scores dropped below their baseline were designated as being In Decline.Depending upon the degree to which the school was In Decline, consequences ranging from a required school improvement plan, the assignment of a highly skilled educator, and the option of parents to transfer students to another school were all possible (KDE, 1998).
From KIRIS to CATS.Highly criticized for several reasons, including the lack of a national norm-referenced test as part of each school's accountability index, KIRIS was replaced by the Commonwealth Accountability Testing System (CATS) in 1998.CATS differed from KIRIS in several ways, but most fundamentally in the addition of the Comprehensive Test of Basic Skills Survey Edition (CTBS-5) in three grade levels (3,6,9) to compose five percent of each school's CATS index.The remaining 90% to 95% of the CATS index for elementary, middle, and high schools was composed of the results of the Kentucky Core Content Test (KCCT), writing portfolios, and non-cognitive indicators (KDE, 2000).
As with KIRIS, CATS stipulated a unique improvement goal for each individual school to reach for each successive biennium.However with CATS, rather than calculating a new improvement goal after each successive biennium based upon the discrepancy between current school performance and the state goal at that point in time, the improvement goal for each year from 1998-2014 was determined by extrapolating a line from each school's accountability index at the end of the 1998-2000 biennium to the statewide goal 100 in at the end of the 2012-2014 biennium.A Zone of Fairness was created to accommodate for the standard error of measure for the test each year.Figure 1 illustrates the growth chart format used to communicate the long-term accountability requirements for each school in the State of Kentucky from 2000 to 2014.Each biennium schools were identified as Meeting Goals, Progressing, or Assistance based upon whether they scored above, within, or below the Zone of Fairness (KDE, 2000;KDE 2002).

School Effectiveness and Demographics
As mentioned previously, school effectiveness in Kentucky has been defined in terms of progress toward a long-term goal, as gauged by the achievement of incremental improvement goals.By focusing on incremental school improvement, rather than absolute aggregate student achievement, CATS had been designed to accommodate for differences in demographics and as well as differences in initial performance that may have impacted student achievement.The goal has been to make school-based decisions to develop instructional strategies to meet each biennial Improvement Goal, incrementally leading each school to the 20-year absolute goal of 100 regardless of disparate school characteristics, such as socioeconomic status (KDE, 1998).It has been an underlying assumption of the interim target model in general (Ladd, 2001), and CATS specifically, that the use of interim targets toward a long-term accountability goal has mitigated any concerns regarding the possibility that socioeconomic differences between schools might bias the accountability system in against high poverty schools.Therefore, the application of individualized improvement goals has been assumed to compensate for demographic differences, providing all schools with an equitable chance of meeting their improvement goals, and ensuring that the improvement goals established for a biennium challenge all schools equally to improve.
For elementary, middle, and high schools in the State of Kentucky, the following hypotheses were investigated: (1) There will not exist a significant relationship between the 2000-2002 CATS Accountability Index and school/community demographic indicators for Kentucky's public elementary, middle, and high schools.
(2) There will not exist a significant relationship between the frequency that Kentucky's public elementary, middle, and high schools were designated as Meeting Goal, Progressing, and Assistance Level for the 2000-2002 accountability biennium and their relative socioeconomic status.
(3) There will not exist a significant relationship between the frequency that Kentucky's public elementary, middle, and high schools had exceeded their 2000-2002 Improvement Goal by the 2001 midpoint report and their relative socioeconomic status.

Method Participants
To be included in the study, Kentucky elementary, middle, and high schools must have been structured in terms of grade-levels as to contain all three accountability grades for their respective school type.Specifically, schools must have been structured to contain grades three to five for elementary schools, six to eight for middle schools, and nine to twelve for high schools.Of the eligible schools, schools were eliminated that (a) did not participate in the federal free/reduced lunch program, (b) had extreme values in terms of socioeconomic variables, and (c) were identified as an alternative school.Table 1 summarizes the grade level structure of both the population and sample for elementary, middle and high schools.School Classifications.The Kentucky Department of Education reported the accountability classification of each school (Meets Goal, Progressing, Assistance-level) based upon the 2000-2002 biennial CATS index.Additionally, the 2001 CATS midpoint index was used to determine relative progress of schools toward these biennial goals by comparing the 2001 midpoint index with their respective 2000-2002 improvement goals and baselines.Schools meeting or exceeding their Improvement Goal were classified as Exceeding Target, schools whose index was between their Improvement Goal and Baseline were classified as Below Target, and schools scoring below their Baseline were identified as Below Baseline.Table 3 and 4

Procedure
Participant elimination.Elementary, middle, and high schools not participating in the federal free and reduced lunch program, that were classified as alternative schools or that had extreme standard scores ( z < -3 or z > 3) for any single socioeconomic indicator were eliminated from the study.The final sample consisted of 698 of 773 elementary schools, 231 of 350 middle schools, and 220 of 238 high schools.Tables 3 and 4 describe the sample in relation to grade level structure and the variables used in the study, respectively.
Regression analysis for the 2000-2002 CATS Index.For each grade level group (elementary, middle, high) a stepwise, multiple linear regression was performed using school and community socioeconomic variables as predictor variables, and the 2000-2002 CATS Accountability Index as the criterion variable.Free and reduced lunch participation rates for 2001 and 2002 were averaged to compensate for the fact that the CATS indices represented two years.
Wealth quintiles.The school and community socioeconomic variables used in the study were converted to a categorical variable to allow for the use of Chi-square analysis.These categories, referred to as Wealth Quintiles, were derived from a standardized weighted average of the socioeconomic indicators found to be significant in the elementary, middle and high school multiple-regression analyses.Specifically, raw scores were converted to standard scores for each significant variable.These standard scores were used to create a weighted average, the weights of which reflected the proportion of the shared variance in the CATS index explained by the addition of each respective variable for each analysis.This weighted average of these standard scores was then used to form the Wealth Quintiles, with the poorest schools assigned to the first quintile and the wealthiest to the fifth.
Determining wealth neutrality.A 5 x 3 Chi-square was used to determine whether a relationship existed between the socioeconomic level of a school and the school's performance relative to its improvement goal.Specifically, the Chi-square was applied to determine whether the Wealth Quintile assigned to a school was related to the accountability classification (Meets Goal, Progressing, Assistance-level) of the school.
Determining capacity to meet goals.A similar methodology was applied to determine whether schools from all socioeconomic levels had equal capacity to meet their improvement goals.For these analyses, a 5 x 3 Chi-square analysis was again utilized.However, rather than relating the Wealth Quintile with the biennial accountability results, the Wealth Quintiles were related to midpoint CATS progress, as determined earlier (Exceeding Target, Below Target, Below Baseline).

Limitations
Steps were taken to ensure that schools of each level (elementary, middle, high) were comparable in terms of the grade levels included.However, there were no steps taken to determine what relationship, if any, grade level structures encompassing two or more levels (e.g., p-12, p-8) had with school classification.Additionally, approximately one-third of middle schools were excluded from the study due to grade level structure.As a consequence, results of the aforementioned analyses may not represent middle grades in all schools and districts.

Hypothesis 1
Results of the stepwise, multiple linear regression indicated that a significant relationship did exist between the 2000-2002 Biennial CATS accountability index and school socioeconomic variables, community socioeconomic variables or both for elementary (R = .634;R 2 = .402;p = .000),middle (R = .677;R 2 = .453;p = .000),and high schools (R = ..737; R 2 = .542;p = .000).As a consequence, the hypothesis that there will not exist a significant relationship between the 2000-2002 CATS Accountability Index and school/community demographic indicators for Kentucky's public elementary, middle, and high schools was rejected for all school levels.

Hypothesis 2
Tables 5, 6 and 7 summarize the observed and expected classifications of elementary, middle and high schools, respectively.Results of the Chi-square analysis indicated that for elementary (χ 2 = 11.441,df = 8, p = .178) and middle schools (χ 2 = 8.238, df = 8, p = .411)there was not a significant relationship between socioeconomic factors and the accountability classification for the 2000-2002 biennium.However, for high schools (χ 2 = 45.251,df = 8, p = .000)there did exist a significant relationship between socioeconomic factors and accountability classification, with over twice as many of the wealthiest schools classified as Meeting Goal as would have been expected, while a little more that half as many of the poorest schools were classified as Meeting Goal as would have been expected.As a consequence, the hypothesis that there will not exist a significant relationship between the frequency that Kentucky's public elementary, middle, and high schools were designated as Meeting Goal, Progressing, and Assistance Level for the 2000-2002 accountability biennium and their socioeconomic status was accepted for elementary and middle schools, and rejected for high schools.

Hypothesis 3
Tables 8, 9 and 10 summarize the observed and expected frequencies with which elementary, middle and high schools exceeded their biennial improvement goals at the midpoint.Results of the Chi-square analysis indicated that there was not a significant relationship between school and community socioeconomic factors and classification of schools as Exceeding Target, Below Target or Below Baseline for middle schools (χ 2 = 11.630,df = 8, p = .168).However, for elementary (χ 2 = 17.806, df = 8, p = .023) and high schools (χ 2 = 39.218,df = 8, p = .000) a significant relationship did exist.As a consequence, the hypothesis that there will not exist a significant relationship between the frequency that Kentucky's public elementary, middle, and high schools had exceeded their 2000-2002 Improvement Goal by the 2001 midpoint report and their socioeconomic status was supported for middle schools, but not supported for elementary and high schools.
Table 8 Actual

Discussion
Analyses of the 2000-2002 CATS Accountability Report indicate a strong relationship exists between the socioeconomic status of schools and their achievement on the assessment tests, with shared variance between socioeconomic factors and the accountability assessments ranging from 39.7% to 60.5% and the shared variance between socioeconomic factors and the CATS accountability index ranging from 43.0% to 50.1%.The presence of such a strong and significant relationship between factors outside of school and student achievement is not new and is not unexpected.Recent analyses in Ohio and Illinois produced similar results.The passage of the NCLB Act, however, has added a new twist to the implications of such studies.
The logic behind public school accountability systems is that the establishment of state goals, standards, and assessments, followed by the systematic dissemination school achievement information relative to state goals and standards is intended to create political leverage for school improvement.This leverage is enhanced by the administration of rewards and consequences to schools based upon the attainment or non-attainment of their improvement goals, respectively.If a pattern exists whereby differential assessment results follow differential socioeconomic status, it becomes increasingly difficult for the school community, parents, and the general public to discern schools producing high student achievement due to effective instructional programs from schools producing high achievement based due to the advantaged status of their students in the absence of an effective instructional program.Moreover, high poverty schools with exceptional instructional programs quite often are overlooked due to the fact that their test scores are not as high as their affluent peers.

The Impact of Poverty of Kentucky's Accountability Model
It is clear that in the absence of any means of "leveling the playing field", schools serving high poverty areas are placed in a difficult and arguably unfair position.States using accountability systems that hold all schools to the same goals at all times run the risk of setting the bar so high as to essentially sort schools based upon the nature of their demographics rather than the quality of their instructional program, or to set the bar so low as to be attainable at any time by any school, but to serve little purpose in terms of leveraging school improvement.
The use of individualized improvement goals to establish fair, yet challenging targets for all schools has been the approach used by Kentucky.The results of the analysis indicated that despite the strong and significant relationship between the achievement test scores for elementary and middle schools and school socioeconomic status, there was not a significant relationship between the accountability classification of these schools and their socioeconomic status.This finding bodes well for the accountability system in general and the instructional programs of schools at the elementary and middle school level in Kentucky, indicating that elementary and middle schools at all socioeconomic levels are implementing programs to put them on track for proficiency in 2014.
Results of the same analyses for Kentucky's high schools did not bode as well.A significant relationship existed between socioeconomic factors and school accountability classification, essentially sorting schools based upon demographics as much as program effectiveness.Being that the means by which each high school's accountability index is calculated is the same as for elementary and middle schools, and the determination of improvement goals is the same as well, the question is raised as to why the accountability system appears to be biased for high schools, but not for elementary or middle schools?

Capacity to Meet Improvement Goals
An assumption of the interim improvement goal accountability models, such as CATS, is that all schools, regardless of background, has the capacity to attain their improvement goals provided they operate at a reasonable level of effectiveness.Conceptually, this capacity may include the capacity to implement changes, curricular and otherwise, through the organization, or the ability to engage parents and the public in the school improvement process.Additionally, factors such as finance may serve to build this capacity.Whatever constitutes this capacity, it is assumed that all schools at all levels are identical in this respect.Should this assumption be true, schools at all levels of poverty should have an equitable chance of meeting their school improvement goals at any point in time, in this case the mid-point of the biennium (2001).
Results of the Chi-square analyses indicate that only for middle schools did this assumption hold true.Elementary and high schools from the highest wealth categories were significantly more likely to have already met their 2000-2002 improvement goals as of the 2001 midpoint report.More specifically, the most affluent Kentucky high schools had met their biennial school improvement goals at twice the rate expected if the process were wealth neutral, whereas the poorest high schools had met the improvement goal at half the rate expected.Elementary results were significant, although not as pronounced, with poorer schools meeting their goals less frequently than expected, and more affluent schools meeting their improvement goals more frequently than expected.
Explanations of the impact of socioeconomic status on the CATS Accountability Index, the system of improvement goals used to classify schools, and the capacity of schools to meet their improvement goals could take several forms.For example, differences in student cohorts taking the tests from year to year, or differences in the availability of qualified staff in higher poverty areas could both impact the results of student assessments.Regardless of the reason, data indicates that for the 2000-2002 biennium, socioeconomic status impacted CATS sufficiently to bias the system in favor of school with lower levels of poverty for high schools.Moreover, when the 2001 midpoint results were considered, it became apparent that a relationship existed between the socioeconomic status of elementary and high schools and the early attainment of their improvement goals.This suggests that even with the individualized improvement goals, high poverty elementary schools may soon fall behind their more affluent peers as the improvement goals are raised.The critical nature of accountability under NCLB makes it imperative that states account for the potential biasing effects of poverty on their public school accountability systems.
Adaptations in Ohio and Texas.The impact of socioeconomic status on accountability measures has been addressed through policy in other states, including Texas and Ohio.The states of Ohio and Texas compare similar schools or school districts, in addition to overall ratings against state standards.That is to say, schools and districts are still accountable for meeting uniform state accountability requirements, but that relative school and school district achievement is also reported to the public in a systematic way.
Noting that it makes more sense to compare similar school districts, thus allowing parents to answer the essential question "How well is our district performing when compared to school districts with similar characteristics, challenges and resources?" the Ohio Department of Education provides for a systematic way of comparing school district performance (ODE, 2002b).Specifically, Ohio communicates differential achievement of socioeconomically and demographically similar school districts to the public by comparing school district achievement with a sample of 20 districts identified as similar based upon poverty, size, socioeconomic status, overall property wealth and school type (urban, suburban, rural).Reports are made available on-line for interested parties to compare the accountability results (testing results, graduation rates, attendance rates) of these similar school districts (ODE, 2002).
Rather than comparing school districts, Texas generates a Comparable Campus Report that clusters 40 schools with comparable socioeconomic and demographic profiles for the purpose of comparing reading and mathematics.The Texas Learning Index (TLI), a value-added measure of individual student test score improvement, is calculated for comparable students at each school, with these results being aggregated producing a school TLI.Schools with each comparison group are assigned to a quartile based upon their TLI.These quartile rankings, which are only applicable within the comparison group, communicate to parents, teachers and taxpayers the impact the school had upon student learning a given year relative to schools of comparable socioeconomic and demographic compositions (TEA, 2002).
Argument against comparing school scores.Since the implementation of the Kentucky Education Reform Act (KERA) in 1990, it has been asserted that individualized target format of KIRIS, and then CATS, has made it inappropriate to compare schools to each other.Rather, schools were to be compared only to their own improvement goal during the reform effort, at least until 2014 when all schools will be held to the same standard.However, the reality is that the general public, inclusive of teachers and principals, look to see how other schools have performed on CATS when gauging performance of schools in the area.Moreover, as each successive biennium passes and schools move closer to being compared to the same standard in 2014, the argument against interschool comparisons becomes less inspiring; 2014 is fast approaching.

General Conclusions
Conclusions of this study should not be misinterpreted to say that lower accountability standards should be established for high poverty schools.The mantra that "All Children Can Learn" applies in that for schools to improve, high standards must be held.The emphasis on closing the achievement gap between disadvantaged students and their peers reinforces the need for high standards.Rather this study illustrates that the absence of a systematic approach of comparing accountability scores of schools with differential socioeconomic characteristics is unfair to the children, parents, teachers, and principals of high poverty schools.
School accountability systems are a political means of leveraging improvement through a balance of collaboration for school improvement and consequences for poor performance.Although much attention has been given to the appropriateness of the instruments used to measure school performance, possibly the weakest link in the chain of accountability may be the ways in which results are reported and excellence is defined.In light of the implementation of the NCLB Act, the fact that as of 2002 the CATS accountability system shows potential for bias against high poverty schools, with no provision made to communicate completely to the public the performance of each schools as compared to similar schools is viewed as unjust to all stakeholders.
CATS and the NCLB Act.Since the passage of the NCLB Act, negotiations regarding the types of tests and standards to be utilized at the state and local level with the have resulted in some flexibility for states.However, the key federal standard regarding the yearly improvement of schools, adequate yearly progress (AYP), was not on the table for negotiation ("Public agenda", 2002).As states prepare to respond to federal government guidelines regarding the determination of adequate yearly progress for public schools, it is important that they examine the degree to which existing accountability measures and models have been influenced by external factors such as socioeconomic status.
Adapting CATS.Currently, when CATS results are communicated to the public through school report cards or press releases, socioeconomic differences between schools are not easily discernable from the minutia of information conveyed.In the absence of a systematic way for the public to gauge how a given school did compared to schools statewide, or even in a given geographic area, parents and taxpayers will often make inappropriate comparisons between schools.These comparisons are unfair to teachers and principals as they strive to improve their schools, unfair to parents as they make decisions about their children's education, and unfair to taxpayers as they draw conclusions regarding how well their tax money has been spent.
As the CATS accountability system is altered to meet federal guidelines, it is time to implement changes that will ensure that the system recognizes school effectiveness independent of socioeconomic influences.We must embrace the fact that high standards for all children are needed, but that we must not become blind to the impact socioeconomic can have on test scores.
The NCLB Act has raised the stakes higher than ever of Kentucky's public schools.Educators, parents, students, and taxpayers at large deserve the best possible information relative to the performance of Kentucky's public schools.Architects of the state's accountability system need to build on the strengths of CATS, yet rectify any weaknesses; providing an accountability system that accurately and fairly adjudicates the effectiveness of Kentucky's public schools.

Figure 1 .
Figure 1.Sample CATS Long-term Accountability Model for a school with an initial index of 40 in 2000.(Bienniumschools are identified based on whether their average index for that biennium is above the Goal Line (Meets Goals), in the Zone of Fairness (Progressing), or below the Assistance Line (In Need of Assistance).) summarized the frequency that schools were classified in each category for both the 2000-2002 biennium and the 2001 midpoint.

Table 1 Distribution of the Population and Sample in terms of Grade Level Structure
VariablesSchool and community socioeconomic indicators and school accountability indices were obtained from the Kentucky Department of Education and the United States Census Bureau for this study.Descriptions of each variable were provided below, with sample summary statistics reported provided in Table2.CATS Indices.The CATS Accountability Indices for the 2000-2002 biennium and the 2001 midpoint were obtained from the Kentucky Department of Education (Kentucky School and community socioeconomic status.School socioeconomic status was represented by mean free and reduced lunch participation rate for each elementary, middle, and high school each year during the 2001 and 2002 school years.(Kentucky Department of Education-October free and reduced price data-2001-2002).Community socioeconomic status was represented by United States Census income and poverty statistics for the county in which each school was located (United State Census -Small area estimates -State and county poverty estimates, rates and median household income 1998).Specifically, the following indicators were used: (a) Median Household Income, (b) Overall Poverty Rate, (c) the Poverty Rate of Persons Under 17 years of age, and (d) the Poverty Rate of Related Persons ages 5 to 17.