Factor Structure of Opportunity to Learn for Students with and without Disabilities

Opportunity to Learn (OTL) stems from the basic premise that there is an important relationship between the quality and intensity of classroom instruction and students’ levels of academic success. For many students with disabilities, an emphasis on OTL has become national priority, yet measuring its impact is a complex challenge. The first purpose of this study was to explore the factorial validity of OTL using indicators found in the 2005 4 grade National Assessment of Educational Progress (NAEP). The study entailed confirmatory factor analyses for potential OTL factors including teacher preparation, professional development, classroom activities, and access to technology. Separate factor analyses were conducted using the reading and mathematics datasets. The authors then looked at the degree to which OTL factors influenced NAEP estimates of ability for both students with disabilities and their non-disabled peers. The following three OTL factors differentially predicted student scores: classroom activities (reading), student constructed projects (reading), and using calculators for instruction (mathematics). For the remaining three reading factors and seven mathematics factors, there were no differences in the relationship between the factors and scores for students with and without disabilities.

with special needs, school performance is a result of the interaction between students' abilities and the instructional environment (Ysseldyke & Christenson, 1987).Yet measuring OTL across classrooms, districts, and states is a challenging and often elusive process (Porter, 1993(Porter, , 1995;;Schmidt & Maier, 2009).In its role as "The Nation's Report Card," the NAEP is a valuable resource in looking at OTL because of the national scope of its data collection process and the broad range of student, teacher, and school variables brought together into one data set.
To date no study has identified a measure of OTL within the NAEP framework nor analyzed the impact of OTL on NAEP scores for students with and without disabilities.Using confirmatory factor analysis, this study investigated OTL factor structures using teacher and student background variables within the NAEP dataset.In an analysis of impact on students with and without Individualized Education Program (IEP) plans, we considered the potential relevance of OTL factors in understanding differential levels of academic performance for students with disabilities (SWD).This work complements OTL studies that address the effects of curricular alignment (Porter, 2002) or the impact of tracking systems on student achievement (Oakes, 1985).The strength and weaknesses of this factor analysis approach as well as areas for future research are discussed in the context of the need for adequate measures of OTL in our understanding of predictors of academic success for both students with and without disabilities (Ysseldyke & Christenson, 1987).

Opportunity to Learn
In a broad sense, OTL refers to a student's level of access to educational resources.Because the United States does not have a national curriculum and because funding is primarily received through local sources, levels of OTL can fluctuate substantially from classroom to classroom and from school to school (McDonnell, 1995).The current Common Core Standards initiative seeks to provide states with a common set of academic standards (Council of Chief State School Officers, 2010), partly in an effort to reduce variance between states and to raise the consistency and quality of accountability measures of teachers, schools, and districts.The focus of the OTL discussion has shifted over the past few decades.Original conceptualizations of OTL during the civil rights era were linked to concerns about educational inequity and access to resources and funding (e.g., the Coleman Report, 1966).Carroll's (1984) model of OTL focused mainly on the amount of time dedicated to instruction, whereas more recent models have emphasized student exposure to content via standards-based instruction (e.g., McDonnell, 1995;Schmidt & Maier, 2009).These shifts parallel new language found in the current Individuals with Disabilities Education Improvement Act (IDEIA, 2004) guidelines over previous versions of the law.A central component of IDEIA involves linking IEPs for eligible students with disabilities to the standards-based instruction and assessment framework in a far more explicit manner than 20 years ago.
OTL has been defined in various ways throughout the research literature and in educational policy (Porter 1988(Porter , 1991(Porter , 1993(Porter , 1995;;Schmidt & Maier, 2009).OTL is sometimes used to refer to the inputs, such as teacher quality and fiscal resources, or to system processes, such as implementation of class size reduction programs or standards-based reforms (Herman, Klein, & Abedi, 2000;Scherff & Piazza, 2008-09).At a descriptive level, OTL has become an important consideration when determining how well schools are translating inputs into positive student outcomes (McDonnell, 1995).In its role within the US educational policy arena, OTL is an important construct in discussing educational gaps -whether related to gaps between races, classes, linguistic status, or disability category (Abedi, Leon, & Kao, 2008;Herman et al., 2000).In a striking statement about the relationship between socio-economic status and education, Haberman (1991) referred to inadequate OTL core curriculum as the "pedagogy of poverty."There is a general concern that inequities in OTL may explain, in part, the significant achievement gaps seen in US public education.

OTL and Students with Disabilities
A key issue in OTL for SWD is whether or not they have access to the regular education core curriculum, or as has sometimes been the case, a less rigorous or even separate curriculum.The concern about access to grade-level content is based, in part, on a history of placements outside of the regular education classroom and an education philosophy that did not require that all students be taught and assessed based on the same content standards.However beginning with the inclusion movement of the 1980s (Osgood, 2005), the standards-based reform movement of the 1990s and continuing through the current accountability reforms (Kornhaber & Orfield, 2001), educational structures and policies have increasingly emphasized the need for higher levels of OTL.Both the No Child Left Behind Act of 2001 (NCLB) and IDEIA include provisions that seek to ensure that students with disabilities are provided services that best meet their educational needs.According to IDEIA, schools are required to provide all students with disabilities an IEP that explicitly lays out what services the school will provide to ensure that students are receiving appropriate education in the least restrictive environment (LRE).In turn, placement in a LRE is to facilitate higher levels of OTL for students with disabilities.
For some students, particularly those with more significant disabilities, higher levels of OTL are not necessarily the result of placement in regular education classrooms but of changes in instructional approach (Cawthon, 2004;O'Sullivan, Ysseldyke, Christenson, & Thurlow, 1990).Classroom level variables, such as the intensity of instruction and use of IEPs to reach academic goals, have become a major focus of OTL-related research (Roach & Elliott, 2006).Using test scores on an alternate assessment as a marker of effectiveness, Roach and Elliott investigated factors that led to higher academic outcomes for students with significant disabilities.Using structural equation modeling, they found that access to content area curriculum significantly predicted student performance on standards-based alternate assessments.In an extension of this work, Roach et al. (2009) discussed the importance not only of more instructional time for students with significant disabilities, but also of raising student engagement to ensure that they access the general education curriculum.These two studies mark a new era in research on the impact of OTL on outcomes for SWD within accountability reform.

Measuring OTL
Just as there are multiple meanings of OTL, there are also varied approaches to its measurement (Herman et al., 2000;McPartland & Schneider, 1996).In a broad sense, OTL describes the system of education that predicts student performance, including the teacher, school, and larger contextual level variables (Scherff & Piazza, 2008-09).Wang (1998) asserts that the complexity of OTL necessitates multiple approaches to its measurement; however developing valid indicators in research has been difficult (Herman et al., 2000).Ideally, OTL should, at minimum, describe students' access to resources, school conditions, school curriculum, and level and intensity of instruction (Herman et al., 2000;Wang, 1998).Alternative definitions have described OTL as a measure of content coverage (e.g., whether or not students are exposed to the appropriate content curriculum), content exposure (e.g., time-on-task and depth of teaching provided to students), content emphasis (e.g., what topics are selected for emphasis), and quality of instructional design (e.g., classroom teaching practices) (Schmidt & Maier, 2009).Depending on the study, OTL can be as concrete as whether or not students have access to college preparatory courses or as broad as whether they have received the "full benefit" of reforms aimed at closing achievement gaps (Wang & Goldschmidt, 2003).
Much of the research on efficacy of education, in general, focuses on student performance on standardized assessments as the primary outcome variable.Previous research has provided limited explicit consideration of a student's OTL test content in analyses of test performance for students with and without disabilities.The current study sought to measure OTL and its impact on test scores using a set of teacher and classroom variables available in the NAEP.Although not developed with a specific framework of OTL in mind, the NAEP is a large-scale, nationally representative dataset that focuses both on student outcomes and on a wide range of contextual factors.Further information about the NAEP dataset and the variable selection process for this study is provided in the methods section, below.Two research questions guided this study: (a) What is the factor structure of NAEP variables that appear to measure facets of OTL? and (b) Is there a differential impact of OTL factors on reading and mathematics scores for students with and without disabilities?

NAEP Dataset
Without a national K-12 curriculum and assessment framework, there are few benchmarks for overall student achievement in the United States (the Common Core Standards under development is a move towards greater commonality in academic standards).The NAEP is one assessment that draws on a nationally representative sample of students in public and private education.Because each state's standardized assessments vary and are based on different content and proficiency standards, NAEP seeks to provide an objective, national yardstick by which to measure student achievement.This study utilized the 2005 4 th grade NAEP datasets for reading and mathematics.

Variable Selection
The conceptualization of OTL in this article is multi-faceted and draws upon literature from a number of areas of research on school factors that affect student achievement.This study utilizes OTL factors that are conceptually linked to previous definitions of OTL and that can be operationalized using variables found in the NAEP dataset; as such, this study was limited by what NAEP collects and carries with it all of the challenges of a secondary data analysis design.Furthermore, selecting OTL variables relevant to both students with and without disabilities is a challenging task.Specific variables that are relevant to special education populations are not always available in large-scale datasets designed for a broader population, even when students with disabilities are included in data collection process.OTL factors present in NAEP and discussed further below include teacher preparation, teacher professional development, content and intensity of instruction, and technology resources.Although the research often cuts across subject areas and student populations, distinctions are made between reading and mathematics content areas, where applicable.
Teacher preparation.As the primary facilitator of classroom learning experiences, a teacher's pedagogy and practice should influence on student learning.In other words, students of a teacher with greater content expertise and pedagogical skills should see higher levels of learning than the same students who have a teacher without strong background or skills (Darling-Hammond & Youngs, 2002).Overall, there are mixed findings regarding the importance of a traditional teacher preparation background, usually associated with a bachelor's degree program, and the sufficiency of an alternative certification program that is typically shorter and relies more on in situ training experiences.Whereas Darling-Hammond and her colleagues found data to support the conclusion that traditional programs are necessary for successful teaching (e.g., Darling-Hammond, 2000;Darling-Hammond & Youngs, 2002;Darling-Hammond & Sykes, 2003), work looking at a teacher's "value added" to student performance downplays support for traditional teacher certification (Wright, Horn, & Sanders, 1997).At issue here is the strength of the connection between what a teacher brings to the classroom (either via training or other teacher characteristics) and measurable outcomes in student performance (Rowan, Chaing, & Miller, 1996;Wilson, Floden, & Ferrini-Mundy, 2001).
Findings about the effects of types of certification are often confounded with other factors, especially content area (i.e., mathematics, science, reading).One of the most consistent and robust overall findings is that students of teachers who hold a degree in mathematics demonstrate higher performance on mathematics assessments than students of teachers without a mathematics content area degree (Goldhaber & Anthony, 2003).In contrast with English and history, mathematics and science teachers holding a degree in their subject area is a significant predictor of student achievement (Goldhaber & Brewer, 1997).Yet this overall finding is also complex and does not tease apart differences in instructional approach or in student population.For example, Gimbert, Bol, and Wallace (2007) assert that in hard-to-staff, urban settings, alternatively-certified mathematics teachers may be as effective as mathematics teachers with traditional training in raising student achievement.
Inclusion of essential teacher variables is important if teacher effects are a contributing factor to learning, particularly for students who may not have other resources to help them towards academic success.It may be, however, that individual factors cannot be meaningfully teased apart.For example, teachers may have a college degree in the subject area (such as mathematics) with certification obtained through multiple routes, ranging from the traditional bachelors to master's degree to alternative programs.Because the teacher preparation and qualification literature is complex and has not drawn clear conclusions as to active ingredients in effective teaching, this study's model of OTL included as many of the above variables as possible in a Teacher Preparation factor, but did not emphasize the primacy of one variable over another.The list of NAEP variables included in the Teacher Preparation factor is found in Table 4 for mathematics and Table 9 for reading.
Professional development.Professional development obtained during one's teaching career serves as a resource for continuing education and skill acquisition (Cohen & Hill, 2001).Many teachers are required to participate annually in a minimum number of professional development hours to maintain their credentials.The effects of professional development on teacher knowledge, instruction, and ultimately, student achievement are thus a critical area of research in education.To the extent that teacher capacity affects student learning, a teacher's level and type of professional development is a potential source of OTL resources for his or her students (Trimble, 2003).
Research on the effectiveness of professional development points to a great diversity in the quality of programs and a corresponding mixed set of results on changes for teachers or their students (Supovitz, 2001).There are two common critiques of teacher professional development: shortcomings in the format of training sessions and the importance of content area specificity in program effectiveness (Garet, Porter, Desimone, Birman, & Yoon, 2001;Thibodeau, 2008).In terms of training session formats, most workshops are delivered in lecture format and are not embedded in teachers' practices.When provided in this manner, professional development does not connect with the life and learning in a classroom (Polk, 2006).Guskey (2000) characterizes effective professional development as sustained, interactive, job-embedded and collaborative in nature.Active learning opportunities within professional development, such as collaborative study groups, lead to more changes in teacher practices than passive instruction (Desimone, Porter, Birman, Garet, & Yoon, 2002;Desimone, Porter, Garet, Yoon, & Birman, 2002;Thibodeau, 2008).When implemented in conjunction with traditional workshop formats, follow-up activities within a teacher's classroom have been found to be essential to cement newly learned skills into active practice (Bauer, Reese, & McAllister, 2003).
Professional development may contribute to student learning as a direct influence on teacher knowledge (and expectations or attitudes), but the extent to which these translate into measurable effects on student achievement remains questionable.Inevitably, any professional development program must show its beneficial impact on student achievement to be viable as an evidence-based program (Porter, Blank, Smithson, & Osthoff, 2005;Resnick, 2005).When a professional development program contains the sustained, collaborative, and system-supported components described above, large-scale assessments can provide one picture of its impact on student achievement.The list of NAEP variables for Professional Development is found in Table 5 for mathematics; there was not a comparable set of variables for reading.
Content and intensity of instruction.Teacher preparation and professional development experiences provide educators with tools to provide students with quality learning experiences.How students come into contact with the outcome of teacher training falls under the general constructs of content and intensity of instruction.Content of instruction can include how an educator presents the information outlined on curricular standards, specific classroom activities, the rigor of student assignments, and how educators provide feedback to students during the learning process.Content of instruction is perhaps the most salient component of OTL that is present in different models in the literature.For example, Porter (2002) describes exposure to curricular content along two dimensions.The first dimension is the specific topic areas covered within the general domain, such as the relationship of planets and the sun within our solar system.These topics are typically listed as rows in a table and represent curricular standards and material aligned to content area assessments.Once the range of topics taught over the time frame (typically a semester or a year) is finished, teachers complete information about the classroom activities for each of the individual topics.The columns of the grid represent the depth of knowledge taught.For each specific topic, the teacher rates the activities along a continuum of cognitive rigor, such as memorize, apply, and extend.By including this dimension, the concept of what students are learning is tied not only to the topic area (e.g., planets) but also to the intellectual depth of learning demands (e.g., knowing the names of the planets vs. knowing which planets are inhabitable and why).
Intensity of instruction, at its most basic level, refers to the amount of time a student spends on a particular topic.For example, some schools operate on a year-round schedule with short breaks, resulting in more hours of instruction than students who are in a traditional nine-month calendar.Intensity can also be increased by focusing on a specific subject, such as reading, for more minutes within the school day.In this approach, time-on-task becomes the primary metric by which to measure OTL.In the NAEP, the variables in this study are mainly teacher self-report; only mathematics has a "time on task" measure, and only using one item instead of on individual content areas.However, many of the categories of activities in mathematics are coupled with the instructional technology variables listed in the next section below.Reading did not have a time on task variable, but had many variables that related to classroom activities.The list of NAEP variables for Content and Intensity of Instruction are found in Table 6 for mathematics and Tables 10 and 11 for reading.
Instructional technology resources.The fourth main area of OTL measured in this article is the use of technology within elementary and secondary instruction.In this discussion we focus on the study of technology-rich environments, e.g., classrooms that utilize computers, internet resources, multi-media, and/or software packages specifically designed to increase student learning or construction of knowledge.Technology-rich can also refer to wholly online learning environments (i.e., virtual schools) that use web-based classroom platforms, video streaming, or other distance-learning tools (Barbour & Reeves, 2009).Although diverse in the approach and specific use of technology resources, what this body of literature has in common is the goal of studying if, how, and why technology use benefits students.Much of the focus in education reform has been on improving learning outcomes in science, technology, engineering, and mathematics (STEM) fields.The goal is to use technology as part of pedagogy that facilitates conceptual understanding over rote knowledge.In a meta-analysis that included technology use in science, Schroeder, Scott, Tolson, Huang, and Lee (2007) found an overall effect size of Cohen's d = .48,which is higher than meta-analyses of technology and reading (Knezek & Christensen, 2008).This study hints at some impact on students, but combined studies that included a wide range of technology-rich environments and measures of student knowledge.A similar overall effect size (d = .45)was found in a study of technology-rich learning in Taiwan, where there was a great national push for the development of critical thinking skills through the use of technology (Liao, Chang, & Chen, 2008).
More targeted research indicates that students may benefit from contexts where the technology resource is matched with a particular content area or cognitive skill (Tuzun, Yilmaz-Soylu, Karakus, Inal, & Kizilkaya, 2009).For example, a study investigating the impact of using Microsoft Excel within a high school Algebra class found no effect on the overall mean score of students on their final exam, but found that the variability in scores decreased and that students indicated feeling more positive about algebra as a subject area (Neurath & Stephens, 2006).Some schools use software as a way to help students study material for state exams.Use of a software program in Florida had a small impact on the reading and mathematics skills for students in late elementary grades, but no impact on student performance in middle or high school (Martindale, Pearson, Curda, & Pilcher, 2005).Software-specific studies, particularly those used for test preparation, have the advantage of an identifiable technology component, but are limited in their scope and in how they encompass the intent of a technology-rich environment.To the extent that technology integration represents best practices and incorporates pedagogical strategies that seek to increase self-efficacy and student engagement, quality technology use in teaching is one of the ways in which we can conceptualize a student's OTL.NCLB, with its focus on closing the achievement gaps between low socioeconomic status (SES) and high SES students, also includes special emphasis on increasing student access to technology-rich environments (Maninger, 2006;Rigeman & McIntire, 2005).Inequities in this investment for students from different backgrounds or with different educational needs represent a malleable component of OTL.The list of NAEP variables for Technology Resources in mathematics is found in Tables 6, 7 and 8; there were not comparable variables for reading in the NAEP dataset.

Study Sample
A stratified random sample was extracted from the 2005 4 th grade NAEP datasets (one for reading, one for mathematics) that consisted of one randomly selected student within each teacher, and all teachers within all schools.This eliminated the need to account for classroom level nesting in the analysis.Teachers and schools with missing values on their identifying variables were eliminated from analysis datasets.For the OTL factor analysis, we first ran a series of constrained models for each factor.In order to provide a cross-validation sub-sample, this dataset was randomly split into two sub-samples: a calibration sub-sample (n = 14,270 and 14,220 for reading and mathematics, respectively) and a cross-validation sub-sample (n = 14,270 and 14,220 for reading and mathematics, respectively).
To capture differences between OTL for students with and without disabilities, students with IEPs were compared with students without IEPs in this analysis.Although not all students with disabilities had an IEP, the IEP variable captured students who received in special education services..The focus in this study was therefore more specific than the general umbrella of students with disabilities and was specifically on those who had a service plan in place to meet their educational needs.If the student's disability did not require a plan to address academic needs, they would not be a part of the IEP group.There were a total of 3,330 students with an IEP in the reading sample and 4,160 students with an IEP in the mathematics sample (across both validation and calibration subsets).
The NAEP includes a wide range of demographic variables in its data collection process.Study sample demographics for students with and without IEPs in the reading sample are provided in Table 1 (similar demographic distributions were found in the mathematics sample).There were a total of 3,330 students with IEPs and 25,210 without IEPs in the reading sample (across both validation and calibration samples).Exact percentages within each demographic characteristic vary depending on the level of missing data in each category.(Demographics terminology used in this article reflects the variable names and categories utilized by NAEP.)There was a greater percentage of males in the IEP group (65 percent) than in the non-IEP group (55 percent), reflecting the higher proportion of male students diagnosed with some types of disabilities than female students (e.g., Autism).Race and ethnicity variables revealed a similar distribution between the IEP and non-IEP groups, with White students representing approximately 60 percent of the sample, followed by Black students at approximately 20 percent, and Hispanic students at 15 percent, with the remaining 5 percent consisting of Asian American/Pacific Islander, American Indian/Alaskan Natives, and Other.Although an imperfect measure of family income, the NAEP does provide information about student participation in the federal free and reduced lunch program.Eligibility for the program was higher in the IEP group (67 percent) than for the non-IEP group (49 percent).Finally, geographic distribution for both groups were essentially identical: across both the IEP and non-IEP groups, one-third of students lived in a Large/Mid-sized city, one-third lived in Urban fringe/Large town, and one-third lived in a Small Town/Rural area.
Within the IEP group, students had a wide range of disabilities.Results for the reading and mathematics samples are reported in Table 2.The proportion of students in reading and mathematics with each kind of disability was relatively similar across categories.The largest proportion of students with an IEP was for those with a Specific Learning Disability, approximately half of all students in the study sample.Students with a Speech Impairment formed the second largest group, with 930 (29 percent) and 1,060 (27 percent) in the reading and mathematics datasets, respectively.The "Other Health Impairment" category also had a sizable representation, with 280 (9.0 percent) and 360 (9.1 percent) students in reading and mathematics, respectively.The remaining categories of disabilities had fewer than 100 students each, including Hearing Impairment, Visual Impairment, Orthopedic Impairment, Brain Injury, Autism, and Developmental Delay.

Analyses Confirmatory Factor Analyses
The first research question focused on the factor structure of a conceptually meaningful measure of OTL based on NAEP variables.Confirmatory factor analyses (CFAs) were conducted to test the fit of each content-specific category of measures (i.e., mathematics or reading).It would have been possible to estimate a single model containing all factors; however the resulting model fit indices describe the global fit and would have been less helpful in terms of identifying factor-specific misfit.Mplus software's weighted least-squares mean and variance adjusted estimation (WLSMV) procedure was used to estimate each model to handle the categorical indicators.In addition the Huber-White covariance adjustment was used (by requesting TYPE=COMPLEX) to provide sandwich estimators of model parameters that corrected the associated standard errors for the dependence resulting from multiple teachers' responses per school.Last, we used Hu and Bentler's (1999) criteria for good model fit when assessing models' fit.This included a minimum of 0.95 for the CFI and TLI indices and of 0.05 for the RMSEA.CFI and TLI values greater than 0.90 were considered supportive of adequate model fit.
For models that did not fit the data, indicators with standardized loadings lower than 0.3 were scrutinized for possible removal.Some indicators were removed due to lack of variability in responses.Others were removed based on substantive reasons in that the low loadings validly indicated misfit of the item with other items measuring the same factor.Last, some indicators were re-coded due to low frequencies in some of the scoring categories.
The final factor models resulting from the first set of calibration CFAs were next crossvalidated using the second, independent cross-validation sub-sample.The same estimation procedure and software were used.Factor loadings, factor variances and indicator thresholds were constrained equal across calibration and cross-validation samples.These analyses were conducted to assess the invariance of the calibration sample's results.

IRT Ability Estimates
Once the final OTL factor models were selected, the factors were used to predict NAEP's composite IRT ability estimates for both mathematics and reading.The NAEP dataset is designed to provide strong group-level estimates of achievement.In order to estimate individual achievement, the NAEP creates a multiply imputed score using five estimates of an individual's ability.We used the composites to create ability estimates based on student responses to assessment items corresponding to IRT theta values (Mislevy, Beaton, Kaplan, & Sheehan, 1992).The overall average theta values for the IEP group was M = -.804(SE = .025)for mathematics and M = -.897(SE = .026)for reading.The average theta values for the non-IEP group was M = .067(SE = .009)for mathematics and M = .053(SE = .008)for reading.
The last step in the analysis was to measure the differences in the impact of OTL factors on ability estimates of students with and without IEPs.While the individual factor structures were constrained to be equal across the two groups consisting of students with and without IEPs, the regression of each outcome on each OTL factor was allowed to vary across groups.This analysis approach thus assumed that the OTL factor structures and loadings were the same across the two groups, but that their impact on student outcomes were not assumed to be equal.

OTL Factor Structure
The results of the CFAs supported the assumption that there are several factors underlying the hypothesized measures of OTL using NAEP student and teacher variables.Results for the mathematics factor models will be described first, followed by the reading factor models.
Mathematics factor models.There were four mathematics factor models in this analysis: Subject Area Teacher Preparation, Professional Development, Technology, and Content-based Technology.The fit index results (Table 3) for the Subject Area Teacher Preparation factor for mathematics indicated only adequate data-model fit in the calibration sample.Table 4 contains the standardized factor loadings for the six measures of subject area mathematics preparation.We selected these items based on their description of a teacher's mathematics-specific subject area credentials at either the undergraduate or graduate level.The loadings are consistently strong ranging from 0.754 up to 0.932.Strong support was found for the invariance of this model across both the calibration and validation sub-samples.Both the CFI and TLI supported data-model fit, although the RMSEA value (RMSEA = 0.07) did not quite meet criterion for good model fit.The second factor model in mathematics was for Professional Development.The CFI and TLI fit index results supported good model fit for the mathematics professional development factor (see Table 3).The RMSEA value, again, did not quite meet the criterion for good model fit (RMSEA =.06).This pattern of fit index results matched those found for the cross-validation model for this factor model.There were two factors in this model: Professional Development Format (PD-Format) and Professional Development Content (PD-Content) (Table 5).The PD-Format items relate to the structure and location of the training sessions.For example, items describe opportunities ranging from a workshop to collaboration with a colleague.The second factor in this model relates to the content of professional development.For example, teachers described whether they learned about content standards, how to use manipulatives in instruction, or how to teach students from diverse backgrounds.Consistently moderate factor loading values were found on the PD-Format factor ranging from 0.443 up to 0.823.Strong loadings were found on the PD-Content factor with values ranging from 0.606 up to 0.923.A reasonably strong correlation was found between the Format and Content factors (r = 0.625).The third factor model in mathematics was the General Technology model.The two-factor Mathematics Technology (MT) factor model fit the data well for the calibration sample (see Table 3).There were two factors in this model: Calculator Use and Digital Device Use (see Table 6).The items in the Calculator Use factor related to ways in which a teacher integrates calculators into mathematics instruction and classroom activities.Time on instruction as a single variable was included in this factor because it most closely aligned with the classroom instruction described in this factor.The Digital Devise Use factor items describe whether students use a range of other technology such as a geometry sketchbook or a symbolic manipulator.These tools were deemed different enough from calculators in how they are used and in their availability to warrant a separate factor.With the exception of time on instruction, the MT-Calculator Use and MT-Digital Device Use factors had moderately to very strong standardized loadings ranging in value from 0.566 up to 0.923.The two factors were not strongly related with a correlation of 0.207.The last factor model for mathematics was Content-Based Technology Use (Table 7).The activities described in this factor model require access to a computer or the internet, resources typically available in a computer lab but not ubiquitously in each classroom.Many of these factors describe a specific instructional purpose for the use of computer-based technology in mathematics classroom activities.This factor is in contrast with the calculator-based activities in the previous Technology factor.The CFI, TLI, and RMSEA results supported the fit of the calibration and crossvalidation models of the four correlated Mathematics Content-Based Technology (MCT) factors (see Table 3).The first three MCT factors (MCT-Assessment, MCT-Instruction, and MCT-Software) are based on teachers' descriptions of their classroom activities.For example, the MCT-Assessment items include testing strategies such as using a computer for mathematics tests to the whole class or for individual students.MCT-Instruction describes ways in which computers are integrated into instruction of mathematics content, such as presenting concepts or playing mathematics games.Finally, MCT-Software describes specific kinds of computer software used in class, such as graphing programs or word processing.Inter-correlations between the three factors are shown in Table 8.With the exception of using computers to make charts, standardized loadings on these factors are consistently strong, ranging from 0.631 up to 0.919.Reading factor models.There were three main factor models within the reading subject area: Subject Area Teacher Preparation, Classroom Activities, and Assessment.The TLI and CFI met criteria for good model fit for the Reading Teacher Preparation (RTP) factor, although the RMSEA's value (RMSEA = 0.06) did not meet criteria (see Table 3) for the calibration sample.The standardized factor loadings for the RTP factor were consistently strong, ranging from 0.697 up to 0.970 (see Table 9).Strong evidence was found supporting the fit of the two-factor Student Reading Activities (SRA) factor model to the calibration sample's data (see Table 3).There were two factors in this model: Teacher-reported activities and Student-reported reading activities.Some of the standardized loadings (see Table 10) on the two factors were moderate whereas others were quite strong, with values ranging from 0.366 up to 0.934.The correlation between the teacher-measured and studentbased factors was again low (r = 0.101).Strong support was found for the Reading -Student-Constructed Assessment factor model consisting of two sub-factors: Teacher-reported classroom assessment and Student-reported classroom projects.Teacher responses were modeled as indicators of the first factor while student responses were used as indicators of the second factor.Moderately strong standardized loadings (values ranging from 0.437 to 0.715) were found across the two factors (see Table 11).Only a low correlation (r = 0.186) was detected between the teacher and student factors.

Differential Impact of OTL factors on IEP and non-IEP student scores
The influence of OTL factors on ability estimates for IEP versus non-IEP groups were tested using a standardized Z-test statistic.Regression coefficient (with Huber-White-corrected standard error) estimates for the IEP and no-IEP groups' regression equations are presented in Tables 12 and 13 for both reading and mathematics scores, respectively.The regression coefficient values for each OTL factor's prediction of the reading or mathematics outcome are interpreted such that positive values represent a positive relationship between the relevant factor and test scores and a negative value represents a negative relationship between the factor and test scores.The Z-test statistic represents the significance of the difference between the two groups.In this analysis, a negative Z-test statistic means that the OTL factor had a stronger relationship with test scores for students with an IEP than for students without an IEP.Or, viewed in the opposite way, a positive Z-test statistic means that the OTL factor had a stronger relationship for test scores of students without IEPs than for students with IEPs.All analyses assume were conducted using an α-level of 0.05.The relationship between NAEP reading scores and each of two out of the five OTL factors differed for students having an IEP versus students not having an IEP.The regression coefficient for the reading Classroom Activities factor's prediction of reading was −0.63for students with an IEP as compared with .37 for students without IEPs.These coefficients differed significantly (Z = 2.86, p < .05),with a stronger, negative relationship between the factor and reading test scores for the IEP group.The coefficient representing the relationship between the level of studentconstructed projects and reading scores was −0.65 or students with an IEP and −0.29 for students without an IEP.These coefficients differed significantly (Z = −4.03,p < .05).
The relationship between mathematics scores and each of the eight OTL factors only differed significantly for IEP and non-IEP groups for one of the OTL factors.The regression coefficient for predicting mathematics scores using the Math Technology -Calculator factor was 0.02 for students with IEPs and 0.09 for students without IEPs.These values differed significantly across the two groups (Z = −1.43,p < .05).

Limitations
There are a number of limitations to this study that must be considered when interpreting the significance of its results.The first is the nature of the dataset used to model OTL.The NAEP dataset is designed to provide a national profile of student achievement across different states and regions of the country.Reports are not available for individual students, teachers, or schools.As a result, students take only a sub-set of the NAEP assessment and only a small proportion of students within a single school participate.As a result, the teachers of students who take the mathematics assessment may be different than the teachers whose students take the reading assessment.The NAEP design emphasizes group level analyses, such as boys versus girls, but is not designed to provide interpretations of findings for the specific groups used in this study, students with and without an IEP.As a result, even though the data here are based on the multiple imputed individual scores to create a group IEP versus non-IEP variable, they were not designed that way in the original structure of the NAEP dataset.Each individual student's score still represents their participation in the NAEP, and it is on this basis that we aggregated students into our groups of interest.
A second limitation to this study is the makeup of the sample of students who had an IEP.Participation in the NAEP is not required, and although students with disabilities are included in this dataset, they are not necessarily representative of all students who have an IEP.The NAEP is a fairly rigorous assessment; although students could use accommodations if it met the criteria for the NAEP and for their own IEPs, students who would typically take an alternate assessment would not likely participate in the NAEP standardized assessment.It may be that the OTL factors identified in this study had a different impact on students with IEPs in general than it did in the students who were included in the NAEP and who were randomly sampled in this study.
The third limitation of this study is that the NAEP variables were designed to gather relevant information for the assessment designers (i.e., the National Governing Board), but were not developed specifically to mirror the literature on OTL.While there is utility of this dataset for research purposes, and these investigations are encouraged, there are many pitfalls to conducting secondary data analyses on the NAEP.There is therefore an emphasis on some factors within OTL (e.g., teacher preparation) but not on others (e.g., time spent teaching individual content areas).Based on this review of extant research and viable factors in the dataset, the mathematics and reading factors represent different areas of the OTL literature.Furthermore, this study focused on 4 th grade, such that teachers were most likely trained and received credentialing as an elementary grade teacher.The emphasis on subject area preparation and expertise may be different in the middle school and upper grades, when teachers specialized in a single content area such as English or science.The results of this analysis may be different for the 8 th grade NAEP results, a grade where there is a different structure for teacher preparation and instruction.Finally, the sampling procedures of the NAEP make it difficult to merge the findings from this dataset with other potential sources of OTL information such as financial resources at the school, spending allocations, and family factors.

Model Fit for OTL Factor Structure
Preliminary results from this project indicate that it may be feasible to construct measures of facets of OTL for use with NAEP analyses.The RMSEA model fit values for factors in this analysis were sometimes close to, although still exceeding, the cutoffs supporting good model fit.With this in mind, the information available in the NAEP dataset includes a number of important contributors to educational quality.The potential utility of the OTL measure in future analyses may be stronger for some factors than for others.For example, it is quite meaningful to know whether a teacher's subject-specific (i.e., reading vs. mathematics) credentials contribute to student performance in the content area assessment.NCLB policies apply not only to student assessment, but also to required subject area preparation for teachers.If the OTL teacher preparation factor is shown to be a predictor of student performance, this result would lend credence to teacher credentialing policies.On the other hand, the technology factor may play a smaller role in applications of an OTL measure.Many of the technology factors were included in this measure of OTL because a large proportion of the items tied the technology to mathematics content.Yet those items that only mention the technology tools, and not content of instruction, may be less meaningful.Whether a student has access to graphing calculators is likely to play only a small role in student performance on the NAEP and is an area of lesser concern in the overall OTL conversation.That said, a multiple factors approach to the conceptualization and measurement of OTL leads to greater applicability of the measure in future NAEP analyses.

Differential Impact on Student Scores
Among the OTL factors used in this analysis, two of the five reading factors and one of the eight mathematics factors showed evidence for differential impact on scores of students with and without IEPs.For reading, the two significantly different factors went in opposite directions.The Classroom Activities-Student Reported factor was a stronger predictor of test scores for students without IEPs whereas the Student Constructed Projects factor was a stronger predictor for students with IEPs.The Classroom Activities-Student Reported factor covers a broader range of activities than Student Constructed Projects, and mostly focused on their perspectives about reading.Items in this factor included statements about "learning a lot" when reading or "read to learn real things" or "working in groups to read."The Student Constructed Projects factor, on the other hand, asked more specific task-oriented questions such as "making a presentation about something read," "writing a book report," and "doing a school project about something read."The Classroom Activities-Student reported factor appears to focus on less concrete, generalized ideas about reading whereas the Student Constructed Projects factor itemized specific activities that required significant levels of time and critical thinking.In this light, the significant difference in outcomes for students with IEPs with higher levels of Student Constructed Projects may be an indicator of the added importance of those activities for this student group.
In mathematics, there was only one factor, Mathematics Technology Use-Calculator, which resulted in a differential impact on estimates of student abilities for students with versus without IEPs.The direction of the difference was negative indicating that the level of calculator use had a significantly stronger relationship on test outcomes for students without IEPs, or those without disabilities, than their peers with an IEP.The Calculator factor covered a wide range of instructional activities, including using a calculator for math lessons, checking homework, answering problems independently, graphing functions, and using the calculator for tests.These items did not focus on specific mathematics content, per se.However, it is relevant to note that calculators were a part of the NAEP assessment process; practice using a calculator in class may have helped students do better on the NAEP assessment.
Overall, these results do not suggest a significantly broad differential impact of OTL between students with and without IEPs.These results may be an artifact of the sampling process for the NAEP and its emphasis on students with IEPs who do not need an alternate assessment, and who are thus potentially functioning closer to grade level than their peers who could not meaningfully participate in a standardized assessment.If this is the case, the students with and without IEPs in this sample may actually have more similar characteristics than one would find in a school or district.The relative impact of OTL factors on student test scores might be different on an assessment that was more inclusive in its sample characteristics.

Implications
Given the broad use of the NAEP datasets in research and policy, a robust measure of OTL could be an important contribution to the literature.This study may ultimately contribute to more complete models of student progress that include OTL factors as predictors of student outcomes.From an equity perspective, differential impact of OTL is one way of measuring level of access to educational resources and whether they are equally "active" in improving outcomes for all students.An important note is that this conceptualization of equity looks not at whether students from two groups have the same level of OTL, but whether the OTL factors have the same level of impact on student outcomes.This study emphasizes the relationship between inputs and outputs in this process.In a sense, this analysis focuses on whether the factors typically identified as important in facilitating student learning have the same relationship to student performance.Findings from this study indicate that for the NAEP, factors developed from internal variables do not indicate a broad differential impact of OTL factors on student scores for students with and without IEPs.In other words, OTL factors measured here appear to have similar relationships to outcomes across both groups.In a program or policy focused on one of these factors, such as increasing technology use in mathematics, one would generally expect to see similar impact on student scores.
One fundamental assumption to this approach is that the same OTL factors are important for both students with and without disabilities.Although NCLB and similar accountability reforms are limited in how they differentiate student groups in expectations about content area standards or definitions of adequate progress, IDEIA is centered on the premise that students with disabilities require specialized services that are tailored to meet their educational needs.Even when in a full inclusion setting, a student with an IEP may receive different levels of instruction than their peers, with additional supports for classroom participation and a range of accommodations for assessment.Analysis in this study used the same OTL factor structure for students with and without IEPs because the purpose of this investigation was to see whether the relationships between these factors and student outcomes were the same for both groups.A more in-depth look at OTL for students with IEPs, apart from the OTL factors identified for both groups, could include factors such as the special education teacher preparation, intensity of services and match to student needs, implementation of accommodations in instruction and assessment, to name a few.The factor structure for students with and without IEPs would be different, with additional variables included for students with disabilities.In a confirmatory factor analysis, separate factor structures limit the ability to compare across groups; but if targeting an individual group, this approach could provide helpful context when interpreting student outcomes.
On an assessment policy level, NCLB's emphasis on assessment outcomes may eventually refocus the education reform discussion on OTL.Student performance on standardized tests is currently the primary indicator of school success.The Common Core Initiative will be an opportunity to further explore how the factor structures identified here may differentially (or not) affect students from different backgrounds, be it students with and without disabilities, students who are native English speakers and those who are English Language Learners, or those from different cultural and ethnic backgrounds.If information about OTL in a school is taken in conjunction with student performance on state assessments, policy makers may have a clearer understanding of efforts schools have made to improve student outcomes.For example, a school with high levels of OTL may receive recognition of those efforts to improve student outcomes.State NAEP results are already poised to become a more visible component of school reports cards.Results from this project may encourage more nuanced assessment policies that consider student and school OTL when interpreting the significance of standardized test results.
NAEP analysis is a growing area of both primary and secondary data analyses.Understanding the relationship between OTL and student test scores could enrich research on educational outcomes, particularly those studies that use NAEP assessment results or similar largescale datasets.Several of the highest rated NAEP Validity Research Priorities focus on how states would like to use NAEP data to support state results and to determine how well NAEP aligns with state goals (NAEP, 2003, p. 72).This begs the question regarding OTL: What exposure to NAEP test content is necessary for NAEP to be a useful benchmark of school success for states?Currently states vary widely not only in the curricula prescribed for students, but on alignment of state standards with the NAEP content.While it is not necessary for states to show perfect alignment, a reasonable match between classroom instruction and NAEP state assessment is necessary for there to be meaningful interpretations of NAEP results.OTL measures embedded within the NAEP dataset may be one way to complement alignment studies of test content.If states are measured, in part, based on their NAEP performance, links between curriculum, instruction, and NAEP outcomes may become a part of a more comprehensive conceptualization of OTL (Porter, 2002).
On a broader level, research using an OTL measure could provide a more nuanced understanding of the factors that contribute to the outcomes of educational reform, particularly those that hope to close historic achievement gaps.It may be useful to perform similar comparisons for other student groups to see if findings are corroborated with other research on the differential impact of OTL (Schmidt & Maier, 2009).The assessments that are being developed for the Common Core will also have the potential to gain more sophisticated understanding of the impact of large-scale reforms.For example, policies that seek to address malleable factors such as teacher preparation or types of classroom activities might look to an OTL measure within the Common Core Assessment framework as a check for impact of implemented changes on student achievement.This would be particularly useful when evaluating the potential for multiple changes in a system from a broad reform initiative, such as NCLB, Race to the Top, charter school initiatives, etc..In contrast with intervention designs where there is a single, often experimentally manipulated factor such as a new curriculum, education reforms are often implemented without the kinds of control groups and isolation of variables that might confound or co-occur with the intended reform.Unintended effects of a new initiative, such as reduction in instructional activities in another subject area, or co-occurring effects, such as the "value added" contribution of teacher preparation in the subject area, are important to measure when monitoring the potential effects of instructional change on student achievement.An OTL measure that tracks the context of learning, particularly when comparing outcomes for groups of students, may be a useful metric in understanding educational change on a large scale.Research and policy guided by a valid and reliable measure of OTL measure will have a more complete view of "what works" in education, for whom, and why.

About the Authors
Stephanie W. Cawthon The University of Texas at Austin Email: Stephanie.cawthon@mail.utexas.edu Stephanie Cawthon is an assistant professor in the School Psychology Program in the Department of Educational Psychology at The University of Texas at Austin.Her expertise is in policy to practice issues related to students with disabilities, particularly in the areas of assessment accommodations and accountability reforms.

Natasha Beretvas
The University of Texas at Austin Natasha Beretvas is a professor in the Quantitative Methods Area of the Department of Educational Psychology at The University of Texas at Austin.Her expertise is in innovative application of psychometric and statistical models; multilevel modeling with a focus on extensions to the conventional multilevel model to handle student mobility and other sources of data structure complexities; and meta-analytic techniques.
Alyssa Kaye The University of Texas at Austin Alyssa Kaye is a doctoral candidate in the School Psychology Program in the Department of Educational Psychology at The University of Texas at Austin.Her interests are in educational interventions for students with learning disabilities.Leland Lockhart The University of Texas at Austin Leland Lockhart is a doctoral candidate in the Quantitative Methods area in the Department of Educational Psychology at The University of Texas at Austin.His interest are in multi-level modeling techniques for large-scale datasets.

Table 1 .
Note: Sample demographics are provided together for both calibration and validation samples used with the reading assessment outcome.All values rounded to the nearest ten as per restricted data use guidelines.

Table 2 .
Disability Classifications in Sample of Students with IEPs by OutcomeNote: A total of 3,160 reported a disability category in the reading dataset and a total of 3,930 reported a disability category in the mathematics dataset of students with IEPs.All values rounded to the nearest ten as per restricted data use guidelines.
**Indicates a total number that represents less than 0.5 percent of the dataset and are thus not reported to maintain confidentiality.

Table 3
Fit Indices for Mathematics and Reading Factor Models Note: Degrees of freedom removed as per restricted data use guidelines.

Table 7
Standardized Factor Loadings for Mathematics Content-based Technology Factors NAEP Variable

Table 8
Inter-factor Correlation Matrix among Mathematics Content-based Technology Factors:

Table 9
Standardized Factor Loadings for Reading Subject Area Teacher Preparation Factor NAEP Variable

Table 10
Standardized Factor Loadings for Reading Classroom Activities Factor

Table 11
Standardized Factor Loadings for Reading: Student-Constructed Assessment Factors

Table 12 .
Regression Coefficients (and Standard Errors) Representing Relationship between Reading OTL Factors and Reading Outcomes for IEP and No IEP Groups and Z Statistics Testing Differences between Groups' Regression

Table 13 .
Regression Coefficients (and Standard Errors) Representing Relationship between Mathematics OTL Factors and Mathematics Outcomes for IEP and No IEP Groups and Z Statistics Testing Differences between Groups'Note.Huber-White covariance adjusted standard errors were estimated to control for dependencies resulting from multiple teachers sampled per school.