An Analysis of Predictors of History Content Knowledge : Implications for Policy and Practice

How and to what extent students learn history content is a complicated process, drawing from the instructional opportunities they experience; the policy prioritization of history/social studies instruction in schools; and their own cultural perspectives toward the past. In an attempt to better understand the complex inter-play among these dimensions, we examined relationships among student sociocultural characteristics, instructional exposure, and school-level variables and US History content knowledge. Using data from the 2010 National Assessment of Educational Progress Test on US History (NAEP-USH), multilevel analyses indicated that while sociocultural indicators (such as race, gender, and socioeconomic status) correlate with achievement, students’ Education Policy Analysis Archives Vol. 25 No. 65 2 instructional exposure variables remain significant predictors of history content knowledge. Moreover, school context such as building-level demographics and state testing-policy predict between school variance in content knowledge and moderate the achievement gap. Results also suggest that, while a substantial achievement gap remains, exposure to text-based instructional practices is associated with increased knowledge. Findings from this study have policy implications for the development of a more inclusive social studies curriculum, the advocating of text-dependent instruction as a high-leverage practice among history teachers, and cautious consideration of tests as proxies for accountability in history education.


Introduction
Teaching history is a complex and multilayered process balancing priorities among content knowledge, curriculum requirements, assessments, cognitive demands, and learning approaches.Among these competing interests is an intricate relationship between how students learn history and what historical content students are taught in school contexts.Within the myriad of factors associated with teaching and learning in history education (Levistik, 2008;VanSleright, 2008), how students are taught, referred to as instructional exposure, affects students' ability to make sense of history.Previous research highlights the importance of inquiry, discipline-specific writing, content literacy, and critical thinking as essential to historical understanding (De La Paz, 2005;Nokes, 2010;Reisman, 2012;Shanahan, 2009;Wineburg & Reisman, 2015).Further complicating students' instructional exposures to history are the various curriculum and content mandates imposed through state policies.Unintended consequences of accountability pressures, such as teaching to the test, can have negative consequences for teaching and learning (Au, 2007;Nichols, Glass, & Berliner, 2012).Alternative analyses of National Assessment for Education Progress (NAEP) and other assessment data link accountability standards and testing to positive student achievement outcomes in other subject areas (Braun, 2004;Dee & Jacob, 2011;Hanushek & Raymond, 2006).However, empirical research examining the effects of testing and accountability on history achievement is largely absent.
History content knowledge is also influenced by the sociocultural context of the learner.Students' identity (ethnicity, class, gender, race, and etc.) affects how students selectively make meaning of content (VanSledright, 2011;Wineburg, Mosborg, Porat, & Duncan, 2007).In other words, learners' levels of engagement with historical content and pedagogy are contingent upon their perceived positionality within the history curriculum as well as their own experiences.School context further confounds the learning process.School-level effects are well-documented predictors of student learning across disciplines (Fantuzzo, Lebeof, & Rouse, 2014;Goldsmith, 2011).School demographic factors correlate with teacher decision-making and student learning outcomes in social studies (Epstein, 2009;Fitchett, Heafner, & Lambert, 2014a, 2014b;Levinson, 2012;Segall, 2006).
In total, research suggests that students' historical learning is an intricate puzzle, contingent upon numerous within-school and outside-of-school dimensions.However, few studies have looked at these puzzle pieces collectively in order to uncover a more elaborate picture of their relationship to history content knowledge.Using data from the grade 12 NAEP United States history test (NAEP-USH), we sought to disentangle the interconnections of students' instructional exposure, sociocultural characteristics, and content knowledge within and across schools to better understand the relationships among these pieces and their relative association with students' learning of history.

The Case for History Content Knowledge
Over the last several years, history educators have shifted their prioritization.No longer focused on the simple collection of facts and ideas, they emphasize the "doing of history"-the cognitive practices of investigation, interrogation, and interpretation of the past (Barton, 2012;Lee, 1994;VanSledright, 2011).Emphasis on inquiry privileges the process of history, rather than the narrative.In doing so, inquiry-based pedagogy teaches students to seek evidence to guide their thinking, assess the veracity of source material, and engage in critical thinking.History through inquiry promotes a democratization of ideas essential to a society.
Yet, narrative structures continue to embed the study of history, providing a template for both the contextual and temporal understanding of the past.Furthermore, principles of cognitive theory propose that development of higher-order thinking, such as historical inquiry, is predicated upon (and influenced by) students' frames of reference or schema (Wertsch, 2002).Providing recommendations for the constructivist teacher, Brophy (2006) suggested that in order for students to make authentic connections in their learning, teachers must provide appropriate opportunities to develop knowledge of fundamental content and concepts.Otherwise, teachers run the risk of encouraging ambiguous relativism, leaving their students intellectually adrift.Wineburg (1998, p. 339) argued that it is important for historical interpreters to recognize the context and vocabulary of what they are reading-helping to "constrain" the meaning in order to avoid uninformed inquiry.Evaluating the prominent canon of the United States is consequential for understanding how Americans as a society collectively identify and perceive the past (Reich, 2011;VanSledright, 2011).This collective narrative, while subject to scrutiny (e.g., Howard, 2003), functions as a common frame of reference that allows for meaningful contextualization of source material, the anchoring of inquiry-based instruction, and construction of a common national identity (Reisman, 2012;Reisman & Wineburg, 2008).
In civic education, researchers have made similar claims regarding the importance of a common knowledge base (Galston, 2007;Zhang, Torney-Purta, & Barber, 2012).Specifically, they argue that individuals' knowledge concepts and content regarding the institutions, rights, and processes encourages voting and other forms of active civic participation.Using NAEP civics data, Niemi andJunn (1993, 1998) theorized an interconnected exposure and selection process to explain how and to what extent students retain civic knowledge.They posited that learners' exposure to civics was contingent upon their schooling experience, including exposure to content and pedagogy.Thus, access to curriculum along with optimal delivery of instruction increases students' exposure.Niemi and Junn, however, further contended that exposure to content failed to adequately explain what civic content students retain.Rather, selection of the content mitigates exposure; learners select (i.e.retain) material based upon their interest in the content, motivated by individual sociocultural elements such as race, class, parental involvement, gender, and past experience with the subject.

Influences on Historical Content Knowledge
Similar to civics processes espoused by Niemi and Junn, we posit that students' knowledge attainment in history is an interconnection of schooling experiences and cultural identification.The following sections highlight the importance of both the sociocultural context and formal instructional experiences as influences on students' history content knowledge.

The Influence of Sociocultural Characteristics
History, as a discipline, attempts to understand the past.Values and cultural norms, which influence how societies understand the past and prioritize historical accounts, shape learners' engagement with content knowledge.VanSledright (2011) and Wertsch (2002) refer to this canonical development, as a freedom-quest narrative, emphasizing a collective national identity of progress.This traditional historical canon helps contextualize and construct historical arguments, while also serving to bind people culturally.Yet, historical interpretation and knowledge are socially constructed (Barton, 2012).How and to what extent individuals engage the past depends, to no small extent, upon their own subjectivity.History education researchers (VanSledright, 2011;Wineburg, Mosborg, Porat, & Duncan, 2007) note that a learner's positionality and sociocultural context within the curriculum influence acceptance of a particular historical narrative.For students to recognize value in the curriculum and remain motivated to learn, discipline-specific research suggests content should be presented as relevant and connected to the lives of learners (Alexander, 2003;Wineburg et al., 2007).Learners whose identities do not reflect the major figures and cultural attributes of history more often fail to connect with the content.Students, unmotivated to engage with curriculum incongruent to their cultural and social norms, retain less formal history knowledge (Chikkatur, 2013;Epstein, 2009;VanSledright, 2008).
Notable scholars of diversity education have critiqued the content of social studies curriculum, holding particular antipathy toward history (Gay, 2003;Howard, 2003;Ladson-Billings, 2003).They argue a lack of race-consciousness in the traditional history canon contributes to skepticism of the discipline among communities of color.Research suggests that Black students' interpretation and acceptance of historical canon varies from their White counterparts.In her study of how Black students interpret history differently from White counterparts, Epstein (2009) found Non-White students take a more critical view of the freedom-quest narrative compared to their White peers, and display outward pride for civil rights struggles and accomplishments.Concomitantly, Chikkatur (2013) found that African American students were more receptive to a curriculum with which they identified.Unfortunately, such episodes within American history canon are more additive than normative, making it difficult for students of color to find their place and identity within the curriculum.
Lack of gender equality in historical narratives is another point of contention.Female students report finding the masculine historical narrative less appealing; thereby, influencing their interest and retention of historical information over years of schooling (Fredrickson, 2004).Research suggests that women are more likely to appreciate history focused on social change, access to democracy, and civil rights (Crocco, 2008).However, akin to the exclusion of Non-Whites, these themes do not resonate in canonical representations of the past, which frequently privilege maleoriented motifs such as war, politics, and power (VanSledright, 2011).Complicated race and gender connections to the past, students from lower socioeconomic backgrounds often lack cultural capital, receive less parental involvement, and maintain lower academic expectations-all of which negatively impact their student achievement (Berliner, 2006;Lee & Bowen, 2006).Future educational goals also serve as predictors of student engagement and interest in content (De La Paz, 2005;Smith & Niemi, 2001).

The Influence of Disciplinary Literacy
History education research suggests students exposed to a variety of instructional modes requiring students to engage text-based resources are more likely to retain content (Reisman, 2012).
Researchers have linked instructional exposure associated with reading, writing, discussion of content, and analysis of documents to "core" teaching practices (Barton & Avery, 2016;Fogo, 2014).These instructional practices are also reflected as essential skills in the Common Core State Standards and manifest as forms of disciplinary literacy (National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010).
Disciplinary literacy is the confluence of information, experiences, and abilities possessed by those who create, communicate and use knowledge within a specialized subject area (Shanahan & Shanahan, 2008).Processes to this end include reading, writing, listening, speaking, and thinking critically in ways meaningful within the context of disciplinary work.The aim of disciplinary literacy is to "transform students into disciplinary insiders" who are equipped with the specialized skills necessary to meet the demands and mores of content domains (Shanahan & Shanahan, 2012, p. 11).These specialized literacy skills give students agency when asked to comprehend and derive meaning from complex informational texts prevalent in history.
This meaning-making process requires a text-dependent interplay between reading material, task and reader.More explicitly, disciplinary literacy in history education includes inquiry and author awareness, reading and analysis of various historical source material, writing as an extension of the text engagement, and the use of discussion to elaborate and qualify textual understanding (Giles, Wang, Smith & Johnson, 2013;Moje & Speyer, 2008;Monte-Sano, 2010;Wineburg & Reisman, 2015).It is through these tenants of disciplinary literacy that students engage content in multiple and meaningful ways, connecting content and concepts to working memory.Though few large-scale studies have attempted to ascertain if exposure to such instruction are core practices that optimize students learning potential, modest-scale studies suggest that exposure to these text-dependent methods has the potential to improve students' understanding and knowledge acquisition in history (Monte-Sano & De La Paz, 2012;De La Paz, et al., 2014;Nokes, 2010;Reisman, 2012;VanSledright, 2011).
Discipline-specific literacy exposure is also associated with test performance.Reich (2009), in his analysis of how students respond to items on the New York Regents Exams, found that students' discipline-specific vocabulary and test-savvy skills were tied to cognitive reasoning on exam items.He further intimates that exposure to reading and writing within the field potentially improves test acumen on historical content assessments.Similarly, Reisman's (2012) experimental study of a reading historical documents program found students exposed to targeted reading strategies scored higher on history tests than those who did not receive disciplinary literacy instruction.
It is important to remember that the aforementioned influences on history knowledge attainment, while grouped for the purposes of this study, are interconnected and should not be compartmentalized.Sociocultural characteristics, such as race and class, can have substantial ramifications for the instruction of students; as research notes that students from non-White backgrounds and from lower socioeconomic families are more likely to receive less qualified social studies teachers and substandard instruction (Fitchett, 2010;Pace, 2011;Segall, 2006).Instructional effects are also pronounced among English language learners and students with special needs, many of whom receive substandard history/social studies instruction (Cho & Reich, 2008;Litner & Schweder, 2008;Szpara & Ahmad, 2006).This teaching divide intensifies unequal opportunities to learn for students and discourages authentic access to the curriculum.

Assessing History
Much debate has centered over the purposes and practices of historical assessment.Seixas and Ercikan (2015) noted there are more multiple ways to appraise historical understanding including knowledge retention, disciplinary procedures, and perspective-taking.They have contended that assessment multidimensionality strengthens the discipline as long as tests accurately measure what they claim to assess.Conflict regarding assessment has arisen in acknowledging what history educators should privilege in their assessments, and what students ought to be able to know and do.Because history education in the United States has traditionally focused on a canonical transmission of the past, assessments have primarily focused on memorization and factual recall (VanSledright, 2014).These exclusively multiple-choice tests often forsake construct validity (measuring what they purportedly assess) for responder reliability and grading efficiency.Moreover, they are often vulnerable to test-taking gamesmanship by students (Reich, 2009).The growing emphasis of disciplinary thinking in history education has led to challenges and criticisms of these traditional assessments.Advocates of a more skill-based assessment of history (Smith & Breakstone, 2015) have challenged conventional history tests, suggesting that they history fail to gauge students' disciplinary understanding (i.e., the "doing" of history).
Further complicating history assessment policies are the lack of federal mandates for testing history or other social studies content (Fitchett et al, 2014a).Thus, it is difficult to evaluate students' understanding of history across state lines and consequentially compare curricular policy and practices in history teaching and learning.The National Assessment of Educational Progress U.S. history test (NAEP-USH), described in more detail in subsequent sections, provides a viable alternative for examining history learning and teaching in the United States.In assessing both students' content knowledge and analytical skills (Lazer, 2015), NAEP-USH offers a national snapshot of history learning in the nation.Furthermore, it can be used to compare sub-groups of students to determine how sociocultural influences and exposure to various instructional practices (e.g., disciplinary literacy) are associated with student learning.

School Policy Context and History Content Knowledge
Within the study of history and the social studies, research also acknowledges that the aforementioned sociocultural and instructional exposure influences are consequential at the schoollevel (Epstein, 2009;Levinson, 2012;Pace, 2011;Segall, 2006).School climate and the surrounding community culture influences how students prioritize and engage historical content knowledge.Moreover, testing culture and accountability requirements constrain history teacher decision-making (Saye et al., 2013) and can promoting rote, lower-level instruction (Volger, 2006).Social educators have lamented that accountability and high-stakes testing have hijacked curriculum, converting learning contexts into cram schools focused on test preparation and offering minimal learning engagement and even fewer opportunities for complex, higher level thinking (Volger & Virtue, 2008).However, competing studies suggest, that while accountability policies have shifted what history students learn, little evidence has indicated that it actually changed how they learn (Grant, 2001;Patterson, Horner, Chandler, and Dahlgren, 2013;van Hover, 2006).Hence, others argued, accountability and testing hold little influence over history teaching.
Given this evidence, one might conclude that there has been little return on investment for educational policy promoting testing and curriculum standardization in history education; however, previous research from Fitchett et al. (2014aFitchett et al. ( , 2014b) ) has hinted otherwise.Analysis of elementary social studies indicated that testing was associated with increased time spent on the subject.Moreover, testing, while associated with lower perceptions of autonomy among practitioners, was a more consistent predictor of instructional time than teachers' perceived classroom control, perhaps suggesting that accountability does increase access to time spent on history/social studies course content.Few studies in history/social studies has taken the next step-examining the association between state-level testing and overall student achievement.Therefore, it is necessary to delve into the broader (and more contentious) literature on relationships between accountability mandates and student achievement.
Among wider educational research and policy circles, there is considerable debate over the efficacy of accountability and state testing as appropriate measures for improving student achievement.Amrein and Berliner (2002), in their archival time-series analysis of NAEP math and reading scores, found little evidence that high stakes testing policies at the state level were associated with higher achievement.In contrast, Rosenshine (2003), examining the same NAEP data with a control group design, concluded that states with testing policies correlated with higher scores on average compared to states without high-stakes testing.He further posited that these gains were not due simply to test preparation, but rather a concerted effort to referee the curriculum content taught, which often aligns with NAEP standards.In a rebuttal, Amrein-Beardsley and Berliner ( 2003) incorporated Rosenshine's control design and found that, while states with high-stakes tests performed better than non-testing states on fourth grade math NAEP, results in reading and other grade bands were inconclusive.Moreover, they concluded that NAEP sampling frames often exclude students with learning disabilities and exceptionalities, thus making state comparison unreliable.1 While important findings, these competing studies focused at the state-level, without sufficiently accounting for variance at the building-or student-level.In response to these earlier studies, Dorn (2006) argued the need for more advanced, multilevel analysis of NAEP data to better examine teaching and learning across state-policy contexts.
Other robust educational policy analyses find increased accountability policies are associated with higher NAEP scores (Braun, 2004;Carnoy & Loeb, 2002;Hanushek & Raymond, 2006), implying higher standards and accountability measures such as testing externally motivate students and teachers, while regulating instructional content.From a cognitive perspective, research suggest testing and constant evaluations serve as potential tools for reinforcing knowledge, skills, and concepts.Carpenter and colleagues (2009) contended that testing has potentially positive effects on students' long-term historical memory.Sousa (2011) affirmed from neurological science, that testing is an effective measure, when used to promote self-regulation, for gauging sustained learning outcomes.However, from a policy perspective, research has not examined the potential influence of testing and accountability policy on what students learn in history and the social studies.

Purpose of the Study
Whereas the majority of the previous research on historical understanding and achievement has focused on either sociocultural or instructional exposure influences, few studies have examined their complex interrelatedness on a large scale while also accounting for complex between-school characteristics including the state policy climate.In this study, we analyzed the relationship among various sociocultural and instructional exposure influences on US content knowledge outcomes for 12 th graders using data from the National Assessment for Educational Progress US History Assessment (NAEP-USH). Four

Method Participants
We analyzed NAEP-USH scores of 12 th grade, public school students (nstudent=10,890) nested in schools (nschool=410).2Small amounts of missing data were present for some of the student demographic characteristics.Multiple imputation was used to replace the missing data.Public schools were specifically chosen because of the inherent uniformity of curriculum and accountability policies found among public schools within a state compared with variable conditions placed upon teachers and students in private schools.Grade 12 was examined exclusively because it was posited that historical knowledge acquisition occurs throughout schooling experience; accordingly, the 12 th grade assessment represents an accumulation of knowledge.

Materials
NAEP US history assessment.For the dependent variable, we used student achievement estimates from a nationally representative sample of students who took the 12 th grade 2010 National Assessment for Educational Progress US History Assessment (NAEP-USH) to assess US history content knowledge (National Center for Education Statistics, 2011).Established in 1994, NAEP-USH is used primarily for research purposes by the U.S. Department of Education and outside researchers to examine historical knowledge across student sub-populations (Lazer, 2015).The federal government administers the test approximately every four years to approximately 10,000 students at three different grade levels, grades 4, 8, and 12. NAEP-USH is a low-stakes assessment.The results from the test are not used for student grading purposes or to evaluate teacher performance.The value of the assessment lies in its national sampling frame and the generalizability of the results.Therefore, NAEP-USH serves as the most comprehensive cross-sectional assessment of US history content and conceptual knowledge in the nation.
Test items are constructed evenly around a series for four themes: change and continuity democracy; the gathering and interactions of peoples, cultures and ideas; economic and technological challenges; and the changing role of American in the world (NAGB, 2010).Eight historical period are representative of the test including: pre-colonial American history, American Revolution, the U.S. Civil War, emergence of modern American, World War II, and contemporary issues.Item development and content validation processes included groups of history teachers, history professors, and history education professors.Their recommendations were presented to the NAEP Assessment Governing Board, who made final decisions on item inclusion and content (NAGB, 2010).NAEP-USH at grade 12 includes 110 multiple choice, 36 short open-ended, and 13 extended open-ended items (Lazer, 2015). 3AEP test developers claim the assessment measures cognitive processes of "historical knowledge and perspective" and "historical analysis and interpretation" (Lazer, 2015;p. 147).Given this claim of breadth and depth of historical thinking, test developers encounter two competing dilemmas: how to measure breadth of content knowledge, while also measuring analytical skills.However, unlike traditional standardized assessments, NAEP-USH uses matrix sampling, whereby the test is divided into 10 blocks.Students complete two 25-minute blocks of the aggregate exam for a total test time of 50 minutes.Each block is paired with all other blocks at least once.These content-themed blocks (i.e.Great Depression and World War II) allow NAEP-USH to measure both students' historical knowledge and, to an extent, their ability to use discipline-specific cognitive processes across content domains.Students only take a portion of the entire assessment (two of ten blocks) and avoid potential test fatigue associated with administering a long, burdensome survey of content knowledge to students.NAEP-USH also attempts to counter issues of reliability associated with offering a narrow set of interpretative questions, requiring first-order historical knowledge that a student might (or might not) already possess.
Because a student only takes a portion of the overall assessment, they do not receive individual scores.Rather, individual outcomes are turned into predictive scores based on student group characteristics.Predictive scores are statistically calculated into plausible values (PVs).These values are created by a multi-stage process.Item-response theory modeling is used to generate the mean and variance of the expected distribution of scores for subgroups of students.Researchers are provided five PVs for each student which represent randomly sampled values from the expected distribution of the subgroup to which the student belongs.These PVs are standardized into composite scores with a range of 0 to 500.
The results of NAEP-USH and their usefulness to history educators has been widely scrutinized and debated.Detractors suggest that NAEP scores do not capture the complexity of historical thinking (Breakstone, 2014;Rothstein, 2004); however, others argue that tests like NAEP provide a baseline of students' understanding of the common historical narrative, a narrative up for critique but essential for contextualization (Reich, 2009(Reich, , 2011;;VanSledright, 2011;Wineburg, 1998).Given the criticism leveled against using NAEP-USH as a representative gauge of higher-order historical thinking, interpretation of student outcomes was tempered.This study defined NAEP-USH results as a measure students' content knowledge acquisition and first-order reasoning rather than a range of more complex historical understanding processes espoused by previous researchers (cf.Barton, 2012; VanSledright, 2011). 4  Sociocultural characteristic predictors.In addition to the assessment items, NAEP-USH also provides data on student demographics and instructional exposure.Informed by early research sociocultural indicators served as independent variables in this study.They included dummy-codes for gender and graduation, free/reduced lunch status.Previous analyses of NAEP-USH found that students who identified as Black, Hispanic, or Native American followed similar trends on NAEP-USH test performance (Heafner & Fitchett, 2015).Therefore, race identification was dichotomized into a non-White, non-Asian indicator.Because cultural capital and educational access are multifaceted, we included parental college enrollment, plans to attend a four-year college, and need for special accommodations (designated Individual Education Plan and/or 504 plan) as sociocultural control variables.
To control for biased estimates from the low-stakes nature of NAEP-USH, a variable for student interest in US History (a standardized Likert-type scale) was incorporated into the models (Lazer, 2015;Niemi & Junn, 1998) as a sociocultural control variable.In addition, students who have not completed US history are potentially disadvantaged on the assessment; hence, they experience a limited opportunity to learn the content.Students' taking US history in 12 th grade was included as a covariate.Moreover, controlling for measures of prior academic achievement reduces the likelihood of reverse causality; whereby, students' achievement determines classroom practice (Podgursky, 2002).NAEP-USH, much like other cross-sectional NAEP datasets, does not include direct measures of prior achievement (Dorn, 2006). 5In lieu of prior academic performance, we also used students' age as a covariate among the sociocultural indicators, positing that students older than the modal age were either grade retained or were slower to mature academically (for other examples see Schmidt, Burrough, Zoido, & Houang, 2015).
Instructional exposure predictors.The grade 12 NAEP-USH provides student-reported data on their instructional exposure, which were incorporated as independent variables in this study.Because teaching and learning in history is not specific to a singular instruction type (Fallace, 2010;Levstik, 2008), we conducted a principal axis factor analysis with orthogonal rotation to determine whether instructional exposure items loaded onto similar constructs.Results indicated that scales loaded onto two factors.One factor, labeled Text-Dependent Instruction, included items measuring frequency of discussion of materials, readings from textbook and other sources, engage historical documents, and write short answers.The second factor, labeled Multimodal Instruction, included items measuring frequency of work on group projects, class presentations, writing reports, going on field trips, listening to information online, and use books or computers in library for schoolwork.One item, frequency of testing and quizzes, was not included in the factors due to lack of fit. 6Both factors were found to exhibit adequate internally consistent reliability within the sample (αtext-dependent=.743;αmulti-modal=.727).The unidimensional factor loading patterns were similar to previous analysis of NAEP-USH data (Heafner & Fitchett, 2015;Smith and Niemi, 2003).Conceptually, the factors aligned with existing literature surrounding students' learning of history.As noted above, the teaching and learning of history point toward the importance of disciplinary literacy, which includes the reading, writing and discussion of historical texts.Furthermore, given the potential to motivate students, efforts to "bring history alive" through varied experiential activities are popular in today's classrooms (cf.Wright-Maley, 2015).Therefore, it was determined that the two-factor solution adequately reflected (conceptually and statistically) students' instructional exposure to history.The two factors scaled as composite scores, standardized, and included as independent variables.In the classroom, the exposure to instruction is rarely dichotomized into either Text-Dependent or 5 NAEP-USH includes only cross-sectional data.Prior knowledge on the NAEP has previously been examined using High School Transcript Study data provided by NCES.Smith and Niemi (2001) used this data in an earlier examination of NAEP-USH.However, those data were not available for the 2010 NAEP-USH. 6Previous analysis (Heafner & Fitchett, 2015) indicates that this item was not significantly associated with NAEP-USH scores.For the sake of model parsimony, we did not include it in this study.
Multimodal instruction.History teachers are likely to use a variety of instructional strategies (Fallace, 2010;Levstik, 2008).An interaction term was created to examine how students' exposure to Text-Dependent instruction as moderated by Multimodal instruction was associated with NAEP-USH.A scale score was created from the interaction term and standardized.
Within instructional exposure variables, Advanced Placement US History class enrollment (APUSH) was operationalized as a substitution for prior student achievement and as a control for instructional exposure predictor.Entry into AP coursework is typically contingent upon a successful academic record and previous research also notes that students taking APUSH courses experience instruction differently than students in non-AP learning environments (Saye et al., 2013;Smith & Niemi, 2001).We also included an indicator of online instruction as a control variable, positing that online teaching and learning environment are demonstrably different than traditional classroom environments.Descriptive statistics of the sample are included in Table 1.School-level demographic characteristics and policy context.Because NAEP-USH samples regionally rather than statewide, it was not possible to analyze a three-level model to explore state to school to student effects.Rather, researchers incorporated state-level accountability data as building-level contextual variables; positing state mandates would have similar impact within schools of the same state.To measure US History assessment and testing policy, Education Week's 2010 Quality Counts survey of state-level testing policy was employed (Executive Summary, 2010).From available data, three state-level indicators assigned to the school-level were created: test social studies at high school only, test social studies at middle and high school, and test social studies at all three grade levels (elementary, middle and high school).Previous studies have used Quality Counts to designate similar states' policy contexts (Fitchett et al., 2014a(Fitchett et al., , 2014b)).NAEP-USH provided quartile measures of school percentage free-reduced lunch and school percentage planning to attend a college/university.In this study, researchers included indicators of greater than 75% free-reduced lunch and less than 25% of students college-bound to examine the effects of concentration poverty and lower than average post-secondary academic readiness on average history knowledge achievement.In addition, school level percentages of Black and Hispanic enrollment were included as school-level predictors.School-level indicators of charter school identification and urbanicity were included as building-level control variables. 7Table 2 includes a list of all predictor and control variables modeled in this study.

Analytical Procedure
We conducted multilevel modeling (Raudenbush, Bryk, Cheong, Congdon, & du Toit, 2004) using HLM software to document relationships among student sociocultural characteristics, instructional exposure, and school-level context and US history content knowledge.The plausible values algorithm in HLM was used to model the dependent variable for each student.Models accounted for the nesting of teachers in schools and determined if there were any significant building-level effects associated with NAEP-USH outcomes.Five models guided the analyses.To answer research question one, Model 1 examined the association between student-level sociocultural variables and controls on NAEP-USH outcomes.To answer research question two, Model 2 introduced student-level, instructional exposure variables and controls into the model to determine their unique contribution to the variance in history content knowledge.Model 3 included the interaction term and furthered our exploration of research question two.To answer research question three, Model 4 included school-level contextual and state policy variables to determine their unique contribution to between building variance of historical content knowledge.To answer research question 4, model 5 examined the extent that school building characteristics moderated the race effect (non-White, non-Asian) on NAEP-USH.The composite scores and interaction terms were group-mean centered at Level 1.The percentages of Black and Hispanic students were grand mean-centered.Student-and school-level sampling weights were used to provide more accurate coefficients.Robust standard errors were used to account for the complex sampling design. 8

Results
The Intraclass Correlation Coefficient (ICC) for the unconditional model was .200,indicating that 20.0% of the variance in history knowledge outcomes was found between schools and 80.0% between students within schools; justifying the use of multilevel modeling.Table 3 presents results from each of the three analyses associated with the student level effects for the full sample.Table 4 presents the results of the school level analyses.Model 1 sociocultural predictors and covariates accounted for approximately 30% of the within-school variance associated with NAEP-USH (historical content knowledge).The inclusion of instructional exposure-type predictors and covariates in Models 2 through 5 accounted for an additional 9% of the within-school variance associated with historical knowledge acquisition; suggesting that approximately 39% of the variance within-schools could be explained by the analysis. 8The HLM equations for the full model: At Model 1, approximately 67% of the between school variance in NAEP-USH performance was explained.In Models 4 and 5, an additional 9% of the variance was explained from the school characteristics.Overall, the analyses accounted for approximately 76% of the between school variance in performance.

Research Question 1: Influence of Sociocultural Indicators on US History Content Knowledge
Findings indicate that a proportion of the variance in students' knowledge of historical content as measured by NAEP-USH can be attributed to sociocultural variables (see Table 3).Results confirm previous smaller-scale analyses that race, class, and gender are associated with variability in students' history content knowledge.When controlling for other variables, Non-White/Non-Asian minority associated with approximately a one-third a standard deviation lower on NAEP-USH compared to their counterparts.Findings also reveal a gender gap, whereby, male students consistently out-perform female students on NAEP-USH by approximately .20 of a standard deviation Students eligible for free-reduced lunch and non-college bound students are also associated with lower performance in US history content knowledge.

Research Question 2: Influence of Instructional Exposure on US History Content Knowledge
While the inclusion of exposure-type variables in the analyses are a modest contribution to the overall model estimate, the analyses reveal important findings for social studies and history educators (see Table 3).In Model 2, each standard deviation increase in text-dependent instruction (including reading, writing, and discussing historical content) is associated with approximately 6.5 points of higher performance on NAEP-USH (or almost a quarter of a standard deviation in achievement).Conversely, each standard deviation increase of multimodal instruction (e.g. group projects and field trips) is associated with the inverse, which equates to a 7-point drop in performance on NAEP-USH.Since instruction in history/social studies is rarely bifurcated between these typologies, we included an interaction term in Model 3. Results suggest that multimodal instruction inversely moderates the positive effects of text-dependent instruction.The interaction of text-dependent and multimodal instruction was associated with slightly more than one quarter of a standard deviation decrease in student performance on the NAEP.Thus, for each standard deviation increase in interaction (more exposure to both) the higher performance associated with text-dependent instruction is offset by multimodal instruction.

Research Question 3: Influence of School Characteristics and Accountability Policy on US History Content Knowledge
Models 4 and 5 accounted for a substantial amount of between school variance (approximately 76%).Indicators of school-wide academic achievement (<25% of the school students attending a 4-year college), the percentage of Black students, and state testing policy were the statistically significant school-level variables.Every 10% increase in the number of Black students in schools was associated with 1.8 point average lower test performance.Testing policy, employed as a proxy for accountability structures, was associated with higher average building-level performance on NAEP-USH.Students who were tested at the high school level were associated with a 3 to 4 point average increase on the test.Moreover, testing across all grade bands was associated with significant and approaching significant increases when controlling for other building level characteristics.

Research Question 4: Modeling the Achievement Gap
As a final step (Table 4, model 5), we measured the persistent racial achievement gap that was found across models 1 through 4. Specifically, we examined the extent that building-level characteristics moderate the gap between Non-White, Non-Asian students and their classmates.Results from Table 4 indicated that each 10% increase in the percentage of Black students within the school was associated with a 1.6 point lower building-level average among Non-White, Non-Asian students on the NAEP-USH grade 12 test performance as compared to their counterparts.To further illustrate, these findings researchers graphed results from the study.Figure 1 indicates that among Non-White, Non-Asian (NONWHTAS) students and their classmates, text-dependent instruction was associated with increased test performance.
Figure 2 illustrates that NONWHTAS with high frequency text-dependent instruction outperformed White, Asian students with lower levels of text-dependent exposure when nested in schools with low to moderate Black student enrollment.However, schools identified as predominately Black were associated with lower average NAEP-USH performance for both race categories, even when accounting for text-dependent instruction.Lower performance was most pronounced among NONWHTAS students.Findings suggested that a systemic underperformance on NAEP-USH within predominately Black schools affected students across racial classification.

Discussion and Implications for Policy and Practice
Providing a large-scale examination of students' history content knowledge from both sociocultural and instructional exposure dimensions suggests that while race, class and gender, serve as dominate predictors of student achievement on history assessments, exposure to text-based instruction makes a positive contribution to what and how students learn.Moreover, building-level demographics and policy context play a significant role in students' between-building achievement level and potentially moderate students' within-building historical content knowledge.These results offer the following implications for policy and practice of history education: a) the importance of acknowledging and countermanding the history knowledge achievement gap, b) the potential for text-dependent instruction as a high leverage instructional practice toward disciplinary literacy, and c) the role of accountability in historical content knowledge.

Acknowledging the History Content Knowledge Achievement Gap
Findings affirm a history content knowledge achievement gap related to the schooling environment and call into question how student communities with non-White, less affluent backgrounds choose to integrate the historical narrative content representative of NAEP-USH.Previous studies offer plausible explanations.Research suggests that the official canon, with an emphasis on White males, is not representative of the diverse population of U.S. schools (Brown, 2011;Cornbleth & Waugh, 1995).Lacking historical positionality beyond the marginalized or the victim, non-White students are more likely to look upon such a narrative with skepticism and dissonance (Chikkatur, 2013;Epstein, 2009;VanSledright, 1998).Consequently, they find learning history/social studies boring, inversely affecting their motivation to learn historical material (Stodolsky, Salk, & Glaessner, 1991;Tanaka & Murayama, 2014).Importantly, race and class should not be conflated.Results indicate, when controlling for race, students who were eligible for free or reduced lunch were associated with lower NAEP-USH.Access to the official curriculum becomes a fundamental consideration in defining students' opportunities to learn.Given this study's findings, perhaps the official canon perpetuated in standards, curricula, and assessments across U.S. schools deserves further scrutiny.When accounting for other aspects of a students' cultural capital (including socioeconomic status and academic track), non-White, non-Asian students still significantly underperformed, suggesting a dissonance between how students of color and White students perceive history.Perhaps greater emphasis needs to be placed on supporting U.S. history curricula that are inclusive rather than additive and pluralistic rather than monochromatic.
Findings also indicate a discrepancy in historical content knowledge between male and female students.The gender gap remained persistent and aligned with a longstanding gender difference in US History (Zwick & Erckian, 1989), in which males often prefer social studies content (including history) compared to females.This lack of motivation and efficacy toward the study of history is associated with a curriculum that prioritizes male-dominated spheres of politics and war (Crocco, 2006(Crocco, , 2008)).If the NAEP does represent the nationalized narrative of the past, then perhaps the gender effect suggests a persistent and over-arching marginalization in how women and feminine topics are included within the history curriculum (Monaghan, 2014;Schmeichel, 2015).Findings from the study imply that additional work at the teaching and curricular level is necessary to improve gender equity and representation in the US history curricula.
Building-level analyses, confirm that the racial achievement divide is not only within, but also between schools-a finding uncomfortably comparable to Levinson's (2012) study of civic understanding.The school percentage of Black students was inversely proportional to average test performance among all students.Modeling the achievement gap within race, results from the analyses found the gap widens when the concentration of Black students is higher.These gaps might be partially explained by the lack of resources and adequate school staffing afforded poor and non-White student populations.Research conducted using the NCES Schools and Staffing survey also suggests that students in predominately Black-enrolled schools are more likely to have a teacher who lacks a teaching license or certification in a social studies/history-related area (Fitchett, 2010).Moreover, the turnover rate, lack of adequate staffing, and dearth of resources in high minority schools is well-documented (Borman & Dowling, 2008;Boyd, Lankford, Loeb, & Wyckoff, 2005).A climate not supportive of the varying needs of history learners can have negative consequences for educational achievement (Ronfeldt, Loeb, & Wyckoff, 2013;Salinas, 2006).The reemergence of segregated of schools coupled with a growing socioeconomic gap also contributes to school environments that lack teaching expertise and sufficient resources (Clotfelter, Ladd, & Vigdor, 2006;Mickelson, 2001;Rumberger & Parlardy, 2005).Such schools struggle to meet academic performance expectations of schools with greater fiscal resources.Further research is necessary to explore the between-school differences in how students' understand history and what content knowledge they retain.Furthermore, additional study is needed to counter what Gutiérrez (2008, p. 357) refers to as a "gap gazing fetish," whereby researchers and academics fixate on large achievement discrepancies between White and Non-White students.Avoiding deficit thinking necessitates research that targets outlier exemplars, students of color who succeed in history classrooms.By understanding the characteristics of success, rather than highlighting deficiency, ambitious history education can promote practices that bolster underperforming student populations.

Encouraging Policy for the Practice of Text-Dependent Instruction
Though the majority of the variance within schools was attributed to sociocultural qualities, findings from this study indicate that instruction matters.The use of text-dependent instructional strategies, which includes using various source materials, is associated with greater acquisition of content knowledge-providing large-scale support of existing historical epistemologies (Guthrie, Klauda, & Ho, 2012;VanSledright, 2011;Wineburg & Reisman, 2015).Given these findings, perhaps it is time for history education to prioritize disciplinary literacy as a high leverage practice.As defined by teacher education researchers (e.g., Ball & Forzani, 2011;Lampert, 2009), high leverage practices are instructional strategies that teachers employ for substantial learning gains.From a cognitive perspective, domain learning models champion discipline-specific reading, writing, and discussion of content to support students' understanding of the past (Maggioni, VanSledright, & Alexander, 2009;VanSledright, 2012).Furthermore, findings from the current study bolster smallerscale qualitative and experimental results suggesting that exposure to text-dependent instruction (an emphasis of domain learning in history) has positive implications for how students learn historical content (De La Paz, 2005;Monte-Sano, 2012;Reisman, 2012).While previous research has examined teachers' perspectives on core history teaching practices (cf.Fogo, 2014), the positioning of high leverage practices has rarely been explicitly supported in broader history/social studies literature and policy.
The C3 Framework can potentially help support practices that promote discipline-specific literacy skills in history and other social sciences.Drawing from the Common Core Anchor Standards (2010), the framework provides a useful inquiry arc, which privileges disciplinary literacies within historical context.Reading complex texts and engaging in critical discourse through narrative formation enables students to articulate learning in sophisticated ways that mirror the high-level learning demands and career expectations of contemporary society.In an area of faddish educational reforms, teachers and teacher educators should provide students (and preservice teachers) opportunities to engage in meaningful, content-rich activities that support reading and writing in the field.These potentially high leverage instructional practices can positively promote historical knowledge (Moje, & Speyer, 2008;Monte-Sano, 2010;Reisman, 2012) and address student learning outcome differences (De La Paz, 2005;Swanson, et al. , 2015).The Stanford History Education Group's (n.d.) Reading Like a Historian program is another promising educational tool for engaging students in discipline-specific reading and writing.The program advocates many of the same instructional strategies associated with text-dependent instructional exposure.
Conversely, findings suggest multimodal instruction over-exposure is associated with a decrease in historical content knowledge.While engaging modes of instruction (Stein, 2009), finding from the current study suggest these methods fail to produce the content knowledge gains of textdependent instruction.Finding should not be interpreted as declaring other pedagogical forms such as museums, film, and group projects are holistically ineffective.Rather, analyses in the current study suggest that these forms of instruction do not provide optimal exposure if the instructional emphasis is content knowledge acquisition.While highly interactive and motivating, learning that accompanies these pedagogical models requires intellectual leaps between stimulating activities and content-specific knowledge (Stoddard, 2014).Traversing these gaps alone is challenging for most students and requires significant debriefing and discourse to process experiential learning in the context of existing schema.However, pushed for time and pressured for content coverage, teachers far too often forgo this process, leaving students with disjointed understanding.
The interaction effect between multimodal and text-dependent instruction further indicates that the former negates the positive content knowledge gains afforded by the latter.It is plausible that inverse achievement outcomes associated with multimodal instruction may be indicative of ineffective instructional uses.Wright-Maley (2015) found that teachers vacillate between "hard" and "soft" control over instruction, a tenuous task that leads to substantially variability in student learning.Likewise, Dack, van Hover, and Hicks (2015) noted how infrequently the use of experiential instruction effectively conveys historical/social studies content and concepts.
It is also important to recognize that NAEP-USH does not necessarily align with what multimodal instruction teaches.Perhaps this lack of association with knowledge attainment is an indicator that multimodal instruction develops different skills, like project-based learning (PBL), that, while motivating and effective for promoting affective dimensions of learning, do not readily translate well to test performance as measured by tests like the NAEP-USH.Moreover, the NAEP-USH did not include a quality indicator.Thus, the analysis was limited to frequency (or exposure) to multimodal instruction.Parker and colleagues (2013) found that students exposed to PBL scored higher than their counterparts on some measures of the AP Government exam, suggesting that quality instructional delivery is associated with student achievement.Nonetheless, when scaled together, multimodal exposure mitigated text-dependent effects.Faced with instructional choices and expectations to improve historical content knowledge, results suggest that text-dependent instruction is the optimal learning/pedagogical strategy given history knowledge as measured by NAEP-USH.

Careful Considerations of Accountability Policy and History Content Knowledge
It is no secret that few school leaders and even fewer teachers approve of the current wave of testing and accountability mandates pervasive in state educational policy across the nation.Moreover, the recent trend linking students' test scores to teacher performance is extremely controversial with research suggesting that many of the accountability models and associated valueadded models failing to accurately align with recognized standards of good teaching (Amrein-Beardsley, 2008;Polikoff & Porter, 2014).The current study found that students in schools with a required high school history test correlated with average increased performance on the NAEP-USH compared to students in non-test buildings.This finding suggests a policy of curriculum standardization and accountability exposes students to test-taking skills and reinforces content knowledge acquisition.The presence of testing creates a continuum of access to history/social studies instruction not prevalent in the absence of testing.
While it is not the purpose of this research to condone the use of testing as a high-stakes component of teacher evaluation, the use of the test as a referee for curriculum suggests that testing or other state-level curricular mandates might help promote increased content knowledge among students.Given that the NAEP-USH 12 th grade measures an aggregate of a students' US History content knowledge, results imply that students who receive greater exposure to US history are more likely to retain content-specific information.Curriculum standardization, and testing as a byproduct, can serve as a policy tool for guaranteeing content is taught and that all students have an opportunity to learn content.Given the differences found across schools, efforts to level curricular access, even in the form of accountability can provide greater content learning opportunities for students.However, similar to Marchant and colleagues (2006) NAEP analyses, we are hesitant to interpret our findings as decisively in favor of high-stakes test to support student learning in history.The effects of testing have to be weighed with the pedagogical costs place upon teachers and students.We also recognize that there are substantial debates on what and how the history curricula and assessment should look and contend that this is a separate issue outside the scope of the current study.We encourage and welcome the continued debate on the purposes and practices of history and the social studies and caution against findings being viewed as essentialist in their recommendations or implications.

Limitations
Due to NAEP's sampling frame, researchers were unable to acquire representation from every state.Therefore, the study's use of state data as a school-level variable should be interpreted cautiously.The NAEP student survey is self-reported data; however, efforts were made to control for response bias and effort by accounting for student interest in history as noted in previous research (Niemi & Junn, 1998).Moreover, research indicates that student reported NAEP data is more reliable than teacher reported data, particularly in later grades (Henke, Chen, & Goldman, 1999;Smith and Niemi, 2001).These limitations in the sampling frame prevented the development of a three-level multilevel model.The potential for between-state variance in students' historical content knowledge not examined in this model exists.In addition, student reported data of instructional exposure measured quantity, not quality.Interpretations regarding the efficacy of textbased instruction and rather ineffectiveness of multimodal instruction in this context should be interpreted cautiously.Lastly, critics argue that NAEP-USH is an imperfect tool for historical understanding.We agree the assessment does not represent the ceiling for historical understanding.However, it offers the largest and most representative sample of students' historical knowledge in the US and can serve as a baseline for determining, on a national scale, students' canonical content knowledge.Understanding students' knowledge of history has implications for how teach and drive policy for the discipline.

Conclusion
In this study, we examined students' history content knowledge as it related to sociocultural attributes, learning exposure, and school characteristics.Findings indicated that while student characteristics (such as race, gender, and socioeconomic status) were significantly associated NAEP-USH outcomes, students' exposure to instructional strategies remain substantial predictors of historical content knowledge.The researchers contend that while a sociocultural achievement gaps continue, issues related to curriculum, such as content, delivery, gender equity and access are important considerations for leveling historical knowledge.Pedagogically, text-dependent instruction illustrates the potential for disciplinary literacy as a high leverage practice for promoting historical knowledge and understanding.Interestingly, accountability mandates (via testing) may serve an important role in historical content knowledge development.Testing is associated with a continuum of access to history/social studies content, which has significant implications for all students.Lastly, the disproportional lack of achievement among students in predominately Black schools should give researchers and education stakeholders pause.However, current findings suggest that students are not destined to be fall within these gaps.Students of color, females, and less affluent students are not pre-ordained to perform poorly compared to wealthy, White/Asian, males.Increased efforts to expose students to disciplinary literacy skills coupled with supportive curricular policies that assure access to curriculum can positively affect students' learning of history.education policy analysis archives editorial board

Figure 1 .
Figure 1.Association between text-Dependent Instruction across NAEP-USH Performance and Race

Table 2
All variables were included in the HLM model.Coefficients for predictors were included Table2.Full tables, which include results of control variables, are available upon request from the authors. Note:

Table 3
Level I Fixed Effects Estimates for Models of Selection-Exposure Predictors of US History ContentKnowledge (n=10,890)Note.Coefficients displayed for independent variables of interest.HLM models included full school and teacher control variables.Level II Random Effects Estimates for Models of Selection-Exposure Predictors of US History ContentKnowledge (n=410) ***-p < .001, **-p < .001, * -p < .05. , †-p <. 10 Parameter Table 4