david brat's liberal analysis of virginia education

Upload: zennie-abraham

Post on 14-Oct-2015

64 views

Category:

Documents


0 download

DESCRIPTION

David Brat's Liberal Analysis Of Virginia Education.David Brat ran on a conservative agenda, but this research paper shows the real Mr. Brat is really a lot like me: an analytical, middle-of-the-road, liberal. Some in the The Tea Party were tricked by their dislike for Eric Cantor.

TRANSCRIPT

SURF PAPER

Educating Americas Children

Rachel Mathers

Dr. David Brat

Department of Economics & Business

SURF 2005

Randolph-Macon College, Ashland Virginia 23005

804-752-7353

AEF - Economics

Prof. David Brat is the contact person

[email protected]

ABSTRACT

The purpose of this paper is to examine President Bushs No Child Left Behind Act and to relate this federal program to Virginias Standards of Learning tests. With this broad literature in mind, I want to determine what factors drive student performance on Virginia SOL tests. The standard economic research on this topic by Hanushek concludes that there are very few policy variables which can alter student test scores. Card and Krueger have challenged this finding, but still no optimistic outcome exists. I wanted to run these tests for Virginia to reach my own conclusions based on the best science possible.

I provide an extensive literature review which reveals many criticisms of NCLB. Criticisms include setting unrealistic goals, gaming the system, and teaching to the test. Another criticism is that States either accept NCLB guidelines or lose federal funding. On the other hand, accountability has been much needed and recent NAEP scores are going up over time.

Since test scores are used to determine school quality, it is important to know which variables affect these scores. Unfortunately, my research shows that many of the major variables, such as the number of students receiving free school lunch, cannot be manipulated by policy. Broad economic growth or major redistribution would be required. On the other hand, less robust variables, such as truancy rates, can be controlled through the use of appropriate school policy. Further research will examine why high school variation explained (R2) is low and whether all variables relevant to our regression story have been captured.

I. Introduction

Why should we worry about holding schools responsible for the performance of their students? Heres why:

At Luther Burbank School, students cannot take textbooks home for homework in any core subject because their teachers have enough textbooks for use in class onlySome math, science, and other core classes do not have even enough textbooks for all the students in a single class to use during the school day, so some students must share the same one book during class timeThe social studies textbook Luther Burbank students use is so old that it does not reflect the breakup of the former Soviet Union. Luther Burbank is infested with vermin and roaches and students routinely see mice in their classroomsThe school library is rarely open, has no librarian, and has not recently been updated. Luther Burbank classrooms do not have computersThe school heating system does not work well. In winter, children often wear coats, hats, and gloves during class to keep warm. Eleven of the 35 teachers at Luther Burbank have not yet obtained regular, non-emergency credentials.

Is this school in a developing nation? No. This is a description of Luther Burbank Middle School in San Francisco, California. In my view, it is a disgrace that we allow any school to be in such pitiful condition. How can this be?

My SURF project will address educational reform in the United States. It will first examine the economics of education literature, and this will serve as an introduction to the unique market for education. It will also examine many aspects of No Child Left Behind, the federal program which holds schools accountable for student performance. At the State level, I will analyze how Virginias Standards of Learning project fits under the Federal NCLB program. Finally, I will do the research necessary to determine which variables truly have an effect on student test scores. This will enable me to assess whether and to what extent state and federal policy makers can truly reform education. How much can policy makers really do? I will close by discussing the policy implications of this research.

II. The Economics of Education

Before we can take a look at new education initiatives, we must first examine and understand the economics of education. The majority of this examination will rely on the work of Economist Caroline Hoxby at Harvard University. Hoxby describes the way in which markets and incentives can create true reform in K-12 education. She begins by contrasting the K-12 market with the higher education market. How do they work?

Higher education in the United States is competitive, in three significant ways. College students have much more decision-making power about where they will attend college because they are more flexible geographically than elementary and secondary students. Also, private institutions offer many more numerous options at the college level than at the primary and secondary levels. Finally, and perhaps most significantly, college students and their families shoulder more of the burden of paying for their own educations than do their counterparts in the primary and secondary sector. This additional responsibility encourages better performance on the part of the students, and more competition among colleges for the students prized dollars. For this and many other reasons, our higher education system is the envy of the world. Every country sends their best to the U.S.

The average primary and secondary school in the United States does not face nearly as much competition, for several reasons. The median metropolitan area in the United States has only four school districtsthat is, parents can choose among only four possible elementary and secondary school districtsClearly, the typical school district does not face as much competition as the typical college faces to attract students and parents. In addition, parents are relatively constrained in their choices among school districts. Even if many districts exist, switching between them usually requires a change of residencewhich can be costly. Choosing an elementary and secondary school district is simply a more constrained choice than choosing a college. In other words, the way to demand a particular elementary or secondary school is to move, so there is much less competition at the elementary and secondary level than at the higher education level.

Finally, the amount of competition in primary and secondary education has decreased significantly over time in the United States. The first reason is school district consolidation. In 1950, there were more than 85,000 school districts in the United States. The nation now has fewer than 15,000 school districts. That amounts to almost a sixfold decrease in the number of school districts in the United States. Of course, many of the districts that were consolidated were rural, and their consolidation may not have had much effect on the competitiveness of elementary and secondary education. But the number of metropolitan school districts fell by more than 200 percent since 1950, and their consolidation has undoubtedly affected parents degree of choice. Once again, we see that the demand for primary and secondary education does not function like the demand for most products.

In addition, schools do not have to work to find local approval since most of the school budget comes from the state, not localities. In 1940, the typical school district in the United States raised just under 70 percent of the money it spent. Most of the money was raised through local property taxes, and districts therefore had to be responsible to local homeowners and voters. A district that used its money inefficiently was likely to experience falling housing prices, low voter support for bond issues, falling property tax revenues, and tight school budgets. In contrast, the typical 1990s school district in the United States relies primarily on the state for its funding. Currently, 57 percent of the average schools funding comes from state and federal governments. As a result, school districts do not compete so hard for local public support. If a school district wants to obtain a generous budget, it may do better by focusing its attention on pleasing the state legislature rather than pleasing local parents.

Another issue that affects the K-12 education market is citizens mindset about education. Market structure affects how people think about education. Empirically, we know that people think about K-12 education as an entitlement, but think about higher education as an investment choice that individuals make. These different patterns of thought are important because there is evidence that students are less engaged in their schooling when they regard it as an entitlement.

Ultimately, these differences contribute to the greater efficiency of higher education as opposed to K-12 education. The differing market structure of the primary-secondary and higher education sectors accounts for the fact that American colleges and universities are significantly more successful and efficient at producing education than are elementary and secondary schools. This is one reason why the higher education sector is the logical focus for federal policy that seeks to make the best use of tax money. Another reason is that federal money has more leverage in higher education than in elementary and secondary education. In sum, the disproportionate weight traditionally given to higher education in federal budgets seems appropriate.

Hoxby concludes that, ultimately, federal policy should attempt to help the higher education sector do more than educate and remediatethat is, provide education to those whose secondary schooling is inadequate. It should make the higher education sector a lever that puts pressure on elementary and secondary schools to perform. This can be done by allowing students to use their secondary school money for college courses, as is done in Minnesota. Also, colleges can be rewarded for setting tougher curricular and achievement standards, since these standards set an example for secondary schools.

As can be seen from Hoxbys work, the market structure of higher education is much different than that of primary and secondary education. Ultimately, it appears that the primary and secondary schools need the mechanism of competition in order to force positive changes in the schools. In order to do this, the demanders (i.e. parents) can be given more choice about which school fits their children the most. Schools would have to improve or be forced out of the market because no parents would choose to send their children there.

II. No Child Left Behind

Educational reform has taken shape recently in the form of the No Child Left Behind Act, commonly referred to as NCLB. The Presidents FY 2004 Budget will result in a 41 percent increase ($3.9 billion) in Title I spending since the passage of the No Child Left Behind Act. The stated purpose of the act is to close the achievement gap with accountability, flexibility, and choice, so that no child is left behind. The ultimate goal is to establish a system in which 100 percent of children can meet high academic standards by the 2013-2014 school year.

How do we plan to reach this ambitious goal? Two major components of the act are that states develop adequate yearly progress (AYP) accountability benchmarks toward the achievement of the ten-year goal for schools, divisions, and as a state, and that schools employ teachers and paraprofessionals who are highly qualified. The forerunner of NCLB was the landmark Elementary and Secondary Education Act (ESEA) of 1965ESEA and its multiple titles (Title I reading programs, for example) targeted dollars to communities with the greatest need. In order to receive Title I funds, states had to create challenging content and performance standards in at least reading and math, develop assessments that were aligned with those standards, and formulate plans to assist and ultimately sanction failing schools.

One of the key goals in educational reform has been to bridge the achievement gap between the poor and the rich. ESEA was the first federal statute that (as part of President Lyndon Johnsons Great Society program) provided really substantial, precedent-setting amount of federal money to local schools so that better education could be provided to historically underserved student groups such as minority and low income students. Prior to 1965, almost all funds for operating our public schools had come from local tax dollars.

NCLB initiated a new strict era of accountability in the nations public schools. School accountability is the centerpiece of President George W. Bush's education reform. With this new system of strict accountability of all schools in all states, the federal government is attempting to correct the evident gaps between students of different ethnic or socioeconomic backgrounds and ensure that each and every student is properly educated so that he/she can make a significant contribution to society.

II.1 The Basics of NCLB

The word most often associated with NCLB is accountability. The accountability sections of the new law refer to those parts of the legislation intended to hold public school educators directly responsible for the effectiveness of their instructional efforts. Inherent in this idea of accountability is the fact that actions will be taken if schools are not meeting the standards. Accountability systems have an overall influence on schools in two ways: through defining areas of particular attention for schools and through providing rewards or punishments.

How will the government know if a school is performing well? Standardized testing is the governments answer to this question. NCLB calls for all states to install annual reading and mathematics tests in grades three through eight, and one-time reading and mathematics tests in grades ten, eleven, or twelve no later than the 2005-6 school year. Starting in the 2007-8 school year, states must also administer science tests at least once in grades three through five, six through nine, and ten through twelvethe tests must be standardizedthe tests should be diagnostic, providing relevant details about weaknesses in the students mastery of what was supposed to be learned. Notice that it is up to the states to design and score the tests for their students. States are free to determine their own standards, to create their own tests, and to determine for themselves the scores that individual students must receive in order to be deemed proficient. Though the federal government does not design the tests, some very general guidelines are provided to states. For example, the law requires each state to clearly describe at least three levels of student achievementnamely, basic, proficient, and advanced.

It is obvious that test scores are a key component of NCLB. These test scores are carefully scrutinized to determine whether schools are improving their scores and whether each subgroup of students is improving. Test scores are the fuel that makes the NCLBA run. Scores are tabulated for schools in the aggregate and must be disaggregated for a number of subgroups, including migrant students, disabled students, English-language learners, and students from all major racial, ethnic, and income groups. All of these scores are then used to determine whether schools are making adequate yearly progress.

Adequate yearly progress (AYP), in turn, is the linchpin of the NCLBA. Adequate yearly progress is tied to whether a sufficient percentage of students are performing proficiently on state tests. The NCLBA requires states to bring all students to the proficient level within twelve years of the Acts passage (i.e., by 2014), and states must ensure that their definitions of adequate yearly progress will enable the ultimate twelve-year goal to be met. To accomplish this, states must set a proficiency goal each year, and that percentage must rise periodically so that by 2014, it hits 100%. For a school to make adequate yearly progress, the student population as a whole, as well as each identified subgroup of students, must meet the same proficiency goal. In other words, all types of student populations must be improving.

Adequate yearly progress is thus less about yearly achievement gains than it is about hitting uniform benchmarks. All states must set a uniform bar for achievement for all schools and all subgroups of students within a school. The first benchmarks were based on test scores from 20012. Using these test scores, states had to establish a starting point for AYP that was the higher of the following two values: (1) the percentage of students in the lowest-achieving subgroup, statewide, who were performing proficiently; or (2) the threshold percentage of students performing proficiently in the lowest-performing quintile of schools statewide. If 30% of a states poor students, for example, scored at the proficient level in 20012, while 40% of all students in the school at the twentieth percentile of achievement scored at the proficient level, the initial AYP bar must be at least 40% for all schools and all subgroups of students. The Act requires all schools within a state, regardless of whether they receive Title I funding, to make adequate yearly progress. Thus, the initial AYP bar is set low so that all subgroups of students must at least achieve the success achieved by the lowest quintile of students.

The Act requires a rate of improvement on test scores that will be incredibly difficult, if not impossible, to sustain. What happens if a school cant meet the rigorous AYP requirements? Those schools that receive federal funding and fail to make adequate yearly progress are identified as in need of improvement. They are also subject to a range of progressively more serious actions. After two consecutive years of failure, schools must develop a plan for improvement and are supposed to receive technical assistance. Students in those schools are also allowed to choose another public school, including a charter school, within the same district. After three years, students who have not already departed for greener pastures must be provided with tutoring services from an outside provider, public or private. Those schools that fail to make AYP for four consecutive years must take one of several measures, including replacing school staff or instituting a new curriculum, and those that fail for five years in a row must essentially surrender control to the state government, which can reopen the school as a charter school, turn over management to a private company, or take over the school itself.

What do economists think of school choice? Many economists find school choice appealing for two reasons. First, it broadens schooling options for families whose choices might otherwise be limited (by such constraints as low incomes, job location, and residential segregation). Second, the increased competition among schools may spur efficiency gains because the loss of student revenues provides schools with an incentive to improve. Although schools that do not receive Title I funds must in theory meet AYP and will have their test results reported, they do not face the public-choice, restructuring, or other accountability provisions that the NCLBA imposes on Title I schools. If a nonTitle I school accepts Title I transfer students, however, this might convert it into a Title I school. It is unclear from the NCLBA whether this would happen automatically with even one transfer, or whether it only would happen if enough poor students transfer to bring the poverty level of the chosen school to the requisite levelSchools that accept transfer students who are poor or African American simply might stop doing so. If this happens, then these students will be stuck in the same underperforming school, and they may never receive an adequate education.

How will we know if a state is truly improving? The NCLBA requires that the National Assessment of Educational Progress (NAEP) reading and math tests be administered every two years to fourth and eighth graders. The NAEP is an extensive testing program that has been used for over thirty years to collect data about student achievement. Prior to the NCLBA, participation in the NAEP was voluntary, but now all states must participate. Nonetheless, only a random sample of students within each state must take the test, and scores are not reported for individual students or individual schools. The NCLBA does not indicate what is supposed to be done with the results of the NAEP, but supporters of the Act suggest that results on the NAEP will ensure the rigor of standards and tests used in each state. The validity and reliability of NAEP, often called the nations report card, are well accepted. It is a test for which students cannot easily be prepped and, since the performance of individual school districts, schools, or students is not reported, there is little incentive to cheat or even to prepare for the test. Since the test was adopted before the advent of state accountability systems, it also provides a neutral standard for assessing the effects of state policies. Thus, improvement there reflects more general learning, not just responses to the specific state testing instruments.

One of the problems with allowing each state to create its own test is that we have no way of simply and accurately comparing results from state to state. However, the NAEP may serve as a makeshift, though not perfect, means of comparison. For example, if a state claims that its scores on its own test have increased by five percent, we would, naturally, expect some increase in their NAEP scores. If the NAEP scores have not increased, it is possible that the state has tweaked the test or the scoring scale so that more students would pass, or it is also possible that the teachers are merely teaching to the test instead of broadening students overall knowledge.

The provisions of NCLB reach beyond mere testing. As for teachers, the NCLBA requires that Title I schools hire only highly qualified teachers for all subjects and that veteran teachers in such schools demonstrate that they are highly qualified by 20056. The Act also reaches beyond Title I schools and requires that all teachers of core academic subjects in nonTitle I schools must be highly qualified by 20056. Pursuant to the Act and accompanying regulations, teachers are considered highly qualified if they are fully certified and have demonstrated competency in the subjects they teach. Competence is assumed if the teacher majored in the subject in college; alternatively, it can be demonstrated by passing a state test or, for existing teachers, by convincing state evaluators that they know their subject areas. With these requirements, the government is attempting to make sure that all schools have well-equipped teachers.

II.2 The Benefits of Accountability - NCLB

Despite the many criticisms of accountability, the rationale behind accountability is simple and logical. The economic rationale behind school accountability systems is that they will provide schools with an incentive to change and to improve their performance. When schools are held accountable, they have incentives to change because they know that there will be repercussions for their poor performance. In a sense, this is a way to force change in schools that have been performing poorly and havent made the changes necessary on their own.

Another benefit of accountability is that it could add more of the competition that Hoxby suggested as an important part of the success of schools. If students begin using the school choice options linked to poorly performing schools, schools will have an additional incentive to improve so that they dont lose all of their students. In the future we could be seeing a much more competitive market for schooling. Parents can choose what food their children eat and what type of media they expose their children to, so why not have more choice in education? Of course, the logistics of large-scale choice could be messy, but it is unlikely that most parents would exercise their choice options.

II.3 Criticisms of NCLB

Criticisms are manythey are centered on six main claims. First, many claim that the NCLB sets unrealistic goals which cant possibly be met. Second, gaming is an inherent risk when there are unrealistic goals. Third, race and poverty are an issue in regards to school success. Fourth, school choice may not work in the way it was intended to work. Fifth, there is concern that some students in subgroups that routinely perform poorly on the tests will dropout. Lastly, there is concern about teaching to the test.

II.3.A Unrealistic Goals

NCLB has been criticized by many who claim that its goals are unrealistic and unattainable, thus creating incentives for states to cheat the system. Those who favor the Act emphasize its laudable goals and celebrate its tough accountability measures. Those who criticize the Act lament the heavy emphasis on testing and the inevitable teaching to the test that will follow. They also chastise the federal government for interfering with state and local control over education while failing to fund all of the costs associated with the Act. Commentators consider the ultimate goal of achieving 100% proficiency in twelve years to be utterly unrealistic.

What is the result of these unrealistic goals? The Act creates counterproductive incentives by establishing overly ambitious achievement goals and imposing significant sanctions for failing to meet those goals. It allows states to act on these incentives by leaving them free to create their own tests and scoring systems. This odd combination of regulatory stringency and laxitycould well prove disastrous. It will encourage states to lower their standards, make their tests easier, or lower the scores needed to be deemed proficient. It will promote greater segregation by class and race. And, finally, it will help push talented teachers away from schools likely to be deemed failing, or from teaching altogether. In an attempt to drive education policy without intruding too greatly upon state authority, the federal government has combined regulatory stringency regarding AYP with regulatory laxity regarding the quality of standards and assessments. This will likely prove to be an unworkable compromise.

II.3.B Gaming

While sanctions are intended to prompt poorly performing schools to improve, they may, in fact, encourage those schools to find ways to game the system. The federal government can create all the monetary rewards and sanctions it likes, but if states are the sole judges of whether their standards are sufficiently rigorous, those rewards and sanctions will either be futile or counterproductive. States have four options. First, they could direct their energy and resources in an earnest effort to improve achievement, hoping against hope that they can pull off a miracle and meet the Acts goalsSecond, states could stall by setting annual yearly progress goals in such a way as to postpone the need for large increases until later in the twelve-year period If the Act is not modified or repealed, however, postponement will not be enough. A third strategy would be to ignore the federal mandate or to decline Title I funding Although large-scale defection may indeed occur over time, at the moment it appears that all states are attempting to comply with the letter of the law. This leaves one last option: make the tests easier or lower the score needed to be considered proficient.

If a state can manipulate the system so that their schools appear to be doing well and dont receive sanctions, whats going to stop them from doing this? Louisiana, Colorado, Connecticut, and Texas have all tinkered with their scoring systems in order to increase the number of students who will be deemed proficient for purposes of the NCLBAOther states may alter not just the scoring system but the tests themselves, making them easier to pass. States on their own might engage in a race to the bottom by creating easy tests that are not accurate proxies for quality but nonetheless give the impression that their schools are good. Hence the puzzle: Why would a state ever use a test or scoring system that does not make most schools look good? Why wouldnt states instead try to manipulate tests or scoring systems to give the impression that their schools are excellent? Thus, the sanctions of NCLB could merely result in easier tests, not better schools.

Prior to the NCLBA, test scores were somewhat ambiguous proxies for quality, because generally lower scores in one state could be interpreted as a sign of a truly rigorous test. Lower test scores generated by harder tests were also tolerable because sanctions were limited and relatively mild, at least for schools. Although states adopted accountability measures prior to the NCLBA, they reserved the most severe sanctions for students. In eighteen states, students had to (and still must) pass exit exams in order to graduate, while in others they had to (and still must) pass tests to be promoted from one grade to the next.

Some states that had good tests to begin with might actually reform for the worse in order to spare their schools from punishment. The requirement that an increasing percentage of students in every school achieve a certain test score each year is arbitrary and unrealistic, in that it establishes achievement goals without any reference to past achievement levels or rates of achievement growth. Many schools, including some that are considered effective, will be unable to meet these achievement targets. This will create pressure to make the targets easier to meet by dumbing down the tests or making scoring systems more generous. By this process, a law intended to raise academic standards may lower them.

Supporters of the Act nonetheless suggest that states will refrain from lowering their standards because of the NAEP. Recall that a sample of fourth and eighth grade students in every state will have to take the NAEP reading and math tests every other year. If a states students perform well on state tests but poorly on the NAEP, the argument goes, the state will be embarrassed into raising its standards. This argument is not very persuasive. First, state tests and the NAEP need not be identical or even similar. A state whose students perform poorly on the NAEP could easily claim that it prepares students for the state tests but not the NAEP. Second, NAEP results are not reported for individual students or individual schools, but instead are reported statewide, and even then only for a couple of grade levels. It thus seems unlikely that state and local officialsor their constituentswill be bothered by a gap between state test results and NAEP results. Therefore, states may be able to lower their standards in order to make it seem as though a large percentage of students are passing the state tests, regardless of NAEP performance.

II.3.C Race and Poverty

NCLB may not even help the students it was most intended to assist. Professors Kane and Staiger concluded that schools that contain an African American or economically disadvantaged subgroup are much more likely to fail to make adequate yearly progress than those that do not. To improve the chances that a particular school or schools within a district make AYP, administrators have an incentive to minimize the number of African American or poor students in a school or district. Importantly, administrators need not exclude all such students. The NCLBA only requires the disaggregation of scores for a subgroup if it is sufficiently large to yield statistically reliable information. Because there is no single formula for determining this figure, the NCLBA allows states to determine the minimum size of subgroups.

Instead of assuring that minority students are meeting the standards, states could simply set a high number for the minimum size that subgroups must be in order for their scores to affect whether a school meets their AYP goal. If this happens, these students will continue to be neglected, though its unlikely that a state could set an extremely high number for the minimum without being noticed.

II.3.D Lack of Real School Choice embedded in NCLB

Recall that the NCLBA allows students in Title I schools that fail to make AYP for two consecutive years to attend another public school within the same district. Only schools that have made AYP are eligible to receive transfer students. If there are no such schools within the district, the NCLBA and its regulations encourage but do not require districts to arrange for students to attend school in another district. The NCLBA regulations also suggest that lack of space in a good school within the same district is not a sufficient reason to deny students their right to choose another school. If minority and poor students disproportionately do worse on standardized tests, Title I schools with such students will be more likely to fail to make AYP. As a result, many minority and poor students will have the option to transfer. The schools to which they transfer are more likely to be white and middle class if, again, past performance on standardized tests is any indication.

As a result, the operation of the public school choice provision in the NCLBA may promote greater racial and socioeconomic integration. This is indeed a possibility, and for those who favor greater integration, it is a welcome one. There are reasons to be skeptical, however, that the choice provisions will play out in the way just described. To begin, it is important to recognize that interdistrict choice is not required by the NCLBA. In many metropolitan areas, segregation occurs between rather than within districts, and in these areas the NCLBA choice provision offers little hope of promoting integration. Second, where there is diversity within a given district, space constraints will surely limit the amount of movement. It is inconceivable that states and districts will abide by the regulation that suggests a lack of space is no excuse for failing to guarantee school choice administrators of successful schools may claim that they lack much, if any, space for transfer students. There is a little-noticed provision in the NCLBA that makes the school choice provision contingent on state permission. The NCLBA requires schools to offer choice unless they are prohibited from doing so by state law. Although this might be an extreme move, it is possible that, if nothing else works, states will enact laws prohibiting school choice. Taken together, all of these obstacles make it unlikely that the NCLBA requirement of offering choice will be sufficient to overcome the strong incentives to maintain or increase racial and socioeconomic segregation. Therefore, there may not be any real choice offered by the school choice provision of NCLB.

II.3.E Incentives for Dropouts

Schools may be so desperate to get rid of those students who perform poorly that they may actually discourage poor and minority students from staying in school. Schools, to the extent they can, will work to avoid enrolling those students who are at risk of failing the exams. The same pressure could lead schools to push low-performing students out, either to another school (if one can be found that will accept them) or out of the school system entirely. This temptation presumably will be strongest at the high school level, both because students most typically drop out at this stage and because low-performing high school students are most likely to be farthest behind. Given the connection between performance on tests, socioeconomic status, and race, the students most likely to be targeted for exclusion will be poor and/or racial minoritiesOne less student performing below the proficiency level increases the overall percentage of students who have hit that benchmark.

Instead of helping those students who are most in need of help, schools could push these students out, leaving them little hope of obtaining a job that pays above minimum wage and provides mentally stimulating work. The No Child Left Behind Act provides weak protection against this temptation. It requires that graduation rates be included as part of a schools determination of AYP, but it does not say what the rate must be, nor does it demand that the rate increase over a certain period of time. Moreover, graduation rates can only be counted against a school when determining AYP. A school with poor test scores, in other words, cannot point to a relatively high graduation rate and thereby make AYP.

On the other hand, a school with good test scores but low graduation rates could be at risk of failing to make AYP if the state sets a high target for graduation rates. States thus have little incentive to establish a demanding graduation rate. The lower that rate is set, of course, the easier it is for schools to push out students. To be fair, the NCLBA does require that information about graduation rates be made public. Disseminating this information is far from useless, but it remains to be seen whether simply publishing graduation rates will provide sufficient protection for students at risk of being pushed out. If it does not, and if dropout rates increase, the NCLBA could end up further harming those students who obviously need the most helpleaving them, quite literally, behind. Instead of receiving an adequate education, students would be pushed out of school before they could garnish the skills needed to succeed.

II.3.F Teaching to the Test

The goals of NCLB are so difficult to achieve that it is obvious that much instructional time will be devoted to test preparation. A system built solely on test scores, for example, filters out everything except student academic achievement. Similarly, if some subjects are tested and others are not, it is natural to think that attention will go more to the tested areas than the untested areas. Related, part of the debate about testing has argued that tests of lower order skills tend to drive out attention of schools to higher order skills.

This could take some of the joy out of teaching and discourage potential employees from entering the profession in the first place. Establishing standards and requiring periodic testing might protect students against unmotivated teachers who, given the chance, would shirk their responsibilities. At the same time, however, reducing their autonomy can make teaching less attractive to very good teachers. Those teachers who can be trusted to motivate and teach their students may find teaching less rewarding the more they are shackled to state standards and tests. Protecting students against bad teachers can thus simultaneously deter good ones from entering or remaining in the profession.

For example, Englands experience is instructive. A little more than a decade ago, under Margaret Thatchers leadership, the British government introduced a national curriculum, which described in precise detail what each child should learn in each grade. The British government also implemented a series of mandatory tests in English, math, and science for students at ages 7, 11, 14, and 16. The tests were designed as a form of handcuffs, to make sure that teachers followed the national curriculum. Ten years later, England is facing a severe shortage of teachers. Other factors have contributed to the shortage, of course, but the same trend exists elsewhere. As one report observed, in countries where accountability systems have undermined teacher autonomy, like England, Canada, and Australia, there is a recruitment crisis.

While the United States still gives states freedom in the design and scoring of their tests, are we moving in the direction of Englands policy? While I think this is a possibility, it appears to be a remote one. If the federal government used a national test as the basis of state educational evaluation, the states rights would be trampled, so this seems unlikely to happen. States would certainly protest such a move, so I dont see a national test as the cause for teachers to leave the profession in the near future. However, they could flee for other reasons. Schools with poor test scores, or even those that generally have good test scores but have one low-performing subgroup, will not make adequate yearly progress. Teachers in those schools will have to suffer the stigma of being associated with failing schools, which can limit future career opportunities. Teachers who remain in schools that consistently fail to make adequate yearly progress face the possibility of being fired or moved to another school.

In addition to these sanctions, imposed by the NCLBA, some state accountability systems also create the possibility that teachers in lowperforming schools will be fired, or that they will face the dispiriting prospect of watching their colleagues receive bonuses for good test results. Attaching consequences to failure may be necessary to provide incentives to take the tests seriously. But it raises the costs associated with failure, which may make teaching even less attractive. These tests could potentially chase away some of the nations best teachers from the profession. II.4 What Changes Would Make NCLB Better?

Three changes that could improve the NCLB are highlighted below. These changes are the use of growth rates, the use of alternative measures of success, and the setting of realistic goals.

II.4.A Focus on Test Score Growth Rates and not Absolute Levels

Its quite easy to attack a piece of legislation, but it is much more difficult to think of constructive ways to make it better. One of the most common suggestions is to focus on the growth rates of test scores rather than the absolute scores. This would be a system that could reward a school for having amazing improvement even if the school still doesnt meet the rigorous standards. Reward for improvement could motivate these schools to keep up the good work, as opposed to sanctioning them for not reaching the absolute goals yet. Rather than focus on absolute achievement levels as the basis for school accountability, Ryan argues that the federal government and states should focus on rates of growth. Doing so would not only give a more accurate picture of school quality, and thus provide a fairer basis for school accountability; it would also diminish or eliminate the perverse incentives created by the No Child Left Behind Act.

II.4.B Use Alternative Measures of Success

Test scores shouldnt be the only measure of a schools success because there are many factors that could make test scores unreliable. Many factors can make a test score inaccurate. For example, there might be ambiguous items in the test itself; the child may not be feeling well on the day the test was taken; the child may have eaten too little, or too much, before the test; or the child may also have been excessively anxious about the test. Thats especially true for very important tests. Then, too, on any given day a student may simply get lucky and guess correctly most of the time or, in contrast, may have a bad-luck day and come up with many incorrectly guessed answers. Sometimes studentsespecially teenagersdont take tests as seriously as educators hope, so the students dont try to do their best.

In short, there are numerous factors that can reduce the accuracy of students test scoresA test score should be seen only as a rough approximation of a students actual achievement level. A value-added system of accountability would provide a more accurate picture of school quality and would not generate the same perverse incentives that have been unleashed by the NCLBA. No attention has been paid to whether the level of growth demanded is at all feasible, nor is much credit given to schools that accomplish more than average levels of growth but still fall short of the uniform AYP benchmark or the safe harbor target. If the levels of growth required are indeed not feasible, this creates the pressure, described above, to make the tests easier or to lower the scores necessary to be deemed proficient.

All of this suggests that a more appropriate basis for accountability would be one that isolates the quality of the school. This is precisely what so-called value-added methods of assessment attempt to do. Although this method is fairly complex in its details, its basic approach is to focus on achievement gains over time for the same individual or groups of students. The underlying supposition is that if we know how much a students achievement has improved from one year to the next, we have a much better sense of what value the school has added to the students academic performance. The reason is fairly simple: Exogenous factors that affect achievement will influence achievement every year that a student is tested. By focusing on gains made or lost by the same students, rather than overall levels of achievement, those exogenous factors are cancelled out. At least that is the idea. In attempting to isolate a schools performance, valueadded approaches generally ignore absolute levels of achievement. Because the focus is on achievement growth, schools can be judged effective even if most or all of their students are performing below average, or below an established proficiency level. Value-added systems that take into account factors like race or socioeconomic status exacerbate this problem, at least politically, because they essentially establish different growth targets for different types of students. What makes these systems potentially more accurate measures of school performance is precisely what makes them politically controversial.

Because value-added systems do not create the same unproductive incentives, and do not paint as inaccurate a picture of school quality, on balance they seem preferable to the approach codified by the NCLBA. To be sure, they are an imperfect solution and carry the risk of asking too little of some students and schools. In effect, some schools may give up on the ultimate goal of 100 percent proficient if they know that they will not be punished for not meeting AYP goals as long as the subgroups in their school improve (i.e. their value-added is positive).

Rewarding schools for meeting growth targets that are realistic may not provide sufficient incentives for schools to do any more than improve marginally on the status quo. Focusing on growth, moreover, necessarily tolerates different levels of absolute achievement for different students. A school whose fifth graders begin the year reading at the third-grade level and end reading at the fourth-grade level certainly has added value and done a decent job; but those students would still be a year behind in reading. Despite these shortcomings, a value-added approach seems clearly superior to the approach embodied in the No Child Left Behind Act.

How can we set growth targets for students? Because we do not have good information about what schools, teachers, and students are capable of achieving over a certain period of time, any accountability system is bound to ask for too little or too much. If it asks for too little, it may be self-limiting. If it asks for too much, it may be selfdefeating, for all of the reasons I have described. It is tempting to suppose that a hybrid system, which combines a focus on absolute targets and growth in achievement, could solve the problem of asking too much and asking too little. But this is a false hope, primarily because new students enter the school system each year.

II.4.C Realistic Goals

Even the goals of NCLB themselves should be revised. Its very idealistic to proclaim that all students should pass their state tests at the proficient level by 2014, but this is a completely unrealistic goal. The federal government should set realistic goals that can be met, keeping in mind that it will never be the case that all students pass proficient. There is a fundamental problem that it is impossible to attain 100 percent proficiency levels for students on norm-referenced tests (when 50 percent of students by definition must score below the norm and some proportion must by definition score below any cut point selected), which are the kind of tests that have been adopted by an increasing number of states due to the specific annual testing requirements of NCLBUsing a definition of proficiency benchmarked to the National Assessment of Educational Progress (NAEP), one analyst has calculated that it would take schools more than one hundred years to reach such a target in all content areas if they continued the fairly brisk rate of progress they were making during the 1990s.

III. What Do Test Scores Imply? What to Do?

Once we have these test scores, how should we track and report them? There are several options that will be discussed below. The first way to look at test scores is called the status change model. The second way is the cohort gain model. The third option is the individual gain score model. Lastly, I review the various types of reporting to parents.

III.1 Status Change Model

Now that states must test their students for proficiency, the question is how the government should go about using these scores. One option is the status change model. The status change model for a school is calculated by aggregating performance across tested gradesWe classify this model as cross-sectional, because it compares snapshots of the school scores across years (as opposed to tracking the performance changes for individual students across years). The status change model is by far the most common approach to assessing what is happening in schools. One issue with this approach is that we wont see the progress of individual students in order to assess the actual contribution of the school.

III.2 Cohort Gain Model

A better option is the cohort gain model. Accountability is quite different when it focuses on the progress of students over time, which we classify as a longitudinal approach. One such approach is the cohort gain model. This approach tracks the performance of individual cohorts of students as they progress through school. Consider, for example, comparing the scores of third graders in 2001 with those of fourth graders in 2002. With a stable student body (i.e., with no in or out migration for the school), the historical school and nonschool factors would cancel out (because they influence a cohorts performance both in grade 3 and grade 4). The cohort gain score would then reflect what the school contributed to learning in grade 4 plus any differences in idiosyncratic test factors or measurement errors across the two gradesAs a result, the cohort model would generally yield a closer measure of school inputs than the status model.

III.3 Individual Gain Score Model

An even more refined approach is the individual gain score model. With this approach, it is actually possible to track the schools contribution to an individual students learning. This method also avoids the problem of student mobility in the calculation of a schools scores. The influence of mobility suggests an alternative measure for accountability, the individual gain score model. This approach improves on cohort change models because it analyzes data at the student level and can include all students with gain scores, not just the students in the original group. If we follow individual students across grades, any historical influences of families and nonschool factors wash out, and the average of individual gains across grades would more closely reflect school quality for the given grade.

III.4 Reporting Scores

In addition to the way scores are tracked, there are two ways that an individual students scores can be reported. The two types of reporting are relative reporting and absolute reporting. A relative score report indicates how a students performance compares to that of other students. For instance, when the parents of a fourth-grade student learn that she scored at the 67th percentile in mathematics on a standardized achievement test, this means she outperformed 67 percent of the fourth graders who were included in the tests norm group. A norm group is a representative sample of students who completed a standardized test soon after it was developed. A standardized test, by the way, is simply a test that must be administered and scored in a standard, predetermined manner. Because traditional standardized tests invariably use norm-group comparisons to interpret a given students scores, many educators describe such tests as norm-referenced. Thats because, to infer what a childs score actually means, educators must reference that childs score back to the scores made by the tests norm group.

Absolute score reports, on the other hand, focus on what it is that the child can or cant doThe report might say something such as the student has mastered this skill at a proficient, or perhaps at an advanced, level. Educational tests built chiefly to provide absolute interpretations are sometimes referred to by educators as criterion-referenced because, when interpreting scores, educators reference the students performance back to a clearly described criterion behavior such as a well-defined skill or body of knowledge. Absolute score reporting schemes are far more usefulinstructionallyto both teachers and parents. Thats because such reports indicate more clearly what it is the child already knows or, in contrast, what the child needs to work on. Relative scores fail to provide such clarity, for they offer only a comparative picture of the childs attainmentsTypically, absolute score reports are supplied either as percent-of-mastery reports or performance labels such as basic, proficient, or advanced.

IV. Virginias Standards of Learning

Virginias tests to satisfy NCLB requirements are called the SOLs (Standards of Learning). U.S. citizens repeatedly cite improving schools as the number one priority for policy-makers. For this reason, the Commonwealth of Virginia undertook a major development aimed at creating clear, measurable objectives and promoting accountability of schools. Based on the work of over 5,000 individuals representing parents, teachers, businesses, and education officials, the Board of Education adopted the Standards of Learning (SOL) in 1995. The SOL formulated a set of requirements governing what teachers would teach and what students would be expected to learn. Emphasis would be placed on four core subjectsEnglish, mathematics, science, and history and social science. The SOL would be tested throughout a students career, with testing in 3rd, 5th, 8th, and 12th grades. Virginia uses graduation rates and attendance rates as other requirements (in addition to testing) necessary to meet AYP standards.

The SOLs significantly impact both schools and students. 2003-2004 is the first year in which unsatisfactory SOL test results will be used to potentially deny diplomas to students who would otherwise qualify based on classes taken and grades received. This is a hefty penalty for a student to face. Graduation from high school is now conditioned on passage of designated end-of-course SOL tests. Students are required to have 22 credits in order to receive a standard diploma. Beginning with the 2000-2001 ninth grade class (graduating class of 2004), six of these 22 credits must be verified credits. In order to receive verified credits, a student must take and pass an end-of-course SOL test. Of the six verified credits required to graduate, two verified credits must be in English and the remaining four credits may be in subjects of the students choosing. Beginning with the ninth grade class of 2003-2004 (graduating class of 2008) and beyond, each high school student must earn two verified credits in English, one each in math, science, and history, and one in a subject selected by the student. In order to receive an advanced diploma, a student must have 24 total credits, and nine of those must be verified through end-of-course SOL tests.

Schools face losing their accreditation as a result of poor SOL performance by their students. By the end of school year 2002-03, a 70 percent passing rate on the SOL tests will have to be achieved in order for schools to attain Fully Accredited status. According to the Virginia Administrative Code, schools that are accredited with warning are subject to academic review and monitoring by State DOE staff. In order to help them improve their accreditation status, these schools are also required to develop a three-year school improvement plan with the assistance of parents and teachers.

IV.1 How is Virginia Doing?

Since the SOLs were first introduced, Virginia schools have gradually been improving. Graphs 1, 2, and 3 from schoolmatters.com display the scores over the past few years. Over the course of several years of SOL implementation, SOL test scores and pass rates have increased substantially. In addition, the number of fully accredited schools has increased substantially over the last five years from 118 in 1999-2000 to 1,414 in 2003-2004.

An important question is how much of a yearly increase in scores will be necessary to ensure that Virginia meets the deadline for 100% proficiency. The RaMP Up Target is a gauge of how much progress a school, district, or state must make each year (on average) in order to reach 100% in 2014. Virginias RaMP Up Target is 2.4 percentage points per year.

One additional area of concern is the graduation rate. Virginia devotes only 3.3 percent of its total taxable resources to education less than 41 other states. Moreover, in 2003 the state spent approximately $16 per middle- and high-school student on dropout prevention, while spending $79,000 per young person incarcerated in juvenile prison. Something is certainly wrong with this picture. Of the students in Virginia that entered 9th grade in 1997, only 74% completed high school within four years, which is above the national average of 70%. Even though this figure is above the national average, it doesnt mean that there isnt a problem.

IV.2 Predictors of SOL Scores

When explaining variation in SOL scores, we look to regressions to tell us the impact of certain variables on the scores. These variables are called independent or right hand side variables. It is important to note that not all of these variables are under the control of the state or school policy, so this implies that not every problem is a direct result of school success or negligence. For example, if a students poverty status contributes to his/her low SOL score, this is a variable that cannot be easily controlled by the school or state. If most of the major predictors of SOL scores happen to be variables such as this, this spells trouble for the school because these are problems that cannot be combated with policy changes. In this section, I will examine three variables which are not directly under the influence of policy makers. These are income level, single-parent households, and race.

Hanushek and Raymond provide an easy way to view educational inputs:

total student achievement = school inputs + other inputs,

where other inputs = ability + family + peers + history + measurement error. Obviously, the school isnt the only influence on a students score. In fact, school variable variation may not even be a major determinant. Because schools with low SOL scores will be considered as failing, it is important to identify what variables impact these standardized test scores.

IV.2.A Income Level

One variable not determined by policy is the income level of the childs family. Graph 4 displays the spread of income levels across Virginia. AGI (Adjusted Gross Income) per capita is a measure of the average personal income level of a communitys residentsPrincipals indicated that parental income has an effect on test scores and attributed this relationship to the level of resources available to supplement a childs education in the home. If a child comes from a less affluent family, the family may not be able to afford to buy books, take their child to museums and other educational places, or spend much time instructing their child, as the parents may need to work multiple jobs. The parents themselves may not be very well educated. Schools in communities with a smaller proportion of college-educated graduates tend to also be located in localities that are less affluent, and in school divisions that pay their teachers less and spend less per student on instruction.

A recent study by Peter Tuerk, a doctoral student at the University of Virginia, drew a very strong connection between the percentage of highly qualified teachers in a school and success on the SOLs. The broad patterns which emerge from these studies are: 1) family characteristics such as income and education level of parents are overwhelmingly important to school performance, and 2) increased resource spending (including spending to reduce teacher/pupil ratios, increased faculty salaries, and per pupil expenditures in general) are not significant explanatory variables. This indicates that pouring money on the problem will not solve anything. The direct relationship between spending and student proficiency is not very strong. This does not mean that money doesn't matter. It simply means that spending, by itself, does not determine performance. Therefore, we cannot merely compensate for poverty-filled areas by dumping more and more money on the schools.

Statistical analysis indicates that poverty (percentage of students participating in the free and reduced-price lunch program), race (percentage of black students), and adult educational attainment (proportion of adults over 25 in the community who hold a college degree) are the three most powerful predictors of SOL test scoresThese three factors explain almost two-thirds of the variation in test scores across Virginia divisions.

Why does poverty play such a large role in test scores? According to school principals, children who are raised in poverty or live in communities with a small proportion of college-educated adults may receive less academic support and encouragement from their parents, have less motivation and self-esteem, and receive less exposure to learning outside of school. They also tend to move more frequently and are exposed to more crime and violence in their neighborhoods. The better the teacher, the better the student performance, regardless of the students background. Researchers disagree over which teacher characteristics matter the mostexperience, education background, subject matter knowledge, or unquantifiable traits. But they generally agree that, whatever characteristic is chosen, better teachers tend to be found in middle class schools rather than in high-poverty schools. A clear correlation between poverty and performance does exist.

How does poverty or low income affect test scores? Poor parents often work multiple jobs, which leaves them little time to provide needed support to their children. Many poor parents also ask their children to take on additional family responsibilities that conflict with school, such as baby-sitting younger siblings or working to supplement the familys income. Moreover, parents with low educational attainment tend to have more difficulty assisting their children with their homework, because they may lack the knowledge needed to do soTeachers believe that this lack of parental support for academic achievement creates a significant obstacle to student performance.

Also, Principals indicated that the lack of student motivation sometimes results from the parents view of educationmany poor parents who did not complete college or high school may not place much priority on education. These parents who view education negatively, or place little value on it, tend to have low academic expectations for their children, which results in low student motivation Along with the lack of motivation, another effect of poverty and low adult educational attainment is low self-esteem. In schools with large numbers of students from poor homes or with parents who have limited education, low expectations are reinforced among peers.

IV.2.B Single-Parent Households

In addition, students from single-parent households may not perform as well on the tests. Principals interviewed by JLARC staff advanced three main reasons why having a large number of students coming from female-headed households may be associated with lower test scores. First, students (particularly boys) raised in female-headed households may lack a positive male influence in the home, a factor that can lead to behavioral issues often associated with lower performance. In addition, single mothers frequently have less time to supervise their childs academic progress and help with homework. Finally, schools may have to provide more social support and services to students who come from single parent homes, which may divert resources from their academic mission.

IV.2.C Race

Race is another hot button issue when it comes to test scores. Graphs 5 and 6 from schoolmatters.com display the different test scores for subgroups of students. Nonwhite is statistically significant (JLARC) and shows that, for our sample, for every 1% increase in the number of students who are nonwhite, a school will on average experience a .097% decrease in the number of students passing the exam. Several factors that may impact academic performance also tend to occur more frequently in places with higher concentrations of black students, and may largely explain the strong relationship that appears to exist between race and SOL test scores. Many of the factors previously described as coinciding with high levels of poverty also coincide with a high proportion of black pupils, which is to be expected because these two demographic characteristics frequently coexist.

IV.3 Widespread Concerns About the SOLs

Many people have reservations about the SOLs. Some of the concerns expressed by principals and teachers are that the SOLs reduce teaching creativity, reduce the opportunity for enrichment activities, create too much pressure for the students, and limit time available for teaching higher-level critical thinking skills. In setting uniform academic standards, state policymakers necessarily take discretion away from teachers. Local teachers, with some direction from local officials, once determined not only how to teach but what to teach.

Another concern is that schools will find ways to game the system instead of actually improving. Accountability systems, no matter how well-designed, will have many incentives embedded within them for schools to game the system. The successful design of accountability systems hinges on the identification and closure of as many of these loopholes as possible. However, the likelihood that schools will find other mechanisms through which they can inflate their observed test performance for the purposes of accountability suggests that all aggregate test scores should be taken with a grain of salt, and not viewed as perfect indicators of school productivity.

IV.4 Gaming the System

In this section, I will examine specific ways that schools can game the system and artificially raise their percentage passing rates. The three instances I discuss here are teaching to the test, the use of school suspensions, and the manipulation of student nutrition.

IV.4.A Teaching to the Test

If schools do not think it is possible for their students to pass the tests, they may attempt to find ways to artificially raise the percentage of their students who are passing. One possibility is so-called teaching to the test, in which schools focus on test-preparation skills and tailor their instruction to subjects included on the examination with high probability. While controversial, it is unclear as to whether teaching to the test is desirable or undesirable, especially when the test content is rigorous and wide-ranging. Another example of behavior that could tend to reduce the informative signal of aggregate test scores involves the assignment of students to special education. Several recent authors, including Cullen and Reback (2002), Figlio and Getzler (2002) and Jacob (2002), have shown that schools tend to respond to accountability systems and testing regimes by classifying more marginal students as disabledOne interpretation of these results is that schools are behaving in an insidious manner, reclassifying potentially low-performing students into testexcluded categories in order to make average test scores look better.

IV.4.B Suspensions

One way that schools could game the system is by giving students suspensions during the testing days so that they can exclude students they believe will perform poorly on the tests. During the testing window, potentially lowperforming students could be given harsher punishments (longer suspensions) than potentially high-performing students receive for similar infractions, because the school may desire to have as many high-performing students as possible in school to take the examination but at the same time hopes to have more low-performing students stay home during testing periods. A study in Florida found that while schools always tend to assign harsher punishments to low-performing students than to highperforming students throughout the year, this gap grows substantially during the testing window. Moreover, this testing window-related gap is only observed for students in testing grades. In summary, schools apparent[ly] act on the incentive to re-shape the testing pool through selective discipline in response to accountability pressures. In nearly sixty percent of cases, two students suspended for the same offense receive differential suspensions. IV.4.C Nutrition

Schools can also game the system through the manipulation of the school lunch program. This tactic can be used to artificially increase test scores. Several studies find glucose to improve short-term cognitive ability (Benton and Parker, 1998; and Pollitt, Cueto and Jacoby, 1998). The results indicate a substantial and significant gain in cognitive ability from the consumption of glucose, or empty calories, which provides a boost in energy. The increased calories consumed enhanced scores on a psychological battery, and on verbal intelligence. Using detailed daily school nutrition data from a random sample of Virginia school districts, we find that school districts having schools faced with potential sanctions under Virginia's Standards of Learning (SOL) accountability system apparently respond by substantially increasing calories in their menus on testing days, while those without such immediate pressure do not change their menus. Suggestive evidence indicates that the school districts who do this the most experience the largest increases in pass rates.

First, school lunch programs tend to target precisely the students most needed by schools to succeed in the accountability system: School accountability systems generally evaluate schools based on the fraction of students attaining a particular minimum acceptable threshold. The students who do not attain this threshold are overwhelmingly low-income, and thereby eligible for subsidized lunches. And since school meal take-up is unsurprisingly very high among those identified as eligible to participate in the program, the stomachs (and hence, according to the nutrition literature, the minds) of a very large fraction of the marginal students in a school are reached by the school nutrition program.

Secondthis is one margin more easily manipulable by the school than most levers that schools might choose to affect student outcomes. Most school food service directors either have degrees in nutrition or have at least been certified by the American School Food Service Association, and are presumably familiar with the links between nutritional content and cognitive performance, so this margin may well be known to most school nutrition decision-makers. Moreover, unlike some of the methods of gaming more widely suggested, such as reclassification of students as disabled, school nutrition is much more difficult to audit, and importantly, the higher governmental agency that audits school nutrition is the U.S. Department of Agriculture, rather than the state or federal education authorities implementing an accountability system. Therefore, this is one form of gaming that schools can (and have) easily gotten away with.

IV.5 Ethnicity and the Self-Fulfilling Prophecy

Teacher expectations could also play a role in student performance on the exams. The desire to improve scores among all student groups will not work if teachers dont give the same support and encouragement to all of their students. There is reason to believe that expectations matter: The recent work on teacher grading standards (Betts, 1995; Betts and Grogger, 2003; Figlio and Lucas, 2004) indicates that higher standards lead to improved student test scores. The consistent finding from this literature is that teachers take Black students less seriously than they do Whites (Ferguson, 1998).

If teachers dont expect much from their students, what motivation do the students have to try harder? Teachers stereotypes, both positive and negative, influence childrens cognitive performance. Jacobson and Rosenthal (1992) stress the importance of the self-fulfilling prophecy in the classroom. Teachers may expect less from children with names that sound like they were given by uneducated parents. These names, empirically, are given most frequently by Blacks, but they are also given by White and Hispanic parents as well. David Figlio notes, I find that teachers tend to treat children differently depending on their names, and that these same patterns apparently translate into large differences in test scores. These results are consistent with the notion that teachers and school administrators may subconsciously expect less of students with names associated with low socio-economic statusnames that are disproportionately given to Black children--and these expectations may possibly become a self-fulfilling prophecy.

The estimated relationship between names and test scores suggests that a reasonably large fraction of the Black-White test score gap can be explained by childrens naming patterns. Because Black children are considerably more likely to be given names associated with low socio-economic status than are White children, one can calculate that around 15 percent of the Black-White test score gap may be due to differences in names given across the races.

This research about naming certainly paints a gloomy picture for the status of education and equality. Im sure that many teachers probably dont even realize that they hold different expectations of students with different types of names. Its not necessarily a problem that cant be combated, though. A required teaching seminar dealing specifically with this issue could help alleviate this problem by, at the very least, bringing it to the attention of teachers and prompting them to take a look at their own expectations of students and how those expectations impact performance.

IV.6 Improving SOL Test Scores

What are the best performing schools doing to achieve good SOL results? This is an important question to ask, as the techniques used in the school districts that perform the best could be used to help other school districts improve. Based on interviews with principals that have achieved academic success, in both high-scoring and successful challenged schools, there appear to be nine effective practices that help promote student achievement on the SOL tests. It is important to note that my research findings may run counter to the suggestions below.

These practices are: (1) strong and stable leadership; (2) an environment conducive to learning; (3) an effective teaching staff; (4) data-driven assessment of student weaknesses and teacher effectiveness; (5) curriculum alignment and pacing; (6) the use of differentiation in teaching to meet the needs of all students; (7) an emphasis on academic remediation; (8) the use of teamwork and collaboration within grades and vertical integration across grade levels, and (9) the maximization of instructional time through attention to the structure and intensity of the school day.

In schools with students who arent performing well, strict changes are necessary to make a difference. Challenged schools also have to take additional steps to create an environment conducive to learning. They have to focus more on controlling disruptive behavior and imposing discipline. Unlike schools without demographic challenges, these schools also have to go to greater lengths to motivate their students, build their self-esteem, and set high academic expectations for them, because the students may lack all three.

V. What Have Other States and Locals Done?

The main focus of this paper is the NCLB and how Virginia has reacted to it, but it is also worthwhile to look at what other states are doing. Specifically, I will examine the systems used in Florida and Chicago.

V.1 Florida

Florida schools have changed their instructional policies, practices, and allocations of resources within and between schools as a result of increased accountability. In November 1998, Florida voters elected Jeb Bush governor, and once he took office in January 1999 Governor Bush worked with the state legislature to implement his A+ education plan that coming summer. At its centerpiece is a system of accountability based largely on student test scores, in which each school would receive a letter grade ranging from A to F. Schools on both ends of the spectrum are affected: students attending schools rated F in two years out of a four year window are eligible for school vouchers, or opportunity scholarships, that can be used to send a child to a private school or alternatively, that make a child eligible to transfer to a C or higher-rated public school in the same district or an adjacent district. Schools receiving grades of A (or in subsequent years, increasing their letter grades from year to year) are eligible for financial rewards totaling about $100 per pupil that can be spent for purposes such as hiring teacher aides or providing teacher bonuses, etc. Floridas system of accountability was a major inspiration behind the NCLB, which is why many of its features are reflected in the act.

V.2 Chicago

Chicago Public Schools (ChiPS) began their test-based accountability policy in 1997. Jacob notes, I find that math and reading scores on the high-stakes exam increased sharply following the introduction of the accountability policy. The first component of the policy focused on holding students accountable for learning, by ending a practice commonly known as social promotion whereby students are advanced to the next grade regardless of ability or achievement level. Under the new policy, students in third, sixth and eighth grades are required to meet minimum standards in reading and mathematics on the Iowa Test of Basic Skills (ITBS) in order to advance to the next grade. Students who do not meet the standard are required to attend a six-week summer school program, after which they retake the exams. Those who pass move on to the next grade; those who fail this second exam are required to repeat the grade. In conjunction with the social promotion policy, the ChiPS also instituted a policy designed to hold teachers and schools accountable for student achievement. Under this policy, schools in which fewer than 15 percent of students scored at or above national norms on the ITBS reading exam were placed on probation. If they did not exhibit sufficient improvement, these schools could be reconstituted, which involved the dismissal or reassignment of teachers and school administrators.

Looking across all grades and subjects, several broad patterns become apparent. First, students in low-performing schools seem to have fared considerably better under the policy than comparable peers in higher-performing schools Second, students who had been scoring at the 10th-50th percentile in the past fared better than their classmates who had either scored below the 10th percentile, or above the 50th percentile. Students in 1998 were 1.7 percentage points more likely to correctly answer questions involving complex skills in comparison to cohorts in 1994 and 1996. The comparable improvement for questions testing basic skills was 3.9 percentage points, suggesting that under accountability students improved more than twice as much in basic skills as compared with more complex skills. Unlike math, it appears that the improvements in reading performance were distributed equally across question type. This analysis suggests that test preparation may have played a large role in the math gains, but was perhaps less important in reading improvement.

VI. Conclusion of Literature Review

Without understanding the literature and national conversation above, it would be very difficult to leap into my regression study. Now that weve reviewed the literature of the field, both federal and state, that knowledge can form the backdrop for my own analysis which is to come. My work fits into the ideas above because we are trying to determine which variables have a statistically significant effect on Virginia SOL scores. We can also analyze the effects over time for several test years. My work will show which variables are significant. It can answer the question of which variables we should be focusing on with education reform policy. However, this study is not flawless. For starters, the data we use is district-level data, not school-level data, so it wont be as accurate as it could be. It will miss some of the variation which could be captured if we had access to school level or student level data. Secondly, if the main variables that show up as significant are variables that policy cannot counteract, the recommendations for combating such problems will not be easy to enact. For example, if poverty is statistically significant, the obvious recommendation is to eliminate poverty, but this is a very unrealistic solution. In that case, suggestions will be given for minor and gradual improvements.

VII. My Research

My research will primarily focus upon updating and replicating the work of others. Specifically, I will follow the work of Brat and Pfitzner (2001). In Brat and Pfitzners work, there is a discussion of the findings of Hanushek and Krueger. In short, Hanushek found no convincing effect on student performance for teacher/pupil ratios, teacher education, teacher salary, or expenditures per pupil. Graph 7 displays the modest gains in test scores over the last decade despite large increases in spending, and Graph 8 shows U.S. performance and spending compared to other countries.

However, Krueger advises us to be cautions about trusting these results. We must remember that this data is at the district level and not the school level, so there is most certainly some measurement error. However, since the district level data are more readily available, that is the data I have chosen to utilize for this research.

This portion of the paper will examine my regression results. I will begin by discussing the variables which were not significant in the regressions. Secondly, I will discuss the regressions which include a striking variable which captures first grade aptitude - FirstApt. Lastly, I will examine the regressions excluding this variable. I will also discuss the general conclusions of this work.

VII.1 The Non-Significant Variables

Before we get to the main story, it is important to note that the main story has emerged through a lot of hard work. The construction of the data set was itself the most demanding task in this project. The most important part of this work is making sure that we have properly specified our model. If one variable is omitted, then our findings would be in question. We have carefully followed the literature in this respect and we have been careful to run all possible independent variables in our regressions, not just those found to be significant in the work of others. In reporting our results, therefore, it is important that we leave others in the field, the results of all research work done using this model, not just the punchline. This also ensures that there is integrity to the regressions we do provide in the end. I was not fishing for a story, but instead I let the data do the talking and I am reporting what the data revealed. In addition, I ran several tests to ensure that our eventual model specification is robust.

There are five types of variables that were found to be statistically insignificant in our model, which is why they were excluded from my final regressions. These five types of variables are alg8 (a proxy for rigor), income and poverty variables, a measure of the percentage of rural housing units, a citydummy variable, and a variable that accounts for teachers with masters degrees.

Alg8 was a variable used by Brat and Pfitzner to represent the percentage of students taking Algebra before the 9th grade. It was a proxy for academic rigor. Schools which encouraged and offered Algebra in the 8th grade were considered to be progressive in terms of rigor. Brat and Pfitzner found that this variable played a significant role in all of the regressions at the eighth grade level, and in none of the regressions at other grade levels for 1998. I tested this variable on a variety of the 2002 SOL scores, and it was not significant. Since this data was not updated for the 2002 school year, this could have skewed my results.

As far as income and poverty are concerned, there are many variables that can be chosen as proxies. I tested many such variables on a variety of different SOL testing grades before determining which was the most significant and most robust across all equations. I found that per capita income, median family income, median household income, median family income, and the percentage of children between age 5 and 17 in families considered below poverty level were all inferior to the variable that measures the percentage of students who received free lunch in the 2004-2005 school year. Therefore, the school lunch measure was used in my final regressions. It explained the most variation in student test scores.

A measure of the percentage of rural housing units was also found to be insignificant. I speculated that having more urban or more rural housing units could affect SOL scores, but the regressions dont back up this theory, so this variable was also excluded from my final regressions.

Closely related to this variable is the citydummy variable, which simply accounted for a Virginia school district being classified as either a city or a county. Richmond is coded as City. Henrico or Hanover are coded as County data. Does this matter? Like the rural housing measure, this variable was also insignificant in most all equations.

Another variable that was insignificant in my regressions was a variable that measures the percentage of faculty with post-graduate degrees. This was another variable used in Brat and Pfitzners work. This variable is from 1991 and 1992, though, so the age of the data could affect the results. Brat and Pfitzner experienced similar results with this variable. They found that the percentage of teachers with masters degrees was statistically significant in only four of the thirty-six regressions. Though many citizens believe this variable is important, the regression results dont fully support this notion.

VII.2 Regressions Including FirstApt

The regression results for each of the ten test scores for 2002 are reported in Table 1. The explanatory variable set includes:

Diplomapercentage of females 25 years and over with a Bachelors, Masters,

Professional School, or Doctorate Degree (from 2000 census)

Lunch

percent of students getting free lunch (school year 2004-2005)

(from http://www.pen.k12.va.us/VDOE)

Truancypercent of students with whom a conference was scheduled after the

student had accumulated six absences during the school year (2003-2004)

(from http://www.pen.k12.va.us/VDOE)

Spendingper pupil total current spending for support services (2003)

(from http://ftp2.census.gov/govs/school/elsec03t.xls)

SPT

students per teacher (2002) (from http://www.schoolmatters.com)