Unformatted Attachment Preview
The Relationship between School Funding and Student Achievement
in Kansas Public Schools
Florence Neymotin
Journal of Education Finance, Volume 36, Number 1, Summer
2010, pp. 88-108 (Article)
Published by University of Illinois Press
For additional information about this article
http://muse.jhu.edu/journals/jef/summary/v036/36.1.neymotin.html
Access Provided by New York University at 08/06/10 8:43PM GMT
The Relationship between School Funding
and Student Achievement in Kansas Public Schools
Florence Neymotin*
a bstr ac t
Recent changes in public school educational finance in the state of Kansas are
shown to have had little positive effect on student educational achievement. A
differences structure is used to determine the effect of changes in revenue per
student at the district level on changes in measures of student achievement.
Measures of achievement employed in the analysis are student test scores in math
and reading, as well as various measures of student persistence in schooling.
i n troduc tion
During the time period of 1997–2006, the state of Kansas witnessed drastic
changes in its financial approach to educational reform, as documented in the
School District Finance and Quality Performance Act.1 These changes affected
how the state distributes per student financial support to school districts in
Kansas. In particular, the state of Kansas has progressively moved towards a
redistributive system of financing education at the school district level. One
example of this sort of change is increasing school funding based on the number
of at-risk youth.
1. Kansas Legislative Research Department, 2006. Amendments to the 1992 School District Finance
and Quality Performance Act and the 1992 School District Capital Improvements State Aid Law
(Finance Formula Components).
Florence Neymotin is an Assistant Professor of Economics at Kansas State University.
*Funding for this project was provided by a grant from the University of Kansas School of Business
Center for Applied Economics. A previous short paper version of this article is available through the
University of Kansas Business School Center For Applied Economics Technical Report (#08-1205)
online at: http://www.business.ku.edu/_FileLibrary/PageFile/1041/TR08-1205--EducationSpending_
Neymotin.pdf.
The author would like to thank Art Hall, Dennis Weisman, and various anonymous reviewers for
many helpful comments corrections and suggestions. Outstanding research assistance was provided by
Urmimala Sen and Rashmi Dhankar. All mistakes are my own.
88
j o u r na l of e du cat i on fi nan ce
|
36 :1
su mmer 2 01 0
8 8–1 08
The Relationship between School Funding and Student Achievement
89
The current analysis of the amended Act finds different conclusions from
those in an earlier study, which analyzed the Act before its recent amendments.
John Deke examined the effect of the School District Finance and Quality
Performance Act from the 1989 to the 1995 school years on the student dropout
rate.2 Deke’s 2003 study focused on the immediate impact of the Act and found
that, during the early 1990s in Kansas, a 20% increase in spending had the effect
of increasing a student’s probability of going on to college by 5%. The present
analysis uses more current data than Deke’s study and is, therefore, unique in
its ability to analyze the effects of the most recent amendments to the School
District Finance and Quality Performance Act on student outcomes.
In contrast to Deke’s results, the current analysis finds only weak evidence
that recent changes to school funding in Kansas had any role in increasing
graduations rates. There is also little evidence of the effect of changes in school
funding on improving student test scores.
The current analysis employs a differencing approach using district-level data
for the years before and after 2005. A differencing approach for this particular
time period is justified due to the large number of amendments to the School
District Finance and Quality Performance Act which occurred in the year 2005.
backgrou n d a n d moti vationa l el e m en ts
The history of education in the U.S. is one with varied systems of finance and
educational goals of both the educators and the governing legislative bodies.3
Until recently, education in urban schools was primarily seen as achieving the
goal of assimilation and indoctrination of immigrants and other non-traditional
groups—such as American Indians—with the values of “Americans.”4 Education
today, however, is recognized as a force that can yield many other benefits to
the individuals accruing the education, their peers, and to the society they live
in as a whole. In addition to increasing an individual’s earnings and longevity,
increased education is also found to foster increases in civic participation,
decreases in criminal activity, and a general heightening of the productive
capacities of society.5
2. J. Deke. 2003. A study of the impact of public school spending on postsecondary educational
attainment using statewide school district refinancing in Kansas. Economics of Education Review. 22:
275-284.
3. C. Goldin. 1999. A Brief History of Education in the United States. NBER Working Paper Historical
Paper 119. 1-76.
4. R.J. Murname. 2008. Educating Urban Children. NBER Working Paper no. 13791. 1-45.
5. D. Card and A. Krueger. 1996. Labor Market Effects of School Quality: Theory and Evidence. In
W. Burtless (Ed.), Does money matter? The effect of school resources on student achievement and adult
success. 97-140. Washington D.C.: Brookings Institute Press. See: Jamison, E.A. et al. 2007. The Effects
of Education Quality on Income Growth and Mortality Decline. Economics of Education Review, 26(6):
90
j our nal of e ducat ion fi na n ce
There is now, and has been for some years, debate regarding the appropriate
measure of educational attainment.6 Two routes have generally been taken in
the economics literature in answering this question. The first route measures
educational attainment with years of completed schooling, and the second
route measures educational attainment in a broader sense via the test scores of
students.7
There are benefits and drawbacks to both of these methods of measuring
educational achievement. One of the clear benefits of using years of schooling
as an outcome measure is that it is more intuitive, easily defined, and the input
is clear—time in school. However, years of schooling as an outcome may not
actually be capturing what it should. It is not clear that actual physical presence
in a classroom is equivalent to “learning” and similarly it is unclear whether
students whose test scores are higher needed to physically be present in school
to achieve success.
A student’s test scores, on the other hand, by measuring not just his or her
physical presence in a classroom but also what has been absorbed, are a more
precise measure of what the student is actually learning. However, test scores
must take factors of the educational process into account which are not solely
school-based inputs. Test scores may reflect inherent abilities of the child, the
time the child puts into studying at home, or parental inputs into education
accrued in the home. For this reason, test scores may be a better measure of
achievement, but for these same reasons will be more difficult to manipulate.8
The current analysis takes the following approach: Test scores are used in
addition to measures of years of schooling “attained”—alternately termed
“persistence” in this article—as the outcomes of interest. In this way, it is possible
to determine how both of the classical measures of educational achievement are
771-788. See: H.M. Levin et. al. 2007. The Public Returns to Public Educational Investments in AfricanAmerican Males. Economics of Education Review. 26(6): 699-708; L. Lochner, L. and E. Moretti. 2004. The
Effect of Education on Crime: Evidence from Prison Inmates, Arrests, and Self-Reports. The American
Economic Review. 94(1): 155-189; K. Milligan et. al. 2004. Does education improve citizenship? Evidence
from the United States and the United Kingdom. Journal of Public Economics. 88: 1667-1695.
6. E.A. Hanushek. 1986. The Economics of Schooling: Production and Efficiency in Public Schools.
Journal of Economic Literature. 24(3): 1141-1177.
7. D. Card and A. Krueger, 1996, op. cit. 2002. School Finance Reform, the Distribution of School
Spending, and the Distribution of Student Test Scores. Journal of Public Economics. 83(1): 49-82. Card and
Payne note that several appropriate measures of educational achievement to use are student test scores and
measures of persistence such as the graduation rate or post-secondary attendance or college graduation
rates. This article is also particularly appropriate to the current analysis as it looks at changes in district
finances and how they affect student outcome measures. There are various papers using test scores as
the relevant outcome measure in the literature dealing with the effects of school finances on educational
achievement. For one example see: J. Guryan. 2001. Does Money Matter? Regression-Discontinuity
Estimates from Education Finance Reform in Massachusetts. NBER Working Paper 8269. 1-54.
8. Test scores have two additional benefits. They are a factor that is more variable in what employers
see, i.e. there are many individuals with the same level of schooling but different test scores. Test scores
are also considered more integral to increasing levels of societal production.
The Relationship between School Funding and Student Achievement
91
affected by changes in per pupil revenues.
Just as there are two avenues to measuring educational achievement in
schooling, there have been two prominent avenues for determining how
educational outcomes can be manipulated, which are through (1) changes in
total revenues per student and (2) changes in class size. The current analysis
focuses on the first avenue—namely it determines the effect of per pupil revenues
on measures of student achievement.9 Part of the reason for the popularity of
this method is because of data availability on revenues per student through state
departments of education and the department of the census.
The second possible way to measure school resources is through measuring
class size. This approach has encountered some obstacles in the literature due to
inadequate methods for ensuring that there is exogenous variation in class size.10
The main result in these studies has shown that class size is indeed an effective
method for increasing student educational achievement.11 Reduced class size is
particularly effective at helping students who are either “at risk”—experiencing
some type of behavioral problems—or in younger grades and hence, easier to
influence. Although this article does not employ class size data in the analysis,
this is a possible avenue for additional or future research.
In terms of policy, the first step to determining whether total revenues per
student affect student achievement is to document a relationship between
policy related to school funding and actual changes in the amount of funding
schools receive.12 This issue is complicated by the fact that individuals often
sort themselves into neighborhoods as a result of changes in school funding
and possibly either counteract or exacerbate the intended effects of changes in
9. One might be tempted to use formula grants or a more specific breakdown of student funding in
schooling when looking at the effect of finances on achievement. There are several problems with using
this approach. The first problem is that there is often a complicated relationship between the various
subtypes of funding. The second related problem is that funding is often allocated based on formula
grants and when the funding runs out, then the actual allocation may be somewhat haphazard. For this
reason, using total revenues or expenditures per student does not get into the minutiae and so avoids
these particular pitfalls.
10. C. Hoxby. 2000. The Effects of Class Size on Student Achievement: New Evidence from Population
Variation. Quarterly Journal of Economics. 115( 4): 1239-1285
11. E. Lazear. 2001. Educational Production. Quarterly Journal of Economics. 116(3): 777-803.
12. One of the earliest studies in this genre was an analysis of the effects of educational finance reform
in California on actual school finances. The major changes affecting California educational finance at
the time of the study were the advent of Proposition 113 and the Serrano ruling. R. Fernandez and R.
Rogerson. 1999. Education Finance Reform and Investment in Human Capital: Lessons from California.
Journal of Public Economics. 74(3): 327-350. See also: S.E. Murray et al. 1998. Education-Finance Reform
and the Distribution of Education Resources. The American Economic Review. 88(4): 789-812. It is
also possible to answer a similar question using methods of calibration rather than employing actual
documented policy changes. The authors of the study who do this find that switching from a system
of state finance of education to one of purely local educational finance would increase school district
educational spending. R. Fernandez and R. Rogerson. 1998. Public Education and Income Distribution:
A Dynamic Quantitative Evaluation of Education-Finance Reform. The American Economic Review.
88(4): 813-833.
92
j our nal of e ducat ion fi na n ce
funding.13 After showing that there is an effect of policy on changes in revenues
per student, some studies went further and looked at the effect of these funding
changes on student outcomes—with the aforementioned methodology.14 In
the case of Kansas, the time period of interest (1997–2006) did not witness the
enactment of any new major legislation affecting its school funding practices.
It did, however, witness a large number of amendments to its School District
Finance and Quality Performance Act. This is one of the major education acts
in Kansas, whose goal is the redistribution of finances to school districts to
equalize educational resources.15 Of the many amendments that were enacted to
the School District Finance and Quality Performance Act, those that went into
effect targeting at risk students were perhaps the most important amendments
for the purposes of the current study.16
Deke’s 2003 study is the only one to analyze the effects of the School District
Finance and Quality Performance Act. Deke's analysis (2003) focuses on the
years from 1989–1995 and the initial impact of the act. This does not account for
the many amendments over the last 13 years. The current analysis instead seeks
to determine how recent changes in per pupil revenues have caused changes
in measures of student achievement, that is, how recent changes in education
finance have affected the educational achievement of Kansans.17 The data
13. C. Hoxby. 2001. “All School Finance Equalizations Are Not Created Equal.” Quarterly Journal
of Economics. 1189-1231. The issue of student sorting by location is not directly considered; however,
it should be noted that the period considered represents one where only amendments were made to
the main legislation on school financing, but no new major legislation enacted. It is less likely that
individuals will sort as a direct result of application of the amendment rather than application of a larger
piece of legislation. In terms of the concern that individuals are sorting on school quality irrespective of
knowing about the enactment of legislation, is not formally treated in the analysis, however an argument
is made in footnote 28 that addresses potential biases which will result from the estimation. The sign of
the bias is discussed. It would represent an interesting extension of the current analysis to additionally
include a methodology incorporating a sorting on school quality or an analysis specifically targeted
to urban versus rural populations where the issues of sorting would vary—with the urban population
being presumably more stable.
14. See J. Guryan. 2001. Guryan exemplifies the logic of examining school finance reform. In his
work on Massachusetts schools, he first calculates the fraction of funding passed through to schools
as a result of a particular Massachusetts policy change and then looks at the effect of this change in
funding on student test scores. The current analysis, because of the more gradual and cumulative nature
of policy changes, is not the right venue to employ a technique of regression discontinuity design in
which Guryan employs a simple pre-and post-reform structure to test for the effect of policy changes on
changes in school funding. In the current analysis, there was an initial change in funding followed by
several later changes and amendments to the funding structure.
15. In September 2006, the Kansas Legislative Research Department published a document detailing in
great specificity the particular changes made to the School District Finance and Quality Performance Act.
As can be seen from this document, and as noted later on in this article, the majority of significant changes
to the act were enacted for the 2005 school year. See: Kansas Legislative Research Department, 2006.
16. In conversations with Kansas legislators, it became apparent that the most important and recent
changes to this act were in targeting special education students and at-risk student populations. Because
of the nature of the current analysis, targeting at-risk students will clearly affect empirical results.
17. Notice that the effects of legislation on changes in per pupil revenues are not tested per se. Although
trends in revenues per student (unadjusted for other factors in districts changing over time) can be seen,
The Relationship between School Funding and Student Achievement
93
employed in the current analysis comprise the longest time frame which can
reasonably be used to capture the effects of the amendments to the act rather
than the act itself.18
data
Information on school district level measures of student achievement including
test scores, graduation rates, and dropout rates come from the Kansas State
Department Board of Education (KSDE).19 Subject test scores used are
math, reading, science, and social studies. Specifically, the test scores provide
information on the percentage proficient in each grade. Information on school
district characteristics, revenues per student, as well as an alternative measure
of student achievement—the diploma rate—come from the National Center
for Education Statistics (NCES).20 Multiple measures of student persistence are
employed in the analysis because these measures were compiled by different
agencies and will, therefore, serve as a robustness check in the analysis. The
population for the analysis includes all school districts in the state of Kansas.
The general period covered in this analysis is the 1997–2006 time period.
While data for student persistence—dropout rates, fraction receiving diplomas,
and the graduation rate—are available for the full time period in question, data
for reading and math test scores are only available for the years 2004–2006.
Data for test scores in science and social science are available for the years 2003
and 2005, and for 2005 respectively. The reason for the piecemeal nature of the
test-score data is that test scores were not uniformly administered each year,
and only certain test scores were administered in each particular year for all
school districts. It is also not possible to employ earlier test-score data due to
a change in the nature of testing in Kansas and tests prior to 2004, which are
this is not a formal element of the analysis. The main purpose of discussing the nature of legislative
processes over this time period is to motivate the empirical analysis of the effect of total revenues per
student on student achievement, rather than constituting a separate part of the analysis interesting in its
own right. In some sense, due to the gradual nature of the enactment of the amendments, it would be
much more difficult to determine the effect of amendments on actual changes in revenues per student.
This is in contrast to a more structural approach or a regression discontinuity approach as discussed in
earlier papers.
18. In order to not capture the effects of the act, but rather only the effect of its amendments, the data
was chosen to begin with the year 1997. The latest data currently available to allow for a consistent end
date for all measures was the year 2006. For the test-score data, not all tests were given in each of the
relevant years; however, the two main tests (reading and mathematics) were given in the years 2004,
2005, and 2006 so the differencing portion of the analysis uses the 2004 and 2006 years for the analysis.
19. To be precise, the fraction of students receiving diplomas is used as the outcome measure of
persistence. This measure is constructed for school district k in year l as:
Fraction_Diplomask,l = # Diplomas_Awardedk,l
Enrollment_Grade_12k,l
20. CPI estimates are used to correct total revenues per student for inflation. The year 1997 is
arbitrarily chosen to have the basket price of 100.
94
j our nal of e ducat ion fi na n ce
not comparable in nature to tests in 2004 and afterwards. For this reason, while
the test score portion of the analysis provides an interesting counterpoint to the
persistence rate analysis, it is the persistence rates that represent the longer time
frame upon which to base results of the analysis and are, therefore, the more
interesting portion of the analysis.
m ethodol ogy
Estimation begins with a cross-sectional Ordinary Least Squares (OLS)
regression analysis of the effect of total revenues per student on measures of
persistence after including school district characteristics as control variables.
The use of a regression analysis is employed because it allows for a determination
of the independent effect of each of the right-hand side variables on the lefthand side variable. In this analysis, the left-hand side, or the “outcome” variable
in each of the various cross-sectional regressions is represented by the various
different measures of student achievement and persistence, while the right-hand
side variables are total revenue per student and school-district characteristics.
Specifically, the model employed is for school district k in year l:
Persistencek,l = β0 + β1TRSk,l + β2DistSchlsk,l + β3DistPopk + εk,l
(1)
Where Persistence is either the dropout rate, the fraction receiving diplomas,
or the graduation rate; TRS is total revenues per student; DistSchls are the
variables describing the school district, i.e. the pupil teacher ratio, the fraction
on free lunch, the number of full-time equivalent teachers, and total enrollment;
DistPop are the variables describing the population in the school district, i.e. the
fraction of 5–17 year olds under the poverty line, median family income, the
fraction of males and of females in the labor force, and fraction of individuals
with varying levels of education.21
The standard assumption used throughout the analysis is that a school
district’s quality is proxied by its observable characteristics.22 Since student
achievement is affected by school district quality, and the measure of total
revenues per student will also be related to school district characteristics, leaving
school district characteristics out of the regression will cause a biased measure
of the relationship between persistence and total revenues per student. It is for
this reason that characteristics of school districts are essential to include in the
21. As seen in Table 2, the education variable is broken into the fraction of individuals who have (1)
a college degree or higher, (2) associate degree, (3) high school diploma or GED, (4) between 9 and 12
years schooling but no diploma, (5) between 1 and 8 years of schooling, (6) no schooling—the omitted
category.
22. The chosen set of school district characteristics are standard in their use in the literature and
should, in all likelihood, capture the major characteristics of school districts that are relevant for
inclusion in the present analysis.
The Relationship between School Funding and Student Achievement
95
analysis to determine the independent effect of total revenues per student on
student achievement, represented by β1 in the previous regression. In this and
all later parts of the analysis heteroskedasticity-robust standard errors are used.
When using school district test scores—percentage proficient in the grade—
as the outcome measure, the same previous general structure was employed.
One change, however, is that regressions are run separately for each district k,
year i, and grade g. Specifically,
TestScorek,i,g = β0 + β1TRSk,i + β2DistSchlsk,i + β3DistPopk + εk,i,g
(2)
Although the cross-sectional regressions are used to determine the initial
relationship between measures of school funding and measures of student
achievement and persistence, they will clearly produce biased regression
coefficients, i.e. despite the fact that controls for school district quality have
been included, the true independent effect of total revenues per student on test
scores will not have been captured. The reason is that—among other problems
of selection bias—parents will choose a location to live in based on their own
socioeconomic status (SES) and desire for their children to do well in school.
In order to alleviate this issue of selection bias, a differences structure is next
employed to determine how changes in school funding are related to changes in
student achievement. To explain the meaning of the differences regression in this
context, it is useful to contrast the baseline OLS regression just employed to the
differences OLS regression which is next employed. In the baseline regression,
the left-hand side variable are used as the level of student achievement in the
district, and on the right-hand side are the levels of total revenues per student and
school district characteristics. The difference regression instead uses the change
in student achievement between an initial and a final year—for persistence this
is 1997–2006 changes and for test scores this is 2004–2006 changes—as the lefthand side variable and changes in total revenues per student, as well as changes
in school district characteristics as the right-hand side variables. The initial level
of school district characteristics—1997 levels for persistence, and 2004 levels
for test scores—are also included to allow for a nonlinear relationship between
student achievement and school quality. The described methodology is parallel
in structure to that used in J. Deke (2003). The reasoning for using this type of
analysis is as follows: Major changes to school funding which occurred during
this time period were due to amendments to the School District Finance and
Quality Performance Act. This act was redistributive in nature so an increase in
school funding would have occurred for lower-performing schools. In this way,
focusing on a “flow” analysis will alleviate the issue of selection bias since parents
will not be selecting school locations conditional on the same characteristics
96
j our nal of e ducat ion fi na n ce
which are causing schools to see increases in total revenues per student.23
The full regression used to determine the effect of total revenues per student
on persistence for school district k between years i and j is thus:
ΔPersistencek = γ0 + γ1 ΔTRSk + γ2ΔDistSchlsk + γ3DistSchlsk,i + γ4DistPopk + uk (3)
where
ΔTRSk = TRSk,j – TRSk,i
ΔPersistencek = Persistencek,j – Persistencek,i
ΔDistSchlsk = DistSchlsk,j – DistSchlsk,i
In the regressions, the initial level of DistSchls is used to allow for the possibility
of a nonlinear relationship between DistSchls and Persistence, i.e. initial levels of
school district characteristics are employed in the analysis. Only the level—and
not the differenced amount—of DistPop is used because it comes from Census
2000 data where only one year of data is available.
Once again, test-score information follows the same relationship for the
outcomes of reading and math proficiency. Specifically, for grade g in school
district k between years i and j:
ΔTestScoreg,k = γ0 + γ1ΔTRSk + γ2ΔDistSchlsk + γ3DistSchlsk,i + γ4DistPopk + ug,k (4)
where
ΔTRSk = TRSk,j – TRSk,i
ΔTestScoreg,k = TestScoreg,k,j – TestScoreg,k,i
ΔDistSchlsk = DistSchlsk,j – DistSchlsk,i
A limitation of the analysis is the short timeframe for which the differencing
analysis is available for test-score data. The differencing regressions where test
scores are an outcome, therefore, serve as an interesting additional result; however,
the more consistent measures of student achievement in the differencing portion
of the analysis are student persistence.24
Robustness Analysis
To allow for the possibility that results were affected by censoring of observations,
all regressions were additionally run using a Tobit regression structure. The initial
assumptions regarding variables to be included in the analysis are all the same
as in the previous sections; however, now the additional condition of censoring
is allowed for in the analysis. Tobit regression is allowable in cross-sectional
23. If anything, the effect of per pupil revenues on achievement will be underestimated since
presumably higher achieving students are moving away from increasing school funding during this
time period. This will be true if the only changes in funding were due to the amendments. This should
be kept in mind when analyzing the final results of the analysis.
24. It is also possible that the act will go into effect with a time lag and, for this reason, the effect of the
last year or two of amendments may not fully show up if they require more than a year or two to fully
take effect. This is a possibility that should be noted in any estimation of results of an educational act
and is again noted in the conclusion.
The Relationship between School Funding and Student Achievement
97
regressions as well as in difference regressions in keeping with the structure in
the earlier section. A Tobit analysis was especially relevant in the cases where top
censoring might be suspected. Generally, the Tobit regression structure assumes
there is a latent outcome variable y* such that the true regression should be:
y* = βX + ε, ε | X ~ N(0, σ2). In reality, only y is observed where y=max(0, y*)
in bottom-censored regressions and alternatively y=min(1, y*) in top-censored
regressions such as those in the present analysis.
r esu lts a n d fi n di ngs
Summary Statistics
Table 1 displays trends over time in total revenues per student, student persistence,
student test scores, and school district educational characteristics. Overall,
Table 1 paints a picture of school districts increasing funding and resources for
students and those experiencing higher rates of persistence and achievement on
test scores. To conclude that these relationships were more than correlations,
however, requires significant analysis beyond these simple summary statistics.
It is for this reason that after describing the summary statistics in Table 1 and
Table 2 in more detail, results from the regression analysis allowing for a more
in-depth analysis will be discussed.
In Table 1, inflation-adjusted total revenues per student increased over this
time period from approximately $7,500 per student in 1997 to $9,400 per student
in 2006. The one clear exception to the upwards trend in per pupil revenues was
a sharp decrease in the year 2004.
Measures of persistence have also been exhibiting a generally consistent trend
of students improving over time, with the fraction receiving diplomas and the
graduation rate increasing and the dropout rate decreasing. For instance, the
graduation rate went from 89.7% in 1997 to 91.3% in 2006. Similarly, the fraction
receiving diplomas went from 94% in 1997 up to 97% in 2006. Dropout rates
also display a generally downward trend from the period 1997–2004 going from
1.7 dropouts per 100 students in the year 1997 to a low of 0.92 dropouts per 100
students in 2004; however, they did display a sharp increase in the 2005–2006
time period. In terms of test scores, the data for math and reading is somewhat
more limited, containing information only for the 2004–2006 time period. For
this three-year period of time, student test scores in both math and reading were
generally increasing over time for all relevant grades.
During the 1997–2006 time period, school districts experienced a movement
towards higher enrollment levels, more full-time equivalent teachers, and a lower
98
j our nal of e ducat ion fi na n ce
Table 1. KS School District Characteristics 1997–2006
Panel A: School District
Information
Revenues Per Student
(in thousands)
Pupil Teacher Ratio
Full Time Equivalent
Teachers
Total Enrollment
Fraction Free Lunch
Panel B: Persistence
Information
Dropout Rate
Graduation Rate
Fraction Diplomas
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
7.5
7.7
7.8
7.9
7.8
8.3
8.7
8.4
8.7
9.4
13.5
13.3
13.1
12.9
13.0
13.2
13.1
13.0
12.8
12.3
103.3 104.8 108.0 107.3 108.4 107.4 107.8 109.3 111.9 115.2
1536 1539 1538 1536 1543 1548 1556 1562 1564 1599
0.33 0.33 0.21 0.22 0.22 0.24 0.25 0.26 0.26 0.26
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
1.7
1.6
1.5
1.4
1.3
1.0
0.9
0.9
1.5
1.6
89.7 89.8 89.9 90.6 91.1 89.2 90.0 91.7 91.1 91.3
0.94 0.94 0.94 0.94 0.94 0.93 0.94 0.95 0.95 0.97
Panel C: Proficiency Rates 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Grade 4 Math
Grade 7 Math
Grade 10 Math
Grade 5 Reading
Grade 8 Reading
Grade 11 Reading
-
-
-
-
-
-
-
82.7
68.0
52.0
71.7
75.4
63.0
87.5
72.1
54.8
78.1
78.8
65.8
83.5
72.5
61.0
79.8
80.4
79.1
pupil-teacher ratio. The fraction on free lunch declined from the higher levels
(33%) from the 1997–1998 period, but rose somewhat from the 1999–2001 lows
(21–22%). The number of full-time equivalent teachers went from a low of 103.3
in 1997 to a high of 115.2 in 2006, while the average enrollment went from a low
of 1,536 students in 1997 to a high of 1,599 students in 2009.
The remaining information on school districts, which was used in this analysis
comes from Census 2000 data. Table 2 displays these characteristics of Kansas
school districts. It is apparent that just over 10% of students in Kansas school
districts were living below the poverty line with a low of approximately 0% and
a high of almost 40% living in poverty. The median income in the typical Kansas
school districts stood at approximately $44,000 with some districts having a
median income as high as $100,000, and others as low as $30,000. It is also true
that approximately 80% of Kansans had at least a high school diploma and 72%
of men were participating in the labor force—as were 58% of women. These
local characteristics of school districts are used as control characteristics in the
following regressions.
The Relationship between School Funding and Student Achievement
99
Table 2. Kansas School District Characteristics (Census 2000)
Mean
S.D.
0.11
0.06
0.00
0.40
44005
8724
31100
102987
Fraction Males in Labor Force
0.72
0.06
0.37
0.92
Fraction Females in Labor Force
0.58
0.06
0.44
0.73
Fraction with College Degree
0.24
0.09
0.07
0.79
Fraction with Associates Degree
0.08
0.03
0.02
0.16
Fraction Who are HS Grads
0.47
0.08
0.12
0.67
Fraction 9-12 Years School
0.12
0.04
0.02
0.24
Fraction 1-8 Years School
0.08
0.05
0.01
0.40
Fraction of Children in Poverty
Median Family Income
Minimum Maximum
Regression Analysis
Table 3 displays cross-sectional regressions where each of the outcome variables
chosen consists of various measures of persistence in panel A and various test
scores by grade in panel B. The panel A regressions are run for each of the years
1997–2006 and the panel B regressions are run for selected years in the 2003–
2006 range with the years chosen depending on availability of a particular test
in a given year. The regressions progressively add in the DistPopul and DistEduc
controls—labeled CENSUS INFO and SCHOOL INFO respectively. This means
that only the last column of each persistence regression for each year corresponds
to equation (1), and only the last column of each test-score regression for each
grade-year combination corresponds to equation (2). Due to this structure,
Panel A contains the results from 90 separate regressions, while panel B contains
the results of 82 separate regressions for a sum total of 172 separate regressions
in Table 3.
Panel A of Table 3 displays the effects of total revenues per student on the
three measures of student persistence in the data—namely the fraction receiving
diplomas, the dropout rate, and the graduation rate. These regressions show that
an increase in total revenues per student serves to improve student persistence
(i.e. lower dropouts and improved graduation rates and the fraction receiving
diplomas) as evidenced by the generally positive coefficients on total revenues
per student in the regressions for the fraction receiving diplomas and graduation
rates, and the negative coefficients for the regression where dropout rate is the
left-hand side variable. This provides some slight evidence for positive effects
of revenues on persistence, but they do not appear to remain when including
district controls. One anomaly is the negative and significant relationship
2000
NO
YES
2000
YES
YES
0.928 0.375 -0.144
NO
NO
1.018
NO
NO
YES
YES
0.478 -0.112
2001
NO
YES
2001
283
1998
281
1.018 1.159
2004
269
1.222 0.865
2004
269
2005
270
270
273
273
272
272
272
[1.23] [1.13] [1.22]
273
273
273
[0.60] [0.62] [1.67]
2005
Mathematics—Grade 10
273
[0.21] [0.06] [0.14]
-0.172 0.059 0.162
2005
Mathematics—Grade 7
270
[1.04] [1.61] [1.50]
0.654 1.085 1.239
-1.601 -1.832 -2.102
273
YES
YES
2004
273
NO
NO
0.471 -0.412 -1.083
273
[2.03]* [1.79] [1.13]
1.212
269
[2.35]* [1.80] [1.83]
1.163
YES
NO
293
Mathematics—Grade 4
295
295
[1.15] [0.36] [2.39]* [2.04]* [1.61] [0.57]
0.009 0.006 0.003
283
283
0
1999
-0.005
283
280
279
2001
279
283
2002
283
283
283
2003
283
-0.177 -0.139 -0.112 -0.125 -0.131 -0.068
283
NO
NO
2005
NO
YES
YES YES
2005
2006
NO
YES
2006
YES
YES
0.38 0.259 -0.25
NO
NO
2004
283
283
284
2005
283
275
275
2006
275
0.226 0.335 0.984 -0.041 -0.028 0.139
284
[0.67] [0.23] [1.22] [1.14] [1.30] [0.95] [0.70] [0.54]
-0.132 -0.126 -0.066
283
YES
YES
-0.76 -0.235 -1.579 -1.841 -2.159
2004
NO
YES
2004
0.005
283
0
2000
283
-0.004
283
0.011
282
0.01
2001
282
0.009
282
0.016
282
0.011
2002
282
0.01
282
0.017
282
282
0.012 0.012
2003
282
0.009
281
281
0.009 0.019
2004
281
281
2005
281
0.024 0.036 0.069
281
282
2006
281
0.018 0.023 0.037
282
[0.78] [0.04] [0.04] [3.81]**[3.27]** [0.93] [3.06]** [2.39]* [1.81] [3.53]** [3.46]** [1.52] [3.39]** [3.03]** [1.59] [0.75] [0.78] [1.08] [2.18]* [1.70] [0.94]
-0.245 -0.202 -0.078
279
NO
NO
YES
YES
296
0.262 0.004
2006
NO
YES
296
2006
269
269
2006
269
269
2004
NO
YES
296
YES
YES
296
2004
270
270
276
276
276
NO
YES
293
YES
YES
293
0.13
2005
0.136
271
271
-0.132 0.711 0.675
2005
Reading—Grade 8
271
[0.11] [0.22] [0.20]
-0.063
Reading—Grade 5
NO
NO
293
2004
275
275
1.181 1.022 0.602
275
274
274
274
271
271
0.227 -0.017
2005
268
268
268
[0.30] [0.43] [0.03]
0.153
Reading—Grade 11
271
[1.55] [2.52]*[2.02]* [0.21] [1.07] [0.81]
0.821 1.088 0.863
270
[1.23] [1.01] [0.72]
-0.743 -0.646 -0.51
NO
NO
296
[0.73] [0.59] [0.49] [2.16]* [1.90] [1.09]
0.596 0.541 -0.533
269
[0.49] [0.39] [0.56]
-0.285 -0.248 0.428
269
[0.01] [0.38] [0.01]
0.01
NO
NO
296
2006
NO
YES
291
YES
YES
291
2006
270
270
1.724
267
274
0.019 -0.616
2006
274
2.08
274
272
272
272
274
274
273
[1.33]
0.992
274
YES
YES
288
272
[0.15]
282
NO
YES
YES YES
283
2003
1.27 1.497 1.545
0.505
270
267
267
267
0.614 0.993 0.911
2003
Social Science—Grade 8
2005
273
273
272
272
272
272
271
271
271
[0.12] [0.37] [0.16] [0.17] [0.22]
0.141 -0.149 -0.212
2003
Social Science—Grade 11
272
[1.53] [0.61] [0.65] [1.23] [1.10]
1.29
2005
270
277
Social Science—Grade 6
NO
NO
283
[1.53] [0.29] [1.50] [1.65] [1.62]
1.355 0.288
2005
NO
YES
288
-0.151 -0.135 -0.471
Science—Grade 10
274
1.793 1.106
2003
274
[0.69] [0.03] [0.57] [2.88]** [2.53]* [1.55]
0.434
274
270
[1.11]
1.043
Science—Grade 7
267
1.677 0.875
2003
267
NO
NO
288
Science—Grade 4
YES
YES
286
1.104 0.677
2003
NO
YES
286
[1.00] [1.08] [0.62]
0.957
NO
NO
286
[0.26] [0.28] [0.11] [2.28]* [2.43]* [1.18]
-0.181 0.241 0.118
270
[0.51] [0.38] [0.57]
-0.339 0.233 0.385
NO
NO
291
277
277
[1.32] [0.14] [1.04] [1.98]* [0.00] [1.09] [2.24]* [1.73] [1.51] [4.33]** [2.72]** [1.80] [2.92]** [2.74]**[2.05]* [2.83]** [2.35]* [1.63] [1.27] [1.32] [1.16] [1.68] [1.97]* [1.96]
0.004
283
280
2000
-0.045 0.002 0.003
280
YES
YES
[0.61]
2003
NO
YES
2003
-0.621
1.082
NO
NO
0.813 1.144
YES
YES
0.279 -0.643 -1.043
2002
NO
YES
2002
[0.31] [0.61] [0.78] [2.36]* [1.86] [2.30]*
NO
NO
Note: Test score regressions are run at the grade level for each school district while persistence regressions are run at the school district level. Robust standard errors are employed in all regressions. CENSUS INFO includes
district level averages for the fraction of women and men in the labor force, Median family income, the fraction of children living in poverty and the fraction of individuals in each of five educational groups (1–8, 9–12, High
School Degree, Associate Degree, College or more). SCHOOL INFO includes the pupil teacher ratio, the school enrollment, the number of full time equivalent teachers, and the fraction of students on free lunch. Absolute values
of t-statistics in brackets. *Significant at 5% level. **Significant at 1% level.
Observations
Tot. Rev. Per Stu.
Observations
Tot. Rev. Per Stu.
Observations
280
-0.13 -0.073 -0.021
280
[1.86] [1.10] [0.23]
280
-0.172 -0.13 -0.014
277
-0.174 -0.127 -0.043
279
[2.52]* [2.00]* [0.65] [2.86]**[2.37]* [0.22]
279
1999
280
1998
280
296
296
296
Observations
Panel B: Test Scores Selected Years
NO
YES YES
CENSUS INFO
NO
NO
YES
SCHOOL INFO
Tot. Rev. Per Stu.
YES
1997
280
283
283
283
Observations
1997
Receiving Diplomas
0.004 -0.001 -0.006
Tot. Rev. Per Stu.
Observations
Dropout Rate
Tot. Rev. Per Stu.
NO
YES
1.255 0.752 0.146
NO
YES
1999
1.123 0.799 0.173
YES
NO
1999
NO
NO
YES
1998
YES
1998
NO
[2.85]** [3.28]** [0.39] [3.05]**[2.23]* [0.46] [2.77]** [1.71] [0.24] [2.90]** [1.39] [0.40] [2.27]* [1.01] [0.19]
Panel A: Persistence 1997–2006
1997
NO
YES YES
CENSUS INFO
NO
NO
YES
SCHOOL INFO
1997
Graduation Rate
0.984 0.793 0.098
Tot. Rev. Per Stu.
Table 3. Effect of Total Revenues Per Student on Persistence and Test Scores
The Relationship between School Funding and Student Achievement
101
between total revenues per student and the fraction receiving diplomas for the
fully controlled regression in the year 1997. Since this pattern does not persist in
the data and is not evident at all until including district controls, this particular
anomaly does not appear to be a relevant concern.
Panel B of Table 3 shows that there was also little effect of total revenues
per student on student test scores in any grade. Once again, only one of the
regressions with both district controls exhibits a significant coefficient on total
revenues per student. This is the case for reading in grade 8 in 2004. In all other
cases, any initial positive effects—present mostly in 2004 and to some extent
in 2003—disappear when adding district controls to the regression. Overall, at
the cross-sectional level there does not appear to be a significant effect of total
revenues per student on test scores or persistence after controlling for school
district characteristics. It is also important to note that there is more evidence
for the persistence than the test-scores regressions due to the longer timeframe
for the persistence regressions.
Although the numerical coefficients on the school district control variables
used in the Tables are not displayed due to space constraints, it is interesting
to note the presence or absence of significant relationships between them and
the outcome measures. In these cross-sectional regressions, pupil-teacher ratios,
number of full-time equivalent teachers, enrollment, and fraction on free lunch
all have significant relationships with the dropout and graduation rate. Of these
school district controls, only the pupil-teacher ratio is significantly related to the
fraction receiving diplomas in any instance. There are also some instances where
there is a significant relationship between both (a) average level of schooling in
the district and (b) the number of five- to seventeen-year olds below the poverty
line with measures of persistence. The importance of these Census measures
of district characteristics on persistence should not be exaggerated since they
disappear when also controlling for the pupil-teacher ratio, full-time equivalent
teachers, enrollment, and fraction on free lunch.
In terms of the cross-sectional test-score regressions, in almost no instances
are any of the pupil-teacher ratio, full-time equivalent teachers, or enrollment
significantly related to student test scores. The only variable that is significantly
related to test scores in multiple regressions is the fraction on free lunch. In
a similar vein, there is sometimes an effect of median family income and the
number of five- to seventeen-year olds below the poverty line on test scores.
These effects appear more prevalent for measures of science test scores than any
of math, reading, or social science.
Taken together, there is some evidence of a relationship between the district
controls and persistence in the cross section, particularly for district measures
related to the schools themselves, that is, pupil-teacher ratio, enrollment, fraction
102
j our nal of e ducat ion fi na n ce
on free lunch, and number of full-time equivalent teachers. These relationships
are weaker when examining the relationship on test scores and tend to focus
slightly more on the income variables—median family income, fraction on free
lunch, and number in poverty.
Table 4 displays results from the Tobit regression where the structure is exactly
the same as in Table 3. As Panels A and B of Table 4 demonstrate, there is no
substantive change in the patterns we see in Table 4 after using this robustness
analysis. The one notable departure in Table 4 is the higher levels of significance
for the Tobit cross-sectional persistence regressions including district level
controls. This is particularly true for the fraction receiving diplomas and for
the dropout rate in the last several years of the data. One anomaly of note is
the positive effect of revenues on dropouts—the opposite direction expected—
in 2005. The reasoning for this anomaly could be as follows: The majority of
amendment changes went into effect during the 2005 school year, and there was
a significant change in the amount of funding which was targeted towards atrisk students during the 2005 school year, so it is natural to observe that total
revenues per student have a negative effect on dropout rate since funding was
directed towards schools that were doing poorly at this point in time. It is also
possible, as evidence from Table 4, that this increase in funding would not affect
graduation rates or the fraction receiving diplomas in an adverse fashion during
that year. The longitudinal regressions which follow, however, do help to explain
in more detail how funding changes affected schools over the entire time period.
The other piece of information gleaned from these regressions is the decrease
in the significance of the effect of total revenues per student on the test-score
regressions. This provides even more evidence that revenues were not helping to
improve test scores.
It would be unwise to finish the analysis at this point, for the reason mentioned
earlier regarding possible selection bias and it is, therefore, necessary to obtain
results from the differenced regression in Table 5, as well as the Tobit analysis
difference regression in Table 6.
Table 5 displays results of the difference regressions and shows how changes
in revenues per student affected changes in persistence (in panel A) and
changes in math and reading scores (in panel B). Panel A displays results from
all three types of persistence of interest, while Panel B results are broken out
by grade of analysis. Once again, controls are added in progressively as well as
including initial levels of the school district characteristics so that for panel A,
only the fourth column of each set of persistence regressions corresponds to the
regression specified in equation (3), and in panel B the fourth column of each
grade-subject combination corresponds to the regression specified in equation
(4). The structure of Table 5 means that there are 12 separate regressions
NO
1.459
YES
0.063
1999
NO
YES
YES
YES
1.629 0.967 0.208
NO
NO
2000
NO
YES
YES
YES
1.268 0.586 -0.203
NO
NO
1.509
NO
NO
YES
YES
0.787 -0.069
2001
NO
YES
280
1997
280
279
279
1998
277
280
280
1999
280
280
280
2000
280
279
279
2001
279
283
283
YES
YES
270
270
283
272
272
272
[0.80] [0.16] [0.83]
273
273
269
269
269
NO
NO
296
2004
NO
YES
296
YES
YES
296
2004
270
270
275
275
1.191 1.034 0.608
2004
275
[1.46] [2.02]* [1.46]
0.811 1.074 0.852
270
[1.07] [0.87] [0.62]
276
276
276
274
274
274
[0.91] [0.80] [0.77] [2.13]* [1.82] [0.97]
0.609 0.541 -0.593
2006
269
[0.41] [0.33] [0.50]
0.01
0.009
282
NO
YES
293
YES
YES
293
0.184 0.153
2005
271
271
271
271
271
[0.21] [1.15] [0.93]
-0.124 0.724 0.684
2005
Reading—Grade 8
271
[0.03] [0.30] [0.22]
0.015
Reading—Grade 5
NO
NO
293
0.229 -0.019
268
268
268
[0.31] [0.43] [0.03]
0.163
2005
0.108 -0.649
[0.64] [0.61] [1.46]
273
2006
269
-0.287 -0.238 0.417
269
[0.02] [0.42] [0.03]
282
2001
2005
0.547
0.449 -0.428 -1.098
2004
YES
YES
296
0.011
282
Reading—Grade 11
273
2006
NO
YES
296
0.009 0.242 -0.018 -0.731 -0.633 -0.496
NO
NO
296
-0.004
283
Mathematics—Grade 10
273
273
273
273
273
[0.20] [0.11] [0.22]
[1.73] [1.75] [1.13]
2005
Mathematics—Grade 7
-0.128 0.071 0.168
2004
269
1.221 0.862
1.224
269
270
1.125 1.257
2005
269
0.691
[1.17] [1.78] [1.71]
1.036 1.154
2004
NO
NO
[1.87] [1.53] [1.54]
1.182
YES
NO
293
Mathematics—Grade 4
295
295
YES
YES
2.074 2.018
2003
NO
YES
2004
NO
YES
YES
YES
0.494 -0.045 0.317
NO
NO
283
2002
283
283
283
2003
283
283
283
2004
283
-0.303 -0.234 -0.184 -0.213 -0.215 -0.119 -0.278 -0.265 -0.162
283
[1.45] [0.01] [0.69] [3.28]** [2.51]* [2.19]* [0.64] [0.06] [0.34]
2.612
NO
NO
2005
NO
YES
YES
YES
2005
284
283
-0.25 -0.051 0.773
284
[0.51] [1.27] [1.82]
-0.399 -1.038 -1.75
NO
NO
2006
NO
YES
YES
YES
275
2006
275
-0.518 -0.337 0.121
275
[1.91] [1.27] [0.25]
0.949 0.651 -0.152
NO
NO
0.016
282
282
0.011
2002
0.01
282
0.017
282
282
282
0.013 0.012
2003
0.009
281
281
0.009
2004
0.019
281
2005
281
0.024 0.036
281
0.069
281
282
2006
0.018 0.024
282
0.037
281
2006
NO
YES
291
YES
YES
291
2006
270
270
1.724
267
267
274
0.005 -0.648
2006
274
2.096
274
1.79
2003
274
272
272
272
274
274
270
274
0.94
274
0.268
YES
YES
288
1.314
2005
270
0.496
270
273
273
272
272
272
[0.24] [0.21] [0.64]
-0.165 -0.146 -0.481
2005
273
[1.84] [2.34]* [0.81]
1.041
Science—Grade 10
[0.74] [0.01] [0.98] [2.83]** [2.47]* [1.19]
0.425
274
1.394
2005
NO
YES
288
[1.46] [1.76] [0.30]
1.102
Science—Grade 7
1.666 0.744
2003
267
NO
NO
288
Science—Grade 4
YES
YES
286
1.094 0.654
2003
NO
YES
286
[1.12] [1.30] [0.72]
0.945
NO
NO
286
[0.11] [0.64] [0.34] [2.20]* [2.25]* [0.92]
-0.057 0.347 0.211
270
[0.64] [0.42] [0.61]
-0.349 0.245 0.382
NO
NO
291
NO
YES
283
YES
YES
282
1.462
267
267
2003
0.958
0.877
272
272
277
271
271
271
[0.25] [0.11] [0.17]
0.201 -0.09 -0.157
2003
Social Science—Grade 11
272
[0.68] [1.12] [0.93]
0.58
Social Science—Grade 8
267
[1.56] [1.79] [1.67]
1.285 1.491
2003
Social Science—Grade 6
NO
NO
283
277
277
[1.89] [0.02] [0.98] [2.72]** [2.37]* [1.57] [4.01]** [2.52]* [1.86] [3.78]**[2.74]**[2.16]* [1.89] [1.88] [3.20]** [1.96] [2.61]**[4.59]** [2.20]*[2.61]**[3.41]**
0.005
283
0
283
0.004 -0.001 -0.005
281
[1.30] [0.15] [1.01]
283
1998
0.006 0.003
0.004 -0.001 -0.006
0.009
283
[1.45] [0.29] [1.66] [2.27]* [1.49] [0.54]
283
2000
283
1997
YES
YES
[1.64] [0.58] [0.10] [4.71]**[3.74]** [1.46] [4.66]**[3.42]**[2.44]* [3.77]**[3.56]**[1.97]* [4.36]**[3.88]** [2.17]* [1.02] [0.19] [2.78]** [2.08]* [1.27] [0.91]
1999
283
[4.03]**[3.10]** [0.77] [3.85]**[2.90]** [0.15] [2.88]** [1.72] [0.45]
2002
NO
YES
1.161 -0.009 -0.664
NO
NO
Note: Test score regressions are run at the grade level for each school district while persistence regressions are run at the school district level. Robust standard errors are employed in all regressions. CENSUS INFO includes district
level averages for the fraction of women and men in the labor force, Median family income, the fraction of children living in poverty and the fraction of individuals in each of five educational groups (1–8, 9–12, High School Degree,
Associate Degree, College or more). SCHOOL INFO includes the pupil teacher ratio, the school enrollment, the number of full time equivalent teachers, and the fraction of students on free lunch. Absolute values of t-statistics in
brackets. *Significant at 5% level. **Significant at 1% level.
Observations
Tot. Rev. Per Stu.
Observations
Tot. Rev. Per Stu.
Observations
Tot. Rev. Per Stu.
YES
YES
0.977 0.111
1998
NO
YES
-0.22 -0.153 -0.043 -0.236 -0.17 -0.011 -0.229 -0.136 -0.048 -0.097 -0.035 -0.008 -0.375 -0.302 -0.14
280
296
296 296
Observations
Panel B: Test Scores Selected Years
NO
YES YES
CENSUS INFO
NO
NO YES
SCHOOL INFO
Observations
Receiving Diplomas
Tot. Rev. Per Stu.
Observations
Dropout Rate
Tot. Rev. Per Stu.
NO
YES
[3.77]**[3.14]** [0.19] [3.87]**[2.79]** [0.26] [3.78]**[2.44]* [0.42] [3.16]** [1.58] [0.42] [2.99]** [1.55] [0.11]
Panel A: Persistence 1997–2006
NO
YES
CENSUS INFO
NO
NO
SCHOOL INFO
1997
Graduation Rate
1.224 0.924
Tot. Rev. Per Stu.
Table 4. Tobit Regressions Effect of Total Revenues Per Student on Persistence and Test Scores
104
j our nal of e ducat ion fi na n ce
Table 5. Effect of Changes in Revenue per Student on Student Outcomes
Panel A: Effect on Persistence (2006–1997)
NO YES YES YES NO YES YES YES NO
CENSUS INFO
NO NO YES YES NO NO YES YES NO
SCHOOL INFO
NO NO NO YES NO NO NO YES NO
INITIAL LEVELS
Change Rev. per Stu.
YES YES YES
NO YES YES
NO NO YES
Change in Frac
Change in Dropout Rate
Change in Grad Rate
Diplomas
-0.09 -0.106 0.12 0.196 -0.18 -0.222 0.084 -0.005 0.014 0.014 0.011 0.023
[0.64] [0.68] [1.21] [1.32] [0.44] [0.53] [0.19] [0.01] [0.84][0.92][0.75][1.28]
281 281 281 281
280 280 280 280 292 292 292 292
Observations
Panel B: Effect on Test Scores (2006–2004)
NO YES YES YES
CENSUS INFO
NO NO YES YES
SCHOOL INFO
NO NO NO YES
INITIAL LEVELS
Change Rev. Per Stu.
(2006–2004)
Observations
Change Rev. Per Stu.
(2006–2004)
Observations
4th Grade
NO
NO
NO
YES YES YES
NO YES YES
NO NO YES
NO YES YES YES
NO NO YES YES
NO NO NO YES
Change in Math Scores
7th Grade
10th Grade
-1.05 -0.848 -0.514 -1
1.06 1.335 0.626 1.834 1.888 2.532 1.241 1.769
[0.77] [0.62] [0.37] [0.78] [0.86] [0.94] [0.42] [1.21] [1.35][1.71][0.74][0.96]
264 264 264 264
265 265 265 265 271 271 271 271
5th Grade
Change in Reading Scores
8th Grade
11th Grade
-1.049 -1.285 -1.045 -0.64 -0.97 -1.391 -2.087 -1.812 0.372 0.329 -0.73 -1.12
[1.06] [1.18] [0.89] [0.56] [0.73] [1.01] [1.48] [1.06] [0.38][0.32][0.65][1.01]
265 265 265 265
270 270 270 270 267 267 267 267
NOTE: Test score regressions are run at the grade level for each school district while Persistence regressions are run at the
school district level. Robust standard errors are employed in all regressions. CENSUS INFO includes district level averages for
the fraction of women and men in the labor force, Median family income, the fraction of children living in poverty and the
fraction of individuals in each of five educational groups (1–8, 9–12, High School Degree, Associate Degree, College or more).
SCHOOL INFO includes changes between 1997 or 2004 (for persistence or test scores respectively) and 2006 which occurred
in the following variables: the pupil teacher ratio, the school enrollment, the number of full time equivalent teachers, and the
fraction of students on free lunch. For the test score regressions, INITIAL LEVELS includes information on the 2004 values of
the pupil teacher ratio, the school enrollment, the number of full time equivalent teachers, and the fraction of students on free
lunch while the persistence regressions INITIAL LEVELS includes the 1997 values for these same variables. Absolute values
of t-statistics in brackets. *Significant at 5% level. **Significant at 1% level.
contained in panel A, and 24 separate regressions contained in panel B for a
total of 36 separate regressions in Table 5.
Panel A of Table 5 shows no regressions where the effect of changes in total
revenues per student on changes in persistence reaches conventional levels of
statistical significance. Similarly, panel B of Table 5 shows no regressions where
the effect of changes in total revenues per student on changes in test scores
reaches conventional levels of statistical significance—either positive or negative.
Both panels of Table 5 support the idea that there is little or no effect of total
revenues per student over this time period on persistence or test scores. Once
again, the results for test scores are measured over a three-year time period while
persistence is measured over a 10-year time period and so are more trustworthy
in nature.
The Relationship between School Funding and Student Achievement
105
Once again, it is useful to note which of the control variables were significantly related to measures of change in persistence and test scores. In panel A,
there were almost no regressions where there was a significant relationship between the control variables and changes in persistence. The only variables which
displayed at times significant coefficients were the initial level of full-time equivalent teachers or the average level of schooling attained in the school district in
Census 2000 data. In panel B, only changes in the pupil-teacher ratio showed up
as significantly related to changes in test scores. In a few cases, there was a significant relationship between the Census 2000 characteristics of average schooling and median family income in the district; however, these were more rare.
Table 6 replicates the exact same structure as Table 5 as well as displaying the
same general pattern of results. The only difference is that Table 6 employs a Tobit
regression for this robustness portion of the analysis. In all but one scenario,
there is no statistically significant effect of changes in revenues on changes in
either persistence or test scores. The one exception where the t-statistic on the
coefficient of “Changes in Revenues per Student” is above the 5% statistical
significance threshold conventionally employed is for the case where full
controls are employed in the regression—specifically, census information, school
information, and initial levels are included—and the outcome measure is changes
in dropout rates between 1997 and 2006. In this particular case, the coefficient
on changes in revenues is 0.195 with a t-statistic of 2.15, above the threshold to
be statistically significantly different from zero at the 5% level. The reason that
this coefficient may be positive could be due to the anomalous nature of what
occurred in 2005 in particular—as seen in the regressions at the cross-sectional
level in Table 4—which will also be evidenced in the differenced regressions.
It is thus possible that there is some selection of funding affecting dropouts by
targeting the at-risk population of students. It should be noted, however, that
in no other instance is there a statistically significant effect at the 5% level of
changes in revenues on either persistence or test-score changes. T-statistics on
changes in revenue per student in all cases but the one mentioned range from
a low close to 0 to a high—approaching, but not gaining conventional levels of
significance—above 1.9 in absolute value. In the majority of cases, the t-statistics
are below one in absolute value, similar to the results in Table 5, implying very
little reason to trust that the associated coefficients are different from zero. In
the subset of cases where the t-statistics are approaching statistical significance,
the effects show an increase in revenues associated with an increase in student
outcomes. Because these results are not statistically significant at conventional
levels of significance, however, no clear statements should be made from that set
of results.
106
j our nal of e ducat ion fi na n ce
Table 6. Tobit Regressions Effect of Changes in Revenue per Student on Student
Outcomes
Panel A: Effect on Persistence (2006–1997)
NO
YES YES YES
NO YES YES YES
NO
YES YES
CENSUS INFO
NO
NO
YES
YES
NO
NO
YES
YES
NO
NO
YES
SCHOOL INFO
NO
NO
NO
YES
NO
NO
NO
YES
NO
NO
NO
INITIAL LEVELS
YES
YES
YES
Change in Dropout Rate Change in Grad Rate Change in Frac Diplomas
-0.091 -0.107 0.12 0.195 -0.176
Change Rev.
[1.04] [1.22] [1.33] [2.15]* [0.34]
per Student
281
281
281
281
274
Observations
Panel B: Effect on Test Scores (2006–2004)
NO
YES YES YES
NO
CENSUS INFO
NO
NO
YES
YES
NO
SCHOOL INFO
NO
NO
NO
YES
NO
INITIAL LEVELS
Change Rev. Per Stu.
(2006–2004)
Observations
-0.215 0.085 -0.001
0.014 0.014 0.011
0.024
[0.42] [0.15] [0.00]
[1.56] [1.58] [1.11]
[1.95]
274
274
274
276
276
276
276
YES
YES
YES
NO
YES
YES
YES
NO
YES
YES
NO
NO
YES
YES
NO
NO
YES
NO
NO
NO
YES
Change in Math Scores
7th Grade
4th Grade
10th Grade
-1.074 -0.889 -0.544 -1.03
1.088 1.392 0.69 1.902
1.922 2.576 1.282
1.834
[0.90] [0.72] [0.41] [0.73]
[0.75] [0.93] [0.43] [1.13]
[1.48] [1.93] [0.93]
[1.26]
264
Change Rev. Per Stu. -1.061
[0.94]
(2006–2004)
265
Observations
264
264
264
265
265
265
265
271
Change in Reading Scores
8th Grade
5th Grade
271
271
271
11th Grade
-1.309 -1.062 -0.65
-0.932 -1.344 -2.06 -1.774
0.371 0.332 -0.725
-1.107
[1.13] [0.86] [0.49]
[0.88] [1.26] [1.80] [1.45]
[0.32] [0.28] [0.59]
[0.81]
265
265
265
270
270
270
270
267
267
267
267
NOTE: Test score regressions are run at the grade level for each school district while Persistence regressions are run at the
school district level. Robust standard errors are employed in all regressions. CENSUS INFO includes district level averages for
the fraction of women and men in the labor force, Median family income, the fraction of children living in poverty and the
fraction of individuals in each of five educational groups (1–0, 9–12, High School Degree, Associate Degree, College or more).
SCHOOL INFO includes changes between 2004 and 2006 which occurred in the following variables: the pupil teacher ratio,
the school enrollment, the number of full time equivalent teachers, and the fraction of students on free lunch. For the test score
regressions, INITIAL LEVELS includes information on the 2004 values of the pupil teacher ratio, the school enrollment, the
number of full time equivalent teachers, and the fraction of students on free lunch while the persistence regressions INITIAL
LEVELS includes the 1997 values for these same variables. Absolute values of t-statistics in brackets. *Significant at 5% level.
**Significant at 1% level.
c onclusions
Changes in the School District Finance and Quality Performance Act over 1997–
2006 had little effect on student persistence or test scores. It should be kept in
mind, however, that the current results may be an underestimate of the effect of
school funding on student achievement due to persistent problems of selection
into schooling, which were not possible to correct in the current analysis. It is
also possible that some of the amendment effects show up with a lag and are thus
not being picked up by the analysis that only covers the years through 2006.
The Relationship between School Funding and Student Achievement
107
It is also important to note that the availability and allocation of resources
is not equivalent to the ability and means to use these resources effectively to
help students. Teacher and administrator ability/training/salary, administrative
structure, and parental involvement all play a role in how effectively resources
are employed in helping students to succeed.25 Besides these factors, at issue is
whether schools that actually need and know how to use funds are the ones that
receive them. It is alternatively possible that funds are allocated in a way so as to
satisfy political exigencies rather than school district’s direct concerns.
The diversity of student populations and the demographic makeup is also
important for schools to consider in making their choices in how to create the
best environment for students to succeed. In the cross-sectional regressions,
there was some slight evidence that characteristics of school districts other than
funding were in some instances significantly related to student outcomes.26 The
pupil-teacher ratio, fraction on free lunch, enrollment levels, and number of
full-time equivalent teachers were, at times, significant predictors of persistence
in the cross section if not in the difference analysis. Some of these, such as the
pupil-teacher ratio and the number of full-time equivalent teachers are partially
able to be manipulated by school districts. There was also some slight evidence
that income and poverty levels of students and their school district areas were
related to persistence and test-score achievement. This should be noted since it
implies that administrators need to account for the underlying populations and
neighborhoods and learn to use the correct strategies for areas with higher rates
of poverty and lower-median family income levels.27
All of these various issues are key factors to consider in moving forward to
decide on the best school policies related to levels of school funding, as well
as the distribution and uses of those funds so as best to meet student need and
foster student achievement.
25. The literature on parental involvement consistently shows a positive effect of parental involvement
on student achievement, particularly for minority and low-income students. A.J. Houtenville and K.
Smith Conway, 2008. Parental Effort, School Resources, and Student Achievement. Journal of Human
Resources. 43(2). 437-453. Evidence that allocating funding to raising teacher salaries—either to attract
better teachers or to encourage current teachers to work harder—serves to increase test scores. L.
Chaudhary, 2009. Education Inputs, Student Performance and School Finance Reform in Michigan.
Economics of Education Review. 28(1): 90-98. It is possible to assess how teachers and administrators
understood how to use an influx of funds and, therefore, whether such an influx of funds improves
achievement. L.Goe, 2006. Evaluating a State-Sponsored School Improvement Program through an
Improved School Finance Lens. Journal of Education Finance. 31(4): 395-419.
26. G. Galster et al 2003. The Influence of Neighborhood Poverty during Childhood on Fertility,
Education, and Earnings Outcomes. Housing Studies. 22(5): 723-751.
27. R.C. Pianta and R.J.Walsh. 1996. High Risk Children in Schools: Constructing Sustaining
Relationships. New York, NY: Routledge Publishers.
108
j our nal of e ducat ion fi na n ce
References:
Card, A. and Krueger, A. 1996. Labor Market Effects of School Quality: Theory and Evidence. In
W. Burtless (Ed.), Does money matter? The effect of school resources on student achievement and
adult success. 97-140. Washington D.C.: Brookings Institute Press.
Card, D. and Payne, A. 2002. School Finance Reform, the Distribution of School Spending, and the
Distribution of Student Test Scores. Journal of Public Economics. 83(1): 49-82.
Chaudhary, L. 2009. Education Inputs, Student Performance and School Finance Reform in
Michigan. Economics of Education Review. 28(1): 90-98.
Deke, J. 2003. A study of the impact of public school spending on postsecondary educational
attainment using statewide school district refinancing in Kansas. Economics of Education
Review. 22: 275-284.
Fernandez, R. and Rogerson, R. 1999. Education Finance Reform and Investment in Human
Capital:Lessons from California. Journal of Public Economics. 74(3): 327-350.
Fernandez, R. and Rogerson, R. 1998. Public Education and Income Distribution: A Dynamic
Quantitative Evaluation of Education-Finance Reform. The American Economic Review. 88(4):
813-833.
Galster, G., Marcotte, D.E., Mandell, M., Wolman, H., and Augustine, N. 2003. The Influence of
Neighborhood Poverty during Childhood on Fertility, Education, and Earnings Outcomes.
Housing Studies. 22(5): 723-751.
Goe, L. 2006. Evaluating a State-Sponsored School Improvement Program through an Improved
School Finance Lens. Journal of Education Finance. 31(4): 395-419.
Goldin, C. 1999. A Brief History of Education in the United States. NBER Working Paper Historical
Paper 119. 1-76.
Guryan, J. 2001. Does Money Matter? Regression-Discontinuity Estimates from Education
Finance Reform in Massachusetts. NBER Working Paper 8269. 1-54.
Hanushek, E.A. 1986. The Economics of Schooling: Production and Efficiency in Public Schools.
Journal of Economic Literature. 24(3): 1141-1177.
Houtenville, A.J., and Smith Conway, K. 2008. Parental Effort, School Resources, and Student
Achievement. Journal of Human Resources. 43(2). 437-453.
Hoxby, C. 2001. All School Finance Equalizations Are Not Created Equal.” Quarterly Journal of
Economics. 1189-1231.
Hoxby, C. 2000. The Effects of Class Size on Student Achievement: New Evidence from Population
Variation. Quarterly Journal of Economics. 115(4): 1239-1285.
Jamison, E.A., Jamison, D.T., and Hanushek, E.A. 2007. The Effects of Education Quality on
Income Growth and Mortality Decline.” Economics of Education Review, 26(6): 771-788.
Kansas Legislative Research Department. 2006. Amendments to the 1992 School District Finance
and Quality Performance Act and the 1992 School District Capital Improvements State Aid
Law (Finance Formula Components).
Lazear, E. 2001. Educational Production. Quarterly Journal of Economics. 116(3): 777-803.
Levin, H.M., Belfield, C., Nuenning, P., and Rouse, C. 2007. The Public Returns to Public
Educational Investments in African-American Males. Economics of Education Review. 26(6):
699-708.
Lochner, L. and Moretti, E. 2004. The Effect of Education on Crime: Evidence from Prison Inmates,
Arrests, and Self-Reports. The American Economic Review. 94(1): 155-189.
Milligan, K., Moretti, E., and Oreopoulos, P. 2004. Does education improve citizenship? Evidence
from the United States and the United Kingdom. Journal of Public Economics. 88: 1667-1695.
Murname, R.J. 2008. Educating Urban Children. NBER Working Paper no. 13791. 1-45.
Murray, S.E., Evans, W.N., and Schwab, R.N. 1998. Education Finance Reform and the Distribution
of Education Resources. The American Economic Review. 88(4): 789-812.
Pianta, R.C. and Walsh, R.J. 1996. High Risk Children in Schools: Constructing Sustaining
Relationships. New York, NY: Routledge Publishers.
High-Stakes Testing and Student Achievement:
Problems for the No Child Left Behind Act
by
Sharon L. Nichols
Assistant Professor
University of Texas at San Antonio
Gene V Glass
Regents’ Professor
Arizona State University
David C. Berliner
Regents’ Professor
Arizona State University
Education Policy Research Unit (EPRU)
Education Policy Studies Laboratory
College of Education
Division of Educational Leadership and Policy Studies
Box 872411
Arizona State University
Tempe, AZ 85287-2411
September 2005
DUCATION POLICY STUDIES LABORATORY
EPSL | EEducation
Policy Research Unit
EPSL-0509-105-EPRU
http://edpolicylab.org
Education Policy Studies Laboratory
Division of Educational Leadership and Policy Studies
College of Education, Arizona State University
P.O. Box 872411, Tempe, AZ 85287-2411
Telephone: (480) 965-1886
Fax: (480) 965-0303
E-mail: epsl@asu.edu
http://edpolicylab.org
This research was made possible by a grant from the Great Lakes Center
for Education Research and Practice.
TABLE OF CONTENTS
Executive Summary ........................................................................................................... i
Introduction....................................................................................................................... 1
Why High-Stakes Testing?............................................................................................... 3
No Child Left Behind: Changing the Landscape of Accountability.................................... 5
High-Stakes Testing and Achievement................................................................................ 8
Conclusions From the Research .................................................................................... 10
Measuring High-Stakes Testing Pressure..................................................................... 11
Existing Systems................................................................................................................ 11
The Present Definition of High-Stakes Testing................................................................. 16
Measurement Part I: Creating a Pressure Rating Index .................................................. 19
Measurement Part II: High-Stakes Pressure Over Time.................................................. 31
Methodology .................................................................................................................... 36
Procedures ........................................................................................................................ 36
Participants....................................................................................................................... 39
Feedback on Method......................................................................................................... 39
Method of Analysis............................................................................................................ 40
Data................................................................................................................................... 43
Results .............................................................................................................................. 43
Part I: Carnoy and Loeb Replication ............................................................................... 43
Part II: Relationship of Change in PRI and Change in NAEP Achievement ................... 72
Part III: Relationship of Change in PRI and Change in NAEP Achievement for
“Cohorts” of Students....................................................................................................... 77
Part IV: Antecedent-Consequent Relationships Between Change in PRI ........................ 79
and Change in NAEP Achievement................................................................................... 79
Discussion....................................................................................................................... 101
Replication of Carnoy and Loeb ..................................................................................... 101
Progression ..................................................................................................................... 103
PRI Change and NAEP Gains ........................................................................................ 103
Limitations and Future Directions.................................................................................. 108
Notes & References ....................................................................................................... 111
Appendices…..…………………………………………………………………………113
External Review Panel………………………………………..……………………….336
High-Stakes Testing and Student Achievement:
Problems for the No Child Left Behind Act
Sharon L. Nichols
University of Texas at San Antonio
Gene V Glass
Arizona State University
David C. Berliner
Arizona State University
Executive Summary
Under the federal No Child Left Behind Act of 2001 (NCLB), standardized test
scores are the indicator used to hold schools and school districts accountable for student
achievement. Each state is responsible for constructing an accountability system,
attaching consequences—or stakes—for student performance. The theory of action
implied by this accountability program is that the pressure of high-stakes testing will
increase student achievement. But this study finds that pressure created by high-stakes
testing has had almost no important influence on student academic performance.
To measure the impact of high-stakes testing pressure on achievement and to
account for the differences in testing pressure among the states, researchers created the
Pressure Rating Index (PRI). The PRI was used in two ways. Correlations between the
PRI and National Assessment for Educational Progress (NAEP) results from 1990 to
2003 in 25 states were analyzed and the PRI was used in replications of previous
research. These analyses revealed that:
•
States with greater proportions of minority students implement accountability
systems that exert greater pressure. This suggests that any problems
associated with high-stakes testing will disproportionately affect America's
minority students.
•
High-stakes testing pressure is negatively associated with the likelihood that
eighth and tenth graders will move into 12th grade. Study results suggest that
increases in testing pressure are related to larger numbers of students being
held back or dropping out of school.
•
Increased testing pressure produced no gains in NAEP reading scores at the
fourth- or eighth-grade levels.
•
Prior increases in testing pressure were weakly linked to subsequent increases
in NAEP math achievement at the fourth-grade level. This finding emerged
for all ethnic subgroups, and it did not exist prior to 1996. While the authors
believe a causal link exists between earlier pressure increases and later fourthgrade math achievement increases, they also point out that math in the
primary grades is far more standardized across the country than the math
curriculum in middle school and, therefore, drilling students and teaching to
the test could have played a role in this increase. This interpretation is
supported by the lack of evidence that earlier pressure increases produced
later achievement increases for eighth-grade math achievement or for fourthand eighth-grade reading achievement.
ii
The authors conclude that there is no convincing evidence that the pressure
associated with high-stakes testing leads to any important benefits for students’
achievement. They call for a moratorium on policies that force the public education
system to rely on high-stakes testing.
iii
Introduction
Supporters of the practice of high-stakes testing believe that the quality of
American education can be vastly improved by introducing a system of rewards and
sanctions for students’ academic performance.1 When faced with large incentives and
threatening punishments, administrators, teachers, and students, it is believed, will take
schooling more seriously and work harder to obtain rewards and avoid humiliating
punishments. But educators and researchers have argued that serious problems
accompany the introduction of high-stakes testing. Measurement specialists oppose highstakes testing because using a single indicator of competence to make important decisions
about individuals or schools violates the professional standards of the measurement
community.2 Other critics worry that the unintended effects of high-stakes testing not
only threaten the validity of test scores, but also lead to “perverse”3 and “corrupt”
educational practice.4 Teachers report that the pressure of doing well on a test seriously
compromises instructional practice.5 And still others worry that the exaggerated pressure
on students and teachers to focus on test preparation is thwarting teachers’ intentions to
care for students’ needs apart from those that lead to the scores they receive on
examinations.6 It is also argued by many that the measurement systems we currently
have cannot support the demands of those who make educational policy.7
The assumption embedded in the current promotion of a high-stakes
accountability model of education is that students and teachers need to work harder and
that by pressuring them with the threat of sanctions and enticing them with financial
incentives, they would expend more effort and time on academic pursuits, and thus
Page 1 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
learning would increase. This rationale is problematic for several reasons. Learning is a
complicated endeavor and as most educators would argue, extrinsic rewards alone cannot
overcome the range of background experiences and individual differences in learning and
motivation students bring to school.8 Still, with significant bipartisan support, the
passage of the No Child Left Behind Act (NCLB) of 2001 instantiated this notion of
academic accountability in education—at least for now. But, is it working? Does the
threat of rewards and sanctions increase achievement?
Although the literature on the mostly harmful and unintended effects of highstakes testing is growing rapidly,9 existing research on the relationship between highstakes testing and its intended impact on achievement is mixed and inconclusive. Some
studies find no evidence that high-stakes testing impacts achievement.10 Others argue
that the data for or against are not sufficiently robust to reject outright the use of highstakes testing for increasing achievement.11 And others report mixed effects, finding
high-stakes testing to be beneficial for certain student groups but not others.12
One potential explanation for the mixed conclusions about the effectiveness of
high-stakes testing on achievement could lie in measurement differences in the
characterization of a high-stakes testing state (i.e., which states truly have high-stakes and
which only appear to have them?). Some researchers study the issue using a two-group
comparison—analyzing achievement trends in states with high-stakes testing policies
against those without.13 Others have studied the issue by rating states along a continuum
of low- to high-stakes state (i.e., a low-stakes state has fewer consequences for low
performance than a high-stakes state). As more states implement high-stakes testing, the
Page 2 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
rating measurement approach becomes more important than the two-group comparison
approach. Exploring new measurement methods is one goal of this report.
This study adds to the literature in two important ways. First, we employ
qualitative and quantitative methods to measure the pressure on teachers, students, and
parents exerted by a “high-stakes testing” system. An examination of the research on
accountability implementation both before and after NCLB was signed into law
uncovered the inadequacies of existing measurement approaches for capturing the range
of pressures that high-stakes testing exerted on students and educators or at the very least,
they showed little agreement from one study to the next. Thus, a significant goal of this
study is to create a more valid system for measuring the pressure that high-stakes testing
systems apply to educators and their students. Our second goal is to use this newly
created rating system to conduct a series of analyses to examine whether the practice of
high-stakes testing increases achievement. This is addressed in two ways. First, findings
from research by Carnoy and Loeb14 (whose recent report concluded that strength of a
state’s accountability model is related to math achievement gains, specifically for
minority students and for eighth graders), are challenged. This research replicates their
analyses, but replaces their high-stakes pressure index with ours. Second, a series of
analyses to investigate the relationship between high-stakes testing implementation and
achievement trends over time are computed.
Why High-Stakes Testing?
The publication of A Nation at Risk15 alarmed citizens with its claim that the
American public education system was failing. As the report noted, it was believed that
Page 3 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
if the education system did not receive a major overhaul, our economic security would be
severely compromised. American culture has internalized this claim to such a degree that
questions about how to solve this “crisis” continue to be at the top of many policy
makers’ agendas. Although our education system is not as bad off as some would have
the public believe,16 the rhetoric of a failing education system has led to a series of
initiatives that have transformed the role and function of the American public school
system. High-stakes testing holds a prominent place in this transformation.
The earliest and most common form of high-stakes testing was the practice of
attaching consequences to high school graduation exams (i.e., students had to pass a test
to receive a high school diploma). New York’s Regents examinations served this
purpose for over 100 years17 and states such as Florida, Alabama, Nevada, and Virginia
had instituted high-stakes graduation exams at least as far back as the early to mid
1980s.18 But in the years since A Nation at Risk, the rhetoric of high expectations,
accountability, and ensuring that all students—especially those from disadvantaged
backgrounds—have an equal opportunity to receive quality education has been
accompanied by a series of federal initiatives including Clinton’s 1994 re-authorization
of the 1965 Elementary and Secondary School Act, subsequent education “policy
summits,” and George H. W. Bush’s Goals 2000. In combination, these initiatives have
progressively increased the demands on teachers and their students and have laid the
groundwork for what was to come next—an unprecedented federal intervention on statelevel education policy making19 that directs all states toward a single goal (i.e., 100
percent of students reaching “proficiency”) via a single system of implementation (i.e.,
standards-based assessment and accountability).
Page 4 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
No Child Left Behind: Changing the Landscape of Accountability
The construction and passage of the No Child Left Behind Act (NCLB) occurred
under the leadership of Rod Paige and George W. Bush. In Texas, in the decade before
they went to Washington, Bush as governor and Paige as superintendent of Houston
school district had built and implemented a controversial high-stakes accountability
system that placed increasing demands and expectations on students for well over a
decade. And while other states were also implementing accountability systems
(Kentucky and New York among others), Texas’s “success” of holding students and
educators accountable for learning was quite visible. Although the “myth” of Texas’s
success has been critically examined and documented,20 it was too late (or more likely, no
one paid close attention) and NCLB, influenced by the programs implemented in Texas
and elsewhere was passed in 2001 and signed into law on January 8, 2002.21
The goal of NCLB was ambitious—to bring all students up to a level of academic
“proficiency” within a 15-year period. As of the day it was signed into law, states had to
initiate a strategic plan for meeting the range of assessment and accountability provisions
the law mandated. States that did not were threatened by the loss of billions in Title I
funding (see Table 1 for an overview of the law’s major mandates). At the core of these
mandates is that states adopt a system of accountability defined by sanctions and rewards
that would be applied to schools, teachers, and students in the event they did not meet
pre-defined achievement goals (see Table 2 for an outline of NCLB-defined rewards and
sanctions).
Page 5 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
Table 1: Overview of Requirements for States Under NCLB
1. All states must identify a set of academic standards for core subject areas at each
grade level;
2.
States must create a state assessment system to monitor student progress toward
meeting these state-defined standards;
3.
States must require schools and districts to publish report cards identifying
academic achievement of its students in aggregate and disaggregated by ethnicity
and other sub groups (e.g., for racial minorities, students for whom English is a
Second Language (ESL) and special education students);
4.
States must create a system of labels that communicate to the community how local
schools and districts are performing;
5.
States must create a plan (i.e., Adequate Yearly Progress or AYP) that would ensure
100 percent of its students will reach academic proficiency by the year 2014-2015;
and
6.
States must come up with a system of accountability that includes rewards and
sanctions to schools, educators, and students that are tied to whether they meet
state’s goals outlined in the AYP plan.
Source: No Child Left Behind Act (NCLB) of 2001 § 1001, 20 U.S.C. § 6301. Retrieved February 18,
2005, from: http://www.ed.gov/policy/elsec/leg/esea02/107-110.pdf
The law is massive and forces states to allocate significant resources in the form
of time, energy, and especially money towards its implementation—implementation that
has been especially cumbersome22 if not potentially counterproductive to the goals of
schooling.23 Most states were not ready to comply with the range of demands from
NCLB. Some didn’t have any sort of assessment system in place, whereas others were
just beginning to pilot theirs. Similarly, some states were already holding students and
their teachers accountable, whereas others had no plans or intentions of doing so. The
demands associated with NCLB have caused problems and challenges for many states.
In the first two to three years of implementation, most states have experienced significant
financial and logistical barriers in implementing two of the primary accountability
provisions stipulated under NCLB: provision of supplementary services and allowing
Page 6 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
students to transfer out of “under performing” schools.24 And, in many cases, the
demands of the law have been met with negativity by those it arguably impacts the
most—teachers.25 Ultimately, the pace at which states are able to enact and implement
the range of accountability provisions outlined by NCLB varies a great deal. It is this
incredible range of accountability implementation that makes the study of high-stakes
testing impact more complicated, but it is this complexity that is addressed by this study.
Table 2: NCLB Sanction and Reward Guidelines
Sanctions
1.
Schools failing to meet adequate yearly progress (AYP) for two consecutive years
must be identified as needing improvement. Technical assistance is to be provided
and public school choice offered;
2.
Schools failing to meet AYP for three years must offer pupils from low-income
families the opportunity to receive instruction from supplemental services, (plus
corrective actions in #1 above);
3.
Schools failing to meet AYP for four consecutive years must take one of the
following specified “corrective actions.”
a.
4.
Replacing school staff, appointing outside expert to advise school, extend
school day or year, change school internal organization structure (plus
corrective actions in 1 and 2 above).
Schools that fail to meet AYP for five consecutive years must be “restructured.”
Such restructures must consist of one or more of the following actions:
a.
reopening as a charter school, replacing all or most school staff, state
takeover or school operations (if permitted under state law), or other major
restructuring of school governance (plus 1-3 above).
Rewards
1.
2.
States must develop strategies related to high performing schools, or those
showing improvement such as:
a.
Academic achievement Awards: Receiving recognition when they close the
achievement gap; or when they exceed AYP for two consecutive years.
b.
“Distinguished schools” designations: identifying those schools that have
made the greatest gains as “models” for low-performing schools.
Financial awards to teachers in schools that have made the greatest gains.
Source: No Child Left Behind Act (NCLB) of 2001 § 1001, 20 U.S.C. § 6301. Available online, accessed
February 18, 2005, from, http://www.ed.gov/policy/elsec/leg/esea02/107-110.pdf
Page 7 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
High-Stakes Testing and Achievement
A series of studies have emerged attempting to examine the effects of high-stakes
testing on student achievement. Amrein and Berliner, Rosenshine, and Braun26 debated
the merits of high-stakes testing for improving achievement, often locating their
conflicting conclusions in the statistical analyses they applied. Amrein and Berliner used
time trend analysis to study the effectiveness of high-stakes testing on achievement at
both the K-8 and high school levels. They analyzed achievement trends across time in
high-stakes testing states against a national average. Their extensive and descriptive set
of results are organized by state for which they noted whether there was “strong” or
“weak” evidence to suggest whether achievement had “increased” or “decreased” in
fourth- and eighth-grade National Assessment of Educational Progress (NAEP) scores in
math and reading. They concluded that “no consistent effects across states were noted.
Scores seemed to go up or down in random pattern after high-stakes test were introduced,
indicating no consistent state effects as a function of high-stakes testing policy.”27
In a reanalysis of the data addressing what were viewed as flaws in Amrein and
Berliner’s method and design—namely a lack of control group—Rosenshine found that
average NAEP increases were greater in states with high-stakes testing polices than those
in a control group of states without. Still, when he disaggregated the results by state,
Rosenshine concluded that “although attaching accountability to statewide tests worked
well in some high-stakes states it was not an effective policy in all states.”28 Again, no
consistent effect was found.
In a follow-up response to Rosenshine, Amrein-Beardsley and Berliner 29 adopted
his research method using a control group to examine NAEP trends over time, but they
Page 8 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
also included in their analysis NAEP exclusion rates.30 They concluded that although
states with high-stakes tests seemed to outperform those without high-stakes tests on the
fourth grade math NAEP exams, when controlling for exclusion rates, they found that
this difference disappeared. They argued NAEP achievement in high-stakes testing states
is likely to be inflated by the exclusion of greater numbers of lower achieving students.
Braun also critiqued Amrein and Berliner on methodological grounds. In his
analysis of fourth- and eighth-grade math achievement (he did not look at reading) across
the early 1990s, he found that when standard error estimates are included in the analyses,
NAEP gains were greater in states with high-stakes testing than in those without, in spite
of exclusion rate differences. He concludes, “The strength of the association between
states’ gains and a measure of the general accountability efforts in the states is greater in
the eighth grade than in the fourth.”31 However, in a separate analysis following cohorts
of students (1992 fourth-grade math and 1996 eighth-grade math; 1996 fourth-grade math
and 2000 eighth-grade math), he found that high-stakes testing effects largely
disappeared. As students progress through school, there is no difference in achievement
trends between states with high-stakes testing and those without. His conclusions about
usefulness of high-stakes testing as a widespread policy are tentative. “With the data
available, there is no basis for rejecting the inference that the introduction of high-stakes
testing for accountability is associated with gains in NAEP mathematics achievement
through the 1990s.”32
Carnoy and Loeb provide yet another set of analyses to describe the impact of
high-stakes testing using a completely different approach for measuring accountability
and focusing on effects by student ethnicity. In contrast to others who adopted Amrein
Page 9 of 336
This document is available on the Education Policy Studies Laboratory website at:
http://www.asu.edu/educ/epsl/EPRU/documents/EPSL-0509-105-EPRU.pdf
and Berliner’s initial categorization, Carnoy and Loeb operationalize “high-stakes
testing” in terms of the “strength” of the accountability in each state, rating each state on
a 5-point scale to perform a series of regression analyses. Their analysis leads them to
conclude that accountability strength is significantly related to math achievement gains
among eighth graders, especially for African American and Hispanic students.
Carnoy and Loeb also consider the relationship between students’ grade-to-grade
progression rates with strength of accountability. Others have argued that high-stakes
testing influences a greater number of students, especially minority students, to drop out
of school.3...