Witte Reply to Peterson, et. al.

"The Effectiveness of School Choice in Milwaukee: A Secondary Analysis of Data from the Program’s Evaluation"

John F. Witte
Department of Political Science
Robert La Follette Institute of Public Affairs
University of Wisconsin-Madison
Madison, WI 53706
608-263-2029
jfwitte@facstaff.wisc.edu

The Greene, Peterson, et al paper (hereafter Peterson, et al in deference to the senior author) is a confusing, tortured effort to try to find any evidence that students enrolled in private schools under the Milwaukee Parental Choice Program (MPCP) do better than any students in the Milwaukee Public Schools (MPS). This reply provides evidence and argument to conclude the following:

I leave the social science community to judge the scientific quality of this study, and our own studies. I hope the media and political analysts will ask why this study was done in this manner? What is the ultimate intent? Is the audience researchers or judges and politicians? Who is funding, backing, and promoting this research? And are the interests of poor, inner-city children the real object of this work, or is the ultimate aim to aid the wider group of families whose children attend private schools in the United States?

John F. Witte
Department. of Political Science
Robert La Follette Institute of Public Affairs
University of Wisconsin-Madison
23 August, 1996

This reply is divided into two sections. The first, comprising sections 1. And 2. below, describes the main points of the Peterson, et al paper and what they claim we said and did. It also responds to the false charge that we withheld data during the course of this study.

The second half of this reply, critiques the secondary analysis they have attempted. It questions the basic approach, which is unique in the analysis of educational achievement; is theoretically incorrect; and leads to two forms of bias - both of which favor choice student achievement. Although we agree the non-selected choice students are theoretically interesting, in practice because of attrition from the control group and students from the MPCP, the comparisons made by Peterson, et al are highly misleading. Although the presentation in the paper makes it very difficult to interpret what they did, we question whether their results lead to the very definitive statements of results they have already claimed in the media and before the courts. Finally, we also address the question of whether our use of a large, random sample of MPS students, using suitable control variables, is a useful referent group. We strongly argue that it is and indicate that Peterson recently testified in court in favor of expansion to a much larger group of students. Ironically, his testimony was based on the Peterson, et al study, which does not include the broader MPS population.

1.1 This paper argues that evaluations written by myself and teams of researchers over five years have been inadequate in that they have focused on a random sample of MPS students as a control group instead of the students who applied to choice and were not selected into choice. Those students would theoretically provide a control on unmeasured selectivity bias which has plagued comparisons been public and private schools (Witte, 1992, 1996). Their study relies exclusively on such a comparison, using "randomized block experimental data." No citation or description of this technique was included in the paper They claim that this analysis results in no difference for students in the first and second years in choice schools, but "substantively significant" results for students in the third and fourth years. The conclusion seems to be that if students are just exposed long enough to private schools, their achievement will increase relative to public school students.

1.2 They also argue that: (1) we have been biased in our reports against choice; (2) our analyses, which have indicated no difference in achievement gains by choice students when compared to a random sample of MPS students, are based on an inappropriate control group consisting of a random sample of MPS students; and, (3) the use of traditional multivariate, regression results, are inadequate to control for differences in sample populations.

1.3 Although they claim "…certainty with which conclusions can be drawn is restricted by certain data limitations (p. 3)," the results of an estimation procedure devoid of any statistical controls and without a control for prior test scores has already appeared graphically in a Wall Street Journal article (8-14-96) written by Greene and Peterson. There is no mention of uncertainty in the article and they also make the claim that: "If similar success could be achieved for all minority students nationwide, it could close the gap between white and minority test scores by at least a third, possibly more." There is no mention in the article of the fact that they limited their study to Black and Hispanic students, but that they have no qualms in generalizing results to minority-white gaps. The political intent of this report, released on the first day of the of the Republican National Convention, and two days before a critical legal hearing on choice in Wisconsin (where Peterson testified in favor of expansion of the MPCP), seems obvious.

1.4 They claim "For five years the researchers [Witte] did not release data from the evaluation for secondary analysis by other members of the scholarly community. But in February, 1996 they made the data available on the World Wide Web. (p. 3)"

2.1 What We Said. Experimental design issues are dealt with in the following section. However, our conclusions on the MPCP have been public record for five years. What we have reported are mixed results. On the positive side we have emphatically declared that the program, relative to the MPS random sample, has not creamed off the best students and families. The students who applied to choice have come from below average SES families, were not doing well in MPS, and their parents were very dissatisfied with MPS schools (compared to choice families). In terms of results: test scores did not improve and did not differ from MPS scores, and attrition from the program averaged 30% per year (not including three private schools which went bankrupt in the midst of the school year); but there was also increased satisfaction of parents, high parental involvement in the choice schools, and improved financial, staff and physical conditions in the choice schools.

2.2 The summary conclusion concerning the program - repeated in both the Fourth and Fifth Year Reports was:

Honorable people can disagree on the importance of each of these factors. One way to think about it is whether the majority of students and families involved are better off because of this program. The answer of the parents involved, at least those who responded to our surveys, was clearly yes. This is despite the fact that achievement, as measured by standardized tests, was no different than the achievement of MPS students. Obviously the attrition rate and factors affecting attrition indicate that not all students will succeed in these schools, but the majority remain and applaud the program. (Witte, et al, 1994, p. 28; Witte, et al, 1995, p. 17)

2.3 Did We Make Data Available In A Timely Manner? Peterson claims that we withheld data from researchers until February, 1996. The truth is that Peterson was offered all the data available at the time in December, 1994. A certified letter to Peterson from the Wisconsin Department of Public Instruction in enclosed as Appendix A. Peterson failed to respond to the letter, but continues to lie about the availability of the data.. Earlier, Mr. George Mitchell made a similar request and data were prepared for him in December, 1992. After making the request, he also failed to respond or pick up the data which had been prepared (Letter in Appendix B). Following final data collection and coding in January, 1996, we prepared data, codebooks, and over 50 separate data files including millions of pieces of information. We loaded the data and our reports and other papers on a World Wide Web Site in February, 1996. The internet site address is: (http://dpls.dacc.wisc.edu/choice/choice_index.html).

3.1 We are uncertain because the design is never described, the technique is not cited, and cell sample sizes are not given. The authors claim that randomized block designs are a superior way to analyze natural experimental data in comparison with the standard multivariate approaches used by the vast majority of scholars. Witte, who has reviewed hundreds of achievement estimation research papers (Witte, 1992, 1996), has never seen this method used before. It appears to be a fall back to matched sample designs which are sometimes used in experiments in psychology, sociology and medical trials. This might be an appropriate method to use in circumstances where it is clear what the relevant blocking factors might be. However, when trying to understand education achievement, it is far from clear what the relevant blocking factors might be. In this study they "block" on race and grade. Why? Why not gender? Why not income? Why not parent education? All these variables have been demonstrated by prior research to be related to achievement.

3.2 A very important question, which simply cannot be answered definitively from this paper, is the number of choice and non-selected choice students in each of their blocks. They allude in several places to small cell sizes or cells which have no students, but they fail to indicate if they used matched pairs in each cell, or how many students might be in the cells. Their Table 1 is simply the numbering scheme for blocks (1 to 72) - never referred to again in the text or tables. Why not have cell sizes in this table? Although because the design is not describe, we have no way of knowing how many choice students were matched or included in each cell, we have counted the total number of "control" group students who have any MPS test after they applied to choice. The results, sorted relative to their "block design," are presented in Table 1 below. The results are frightening. As summarized at the bottom of Table 1, 23 of 72 blocks have no control group students and 20 other blocks have between 1, 2 or 3 students. Thus 60% of the blocks have less than 4 control students.

4.1 In Theory Yes; In Practice No. And in this case using this design clearly biases the results in favor of choice students. As I stated years ago in my research proposal for a grant to the Spencer Foundation to fund this research, non-selected choice students could provide a unique opportunity to control for unmeasured selectivity bias which has plagued studies trying to estimate differences in achievement between public and private school students (Witte, 1992, 1996). However, as the authors note: "The analysis depends on the assumption that the missing cases do not differ significantly from those remaining in the sample." (p. 8) This is true. To test this they analyze (without statistical significance tests, which they claim are inappropriate), the differences between selected and non-selected students on demographic and prior test data. They claim there are no meaningful differences. What they fail to analyze are the differences in the non-selected students who "continue" in the program by returning to MPS and those who go elsewhere to school. The latter group is comparable to being selected for the placebo shot, but then saying: "No thank you, I quit." As can be seen in Table 2, the differences between students who continue in MPS (for which we have subsequent test data), and those who do not point to a residual control group which comes from poorer families, and have less educated parents who are less involved in their child’s education. This creates a major bias in using this control group. And of course the bias will favor choice students when they are compared to this group.

4.2 There is also a further problem in that the non-selected students who did return to MPS were not tested in every year, as were the choice students. Also, who was tested in MPS varied by income in that students in the Chapter 1 program (AFDC or Free-lunch as primary qualification) were required to be tested every year, while those not in Chapter1 were not. Thus again, a larger proportion of the control group having test data were likely to be poor, further biasing the results in favor of choice students. The Chapter 1 problem adds to the need to control for income (or Free-lunch eligibility) in all estimates. The authors only control for income in one analysis (Table 5), and the probability results of that analysis are horrendous (See section 5.1 below).

4.3 The problems do not stop there, however. The only positive results the authors claim are for third and fourth year choice students - the theory being that if the students remain in the private schools long enough the magic of private-school education will have time to work. Given attrition rates of 30% per year and the fact that three private schools from 1990-1995 closed in the middle of the year (these schools were not included in attrition rates), there is obviously a major problem in basing program success on the "survivors" after three or four years. However, if the attrition group is random, no bias will result. The question is whether the students who remain are a random sample of those who do not, or if in fact there is also selection out in the experimental group just as there was selection out in the control group?

4.4 Again, the answer to this question is striking and once again biases the results of Peterson, et al study in favor of choice. We analyzed attrition over four years in our cumulative Fourth Year Report. Table 18 of that report is reproduced here as Table 3. It is very clear that those students who left the MPCP on average had lower prior tests on both reading and math, and, for students who had both pre- and post-tests one year apart, the decline in their scores was considerably greater than the differences of those students who continued ( the two-tailed p for math is .008; for reading .124). In addition, in all categories the parents of those students who left were less involved than those who stayed - although only involvement in organizations is significant. Finally, very clearly, the students who left were much less satisfied with the private schools (DisChScl) than those who remained.

4.5 What this means is that over time the remaining students in the private schools are a continuously refined group. Leaving these schools is a combination of family choice, counseling out, and expulsion. Because the schools do not have to hold expulsion hearings and there was no appeal if families were counseled out, it is impossible to determine how many students fall into which category. However, for this analysis, it does not matter why they left, only that those who left were clearly not a random sample of those who began the program.

4.6 In conclusion, while the use of non-selected verses choice students as a natural experiment is theoretically of interest, in practice because both the non-selected choice students who continue in MPS and the students who remain in choice are not random samples of the beginning groups, the design is extremely questionable and both forms of bias clearly favor superior results for choice students who remain in the program three or four years.

5.1 Only If One Has Faith in Misspecified Models, Based on Small Sample Sizes, Which Disregard Conventional Probability Levels. Throughout this paper, the authors use terms "substantively significant" or "substantively important" rather than "statistically significant. And for good reason. Most of their results are not statistically significant using conventional levels of significance. They attempt to misdirect the reader into looking at only one-tailed probability tests because they hypothesize choice students will perform better (rather than a test of deviation from 0 between the groups). That argument is absurd given that their coefficients go in both + and - directions and that they report similar results from our previous studies. The arrogance involved in this cavalier approach to statistical inference reaches its height in Table 5, which is the only model which attempts to control for anything other than gender. The two tailed tests reported in that table for the third and fourth year choice students in math and reading are .11, .28, .17, and .59.

5.2 The publicity barrage surrounding the release of this paper - before it was subject to any kind of peer review - centered on the claims derived from Table 4 as highlighted and graphically depicted in a Wall Street Journal article. This table is based on post-test scores only (with no control on priors and thus no indication of value-added achievement), and with only a control for gender. By using only the post-test score, all the biases of the differential samples - poorer control group and higher performing third and fourth year choice survivors, come through in unmitigated fashion. Further, they fail to control for a host of other variables which have repeatedly been shown to be related to educational attainment (mother’s education, family income, parental expectations, parental involvement, etc.) This creates major misspecification problems which bias the coefficients upward. Finally, as noted above, the results for reading do not even approach standard levels of statistical significance (P values of .16 and .25). But there is no doubt that among choice supporters, these simple numbers will be endlessly referred to, grow in certainty, and perhaps even size - larger graphs would make them appear even more impressive.

5.3 The one value-added model they test with prior tests as a control is based on only 26 choice and control students in the fourth year; produces an insignificant, but negative coefficient for reading (favoring the control group); and the positive coefficient for math has a probability of .31. In addition the model contains no other controls such as family income, parent education, etc.

6.1 Absolutely Not. The authors spend over half their paper, not analyzing or appropriately presenting or explaining their analysis, but criticizing the design and presentation of our research. The major claims they make are: (1) that we should not be comparing choice students to MPS students because it is an inappropriate sample; and, (2) because MPS students do not have the same characteristics as choice students we cannot adequately control for the differences in the populations.

6.2 The first of these claims depends on how the results are to be generalized and for what policy purpose. The groups they study are in one sense unique groups. Applications for the MPCP ranged from 577 to 1049 over four years. This compares with approximately 60,000 other MPS students who were eligible each year. As stated earlier, the choice applicants were doing poorly in MPS and their parents were very alienated from the system. If the program is only to apply to disaffected minority students who are looking for an alternative to public schools because they are doing poorly in MPS schools, then comparison to the wider population, would not be as meaningful in a policy sense. However, if the intent is expand this program to include a much wider set of students, then the MPS control group is not only appropriate, but essential. And that expansion of the program is not only what has been passed by our legislature, but it is also what Peterson argued for before a Dane County, WI Court on August 14, 1996. Ironically, he used this study - released two days earlier - with its narrow and biased control group, to support his arguments to expand the program to include up to 15,000 students in both independent and parochial schools.

6.3 The second claim, that because the MPS and choice students have different characteristics, accurate estimates of achievement gains cannot be estimated, if true, would invalidate mountains of multivariate research attempting to compare different types of schools with different types of students. This would wipe out all the literature on public and private school differences going back two decades (including the pro-private school Coleman, et al studies (1982, 1987) and the Chubb and Moe (1990) study). The issue involves the adequacy of controls used in estimation models. Our studies have relied on a wide range of different estimation models, with different levels of controls depending on the data sources and sample sizes. Despite what the author’s claim on pp. 22-24, we have estimated these models for each year as well as a "stacked" four year model (Fourth Year Report, Tables 12-15). Unlike this study, all of our regression estimates use "value added" models, which control for prior learning as a way of gauging the effects of schools in terms of what they "add" to a students achievement. We have also been very careful to run regressions with and without survey variables - even though we have tested the results of including survey variables using weights to offset race and income response biases (Fourth Year Report, Tables 12-15; Fifth Year Report, Tables 11 and 12). Finally, we have modeled choice effects as a singular indicator variable (in choice or not) and alternatively as a series of indicator variables designating the years students were in choice. In all of our estimates, with all of these various controls - far more than employed in this study - there is no consistently different effect for choice verses MPS students. And the years in choice make no difference.

Because of the media and political impact of this paper, I felt compelled to reply - once. Given the quality and obvious intent of this research, I do not envision responding to any subsequent research or writings these authors produce. It is my hope that other scholars will avail themselves of the data we have spent so much time collecting and preparing for dissemination. That would provide a real forum for scientific and then policy discussion of this very important issue.

TABLE 1. NON-SELECT CHOICE (CONTROL GROUP) STUDENTS, WITH SOME POST-APPLICATION TEST.⁷

^{Year of Application}

Grade and Ethnicity	1990	1991	1992	1993

Kindergarten
African American	6	2	0	0
Hispanic	0	0	0	0
First Grade
African American	17	8	19	19
Hispanic	5	0	1	0
Second Grade
African American	11	6	18	8
Hispanic	6	0	0	0
Third Grade
African American	3	2	17	13
Hispanic	2	0	1	0
Fourth Grade
African American	3	4	12	13
Hispanic	3	0	2	0
Fifth Grade
African American	12	2	15	14
Hispanic 3 0 3 0	3	0	3	0
Sixth Grade
African American	7	10	11	11
Hispanic	2	0	1	0
Seventh Grade
African American	5	6	16	11
Hispanic	3	3	1	0
Eighth Grade
African American	3	1	4	4
Hispanic	0	1	0	0

Total
African American	67	41	112	74
Hispanic	24	4	9	0

Table Summary: Of 72 “Blocks” - 23 Contain 0 Control Group Students (32%) 20 Contain 1 to 3 Control Group Students (28%) 17 Contain 10 or more Control Group Students (24%)
Because choice students applied to choice in multiple years and because it was not explained in the paper how choice students were matched or placed in blocks, it is impossible based on their study to construct a cell size table for selected choice students. The authors do not provide any block sizes for their study.

TABLE 2. CHARACTERISTICS OF NON-SELECTED CHOICE STUDENTS WHO RETURN TO MPS (AND ARE IN THE STUDY) AND THOSE WHO DO NOT (AND ARE NOT IN THE STUDY).¹

	Non-Selected Choice Students:
	Returning to MPS (In the Study)	Not-Returning to MPS (Not in the Study)

Mother's Education
Mean Years	3.9	4.2
(N)	(135)	(89)

Annual Family Income
Means $	$11,666	$12,114
(N)	(129)	(90)

Family Size
Mean Number of Children	2.8	2.5
(N)	(136)	(95)

Parent Contacting School
Scale Mean (Higher=More)	7.95	9.36
(N)	(122)	(66)

School Contacting Parent
Scale Mean (Higher=More)	3.06	3.16
(N)	(121)	(68)

Parent in School Organizations
Scale Mean (Higher=More)	2.21	2.61
(N)	(119)	(66)

Parent Involvement at Home
Scale Mean (Higher=More)	8.66	9.00
(N)	(128)	(80)

1 Parental Involvement scale questions and statistical properties are available for all subsets in our studies in Witte, et al, Fourth Year Report: Milwaukee Parental Choice Program

Variables: Prior and post-math (MNCE) and reading (RNCE); DisChScl=dissatisfaction with choice school; PiParScl=parent contact school; PiSclPar=school contact parent; PiSclOrg =Parent involvement in school organizations; ParChild=parent involvement at home; EdImport=importance of education; DistchScl= distance from choice school (miles).

Witte, John F. 1992. "Private Verses Public School Achievement: Are There Findings

Witte, John F., Thorn, Chris, Pritchard, Kimberly, and Clairbourn, Michele. Fourth Year

Witte, John F., Thorn, Chris, and Pritchard, Kimberly. Fifth Year Report: Milwaukee

TABLE 3. Mean Differences Between Continuing and Attrition Choice Students: 1990-1993.
Variable/Scale	1 Continuing Students			2 Attrition Students			3 Prob. Diff.=0
	Mean	Std. Dv.	(N)	Mean	Std. Dv.	(N)	p

Prior RNCE	38.64	15.97	832	36.87	17.68	326	.115
Post RNCE	38.33	15.54	1142	36.62	17.01	413	.073
RNCE Diff.	-0.50	14.45	805	-1.86	12.76	312	.124
Prior MNCE	40.20	18.13	844	37.95	18.83	326	.065
Post MNCE	40.81	18.39	1191	37.80	18.07	417	.004
MNCE Diff.⁹	0.91	15.24	818	-1.79	14.95	306	.008
DisChScl	13.27	4.20	757	14.98	5.65	173	.000
PiParScl	10.24	4.54	938	10.54	4.79	248	.391
PiSclPar	4.31	3.06	936	4.48	2.93	247	.413
PiSclOrg	3.43	1.56	919	3.45	1.63	237	.837
ParChild	9.98	3.65	926	8.87	3.66	243	.950
EdImport	11.38	1.89	928	11.39	1.87	239	.912
DistChScl	3.07	2.42	1300	3.36	2.92	547	.046
Income (K$)	11.40	7.42	855	11.16	8.23	294	.665