Tuesday, August 31, 2010
California Teachers: Hot or Not?
Value Added and Reduced.
This week, the LA Times published the findings of a study of teachers and effectiveness. 6,000 third-fifth grade teachers, from all over California, were rated and ranked from “most effective” to “least effective.” The top 100 were highlighted in their own column.
Bombshell! Teachers across California must be rattled. In a world where tenure has reigned supreme for decades, suddenly teachers are being reduced to numbers. A California fourth grade teacher can look herself up, on the LA Times website, and find her rank. And so can anyone else.
How did they decide who was Hot and who was Not?
The LA Times developed this ranking on one statistical value derived from one place: standardized tests. While not complicated, it does take a little explanation:
1. Researchers compared a student's performance on state-wide standardized tests from one year to next year.
2. The difference in scores reflected a higher score, a lower score, or the same score.
3. After each student's net change score was tabulated, the students were grouped by teacher.
4. Each teacher group's average net change score was determined based on these numbers.
This “Value Added” score, the Times argues, allows for anyone to see what “value” any teacher has added to any of his students. Simple. Clean. A raw piece of data-- just what we love.
Ah, but not so fast. Data's beauty is in the eyes of the beholder.
Those who challenge the data have some great arguments against the analysis. First, the data all hangs from the chain of the quality of the standardized tests. A 'good standardized test' has proven to be the the single hardest thing to build in the realm of student evaluation. Holy Grail hard.
Second, it's not that ONE test has to be terrific: the first year AND the second year test must give an accurate measure of a student's acumen. And then every subsequent test that is used in this manner. Furthermore, these tests must be built together in order to measure growth; the tests must work in concert in order to have comparative value.
Finally, the 'Value Added' score doesn't measure ability-- it only measures change. Take one example of the many scenarios that skew the data. Think about what happens when comparing a teacher with kids who start with an average of 95/100 to a teacher with kids who start out with an average of 45/100. The teacher with the better kids, weirdly, has a terrific chance of being ranked as a poor teacher.
However, those who support the data build a strong case, too. First, every teacher (and student) is measure using the same test. Subsequently, no matter how 'good' or 'bad' the test is, everyone is handled with equal fairness (and unfairness). This is true even in comparing first year to second year tests-- it's fair because everyone is measured with, theoretically, the same stick.
Second, no small number of students who had 'a bad day' will skew the data simply because all of the students took the test. Arguably, a nearly equal number of students would have had 'a great day' taking the test. The size of the sample smooths out the inconsistencies or anomalies. The more years this data is collected, the less the effect of accidentals.
Finally, supporters say, the data show that where a kid comes from (class, ethnicity, and gender) are far less important than the kind of teacher she has. This is good because we can finally separate the data about student achievement from the data about teacher achievement. The Value Added model focuses on growth as opposed to talent or natural endowment. (Richard Buddin, who wrote a paper about the data collection, loves to say the word 'endowment.' You have to love using that word earnestly. I implore you to work the word 'endowment' into conversations you have today.)
Find Buddin's work here http://www.latimes.com/media/acrobat/2010-08/55538493.pdf
These are just the arguments about the data and its collection. The hot and nasty arguments are about philosophy, cultural values, and professionalism. Rightfully, some of the argument centers on whether or not information about teachers' evaluations should be made public.
I'd like to offer a broader perspective.
The idea of ranking teachers is ridiculous. It doesn't serve to improve the profession; it only serves the media's love of bite-sized information that allows for thin conclusions. Henry I. Braun, in his treatment of Value Added Assessment, warns of the dangers of "casual interpretation" of data. The amount of ugliness that will follow the Times posting this information may very well swallow any other productive dialogue.
It takes a quality blindfold to pretend that something is rank-able just because it has a number attached to it. It's a shame that the solid journalism that supported this research has been shared in this way. The LA Times decision to rank the teachers has turned California into a Miss America contest. A beauty pageant isn't about beauty: it's about winners and losers. Revamping education will require a much wider, more informed view than that.
That said, evaluating teachers (like evaluating anything) requires some hard statistical data. Toss the ranking; keep the data. The “Value Added” Score provides non-anecdotal information about a teacher. For this reason, this statistical information, despite its potential flaws, is not only useful. It is important.
Stats like “Value Added” are essential. And, they are the future.
As you will read in my long-awaited final installment of “Scapegoats and Saviors IV,” at least part of teacher evaluation must come from a number like this. It cannot-- and should not-- be avoided.
Check out the LA Times to read the article yourself. I'd love to hear what you think about it.
If you really want to dork out (and I highly recommend doing so), peruse the scholarly work (including that of Buddin and Braun) on how and why they did the Value Added model.