Tuesday, August 31, 2010

California Teachers: Hot or Not?

Value Added and Reduced.

This week, the LA Times published the findings of a study of teachers and effectiveness. 6,000 third-fifth grade teachers, from all over California, were rated and ranked from “most effective” to “least effective.” The top 100 were highlighted in their own column.

Bombshell! Teachers across California must be rattled. In a world where tenure has reigned supreme for decades, suddenly teachers are being reduced to numbers. A California fourth grade teacher can look herself up, on the LA Times website, and find her rank. And so can anyone else.

How did they decide who was Hot and who was Not?

The LA Times developed this ranking on one statistical value derived from one place: standardized tests. While not complicated, it does take a little explanation:
1. Researchers compared a student's performance on state-wide standardized tests from one year to next year.
2. The difference in scores reflected a higher score, a lower score, or the same score.
3. After each student's net change score was tabulated, the students were grouped by teacher.
4. Each teacher group's average net change score was determined based on these numbers.

This “Value Added” score, the Times argues, allows for anyone to see what “value” any teacher has added to any of his students. Simple. Clean. A raw piece of data-- just what we love.

Ah, but not so fast. Data's beauty is in the eyes of the beholder.

Those who challenge the data have some great arguments against the analysis. First, the data all hangs from the chain of the quality of the standardized tests. A 'good standardized test' has proven to be the the single hardest thing to build in the realm of student evaluation. Holy Grail hard.

Second, it's not that ONE test has to be terrific: the first year AND the second year test must give an accurate measure of a student's acumen. And then every subsequent test that is used in this manner. Furthermore, these tests must be built together in order to measure growth; the tests must work in concert in order to have comparative value.

Finally, the 'Value Added' score doesn't measure ability-- it only measures change. Take one example of the many scenarios that skew the data. Think about what happens when comparing a teacher with kids who start with an average of 95/100 to a teacher with kids who start out with an average of 45/100. The teacher with the better kids, weirdly, has a terrific chance of being ranked as a poor teacher.

However, those who support the data build a strong case, too. First, every teacher (and student) is measure using the same test. Subsequently, no matter how 'good' or 'bad' the test is, everyone is handled with equal fairness (and unfairness). This is true even in comparing first year to second year tests-- it's fair because everyone is measured with, theoretically, the same stick.

Second, no small number of students who had 'a bad day' will skew the data simply because all of the students took the test. Arguably, a nearly equal number of students would have had 'a great day' taking the test. The size of the sample smooths out the inconsistencies or anomalies. The more years this data is collected, the less the effect of accidentals.

Finally, supporters say, the data show that where a kid comes from (class, ethnicity, and gender) are far less important than the kind of teacher she has. This is good because we can finally separate the data about student achievement from the data about teacher achievement. The Value Added model focuses on growth as opposed to talent or natural endowment. (Richard Buddin, who wrote a paper about the data collection, loves to say the word 'endowment.' You have to love using that word earnestly. I implore you to work the word 'endowment' into conversations you have today.)

Find Buddin's work here http://www.latimes.com/media/acrobat/2010-08/55538493.pdf

These are just the arguments about the data and its collection. The hot and nasty arguments are about philosophy, cultural values, and professionalism. Rightfully, some of the argument centers on whether or not information about teachers' evaluations should be made public.

I'd like to offer a broader perspective.

The idea of ranking teachers is ridiculous. It doesn't serve to improve the profession; it only serves the media's love of bite-sized information that allows for thin conclusions. Henry I. Braun, in his treatment of Value Added Assessment, warns of the dangers of "casual interpretation" of data. The amount of ugliness that will follow the Times posting this information may very well swallow any other productive dialogue.

It takes a quality blindfold to pretend that something is rank-able just because it has a number attached to it. It's a shame that the solid journalism that supported this research has been shared in this way. The LA Times decision to rank the teachers has turned California into a Miss America contest. A beauty pageant isn't about beauty: it's about winners and losers. Revamping education will require a much wider, more informed view than that.

That said, evaluating teachers (like evaluating anything) requires some hard statistical data. Toss the ranking; keep the data. The “Value Added” Score provides non-anecdotal information about a teacher. For this reason, this statistical information, despite its potential flaws, is not only useful. It is important.

Stats like “Value Added” are essential. And, they are the future.

As you will read in my long-awaited final installment of “Scapegoats and Saviors IV,” at least part of teacher evaluation must come from a number like this. It cannot-- and should not-- be avoided.

Check out the LA Times to read the article yourself. I'd love to hear what you think about it.

If you really want to dork out (and I highly recommend doing so), peruse the scholarly work (including that of Buddin and Braun) on how and why they did the Value Added model.
lookee here:


  1. What about those teachers with students scoring at a level that does not allow for improvement? Say the teacher of an AP class who has students who regularly ace standardized tests; how can they be graded next to teachers who have students with room for improvement? Standardized testing is rather flawed, and to have a teacher be graded on their ability to have students adhere to its demands seems rather over-simplistic. And why do teachers have to be boiled down to numbers anyway?

  2. "And why do teachers have to be boiled down to numbers anyway?"

    It's a great question. I think that we must avoid reducing the way we value teachers. Too often, we want to rate everything quickly and simply-- this is a mistake. You say it's "over-simplistic." I agree entirely.

    You also bring up a serious problem with the Value Added model. Yes, teachers of high scoring kids would be at a disadvantage. Every number will have a problem.

    My idea: use numbers sparingly. I am convinced we need to have a quantitative value in teacher evals-- for practical and political reasons. I'll explain how I think we should do this here in the next week.

    Thanks for reading and commenting, Anonymous. :)

  3. Ahhh numbers. They are so beautifully simple and yet so tricky to attach real meaning to. Accountability in education is an essential, overdue, wonderful component to how we now look at teaching. Looking at outcomes in education is as essential as it is when you are managing a business. Numbers give the ability to really look at statistics that measure success. However, sometimes they can lead us to conclusions that are not clear or even accurate, especially when dealing with sloppy things like the education of a child or the effectiveness of a teacher.