Sunday, June 12, 2011

Evaluating Teachers’ Performance

The Michigan State Legislature has decreed that all public school districts will, beginning this fall, annually evaluate all teachers and principals in a way that considers student growth over time as a significant factor, translating that growth into satisfactory or unsatisfactory ratings of the staff members.

On the face of it, the plan sounds reasonable: why not measure educators’ performance to a significant degree by learning outcomes for students? After all, good learning outcomes are arguably the primary purpose of schooling, are they not?

My question would be whether this is the best way to achieve them. There are other issues, such as the reliability of our testing methods for determining student knowledge and skill levels, the difficulty of evaluating teachers who do not teach tested subjects, and the efficiency of trying to improve teaching with threats rather than professional development and support. But today, I want to consider the simple cost of such evaluations and a different way to spend that kind of money to better achieve our goal of student learning.

What will it cost?

To start with, we can throw out the MEAP tests for this purpose. In order to fairly evaluate student growth over time, we need to test them at the beginning of each year to establish baselines, and at the end of each year to measure growth. To give it an educational as well as an evaluation role, it would also have to be given several times during the year, to enable course corrections and interventions if they are not doing as well as expected.

You can get an idea of what this will cost by what the Ann Arbor Public Schools (AAPS) has just committed to spending for norm-referenced Northwest Evaluation Association (NWEA) testing:

  • $62,856 annually for K–2 testing (plus Scarlett MS)
  • $49,491 annually for grades 3–5 testing
  • $51,611 annually (starting next year) for grades 6–8 testing
  • $31,500 for the new server space this testing will require (one-time expenditure)
  • Unknown but significant amount for new computers required for middle school testing
  • An estimated $50,000 in two years to pilot software for another evaluation model

That amounts to about $164,000 per year in just K–8 testing costs, after a significant investment in computers and servers to handle it, followed by another significantly expensive software pilot, presumably leading to a much more expensive broader implementation later. This is not chump change. Can you say, “unfunded mandate”? Note that districts are supposed to come up with the funds for this evaluation system at the exact same time that our per-pupil funding has been cut by $470 and our retirement costs have increased by 18% (including the one-year reprieve in rate increase).

My understanding of the NWEA testing is that it is actually efficient and useful. Because it responds dynamically to student input, the difficulty level adjusts up or down to give a good estimate of just what each individual student knows and can do. And because it is computerized, with nearly instant results, it provides specific information that is useful to teachers.

The next step

But testing — whether for teacher evaluation or to allow timely adaptation by teachers — is only part of what we need. It may diagnose a problem but does not solve it. Suppose a child is not learning or a teacher does not seem to be teaching effectively — then, what? To pursue our goal of student achievement, we need ways to help teachers do their job better: professional development that works. Montgomery County, Maryland, where I grew up, has an innovative way of coaching teachers who need help. Its Peer Assistance and Review program formally mentors both new teachers and veterans who are underperforming, according to their principals’ evaluations.

Intensive help in the form of modeling, planning, coaching, and reviewing instruction is provided for a year by experienced and highly qualified Consulting Teachers. CTs induct new teachers into the school culture, providing practical tips and demonstrating what good teaching looks like. They also provide struggling teachers with intensive support and assistance to improve their practice. CTs are provided the same Observing and Analyzing Teaching courses offered to principals. After a three-year rotation in the CT position, they return to the classroom — with improved leadership and communication skills now available to fellow teachers who are not in such dire need of help.

After these year-long interventions, a PAR panel of teachers and principals reviews the CT data and report and evaluates whether staff are meeting the district’s six standards of effective teaching: commitment to students and their learning; knowledge of their subject and how to teach it; maintaining a positive learning environment; continually assessing student progress and adapting instruction to improve it; commitment to continuous improvement in their practice; and exhibiting a high degree of professionalism. The panel then recommends that teachers be returned to the regular professional growth cycle, be given a second year of support, or be dismissed.

While the point of the system is not simply to find and fire “bad teachers,” teachers who need to improve but do not are, in fact, dismissed. According to a recent New York Times recap of the PAR program, panels have voted to fire 200 teachers in the past 11 years, and 300 more have left rather than go through the PAR process. (Keep in mind that school districts in Maryland are county-wide and therefore very large; MCPS has nearly 150,000 students.) For comparison, in the ten years before PAR, only five teachers were fired.

What is more important, though, is that hundreds of new teachers were effectively mentored through their always-difficult first year, and hundreds more struggling teachers were helped to become the effective and professional instructors that all children deserve. Moreover, the program’s careful design, and equal panel representation of teachers and administrators chosen by their unions, have resulted in real trust and buy-in by all parties. The program is seen not as merely punitive but as a genuine opportunity to improve professional practice. Even before that trust was built, a 2004 report found that most tenured teachers who were put into the program were grateful for it afterward, acknowledging that it made them better teachers.

In other words, the PAR program improves student outcomes by improving teacher practice. Yet it does not meet federal standards for school improvement.

Elevating means above ends

Federal school improvement grants awarded through Race to the Top are intended to accelerate student growth, but the program is very specific about just how this must be demonstrated. Specifically, districts are required to evaluate teacher quality by means of students’ test scores, just as Michigan now requires. Montgomery County PS, already getting results the rest of us would envy, had to turn down the $12M it could have gotten from Maryland’s RttT grant. As its superintendent noted, “We don’t believe the tests are reliable. You don’t want to turn your system into a test factory.”

Well, that horse is already out of the barn here, I’d say. Tests are moving, metaphorically, from “the important thing” to “the only thing.” We seem to have forgotten their point.

Another way

Suppose, instead, we diverted some of the additional money we will now have to spend on testing to a proven professional development model like PAR. A handful of master teachers delegated to coaching new and struggling staff on an ongoing basis could do more than define the outlines of a problem, as testing does. Instead, they could actually solve it. Building better teachers produces better student achievement. Is that so hard to believe?