A Broad Notion of Teacher Effectiveness: Great Teaching
A Broad Notion of Teacher Effectiveness: Great Teaching
A Broad Notion of Teacher Effectiveness: Great Teaching
often what they recall. Learning to set goals, take risks and responsibility, or simply believe in
oneself are often fodder for fond thanks—alongside mastering pre-calculus, becoming a critical
reader, or remembering the capital of Turkmenistan.
It’s a dynamic mix, one that captures the broad charge of a teacher: to teach students the skills
they’ll need to be productive adults. But what, exactly, are these skills? And how can we determine
which teachers are most effective in building them?
Test scores are often the best available measure of student progress, but they do not capture every
skill needed in adulthood. A growing research base shows that non-cognitive (or socio-emotional)
skills like adaptability, motivation, and self-restraint are key determinants of adult outcomes.
Therefore, if we want to identify good teachers, we ought to look at how teachers affect their
students’ development across a range of skills—both academic and non-cognitive.
A robust data set on 9th-grade students in North Carolina allows me to do just that. First, I create a
measure of non-cognitive skills based on students’ behavior in high school, such as suspensions
and on-time grade progression. I then calculate effectiveness ratings based on teachers’ impacts on
both test scores and non-cognitive skills and look for connections between the two. Finally, I explore
the extent to which measuring teacher impacts on behavior allows us to better identify those truly
excellent educators who have long-lasting effects on their students.
I find that, while teachers have notable effects on both test scores and non-cognitive skills, their
impact on non-cognitive skills is 10 times more predictive of students’ longer-term success in high
school than their impact on test scores. We cannot identify the teachers who matter most by using
test-score impacts alone, because many teachers who raise test scores do not improve non-
cognitive skills, and vice versa.
These results provide hard evidence that measuring teachers’ impact through their students’ test
scores captures only a fraction of their overall effect on student success. To fully assess teacher
performance, policymakers should consider measures of a broad range of student skills, classroom
observations, and responsiveness to feedback alongside effectiveness ratings based on test scores.
A Broad Notion of Teacher Effectiveness
Individual teacher effectiveness has become a major focus of school-improvement efforts over the
last decade, driven in part by research showing that teachers who boost students’ test scores also
affect their success as adults, including being more likely to go to college, have a job, and save for
retirement (see “Great Teaching,” research, Summer 2012). Economists and policymakers have used
students’ standardized test scores to develop measures of teacher performance, chiefly through a
formula called value-added. Value-added models calculate individual teachers’ impacts on student
learning by charting student progress against what they would ordinarily be expected to achieve,
controlling for a host of factors. Teachers whose students consistently beat those odds are
considered to have high value-added, while those whose students consistently don’t do as well as
expected have low value-added.
At the same time, policymakers and educators are focused on the importance of student skills not
captured by standardized tests, such as perseverance and collaborating with others, for longer-term
adult outcomes. The 2015 federal Every Student Succeeds Act allows states to consider how well
schools do at helping students create “learning mindsets,” or the non-cognitive skills and habits that
are associated with positive outcomes in adulthood. In one major experiment in California, for
example, a group of large districts is tracking progress in students’ non-cognitive skills as part of
their reform efforts.
Is it possible to combine these two ideas by determining which individual teachers are most effective
at helping students develop non-cognitive skills?
To examine this question, I look to North Carolina, which collects data on test scores and a range of
student behavior. I use data on all public-school 9th-grade students between 2005 and 2012,
including demographics, transcript data, test scores in grades 7 through 9, and codes linking scores
to the teacher who administered the test. The data cover about 574,000 students in 872 high
schools. I focus on the 93 percent of 9th-grade students who took classes in which teachers will also
have traditional test score-based value-added ratings: English I and one of three math classes
(algebra I, geometry, or algebra II).
I use these data to explore three major questions. First, how predictive is student behavior in 9th
grade of later success in high school, compared to student test scores? Second, are teachers who
are better at raising test scores also better at improving student behavior? And finally, what measure
of teacher performance is more predictive of students’ long-term success: impacts on test scores, or
impacts on non-cognitive skills?
The Predictive Power of Student Behavior
To explore the first question, I create a measure of students’ non-cognitive skills by using the
information on their behavior available in the 9th-grade data, including the number of absences and
suspensions, grade point average, and on-time progression to 10th grade. I refer to this weighted
average as the “behavior index.” The basic logic of this approach is as follows: in the same way that
one infers that a student who scores higher on tests likely has higher cognitive skills than a student
who does not, one can infer that a student who acts out, skips class, and fails to hand in
homework likely has lower non-cognitive skills than a student who does not. I also create a test-
score index that is the average of 9th-grade math and English scores.
I then look at how both test scores and the behavior index are related to various measures of high-
school success, using administrative data that follow students’ trajectories over time. The outcomes I
consider include graduating high school on time, grade-point average at graduation, taking the SAT,
and reported intentions to enroll in a four-year college. Roughly 82 percent of students graduated, 4
percent are recorded as having dropped out, and the rest either moved out of state or remained in
school beyond their expected graduation year. Because I am interested in how changes in these skill
measures predict long-run outcomes, I control for the student’s test scores and behavior in 8th
grade. In addition, my analysis adjusts for differences in parental education, gender, and
race/ethnicity.
My first set of results shows that a student’s behavior index is a much stronger predictor of future
success than her test scores. Figure 1 plots the extent to which increasing test scores and the
behavior index by one standard deviation, equivalent to moving a student’s score from the median to
the 85th percentile on each measure, predicts improvements in various outcomes. A student whose
9th-grade behavior index is at the 85th percentile is a sizable 15.8 percentage points more likely to
graduate from high school on time than a student with a median behavior index score. I find a
weaker relationship with test scores: a student at the 85th percentile is only 1.9 percentage points
more likely to graduate from high school than a student whose score is at the median. The behavior
index is also a better predictor than 9th-grade test scores of high-school GPA and the likelihood that
a student takes the SAT and plans to attend college.
While these patterns reveal that the behavior index is a good predictor of educational attainment,
they are descriptive. They do not show that teachers impact these behavior, and they do not show
that teacher impacts on these measures will translate into improved longer-run success. I next
examine these more causal questions.
Applying Value-Added to Non-Cognitive Skills
The predictive power of the behavior index suggests that improving behavior could yield large
benefits, but it leaves open the question of whether teachers who improve student behavior are
different from teachers who improve test scores. This is important, because if teachers who are more
effective at raising test scores are also more effective at improving behavior, then we will not
improve our ability to identify teachers who improve long-run student outcomes by estimating
teacher impacts on behavior. In contrast, if the group of teachers who are effective at improving test
scores includes some who are above average, average, or even below average at improving
behavior, then having non-cognitive effectiveness ratings will allow us to identify truly excellent
teachers who may have the largest impact on longer-run outcomes by improving both test scores
and behavior.
To assess this, I employ separate value-added models to evaluate the unique contribution of
individual teachers to test scores and to the behavior index. I group teachers by their ability to
improve behavior, and plot the distribution of test-score value-added among teachers in each group.
If teachers who improve one skill are also those who improve the other, the average test-score
value-added should be much higher in groups with higher behavior value-added, and there should
be little overlap in the distribution of test-score value-added across the behavior value-added groups.
That’s not what the data show. Although teachers with higher behavior value-added tend to have
somewhat higher test-score value-added, there is considerable overlap across groups (see Figure
2). That is, although teachers who are better at raising test scores tend to be better at raising the
behavior index, on average, effectiveness along one dimension is a poor predictor of the other. For
example, among the bottom third of teachers with the worst behavior value-added, nearly 40 percent
are above average in test-score value-added. Similarly, among the top third of teachers with the best
behavior value-added, only 58 percent of teachers are above average in test-score value-added.
This reveals not only that many teachers who are excellent at improving one skill are poor at
improving the other, but also that knowing a teacher’s impact on one skill provides little information
on the teacher’s impact on the other.
https://www.educationnext.org/full-measure-of-a-teacher-using-value-added-assess-effects-student-
behavior/