Public Disclosure Authorized
Public Disclosure Authorized
Public Disclosure Authorized
Public Disclosure Authorized
WPS6043
Policy Research Working Paper
6043
Performance-related Pay in the Public Sector
A Review of heory and Evidence
Zahid Hasnain
Nick Manning
Jan Henryk Pierskalla
he World Bank
East Asia and Paciic Region
Poverty Reduction & Economic Management Sector Department
Poverty Reduction and Economic Management Network
Public Sector Governance
April 2012
Policy Research Working Paper 6043
Abstract
he objective of this paper is to provide a review of the
theoretical and, in particular, empirical literature on
performance-related pay in the public sector spanning the
ields of public administration, psychology, economics,
education, and health with the aim of distilling useful
lessons for policy-makers in developing countries.
his study to our knowledge is the irst that aims to
disaggregate the available evidence by: (i) the quality
of the empirical study; (ii) the diferent public sector
contexts, in particular the diferent types of public sector
jobs; and (iii) geographical context (developing country
or OECD settings). he paper’s main indings, based on
a comprehensive review of 110 studies of public sector
and relevant private sector jobs are as follows. First, we
ind that overall a majority (65 of 110) of studies ind a
positive efect of performance-related pay, with higher
quality empirical studies (68 of the 110) generally more
positive in their indings (46 of the 68). hese show that
explicit performance standards linked to some form of
bonus pay can improve, at times dramatically, desired
service outcomes. Second, however, these more rigorous
studies are overwhelmingly for jobs where the outputs or
outcomes are more readily observable, such as teaching,
health care, and revenue collection (66 of the 68). here
is insuicient evidence, positive or negative, of the efect
of performance-related pay in organizational contexts
that that are similar to that of the core civil service,
characterized by task complexity and the diiculty of
measuring outcomes, to reach a generalized conclusion
concerning such reforms. hird, while some of these
studies have shown that performance-related pay can
work even in the most dysfunctional bureaucracies in
developing countries, there are too few cases to draw irm
conclusions. Fourth, several observational studies identify
problems with unintended consequences and gaming
of the incentive scheme, although it is unclear whether
the gaming results in an overall decline in productivity
compared to the counterfactual. Finally, few studies
follow up performance-related pay efects over a long
period of time, leaving the possibility that the positive
indings may be due to Hawthorne Efects, and that
gaming behavior may increase over time as employees
become more familiar with the scheme and learn to
manipulate it.
his paper is a product of the Poverty Reduction & Economic Management Sector Department, East Asia an Paciic
Region; and the Public Sector Governance and Poverty Reduction and Economic Management Network. It is part of a
larger efort by the World Bank to provide open access to its research and make a contribution to development policy
discussions around the world. Policy Research Working Papers are also posted on the Web at http://econ.worldbank.org.
he author may be contacted at zhasnain@worldbank.org and nmanning@worldbank.org
he Policy Research Working Paper Series disseminates the indings of work in progress to encourage the exchange of ideas about development
issues. An objective of the series is to get the indings out quickly, even if the presentations are less than fully polished. he papers carry the
names of the authors and should be cited accordingly. he indings, interpretations, and conclusions expressed in this paper are entirely those
of the authors. hey do not necessarily represent the views of the International Bank for Reconstruction and Development/World Bank and
its ailiated organizations, or those of the Executive Directors of the World Bank or the governments they represent.
Produced by the Research Support Team
Performance-related Pay in the Public Sector: A Review of Theory and
Evidence
Zahid Hasnain1, Nick Manning2, and Jan Henryk Pierskalla 3
JEL Codes: H11, H83, I18, I28
Acknowledgements: This paper has benefitted greatly from comments by Mike Stevens, Willy McCourt, Mariano
Lafuente, Gary Reid, and Svetlana Proskurovska.
1
2
3
World Bank (zhasnain@worldbank.org)
World Bank (nmanning@worldbank.org)
Duke University (jhp5@duke.edu)
Contents
1. Introduction .......................................................................................................................................... 1
2. Theoretical debates ............................................................................................................................... 4
Expectancy and reinforcement theory ............................................................................................ 4
Incentive and principal-agent theory .............................................................................................. 5
Behavioral economics —intrinsic versus extrinsic motivation ...................................................... 7
3. Organizing the empirical evidence: ―Craft‖ and ―Coping‖ jobs ........................................................... 9
Methodological approaches.......................................................................................................... 11
4. The empirical literature reviewed ....................................................................................................... 13
Observational studies ................................................................................................................... 13
Public sector coping jobs ....................................................................................................... 13
Public sector craft jobs........................................................................................................... 14
Tax administration, job placement ................................................................................... 14
Teaching............................................................................................................................ 16
Health care jobs ................................................................................................................ 17
Private sector: Craft or coping jobs ....................................................................................... 19
Experimental studies .................................................................................................................... 21
Meta-studies........................................................................................................................... 21
Laboratory experiments ......................................................................................................... 22
Public sector craft jobs........................................................................................................... 23
Tax administration, job placement ................................................................................... 23
Teaching............................................................................................................................ 24
Health sector ..................................................................................................................... 25
Private sector: Craft or coping jobs ....................................................................................... 26
5. Assessing the evidence ....................................................................................................................... 26
6. Summary............................................................................................................................................. 28
Appendix A: List of empirical PRP studies reviewed ............................................................................ 33
Appendix B: List of High Quality Studies of Craft and Coping Jobs .................................................... 37
References .............................................................................................................................................. 40
Figures
Figure 1: Aggregate findings on performance-related pay……………………………………………….. 27
Figure 2: Findings by internal and external validity……………………………………………………… 27
Figure 3: Findings by job type……………………………………………………………………………. 28
Figure 4: Findings for craft and coping tasks by research quality and country context………………….. 29
Tables
Table 1: James Q. Wilson‘s classification of job type ................................................................................ 11
Table 2: Studies by country environment, methodology, and job type....................................................... 13
Table 3: Findings of high quality craft and coping studies by sector and country context ......................... 29
Boxes
Box 1: The elements of pay flexibility.......................................................................................................... 3
1. Introduction
Performance-related pay (PRP) has been introduced in many countries as a possible tool for improving
the productivity and accountability of the public sector. Over the past fifteen years, a majority of OECD
countries have implemented PRP in the central administration (core civil service), in specialized entities
such as revenue administration, and for key service delivery staff such as teachers and medical personnel.
Middle income countries, and to some extent low income countries, perhaps drawing on the OECD
example, have also experimented with PRP in an attempt to inject performance-orientation in otherwise
dysfunctional bureaucracies. A vast theoretical and empirical literature has analyzed various dimensions
of PRP, and there is now a small but growing body of robust evidence on the impact of PRP that is
shedding new light on what is achievable and under what specific conditions.
The objective of this paper is to provide a review of the theoretical and in particular empirical literature
on PRP in the public sector spanning the fields of public administration, psychology, economics,
education, and health with the aim of distilling useful lessons for policy-makers in developing countries.
This is by no means the first comprehensive review of this literature; but it is, to our knowledge, the first
that aims to disaggregate the available evidence by (i) the quality of the empirical study; (ii) the different
public sector contexts, in particular the different types of public sector jobs; and (iii) geographical context
(developing country or OECD settings). The intention in so doing is to ensure that the findings from the
empirical literature are appropriately nuanced.
PRP is a compensation arrangement in which the final salary of an employee is a function of some form
of measured ―performance‖, where how performance is measured, who measures it, and how it is linked
to salary can all vary considerably and are key aspects of the design of the scheme. Performance can be
based on qualitative assessments or quantitative measures of inputs (effort allocation, attendance,
voluntary contributions at the workplace and skills acquisition), outputs (completion of pre-agreed tasks
or number of clients/cases served), or outcomes (student test scores, official client evaluations, service
utilization rates or revenue creation). Salary can either be wholly a function of performance, for example
as piece-rate pay in a manufacturing setting or commission-based salary in a sales environment, or a
combination of base pay and one-off bonuses or merit increases of base pay. Bonuses and merit increases
can be awarded on an individual, small team or larger departmental basis. Evaluations can be
implemented by direct supervisors, human resource specialists, peer panels or outside agencies. Once
performance has been measured, it has to be evaluated against a performance standard. This standard can
be based on individually pre-agreed goals, absolute performance against minimum or scaled standards,
relative improvements against past performance, rank-order of performance in a tournament evaluation or
relative performance measured against co-workers, other teams of co-workers, other schools or agencies
nationally or regionally.
PRP in the form of bonuses or merit increases to basic pay has been used more frequently in the OECD in
recent years. According to one estimate, approximately two-thirds of OECD countries have introduced
PRP in some form or the other (OECD 2005). The United Kingdom, Switzerland and the Czech Republic
apply PRP more extensively than countries such as New Zealand, Austria and the Netherlands. In
Finland, for example, the proportion of basic salary that PRP can represent can amount to over 40% of the
total. In the US, while PRP is of limited use in the core public administration, it is at the center of efforts
to improve teacher accountability over the past decade in the context of the No Child Left Behind Act.
The literature suggests that there are similar movements underway in middle income countries and,
perhaps more sporadically, in lower income countries where PRP is more often referred to in the health
and education sectors than in the core administration.
1
Drawing on contract theory and the problems of moral hazard, a significant chunk of the academic
literature has examined the impact of PRP on increasing effort and reducing shirking. In situations where
effort is unobservable, fixed pay contracts provide little ability for employers to influence employee
effort. This is especially likely to be the case in traditional civil service jobs characterized by uniform pay
for jobs in similar grades, pay increases based largely on seniority, and negligible probability of
termination. Contracts that tie observable outputs, which are correlated with unobservable effort, to
desired pay incentives can mitigate such problems. Some of the literature has attempted to move beyond
effort and to introduce the concept of ―engagement‖ as a measure of an employee‘s emotional and
intellectual commitment to their employing organization and its success, encompassing both commitment
(―I like working here‖) and organizational citizenship (―I am prepared to go the extra mile‖), with the
question then being what impact can PRP have on staff engagement. Contract theory also suggests that
PRP can help address the problem of adverse selection by encouraging high ability individuals who will
do better under a performance pay scheme to join the agency and similarly discouraging low ability
individuals (the ―sorting‖ effect).
Critics counter that PRP does not work when tasks are multi-dimensional as it results in ―gaming‖
behavior whereby effort is only allocated towards what is observed and measured which may not improve
overall outcomes. A huge literature has examined the challenges in effectively designing these schemes,
particularly in public sector contexts where tasks are complex and outcomes cannot be easily measured.
The psychological and behavioral economics literature in addition has argued that individuals are also
motivated by intrinsic concerns about the inherent social value of the job — particularly in the public
sector — and that PRP which explicitly focuses on extrinsic benefits might crowd-out intrinsic motivation
thereby reducing worker productivity.
In reviewing the empirical evidence, 153 studies in total, the paper is explicit in focusing largely on job
types that are relevant to the public sector and characterized by task complexity and unobservability of
effort thereby excluding studies that examine PRP for simple manufacturing jobs or other repeatable
tasks. These jobs, borrowing James Q. Wilson‘s typology, are classified as ―craft‖ and ―coping‖ jobs.
This narrows the list to 110 studies. Reviewing these, the paper draws the following conclusions. First, a
majority (93 of the 153) of all studies find some positive effect of PRP, and a majority (65 of 110) of
studies of craft and coping jobs show positive findings. Limiting the analysis to craft and coping also
shows while early case studies found largely inconclusive evidence on the impact of PRP on staff morale,
effort, and productivity, work over the last 10 to 15 years that has been based on more systematic
observational studies and experimental evaluations in the laboratory and in the field have generally found
that explicit performance standards linked to some form of bonus pay can improve, at times dramatically,
desired service outcomes. 46 of the 68 high quality studies of craft and coping jobs showed a positive
effect of PRP.
Second, however, these more rigorous studies are usually for jobs where the outputs or outcomes are
more readily observable, such as revenue collection, teaching and health care. There is simply not enough
robust evidence, positive or negative, of the effect of PRP in organizational contexts that that are similar
to that of the core civil service, characterized by task complexity and the difficulty of measuring
outcomes to reach a generalized conclusion for reform. To be specific, of the 68 high quality studies, only
2 were for contexts similar to core civil service jobs. Third, while some of these studies have shown that
PRP can work even in the most dysfunctional bureaucracies, there is limited evidence in developing
country contexts (only 10 high quality studies), with considerable discretionary as opposed to rule-based
behavior and with significant politicization that can negatively affect the overall credibility and
legitimacy of the incentive scheme. Fourth, several observational studies identify problems with
unintended consequences and gaming of the incentive scheme, although it is unclear whether the gaming
2
results in an overall decline in productivity compared to the counterfactual. Finally, few studies follow up
PRP effects over a long period of time, leaving the possibility that the positive findings may be due to
Hawthorne Effects 4 , and that gaming behavior may increase over time as employees become more
familiar with the PRP scheme and learn to manipulate it.
This literature review focuses almost entirely on the individual incentive effects of PRP as this is what the
bulk of the literature has emphasized. It should be noted that there are, at least in the policy literature,
potential agency-level and public sector wide effects of PRP that have to date been underexplored in the
academic literature. PRP can provide a mechanism for conveying to staff the increasing expectations of
agency performance, in effect changing the agency-level culture of performance (―this is how hard we
work around here‖) through the multitude of individual performance appraisal discussions (Marsden
2004). PRP can also have wider public sector impacts on the fiscal sustainability of the wage bill, public
sector pay competitiveness, and social objectives, such as gender equity. PRP may contribute to cost
containment by limiting pay increases to less costly performance bonuses (Marsden and French 1998).
Some studies have also suggested that PRP may have a gender dimension as men are more likely to see
the arrangement as fair and reasonable. Finally, PRP in the OECD countries been introduced in the
context of other changes in pay policy, in particular moves towards differentiated pay arrangements
whereby pay for similar jobs varies across government agencies and occupational groups, and the
delegation of pay setting authority from central human resource ministries to line ministries and agencies
(see Box 1). What remains unexplored are the potential linkages between these reforms and whether or
not PRP caused or was the effect of these other changes in pay policy.
Box 1: The elements of pay flexibility
OECD countries have move towards flexible pay arrangements in the public sector which is in essence some
combination of three key features, or design elements:
Performance-related pay: Enabling pay to differ for civil servants doing the same job by linking a portion of
the civil servants‘ pay to individual or group effort or performance with performance targets set either at the
individual level and/or the group level.
Differentiation: Pay differences within and across government ministries, departments, and agencies for civil
servants doing the same job - based on the need to attract and retain qualified staff for those jobs, or to persuade
staff to accept new working arrangements. The need for special incentives for attraction or retention can be
based on labor market conditions and/or cost of living in the localities where agencies operate or as a function
of the specific skills that the agency competes for in the labor market, based for example on job evaluations and
labor market surveys. Differentiation can affect: (i) all staff paid by the entity, with the result that there are
agency-specific pay scales; or (ii) particular occupational groups or cadres; (iii) specific individuals, with scarce
but vital skills; or (iv) specific locations entailing particular hardship.
Delegation: Transferring authority over pay setting and human resource management from a central civil
service agency to ministries, agencies, or departments. The transfer may entail transferring from the central to
the entity level (or a mixture): pay negotiations (if any); setting the overall wage bill; and design of pay scales.
While the three dimensions of pay individualization, differentiation and delegation describe analytically distinct
elements of pay flexibility, reform examples in the real world often combine all elements. There are three reasons
why components of pay flexibility tend to go together: a shared set of assumptions that underpin them; a
complementarity in the objectives that they are aiming for; and sequencing. First, many of these reforms took place
in the context of New Public Management that emphasized giving managers the authority to manage in exchange for
tighter accountability standards. This move reflected wider changes in the economy where there has been a
significant move away from industry and sector-wide negotiating arrangements with significant government
involvement towards more local and agency-specific arrangements.
4
Hawthorne Effect is the phenomenon whereby a subject modifies his or her behavior simply in response to the
fact that they know they are being studied, and not because of the particular treatment of interest.
3
Second, the three components of pay flexibility likely complement each other in producing outcomes at the
individual, agency, and public-sector wide level. For example, differentiated pay-setting is widely assumed to allow
the agency to set pay at a level which is appropriate for the given task in the specific labor market in which the
agency operates. However, to the extent that it also disrupts sector-wide pay negotiations, allowing government as
employer to limit pay increases to the minimum necessary to maintain performance without the ―leveling up‖ effect
of a common pay framework, it can also have some effect in reducing pressures on the aggregate public sector wage
bill. Delegated pay setting by giving managers authority to engage their staff in determining how best to keep the
agency functioning within the hard constraints is also likely to have some effect on the culture of performance at the
agency level.
There are may also be a logical sequencing between the components. For example, delegation can lead to PRP since
if pay-setting authority and accountability for results have been transferred to agency managers, they will look for
mechanisms that maximize the likelihood that the results will be achieved at minimum cost. Delegation also
automatically leads to differentiation to the extent that ―local supplements will be used as a means of topping up the
earnings of those professional staff in demand from private sector employers rather than of those for whom the state
acts as a near monopsonist employer‖ (Grimshaw 1998a, p.7)
This paper is organized as follows. The next section details the main theoretical arguments for and against
PRP and their hypothesized impact on staff effort and productivity. The bulk of the paper — sections 3 to
5 — focuses on the empirical evidence. The presentation of the empirical evidence is done with the aim
of differentiating the findings by both the methodology of the study and the relevance of the study to
different types of public sector jobs, such as the core civil service, tax administration jobs, teaching, and
health sector jobs. Sections 6 summarizes the main findings and the final section points to the underexplored agency and public sector wide potential effects of PRP and provides suggestions for future
research.
It should also be noted here what this paper is not about. It is clear that pay policy is only one factor
affecting staff incentives and public sector performance; other human resource management
considerations including recruitment, personnel management, training, and organizational management
practices are undoubtedly important. The paper reviews the empirical evidence from research that is
limited to PRP, assuming that all else is constant; this limitation is both to keep the task manageable and
in order to examine a variable which has a particular relationship with other aspects of human resource
reform. It cannot reach a conclusion that when PRP ―works‖ that it is more important than other variables.
These are important issues which merit a different study.
2. Theoretical debates
Theoretical debates on PRP have been evolving in the context of private businesses (Prendergast 1998,
1999), the general public sector (Dixit 1999; Burgess and Ratto 2003; Perry, Mesch, et al. 2006) and on
specific occupations such as teachers (Neal 2011). The theoretical arguments can be roughly divided
between early psychological theories on human motivation and training, popular in public administration
research, on the one side and core economic theories on incentive structures and principal-agent problems
on the other, with behavioral economics building a bridge between the two.
Expectancy and reinforcement theory
Public administration research on PRP usually relies on what is often called ―expectancy‖ (Vroom 1964;
Porter and Lawler III 1968) and ―reinforcement‖ theory (Skinner 1969; Luthans 1973). Expectancy theory
builds on psychological insights about repeated behavioral patterns and learning under positive and
negative stimuli. In its simplest form the theory suggests that explicit incentives in the form of
4
performance pay work under two conditions: First, employees need to believe that increased effort leads
to increased performance, second increased performance leads to desired outcomes and is recognized by
management. If the two conditions are met, employees form a behaviorally salient expectation about a
future reward and adjust their work effort upwardly. Reinforcement theory stresses the effect of
cultivating a behavioral norm of high work effort through reinforcing behavior with positive rewards.
Apart from the direct link between performance and individual rewards, advocates in the field of public
administration highlight secondary effects associated with performance pay: it helps to recruit and retain
highly-skilled and/or motivated staff who presumably would do better under such an arrangement;
increases the awareness for organizational goals by defining explicit performance standards; weakens the
power of public sector unions; makes managers more responsible; signals core organizational goals to
outside actors; increases the link between individual and organizational job goals; reduces the overall
wage bill by moving away from automatic pay increases; and leads to an increase in overall job
satisfaction through the individual recognition of employee efforts (Marsden 2004; OECD 2005b;
Marsden 2009).
Critics of performance pay point out that the two conditions of expectancy theory are not always met and
it is in principle difficult to design performance pay schemes that work as intended. Furthermore, critics
argue humans do not always approach work effort and the assessment of salary in an entirely rational way
and thus invalidate simple theoretical models based on rational-actor assumptions. In addition, measuring
performance in the public sector is often fraught with difficulty. Many core public servants perform
services on a daily basis that are hard to measure or are non-measurable, or produce outputs that are not
market-priced. For example, early critics of test score based school and teacher evaluations argued that
teacher performance cannot be neatly summarized by mechanic student test scores and such practice
usually invites behavior that contradicts the overall goals of the teaching profession (Murnane and Cohen
1986). Interestingly, these criticisms were already leveled against early forms of test-based performance
pay in British schools in the 19th century (Gratz 2009). Using explicit and objective performance
measures can induce tunnel vision, myopia and measure fixation (Propper and Wilson 2003).
Additionally, there is a lack of clarity about who determines and evaluates performance, and civil servants
often work in large teams under the supervision of multiple managers, complicating the attribution of
performance and responsibility of evaluation. Some authors see as a necessary condition the presence of
high levels of trust and transparency between employees and management, to avoid arbitrary
implementation and worker dissatisfaction (Kellough and Lu 1993).
An influential article (Kerr 1975) identified scenarios in which well-intended incentive schemes ended up
favoring behavioral responses by employees that fulfill performance criteria, but taken to the extreme
contradict overall organizational goals and standards. A different strand of criticism focuses on other
motivations underlying public servants' effort. Apart from pure monetary rewards as expressed in salary,
civil servants, it is argued, are motivated by notions of altruism, prosocial behavior and commitment to
institutional goals (Perry and Hondeghem 2008), which are seen to compete with or even stand in conflict
with explicit monetary incentives.
Incentive and principal-agent theory
Several surveys have synthesized the theoretical literature on pay incentive systems for the private sector
(Prendergast 1998, 1999), for the public sector (Dixit 1999; Burgess and Ratto 2003) and specialized
occupations such as teaching (Neal 2011). The most basic argument for incentive pay is founded in a
simple microeconomic principal-agent model of labor relations, in which a principal (the employer) wants
to induce an agent (the employee) to perform a certain task. Such principal-agent relationships are
commonly affected by two problems (Dixit 1999): moral hazard and adverse selection.
5
Moral hazard describes a scenario in which the agent's actions affect the principal's payoffs, but the action
is not directly observable to the principal. This situation arises naturally in the workplace setting (public
or private), i.e. the employee's effort at work is not directly observable, but influences productivity and
outcomes, which the employer cares about. Contracts that tie observable outputs, which are correlated
with unobservable effort, to desired pay incentives can mitigate inefficiencies in the principal-agent
relationship from the perspective of the employer. Offering fixed pay contract gives the employer little
leverage to influence employee effort after hiring decisions have been made. The incentive problem is
exacerbated if employees are hard to fire. Bonus or merit pay schemes are therefore one way of designing
incentive schemes that address moral hazard.
In the case of adverse selection the agent has access to private and valuable information at the time of
contract signing. To induce the agent to reveal this private information, the principal must offer attractive
contract terms. Adverse selection in the public sector plays an important role in civil service recruitment,
where low and high-skill applicants are hard to distinguish based on public information. Public agencies
need to offer contracts that induce high-quality applicants to apply and deter low quality applicants from
misrepresenting their qualifications. Merit pay systems are argued to alleviate this sorting problem and
attract higher quality personnel that expect to perform well under system of merit pay, while traditional
fixed salary scales are seen to attract low grade applicants (Delfgaauw and Dur, 2008).
Avoiding moral hazard and adverse selection are often used to advocate for forms of performance pay in
the public sector. Such incentive schemes fundamentally require the ability to measure some relevant
outputs, design a scheme that properly links unobserved actions to outcomes and offer bonuses that
induce agents to increase effort. Incentives work best if the agent's actions are tightly linked to observable
outcomes, i.e. when the random noise is not overpowering the incentive effects. Incentives schemes are
also affected by the risk-aversiveness of employees. Since an incentive scheme links outcomes that are
only partially under the control of the agent, making final pay outcome-dependent decreases the utility of
risk-averse employees, who usually demand an upward adjustment of average pay to compensate for the
increase in risk. Even with very simple models, the optimality of the incentive scheme is sensitive to
important design aspects, like the schedule of bonuses (linear, stepwise or other) and depends on the
particularities of the employee's task.
A well-known criticism of performance pay arrangements is that when tasks are multi-dimensional
incentivizing only some tasks that are observable and measurable will not necessarily improve overall
outcomes, but rather lead to a substitution of effort allocation from the unobservable to the observable
tasks, which under some circumstances — depending on whether the different tasks are complements or
substitutes, and the nature of the functional relationship between the tasks and the outcomes — can even
lead to worse outcomes (Holmstrom and Milgrom 1991). For example, the task of teaching can involve
both instruction based on sound curricula and coaching on test-taking strategies, and poorly designed
incentive schemes can encourage teachers to re-allocate effort to the latter and away from the former
(―teaching to the test‖) to the detriment of human capital accumulation. Since it is generally hard to
accurately measure and evaluate any aspect of public sector jobs, it is hard to devise effective civil service
schemes that have the intended consequences with regard to final outcomes.
The problem of selecting appropriate performance measures to address this problem has spawned its own
theoretical and empirical debates (Courty, Heinrich, et al. 2005). A problem related to the multi-tasking
argument deals with the issue of gaming or cheating incentive systems. Typical examples are outright
manipulation of results, cream-skimming, i.e. the manipulative selection of clients to improve program
effects (Heckman, Heinrich, et al. 1997) or even the provision of high-caloric food to students during test
days (Figlio and Winicki 2005). The problem of gaming performance standards is equally relevant in a
6
dynamic context, requiring ongoing adjustments by the principal (Courty and Marschke 2003). Luck
during early periods of the evaluation period can induce increased slack for the rest of the time, a form of
incentive gaming that has been found to be empirically relevant, as an influential study of Navy recruiters
has shown (Asch 1990). To counteract excessive gaming of incentive schemes, it has been suggested, in
the context of student test scores and teacher merit pay, to use evaluation systems independently from
output measurements, i.e. in the case of teacher evaluations, to measure teacher contributions in tests
different in form and content from instruments designed to track overall student and school progress or
national exams (Neal 2011). Additionally, relative performance schemes in which employees are ranked
against each other, potentially in a formal tournament setting are much harder to manipulate (Barlevy and
Neal 2011; Neal 2011).
However, (Marsden 2009) notes that the gaming is not necessarily restricted to ingenious behaviors on the
part of staff. Managers required to implement performance-pay arrangements might conclude that, in the
bargaining with their staff concerning efforts and results, it is tempting and doubtless easier to: ―collude
with their subordinates: to go through the motions and fill in the forms for goal setting and appraisal, but
not to worry about the reality‖ (Marsden 2009, p.5)
As outlined above, incentive schemes do not necessarily have to reward individual performance, but can
focus on rewarding teams. Rewarding team performance can have certain advantages, ranging from
reduced evaluation costs, to avoiding harmful competition between employees. However, basing rewards
on team outputs can also lead to problems of free-riding where some team members willfully reduce their
efforts in the expectation of relying on the work of others. The strength of free-riding problems depends
on the size of the team and internal monitoring and punishment norms (Dixit 1999).
Picking the correct size of bonus brings its own challenges. Small bonuses will have little incentive
effects and fall short of expectations, while large bonuses can lead to employees to treat incentive
schemes as pure lotteries, especially if outcomes are strongly stochastic (e.g. student test scores (Neal
2011)), and to encourage cheating.
A reverse problem can arise if employees have to deal with multiple principals, a common feature of
public service hierarchies. If different principals value different outputs, have different information
available and have little ability or incentive to coordinate, separately designed incentive schemes are
likely to fail (Dixit 1999).
Behavioral economics —intrinsic versus extrinsic motivation
Building on the argument that worker motivation is also driven by intrinsic concerns behavioral
economists have advanced a line of argument that casts additional doubt on the feasibility of performance
pay arrangements, specifically in public service settings. Introducing explicit monetary incentives for
employees with strong intrinsic motivation can have the effect of crowding-out these intrinsic affects, i.e.
workers change their perception about the organizational goals and values, leading to an overall reduction
of effort. Even small changes in the pecuniary reward structure can induce a change in attitudes,
switching from seeing the task as a partially voluntary contribution to a low paid contract service without
any ownership stake. Crowding-out can be especially salient if performance pay is introduced using
antagonistic framing and can stifle creativity and collaboration (Frey and Osterloh 1999). While the
theory of intrinsic motivation has been proposed by psychologists, formal treatments of the trade-off
between extrinsic and intrinsic motivation have been developed (Kreps 1997; Benabou and Tirole 2003,
2006) and integrated in the wider context of public goods provisions (Besley and Ghatak 2004).
7
The debate around the weight of intrinsic motivation within overall incentives has been crystallized in the
debate led by Le Grand about whether public service workers are ―knaves or knights‖ (Le Grand 2003).
Le Grand argues that post-war public administration theory in the UK (and in Europe more generally) saw
public servants as public-spirited altruists, a misleading interpretation of reality which was recognized in,
although not adequately redressed by, the 1980s introduction of various New Public Management
reforms.
On the opposing side of the argument, (Pink 2009) has developed the critique of monetary and other
extrinsic incentives into a broader theory, hypothesizing that they are both counterproductive, as they
frequently undermine intrinsic incentives, and unnecessary, as intrinsic incentives can be harnessed and
used to maximize individual productivity. His theory suggests that tasks can be constructed to: (i)
maximize an individual‘s sense of autonomy (drawing inter alia on the ―self determination theory‖
expounded in the overview paper by (Ryan and Deci 2000) and the cross-country attitude survey
undertaken by (Chirkov, Ryan, et al. 2003)); (ii) mastery (continuous incremental learning and
improvements rather than distant targets citing inter alia (Sauermann and Cohen 2008) who use data on
over 11,000 industrial scientists and engineers to show that intrinsic motives, particularly the desire for
intellectual challenge, appear to benefit innovation more than extrinsic motives such as pay); and (iii)
purpose (drawing inter alia on (Niemiec, Ryan, et al. 2009) who show from a follow up study of 246
students who graduated from two US colleges that those who met intrinsic aspirations for personal
growth, close relationships, community involvement, and physical health had better scores for
psychological satisfaction and health than those who pursued attainment of the extrinsic aspirations for
money, fame, and image.
Another psychological argument, known as the ―Yerkes-Dodson law‖, highlights the phenomenon of
―choking under pressure‖ (Ariely, Gneezy, et al. 2009). If individual salary is subject to high-stakes
pressure, individuals experience increased arousal and shift from automatic to controlled behavior, a
narrowing of attention and a pre-occupation with the reward, all lessening the chances of success. The
argument is that performance has an ―inverse U‖ relationship with the level of the incentive payment,
with performance improving at low and moderate levels of incentive payments as compared to no
payments, but then being worse at very high levels of payment compared to moderate, low, or even no
payments.
Lastly, behavioral economists have identified the possibility of satisficing instead of maximizing
behavior. Employees might exert effort until a certain minimum level of reward is reached and then
substitute additional labor supply for increased leisure or idle time. A study of New York cab drivers
identified the prevalence of satisficing behavior and questions the effectiveness of performance pay
schemes, if the satisficing threshold is hit quickly (Camerer, Babcock, et al. 1997).
The proponents of ―self-determination theory‖ clearly have a strong and intuitively attractive point to
make, but it is not evident that the case for intrinsic incentives overwhelms consideration of extrinsic
incentives. Three areas of uncertainty hang over conclusions in this area. First, autonomy, mastery and
purpose are more feasible in some tasks than in others. Second, intrinsic motivation can take two forms
— being motivated by the inherent nature of the job or being motivated to earn the respect of one‘s peers
— and group-based performance schemes, to the extent that they encourage teamwork, could increase
intrinsic motivation. Finally, the theory is largely addressing the moral hazard problem inherent in
principal-agent relationships, suggesting that staff can sense that they are being treated instrumentally
through the use of extrinsic incentives and so will be motivated to cheat. There are different implications
for the self-selection/sorting argument if the general population can be divided between predominantly
intrinsically and extrinsically motivated people, since under those circumstances an explicit system of
performance pay will attract extrinsically motivated applicants to the civil service.
8
3. Organizing the empirical evidence: Craft and Coping jobs
To summarize, the main argument for PRP is that linking incentives to inputs, outputs and outcomes can:
(i) ameliorate moral hazard, inducing more effort, including the selection of better working methods to
the extent that the individual has some autonomy; and (ii) address adverse selection by encouraging high
ability individuals who will do better under a performance pay scheme to join the agency and similarly
discouraging low ability individuals. Both of these linkages suggest contemporaneous and over-time
effects of incentive schemes. Comparatively little research focuses on the self-selection or sorting effects
of competitive performance schemes; the larger part of the literature deals with the stringent context
conditions necessary for performance incentives to engender increased effort.
These contextual factors can be disaggregated into two categories: First, variables that are characteristics
of the job that the individual performs and second, variables that are characteristics of the technical design
of the performance scheme itself. The key ―job‖ variables identified in the literature are:
Measurability of the goal or outcome from the job: In some jobs outcomes are easier to measure
than on others. Examples are production work on the factory assembly line or sales jobs, to be
contrasted with managerial jobs in line ministries or large private bureaucracies;
Multi-dimensionality of the actions that produce the outcome: The overall outcome of the job
may depend on a single or multiple actions or activities. Then performance pay can create
perverse outcomes by encouraging effort allocation towards the actions that positively impact the
performance measure being used but negatively impact the overall outcome;
Observability and measurability of the actions that produce the outcome (Wilson, 1989): Most
public sector jobs involve multi-dimensionality of tasks; however, what distinguishes some tasks
from others is whether or not the actions are observable. As James Q. Wilson noted, in some jobs
it is much easier to measure whether or not an action was performed — e.g. health safety
regulations were drafted — than whether or not the outcome, improved occupational safety,
occurred. As (Dixit 1999) notes, the principle terrain of incentive theory are those jobs where
actions are not observable;
Controllability of the outcomes (Bruns et al, 2011): The extent to which the outcome is a function
of the efforts of the individual or is influenced by other factors beyond the individual‘s control.
Some of the main design choices in incentive schemes are:
Predictability of the incentive (Bruns et al 2011): the probability of the agent receiving the
incentive if the measured outcome is achieved. If the probability is either close to 0 or 1, then the
incentive will have no impact. The incentive is clearest for example, for piece-work settings,
common in sales or blue-collar settings in the private sector;
The size and nature of the incentive payment: While the incentive effect should theoretically
increase with the size of the bonus, the Yerkes-Dodson Law points to the consequences of very
large bonuses. Large bonuses also can create incentives for cheating. Group based schemes could
encourage team work but also result in free-riding behavior;
The nature of the performance evaluation and the performance standard: Whether this should
this be an objective evaluation based on quantitative performance targets or a subjective
evaluation, what is the benchmark against which it is evaluation, and who should be responsible
for the evaluation.
9
The general criticism is that performance pay cannot be implemented in the public sector because
political difficulties in selecting appropriate design features to address the complexity of the job variables
— for example, giving bonuses to everyone thereby rendering it a completely predictable salary
supplement and not an incentive payment. However, this criticism glosses over the many different
organizational contexts within the public sector. For example the outcomes of schools and tax authorities
are more measurable than those of central policy or administrative units in ministries and departments.
Moreover, many large private sector bureaucracies also approximate public bureaucracies on some of
these variables — while there is the profit ―bottom line‖, the controllability of outcomes can be low and
the tasks that generate the outcome can be multidimensional. In examining the empirical evidence
therefore, it is very important to distinguish between these different contexts in order to better understand
under what conditions these schemes can or cannot work.
This paper borrows and slightly modifies Wilson‘s typology (Table 1) to organize the empirical evidence
so as to present these contextual nuances.5 Jobs can be characterized by whether or not the job‘s outputs
are easily measurable and whether or not the actions in the job to produce the output, or the internal
production process, are observable. The matrix provides a framework within which to organize the
empirical evidence by job type, with the simplifying assumption that jobs with multiple dimensions are
located within the cell that represents the most complex of those dimensions. The top left box describes
―Production Jobs‖ in which outcomes are easily measurable, the production process consists of
repeatable, mechanical tasks that are observable to an outside monitor, and controllability is likely to be
high. Typical examples are manufacturing factory-floor jobs, sales jobs, and municipal services like
garbage collection. If the production process is not directly observable, but outputs remain measurable,
such jobs are termed ―Craft Jobs‖. With recent advances in measuring learning outcomes, teaching can be
classified as a public service in which the exact process of production is hard to fix, but, at least to a
certain degree, desired outcomes are quantifiable. Similarly some of the outcomes of healthcare,
particularly preventative services like child immunization, are also more measurable. Other examples
include tax collection, job placement services, and auditing.
In the bottom row are ―Procedural Jobs‖ and ―Coping Jobs‖. Both are characterized by difficult to
measure outcomes, but again differ in the observability of the production process to an outsider.
Procedural jobs like the military have clearly defined inputs, whereas administrative jobs in general
policy units of the central government neither produce easily measurable outputs, nor have transparent
production processes. Coping jobs present the most challenging functional contexts for PRP.
The general literature on performance pay is quite positive in relation to production jobs. Much of the
earlier literature on incentive payments in the private sector looked at the impact of piece-rate
compensation on productivity in production organizations involving manufacturing jobs or on similar
repetitive tasks. (Stajkovic and Luthans 2003) conducted a meta-analysis of 72 empirical studies that
investigate the impact of financial, as well as other, incentives on various measures of performance in
organizational settings. These studies were conducted in a variety of private organizational settings, and
5
Wilson had originally used this framework to classify organizations and not jobs, the implicit assumption being
that organizations were homogenous in the tasks that they performed. This adaption of the framework to
jobs does not change the logic of the typology, and is consistent with the proposal made by (Pritchett and
Woolcock 2004) to classify decision-making functions in the public sector according to the discretion inherent
in the task (vs. simple rule-following) and the number of the transactions necessary to deliver the service.
Tasks hi h are e essarily dis retio ary a d tra sa tio i te si e are pra ti e tasks i Prit hett’s logi a d
opi g tasks i Wilso ’s ie – but essentially refer to the same type of job – a d, as Prit hett otes: t he
provision of key, discretionary, transaction-intensive services through the public sector is the mother of all
i stitutio al a d orga izatio al desig pro le s (Pritchett and Woolcock 2004, p.196).
10
included both industrial and service sector organizations. The study showed that financial incentives alone
improved task performance by 23 percent, whereas financial incentives combined with social recognition
and positive feedback from superiors increased task performance by 45 percent.
Table 1: James Q. Wilson’s classification of job types
Actions or internal production process of the job
Outputs from the job
Relatively
easily
measurable
Not easily
measurable
Observable
Not observable
Production job: Simple
repetitive stable tasks,
specialized skills.
Craft jobs: Application of general sets of
skills to unique tasks, but with stable, similar
outcomes.
Examples: Manufacturing,
sales, simpler municipal
services (garbage collection).
Examples: Auditing; revenue collection;
teaching; medical practice; Job placement
work
Procedural job: Specialized
skills; stable tasks, but unique
outcomes
Coping job: Application of generic skills to
unique tasks, but outcomes cannot be
evaluated in absence of alternatives
Examples: Military
Examples: Administration; managerial jobs
in large private sector organizations
A well-known study by (Lazear 2000) uses individual-level worker data from a glass company and
estimates a 44% increase in productivity after switching to a piece-rate salary schedule. In a structurally
similar setting, the introduction of piece-rate pay in a Canadian tree planting business has also been found
to strongly affect worker productivity and profits (Paarsch and Shearer 1999).
However, production jobs, as well as procedural jobs, are of limited relevance to the public sector.
Rather, given that the loci for incentive schemes, following the principal-agent literature, are mostly in
contexts where actions are not easily observable but outcomes may or may not be, the focus of this paper
will be on jobs in the right hand column of Table 1. Note that some of the studies of the private sector that
have focused on incentives for rank-and-file ―knowledge workers‖ in large organizations can also be
classified in this right hand column and will be included in the discussion.
Methodological approaches
The vast empirical literature on PRP utilizes a range of methodological approaches. Early studies on
performance pay in the public sector were largely qualitative case studies. As elements of civil service
reform and incentive schemes were introduced in OECD countries throughout the 1980s and 1990s,
academics and policy experts wrote initial studies and reports that summarized the main descriptive
feature of reform attempts, and chronicled the reform process and implementation. Overall evaluation of
the success of incentive schemes was based on qualitative impressions of practitioners. Later studies
11
approached questions of performance pay slightly more systematically, comparing several reform cases
with each other and using convenience samples of employees and senior management, subject to
performance pay, to collect data on self-assessed motivation and satisfaction with newly introduced
performance pay. While attempting to provide a systematic evaluation of the successes and failures of
performance pay reform, to a large extent these studies rely on weak research designs. Despite their
descriptive value, inferences based on the single or comparative evaluation of reform cases, without the
use of counterfactuals, cannot establish the existence or absence of potential program effects.
Furthermore, utilizing convenience samples and using survey instruments that only measure self-assessed
motivation suffer from selection and perception bias problems, while ignoring the information on actual
outcomes.
Although the tenor of this initial wave of research was rather discouraging about the results on the
effectiveness and popularity of performance pay among staff, incentive schemes for the public sector did
not lose their appeal to policy makers (Marsden 2009). An intense empirical debate on teacher incentive
was sparked in the context of local American school reforms and a national debate on test-score based
school accountability. A series of papers by education economists used the opportunity of local reform
attempts to collect data on samples of schools and students subject to new performance pay arrangements
and comparable control populations. By measuring detailed public service outcomes, student's grades and
test scores, drop-out rates and attendance records, plus detailed information on bonus programs, the
quantitative evaluation of this new data allowed a more stringent test of the competing theoretical
arguments. The advantage of these studies compared to prior work is their ability to disentangle the effect
of teacher effort from other factors that also determine student outcomes.
While an improvement, many of these quantitative observational studies and similar work on private
companies and the public service still fall short of an ideal research design for causal inference on
program effects. The gold standard for program evaluation is the use of randomized-controlled trials
(RCT), in which treatment assignment to subjects is randomized and unrelated to other observable and
unobservable characteristics. This randomization allows the estimation of the treatment effect by a
comparison of treated and control units. Observational studies on the other hand rely on treatment and
control groups created not by controlled random assignment, but produced by social processes, often
related to the research question at hand. Issues of selection bias and confounding factors can undermine
the internal validity of such studies.
Utilizing the power of a randomization framework, several behavioral economists have used laboratory
experiments to test hypotheses with regard to incentive schemes. Laboratory experiments offer at least
two distinct advantages over observational studies: the researcher can use randomization to ensure
unbiased and consistent estimation of treatment effects; and researchers can design their experiment to
directly relate to theoretical questions at hand. The review of the theoretical literature has identified the
importance of design details and the plethora of possibilities when it comes to performance pay
arrangements. Observational studies have to rely on bonus pay schemes that have been implemented in
real life, which do not necessarily cover all interesting variations and often mix different elements that
conflate theoretical questions of interest. Being able to design a clean and tailored experiment to trace the
effect of intrinsic versus extrinsic motivation in a linear bonus pay scheme for example is a huge
advantage which laboratory experiments offer. A wave of articles in the late 1990s and 2000s explores
various issues of performance pay. While offering certain advantages over observational studies,
laboratory experiments often use notoriously small samples and student subjects that share few
characteristics with actual workers or public servants. Furthermore, laboratory experiments can hardly
ever replicate real work place settings or offer bonus schemes that remotely approach bonus sizes
common even in only moderately incentivized performance pay schemes. This raises concerns from a
sampling perspective and the representativeness of the subject pool, as well as the comparability of
12
laboratory treatments and real-world bonus programs. Researchers generally confer low external validity
to laboratory experiments and caution against the isolated interpretation of results derived from a single
experiment.
The most recent attempts at addressing the issue of proper causal inference on performance pay and
increasing the representativeness of results are RCTs. In an RCT researchers are able to randomize key
features of an actual policy program that services the actual population of interest. The advantage of such
a field experiment is the similarity of the target population, the structure of the incentive program, paired
with the randomization of treatment. Although field RCT are time and resource intensive studies, several
teams of researchers have implemented similar studies in different contexts around the world,
considerably adding to the empirical understanding of performance pay.
4. The empirical literature reviewed
Given that the literature has analyzed incentive pay for a broad set of jobs and utilizing an array of
methodological approaches, the following presentation of the empirical literature in this paper is sorted by
(a) the nature of the job, using Wilson‘s typology and (b) the methodological approach. In total 153
empirical studies of PRP (see the Appendix for the full list) were considered in this review, of which 110
are for craft and coping jobs (Table 2). The research to date on the subject has largely focused on
advanced countries — in the review 127 studies are in OECD contexts, and only 26 are in developing
country settings. The literature has also focused largely on craft jobs and production jobs, with no
experimental studies to date on coping jobs.
Table 2: Studies by country environment, methodology, and job type
Country and methodology
OECD study
Observational
Field RCT
Lab. experiment
Developing country study
Observational
Field RCT
Lab. experiment
Total
Production jobs Procedural jobs
27
14
7
6
1
0
0
1
28
0
0
0
0
0
0
0
0
0
Types of Jobs
Coping jobs
Craft jobs
15
15
0
0
1
1
0
0
16
Unclassified
72
59
13
0
22
15
6
1
94
Total
13
13
0
0
2
2
0
0
15
127
101
20
6
26
18
6
2
153
Observational studies
Public sector coping jobs
Observational studies about performance-related pay for public sector coping jobs suggest somewhat
limited effectiveness of this component of pay flexibility for these activities — they also focus
overwhelmingly on OECD settings.
A series of OECD reports and associated discussion papers chronicle the type and extent of pay related
civil service reforms in advanced industrialized countries (OECD 1993, 1996, 1997b; Kim 2002; Burgess
and Ratto 2003; OECD 2004a, 2005a, b; Perry, Mesch, et al. 2006; Ketelaar, Manning, et al. 2007;
Rexed, Moll, et al. 2007; OECD 2008, 2009; Perry, Engbers, et al. 2009). Generally, these refer to coping
jobs in that they refer to, or imply that they are referring to, managers with complex tasks.
13
(Cardona 2007) reviews incentive programs in the US, particularly the Performance Management and
Recognition System, the UK's Inland Revenue Service performance scheme and similar attempts in
Australia. The study documents several common issues in the implementation of performance pay:
employees are hardly ever scored less than satisfactory in their evaluations, bonus systems were designed
so that only very few employees actually received any payments and the majority of staff found the
system de-motivating and inciting jealousies. (Straberg 2010) highlights the problem of perceived
unfairness following the introduction of performance pay in an OECD country, although also notes that
there was no empirical linkage between pay justice perceptions and workplace behaviors. Managers
equally found little positive changes resulting from the introduction of performance pay. As context,
multiple studies confirm the political and operational difficulties of successfully introducing any major
program of pay reform within the public service (World Bank 1999; Kiragu and Mukandala 2003;
Independent Evaluation Group 2008).
Brudney and Condrey present a pair of typical studies drawing on non-representative surveys of federal
managers and government employees in the US, that were subject to performance-pay arrangements
(Condrey and Brudney 1992; Brudney and Condrey 1993). While largely descriptive in nature, they find
that 17% of managers report an increase in motivation, but prior attitudes about merit pay affect this
result. It is unclear though whether the documented increase of motivation is solely an effect of
performance pay, since no control units are used to rule out the influence of other possible confounders or
if self-reported motivation is in any way linked to actual work outputs. Similar work on performance pay
for civil servants in the US state of Georgia also finds highly critical opinions of staff members with
regard to explicit evaluation and selective bonus pay (Kellough and Nigro 2002). More recent studies
using survey data of US city managers finds higher satisfaction of employees when performance pay is
used (Stazyk 2010). The research does however highlight the complexity of the issue as it suggests that
there is a crowding-out effect on extrinsic motivation for staff with a distinctively high level of intrinsic
motivation — but ultimately this does no harm to effort or to job satisfaction.
(Dowling and Richardson 1997) evaluate the effect of performance pay on UK National Health Service
managers, a study, since it focuses on management rather than physicians, that is more closely related to
core public administration jobs than healthcare jobs. Using self-reported data from a survey, they find a
modest positive effect of pay incentives on manager motivation and effort.
Whether PRP should be used is one thing — when and where it can be used is another. (Dahlstrom and
Lapuente 2009) theorizes, using transaction cost economics, that when senior officials share a career path
with elected politicians and thus are unlikely to act impartially, then PRP is less often used as those
officials are unable to provide credible commitments to their staff. Why would one work hard to achieve a
goal if it transpires that the goalposts might be arbitrarily moved for political reasons at the last moment?
They find empirical support for this prediction.
Public sector craft jobs
Observational studies about performance-related pay for public sector craft jobs are more optimistic
concerning their effectiveness. There is more evidence to draw on from developing countries, particularly
in relation to teaching and health care.
Tax administration, job placement
Revenue authorities are examples of public sector agencies that can be classified as craft organizations
where outputs — number of audits conducted and tax fines collected — are more easily measurable and
there is a clearer link between the efforts of, for example, individual tax auditors and revenue collection.
14
A good example of detailed pay PRP and high-powered performance pay incentives comes from the
federal Brazilian tax collection agency. In 1988 the Brazilian government created a bonus program for tax
officials that rewarded the identification of tax violations. Base salary was augmented on a monthly basis
on an individual and group basis. The group reward was calculated based on the relative performance of
one local agency versus others, with relative performance measured based on total fines collected,
attainment of pre-defined quotas (total tax collection, number of inspections, collection of overdue taxes)
and the size of the agency. Individual rewards were based on monthly evaluations by the direct
supervisor, which combined objective performance criteria and managerial discretion and rated
employees on a scale from zero to 70. Each employee that scored more than 21 points was entitled to an
individual reward with the value being determined by the overall availability of funds (which are
proportional to the collected fines) and the performance of co-workers. It was not unusual for total bonus
payments to reach 200% of base pay.
(Kahn, De Silva, et al. 2001) found that this incentive scheme resulted in a 75% increase in fines per
inspection. At the same time, they also found substantial regional variation, with responses ranging from
19% to 145%. The authors do caution that diverse management techniques resulted in some regions
targeting wealthier sources (such as corporations), more aggressively, which points to the potential
negative effects of such high powered incentive schemes which may encourage extortion. Unfortunately,
limited data prevented the authors from examining these social costs further.
Another study of revenue authorities is by (Bertelli 2006), which explores the potential tradeoff between
intrinsic and extrinsic motivation in the Internal Revenue Service in the US. The IRS implemented a
paybanding system that imposed high-powered performance incentives on supervisors, but not on nonsupervisory personnel. Using data from the 2002 Federal Human Capital Survey, the author showed that
the incentive scheme crowded in intrinsic motivation at the lowest pay levels, and crowded out at the
highest levels.
(World Bank 2001) used survey data from revenue departments in 14 low, middle and high income
countries, and detailed case study evidence from 7 of those, to review the effectiveness of bonus and
salary supplement systems as a means to enhance effectiveness in revenue departments. They concluded
that the ―circumstantial evidence‖ suggests that bonus systems do indeed seem to have an impact on
organizational effectiveness. They note that in a number of countries the introduction of bonus systems
have had a measurable impact on recruitment and retention of employees. However, they note that the
success of bonus systems relies heavily on ―legitimacy‖, i.e. the internal and external ―acceptance‖ of the
bonus system.
A set of studies of performance incentives in a similar organizational context is that of the US Job
Training Partnership Act (JTPA). Under the Act, 620 semi-autonomous training centers were responsible
for implementing job training programs for the indigent, and were given financial incentives tied to labor
market outcomes — employment status, earnings — of the trainees. These bonuses were given to the
training centers thereby augmenting their budgets but could not be used to supplement staff salaries.
(Courty and Marschke 2004) find evidence of the prevalence of gaming among the agency staff in the
choice of termination date of the training for the participants, which while increasing organizational
bonuses imposed a cost to the participants in terms of earnings. Similar effects have been found by related
studies of the program (Heckman, Heinrich, et al. 1997).
An early quantitative observational study of performance pay in the public service was implemented by
(Asch 1990), who collected data on the behavior of Navy recruiters, subject to a point-based performance
system. The incentive consisted of a point-scheme for the quality of recruited candidates, a fixed time
frame for evaluation and a minimum threshold of points needed to qualify for a bonus. Asch shows that
15
the incentive scheme did increase the effort of recruiters and led to the recruitment of more high-quality
candidates, but also induced recruiters to exhibit a form of gaming behavior, i.e. increasing recruiting
efforts early in the cycle and once an expectation of reaching the bonus level was achieved, a reduction of
effort for the remaining evaluation time followed.6
Teaching
The literature on performance pay for teachers, an essentially craft job, also shows a variety of research
designs. Some researchers use qualitative case studies (Murnane and Cohen 1986) or perception surveys
of teachers subject to performance pay to measure effects on performance (Heneman III and Milanowski
1999; Kelley 1999). These studies often show a low degree of satisfaction with bonus systems and
explicit evaluation. By contrast, studies focusing on actual outcomes often find rather encouraging results.
A series of papers use the introduction of performance pay for teachers to quantitatively assess the effects
on student outcomes. By now studies have utilized various data sources and structures to assess the
effects of performance programs.
In the American context researchers have evaluated the effects of teacher quality on student outcomes
(Goldhaber and Brewer 2000; Hanushek and Rivkin 2006; Clotfelter, Glennie, et al. 2007). Since
performance pay is argued to be one important tool for attracting and retaining highly-qualified teachers,
it is important to determine the effectiveness of merit systems in that regard. (Clotfelter, Diaz, et al. 2004;
Clotfelter, Glennie, et al. 2008) show, using detailed data from North Carolina's schools, that
accountability and performance pay systems contribute positively to retaining quality teachers. The
introduction of merit pay can also be linked to student test scores, but with varying empirical robustness.
(Cooper and Cohn 1997) find for a sample of over 500 South Carolina classes a positive effect of merit
awards for teachers on mathematics and reading test score achievements.
Cross-sectional studies using data from the American National Educational Longitudinal Survey have
been used to show a positive link between individual merit awards for teachers and student test scores
(Figlio and Kenny 2007). Positive effects of performance pay have also been found in Arkansas
kindergartens (Winters, Ritter, et al. 2009). To mitigate problems of causal inference, sometimes
researchers can use difference-in-difference estimation, due to the geographic difference of the phasing-in
of reforms (Eberts, Hollenbeck, et al. 2002; Atkinson, Burgess, et al. 2004). While (Eberts, Hollenbeck, et
al. 2002) find no effects on student test scores and even slightly negative effects on other outcomes, their
analysis relies on student-level data from only two schools. A more thorough and systematic differencein-difference analysis by Atkinson et al. finds clear positive effects of performance pay for British
schools. By utilizing particular features of Tennessee's Career Ladder System and the Project STAR field
experiment, (Dee and Keys 2004) are able to link teachers' quality assessments, as expressed in the career
ladder grouping, to student test scores. They find that the official career ladder system had only mixed
success in rewarding teachers with the highest test score gains, but nonetheless teachers with merit awards
had positive effects on student's math scores. They found however no statistically significant effects on
reading scores.
6
An early observational study which revealed significant gaming is noted in (Wilms and Chapleau 1999) who
note that performance-based pay began in the UK in about 1710, with salaries based on test scores in reading,
writing and arithmetic. The rationale was that it would help keep students from poor families in school, where
they could learn the basics. In reality, the incentives encouraged teachers to narrow the curricula to include
only easily assessed subjects, and cheating by both inspectors and teachers made the system ultimately
untenable. The system was dropped in the 1890s. A similar scheme was introduced briefly in Canada in 1876,
but it ran into similar difficulties and was terminated around the same time.
16
Several studies from the context of American school reform also document the role of unintended side
effects of explicit accountability programs. Large-scale testing of students as part of the 2001 No Child
Left Behind Act ties student test scores to important resource allocations for schools, giving schools
incentives to improve student learning, but also to increase pure test-taking ability or to engage in outright
cheating (Jacob and Levitt 2003; Jacob 2005). Quite surprisingly, (Figlio and Winicki 2005), using daily
lunch menu data from a random sample of 23 school districts in Virginia, show that even the caloric
content of school lunches was adjusted upwardly to improve cognitive ability on test days.
An interesting study of private schools in India assesses the role of teacher unionization on student
outcomes (Kingdon and Teal 2008). While not explicitly evaluating performance pay, unionization of
teachers represents an increase in job security and uniform, higher pay, without being linked to explicit
performance standards. (Kingdon and Teal 2008) find strong negative effects of unionization on student
outcomes, utilizing a within-pupil across subject variation fixed effects design. The study by (Ladd 1999),
mentioned above, on school accountability in Dallas, uses panel data and finds positive effects of merit
pay on student performance and dropout rates.
A set of observational studies (Lavy 2008, 2009) uses data from an Israeli policy experiment with
tournament based teacher competition for bonuses. Using regression discontinuity and difference-indifference designs to approximate random treatment assignment, the study shows significant gains in
student achievements. The studies also assess potential mechanisms for the link between performance pay
and improved test scores, identifying a change in teaching methods, enhanced after-school teaching and
increased teacher responsiveness over test-score manipulation as important causal channels.
A comprehensive study using cross-national data on performance systems in schools and PISA test scores
also finds an positive association between pay-for-performance type reforms, improved teacher quality
and student test scores (Woessman 2010).
Health care jobs
A fairly large literature has dealt with the role of performance pay in improving health care delivery and
services. General literature reviews (Petersen, D., et al. 2006; C. and N 2009) reflect the diversity in
empirical approaches and the strong focus on OECD country experiences. In particular performance
incentives in the UK NHS system and several U.S. insurance systems have received attention in the
research literature.
A series of studies has evaluated the potential effects of financial incentives for primary care physicians.
The British NHS introduced performance-pay elements into the remuneration of family practitioners in
2004. (Doran, Fullwood, et al. 2006) use data on over 8000 family practices and evaluate the effect of
performance pay on patient outcomes and find overall high performance in the first year of the incentive
scheme, but also evidence of ``gaming‘‘ through the exclusion of patients. (Campbell, Roland, et al. 2005;
Campbell, Reeves, et al. 2007) also analyze the role of financial incentives in a stratified random sample
of British general practices focusing on care for coronary heart disease, asthma and type 2 diabetes,
finding a substantial effect of financial incentives introduced in 2004. (Steel, Maisey, et al. 2007) find
positive effects on asthma and hypertension treatment, as do (Vaghela, Ashworth, et al. 2009).
(Chalkley, Tilley, et al. 2010), using evidence derived from a natural experiment in the UK publicly
funded dental care system, analyzed the effects of using incentive pay that provided explicit rewards for
increased service provision against the alternative of offering an employment-like relationship. They
found that dentists who were moved from quasi-employment arrangement to an activity-based incentive
contract increased their activity in the publicly funded service by 26%. They also found evidence of
17
considerable variation between suppliers, which suggests that factors such as an individual‘s intrinsic
motivation, professional standards, and preferences were important moderators of financial incentives.
In the U.S. context, several insurance providers and health maintenance organizations have experimented
with elements of financial incentives for care providers in various states. An early study of health
maintenance organizations (HMO) managers‘ views on financial incentives found mixed support for the
effectiveness of performance pay (Hillman, Pauly, et al. 1991) in the eyes of managers.
More recently, several studies have found fairly positive effects of performance incentives on health
services, patient outcomes and satisfaction. (Safran, Rogers, et al. 2000) use a cross-sectional study of
Massachusetts adults to assess the effects of various health-maintenance organizations (HMO) and their
specific contract elements on primary care. One of the results links financial incentives for physicians to
patient satisfaction. An evaluation of performance pay pilot program for physicians found meaningful
improvements for diabetes patients compared to the control group (D. and Horrigan 2005). Small positive
or mixed effects were also found studies by (Amundson, Solberg, et al. 2003; Casalino, Gillies, et al.
2003; McMenamin, Schauffler, et al. 2003; Levin-Scherz, DeVita, et al. 2006; Coleman, Reiter, et al.
2007; Felt-Lisk, Gimm, et al. 2007; Mandel and Kotagal 2007; Young, Meterko, et al. 2007).
(Rosenthal, Frank, et al. 2005) analyze a natural experiment, comparing quality improvements in two
physician groups in the U.S. from 2001 to 2004. They find improvements in cervical cancer screenings
but not other outcomes, largely rewarding practices with a high baseline performance. In a cross-sectional
sample of primary care physicians that contracted with Medicaid managed care organizations in 2002 in
California found a partially positive effect of incentive pay on STD care (Pourat, Rice, et al. 2005).
(Lindenauer, Remus, et al. 2007) analyze the effects of public reporting and pay-for-performance in
hospital care in a Medicare/Medicaid demonstration project. Hospitals participating in the performance
scheme show a significant improvement in overall measures of patient care quality, including care for
heart failure, acute myocardial infection and pneumonia by up to 16%, compared to the control group. In
a related, but patient-level study (Glickman, Ou, et al. 2007) evaluate the largest pay-for-performance
pilot project in the U.S., finding no conclusive effects for several treatments and patient outcomes.
Similarly, (Pearson, Schneider, et al. 2008) find that pay-for-performance elements in physician contracts
in Massachusetts did not add any significant gains above and beyond secular improvement in a time
period from 2001-2003. In a study of public community health centers in Houston, (Gavagan, Du, et al.
2010) also find no effects of performance pay, while (Chung, Palaniappan, et al. 2010) find no effects for
primary care physicians in California. Mirroring the results found in other areas, while financial
incentives can improve particular behavioral responses of staff members, it is difficult to design an
incentive scheme that does not also produce unintended consequences and rewards unwanted behavior.
(Shen 2003) provides evidence of gaming and selection effects of financial incentives for substance abuse
care providers. (Li, Hurley, et al. 2011) utilize a natural experiment in Ontario, assessing the effect of
performance-related pay on physician behavior and targeted primary care provision. They do find positive
results for some, but not all financial incentives, providing a cautionary message with regard to the
potential impact of performance pay.
Importantly, all these studies evaluate performance pay for specific, easily measurable tasks, in a highly
institutionalized environment with powerful monitoring capabilities. Observational studies dealing with
performance pay in the health care sector in the developing country context are much fewer in number
and can draw on fewer large-scale experiences. Despite the lack of widespread use of incentive pay in the
developing world, incentive focused reforms for doctors and nurses have received some attention.
(Vujicic 2009) outlines the potential benefits of performance pay for health services staff, but also
18
identifies the risks of supplier-induced demand and cost-explosions. He also correctly distinguishes
between performance pay within health care units and the general contracting-out of health services to
NGOs. While contracting of services though often includes overall performance targets and entails the
introduction of staff performance incentives within contracted facilities, this review focuses explicitly
only on performance pay elements in health care facilities.7
Studies focusing on staff-level performance pay in low and middle income countries generally find
positive results, but largely illustrate the lack of systematic findings and evidence. (McNamara 2005)
discusses overall six cases of payment for quality in the health services sector across developed and
developing countries, with cases in Nicaragua and Haiti having had had a positive effect. The Nicaraguan
reform efforts though combined decentralization of decision-making authority, increased local
accountability with explicit performance agreements, and while being judged to have led to an overall
improvement (Jack 2003), it is hard to disentangle the effects of each reform element.
Similarly, in a recent study by (Witter, Zulfiqur, et al. 2011) a pay-for-performance arrangement in a
NGO-led health project in the Battagram district of Pakistani was evaluated and found to have improved
general services provision, but with an unclear effect of the performance-based elements. The study
highlights though the weak link between bonus pay and performance, as well as the low amount of
monetary incentives in relative terms.
A study of health care reform efforts in two Rwandan districts shows that the use of performance
elements paired with increased autonomy seems to offer a viable and cost-effective way to improve health
care delivery (Meessen, Musango, et al. 2006). In a later study, (Meessen, Kashala, et al. 2007) evaluate
the performance of 15 health centers in Kabutare, Rwanda. They document a sharp increase in staff
productivity after the introduction of output-based bonuses. (Soeters, Habineza, et al. 2006) highlight the
potential applicability of the Rwandan experience in sub-Saharan Africa more generally.
Similarly, efforts to improve health services provision in Haiti using performance-based payment for
NGOs in a USAID pilot project showed encouraging effects on immunization coverage and
organizational behavior (Eichler, Auxila, et al. 2001).
A recent book by (Eichler and Levine 2009) outlines the general argument for the use of explicit incentive
schemes in the provision of health services in developing countries. Apart from demand-driven financial
incentives through conditional cash transfer programs, they discuss the role of performance-pay and
contracting-out of services provision. They review evidence from various contexts, in particular
experiences with contracting NGOs in Afghanistan and project evaluations in Haiti and Nicaragua,
overall advocating the increased use of financial incentives.
Private sector: Craft or coping jobs
Observational studies on performance-related pay in private sector craft or coping jobs are generally
suggestive that PRP has a positive impact – but highlights the importance of careful design.
There is a newer and growing empirical literature that has looked at incentive payments in private
organizational settings that are closer to that of ministries and departments, characterized by low
measurability and low controllability of tasks. These organizational settings are akin to craft or, in some
7
(Loevinsohn and Harding 2005) review the success of ten contracting-out projects in the developing world,
finding largely encouraging results.
19
cases, coping organizations. Analyses of incentive schemes in such organizational settings also exhibit a
large variation in the research designs. (Beer and Cannon 2004) study the failure of 13 incentive plans at
Hewlett Packard using interviews and internal documents, finding that managers abandoned the programs
due to the perceived costs. Providing slightly more comprehensive evidence, in a worldwide survey of
205 top managers (Beer and Katz 2003) document the weak support of incentive schemes among
management. Both studies echo some concerns identified in similar public sector studies, but suffer from
weaknesses in their research design.
A number of quantitative observational studies improve upon prior work by using more representative
samples. (Belfield and Marsden 2003) use panel data from a large UK work place survey, which include
both piece-rate jobs and ―knowledge work‖ jobs in which performance is based on achievement of
previously agreed goals, and find strong effects of individual pay-for-performance, but only conditionally
on the monitoring regime. Another study uses the British Household Panel Survey to distinguish the
productivity and sorting effects of performance pay, finding that jobs with performance-related pay attract
workers of higher ability and induce workers to provide greater effort (Booth and Frank 1999).
(Blasi, Freeman, et al. 2008) econometrically examine the relationship between various forms of profit
sharing and stock options on staff turnover, absenteeism, effort, and other productivity measures. The
analysis is based on two large surveys of private sector firms, and finds statistically significant positive
linkages between these shared capitalism schemes and perceptions of workplace performance such as
turnover, loyalty, and worker effort. These incentives have the strongest impact when combined with
other organizational variables such as competitive wages, training, and employee involvement in
workplace policies.
(Hochberg and Lindsey 2010) is one of the few empirical examinations of the impact of stock options on
company rank-and-file on firm performance (as opposed to the impact of options on top executives, on
which there is a large literature). Since stock options are a group incentive based on overall company
performance, incentives to free-ride are very high. However, alternative literature also suggests that nonexecutive compensation may also increase cooperation and encourage mutual monitoring among coworkers. Using a large database for a broad set of firms, and explicitly controlling for endogeneity, the
study shows that stock options exert a positive effect on firm performance. The study also finds that this
effect is higher on smaller firms, consistent with the free-riding hypothesis, and also higher in firms with
higher growth opportunities where the monetary incentive is higher.
(Aboody, Johnson, et al. 2007) also examine the impact of executive and non-executive stock options on
firms‘ operating performance, based on an empirical investigation of a sample of 1300 firms and
contrasting between firms that re-priced their options to make them an attractive financial incentive
versus those that did not. The study finds that while firms that repriced their options had a larger increase
in operating income and cash flows compared to non-repricers, this impact was entirely due to executive
stock options. The repricing of non-executive stock options had no impact, consistent with the free-rider
argument, and in contrast to the (Hochberg and Lindsey 2010).
Some studies from the private sector have also highlighted other conditional variables, such as strength of
social networks, that impact pay incentives. (Bandiera, Barankay, et al. 2005) use data from a farm labor
operation to compare the effects of piece-rate versus relative incentive schemes, showing that individual
piece-rates enhance effort irrespective of social relations among workers, whereas the effects of relative
incentive schemes, which impose negative externalities among workers, is diminished when workers have
stronger social ties. On the other hand, if incentives explicitly recognize team efforts, team rewards can
outperform individual piece-rate wages, as one study on team work in a garment plant has shown
20
(Hamilton, Nickerson, et al. 2003). Data from a large group incentive scheme at Continental Airlines even
found positive effects in the presence of strong free-rider incentives (Knez and Simester 2001).
Experimental studies
Among experimental studies a distinction has to be made between laboratory and field experiments, with
each having their own advantages and disadvantages. Laboratory experiments are characterized by their
strong control over experimental design and treatment specification, but often lack the ability to use
representative subjects or simulate convincingly real-world settings. Field experiments combine the
randomized assignment of treatment with actual real-world programs and participants, but are often more
constrained in their design choices, face stronger exogenous pressures to succeed and are resourceintensive.
Meta-studies
(Jenkins, Mitra, et al. 1998) provided the first meta-analysis of the psychological literature on the impact
of financial incentives on performance and concluded, based on analysis of 47 studies that these
incentives resulted in a 12% improvement in performance quantity and a negligible effect on performance
quality. The study however had several limitations as it did not distinguish between types of incentive
programs, complexity of tasks, and differences in organizational settings.
Building on the Jenkins study, (Condley, Clark, et al. 2003) conducted a meta-analysis of 64 field and
laboratory experiments, as well as observational studies, on the impact of monetary and non-monetary
individual and group incentives on performance. The criteria for inclusion of studies in the analysis was
that studies had to (a) have a use a control group or a pre-treatment measure of average performance; (b)
involve the use of incentives to enhance performance; and (c) and report some statistical data. The studies
included private sector settings, public sector, as well as laboratory experiments with college students.
Within the public sector, however the vast majority were of schools, with only one study in a core
government setting, thereby greatly limiting the generalizability of the findings to the core civil service.
Importantly, the analysis also distinguished between studies that looked at cognitive (38 studies) versus
mechanical manual tasks (26 studies), and the measurability of the tasks, looking at both quantitative and
qualitative performance targets. The results of the meta-analysis showed that employees and other
research participants who received performance incentives achieved an average 22% increase in work
performance. The findings were irrespective of the settings, although again it should be noted that there
was only one study of performance incentive in a government agency. Monetary incentives were found to
be more effective than non-monetary gifts, and group-based incentives were significantly more effective
(48% increase in performance) than individual incentives (19% increase). While the incentives worked
for both cognitive and physical tasks, the gains were higher for the latter (30% increase compared to 20%
increase). No significant difference was found based on the measurability of the tasks.
(Weibel, Rost, et al. 2009) conduct a meta analysis of 46 high quality empirical studies published in the
fields of economics and psychology and covering both simple and complex tasks, and with both
quantitative and qualitative outcome measurements. Overall the study finds a statistically significant and
positive effect of pay for performance on performance; however, and in contrast to the above, the findings
were significantly positive for simple tasks and smaller, but still statistically significant, negative effect
for complex tasks. The authors argue that this negative effect is due to the reduction in intrinsic
motivation brought about by the incentive scheme.
21
Laboratory experiments
Building on these psychological studies, a number of behavioral economists have done laboratory
experiments to explore the different aspects of performance pay, including the functional relationship
between bonus sizes and performance, the incentive and sorting affects of performance pay, the impact of
different types of bonuses, and the possible tradeoffs between extrinsic and intrinsic motivation.
(Ariely, Gneezy, et al. 2009), explore the effect of bonus size on performance in laboratory experiments
using subjects in the US and India, with 24 and 87 participants respectively. Participants had to solve
cognitive tasks under time pressure and were incentivized with bonuses that varied from small to large
relative to their normal pay. They found evidence for an ―inverse-U‖ relationship between bonus size and
performance, with the ―choking-under-pressure'' effect where bonuses at very high levels lead to a
worsening of performance compared to bonuses at low and moderate levels.
An experiment with 115 Australian students that tried to distinguish the potential incentive and sorting
effect of performance pay found supportive evidence for both hypotheses (Cadsby, Song, et al. 2007). In
addition they found not only that low productivity subjects were less likely to sort into pay-forperformance jobs, but also that subjects with higher levels of risk-aversion avoided pay-for-performance,
suggesting important unintended side effects. The experimental comparison of piece-rate, team rewards
and relative performance schemes found that piece-rate systems and team rewards overall induce similar
effort levels (free-riding in team rewards is compensated by higher effort contributions). Effort was
higher, but also more variable in tournament-based reward systems. The experiment also revealed that
subjects' attitudes about the varying reward systems differed widely (van Dijk, Sonnemans, et al. 2001).
(Straberg 2010) showed in an empirical study concerning the perceived impact of performance-related
pay in Sweden, that men were much more likely to see the arrangement as fair and reasonable.
Tackling the problem of multi-dimensionality of many tasks, (Fehr and Schmidt 2004) conduct an
experiment with university students to understand the effects of varying bonus schemes on effort
provision on two distinct tasks, only one of which is contractible. They find that simple piece-rate
contracts lead to a focusing on the contractible task, while bonus arrangements designed to be more
encompassing and to explicitly address the multi-tasking problem also induce participants to spend time
on the second task.
The issue of extrinsic versus intrinsic motivation and how performance pay can change the perception of
salary arrangement is the subject of a laboratory experiment by (Gneezy and Rustichini 2000). They use
high school and university students in Israel and offer them different size of bonuses for specific tasks.
The results suggest that subjects showed higher levels of productivity when offered large rewards, but
small awards led to worse performance than offering no monetary reward at all. This suggests the
importance of framing of performance pay — if bonuses adequately communicate the importance of
performing assigned tasks well compared to the overall goals of the organization, they can work, but if
bonuses trigger a change in evaluation of the worker relationship, crowding-out of intrinsic motivation
can worsen productivity.
An interesting laboratory experiment recruits future teachers in India to assess the possibility of gaming
effects under performance pay. The experiment assesses teacher efforts when rewards are a function of
average student test scores. In a situation with strong social heterogeneity and prejudice, teachers might
focus on assisting high-status students, while neglecting lower caste pupils. The experiments reveals that
poorly designed incentive plans lead to such a misallocation of teacher effort, which produces unequal
22
distribution of effort across student groups, but properly designed incentives can mitigate such behavior
(Jain and Narayan 2011).
Public sector craft jobs
The evidence from experimental studies for public sector craft jobs is basically similar to that from
observational studies – there is generally a positive impact but the limitations are also similar, in that the
evidence is largely from OECD countries with the significant exception of teaching and health care where
several significant experiments have been undertaken in developing countries.
Tax administration, job placement
(Burgess, Propper, et al. 2010) use a randomized controlled trial (RCT) to examine the impact of a pilot
team-based incentive scheme introduced in 2002 in Her Majesty‘s Customs and Excise (HMCE), the
indirect tax assessment and collection agency of the UK government. Each team consisted of a small
number of tax offices, and ranged from 150 to 280 workers; there were two treatment teams that received
two different types of incentive payments — one a bonus that was a fixed percentage of the officer‘s
salary and the other a flat rate bonus — and a control team. The incentive scheme for both the treatment
teams consisted of meeting a set of targets on revenue collection and conduct of audits, with an average
bonus size of approximately 3% of annual salary. If the target was met than all staff in the team received
the bonus. The authors‘ use detailed data from the HMCE‘s performance management system and
personnel records to show that the tax yield increased for both the treatment teams relative to the control
group, and that these increases were due to more time spent auditing that resulted in the recovery of
greater tax revenue. The study also found that the strategies of the two incentivized teams were different,
with the managers in team 2 allocating more incentive tasks to efficient workers than managers in team 1.
Whether the flat rate bonus structure contributed to this task allocation was not examined.
(Burgess, Propper, et al. 2011) also did another RCT to examine the impact of a team-based incentive
scheme introduced in a large UK public agency, Jobcentre Plus, which is tasked with placing the
unemployed into jobs and administering welfare benefits. Each team consisted of a district, which had
several offices, ranging from 250 to 1500 staff. There were 17 treatment districts and 73 control districts.
The incentives were based on achieving both quantitative and qualitative targets — number of individuals
placed in jobs, customer and employer service, and reducing benefit calculation error and fraud. The job
placement target was weighted by the types of clients who found employment (e.g. highest weights to an
unemployed single parent) with extra points if the person retained employment. While individuals worked
in offices, the targets were set at the district level, which consisted of many offices that operated
independently of each other, a potential flaw in the schemes design. The study was explicitly designed to
assess the impact of incentives given multi-dimensionality of tasks, as well as possible free-riding given
the nature of the team incentive.
The study‘s findings were that while overall there was little difference between the treatment group and
the control group on job placement, in smaller teams (fewer offices per district, and smaller offices) the
incentives resulted in 10% greater job placements than in the control group, with the effect declining for
larger districts such that there was a significant negative impact on productivity for the largest quartile of
districts. This suggests considerable free-riding behavior in larger teams. The authors also found that none
of the quality measures were significant in the treatment group, irrespective of team size, suggesting that
measurement problems were important.
23
Teaching
A number of field experiments have evaluated the impact of performance pay for teachers on reducing
absenteeism and improving learning outcomes. The findings are generally mixed. In a study, (Duflo,
Hanna, et al. 2010) show that random assignment for monitoring and financial incentives for teachers in
rural India led to a strong reduction of teacher absenteeism and increased students' test scores by
approximately 0.17 standard deviations. Units of observation for the study were single-teacher schools
run by an NGO. The NGO selected 120 schools for testing the monitoring and incentive program, 60
which were randomly assigned to the treatment group. In the treatment groups, teachers had to use
tamper-proof cameras to document their classroom presence at the start and end of each school day.
Teacher salary was then made a function of the days in attendance, ranging from 50% to 130% of the pay
in the fixed wage control group.
(Kremer and Chen 2001) by contrast show that subjective monitoring arrangements by an individual in
the institutional hierarchy (like the headmaster of a school) may not work in developing country settings
because the monitor might shirk, try to avoid confrontation, or collude with the workers. In Kenya, the
Early Childhood Education Project offered substantial material incentives to teachers (bicycles) with good
attendance as reported by their supervisor. Yet, the study found no effect of this program on absences as
there was considerable cheating. In every school, the headmaster reported sufficient attendance for the
teacher to receive the prize; however, when the research team independently verified absence through
unannounced visits in both treatment and comparison schools, they found that the absence rate was
actually exactly at the same high level in treatment and in comparison schools. This and Duflo‘s study
suggest that impersonal, external monitoring by a camera coupled with a clear, credible, and automatic
threat of punishment and promise of reward was the key design feature for program success.
Studies of performance pay linked to student outcomes are also similarly mixed. A field experiment in 50
Kenyan schools linking teacher salaries to student test scores failed to find lasting effects (Glewwe, Ilias,
et al. 2010). Teacher attendance did not improve; teachers did not adjust their teaching methods or
conduct more preparation sessions. Students in treated schools did perform better during the program
duration, but these gains did not carry beyond the study period.
A field experiment conducted in NYC public schools also failed to find statistically significant effects of
team incentives for teachers on student outcomes (Fryer 2011). In 2007 New York City launched a pilot
program of financial incentives to teachers in 400 low-performing schools with the goal of improving
student outcomes. There were explicit eligibility criteria for schools to be part of the pilot program; about
half were randomly selected to receive treatment, which consisted of a school bonus of $3,000 per staff
member if certain standards are reached. The study did not find any effect of the financial incentives on
teacher or student behavior. The surprising null finding is seen to be potentially produced by a lack of
strong individual incentives, driven by the small size of the average bonus (4% of annual teacher salary),
free-riding of teachers within schools and the overall complexity of the incentive scheme. A related study
that also assesses the effects of the NYC group incentive program on classroom activities and teacher
turnover and qualification, apart from test scores and teacher effort, similarly finds no effects (Goodman
and Turner 2010). A three-year experimental evaluation of the Project on Incentives in Teaching (POINT)
in Metropolitan Nashville schools also found no significant effects of bonus incentives on student test
scores (Springer, Ballou, et al. 2010).
On the other hand a large-scale field experiment in a representative sample of 300 government-run rural
primary schools in India found that bonus pay linked to the mean improvement of student test scores in an
independent learning assessment led to a statistically significant and substantively meaningful
24
improvement of student outcomes (Muralidharan and Sundararaman 2009). In the treatment group scores
were higher by 0.28 standard deviations in math tests and by 0.16 standard deviations in language tests,
across ―conceptual‖ and ―mechanical‖ parts of the test. They also find positive spill-over effects to
subjects not part of the official student assessment.
Health sector
A number of randomized-controlled trials have been implemented to determine the role of performance
pay on health worker productivity, patient treatment and outcomes. Similar to studies on healthcare
relying on observational data, the majority of studies assess these questions in the context of OECD
health care systems. (Prentice, Burgess, et al. 2007) point out that improving quality rather than quantity
of output is the primary focus for many, if not all, performance pay schemes implemented in the
healthcare sector.
One of the first studies to employ an experimental design was work on performance incentives for nursing
homes (Norton 1992). He finds that nursing homes assigned to the treatment group show better resident
health outcomes and shorter stays.
(Kouides, Bennett, et al. 1998) implement a randomized-controlled trial, offering a randomly selected set
of primary care physicians financial incentives based on influenza immunization rates of the elderly, as
part of a Medicare demonstration project. Doctors in the treatment group were eligible to receive a $0.80
payment per shot for an immunization rate of 70% and $1.60 for each shot, if an immunization rate of
85% was attained. The experiment finds a difference of 7% in the immunization rate between treatment
and control groups.
On the other hand, (Hillman, Ripley, et al. 1998; Hillman, Ripley, et al. 1999) use two RCT designs to
incentivize cancer screenings for women of age 50 and above and pediatric immunizations, respectively.
In both studies the authors document no significant difference between treatment and control groups.
Similarly, a RCT implemented by (Grady, Lemkau, et al. 1997) finds no clear effects of financial
incentives on mammography referrals by primary care physicians.
Contrastingly, a set of studies (Fairbrother, Hanson, et al. 1999; Fairbrother, Siegel, et al. 2001), also
focusing on pediatric immunizations, finds that performance incentives increased immunizations rates by
several percentage points compared to the control group. A randomized field trial at the clinic-level found
that financial incentives improved treatment of smoking cessation outcomes (Roski et al. 2003). Work on
performance pay for cognitive services interventions by pharmacists also finds positive effects
(Christensen et al. 2000).
To our knowledge, the only two available randomized-controlled trials on performance pay in health care
in a low income country are a study by Basinga et al. (2010) in Rwanda and a study by Singh (2010) in
India. Basinga et al. use an RCT design to evaluate performance pay in Rwandan primary health care
centers. The authors took advantage of a sequenced roll-out of the scheme across Rwandan health care
facilities, collecting data on child preventive care and prenatal delivery. To isolate the performance-pay
effect from a general increase in resources, comparison facilities received an equivalent increase in their
budgets. The study uses information from 166 facilities and 2158 households. They find large effects on
all central outcome measures, but with particularly striking effects for services with the highest payoffs
and smallest necessary staff effort.
25
(Singh 2010) treated three groups of mothers and staff providing child care and nutritional advice to them
in Chandigarh, India: in one group the workers received with performance pay; in a second group the
workers had no performance pay but the women that they worked with were separately given factual
information about nutrition; and the third group received both treatments. The study found that children‘s
weights improved only in the third group compared to the control group.
It is noteworthy that nearly all studies on the health care sector so far focus on fairly narrow types of
performance pay and specific, single outcome measures in preventative care, not necessarily overall
multidimensional patient treatments and outcomes.
Private sector: Craft or coping jobs
The evidence from experimental studies for private sector craft or coping jobs is largely from OECD
countries. In a field experiment from the private sector (Bandiera, Barankay, et al. 2006), some managers
were treated with the introduction of a performance-pay system and productivity of lower-tier workers
was used as an outcome measure. The study finds evidence of both an incentive and sorting effect, i.e.
managers support their high-productivity workers and fire the least qualified employees. Evidence from a
quasi-experiment in a private sector company found that the introduction of a new appraisal system that
feeds into performance pay can improve trust in top-level management (Mayer and Davis 1999). An
experimental treatment of monitoring efforts by management in a call center found that employees largely
behave according to a rational-cheater model of human behavior, highlighting the importance of
performance measurement, but at the same time a substantial portion of employees remains unaffected by
monitoring attempts (Nagin, Rebitzer, et al. 2002).
5. Assessing the evidence
To assess the overall evidence, the 152 studies that were reviewed in this paper (see the Appendix for a
list) were grouped into three categories: positive if their findings provide positive evidence for the
effectiveness of incentive schemes; 8 neutral if the study is largely descriptive or finds contradicting
evidence; and failed if the evidence indicates no effect or negative effect of performance pay. Figure 1
shows the overall frequency of results. A majority of studies (93 out of the 152) presents supportive
evidence for some form of effect of performance pay schemes, with experimental studies showing more
positive findings than observational ones.
In drawing lessons however, it is important to distinguish the findings more systematically by the
research quality of the concerned study. Study quality was ranked in two different ways. First, each study
was assessed for its ‗internal validity‘, or strength of the causal arguments being made, using a five-point
ranking (from weak to strong) as follows:
1. no empirical study or faulty research design
2. descriptive; small sample size
3. secondary data analysis and/or descriptive data analysis; small sample size; some statistical
analysis
4. quasi-experimental design; reasonable sample size; conclusions based on statistical analysis
8
Inevitably, there is some subjectivity in the classification of studies. Studies were rated as positive if there was
general evidence on the basic functionality of incentive schemes, even if additional results qualify the effect,
e.g. studies on crowding-out of intrinsic motivation generally still find positive effects of explicit incentives.
26
5. laboratory experiments; randomized controlled trial; large sample size; strong statistical analysis;
strong conclusions
Second, studies were also evaluated on the dimension of ―external validity‖, or to what extent the causal
connections drawn in the specific context of the study would remain valid if replicated in other contexts.
So for example, lab experiments and RCTs offer very strong evidence about causality (high internal
validity), but in a specific context — they tell us the average impact of a particular intervention in a
particular location with a particular sample at a particular point in time. They are often accused of being
low on external validity as the study subjects (usually college students in the case of laboratory
experiments) are not representative of the general population, or in this case the population of interest
(civil servants) and the requirements of the experiment imply very particular conditions that may not
approximate real world settings.
Figure 1: Aggregate findings on performance-related pay
Findings by study type
Aggregate findings (number of studies)
100
80
93
70
80
60
Number of studies
90
70
60
50
37
40
30
23
68
Failed
50
35
40
30
20
17
16
8
7
10
20
Neutral
2
0
0
0
10
Observational
Field experiment
Lab experiment
0
Failed
Neutral
Positive
Figure 2 (left panel) shows the overall results by the measure of internal validity. The majority of high
quality studies (score of 4 and above) show a positive finding, while a majority of the low quality study
(score of 1 or 2) show a negative or neutral finding. When assessing the findings according to the external
validity of studies (Figure 2 right panel) again more positive results are found in more externally valid
studies.
Figure 2: Findings by internal and external validity
Findings by measures of internal validity
Findings by measures of external validity
50
70
45
60
35
Failed
Neutral
Number of studies
Number of studies
40
Positive
30
25
20
15
Failed
Neutral
Positive
50
40
30
20
10
10
5
0
1
2
3
4
Ranking: 1 = lowest quality; 5 = highest quality
5
0
Low
High
External validity
27
Parsing the evidence by job type, a majority of studies of craft and production jobs show positive results,
while a majority of studies of coping jobs find negative or neutral results (
Figure 3 left panel). These limitations of PRP to the demanding contextual environments of coping jobs
are in line with the theoretical arguments, though the total number of studies of coping jobs (16 in total in
this review) are too few to allow generalizations with any degree of confidence. Analyzing studies of
relevance to the public sector — i.e. craft or coping jobs only, or 110 studies in the review — by country
context reveals that, counter-intuitively, the weight of the evidence is somewhat stronger for developing
countries (
Figure 3 right panel). However, it must be emphasized that the number of studies for developing countries
is low.
Figure 3: Findings by job type
Findings by job type
Relevant studies (craft or coping job) by country
context
70
Number of studies
60
Failed
Neutral
Positive
60
50
40
30
24
20
15
19
8
10
1
3
3
5
0
Prod. Job
Craft Job
Coping Job
50
45
40
35
30
25
20
15
10
5
0
46
Failed
Neutral Positive
24
19
17
1
OECD
3
Developing Country
When the quality of study (internal validity) is considered separately for studies of craft and coping jobs,
again more rigorous research shows on average more positive findings.
Figure 4 illustrates the breakdown of results by study quality for relevant studies. The left panel shows
that a majority of high quality studies (ranked 4 and 5 in the measure of internal validity) of craft and
coping jobs show positive findings (46 out of 68) while a majority of low quality studies show negative or
neutral results. However, the dearth of the evidence on coping jobs is even more apparent for high quality
studies as the bulk of the literature has focused on craft jobs, in particular teaching and healthcare (right
panel). The findings for craft jobs is generally more positive in these high quality studies in developing
country contexts than in OECD contexts, although the number of such studies is very small and largely
limited to health and education (Table 3).
6. Summary
Overall the body of evidence paints a supportive picture of performance pay in craft jobs within the
public sector, but less so for coping jobs. So, as incentive theory would predict, PRP seemingly has a
greater role to play in jobs where the outputs are more readily observable, such as teaching and health
care jobs, than it does in more general administration. That these are jobs where the day to day actions of
staff are unobservable does not seem to be an obstacle — apparently confounding, at least in the short
term, the behavioral economics concern about crowding out intrinsic incentives. It is also in relation to
28
craft jobs that there have been more observational and experimental studies in developing countries —
and generally the evidence from those settings is more positive than in OECD settings.
Among the observational studies, work analyzing the introduction of performance pay in the private
sector finds nearly uniform support for the effectiveness of explicit incentive schemes. Observational
studies of PRP for craft jobs in the public sector highlight that measurability of effort and output is more
difficult in that environment and present slightly more mixed findings, but it seems the better output
measurement and performance assessment are, the better performance pay works. The study of Brazilian
Figure 4: Findings for craft and coping tasks by research quality and country context
Relevant studies by quality (total 109)
Findings for high quality (4 and 5) relevent
studies only
40
Failed
35
Neutral
34
Positive
50
40
25
35
Neutral
Positive
30
20
25
15
12
9
10
5
5
Failed
44
45
30
7
5
1 2
12
9
5
12
15
7
10
10
1
0
20
5
1
0
0
0
2
0
1
2
3
4
5
Craft Job
Coping Job
1= lowest quality; 5 = highest quality
Table 3: Findings of high quality craft and coping studies by sector and country context
Education
OECD
Positive
Negative or neutral
Developing country
Positive
Negative or neutral
Total
Health
19
12
7
6
4
2
25
Craft jobs
Tax
27
16
11
4
4
0
31
Other
2
1
1
1
1
0
3
Public
7
6
1
0
7
Coping jobs
Private
1
1
0
0
1
1
1
0
0
1
tax collectors is a good example of public sector work in which performance pay can be successfully
implemented. Equally, Lavy's study of Israeli teacher competition (Lavy 2009) shows the ability of a
well-designed performance scheme to function properly. At the same time, several observational studies
identify problems with unintended consequences, generically subsumed under ―gaming‖ the incentive
scheme, that can run counter to the original intentions of the reforms. With the current state of evidence
though it remains unclear whether incidents of gaming have a net negative effect in the presence of
increased productivity. Furthermore, while explicit incentive schemes certainly increase the opportunity
for gaming, standard civil service arrangements have their own unintended incentive effects, i.e.
employees will engage in behavior that increases the chances of easy work assignments or promotions
and it is simply unknown whether existing forms of gaming are worse than similar behavior under
performance pay.
Moving to studies that attempt to fulfill the gold standard of experimental design, the evidence overall
again speaks in favor of the potential utility of performance pay for craft jobs. Comparing various
laboratory experiments, the results suggest that indeed explicit performance incentives can work, but the
studies employ easily measurable performance indicators and use fairly unrepresentative subject pools.
29
Both concerns should caution policy makers against accepting the results independently of other research.
On the other hand, similar results have been found across a varied set of experimental settings, test
locations and subject pools and overall findings do resonate with the observational literature, improving
overall credibility and external validity.
The strongest form of evidence comes from field experimental studies for craft jobs that neatly address
concerns of internal and external validity. Here, evidence is somewhat more mixed. Several studies of
teacher incentive programs found no or transient effects of bonus pay systems in the context of US
schools but in the developing world, evidence has been more positive. The discrepancy between teacher
incentives in the developed and developing world could on the one hand stem from the relative magnitude
of incentives compared to normal salary or on the other hand come from higher marginal effects in the
education production function in developing countries. Many factors enter the production of education, all
of which are likely lacking in many developing country schools. Improving one input aspect, e.g. teacher
presence and effort, could have conceivably larger marginal effects than the same input improvement in a
developed country school.
However, three limitations of these research findings loom large. First the number of high quality studies
overall is limited, and the number for developing countries is particularly low. This is particularly the case
for coping jobs reflective of the work of the core public administration for which there is at this stage
simply not enough evidence on whether or not PRP can work. This small pool of research studies makes
it hard to disagree with the observation that: ―much of the early evidence is not very robust.
Consequently, a meta-analysis in which it is concluded that the ―bulk‖ of the evidence shows this or that
cannot be provided as — despite the strong positions taken by the proponents or opposers of attempts to
incentivise the public sector — there is no bulk‖ (Prentice, Burgess, et al. 2007, pp.12-13).
Second, the studies inevitably tend to look at the impact of PRP in the short term. Is it possible that the
positive results are the result of Hawthorne Effects or new incentives which degrade over time as attitudes
towards work change? It is generally accepted that ―(p)erformance related pay requires a long list of
supportive local conditions before it stands a good chance of working as intended‖ (Pollitt and Dan
2011b, p.46) and it seems possible that the supportiveness of those conditions may only emerge over the
longer term. In particular, to fully explore the persistent concern about crowding-out of intrinsic
motivation, long term studies are necessary to assess how the design of incentive schemes and how they
are communicated to employees minimizes the risk of employees changing attitudes toward their task.
Third, existing research exclusively focuses on the mechanical aspects of performance pay reforms, i.e.
how to measure performance, how to design the incentive scheme and how large the bonus should be. No
study explicitly considers contextual factors that can affect the operation of even the most well-designed
incentive schemes. Most glaringly, the role of politicized bureaucracies has not been addressed properly.
One, we need a better understanding of how politics can subvert flexible pay reforms and create important
unintended side-effects and it is worth investigating how performance pay can potentially be used as a
tool in de-politicizing the civil service and increase transparency and meritocracy. Interestingly, the move
away from individualized pay in various area of the civil service, to uniform salary scales, was partly
driven by the prevalence of patronage politics and clientelistic linkages that permeated the civil service in
19th century Western countries. Research could draw on this rich historical data and contemporary
experience of civil service reform in developing countries to inform our understanding of performance
pay. Apart from the influence of political parties or influential patrons, the role of civil service unions
needs more attention. While most unions oppose in principle performance pay as a threat to union power,
a better understanding of unions in the reform initiation and implementation could vastly improve future
reform attempts. Countries vary dramatically in their wage bargaining structures and the strength of
30
unions, introducing performance pay in one scenario is likely to have vastly different effects than in
others.
Further studies will need to broaden the coverage of developing countries and review the evidence over a
longer time scale. If they continue to show a broadly positive association between PRP and performance
in craft jobs, they will also need to shed light on a number of important design questions concerning
individual versus group rewards, size of bonus, and types of performance measures used. They will also
need to unpack the political process underlying reform initiation, implementation and sustained support
for reforms — are the ideological or other pressures for PRP at odds with the long term fine-tuning which
is undoubtedly necessary to maintain any positive results?
Pending that further research, the guidance for practitioners seems to be that PRP for craft jobs is a
feasible possibility for performance improvement and has some evidence behind it — but that careful
design and piloting, accompanied by a technical willingness and a political ability to change direction is
key. For coping jobs however, there is at this stage simply not enough robust evidence to draw any
conclusions either way.
This review has examined almost exclusively the effects of PRP on effort as that has been the focus of
much of the theoretical and empirical literature. However, there is a small policy literature that also points
to potential agency level and broader public sector wide effects that merit further more rigorous empirical
examination. PRP‘s contribution to agency level productivity through the sorting channel — attracting
higher quality workers who are likely to do better in this scheme — has already been discussed, and there
is some fairly robust evidence of this in developed country contexts (Booth and Frank 1999; Bandiera,
Barankay, et al. 2006; Cadsby, Song, et al. 2007). Individuals sorting into jobs with performance pay are
on average higher educated, more qualified, less risk-averse and male. In a comprehensive laboratory
experiment that features the self-selection of participants into performance schemes and also compares
fixed wage, piece-rate and tournament-based rewards, researchers found clear incentive effects for
variable payment schemes, driven by sorting of individuals. Individual-level characteristics like selfassessment, risk preferences, gender and social preferences systematically predict sorting decisions
(Dohmen and Falk 2007).
In consequence, it is possible that PRP would have a clearer effect on individual incentives if the sorting
effect had longer to work its way through the system. To evaluate this effect however, future studies need
a longer time-frame to assess changes in recruitment and work-place behavior.
It is important to note that in OECD countries, performance-related pay has been introduced at a time of
increasing ―work intensification‖ through control measures such as the use of ICT to track individual
production and working habits (Green 2001; Beynon, Grimshaw, et al. 2002). These are somewhat hard
drivers of changes in what Marsden refers to as the ―effort bargain‖ (Marsden 2004) — the implicit
understanding between staff and management about ―how hard we work around here‖. PRP can offer an
additional and less coercive point of entry into renegotiating this bargain. (Marsden 2004) cites two
examples of changes in working practices that were sought in hospital management and in the tax service,
both in the UK. In both cases, the task of implementing the changes fell to managers who were under
pressure from their staff to be lenient with work assignments and generous with pay increases. In both
cases, he argues, that individual incentives were only a modest part of the function of performance-related
pay. Its real contribution was ―to enable management to redefine the established performance norms in
their organization, and then to obtain effective compliance with those norms, with the explicit or tacit
agreement of as many employees as possible‖ (Marsden 2004, p.351). In sum, by incrementally ratcheting
up the performance expectations through the many thousands of performance appraisal discussions, the
informal agency working culture was changed.
31
PRP can also be considered as an element in overall cost containment as it provides the employer with the
additional tactical option of proposing that pay increases should only be provided in the form of (less
costly) enhanced performance bonuses (Marsden and French 1998). (French 2005) identifies the
significant additional control that performance-related pay gave managers within the UK Inland Revenue
in the early 1990s, allowing them to cease automatic cost-of-living allowances, largely restrict additional
pay increases to the performance component of pay, combining the latter with a de facto forced curve for
allocating performance-based rewards. (French 2005) also highlights how individual negotiations within
a performance-related pay scheme in the UK allowed managers to sever the previous linkages between
the level of complexity of the tasks required of staff and their grade. Under the performance-related pay
scheme, targets were negotiated individually and not by grade.
Finally, at least in OECD countries, PRP has usually been accompanied by other changes in pay policy,
of which the two noteworthy ones are differentiation - creating pay differences within and across
government agencies for civil servants doing the same job, based on the need to attract and retain
qualified staff for these jobs; and delegation of pay-setting authority away from a central civil service
agency to ministries, agencies, or departments. The linkages between PRP and pay differentiation and
delegation remain underexplored. For example, it is possible that delegation is reported to lead to PRP
since managers have an incentive to look for mechanisms that maximize the results can be achieved at
minimum cost.
32
Appendix A: List of empirical PRP studies reviewed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
Authors
Aboody et al
Amundson et al
Ariely et al.
Asch
Atkinson et al.
Ballou
Bandiera, Barankay and Rasul
Bandiera, Barankay and Rasul
Basinga et al
Beaulieu and Horrigan
Beer and Cannon
Beer and Katz
Belfield and Marsden
Bender and Elliott
Bertelli
Blasi et al
Booth and Frank
Brudney and Condrey
Burgess et al
Burgess et al
Burks, Carpenter and Goette
Cadsby, Song and Tapon
Camerer et al.
Campbell et al
Campbell et al
Cardona
Casalino et al
Chalkley et al
Christensen et al
Chung et al
Clarketal
Clotfelter et al.
Clotfelter et al.
Clotfelter, Ladd and Vigdor
Cohn
Coleman et al.
Condrey and Brudney
Cooper and Cohn
Courty and Marschke
Daley
Dee and Keys
Dohmen and Falk
Doran et al.
Dowling and Richardson
Duflo, Hanna and Ryan
Eberts, Hollenbeck and Stone
Eichler and Levine
Year
2010
2003
2008
1990
2004
2001
2005
2006
2010
2005
2004
2003
2003
2003
2006
2008
1999
1993
2010
2004
2009
2007
1997
2007
2005
2007
2003
2010
2000
2007
1995
2004
2008
2007
1996
2007
1992
1997
2003
1993
2004
2007
2006
1997
2007
2002
2009
Type
observational
observational
lab experiment
observational
observational
observational
observational
field experiment
field experiment
observational
observational
observational
observational
observational
observational
observational
observational
observational
field experiment
observational
field experiment
lab experiment
observational
observational
observational
observational
observational
observational
field experiment
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
lab experiment
observational
observational
field experiment
observational
observational
Quality
4
3
5
4
4
3
4
5
5
4
2
3
4
4
4
4
4
3
5
4
5
5
4
4
3
1
4
4
5
3
4
4
4
4
2
4
3
4
4
3
4
5
1
4
5
4
4
Effect
positive
positive
positive
positive
positive
neutral
positive
positive
positive
positive
failed
positive
positive
positive
neutral
positive
positive
positive
positive
positive
positive
positive
neutral
positive
positive
failed
positive
positive
positive
failed
neutral
failed
positive
positive
neutral
positive
neutral
positive
positive
positive
positive
positive
positive
positive
positive
failed
positive
Country context
OECD
OECD
Developing
OECD
OECD
OECD
OECD
OECD
Developing
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
Developing
OECD
Developing
Type of Job
Coping
Craft
Production
Craft
Craft
Craft
Production
Production
Craft
Craft
Production
Production
Production
Unclassified
Craft
Craft
Production
Coping
Craft
Craft
Production
Production
Production
Craft
Craft
Unclassified
Craft
Craft
Craft
Craft
Craft
Craft
Craft
Craft
Craft
Craft
Coping
Craft
Craft
Coping
Craft
Production
Craft
Coping
Craft
Craft
Craft
33
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
Authors
Eichler, Auxila and Pollock
Eriksson and Villeval
Fairbrother et al
Fairbrother et al
Fehr and Goette
Fehr and Schmidt
Felt-Lisk et al
Figlio and Kenny
Figlio and Winicki
Fryer
Gavaghan et al
Gerhart and Milkovich
Glewwe, Ilias and Kremer
Glickman et al
Gneezy and Rustichini
Goldhaber and Brewer
Goodman and Turner
Grady et al
Gratz
Grimshaw
Hamilton, Nickerson and Owa
Heckman, Heinrich and Smith
Heneman and Milanowski
Hillman et al
Hillman et al
Hillman et al
Hochberg and Lindsey
Ingraham
Jack
Jacob
Jacob and Levitt
Jain and Narayan
Kahn, Silva and Ziliak
Kelley
Kellough and Nigro
Kellough and Selden
Kerr
Ketelaar, Manning and Turkisc
Kim
Kingdon and Teal
Kiragu and Mukandala
Knez and Simester
Koretz
Kremer and Chen
Kouides et al
Year
2001
2004
1999
2001
2007
2004
2007
2007
2005
2011
2010
1990
2010
2007
2000
2000
2010
1997
2009
1998
2003
1997
1999
1998
1999
1991
2010
1993
2003
2005
2003
2011
2001
1999
2002
1997
1975
2006
2002
2008
2003
2001
2002
2001
1998
Type
observational
lab experiment
field experiment
field experiment
field experiment
lab experiment
observational
observational
observational
field experiment
observational
observational
field experiment
observational
lab experiment
observational
field experiment
field experiment
observational
observational
observational
observational
observational
field experiment
field experiment
observational
observational
observational
observational
observational
observational
lab experiment
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
field experiment
field experiment
Quality
2
5
5
5
5
5
4
4
4
5
4
4
5
4
5
4
5
5
2
1
4
4
3
5
5
3
4
1
2
4
4
5
4
2
3
3
1
3
2
4
2
4
1
5
5
Effect
positive
positive
positive
positive
positive
positive
neutral
positive
neutral
failed
failed
positive
neutral
neutral
positive
positive
failed
failed
neutral
failed
positive
neutral
neutral
failed
failed
neutral
positive
failed
positive
positive
neutral
positive
positive
neutral
failed
neutral
failed
neutral
neutral
positive
neutral
positive
failed
failed
positive
Country context
Developing
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
Developing
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
Developing
OECD
OECD
Developing
Developing
OECD
OECD
OECD
OECD
OECD
OECD
Developing
Developing
OECD
OECD
Developing
OECD
Type of Job
Craft
Production
Craft
Craft
Production
Production
Craft
Craft
Craft
Craft
Craft
Production
Craft
Craft
Production
Craft
Craft
Craft
Craft
Unclassified
Craft
Production
Craft
Craft
Craft
Craft
Craft
Unclassified
Craft
Craft
Craft
Craft
Craft
Craft
Unclassified
Unclassified
Unclassified
Unclassified
Unclassified
Craft
Unclassified
Production
Craft
Craft
Craft
34
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
Authors
Ladd
Lavy
Lavy
Lazear
Lazear
Le Grand
Leford, Lawler and Mohrman
Levin-Scherz et al.
Li et al
Lindenauer et al
Mandel and Kortagal
Marsden
Marsden
Marsden and French
Mayer and Davis
McMenamin et al
McNamara
Meessen et al
Meessen, Kashala and Musang
Milkovich and Wigdor
Muralidharan and Sundararam
Murnane and Cohen
Nagin et al.
Norton
Odden and Kelley
OECD
OECD
OECD
OECD
Paarsch and Shearer
Pearson et al
Pires
Pourat et al
Reid
Rexed et al.
Rogers and Vegas
Rosenthal et al
Roski et al
Safran et al
Schick
Shaw, Gupta and Delery
Shearer
Shen
Singh
Soeters and Griffiths
Soeters et al
Year
1999
2008
2009
2000
2003
2007
1995
2006
2011
2007
2007
2004
2009
1998
1999
2003
2005
2006
2008
1991
2009
1986
2002
1992
2002
1996
1997
2005
2008
1999
2008
2007
2005
1992
2007
2009
2005
2003
2000
1998
2002
2004
2003
2010
2003
2006
Type
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
field experiment
observational
observational
observational
observational
observational
field experiment
observational
field experiment
field experiment
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
observational
field experiment
observational
observational
observational
field experiment
observational
field experiment
observational
observational
Quality
4
4
4
4
1
2
1
4
4
4
3
3
1
3
5
4
2
4
3
2
5
1
5
5
2
2
2
2
2
4
4
2
4
2
2
1
4
5
4
1
4
5
4
5
2
3
Effect
positive
positive
positive
positive
positive
positive
neutral
positive
neutral
positive
positive
neutral
neutral
neutral
positive
positive
neutral
positive
positive
positive
positive
failed
neutral
positive
positive
neutral
neutral
neutral
neutral
positive
failed
failed
positive
positive
neutral
positive
neutral
positive
positive
failed
positive
positive
failed
positive
positive
positive
Country context
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
Developing
Developing
Developing
OECD
Developing
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
Developing
OECD
Developing
OECD
Developing
OECD
OECD
OECD
OECD
OECD
OECD
OECD
Developing
Developing
Developing
Type of Job
Craft
Craft
Craft
Production
Production
Unclassified
Production
Craft
Craft
Craft
Craft
Unclassified
Unclassified
Unclassified
Production
Craft
Craft
Craft
Craft
Production
Craft
Craft
Production
Craft
Craft
Unclassified
Unclassified
Unclassified
Unclassified
Unclassified
Craft
Unclassified
Craft
Unclassified
Unclassified
Craft
Craft
Craft
Craft
Unclassified
Production
Production
Craft
Craft
Craft
Craft
35
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
Authors
Year
Springer et al.
2010
Stajkovic and Luthans
2001
Stazyk
2010
Steel et al
2007
Straberg
2010
Streib and Nigro
1993
Vaghela et al.
2009
van Dijk, Sonnemans and van W 2001
Vegas
2005
Vegas and Umansky
2005
Willems, Janvier and Henderic 2006
Witter et al
2011
Woessmann
2010
World Bank
2001
Young et al
2007
Type
field experiment
field experiment
observational
observational
observational
observational
observational
lab experiment
observational
observational
observational
observational
observational
observational
observational
Quality
5
5
4
3
3
3
3
5
3
3
2
3
4
3
4
Effect
failed
positive
positive
positive
neutral
neutral
positive
positive
positive
positive
neutral
neutral
positive
positive
positive
Country context
OECD
OECD
OECD
OECD
OECD
OECD
OECD
OECD
Developing
Developing
OECD
Developing
OECD
Developing
OECD
Type of Job
Craft
Production
Unclassified
Craft
Craft
Unclassified
Craft
Production
Craft
Craft
Unclassified
Craft
Craft
Craft
Craft
36
Appendix B: List of High Quality Studies of Craft and Coping Jobs
EDUCATION
Country Context: OCED
Type of study: Observational
Positive
1. Atkinson et al. (2004)
2. Clotfelter et al. (2008)
3. Clotfelter, Ladd,& Vigdor
(2007)
4. Cooper & Cohn (1997)
5. Dee & Keys (2004)
6. Figlio & Kenny (2007)
7. Goldhaber & Brewer (2000)
8. Jacob (2005)
9. Ladd (1999)
10. Lavy (2008)
11. Lavy (2009)
12. Woessman (2010)
Findings
Neutral
1. Figlio & Winicki (2005)
2. Jacob & Levitt (2003)
Negative
1. Clotfelter et al. (2004)
2. Eberts et al. (2002)
Country Context: OCED
Type of study: Experimental
Findings
Neutral
Positive
No studies
No studies
Negative
1. Fryer (2011)
2. Goodman and Turner (2010)
3. Springer et al (2010)
Country Context: Developing
Type of study: Observational
Findings
Neutral
No studies
Positive
1. Kingdon & Teal (2008)
Negative
No studies
Country Context: Developing
Type of study: Experimental
Positive
1. Duflo et al (2007)
2. Muralidharan & Sundararaman
(2009)
3. Jain & Narayan (2011)
1.
Findings
Neutral
Glewwe et al (2010)
Negative
1. Kremer & Chen (2001)
37
HEALTH
Country Context: OCED
Type of study: Observational
Positive
1. Beaulieu & Horrigan (2005)
2. Campbell et al. (2007)
3. Casalino et al (2003)
4. Chalkley et al (2010)
5. Coleman et al (2007)
6. Levin-Scherz et al (2006)
7. Lindenauer et al (2007)
8. McMenamin et al (2003)
9. Pourat et al (2005)
10. Safran et al (2000)
11. Young et al (2007)
1.
2.
3.
4.
5.
Findings
Neutral
Clark et al (1995)
Felt-Lisk et al (2007)
Glickman et al (2007)
Li et al (2011)
Rosenthal et al (2005)
Negative
1. Gavaghan et al. (2010)
2. Pearson et al. (2008)
3. Shen (2003)
Country Context: OCED
Type of study: Experimental
1.
2.
3.
4.
5.
Positive
Fairbrother et al (2001)
Fairbrother et al (1999)
Kouides et al (1998)
Norton (1992)
Roski et al (2003)
Findings
Neutral
No studies
Negative
1. Grady et al (1997)
2. Hillman et al (1998)
3. Hillman et al (1999)
Country Context: Developing
Type of study: Observational
6.
7.
Positive
Eichler & Levine (2009)
Meesen et al (2006)
Findings
Neutral
No studies
Negative
No studies
Country Context: Developing
Type of study: Experimental
8.
9.
Positive
Basinga et al (2010)
Singh (2010)
Findings
Neutral
No studies
Negative
No studies
38
REVENUE ADMINISTRATION
Country Context: OECD
Type of study: Experimental or observational
1.
Positive
Burgess et al (2010)
1.
Findings
Neutral
Bertelli (2006)
Negative
No studies
Country Context: Developing
Type of study: Experimental or observational
1.
Positive
Kahn et al (2001)
Findings
Neutral
No studies
Negative
No studies
OTHER (Job placement, recruitment, private sector)
Country Context: OECD
Type of study: Experimental or observational
1.
2.
3.
4.
5.
6.
Positive
Asch (1990)
Burgess et al (2004)
Courty & Marschke (2003)
Christensen et al (2000)
Blasi et al (2008)
Hochberg & Lindsey (2010)
1.
2.
3.
4.
5.
6.
Findings
Neutral
Heckman et al (1997)
Negative
No studies
COPING JOBS (Health managers, private sector)
Country Context: OECD
Type of study: Experimental or observational
1.
2.
Positive
Aboody et al (2010)
Dowling & Richardson (1997)
Findings
Neutral
No studies
Negative
No studies
39
References
Aboody, D., N. Johnson and R. Kasznik (2007), 'Employee Stock Options and Future Firm Performance:
Evidence from Option Repricings', Journal of Accounting and Economics, 50, 74-92.
Ahmad, S. and R. G. Schroeder (2003), 'The Impact of Human Resource Management Practices on
Operational Performance: Recognizing Country and Industry Differences', Journal of Operations
Management (21), 19-43.
Amundson, G., L. I. Solberg, M. Reed, E. M. Martini and R. Carlson (2003), 'Paying for Quality
Improvement: Compliance with Tobacco Cessation Guidelines', Joint Commission Journal on
Quality and Safety, 29 (2), 59-65.
Andrews, M. (2008a). Are One-Best-Way Models of Effective Government Suitable for Developing
Countries? Harvard Kennedy School, Cambridge, Mass.
Andrews, M. (2008b). Good Government Means Different Things in Different Countries Kennedy School
of Government, Cambridge, MA http://web.hks.harvard.edu/publications/getFile.aspx?Id=324.
Antwi, J. and D. Phillips (2011). Wages and Health Worker Retention: Evidence from Public Sector
Wage Reforms in Ghana. World Bank, Washington DC.
Ariely, D., U. Gneezy, G. Lowenstein and N. Mazar (2009), 'Large Stakes and Big Mistakes', Review of
Economic Studies (76), 541-469.
Arrowsmith, J. and P. Marginson (2008). Wage Flexibility. European Foundation for the Improvement of
Living and Working Conditions, Dublin.
Asch, B. J. (1990). Navy Recruiter Productivity and the Freeman Plan. RAND Corporation Santa Monica,
CA
Atkinson, A., S. Burgess, B. Croxson, P. Gregg, C. Propper, H. Slater and D. Wilson (2004). Evaluating
the Impact of Performance-Related Pay for Teachers in England (Working Paper No.04/113).
Centre for Market and Public Organisation, Bristol UK.
Atkinson, A. B. (1999). Is Rising Inequality Inevitable? A Critique of the Transatlantic Consensus. UNU
World Institute for Development Economics Rsearch, Helsinki.
Atkinson, J. and N. Meager (1986), Changing Patterns of Work: How Companies Achieve Flexibility to
Achieve New Needs, London, National Economic Development Organisation.
Balfour, D. and B. Wechsler (1996), 'Organisational Commitment: Antecedents and Outcomes in Public
Organisations', Public Productivity and Management Review, 29, 256-277.
Bandiera, O., I. Barankay and I. Rasul (2005), 'Social Preferences and the Response to Incentives:
Evidence from Personnel Data', Quarterly Journal of Economics, 120 (3), 917-962.
Bandiera, O., I. Barankay and I. Rasul (2006). Incentives for Managers and Inequality among Workers:
Evidence from a Firm Level Experiment (Discussion Paper No. 2062). Institute for the Study of
Labor Bonn, Germany.
Barber, L., S. Hayday and S. Bevan (1999). From People to Profits (Ies Report 355). Institute for
Employment Studies, London.
Bardhan, P. (2002), 'Decentralization of Governance and Development', Journal of Economic
Perspectives 16 (4), 185-205.
Bardhan, P. and D. Mookherjee (eds) (2006), Decentralization and Local Governance in Developing
Countries. A Comparative Perspective, Cambridge Mass., MIT Press.
Barlevy, G. and D. Neal (2011). Pay for Percentile (Working Paper 17194). National Bureau for
Economic Research, Cambridge, Mass.
Beer, M. and M. D. Cannon (2004), 'Promise and Peril in Implementing Pay-for-Performance', Human
Resource Management, 43 (1), 3-48.
Beer, M. and N. Katz (2003), 'Do Incentives Work? The Perceptions of a Worldwide Sample of Senior
Executives', People and Strategy, 26 (3), 30-44.
40
Belfield, R. and D. Marsden (2003), 'Performance Pay, Monitoring Environments, and Establishment
Performance', International Journal of Manpower, 24 (4), 452-489.
Benabou, R. and J. Tirole (2003), 'Intrinsic and Extrinsic Motivation', Review of Economic Studies (70),
489-520.
Benabou, R. and J. Tirole (2006), 'Incentives and Prosocial Behavior', American Economic Review, 96
(5), 1652-1678.
Bender, K. A. and R. F. Elliott (1997), 'Decentralization and Pay Reform in Central Government: A Study
of Three Countries', British Journal of Industrial Relations, 35 (3), 447-475.
Bender, K. A. and R. F. Elliott (2003), Decentralised Pay Setting. A Study of the Outcomes of Collective
Bargaining Reform in the Civil Service in Australia, Sweden and the UK, Farnham, UK, Ashgate.
Bertelli, A. M. (2006), 'Motivation Crowding and the Federal Civil Servant: Evidence from the U.S.
Internal Revenue Service', International Public Management Journal, 9 (23).
Besley, T. and M. Ghatak (2004). Competition and Incentives with Motivated Agents (Working Paper).
London School of Economics and Political Science, London.
Bevan, G., K. Sisson and P. Way (1981), 'Cash Limits and Public Sector Pay', Public Administration, 59,
379-398.
Beynon, H., D. Grimshaw, J. Rubery and K. Ward (2002), Managing Employment Change: The New
Realities of Work, Oxford, Oxford University Press.
Blasi, J. R., R. B. Freeman, C. Mackin and D. L. Kruse (2008). Creating a Bigger Pie? The Effects of
Employee Ownership, Profit Sharing, and Stock Options on Workplace Performance. NBER,
Cambridge, Mass.
Booth, A. L. and J. Frank (1999), 'Earnings, Productivity, and Performance-Related Pay', Journal of
Labor Economics, 17 (3), 447-463.
Brown, W., S. Deakin, M. Hudson, C. Pratten and P. Ryan (1998). The Individualisation of Employment
Contracts in Britain. Centre for Business Research, University of Cambridge, Cambridge, UK.
Brown, W. and D. Marsden (2010). Individualisation and Growing Diversity of Employment
Relationships London School of Economics and Political Science, London.
Brudney, J. L. and S. E. Condrey (1993), 'Pay for Performance: Explaining Differences in Managerial
Motivation', Public Productivity & Management Review, 17 (2), 129-144.
Bruns, B., D. Filmer and H. A. Patrinos (2011), Making Schools Work: New Evidence on Accountability
Reforms, Washington DC, World Bank.
Burgess, S. and P. Metcalfe (1999). Incentives in Organisations: A Selective Overview of the Literature
with Application to the Public Sector. University of Bristol, CMPO and CEPR, Bristol.
Burgess, S., C. Propper, M. Ratto, S. Scholder and E. Tominey (2010), 'Smarter Task Assignment or
Greater Effort: What Makes the Difference on Team Performance?', Economic Journal, 120
(547), 968-989.
Burgess, S., C. Propper, M. Ratto and E. Tominey (2011). Incentives in the Public Sector: Evidence from
a Government Agency (Working Paper No. 11/265). Center for Market and Public Organization,
Bristol UK.
Burgess, S. and M. Ratto (2003), The Role of Incentives in the Public Sector: Issues and Evidence
(Working Paper), Bristol, UK, Centre for Market and Public Organisation.
C., E. and P. N (2009), 'Performance-Based Payment: Some Reflections on the Discourse, Evidence and
Unanswered Questions', Health Policy and Planning, 1 (7).
Cadsby, C. B., F. Song and F. Tapon (2007), 'Sorting and Incentive Effects of Pay-for-Performance: An
Experimental Investigation', Academy of Management Journal, 50 (2), 387-405.
Camerer, C., L. Babcock, G. Loewenstein and R. Thaler (1997), 'Labor Supply of New York City
Cabdrivers: One Day at a Time', Quarterly Journal of Economics and Philosophy, 111, 407-441.
Campbell, S. M., D. Reeves, E. Kontopantelis, E. Middleton, B. Sibbald and M. Roland (2007), 'Quality
of Primary Care in England with the Introduction of Pay for Performance', New England Journal
of Medicine 357, 181-190.
41
Campbell, S. M., M. Roland, E. Middleton and D. Reeves (2005), 'Improvements in the Quality of
Clinical Care in English General Practice: Longitudinal Observational Study', British Medical
Journal, 331 (1121-3).
Cardona, F. (2007). Performance-Related Pay in the Public Service in OECD and EU Member States.
OECD SIGMA, Paris.
Casalino, L., R. Gillies, S. Shortell, J. Schmittdiel, T. Bodenheimer, J. Robinson, T. Rundall, N. Oswald,
H. Schauffler and M. Wang (2003), 'External Incentives, Information Technology, and Organized
Processes to Improve Health Care Quality for Patients with Chronic Diseases', Journal of the
American Medical Association, 289 (4), 434-441.
Castaño, R., R. Bitran and U. Giedion (2004 ). Monitoring and Evaluating Hospital Autonomization and
Its Effects on Priority Health Services Abt Associates, Bethesda, MD.
Chalkley, M., C. Tilley, L. Young, D. Bonetti and J. Clarkson (2010), 'Incentives for Dentists in Public
Service: Evidence from a Natural Experiment', Journal of Public Administration Research and
Theory, 207-223.
Chaudhury, N., J. Hammer, M. Kremer, K. Muralidharan and F. H. Rogers (2006), 'Missing in Action:
Teacher and Health Worker Absence in Developing Countries', Journal of Economic
Perspectives, 20 (1), 91-116.
Chirkov, V. I., R. M. Ryan, Y. Kim and U. Kaplan (2003), 'Differentiating Autonomy from Individualism
and Independence: A Self-Determination Theory Perspective on Internalization of Cultural
Orientations and Well-Being', Journal of Personality and Social Psychology, 84, 97-110.
Chomitz, K. M., G. Setiadi, A. Azwar, N. Ismail and Widiyarti (1997). What Do Doctors Want?: Two
Empirical Estimates of Indonesian Physicians' Preferences Regarding Service in Rural and
Remote Areas. World Bank, Washington DC.
Christensen, T. and P. Lægreid (eds) (2011), The Ashgate Research Companion to New Public
Management, Farnham, UK, Ashgate.
Chu, K.-y. (ed.) (1991), Public Expenditure Handbook : A Guide to Public Expenditure Policy Issues in
Developing Countries, Washington DC, IMF.
Chung, S., L. P. Palaniappan, L. M. Trujillo, H. R. Rubin and H. S. Luft (2010), 'Effect of PhysicianSpecific
Pay-for-Performance Incentives in a Large Group Practice', American Journal of
Managed Care, 16 (2), 35-42.
CIPD (2006). Working Life: Employee Attitudes and Engagement. Chartered Institute of Personnel and
Development, London.
Clotfelter, C., R. A. Diaz, H. Ladd and J. Vigdor (2004), 'Do School Accountability Systems Make It
More Difficult for Low-Performing Schools to Attract and Retain High-Quality Teachers?',
Journal of Policy Analysis and Management, 23 (2), 251-271.
Clotfelter, C., E. Glennie, H. Ladd and J. Vigdor (2007). How and Why Do Teacher Credentials Matter
for Student Achievement? (Working Paper 2). National Center for Analysis of Longitudinal Data
in Educational Research, Washington DC.
Clotfelter, C., E. Glennie, H. Ladd and J. Vigdor (2008), 'Would Higher Salaries Keep Teachers in HighPoverty Schools? Evidence from a Policy Intervention in North Carolina', Journal of Public
Economics, 92, 1352-1370.
Cohen, A. (1991), 'Career Stage as a Moderator of the Relationship between Organisational Commitment
and Its Outcomes: A Meta-Analysis', Journal of Occupational Psychology, 64, 253-268.
Cohen, A. (1993), 'Age and Tenure in Relation to Organisational Commitment: A Meta-Analysis', Basic
and Applied Social Psychology, 14, 143-159.
Coleman, K., K. L. Reiter and D. Fulwiler (2007), 'The Impact of Pay-for-Performance on Diabetes Care
in a Large Network of Community Health Centers', Journal of Health Care for the Poor and
Underserved, 18 (4), 966-983.
Common, R. (1998), 'Convergence and Transfer: A Review of the Globalisation of New Public
Management', International Journal of Public Sector Management, 11 (6), 440-448.
42
Condley, S., R. Clark and H. Stolovitch (2003), 'The Effects of Incentives on Workplace Performance: A
Meta-Analytic Review of Research Studies', Performance Improvement Quarterly, 16 (3), 46-63.
Condrey, S. E. and J. L. Brudney (1992), 'Performance-Based Managerial Pay in the Fedeeral
Government: Does Agency Matter?', Journal of Public Administration Research, 2 (2), 157-174.
Cooper, S. T. and E. Cohn (1997), 'Estimation of a Frontier Production Function for the South Carolina
Educational Process', Economics of Education Review, 16 (3), 313-327.
Courty, P., C. Heinrich and G. Marschke (2005), 'Setting the Standard in Performance Measurment
Systems', International Public Management Journal, 8 (3), 1-27.
Courty, P. and G. Marschke (2003), 'Dynamics of Performance-Measurement Systems', Oxford Review
of Economic Policy, 19 (2), 268-284.
Courty, P. and G. Marschke (2004), 'An Empirical Investigation of Gaming Responses to Explicit
Performance Incentives', Journal of Labor Economics, 22 (1), 23-56.
Cutler, T. and B. Waine (2005), 'Incentivizing the Poor Relation: 'Performance' and the Pay of Public
Sector 'Senior Managers'', Competition and Change, 9 (1), 75-87.
D., B. N. and D. R. Horrigan (2005), 'Putting Smart Money to Work for Quality Improvement', Health
Services Research, 40 (5), 1318-1334.
Dahlstrom, C. and V. Lapuente (2009), 'Explaining Cross-Country Differences in Performance-Related
Pay in the Public Sector', Journal of Public Administration Research and Theory, 20, 577-600.
Dee, T. S. and B. J. Keys (2004), 'Does Merit Pay Reward Good Teachers? Evidence from a Randomized
Experiment', Journal of Policy Analysis and Management 23 (3), 471-488.
Delfgaauw, J. and R. Dur (2008), 'Incentives and Worker‘s Motivation in the Public Sector', The
Economic Journal, 118, 171-191.
Dell'Aringa, C. and N. Lanfranchi (1999), 'Pay Determination in the Public Service: An International
Comparison', in R. Elliott and C. Lucifora (eds.) Public Sector Pay Determination in the
European Union, Basingstoke, MacMillan Press, pp 29-70.
Dell'Aringa, C., C. Lucifora and F. Origo (2007), 'Public Sector Pay and Regional Competitiveness: A
First Look at Regional Public-Private Wage Differentials in Italy', The Manchester School, 75
(4), 445–478.
Dewatripont, M., I.Jewitt and J.Tirole (1999a), 'The Economics of Career Concerns, Part 1: Comparing
Information Structures', Review of Economic Studies, 66, 183-198.
Dewatripont, M., I.Jewitt and J.Tirole (1999b), 'The Economics of Career Concerns, Part 2: Application
to Missions and Accountability of Government Agencies', Review of Economic Studies, 66, 199217.
Dickens, W. T. and L. F. Katz (1987). Interindustry Wage Differences and Industry Characteristics
(NBER Working Paper No. W2014). National Bureau of Economic Research, Washington DC.
Dixit, A. (1999), 'Incentives and Organization in the Public Sector. An Interpretative Review', The
Journal of Human Resources, 34 (4), 696-727.
Dixit, A. (2002), 'Incentives and Organizations in the Public Sector: An Interpretative Review', Journal of
Human Resources, 37 (4), 696-727.
Dohmen, T. and A. Falk (2007). Performance Pay and Multi-Dimensional Sorting - Productivity,
Preferences and Gender (Working Paper). Institute for the Study of Labor, Bonn, Germany.
Doran, T., C. Fullwood, H. Gravelle, D. Reeves, E. Kontopantelis, U. Hiroeh and M. Roland (2006), 'Payfor-Performance Programs in Family Practices in the United Kingdom', New England Journal of
Medicine, 355, 375-384.
Dowling, B. and R. Richardson (1997), 'Evaluating Performance-Related Pay for Managers in the
National Health Service', The International Journal of Human Resource Management, 8 (3), 348366.
Duflo, E., R. Hanna and S. P. Ryany (2010). Incentives Work: Getting Teachers to Come to School. MIT
(Department of Economics and J-PAL) and the Kennedy School of Government, Cambridge,
Mass.
43
Eberts, R., K. Hollenbeck and J. Stone (2002), 'Teacher Performance Incentives and Student Outcomes',
Journal of Human Resources, 37 (4), 913-927.
Eichler, R., P. Auxila and J. Pollock (2001), ' Performance-Based Payment to Improve the Impact of
Health Services: Evidence from Haiti', World Bank Institute Online Journal (April 2001).
Eichler, R. and R. Levine (2009). Performance Incentives for Global Health: Potential and Pitfalls. Center
for Global Development, Washington DC.
Eyck, K. V. (2003). Flexibilizing Employment: An Overview ILO, Geneva.
Fairbrother, G., K. L. Hanson, S. Friedman and G. C. Butts (1999), 'The Impact of Physician Bonuses,
Enhanced Fees, and Feedback on Childhood Immunization Coverage Rates', American Journal of
Public Health, 89 (2), 171-175.
Fairbrother, G., M. J. Siegel, S. Friedman, P. D. Kory and G. C. Butts (2001), 'Impact of Financial
Incentives on Immunization Rates in the Inner City: Results of a Randomized Controlled Trial',
Ambulatory Pediatrics, 1 (4), 206-212.
Farnham, D. and S. Horton (2000), 'The Flexibiity Debate', in D. Farnham and S. Horton (eds.) Human
Resources Flexibilities in the Public Services, London, MacMillan Press.
Fehr, E. and K. M. Schmidt (2004), 'Fairness and Incentives in a Multi-Task Principal-Agent Model',
Scandinavian Journal of Economics, 106 (3), 453-474.
Felt-Lisk, S., G. Gimm and S. Peterson (2007), 'Making Pay-for-Performance Work in Medicaid', Health
Affairs, 26 (4), 516-527.
Fields, G. S. and H. J. Wan (1989), 'Wage-Setting Institutions and Economic Growth', World
Development, 17 (9), 1471–1483.
Figlio, D. N. and L. W. Kenny (2007), 'Individual Teacher Incentives and Student Performance', Journal
of Public Economics, 91, 901-914.
Figlio, D. N. and J. Winicki (2005), 'Food for Thought: The Effects of School Accountability Plans on
School Nutrition', Journal of Public Economics, 89 (381-394).
French, S. (2005). Performance-Related Pay in the UK Public Services: Unraveling the Contradictions.
New Developments in Public Sector Pay-setting, Queens University Belfast and the UK Labour
Relations Agency.
Frey, B. S. and M. Osterloh (1999). Pay for Performance - Immer Empfehlenswert? Zeitschrift fur
Fuhrung und Organisation, Münster, Germany.
Fryer, R. G. (2011). Teacher Incentives and Student Achievement: Evidence from New York City Public
Schools (NBER Working Paper 16850). National Bureau for Economic Research, Washington
DC.
Fudge, C. (1990), 'Flexibility Reconsidered: Selected Issues', in OECD (ed.) Flexible Personnel
Management in the Public Service, Paris, OECD, pp 91-99.
Gallup (2011). Employee Engagement: What‘s Your Engagement Ratio? Washington DC, Gallup
Consulting. http://www.gallup.com/consulting/121535/Employee-Engagement-OverviewBrochure.aspx.
Gavagan, T., H. Du, B. Saver, G. Adams, D. Graham, R. McCray and K. Goodrick (2010), 'Effect of
Financial Incentives on Improvement in Medical Quality Indicators for Primary Care', Journal of
American Board Family Medicine, 23, 622-631.
Georgellis, Y., E. Iossa and V. Tabvuma (2011), 'Crowding out Intrinsic Motivation in the Public Sector',
Journal of Public Administration Research and Theory, 21 (3), 473-493.
Gerber, A. and N. Malhotra (2008), 'Do Statistical Reporting Standards Affect What Is Published? Public
Bias in Two Leading Political Science Journals', Quarterly Journal of Political Science, 3 (313326).
Glewwe, P., N. Ilias and M. Kremer (2010), 'Teacher Incentives', American Economic Journal: Applied
Economics, 2 (3), 205-227.
44
Glickman, S., F. Ou, E. DeLong, M. Roe, B. Lytle, J. Mulgund, J. Rumsfeld, W. Gibler, E. Ohman, K.
Schulman and E. Peterson (2007), 'Pay for Performance, Quality of Care, and Outcomes in Acute
Myocardial Infractions', Journal of the American Medical Association, 297 (21), 2373-2380.
Gneezy, U. and A. Rustichini (2000), 'Pay Enough or Don‘t Pay at All', The Quarterly Journal of
Economics, 115 (3), 791-810.
Goldhaber, D. D. and D. J. Brewer (2000), 'Does Teacher Certification Matter? High School Teacher
Certification and Student Achievement', Educational Evaluation and Policy Analysis, 22 (129).
Goodman, S. and L. Turner (2010). Teacher Incentive Pay and Educational Outcomes: Evidence from the
Nyc Bonus Program (Working Paper). PEPG Conference "Merit Pay: Will It Work? Is It
Politically Viable?". Harvard Kennedy School, June 3-4, 2010
Grady, K., J. Lemkau, L. N. and C. Caddell (1997), 'Enhancing Mammography Referral in Primary Care',
Preventive Medicine (26), 791-800.
Gratz, D. B. (2009), The Peril and Promise of Performance Pay. Making Education Compensation Work,
Lanham, MD, Rowman & Littlefield.
Green, F. (2001), 'It's Been a Hard Day's Night: The Concentration and Intensification of Work in Late
Twentieth-Century Britain', British Journal of Industrial Relations, 39 (1), 53-80.
Grimshaw, D. (1998a). National Systems of Public Sector Pay: Implications for ‗Welfare Outcomes‘ and
Economic Stability. The ESRC Labour Studies Seminars - 27th November 1998: Reinventing the
State, Centre for Comparative Labour Studies, Department of Sociology, University of Warwick,
Economic and Social Research Council,
http://www.csv.warwick.ac.uk/fac/soc/complabstuds/confsem/Grimshaw.htm.
Grimshaw, D. (1998b). National Systems of Public Sector Pay: Implications for ―Welfare Outcomes‖ and
Economic Stability. The ESRC Labour Studies Seminar. London.
Grimshaw, D., K. Jaehrling, M. van der Meer, P. Méhaut and N. Shimron (2007), 'Convergent and
Divergent Country Trends in Coordinated Wage Setting and Collective Bargaining in the Public
Hospitals Sector', Industrial Relations Journal, 38 (6), 591–613.
Groshen, E. L. (1991), 'Sources of Intra-Industry Wage Dispersion: How Much Do Employers Matter?',
Quarterly Journal of Economics, 106 (3), 869-884.
Gruening, G. (2001), 'Origin and Theoretical Basis of New Public Management', International Public
Management Journal (4), 1-25.
Hakimi, E., N. Manning, S. Prasad and K. Prince (2004), Asymmetric Reforms: Agency-Level Reforms
in the Afghan Civil Service, South Asia Region: PREM Working Paper Series, Washington DC,
World Bank.
Hamilton, B. H., J. A. Nickerson and H. Owan (2003), 'Team Incentives and Worker Heterogeneity: An
Empirical Analysis of the Impact of Teams on Productivity and Participation', Journal of Political
Economy, 111 (3), 465-497.
Hammer, J. S. and N. Chaudhury (2004), 'Ghost Doctors: Absenteeism in Bangladeshi Health Facilities',
World Bank Economic Review, 18, 423-441.
Hanushek, E. A. and S. G. Rivkin (2006), 'Teacher Quality', in E. Hanushek and F. Welch (eds.)
Handbook of the Economics of Education, Amsterdam, North-Holland, pp Chapter 18.
Heckman, J., C. Heinrich and J. Smith (1997), 'Assessing the Performance of Performance Standards in
Public Bureaucracies', The American Economic Review 87 (2), 389-395.
Heintzman, R. and B. Marson (2005), 'People, Service and Trust: Is There a Public Sector Service Value
Chain?', International Review of Administrative Sciences, 71 (4), 549-575,
http://ras.sagepub.com/cgi/content/short/71/4/549.
Heneman III, H. G. and A. T. Milanowski (1999), 'Teacher Attitudes About Teacher Bonuses under
School-Based Performance Award Programs', Journal of Personnel Evaluation in Education, 12
(4), 327-341.
Hillman, A., M. Pauly, K. Kerman and C. Martinek (1991), 'Hmo Manager‘s Views on Financial
Incentives and Quality, Health Affairs', Health Affairs, 10 (4), 207-219.
45
Hillman, A., K. Ripley, N. Goldfarb, I. Nuamah, J. Weiner and E. Lusk (1998), 'Physician Financial
Incentives and Feedback: Failure to Increase Cancer Screening in Medicaid Managed Care',
American Journal of Public Health, 88 (11), 1699-1701.
Hillman, A., K. Ripley, N. Goldfarb, J. Weiner, I. Nuamah and E. Lusk (1999), 'The Use of Physician
Financial Incentives and Feedback to Improve Pediatric Preventive Care in Medicaid Managed
Care', Pediatrics (104), 931-935.
Hochberg, Y. V. and L. Lindsey (2010), 'Incentives, Targeting and Firm Performance: An Analysis of
Non-Executive Stock Options', Review of Financial Studies, 23 (11).
Holmstrom, B. (1982), 'Managerial Incentive Problems: A Dynamic Perspective', Review of Economic
Studies, 1 (169-182).
Holmstrom, B. and P. Milgrom (1991), 'Multitask Principal-Agent Analyses: Incentive Contracts, Asset
Ownership, and Job Design', Journal of Law, Economics & Organization, 7, 24-52.
Hood, C. (2005), 'Public Management: The Word, the Movement, the Science', in E.Ferlie, L. Lynn Jr.
and C.Pollitt (eds.) The Oxford Handbook of Public Management, Oxford, OUP, pp 7-26.
Hood, C. and R. Dixon (2010), 'The Political Payoff from Performance Target Systems: No-Brainer or
No-Gainer?', Journal of Public Administration Research and Theory, 281-298.
Houston, D. J. (2009), 'Motivating Knights or Knaves? Moving Beyond Performance-Related Pay for the
Public Sector', Public Administration Review, 69 (1), 43-56.
Hutton, W. (2010), Fair Pay in the Public Sector: Interim Report, London, H.M.Treasury.
Independent Evaluation Group (2008), Public Sector Reform: What Works and Why?, Washington DC,
World Bank.
Ipsos-MORI (2006). Change Management and Leadership: The Challenges for the Public Sector. IpsosMORI, London.
Jack, W. (2003), 'Contracting for Health Services: An Evaluation of Recent Reforms in Nicaragua',
Health Policy and Planning, 18 (2), 195-204.
Jackson, S. E., R. S. Schuler and S. Werner (2012), Managing Human Resources, Mason, Ohio, SouthWestern.
Jacob, B. A. (2005), 'Accountability, Incentives and Behavior: The Impact of High-Stakes Testing in the
Chicago Public Schools', Journal of Public Economics 89, 761-796.
Jacob, B. A. and S. D. Levitt (2003), 'Rotten Apples: An Investigation of the Prevalence and Predictors of
Teacher Cheating', Quarterly Journal of Economics, 118 (3), 843-877.
Jain, T. and T. Narayan (2011), Incentive to Discriminate? An Experimental Investigation of Teacher
Incentives in India (Working Paper), Indian School of Business.
Jenkins, G. D., A. Mitra, N. Gupta and J. D. Shaw (1998), 'Are Financial Incentives Related to
Performance? A Meta-Analytic Review of Empirical Research', Journal of Applied Psychology,
83 (5), 777-787.
Kahn, C. M., E. C. De Silva and J. P. Ziliak (2001), 'Performance-Based Wages in Tax Collection: The
Brazilian Tax Collection Reform and Its Effects', The Economic Journal, 111, 188-205.
Karlson, N. and H. Lindberg (2011). The Decentralization of Wage Bargaining. The Ratio Institute,
Stockholm.
Kelley, C. (1999), 'The Motivational Impact of School-Based Performance Awards', Journal of Personnel
Evaluation in Education, 12 (4), 309-326.
Kellough, E. J. and H. Lu (1993), 'The Paradox of Merit Pay in the Public Sector: Persistence of a
Problematic Procedure', Review of Public Personnel Administration 13, 45-64.
Kellough, J. E. and L. G. Nigro (2002), 'Pay for Performance in Georgia State Government: Employee
Perspectives on Georgiagain after 5 Years', Review of Public Personnel Administration, 22 (2),
146-166.
Kernaghan, K. (2011), 'Getting Engaged: Public-Service Merit and Motivation Revisited', Canadian
Public Administration, 54 (1), 1-21.
46
Kerr, S. (1975), 'On the Folly of Rewarding a, While Hoping for B', The Academy of Management
Journal, 18 (4), 769-783.
Ketelaar, A., N. Manning and E. Turkisch (2007). Performance Based Arrangements for Senior Civil
Servants - OECD Experiences (OECD Governance Working Paper). Paris.
Kim, P. S. (2002). Strengthening the Pay-Performance Link in Government: A Case Study of Korea.
Governing for Performance in the Public Sector: OECD-Germany High-Level Symposium.
Berlin.
Kingdon, G. and F. Teal (2008), Teacher Unions, Teacher Pay and Student Performance in India: A Fixed
Effects Approach (CESifo Working Paper No. 2428), Munich, Germany, Ifo Institute, Center for
Economic Studies.
Kiragu, K. and R. Mukandala (2003). Public Sector Pay Reform - Tactics Sequencing and Politics in
Developing Countries: Lessons from Sub-Saharan Africa Pricewaterhousecoopers and
University of Dar es Salaam, Dar es Salaam, Tanzania.
Knez, M. and D. Simester (2001), 'Firm-Wide Incentives and Mutual Monitoring at Continental Airlines',
Journal of Labor Economics, 19 (4), 743-772.
Knudsen, R. and L. Pedersen (1993). Wage Determination and Sex Segregation in Employment in
Denmark. Manchester School of Management, UMIST, Manchester.
Kouides, R., N. Bennett, B. Lewis, J. Cappuccio, W. Barker and M. LaForce (1998), 'Performance-Based
Physician Reimbursement and Influenza Rates in the Elderly', American Journal of Preventive
Medicine, 14 (2), 89-95.
Kremer, M. and D. Chen (2001). An Interim Report on a Teacher Attendance Incentive Program in Kenya
(Mimeo). Cambridge, Mass., Harvard University.
Kremer, M., K. Muralidharan, N. Chaudhury, J. S. Hammer and F. H. Rogers (2004), 'Teacher Absence
in India: A Snapshot', Journal of the European Economic Association, 3, 2-3.
Kreps, D. M. (1997), 'Intrinsic Motivation and Extrinsic Incentives', The American Economic Review, 87
(2), 359-364.
Krueger, A. B. and L. H. Summers (1988), 'Efficiency Wages and the Inter-Industry Wage Structure',
Econometrica, 56 (2), 259-293.
Kumar, P., G. Murray and S. Schetagne (1999), Workplace Change in Canada: Union Perceptions of
Impacts, Responses and Support Systems, Kingston Ontario, Queens University.
Ladd, H. F. (1999), 'The Dallas School Accountability and Incentive Program: Evaluation of Its Impacts
on Student Outcomes', Economics of Education Review, 18, 1-16.
Lafuente, M. and N. Manning (2010). Executive-Legislative Authority over Public Servants' Pay: Lessons
from Paraguay. World Bank, Washington DC.
Lavy, V. (2008). Gender Differences in Market Competitiveness in a Real Workplace: Evidence from
Performance-Based Pay Tournaments among Teachers (NBER Working Paper No. 14338).
Natoinal Bureau for Economic Research, Washington DC.
Lavy, V. (2009), 'Performance Pay and Teachers‘ Effort, Productivity and Grading Ethics', American
Economic Review, 99 (5), 1979-2011.
Lazear, E. (1989), 'Pay Equality and Industrial Politics', Journal of Political Economy, 97, 561-580.
Lazear, E. P. (1981), 'Agency, Earnings Profiles, Productivity and Hours Restrictions', The American
Economic Review, 71 (5), 606-620.
Lazear, E. P. (2000), 'Performance Pay and Productivity', The American Economic Review, 90 (5), 13461361.
Le Grand, J. (2003), Motivation, Agency and Public Policy: Of Knights and Knaves, Pawns and Queens,
New York, Oxford University Press.
Levin-Scherz, J., N. DeVita and J. Timbie (2006), 'Impact of Pay-for-Performance Contracts and Network
Registry on Diabetes and Asthma: Hedis Measures in an Integrated Delivery Network', Medical
Care Research and Review, 63 (1), 14S-28S.
47
Li, J., J. Hurley, P. DeCicca and G. Buckley (2011). Physician Response to Pay-for-Performance:
Evidence from a Natural Experiment (Working Paper 16909). National Bureau for Economic
Research, Cambridge, Mass.
Lindauer, D. L. and B. Nunberg (eds) (1994), Rehabilitating Government, Washington DC, World Bank.
Lindenauer, P., D. Remus, S. Roman, M. Rothberg, E. Benjamin, A. Ma and D. Bratzler (2007), 'Public
Reporting and Pay for Performance in Hospital Quality Improvement', New England Journal of
Medicine, 365 (5), 486-496.
Loevinsohn, B. and A. Harding (2005), 'Buying Results? Contracting for Health Service Delivery in
Developing Countries', The Lancet (366), 676-681.
Luthans, F. (1973), Organiational Behavior, New York, NY, McGraw-Hill.
Maguire, M. (1993), 'Pay Flexibility in the Public Sector -- an Overview', in OECD (ed.) Pay Flexibility
in the Public Sector, Paris, OECD, pp 9-18.
Mandel, K. and U. Kotagal (2007), 'Pay for Performance Alone Cannot Drive Quality', Archives of
Pediatric Adolescent Medicine, 161 (7), 650-655.
Mangham, L. (2007). Addressing the Human Resource Crisis in Malawi‘s Health Sector: Employment
Preferences of Public Sector Registered Nurses (Esau Working Paper 18). Overseas Development
Institute, London.
Manning, N. (2001), 'The Legacy of the New Public Management in Developing Countries', International
Review of Administrative Sciences, 67 (2), 297-312.
Manning, N. and N. Parison (2003), International Public Administration Reform : Implications for the
Russian Federation, Moscow, Higher School of Economics, with the World Bank.
Marsden, D. (2004), 'The Role of Performance-Related Pay in Renegotiating The "Effort Bargain": The
Case of the British Public Service', Industrial and labor Relations review, 57 (3), 350-370.
Marsden, D. (2009). The Paradox of Performance Related Pay Systems: Why Do We Keep Adopting
Them in the Face of Evidence That They Fail to Motivate? Centre for Economic Performance,
London School of Economics, London.
Mathieu, J. and D. Zajac (1990), 'A Review and Meta-Analysis of the Antecedents, Correlates, and
Consequences of Organisational Commitment', Psychological Bulletin of the American
Psychological Association,, 108, 171-194.
Mayer, R. C. and J. H. Davis (1999), 'The Effect of the Performance Appraisal System on Trust for
Management: A Field Quasi-Experiment', Journal of Applied Psychology, 84 (1), 123-136.
McMenamin, S. B., H. H. Schauffler, S. M. Shortell, T. G. Rundall and R. R. Gillies (2003), 'Support for
Smoking Cessation Interventions in Physician Organizations: Results from a National Study',
Medical Care, 41, 1396-1406.
McNamara, P. (2005), 'Quality-Based Payment: Six Case Examples', International Journal for Quality in
Health Care, 17 (4), 357-363.
Meessen, B., J. Kashala and L. Musango (2007), 'Output-Based Payment to Boost Staff Productivity in
Public Health Centres: Contracting in Kabutare District, Rwanda', Bulletin of the World Health
Organization, 85 (2), 108-115.
Meessen, B., L. K. Musango, J. and J. Lemlin (2006), 'Reviewing Institutions of Rural Health Centres:
Performance Initiative in Butare, Rwanda', Tropical Medicine and International Health, 11 (8),
1303-1317.
Milkovich, G. and A. Wigdor (1991), Pay for Performance: Evaluating Performance Appraisal and Merit
Pay, Washington, DC, National Academy Press.
Mills, Z., S. Dahal, C. Garrity and N. Manning (2011). Wage Bill and Pay Compression Summary Note.
World Bank, Washington DC.
Moynihan, D. (2008), The Dynamics of Performance Management, Washington, DC, Georgetown
University Press.
Moynihan, D. P. and S. K. Pandey (2007), 'The Role of Organizations in Fostering Public Service
Motivation', Public Administration Review, 67 (1), 40-53.
48
Muralidharan, K. and V. Sundararaman (2009). Teacher Performance Pay: Experimental Evidence from
India (NBER Working Paper 15323). National Bureau for Economic Research, Washington DC.
Muralidharan, K. and V. Sundararaman (2011), 'Teacher Opinions on Performance Pay: Evidence from
India', Economics of Education Review, 30 (3), 394-403.
Murnane, R. J. and D. K. Cohen (1986), 'Merit Pay and the Evaluation Problem: Why Most Merit Pay
Plans Fail and Few Survive', Harvard Educational Review, 56 (1), 1-17.
Nagin, D. S., J. B. Rebitzer, S. Sanders and L. J. Taylor (2002), 'Monitoring, Motivation, and
Management: The Determinants of Opportunistic Behavior in a Field Experiment', American
Economic Review, 92 (4), 850-873.
Ndetei, D. M., L. Khasakhala and J. O. Omolo (2008). Incentives for Health Worker Retention in Kenya:
An Assessment of Current Practice. Africa Mental Health Foundation and Institute of Policy
Analysis and Research, Dar es Salaam, Kenya.
Neal, D. (2011). The Design of Performance Pay in Education (NBER Working Paper 16710). National
Bureau for Economic Research, Washington DC.
Niemiec, C. P., R. M. Ryan and E. L. Deci (2009), 'The Path Taken: Consequences of Attaining Intrinsic
and Extrinsic Aspirations in Post-College Life', Journal of Research in Personality, 73 (3), 291306.
Niskanen, W. (1973), Bureaucracy: Servant or Master, London, Institute of Economic Affairs.
Norton, E. (1992), 'Incentive Regulation of Nursing Homes: Specification Tests of the Markov Model', in
D. Wise (ed.) Topics in the Economics of Aging, Chicago, University of Chicago Press, pp 275304.
Nunberg, B. (1988). Public Sector Pay and Employment Reform (World Bank Working Paper). World
Bank, Washington DC.
Nunberg, B. and J. Nellis (1990), Civil Service Reform and the World Bank (World Bank Working
Paper), Washington DC, World Bank.
Nunberg, B. and R. Taliercio (2012). Making Things Worse: Do Aid Donors Undermine Civil Service
Reforms? (Unpublished Manuscript). Washington DC.
O'brien, J. and M. O'Donell (2007), 'From Workplace Bargaining to Workplace Relations: Industrial
Relations in the Australian Public Serivce under the Coalition Government', in M. J. Pittard and
W. Phillipa (eds.) Public Sector Employment in the Twenty-First Century, Canberra, Australian
national University Press.
Odden, A. and C. Kelley (2002), Paying Teachers for What They Know and Do, Thousand Oaks, CA,
Corwin Press.
OECD (1993). Pay Flexibility in the Public Sector. OECD, Paris.
OECD (1996), Pay Reform in the Public Service: Initial Impact on Pay Dispersion in Australia, Sweden,
and the United Kingdom, Paris, OECD PUMA.
OECD (1997a), Measuring Public Employment in OECD Countries: Sources, Methods and Results,
Paris, OECD.
OECD (1997b), Trends in Public Sector Pay in OECD Countries, Paris, OECD/PUMA.
OECD (2004a). Trends in Human Resources Management Policies in OECD Countries. An Analysis of
the Results of the OECD Survey on Strategic Human Resources, Paper Presented to the Human
Resources Management Working Party. OECD, Paris.
OECD (2004b), 'Wage-Setting Institutions and Outcomes', in J. Martin (ed.) OECD Employment
Outlook, Paris, OECD, pp 127-181.
OECD (2005a), Modernising Government: The Way Forward, Paris, OECD.
OECD (2005b), Performance-Related Pay Policies for Government Employees, Paris, OECD.
OECD (2008), The State of the Public Service, Paris, OECD.
OECD (2009), Government at a Glance, Paris, OECD.
OECD (2011a). 2010 Human Resources Management Composites: Theoretical Framework, Construction
and Weighting OECD, Paris.
49
OECD (2011b), Government at a Glance, Paris, OECD.
OECD Working Party of Senior Budget Officials (2011). Restoring Public Finances. Public Governance
and Territorial Development Directorate, OECD, Paris.
Osterloh, M. and J. Frost (2002), 'Motivation and Knowledge as Strategic Resources', in B. S. Frey and
M. Osterloh (eds.) Successful Management by Motivation: Balancing Intrinsic and Extrinsic
Incentives, New York, Springer-Verlag, pp 27-51.
Paarsch, H. J. and B. S. Shearer (1999), 'The Response of Worker Effort to Piece Rates', Journal of
Human Resources 34 (4), 634.
Painter, M. (2006), 'Sequencing Civil Service Pay Reforms in Vietnam: Transition or Leapfrog',
Governance, 19 (2), 325-346.
Palmer, D. (2006), 'Tackling Malawi's Human Resources Crisis', Reproductive Health Matters, 14 (27),
27-39.
Pearson, S., E. Schneider, K. K., K. Coltin and J. Singer (2008), 'The Impact of Pay-for-Performance on
Health Care Quality in Massachusetts, 2001-2003', Health Affairs, 27 (4), 1167-1176.
Perry, J. L., T. A. Engbers and S. Y. Jun (2009), 'Back to the Future? Performance-Related Pay, Empirical
Research and the Perils of Persistence', Public Administration Review, 69 (1), 39-51.
Perry, J. L. and A. Hondeghem (eds) (2008), Motivation in Public Management: The Call of Public
Service, Oxford, Oxford University Press.
Perry, J. L., D. Mesch and L. Paarlberg (2006), 'Motivating Employees in a New Governance Era: The
Performance Paradigm Revisited', Public Administration Review, 66 (4), 505–514.
Petersen, L. A., L. D., T. U. Woodard, C. Daw and S. Sookanan (2006), 'Does Pay-for-Performance
Improve the Quality of Health Care?', Annals of Internal Medicine, 145 (4), 265-272.
Pfeffer, J. (1998a), The Human Equation: Building Profits by Putting People First, Cambridge, Mass.,
Harvard Business School Press.
Pfeffer, J. (1998b), 'Seven Practices of Successful Organizations', California Management Review, 40 (2),
96–124.
Pink, D. H. (2009), Drive: The Surprising Truth About What Motivates Us, New York, Riverhead.
Podsakoff, P. M. and S. B. Mackenzie (1994), 'Organisational Citizenship Behavior and Sales Unit
Effectiveness', Journal of Marketing Research, 31 (351-363).
Pollitt, C. (1993), Managerialism and the Public Services, Oxford, Blackwell.
Pollitt, C. (1995), 'Justification by Works or by Faith', Evaluation, 1 (2), 133-154.
Pollitt, C. and S. Dan (2011a). The Impacts of the New Public Management in Europe: A Meta-Analysis
COCOPS, Erasmus University, Rotterdam.
Pollitt, C. and S. Dan (2011b), The Impacts of the New Public Management in Europe: A Meta-Analysis
(COCOPS Working Paper No. 3), Brussels, European Commission.
Porter, L. W. and E. E. Lawler III (1968), Managerial Attitudes and Performance, Homewood, IL, Dorsey
Press.
Pourat, N., T. Rice, M. Tai-Seale, G. Bolan and J. Nihalani (2005), 'Association between Physician
Compensation Methods and Delivery of Guideline-Concordant Std Care: Is There a Link?', The
American Journal of Managed Care, 11, 426-432.
Prendergast, C. (1998). What Happens within Firms? A Survey of Empirical Evidence on Compensation
Policies (NBER Working Paper). National Bureau for Economic Research, Washington DC.
Prendergast, C. (1999), 'The Provision of Incentives in Firms', Journal of Economic Literature 37 (1), 763.
Prentice, G., S. Burgess and C. Propper (2007), 'Performance Pay in the Public Sector: A Review of the
Issues and Evidence'.
Pritchett, L. and M. Woolcock (2004), 'Solutions When the Solution Is the Problem: Arraying the
Disarray in Development', World Development, 32 (2), 191-212.
50
Propper, C. and D. Wilson (2003), The Use and Usefulness of Performance Measures in the Public Sector
(Cmpo Working Paper Series No. 03/073), Bristol, UK, The Centre For Market And Public
Organisation.
Rafferty, A. M., J. Maben, E. West and D. Robinson (2005). What Makes a Good Employer?
International Council of Nurses, Geneva,
http://www.icn.ch/images/stories/documents/publications/GNRI/Issue3_Employer.pdf.
Rexed, K., C. Moll, N. Manning and J. Allain (2007). Governance of Decentralised Pay Setting in
Selected OECD Countries (OECD Working Papers on Public Governance, 2007/3). OECD, Paris,
http://caliban.sourceoecd.org/vl=7179447/cl=20/nw=1/rpsv/cgi-bin/wppdf?file=5l4qdflvl56d.pdf.
Rosenthal, M., R. Frank, Z. Li and A. Epstein (2005), 'Early Experience with Pay-for-Performance',
Journal of the American Medical Association, 294 (14), 1788-1793.
Ryan, R. M. and E. L. Deci (2000), 'Self-Determination Theory and the Facilitation of Intrinsic
Motivation, Social Development, and Well-Being', American Psychologist, 55 (1), 68-78.
Safran, D., W. Rogers, A. Tarlov, T. Inui, D. Taira, J. Montgomery, J. Ware and C. Slavin (2000),
'Organizational and Financial Characteristics of Health Plans. Are They Related to Primary Care
Performance?', Archive of Internal Medicine, 160, 69-76.
Samaratunge, R., Q. Alam and J. Teicher (2008), 'The New Public Management Reforms in Asia: A
Comparison of South and Southeast Asian Countries', International Review of Administrative
Sciences, 74 (1), 25-46.
Sauermann, H. and W. M. Cohen (2008). What Makes Them Tick? Employee Motives and Firm
Innovation (NBER Working Paper No. 14443). NBER, Cambridge MA,
http://www.nber.org/papers/w14443.pdf.
Schick, A. (1998), 'Why Most Developing Countries Should Not Try New Zealand's Reforms', World
Bank Research Observer (International), 13, 23-31.
Shen, Y. (2003), 'Selection Incentives in a Performance-Based Contracting System', Health Services
Research, 38 (2), 535-552.
Singh, P. (2010). Performance Pay and Information: Reducing Child Malnutrition in Urban Slums.
London School of Economics, London.
Skinner, B. F. (1969), Contingencies of Reinforcement, New York, NY, Appleton-Century-Crofts.
Soeters, Robert & Griffiths, Fred (2003) Improving government health services through contract
management: a case from Cambodia, Health Policy and Planning, 18 (1), 74-83
Soeters, R., C. Habineza and P. Peerenboom (2006), 'Performance-Based Financing and Changing the
District Health System: Experience from Rwanda', Bulletin of the World Health Organization, 84,
884-889.
Springer, M. G., D. Ballou, L. Hamilton, V.-N. Le, J. R. Lockwood, D. F. McCaffrey and M. P. B. M.
Stecher (2010). Teacher Pay for Performance: Experimental Evidence from the Project on
Incentives in Teaching. National Center on Performance Incentives at Vanderbilt University,
Nashville, TE.
Stajkovic, A. D. and F. Luthans (2003), 'Behavioral Management and Task Performance in
Organizations: Conceptual Background, Meta - Analysis, and Test of Alternative Models',
Personnel Psychology (56), 15-194.
Stazyk, E. C. (2010). Crowding out Intrinsic Motivation? The Role of Performance-Related Pay.
American University, School of Public Affairs Washington DC.
Steel, N., S. Maisey, A. Clark, R. Fleetcroft and A. Howe (2007), 'Quality of Clinical Primary Care and
Targeted Incentive Payments: An Observational Study', British Journal of General Practice, 57
(449-454).
Stevens, M. and S. Tegemann (2004), 'Comparative Experience with Public Service Reform in Ghana,
Tanzania and Zambia', in S. Kpundeh and B. Levy (eds.) Building State Capacity in Africa,
Washington DC, World Bank, pp 43-86.
51
Straberg, T. (2010). Employee Perspectives on Individualised Pay: Attitudes and Fairness Perceptions.
Department of Psychology. Stockholm, University of Stockholm. PhD.
Therkildsen, O., P. Tidemand, B. Bana, A. Kessy, J. Katongole, M. B. Ddiba and M. Nielsen (2007).
Staff Management and Organisational Performance in Tanzania and Uganda: Public Servant
Perspectives. Danish Institute for International Studies, Copenhagen, Denmark.
Thompson, J. R. (2006), 'The Federal Civil Service: The Demise of an Institution', Public Administration
Review, 66 (4), 496-503.
Thompson, J. R. and S. L. Fulla (2001), 'Effecting Change in a Reform Context: The National
Performance Review and the Contingencies of ―Microlevel‖ Reform Implementation', Public
Performance and Management Review, 25 (2), 155-175.
Vaghela, P., M. Ashworth, P. Schofield and M. C. Gulliford (2009), ' Population Intermediate Outcomes
of Diabetes under Pay-for-Performance Incentives in England from 2004 to 2008', Diabetes Care,
32 (427-9).
Valentine, T. R. (2002). A Medium-Term Strategy for Enhancing Pay and Conditions of Service in the
Zambian Public Service (Final Report). Management Development Division, Cabinet Office,
Lusaka, Zambia.
van Dijk, F., J. Sonnemans and F. van Winden (2001), 'Incentive Systems in a Real Effort Experiment',
European Economic Review, 45 (2), 187-214.
Vance, R. J. (2003). Employee Engagement and Commitment: A Guide to Understanding, Measuring and
Increasing Engagement in Your Organization. Society for Human Resource Management
Foundation, Alexandria, VA.
Vandenberg, R. and C. Lance (1992), 'Satisfaction and Organisational Commitment', Journal of
Management, 18, 153-167.
Vegas, E. and I. Umansky (2005). Improving Teaching and Learning through Effective Incentives: What
Can We Learn from Education Reforms in Latin America. World Bank, Washington DC.
Vroom, V. H. (1964), Work and Motivation, Hoboken, NJ, Wiley.
Vujicic, M. (2009). How You Pay Health Workers Matters: A Primer on Health Worker Remuneration
Methods. World Bank, Washington DC.
Wagstaff, A. and M. Claeson (2004), The Millennium Development Goals for Health: Rising to the
Challenges, Washington DC, World Bank.
Wallerstein, M. (1999), 'Wage-Setting Institutions and Pay Inequality in Advanced Industrial Societies',
American Journal of Political Science, 43 (3), 649–680.
Weber, M. (1978), Economy and Society (Vol. 2), Berkely, CA, University of California Press.
Weibel, A., K. Rost and M. Osterloh (2009), 'Pay for Performance in the Public Sector - Benefits and
(Hidden) Costs', Journal of Public Administration Research and Theory, 20 (2), 387-412.
White, G. (2000), 'Pay Flexibility in European Public Services: A Comparative Analysis', in D. Farnham
and S. Horton (eds.) Human Resources Flexibilities in the Public Services, London, MacMillan
Press, pp 255-279.
Wilms, W. W. and R. R. Chapleau (1999), 'The Illusion of Paying Teachers for Student Performance',
Education Week, 19 (10).
Winters, M. A., G. W. Ritter, J. P. Greene and R. Marsh (2009), 'Student Outcomes and Teacher
Productivity and Perceptions in Arkansas', in M. G. Springer (ed.) Performance Incentives. Their
Growing Impact on American K-12 Education, Washington DC, Brookings Institution Press.
Witter, S., T. Zulfiqur, S. Javeed, A. Khan and A. Bari (2011), 'Paying Health Workers for Performance
in Battagram District', Human Resources for Health, 9 (23).
Woessman, L. (2010), Cross-Country Evidence on Teacher Performance Pay (CESifo Working Paper No.
3151), Munich, Germany, Ifo Institute, Center for Economic Studies.
World Bank (1999). Civil Service Reform: A Review of World Bank Assistance: Report No. 19211.
OED, World Bank, Washington DC.
52
World Bank (2001). Salary Supplements and Bonuses in Revenue Departments (Final Report). World
Bank, Washington DC.
World Bank (2004). Labor Markets in Europe and Central Asia. World Bank, Washington DC.
World Bank (2007), What Do We Know About School-Based Management, Washington DC, World
Bank.
World Bank (2009). Pay Policy Reform: Building a Foundation for Public Sector Performance through
Improved Public Sector Pay Policy by Using A "Single Pay Spine" World Bank, Washington DC.
Yemin, E. (1993), 'Labour Relations in the Public Service: A Comparative Overview', International
Labour Review, 132 (4), 469-490.
Young, G., M. Meterko, H. Beckman and E. Baker (2007), 'Effects of Paying Physicians Based on Their
Relative Performance for Quality', Journal of General Internal Medicine, 22 (6), 872-887.
53