The audit society and its enemies

Niklas Bengtsson, Per Engström

28 October 2014



In the second season of The Wire, thirteen dead women are discovered in a cargo container in Baltimore. Being a case with a low probability of being solved, the different managers of the police force immediately start arguing over which department is responsible for investigating the issue. The incentives are clear – whoever gets the case will have poor statistics to show when the management of the department is audited at the end of the year. The consequence is bribes, corruption, and a dysfunctional police force.

It is clear that critics of the ‘audit society’ and the so-called ‘new public management’ doctrines have gained momentum in recent years. In the economics profession, the critique is fuelled by results in behavioural economics, and at the centre of the critique is the so-called motivation crowding-out hypothesis. In a recent article in Finanz und Wirtchaft (11 July 2014), one of the paradigm’s key proponents, Professor Bruno Frey, notes that “New public management is out”. Performance pay and similar schemes, he argues, have filled bureaucratic organisations with self-interested profit seekers rather than mission-oriented individuals. The apparent failure of new public management reflects an intellectual failure of ‘pure’ agency models (and a verification of more behavioural approaches to economic organisation).

In this column, we wish to convey that the field-experimental evidence on motivation crowding-out among public servants is not as clear-cut as one might hope. In particular, the mere insight that agents are motivated by other things than self-interest is not enough to throw performance-based evaluations out the window. The conclusion is partly informed by a recent study of ours, Bengtsson and Engström (2014).

Motivation crowding-out

Let us first make a (perhaps obvious) point – evaluations and audits are costly, and at some point the marginal benefits of audits will exceed the marginal costs. However, the key argument against audits is not the direct costs, but that extrinsic incentives will crowd out intrinsic motivation. In plain English, monetary carrots and sticks – such as performance pay and threats to cut funding – will demotivate civil servants and make them less productive. Further, civil servants, being motivated by a strong sense of moral duty and mission, are particularly susceptible to motivation crowding-out.

There is now a large body of evidence documenting how extrinsic incentives crowd out motivation. Frey and Jegen (2001) and Kunz and Pfaff (2002) review the psychological research on how material incentives may affect intrinsic motivation, and Bowles and Polania-Reyes (2009) survey the economic literature. These types of ‘incentive puzzles’ have been observed in the field as well, when donating blood (Mellstrom and Johannesson 2008) or participating in fundraising campaigns for cancer (Gneezy and Rustichini 2000). The literature’s relevance to the audit society is most explicitly spelled out in a much-cited contribution by Falk and Kosfeldt (2006) on the “hidden costs of control”. In a laboratory experiment, they find that when principals exert control over agents, the agents respond by reducing their prosocial behaviour. Although subsequent studies have been unable to replicate some specific aspects of Falk and Kosfeldt’s findings (Ploner and Ziegelmeyer 2007, Hagemann 2007), the mechanisms unveiled in Falk and Kosfeld are intuitive and in accordance with the rest of the crowd-out literature.

While there is no doubt that crowding-out mechanisms exist, there is also no doubt that they do not exist everywhere. For instance, Lazear (2000) refutes all forms of motivational crowding-out in a study of performance pay at an auto glass company, and Nagin et al. (2002) report evidence supportive of conventional agency models at a telephone solicitation company. These are hardly the only papers that find evidence of conventional agency theory. The reason we cite these two here is because they deliberately address crowding-out mechanisms in their motivations.

The question is thus how far outside the lab the experimental results on the hidden costs of control reach. Where should we look? On this issue, Meier (2006) soberly points out that “intrinsic motivation can only be crowded out by extrinsic incentives if people have intrinsic motivation to begin with”, arguing that agents delivering public goods should be more susceptible to motivational crowding-out than auto mechanics. The same point is raised by Frey and Jegen (2001), who single out the non-profit sector as a more relevant context for crowding-out mechanisms. Guided by these considerations, Frey and Götte (1999) and Besley and Ghatak (2005) study motivational crowding-out in non-profit environments.

Although the behavioural literature has spurred some important empirical work in the area, it is a misunderstanding that traditional agency models have been unable to incorporate altruistic motivation as a rationale for trust-based contracts. In a much-cited paper, Aghion and Tirole (1994) explained that with asymmetric information and incomplete contracts, a ruler might find it in the public’s best interest to trust experts with some autonomy (and even to commit to implement the expert’s proposal without knowing what it might entail). The reason is that the expert will not do his best if he or she fears that a misinformed ruler will revoke the expert’s decision. Key in Aghion and Tirole’s agency model is that the expert – the civil servant – is mission-oriented and cares about the direction of society.

Are crowding-out effects relevant for policy?

Under what circumstances are trust-based contracts important? Both conventional and behavioural economics seem to agree that the detrimental effects of audits and control are highest in environments that are ‘non-profit’, such as among voluntary workers operating in developing countries. (Curiously, this is also an area where calls for audits and evaluations have been strongest – and where such calls to a large extent have been met).

The Swedish non-profit sector provides a case in point. Since 2005, the Swedish foreign aid agency (Sida) has distributed approximately $150–200 million annually to independent non-profit organisations. The monitoring of these funds has been delegated to so-called framework organisations, which are chaired by members of the non-profit sector themselves. Sida officially describes the system as ‘based on trust’. This trust-based relationship was criticised in 2008 by the National Audit Office of Sweden, which spurred a debate between the two government authorities. The debate reflected the clash between modern public management practices against the more traditional view of the mission-oriented civil servant. The National Audit Office pointed out that Sida’s partners were particularly susceptible to corruption, given the lack of proper monitoring. The chair of the Swedish Red Cross, Bengt Westerberg, answered that evaluations are important but that too much could ‘spread paralysis’ among non-profit workers. Untouched, the National Audit Office, in turn, called for a stricter ‘control environment’ at Sida.

The experiment

At this point, around 2009, one of us presented previous research at Sida, and we became involved in a discussion of how to meet the National Audit Office’s demands. In particular, we were asked if there was a way to analyse whether the ‘control environment’ was not already too strict. We answered as most empirical economists would answer to such a question – randomly assign threats of audits and see what happens!

Later the same year we designed a policy experiment meant to examine the question of how non-profit workers react to more audits (the experiment is published in this summer’s issue of the Economic Journal). In essence, the experiment randomly assigned threats of audits to a sample of non-profit organisations in Sweden. We chose organisations with roots in traditional social movements (Christian churches or trade unions) and in more modern missions (fair trade, sustainable development, women’s rights etc.). Key was that they focused on promoting human development issues. The experimental intervention aimed for maximum simplicity. At the beginning of the year, a random selection of organisations (the treatment group) were informed that their financial documentation would be subject to an additional, special audit, conducted by the financial principal, Sida, itself. The organisations were also informed that they would risk losing future funds if Sida detected any irregularities. Non-selected organisations (the reference group) received no information about Sida’s upcoming audit at all. After the end of the fiscal year, the performances of all organisations were evaluated, and their financial documentations were reviewed by an accountant.

Given that we were part of the design from the start, we were able to set up rules governing the experimental trial in order to preserve the field context as much as possible. The accountant hired to review the accounts was blind to treatment status – in fact she was not aware of the experiment at all. We deliberately abstained from introducing ‘artefactual’ treatments. All contacts with the organisations were made through Sida representatives; the experimental design and our involvement as researchers were not mentioned to the non-profit organisations until after the experiment. We also established strict communication guidelines (a so-called Sprachenregelung) at Sida – all incoming phone calls from the organisations in the sample were directed to a designated contact at Sida, who was instructed not to give away any information about the experiment other than what had already been communicated through the threat of audit.

Results: the threat of audit improved all aspects of efficiency

Based on the accountant’s report, we found that expenditures among treated organisations were more carefully motivated. This result is hardly surprising since the treatment group was explicitly informed that their financial documentation would be scrutinised. However, we also found that non-monitored organisations exhibited a distinct expenditure maximisation effect, virtually hitting the Sida-funded budget short of a few percentage points. By contrast, organisations in the treatment group were more likely to return unused funds to Sida.

That the treated organisations were more moderate in their spending was an interesting result, but in isolation, this specific result did neither refute nor verify the crowding-out hypothesis. Less irregularities and less spending could very well reflect lower productivity (consistent with a crowding-out effect). However, the tendency to return funds did not seem to come at the expense of reduced outreach of the projects. On the contrary, based on the narrative reports, the treatment group claimed to have reached a higher degree of outreach relative to the reference group.

We foresaw the obvious objection that talk is cheap – self-reported outreach is a mere reflection of ‘creative book-keeping’, a typical malign symptom of new public management. We therefore equipped treatment-blind assistants with media search engines to measure the projects’ actual outreach in media. The result showed that treated organisations were significantly more likely to be mentioned in local media compared to projects in the reference group. This effect was particularly dramatic among groups that also cut back on spending.

Finally, our intervention did not spur any obvious long-run selection effects (say, towards more self-interested organisations). Organisations treated with increased monitoring were as likely to re-apply for new funds the year after as non-monitored organisations.


The motivation crowding-out hypothesis appears to add some flesh to the proposal that there are limits to evaluation and audits (leaving aside the direct costs). According to this theory, the reason audits do not work is because mission-oriented agents become demotivated when results are monitored. However, there is still much more to be learned about when, exactly, financial carrots and sticks have disincentive effects. Our study shows that it is not enough to simply look for a sector populated by ‘altruists’. We were able to boost the outreach of humanitarian issues in Sweden using a simple financial nudge.

This is not to say that Michael Ghiselin’s quip (“scratch an altruist and watch a hypocrite bleed”) is correct. Our interactions with the non-profit workers made it clear (at least to us) that non-profit workers are indeed not hypocrites. In fact, most of them cared deeply about their organisation’s mission, so deeply that some of them appeared to welcome the audit because it gave them a chance to prove their true quality against competing missions (Sida gives funds to both to Catholic organisations as well as sexual rights organisations – organisations with vastly different opinion on some issues). The most confident non-profit organisations thus welcomed the audit, as they perceived the audit as legitimate (see Schnedler and Vadovic 2011 for laboratory evidence of lower hidden costs under legitimate audits). We hope that once the hype against new public management is over, the public opinion will converge towards the relevant issue: When are audits legitimate? When are they not?


