Targets & Indicators – Measure for Measure or Comedy of Errors ?

This is a paper I wrote in 1999, when a college Principal. I uploaded it in June 2020, but as I haven’t updated it, it is a bit of a historical document.  Many of the issues, though, remain, even if the field of combat changes.  When I hear the Johnson government explaining why international comparisons of Covid mortality are not really fair to the UK, my mind wanders back …

Summary

The current system of providing public services by purchaser/provider split, and management by audit is weakened by the difficulties of identifying accurate performance indicators (PIs).  The problems – data capture, social adjustment, fraud, tariff farming, cost and inflexibility – are greater than has previously been thought.  The selection of proxies for PIs often leads to distortion of provider effort. 

Government does need to make secure judgements about value for money and quality.  It also needs evidence to base the selection of preferred suppliers in a managed supply chain relationship.  To be effective in their role, PIs should be used with caution, and seen as supporting the management of supply chain relations rather than making final judgements of provider quality or cost.  Judgements need to be supplemented by a strong research base, and by developing longer term indicators.  This will enable the development, through time, of a third way to manage a more effective post 16 sector – taking the best from the traditional public sector virtues and market disciplines whilst avoiding fetishistic contract compliance and audit.

 

1.         Introduction

1.1     There has in recent years been a major change in the way we run the public sector, a change that has drawn little attention from public or press.  Public services used to be delivered more or less directly by central or local government.  No longer.  What happens is this.  Firstly, an agency at arms length from government is created.  This agency is told the tasks that it is required to do and given the funds to get on with it.  The agency then purchases the provision from institutions that have been made independent of local or national government.  They have Chief Executives, develop mission statements, are required to make strategic plans and operating statements, undertake consumer surveys, hit targets.  Any initiatives or pilots beyond the base are launched via ‘challenge funds’ in which institutions and agencies are invited to bid for the opportunity to run a project for additional cash.  Compliance with targets and conditions is checked by exhaustive audit processes: institutional success judged by performance indicators.  Benchmarks and league-tables are used to judge how efficiently one institution performs in relation to another.

1.2     With the obsession with franchise and accountability, control and cuts and sleaze and staff contracts, we have failed to give enough importance to the coming of this new world to further education.  Indeed, one of the besetting sins of the FE sector has been its failure to see itself as part of public sector trends, to think that its griefs are private and its problems specific.  Colleges may have student volume targets and need to hit performance indicators for unit cost, retention and results, but the truth is that All God’s Children Got Targets.  Prisons have targets for multi-occupied cells, suicides and escapes.  Job Centres have targets for job placement for unemployed people, magistrates courts and operating theatres need to get the right number of clients through and out the other side, and (as college 16 hr students know to their cost) Social Security clerks have targets for disqualifying benefit claimants.  Police have clear-up rates, council housing has rental rates to hit.  The population itself even has guidance as to how many alcohol units to consume – just about the only government target that harassed public service managers regularly manage to exceed.

1.3     The trend towards governmental target mania has not escaped the media.  “Apart from well publicised targets such as reductions in hospital waiting lists and the size of infant classes, precise performance targets have now been laid down for everything from the state of soldiers’ and sailors’ teeth to the number of specimens to be gathered by the Botanic Gardens at Kew” (Sunday Times Feb 13th 2000).

1.4     We now learn that the Treasury is setting targets for other Departments as part of the Comprehensive Spending Review – and setting them at very detailed levels, looking for them to justify sums of money at the £10m level.  Cash will be reclaimed from central government departments who do not hit their own targets.  So the Treasury sets targets for the DfEE which sets them to the FEFC which sets them for colleges who set them for their own schools and faculties who set them for course managers.  Targets, like Jonathan Swift’s fleas, go on ad infinitum.

2.       The Attraction of Performance Indicators

2.1     The logic is pretty simple.  Public services – training places, hospital operations, prison custody – are a commodity like any other.  They should be purchased for the public in a way that delivers value for money.  Value for money is best assessed if we specify and then measure what we expect to get for our money.  The agency approach also can also be argued to get round some other problems.  One was what had become known as “producer capture” in which public services got to be run for the convenience of their staff rather than their users.  Another was the way that power had been in the hands of local government or amateur committees rather than expert managers.

2.2     The advantages of performance indicators to those charged with running the agency centred but still massively complex and expensive public service are obvious.  Firstly, performance indicators can be seen as part of an attempt to increase customer focus and improve quality (Ghobadian and Ashworth 1994).  How quickly are patients being seen ?  Which schools are getting the best results ?  Secondly, performance indicators can be used to help focus effort on matters of strategic importance.  The attraction for politicians of a well-designed performance review system lies in its power to operationalise manifesto commitments.  By extension, performance measurement is consistent with the government’s desire for greater accountability.  It can apparently increase control over decentralised activities and (one step further again) help with the management of compulsory competitive tendering.  To the extent that they inform administrative judgements about relative institutional performance, they can create a quasi-competitive market as funding agencies can select more effective and lower cost providers.

2.3     At its best, analysis of valid numbers can be used to tell a pragmatic government which policies appear able to deliver their goals.  The danger is that it can be also used uncritically to justify cuts in public spending.  It is always possible to find one hospital, one college or one police service that can achieve benchmark results at low cost.  “Calling for efficiency improvements through the better management of performance allowed the government to cut public expenditure without necessarily advocating or, more significantly, being seen to advocate service level depletion, a process facilitated by the politically irresistible “value for money” tag.  It was difficult to oppose the concept of value for money without seeming to advocate or at least defend waste and inefficiency” (Ball and Monaghan 1996).

2.4     And whilst we are wading in the muddy waters of cynicism, let’s not forget one last advantage of agency/PI management to a politician – the diversion of blame.  David Blunkett’s promise to resign if the national learning targets are missed is not only courageous, it is rare.  The logic nowadays is different.  It goes like this – the government has chosen the goals – safe railways, making absent parents pay, high school standards.  They’ve given clear orders and handed over the cash with the task.  Failure must therefore be due to an incompetent agency that has fallen down on the job.  When a dangerous prisoner escapes, or a train crashes, ministers no longer resign – they call in the head of the Prisons Agency or Railtrack for a dressing down.

3.       The problems with performance indicators

 

3.1     For whatever reason, and of whatever type, the march of the performance indicators was irresistible, despite evidence of substantial problems in their usefulness.  It may be worth reviewing these problems systematically.

3.2     Social factors  We spotted early on that the targets and league tables were weak in allowing for social factors.  Of course there was in the past some indefensible justification of poor provision on social grounds.  David Blunkett once memorably told me, when he was leader of Sheffield Council and I was opening a new college in his ward, “I don’t want you coming back in two years and telling me the results are rubbish because everyone’s disadvantaged”.  We must make educational success available to all students, whatever their background: the energy the government is giving to this task is heartening.  But one of the most secure findings of educational research remains the link between social class and attainment.  Judgements that make no allowance for context let the schools in the suburbs off easy, and leave the inner city school (and for that matter police force or rent office) demoralised.  The first performance tables for New Deal job placement success placed 10 of the 15 ‘best performing’ New Deal areas in rural Scotland, and 15 of the 20 ‘worst performing’ in inner London.  Lambeth was placed 119th out of 144 in its performance.

3.3     For as long as social inclusion is part of the public policy agenda, we must get this measure of achievement against the grain right.  The problem is knowing how much of a discount is appropriate.  Bradford will never match the primary school attainment levels of Kingston-upon-Thames, but what levels should be expected? Converting raw scores into genuine measures of performance is an important task for the research community.  In the meantime, benchmarks may be better than league tables.  As soon as the idea of ‘clustering’ results in comparable areas was introduced, Lambeth’s New Deal job placement score rose to 2nd out of 17.

3.4     Creaming  The damage from ignoring context could be worse than inappropriate judgements of quality, as managers alter their institutional policies to look good.  Economists looking at planning systems note that “creaming” is encouraged, where easier targets or clients are given preferences over more difficult (and often more deserving cases).  It would be sad if those working on the toughest tasks chose to ignore difficult to place trainees, or students with unsuccessful secondary backgrounds[1].  OECD (1988) has found creaming to be widespread in training schemes across the world.  It is already happening in one crucial area outside education, where I predicted last year that publicly comparing hospital mortality rates will lead to a reluctance to treat – or even admit – seriously ill patients.  “The Independent” duly reported on 7 September 1999 that “The Chief Executive of a London teaching hospital has warned that surgeons could refuse to operate on high-risk patients if the emphasis on success rates for individual doctors increased”.

3.5     Measuring the wrong things  Those collecting performance indicators sometimes ask for numbers that don’t actually measure what they want to know.  Sometimes it is because we are measuring what is easy, not what matters.  There are some particularly fatuous examples that can be quoted.  The Prison Service’s record in preparing prisoners for release is judged by how many visits above the minimum they receive in their final months in gaol.  Local government responsiveness is assessed by seeing how quickly phones are answered: an unhelpful response after three rings scores better than solving the problem after four[2].   FE figures are rarely that gross, but the numbers don’t always show real performance.  For example, reliance on fixed census dates make retention measures very sensitive to the date a course starts.  A more important example from the FE world is the very poor collection of destination data.  This must be the paramount indicator of success for an educational institution – are people in jobs, earning more money, off drugs, staying out of prison, living independently ?  It would answer the government’s proper desire for outcomes, not outputs.  Yet because it is expensive and difficult to collect, quality information (outside the sixth form college sector) is rarely available[3].

3.6     Data quality  Even when the measure is right, apparent differences in institutional performance may just show variations in ability to collect data: Lambeth College’s declared pass rate is twice that of some comparable inner London colleges, but I don’t think we’re twice as good.  Or differences may be about managing data.  The number of rail complaints fell when railway companies made forms more difficult to obtain: the number of sexual crimes reported rose when police took a more sympathetic attitude to victims.  The problem is worse when information is unrelated to incentives or judgements of performance: managers only put resource into result areas.  The old FE Annual Monitoring Survey was held in contempt by people who had better things to do: and sending nonsense data to head office is not unknown in other sectors.  The way that agencies now get round this is to tag resource to data: this gives providers an incentive to get data in, and allows audit to perform a dual function of monitoring performance alongside its traditional role of assuring probity.  The problem is that it also gives institutions financial incentives to massage the figures.

3.7     Fiddling the books This rarely comes to out-and-out fraud, though that certainly happens[4].  If there are financial penalties for student drop-out, there will be colleges who claim 100% retention.  If you only get paid for a piece of adult training if the trainee gets a job or passes an NVQ, then those job outcomes or qualifications will appear.  It is a lucky TEC that is not carrying a provision in its accounts against past fraud.  I’ll move to FE in a minute, but here are a few interesting cases reported in the press:

  • in the US, the reputation of some well-regarded “zero-tolerance” police chiefs has fallen as it has been found that their impressive figures were the result of distortions and misreporting as crimes were either unreported or downgraded. Car theft became vandalism, burglary was lost property – all to ensure the serious crime rate fell and the detection rate rose[5].
  • privatised rail operators have extended the times for journeys to make it easier to avoid penalties on their punctuality targets
  • a city in China caused alarm for environmental health officials when it reported many more mentally handicapped children than expected. On investigation, it turned out that the local school system, realising that students with special needs could be left out of league tables, classified all who stood little chance of a pass as disabled.  Some of the kids moved to other school districts and did very well.
  • New York schools were discovered, according to the Independent of 9 December 1999, to be routinely falsifying the tests that show pupil progress, in order to look good in government tables and attract extra funding.

3.8     “The more any social indicator is used for social decision making, the greater the corruption pressures upon it.” (Campbell 1979).  And it’s not just social policy.  Major errors in the US Army’s official estimates of Vietcong numbers (which ignored guerrillas, supporters and anyone not in military uniform) were covered up to make it appear that the US was winning (Adams 1995).  Agency theory shows how monitoring is needed where there are “hidden actions” or “information asymmetries” which expose principals to “moral hazards” or “adverse selection” (Wallace 1980).  But monitoring is rarely deep or clever enough.  It didn’t catch Robert Maxwell or Nick Leeson, BCCI or Halton College.  Or General Westmoreland.

3.9     Proxy padding   More common than straight dishonesty is tariff farming.  Colleges short of units enter a whole cohort for additional qualifications: all the “A” level language students do Institute of Linguists courses, all the art & design students do RSA CLAIT[6].  The franchise scam, where colleges bought growth rates of up to and above 100% by badging existing students from private trainers or profitable companies, is another aspect of this.  Again, colleges have a huge financial disincentive to classify a student as left, and so the FE league tables show implausibly high retention rates alongside implausibly low achievement rates[7].  This is all horribly well-known to economists, who speak of Goodhart’s Law to describe how almost any statistic will become distorted by being used as a control number (Goodhart 1975).

3.10   The result is that the published performance indicators rarely provide sound evidence of underlying real performance[8] – unless we are talking about the performance of nimble minds in the management information industry.  As a leading FE Principal once said “They want units, I’ll give ‘em units”.  This is well known in the literature of central planning.  Gray (1997) uses the example of the “production fetishism” of the Soviet planning system to illustrate the “gaming” effects of output monitoring.  In Russia, for example, production targets encouraged over-production of poor quality goods.  When targets for glass production were defined in tons, the factory had an incentive to produce undesirably thick glass, but when targets changed to square metres the glass was made too thin.

3.10   The problem is that in the absence of a real market, and unwilling to trust public servants, funding agencies have to look for proxies to measure: this is where much of the distortion comes in.  They can’t measure efficient health care, so they measure the length and duration of the waiting list.  So hospitals concentrate on procedures that reduce waiting lists rather than medical need.  They can’t measure a good education, so they have to rate schools on whether they get pupils to five grade Cs at GCSE.  Performance of the lower ability cohort then falls as schools concentrate resources only on those likely to get to that level.  TECs are in the same dilemma.  Having to hit their MPLs – “minimum performance levels” – measured as so many NVQs or training places, they are under great temptation to provide the qualifications and courses that will tick the box – short, low level, cheap – rather than those the economy needs – long, high-level, costly.

3.12   Closer to home, the recent expansion in FE of first-aid and food hygiene courses is not because the economy needs them, but because the price FEFC used to pay for them offered such a good profit margin.  The FEFC’s line is now to say that the funding system was fine, but it was knocked off course by the unexpected entrepreneurialism of the FE sector. They should have known better. “The central ideas of economic theory are very simple.  They boil down to little more than the proposition that people will usually take advantage of opportunities …(Krugman 1999). 

3.13 Outcomes not process  This would all be an amusing game were it not for the fact that resources are diverted into activities that yield good indicators and away from those that don’t fit.  A construction training centre shut last year in Camberwell because TECs are obliged by their funding regime[9] to hold back 75% of the money until trainees have a steady job: but in the low end of building trades, jobs are rarely 28 consecutive days[10], so the trainers can’t get paid.  An American example, and maybe not totally fictional, is to be found in Michael Connolly’s wonderful Harry Bosch books, which feature the contempt of our flawed but decent hero for the ‘number squads’ who exist to give the LA Police Department impressive arrest and clear-up rates for public consumption, irrespective of real work or serious criminals.

3.14   Short-term v long term  There becomes an emphasis on outputs rather than process, doing the minimum to get by.  “Innovation involves risk and risk is not rewarded.”  (Soviet quote in Gray 1997).  Projects with a quick payoff will tend to be preferred in a system which needs to prove the value of the current year’s expenditure by achievements before year-end.  Trainers financed by NVQ aim for fast turnaround whilst the broad skills we need to develop for the new employment world are likely to be a casualty of a number driven system.  Sixth form college staff lamented the passing of the entitlement curriculum as the qualification driven funding and quality assessment system drove down teaching hours.  Inner city colleges found that innovative courses aimed at rescuing the young disaffected might take people off benefit and build confidence – but they certainly have retention and achievement rates below approved benchmarks.  A stark illustration of the pull between PIs v innovation can be expressed in a single question – what will be the retention and achievement statistics of the UfI/Learndirect ?

3.15   The tension between short term indicators and long term improvement is particularly acute in the school sector.  One of Lambeth College’s partner schools is a tough, but improving, secondary in the north of the Borough.  It has, this year, for the first time attracted a full intake of local 11 year olds: but it will be five years before this is reflected in its GCSE league table performance – time enough for the local community to take fright at the poor PIs, send its children elsewhere, and turn the cycle downwards again.

3.16   Paying the cost  Another great disadvantage of the new wisdom is cost.  The army of administrators and auditors needed by the new management system must be paid for before any efficiency gain can be claimed.  This is a hidden cost, skimmed from education budgets.  It should worry a government aiming to “ensure the money reaches the learner rather than being tied up in administration” (DfEE Press Release, 29 October 1999).  We have 8 management information staff at Lambeth, collecting and checking information for the Funding Council: this is a comparatively modest set up for a college our size[11].  One London TEC spent nearly £1m last year on extra accountancy staff to ensure it was compliant with Government Office requirements.  Even then, it is likely that different auditors will have different interpretations, making comparisons based on their approved data flawed[12].  What is ironic, as we have noted earlier, is that compliance auditing of this sort is not good at uncovering fraud, which almost always comes to light in other ways[13].  Just as wags sometimes ask why there is only one Monopolies Commission, perhaps we should ask where is the value for money in audit.

3.17   It cannot be just the institutions who are paying the cost: the agencies are also running to stand still.  TECs have existed for ten years, yet last year – between April and October – there were 22 supplementary changes to their financial agreements with Government Offices.  Incorporation happened in 1993.  The FEFC has already published almost 400 Circulars and the pace does not slacken: the number of circulars issued has increased in each of the last four years and they now come more or less weekly.  This is not a criticism of the Council or its staff – I believe that if we want to run a system via the target and audit culture it will indeed require this sort of detail[14].

3.18   Distorting the managerial task  What institutional managers have to do in a world of audit and target and performance indicators and league tables is comply[15].  Local discretion – the style of the college plan, the length of the New Deal Gateway – is limited.  It is not that responding to the agency above you gets in the way of the job  – it becomes the job.  Staff meetings are given over to explaining to baffled teachers and technicians why we have to comply.  Management information is shaped to give the agency what it wants, not managers what they want[16].  Each new Circular must be scrutinised for its effect on the college.  In the longer term, a generation of managers comes through who see the whole job as responding to head office.  Initiative and flair, local responsiveness and job satisfaction will ebb away until we get to the point of complaining, with Inspector Truscott in “Loot”, “how dare you involve me in a situation for which no memo has been issued !”

3.19   This has already happened in the TECs.  Impressive senior industrialists have been recruited to their boards.  But rather than discuss the training and skills needs of the region or industries they know about, they concentrate on the latest number of adult starts, cash reserves and performance levels.  What the whole movement was set up for has been lost in a fog of targets and indicators.  In a huge irony, the attempt to introduce the market ends in Stalinism.

  1. How it works in post-16 education and training

4.1     The world of indicators and targets came to further education soon after incorporation.  The 1992 Further and Higher Education Act established a Further Education Funding Council, and took nearly 500 colleges out of local government control.  This was presented at the time as liberating colleges to be more entrepreneurial and flexible: at a famous reception, John Major asked the assembled Principals and Chairs “Isn’t it great to be free ?”

4.2     We soon discovered we were about as free as a Marks and Spencer shirt supplier.  Our liberation was quickly followed by the need to share Strategic Plans on a common format, to break that plan down into lists of actions by whom and by when, to make detailed returns to the Funding Council three times a year, and to hit ever more demanding targets of growth, unit cost and achievement.  A new funding system was introduced based on breaking all our work down into units: a college’s unit earnings depended on not only its student numbers, but its programme mix, its ability to raise fee income, its record in retention and achievement, how much learning or childcare support its students needed and more recently how many low income students it attracted [17].  League tables were established, with six major indicators [18] taking pride of place:

PI 1.       Achieving the funding target.  Each year Colleges are set targets for their activity in the coming year, based on student enrolments, additional learning support and so on.  By seeing how close we get to our funding target, the FEFC judges the effectiveness of our planning and target setting.

PI 2.       Student enrolment trends  The FEFC looks also at how fast enrolments are growing (or falling).  This shows them if colleges are providing programmes that meet the needs of students.

PI 4.       Student continuation  The proportion of enrolled students who are still attending in the summer term to show the appropriateness as well as effectiveness of learning programmes.

PI 4.       Student Achievements  The percentage of completing students who attained their main learning goal is used to provide an indicator of student achievements.

PI 5.       Attainment of NVQs  The FEFC needed to know the numbers of young people achieving NVQ2 and NVQ3 and the number of adults achieving NVQ3 or higher in order to see how the College is contributing to meeting the national targets for education and training.  It is difficult to see what this indicator adds to PI2 and PI4 apart from a marketing device for the NVQ sacred cow.

PI 6.       Average level of funding (ALF) measuring the college’s cost-efficiency measured as the amount of cash per unit.  This has been abandoned – ostensibly because unit cost is now determined centrally according to convergence tram-lines, but possibly because of the known flaws in its calculation.

4.3     These are the headline indicators, but others have come into play also.  Assessments of quality routinely use the number of lessons graded satisfactory or better during inspection visits, or the number of colleges with management that is less than satisfactory.  Financial assessments quote the proportion of colleges that achieve grade A status – meaning that the evidently have the resources to achieve their business plans.

4.4     Interestingly, other sectors of the public training worlds have different indicators.  TECs have 8 ‘minimum performance levels’ (MPLs) in total, measured at the mid-year and end-year points.  The MPLs are:

– Youth starts

– Youth output points

– MA ethnic minority starts

– Basic Employability (BE) starts

– BE output points

– Adult jobs/self employment

– Adult ethnic minority outcomes

– IIP recognitions for organisations with 10 or more employees

4.5     Although these figures are arranged in league tables by the DfEE, they are essentially thresholds not absolute values.  And not only is the style of indicators different, so is the use to which they are put.  There is no pass or fail mark in the FEFC indicators, nor is there any apparent adverse consequence of performing poorly[19].  Indeed, a bad enough performance will bring a generous Standards Fund cheque flopping onto your college welcome mat.  By contrast, a TEC that falls short of its mid-year or end-year minimum performance levels will find that the regional government office reduces its managerial freedom significantly – in particular, the ability to vire between programmes.  The idea of using minimum performance levels as a licence to trade[20] as an educational supplier might be attractive to an agency keen to ensure low entry barriers but anxious about quality.

4.6     The existence of differing targets and performance levels in similar programmes place interesting choices to managers in the system, who often have to deal with clients from different agencies doing the same course.  Take the case of a further education college with European and FEFC funding.  To avoid double-funding, FEFC students are discounted when put forward for ESF funding.  As a result, managers are often faced with a choice at the end of the year as to which agency’s targets to hit – FEFC or regional Government Office.  It could also be argues that there are also internal tensions between performance indicators – for example between high retention and high achievement (early leavers are likely to be student struggling with the course) or between standards and widening participation.  This makes the FE manager’s job – and maybe our analysis of their behaviour – like the idea of ‘satisficing’ found in the theory of the firm.

5.       So what should we do ?

5.1     None of this is to deny that we need good performance indicators.  It does matter that a child is three times more likely to die in the same heart operation in a hospital in Bristol than in one in Birmingham.  We still need the numbers to inform judgements on effectiveness, value for money, flexibility, quality.  No one wants to go back to a sluggish public sector of council estates and sink schools.  But when creating a new post-16 education and training world, we must acknowledge the weaknesses of the measures we have used to judge the workings of the system we inherit.  We need to blend the best of agency PI world with the best of the traditional public sector in (I hesitantly suggest) a third way.  I suggest below that we need to take three decisions

–    make sure the numbers are clean and useful for managers – defined in terms that make sense to those compiling them and those using them.

–    supplement them with research and evaluation to a much greater degree than at the moment

–    use them to develop effective long term purchaser-supplier relationships, rather than as headlines for political and public use.  Performance indicator gaming is a consequence of poor communication and mistrust, that can be avoided by engendering a sense of common effort.

5.2     Clean up the numbers  Our first task is to make the numbers collected useful for judgements to be made. We are talking about the 3 Ds: 

–    the dimension – the aspect of performance that the manager or agency wants to know about  (local government responsiveness, college student wastage)

–    the definition – how the dimension is to be captured (how quickly the phone is answered, whether an enrolled student is still on course)

–    the data – giving precise instructions to those collecting the information, which may involve proxies or cut-off dates, closeness to prescribed target levels or whatever (how many calls were or were not answered inside four rings; students in class after May 1st as a percentage of those enrolled on November 1st). 

All three need to be right to get meaningful figures – but even so they would still leave the problems of context and gaming I mentioned earlier. 

5.3     My recommendations would therefore include

  • developing a suite of performance indicators that measure what is important – skills gaps – social inclusion – value added – perhaps even changes in local output – not what is easy.  Recognise that some of these – were access students successful in graduating, did Business Start courses create enduring SMEs, has that single parent stayed off benefit – will require years to assess.  Recognise too that judgements of college performance will need to be made using the suite, not by what Charles Handy described as the ‘tyranny of the single number’.
  • ensuring that targets and performance indicators have a social context, a carefully considered ‘degree of difficulty’ rating. Don’t compare intensive care with general wards.  Cluster providers in families for comparison. To be fair, this is now recognised by New Deal and more grudgingly the FEFC.

–    develop some partnership goals and targets that have to be achieved by teams of institutions working together.  This is already in place in learning partnership plans, and needs to be developed to full usefulness

–    remove any perverse incentives.  Scoring schools on their average grade rather than their number of students with five Cs will encourage raising achievement across the board.  Funding franchised work with employers at the actual contract cost will support partnerships that make educational sense, whilst offering nothing to the fast buck merchants.

  • Keep stability between years – don’t keep ducking and diving. The better understood a PI is, the easier will providers find it to give good figures, and the cuter will auditors be to spot malpractice.  FEFC retention and achievement figures are now tip-toeing towards usefulness – but after four years of noise and nonsense.  This implies running indicators as pilots for a number of years before publication – a promise that was broken in primary school tests.  With 2000 or more providers, we cannot afford to jump around with different measures of quality or value for money from year to year.
  • Measure what you want to manage not proxies – for example, satisfied callers not phone rings, students not funding units[21].

–    Make sure that figures are comparable across sectors – that (for example) school and college data for retention and exam success are measured the same way (which they currently aren’t).

  • We must have some input measures as well as output measures – e.g. PCs per student, teaching hours, number of student counsellors per full-time student. The period of hygiene that moved institutions away from judging quality only in terms of inputs – carpets in the classrooms, how many teachers, home many library books – was useful, but we need to move on.
  • ensure that performance indicators are compatible with internal target setting (unlike the FEFC’s current practice in results, for example). Thus, managers will have an incentive to ensure the numbers are accurate for their own purposes: it will also minimise cost.  Further, providers should be encouraged to develop their own performance indicators for the areas of concern to them, and share them with colleagues.  (It strikes me that the source of the ‘gaming’ problem is that performance indicators were initially devised for internal management use – the corruption came when they became used by external bodies for control)
  • even after all this, view the figures with very great circumspection. They could be used to inform the decision about collecting more information – which colleges are ready for their next inspection, for example. But as Einstein is alleged to have said, “not everything that counts can be counted, and not everything that can be counted counts”.  Users of data need to know the limitations involved in collection and behaviour lower down the food chain.

5.4     Evaluation and research  It is often said that FE has a poor research base.  This is no longer true; but it is the case that FE makes little use of research findings for policy choices.   Alongside a suite of reliable performance indicators we must build longer term evaluation – measuring the success of differing approaches in reducing drop-out, achieving progression to higher level courses, success in HE and employment, finding out who is in work and what they earn.  Long term research is needed: an Access course may have good results and retention, but do its graduates make it through higher education.  A training programme may be low cost, but are the clients still as likely to be unemployed three years, five years later.  A national programme needs to be co-ordinated that will link strong research with purposeful dissemination: FEDA seems well placed to take this role.

6.       Rebuilding a public service

6.1     Performance indicators may be configured this way and that, or fiddled one way or another, and research findings may support or criticise current practice.  The most important question to ask about them is – “so what ?”  The point, as someone once remarked, is not to interpret the world but to change it (Marx 1888).  What do we want PIs for ?  I would argue that the most profitable use would be to inform the long term relationship between the Learning and Skills Councils, and the providers.  “The voice of many evaluators lies behind the view that performance indicators should be used within a developmental rather than an accountability approach.  ‘Performance indicators should be used as “guides, not answers”, as “tin-openers” rather than “the dials on the dashboard of a car”, for “systematic thinking, rather than the rigorous allocation of blame”, for “learning rather than control”’ (Jackson 1998).

6.2     Building the partnership.  But this implies that there is a long-term relationship.  In my view, it is essential that we commit ourselves to developing a quality infrastructure through time.  The UK skills problem is long term and deep seated.  Our problems of technical education have been noted by Adam Smith and Alfred Marshall: the 1884 Royal Commission was established to look into the problem.  The Professor of Social Economics at LSE wrote illuminatingly on the subject as it faced the Macmillan government (Williams, 1963).   There are no quick fixes.  Skills gaps and social exclusion will only be addressed if we regard the existence of strong institutions in vocational education as being as important as it is in the university sector, in hospitals or in the private schools.  We must stop regarding colleges as just “suppliers”, in some way part of the problem rather than part of the solution.

6.3     Stability and continuity is essential for success.  No-one will invest in expensive equipment or new buildings if they are likely to lose a contract in  a year or so.  Allocating high volume courses to the cheapest provider will not help, either.  Specialist and high level provision feeds off general high volume courses: selling low level carpentry courses to the lowest bid imperils stone-masonry or stained glass.  Expertise in meeting the needs of disaffected, or ways to challenge the most able, are built through time on the basis of commitment to a community.  The collaborative local system implied by Learning Partnerships cannot work without stability of institutions – and lifelong learning will be strengthened if the college is still there when people come back to it.  This is not an easy way out: I have found in my own job that some stability is required to chase down quality and cost issues.  Real improvement is a slow, slogging process.  Dramatic changes in PIs/targets are either fraudulent or recovery from dreadfulness.

6.4     An FE Principal arguing for stability and continuity could be seen as self-serving, but it is important nonetheless.  Change in the public sector can be swifter than in markets.  Colleges have been effective in responding to secular change – the collapse of apprenticeship, the decline of manufacturing and mining, the feminisation of the workforce, the coming of information technologies.  What has destabilised them has been new funding systems or changes in government priorities.  Policy is bound to be capricious – whatever happened to Regional Advisory Councils, the Education Reform Act, ET, NAFE planning, the ILEA, polytechnics, the MSC, WRFE, grant-maintained schools, the demand-related element, TECs ?  There is even a good case for saying that strategic planning in the public sector, as currently configured, is an impossibility.  By contrast, institutional continuity is greater in capitalist markets than in the public sector. The dominant firms, with some exceptions like Microsoft, have been around for decades, sometimes more.  The moral in the public sector is to presume in favour of continuity.

6.5     Empower the providers  Then we must loosen up the control freak culture.  There must be greater freedom to approach problems in new ways.  This calls for a removal of in-year funding penalties (so colleges can ‘fail forward’ without immediate financial punishment).  The power of local LSCs to deliver a part of the budget outside the funding tariff must be used to support imaginative initiatives that hit government goals – reaching out to the disaffected, establishing connections between training and work, launching programmes that do not fit conventional qualification aims.  This must be managed in a way different to the existing reliance on challenge bids, enabling those delivering the service to develop ideas of what works on the ground.  I would argue for budgets based on agreed plans, not in-year achievement.  This would add purpose and interest to the preparation of Strategic Plans.  Working like that will also have some chance of re-moralising workers in the public sector, whose enthusiasm and commitment has held much together over the difficulties in recent years.

6.6     The argument for continuity and freedom is not a soft option, nor is it hostile to the idea of developing a new learning market.  Capitalism has moved on.  The relationship between supplier and provider in the advanced private sector is no longer one of ‘open outcry’, or stack it high and sell it cheap.  In today’s knowledge economy, companies learn together.  Supply chains are built carefully through time.  Business partners are involved in new designs, in developing new products.  This is not to say that buyers do not demand efficiency and quality from their suppliers – nor that they sometimes choose to drop a supplier altogether.  Indeed, it can be argued that the growth of performance indicators and the audit culture is because of agencies’ reluctance to let public institutions die.  As Julian Gravatt, the percipient Registrar at Lewisham College has remarked, one cannot argue for an economy of strong public institutions whilst sheltering every weak one.  Perhaps the Faustian bargain for the next stage of post-16 development is looser control for less security.

6.7     Just as the idea of a spot market in training runs counter to what is really happening in modern capitalism, the audit culture is dissonant not only with any idea of well-regulated professionalism, but also with modern management theory.  Glance down Deming’s famous 14 points – “drive out fear”, “constancy of purpose”, “eliminate arbitrary numerical measures” – and see how many fit the compliance world.

6.8     Crude ideas of “producer – provider separation”, a false dichotomy between the interests of learners and those of colleges, and illusions that indicators exist which can give clean snapshots of performance will lead to more systems combining the worst aspects of markets – uncertainty and lack of accountability – with the inflexibility and bureaucracy of planning.  They arise from an outdated model of how modern markets are now configured, and will leave policy makers and their servants, like unsuccessful generals, fighting the last war.

  1. Conclusion
  • When I started this line of enquiry, I thought it was an interesting back-water that might be good for a few gags. Over the past couple of months, I’ve come to another view – that this is actually very important topic.  Discussions with interested colleagues in and outside education have indicated that figures are much less reliable, and gaming much more common, than is realised by the makers of high policy.
  • But the topic goes deeper than that, and asks what is the difference between public and private endeavours ? How do we judge how well we are reaching our public goals ?  How can we balance the need for a stable development of public institutions against complacency and poor quality.  This is not a trivial matter, and there are no clear answers.  It is my view that time will show that the bean-counters have led us up a cul-de-sac. The way forward must involve regenerating the support and enthusiasm of those delivering public services to the purpose of genuinely improved performance and efficiency.  But we cannot avoid a difficult journey towards a re-invented public sector.

 

References and Further Reading

Adams, Sam (1995) ‘The War Of Numbers; An Intelligence Memoir’, Steerforth

Ball, Rob and Claire Monaghan (1996) ‘Performance Review:  The British Experience’, Local Government Studies 22(1): 40-58.

Campbell, D. T. (1979) ‘Assessing the Impact of Planned Social Change’, Evaluation and Program Planning 2:67-90.

Connolly, Michael (1996) ‘Black Ice’ (Orion)

DETR/Audit Commission (1999) ‘Performance Indicators for 2000/20001: Best Value and local authority performance indicators for 2000/2001’

FEFC (1994) ‘Circular 94/31 Measuring Achievement’

FEFC (1999) ‘Performance Indicators 1997-98; Further Education Colleges in England’

Ghobadian, Abby and John Ashworth (1994) ‘Performance Measurement in Local Government – Concept and Practice’, International Journal of Operations and Production Management 14(5): 35-50.

Goodhart, Charles (1975) “Problems of monetary management: The U.K. experience” in: Courakis (ed), (1981) Inflation, Depression and Economic Policy in the West (Totowa)

Gray, Anne (1997) ‘Contract Culture and Target Fetishism.  The Distortive Effects of Output Measures on Local Regeneration Programmes’, Local Economy 11(4): 343-357.

Jackson, Annabel  (1998) ‘The ambiguity of performance indicators’ in  ‘Performance Measurement: Theory and Practice’.  Editors Andy D Neely and Daniel B Waggoner.  Centre for Business Performance, The Judge Institute of Management Studies.

Krugman, Paul ‘The Accidental Theorist’ (Penguin 1999)

Likierman, Andrew (1993) ‘Performance Indicators: 20 Early Lessons from Managerial Use’, Public Money and Management Oct/Dec 1993

Marx, Karl ‘Thesis on Feuerbach’ (written 1845, pub. 1888)

Moorse Rosemary and Dixon, Stella (1999) ‘Measuring Performance – Improving Quality’, FEDA Bulletin Vol. 2 Number 11

OECD (1988)  ‘Measures to Assist the Long-Term Unemployed: Recent Experience in Some OECD Countries’.  Paris: OECD.

Orton, Joe (1967) ‘Loot’ in ‘Orton: Complete Plays’  Methuen (1998)

Perry, Adrian (1998) ‘Benchmarks or Trade-marks – how we are managing FE wrong’  Visiting lecture to University of Greenwich, unpublished.

Wallace, W. A. (1980)  The Economic Role of the Audit in Free and Regulated Markets. New York: University of Rochester.

Williams, Gertrude (1963) ‘Apprenticeship in Europe – The lesson for Britain’ (Chapman and Hall)

[1]  Note the increasing exclusion of difficult pupils from schools. – which rose 500% between 1990/91 and 1995/96.  They get in the way of how we currently measure raising standards and have to go.

[2]  Another ‘measuring the wrong things” story.  An angry MP demanded to know why her local Education Authority was at the bottom of the league for drawing down funds for nursery places, for eliminating outside toilets in primary schools, and in school maintenance grants.  The reason is that (a) it has 120% coverage of nursery places and needs no more (b) it replaced outside toilets twenty years ago (c) the school buildings have been well maintained and require little remedial work.  Far from being the sign of a bad LEA, the league position was a mark of honour.

[3]  Perhaps we should take a leaf out of the book of the American National Foundation for the Teaching of Entrepreneurship.  They used a private eye to track down alumni of their inner-city owner-manager programmes !

[4]  Stories of Welsh primary schools who over-claimed pupils, or doctors with implausible patient lists both surfaced during the preparation of this paper.

[5] Independent on Sunday 19/10/98

[6]  In the TEC world, the number of NVQs per trainee can be boosted by putting able apprentices in for a number of awards – NVQ1, NVQ2 and NVQ3 shortly after one another.

[7]  for example, reported retention for colleges in deprived areas on NVQ3 level courses is 75%, with pass rates at 57%.  Give me a break.  There is another reason for reported achievement being lower than actual – namely the difficulty of collecting results from many examination boards for a transient adult student population.  But if the funding methodology favoured results ahead of retention …

[8]  FEFC inspectors rarely believe the published PIs and always send in a statistical hit squad before inspection.

[9] rumoured to be devised at the insistence of Michael Portillo when Employment Minister

[10] and the reason for this is that when a job outcome was defined (and paid) after seven days, numerous scams happened.  Again, the difficulties of establishing a quasi-market come to the fore.

[11] David Eade, a respected Principal at Barnsley, estimates that his college – famously efficient and lean – spends £1m a year on collecting and providing management information for funding agencies.

[12] and so more guidance is needed – and tirelessly supplied – for auditors.  Colleges framing self-assessment reports on audit quality will be helped by the FEFC’s list of 119 (count them) points to check.  A similar picture is to be found in equal opportunity, where a college filling out the FEDA good practice check will find itself printing off upwards of fifty pages.

[13]  In a tragic recent example, the Bristol baby heart operation scandal was unveiled by a whistleblower, not by analysis of PIs.

[14]  An interesting interchange from the PAC investigation into Halton College (26/4/99) :

“ (Mr Twigg) Does it concern you that since 1992 roughly 400 circulars plus have been sent to colleges, many of them relating to financial regulations, audits, financial matters and yet we are still seeing the problems we are looking at today in terms of Bilston, in terms of Halton and you may be aware that today we have heard on the news about Wirral College and major problems with overspending there. Is it not the case that the quantity of advice from the FEFC has not been matched by the quality?

(Mr Bichard) We should always be concerned at the amount of advice which is going out both to colleges and to schools. We have, however, been going through the establishment of a new sector, we have been setting up new funding arrangements, we have had need to offer an enormous amount of guidance to colleges through the FEFC so I am not surprised at the amount which is going out. We should always be concerned about the quality and the clarity of that advice.”

[15] The new managerial wisdom is strikingly like the ways we use to manage our own poor performers in-house – setting clear goals and targets and checking all the time.  This gives a clue to how much the policy makers now trust the public servants below them.

[16] a classic example of this is the absence of any workable electronic register system that counts student attendance.  Colleges are actually worse at this than they were in the early ‘90s.

[17] following the report of the Kennedy Committee, premium payments are now made for students from deprived areas.

[18] see FEFC Circular 94/31.  Yes, it is that long ago.

[19]  There are only two I can think of.  One is the compulsion for a college in financial category C to agree a rescue plan to the FEFC; and there is a ‘slap on the wrist’ for colleges getting unsatisfactory inspection grades in that they may not expand areas of poor provision until improvement is confirmed by re-inspection.

[20]  A bit like allowing clubs to enter the Football League when their ground safety is up to scratch ?  Or requiring contractors to follow building regulations ?

[21]  Between 1990 and 1993, college funding was based on FTE students.  Enrolments boomed.  From 94 onwards, it was based on funding units.  Unit totals boomed, but actual student growth slowed to a crawl.