by MERLIN CROSSLEY 

I saw a question on Twitter – what one thing would you do to improve academia?

One of my colleagues answered ‘get rid of metrics’.

Several people have suggested this.

I keep thinking about it.

But metrics persist partly because “if you can’t measure it, you can’t manage it”, or, since some people dispute that, perhaps “if you can’t measure it, you can’t prioritise it, you can’t invest in it, you can’t allocate scarce resources to it, justify the allocations, and sustain or increase the funding”.

Ideally, transparently measuring things can provide protections against disputes and disharmony. In a perfect world metrics would provide opportunities for outsiders who indisputably demonstrate their worth. Sometimes good scores can drive inclusion.

But the world is still not perfect.

Firstly, the measurements are always imperfect. One type of error involves inaccuracy. That’s something we’re all used to and we can take degrees of certainty into account when making judgements.

The bigger problem is that most metrics measure the wrong things. Most simply, numbers are designed to measure quantity but not quality.

This creates perverse incentives that ultimately drive the wrong behaviours.

Which made me think about what we measure. For research we measure quantities, like number of papers and citations and then attempt to overlie this with a quality measure by using impact factors and H indices.

But, of course, these are all inter-related. They are all just quantity measures. The relationships turn out to be remarkably simple. Apparently, someone’s H index is typically equal to half the square root of their total citations, or a third of their total number of papers. Impact factors are directly related to average citations per paper over two calendar years.

To me the problem is that quantity measures end up masquerading as quality measures.

The Declaration on Research Assessment (DORA), the Leiden manifesto for research metrics, and other statements warn us not to take metrics too seriously. To me the best way to do this is to recognise that they measure quantity not quality. Measuring quantity can be part of what we do, but we can remember that it should never be all that we do.

When it comes to teaching and defining what is good teaching, what we want to encourage and invest in, then things are equally or more challenging.

It is probably easier to answer the question – what is art? – than what is good teaching?

Like most questions about quality the answer is partly in the eye of the beholder.

I think one day, with large scale electronic teaching, we may be able to measure ‘’earning gain” but at present we can’t really run double-blind controlled experiments, changing one variable at a time, so we rely heavily on student feedback to get a feel about whether the teaching is going well. In my institution we call these MyExperience surveys, to emphasise the fact that we are asking the students about their experience, not asking them to evaluate the content or teaching (though it is a subtle distinction).

Then, despite everything, we tend to look at the numbers, i.e. percentage satisfaction. And again one is tempted to see this as a quality measure, as a measure of whether the teaching was good quality. But, we don’t really know. The exams and assessments might tell us if the students learnt something (if we also know the starting point of the students) but the student surveys just tell us about the student experience. Knowing about the student experience is important, but it is not the same as good teaching.

We find it hard to communicate this distinction though and the percentage agreement numbers quickly take over and gain prominence. Perhaps this is not too bad and here the “perverse incentive” may have a positive side and may drive up student satisfaction but there is a constant worry that this may come at the cost of optimal learning.

Perhaps it would be more useful to treat the surveys as measures of quantity not quality. To reduce their power, and just look at how many students responded to the surveys. This provides information on how many students are being reached by each lecturer.

It’s an interesting thought, perhaps a bit like how many people read a book, watch a film, or download a music track. It does not tell us the quality of the experience the book or film or band provides but box office sales do play a large role in determining which films, books or bands become classics. Popular measures when combined with expert review can be helpful.

I also think institutions should thank lecturers who reach a lot of students.

But won’t this drive another perverse incentive?

Won’t lecturers try to maximise how many students fill in the surveys? Won’t people compete to teach in the large first-year courses? And above all this, won’t academics try to optimise the student experience?

I hope they will.

I’ve come to the conclusion that quantity measures are part of life. More than that I’d say all measures relate to quantity and quality cannot be measured, it can only be agreed upon. Numbers measure quantity and words measure quality. Definitions of quality can be manipulated to serve a variety of agendas, not all of them good, so quantity has a role to play.

Once one recognises that metrics relate to quantity not quality, one thinks differently and becomes less enslaved to these numbers. Then one can begin to think about the question, what qualities do I want to drive and how will I do that fairly and without conscious or unconscious bias. That is what I would call an art rather than a science and something I’ll keep discussing with colleagues on Twitter and beyond.

Prof. Merlin Crossley

Deputy Vice-Chancellor Academic

UNSW SYDNEY


Subscribe

to get daily updates on what's happening in the world of Australian Higher Education