Tuesday 24 July 2012

How do we measure an individual's research impact?

This is a subject that I thought I already understood fairly well, having expressed on this blog and elsewhere various sceptical views about the whole notion of impact, including some of the commentary about it that has come out in connection with REF discussions and the like.

Some of the debate about impact has certainly been pitched at a fairly low level, and has often not taken us a great deal further than the well worn territory of either assessing 'research quality' - of an individual or a department (or, more formally, a unit of assessment) - in quite subjective terms; or making the assessment on the basis of publication numbers, journal rankings and citation counts. Not only has this not advanced matters much beyond the old RAE system, but it is open to all the familiar arguments: how many articles to count; how far back to go; how to judge the quality of an individual article, etc. Like a lot of important things in life, research quality seems to be one of these things where 'we know it when we see it'.

However, a few days ago I was intrigued to find that I didn't really understand this topic of measuring research impact as well as I thought. It happened like this. I had been approached by a couple of academic bodies in Italy to help them with two tasks: (i) assessing the research quality of individual academics, by reading and ranking a selection of their papers; (ii) assisting Italian universities to make new appointments at professorial level. I presume my name was put forward by a colleague in Italy, but in order to take things forward the Italian academic bodies naturally asked me to provide my CV and to complete a couple of simple forms. Whether, in the end, the Italians ask me to do these academic tasks, remains to be seen, but completing their forms was unexpectedly challenging.

For one of the forms asked me for the h-index and g-index of my research, something that left me totally baffled as I've never heard of them. Just to deal with the forms I did a quick Google search using the string 'Paul Hare h-index'. To my surprise this came back with h = 5 and g = 7. I still had no idea what they meant so I just put them on the forms and sent them off as requested.

Once that was done, though, I had to investigate further.

It then turned out that the h-index originated in a paper by Hirsch published in a physics journal in 2005. His idea is a really simple one. If a researcher has an h-index of 22, say, it means they have 22 papers with 22 or more citations. The h-index is the maximum number of a researcher's papers where this property holds. Thus someone might have published 80 or so papers, but most will only have a handful of, or even no citations. With 80 papers, the h-index could turn out to be 25, if that researcher - among all his/her output - has 25 papers with at least 25 citations (and not 26 papers with at least 26 citations).

The g-index, by the way, is a modification of the h-index to weight more heavily the articles that gain more citations.

What you realise when you think about all this is that anyone's measure of h (or g) will depend on the specific database that is examined - what papers are included in it, what citations are counted, etc. This means it's not easy to get a wholly reliable measure of h, because available research publication and citation databases are nowhere near good enough. Whatever database is used, for any given researcher their h value should rise over time, implying that it's probably not a great indicator for early-career researchers. Moreover, if different databases are used to investigate a given individual, one can expect to find different h values. All one can say, then, is that the person's true h-value is at least as large as the largest value one manages to find.

To illustrate the effect of using an alternative database, I downloaded an amusing bit of free software called Publish or Perish (Harzing, A.W., 2007, Publish or Perish, available at this link). This searches through the Google Scholar database, and using this approach I found the following values for my research: h = 20 and g = 33. I still don't really know whether these values are good, bad or indifferent, as I haven't checked the corresponding values of any colleagues and haven't seen these research performance indicators reported before.

Should we try to use indicators like this in the forthcoming REF? It seems to me worth thinking about, despite the practical difficulties I have mentioned above.

No comments:

Post a Comment