Article-Level Download Metrics – What Are They Good For?

September 18, 2009 Liz Allen Publishing

This guest blog, on the addition of article usage data to the PLoS Article-level metrics program, is written by Cameron Neylon, a biological scientist who is an Academic Editor on PLoS ONE, runs the blog Science in the Open, and works in the UK at the Science and Technology Facilities Council. He has also created this video (which focuses on how to use social bookmarks), on the previous release of Article-level metrics.

We live in a world where we demand instant gratification and the most up-to-date information. Getting information faster than anyone else, and learning how to use that effectively to one’s advantage, is the best way to get ahead. At the same time, we are drowning in information, suffering from severe information overload, because we are unable to effectively filter the growing volume and diversity of incoming data.

As scientists, we know there are more relevant papers out there than we can possibly read. And this is before we consider the world beyond paper, of data, conferences to attend, people to speak to. We need ways to bring the things that are relevant to us to our attention so that we can take a closer look. As authors of papers, and as applicants for promotion or for our next job, we need information on how important our papers are, how well they are doing.

Traditionally, we might have looked to the name of the journal as a way of determining how important a paper was, but this metric is breaking down. It is breaking down due to the development of new types of journals, such as PLoS ONE, which takes papers from across the spectrum, from groundbreaking to those only of interest to a small group of people, and which is on course to become the largest journal in the world next year. It is breaking down because our means of finding papers are now primarily based on searching whole databases, not a single issue table of contents. But above all, it is breaking down because of a growing recognition that judging a paper, or even a person, based on the journal in which they appear is not a good measurement. We need to measure the paper itself, not the container.

Article-level metrics recognize the need to measure individual papers, and PLoS continues to lead the way with this week’s release of download counts for every paper in the PLoS stable, complementing the existing information on citations, bookmarking, and online coverage. So what do download metrics add to this mix?

Primarily they add speed. Citations take a long time to build up and be registered, so although they may provide a clearer idea of the influence of a paper in the long term, they are not a great deal of use in the weeks and months after a paper is released. Online bookmarking can be more rapid but isn’t mainstream enough to provide good statistical information yet. But (just about) everyone hits the Web site and downloads a paper, whether they are an online geek or not, making the numbers a good representation of community use.

On the other hand, downloads are sensitive to new kinds of effects. Hit the front page of Slashdot and your paper will get a huge spike in downloads. Having it set as reading for a large class could skew the numbers away from “serious” downloads, and there is the distinct possibility of Web crawlers going mad and downloading a paper thousands of times. And all of this is before we take into account the potential for people to try and “game” the counts by sitting in their office downloading the paper over and over again. Sad, but it will almost certainly start to happen.

Aside from the robots and the gaming, my feeling is that these are things that we should be measuring. Actual public interest in your paper? Real educational value being gained from it? These are things that you want to know about. It would be even better if we could separate these out, and I find the prospects of using download versus citation metrics in the future quite exciting. But in the meantime, it gives us a new measure that we can compare with what we already have available.

And, at the end of the day, that is how I think we should see these new metrics—it is more information. It isn’t yet clear how best to use these measures, but it is up to us as scientists, who, after all, make our living out of measurement and analysis, to figure out how best to use them. The approach PLoS is taking of simply presenting the data, and as much of it as is possible, is to me exactly the right approach. It is not the responsibility of journals to tell us how to measure and report things—it is up to us.

In the end, there is only one way of determining whether a particular paper is important and relevant to you personally, and that is to read it, digest the information, critically analyze it, and come to your own conclusions. You can’t avoid this and you shouldn’t. Where download and other article-level metrics can help is in making that decision about how much time you want to invest in a given paper. We need better ways of making that decision, and more data can only help.