Green and Gold2 July 2012
Like most universities in Britain, we have an open access repository where anyone can go, online, to find publications and reports by researchers at Salford. Since we launched the University of Salford Institutional Repository (USIR) in 2009, we have built up a digital collection of 7,375 items, of which 4,105 have a full text document attached, and of which 2,616 are publicly accessible. Over the last three years, downloads of papers in our collection have increased tenfold, from 29,135 in 2009 to 297,635 over the past year. There is good evidence, from both the UK and elsewhere, that when research results are made freely available in online repositories, rates of citation rise. And increased rates of citation are a good indication that new knowledge is spreading. This is the energy that drives the research and innovation system.
USIR is a medium sized university repository and is “green”. This means that, for the most part, it contains either summary data about publications (rather than full text) or final drafts of publications which their authors have posted before submitting the text to a journal for publication. Submitting the final draft avoids violating copyright, which an author usually signs away to a publisher at the last stage of the publication cycle. Green repositories also contain unpublished, or self-published, reports and information; the so-called “grey” literature which is increasingly important, given new and easily available forms of digital distribution.
“Green” repositories are of significance in making research results widely available, and will continue to be so. But their main limitations are that they do not – and cannot – contain everything. In particular, they cannot contain the “version of record” when this is protected by a copyright that has been ceded to a publisher, and the publisher requires payment to view the paper, either via a journal subscription paid by an individual or an institution, or by means of a “pay to view” charge via a commercial website. While access to the “green” version of a research paper is very useful in scanning a field for new work, only the version of record has research results corrected after review, final forms of diagrams, tables and photographs, and the final pagination for the purposes of citation. A researcher finding a new and exciting paper in a “green” repository will often need to get access to the published version as well, either via an online subscription to the journal licenced after payment by a library, or through a one-off payment via the publisher’s web page.
A further limitation of “green” open access – and of growing significance – is the barriers it poses for text and data mining. Text and data mining have made remarkable advances in recent years, and now digital robots are able to trawl vast amounts of widely distributed information, finding and sorting that which is relevant to a search and stripping numbers from tables and shapes and patterns from images. Social media enterprises such as Facebook and Google do this to everything you write online via their sites, sifting out commercial information (unless you enable privacy controls). But, increasingly, vital research areas such as computational biology depend on the similar forms of digital robotics to collect information vital for drug design, or disease control, economic trends or climate change. To be effective – and accurate – scientific text and data mining systems need to trawl versions of record of research publications, and this will not work when thousands of pay-walls get in the way, or when restrictive formats, such as PDF, are used as attempts to restrict re-use of published papers.
These limitations are increasingly making the combination of “green” repositories and subscription publishing an unsatisfactory compromise. And this compromise may also be keeping the costs of subscriptions (or licences) to academic journals artificially high. Because publishers price whole journals (and bundles of journals, known as the “Big Deal”) as packages, and negotiate with large consortia of libraries, there is little direct relationship between researchers, who choose where to submit their articles, and their publishers. If a quality retailer, perhaps selling sports equipment, priced their products too high, customers would go elsewhere and the reputation of the shop would fall. It would have to innovate, lower its prices or go out of business. But until recently the ever-rising price of academic journals has been unclear to many academics because subscription costs are paid through libraries, and electronic journals appear to be free at the point of use as long as the researcher has a recognised and authorised electronic identity. While many publishers’ prices have been reasonable, others have not. Elsevier’s reported profit margins of close to 40 percent have, rather dramatically, alerted the broader academic community to what university librarians have been saying for years – that the current system of distribution and pricing is unsustainable.
An alternative – already well established – is “gold” open access, in which publication costs are paid before publication or by other means, allowing the publisher to permit wider distribution without damaging loss of revenue. But the term can be misleading, because it has been used to embrace rather different approaches. In some cases, “gold” means upfront payment for limited distribution rights; a paper may be distributed but not re-used in any way, including text or data mining, without further charges. This form of gold does not help much with current challenges and may just be a way of adding additional publication charges, in parallel with licencing and pay-wall costs. Gold can also mean the upfront payment of all costs as well as a reasonable margin. This should allow unrestricted distribution and re-use rights, signalled by the Creative Commons’ CC-BY licence. Rather than calling this approach “gold”, it is probably better, if more clumsy, to refer to this as full Article Processing Charges, or full APCs for short.
A central point of contention is whether or not publication with a CC-BY licence (and therefore with no restriction on distribution or re-use) will result in libraries cancelling journal subscriptions. It may seem intuitive that this would be the case; if all the major research papers are freely available, why subscribe? Some publishers have argued vehemently that, unless APCs are high, their industry will die. Some have advocated national licences as an alternative. These would make all research publications available while protecting publishers’ revenue streams through mandatory subscriptions. Some publishers want a regulated system, in which open access to publications for which there have not been full APCs is prohibited for considerable periods of time. This is often conceptualised as a prohibition of open access for the “half-life” of the article – the time over which half of all eventual readership will happen. In some cases, this half-life could be five years.
In reality, there is little independent and systematic evidence that academic journal subscriptions are cancelled if parts of the journal content are made open access, and some studies suggest the opposite. This could be because academic journals often contain valuable editorial content that is additional to research papers, because researchers often need to browse through whole journals to keep up with specific fields, and because libraries need to ensure that they have full and coherent runs of journal titles online, as they did traditionally with paper copies. Innovations by a range of publishers are showing how new business models can move with, and anticipate, the changing world that is enabled by new digital technologies.
Important here will be the maximum embargoes on research results that are permitted by funding bodies such as the Wellcome Trust and the research councils on research results based on grants given by them. In particular, public interest in publicly-funded research must be that these restrictions are short-lived and perhaps no longer than six months. It will also be important for research councils to follow the lead of the Wellcome Trust in providing appropriate mechanisms for funding APCs. These moves will provide the conditions for a “mixed economy” in which the transition towards full APCs and CC-BY licences continues and in which academic journal prices are not kept artificially high because subscriptions double up as toll-gates, restricting access to essential research results that have often been funded with public money.
These are the issues that we have addressed in our report, “Accessibility, sustainability, excellence: how to expand access to research publications”, commonly called the Finch Group Report. In essence, our proposals seek to map out the transition, that could take a decade or more, from the current uneasy combination of “green” open access, various forms of “gold” and online subscriptions and pay-to-view, and towards a system in which all APCs are met up-front, with CC-BY licences, across all major science and knowledge systems. Given the inherent nature of digital production and distribution, this is as inevitable for academic publishing as online music distribution and digital video on demand is for the entertainment industry. What are needed are careful and sensible policies that steer this transition, to the best interests of researchers and those who depend on the new knowledge that they make and need to share.
Some first responses to the Finch Group Report have criticised us because, somehow, we did not propose that publishers’ prices should be regulated. But it’s difficult to see how this could have been done, since there is no evident national mechanism for price control, let alone for the sort of international agreements that would be required, since many leading academic publishers are not based in Britain.
Even if price controls were possible, they could be counter-productive. It is a reasonable expectation that moving to full APCs and CC-BY licences – particularly if this is required by funding councils for research that is the outcome of public funds – could promote innovation and new business models that will drive publishing charges down. There is good evidence that this is already happening.
The Report has also been criticised because it attempts to model the costs of the transition, in an environment in which research funding is extremely tight. But it’s important to go to the detail here. Although the headline figure in the Finch Group Report is £50-£60m per annum, this is a bundle of associated costs. The estimate for the move to full APCs is £28m per annum. This deliberately assumes average APCs that are 20 percent higher than the average recorded by the Wellcome Trust (which is the best available model we have for future arrangements), and that the rest of the world moves more slowly than the UK (meaning that universities have to maintain subscriptions in parallel with open access, so that researchers can get access to research published elsewhere). It is quite possible that innovation will drive down the average costs of APCs more rapidly than we have assumed, and that the rest of the world will adopt full APC systems more quickly. If this happens annual transition costs will decline more rapidly; they are unlikely to rise above those published in the Finch Group Report, because we have been deliberately gloomy.
An important piece of research by Alma Swan and John Houghton, commissioned by the UK Open Access Implementation Group and now published, adds to the Finch Group analysis by modelling the implications of the transition from “gold” to “green” for four different kinds of universities. The key point is that, once this inevitable transition is completed, all universities will have lower overall costs.
UCL’s David Price has argued that we have got it only partially right, and that the best course is to stay with the present combination of “green” and subscription publishing, augmented by licence extensions to allow, for example, biomedical researchers in the NHS to get the same “free at the point of use” effect as their colleagues who have the good fortune to be affiliated with universities. In fact, our report recommends licence extensions such as these (these are part of the composite headline figure of estimated costs during the transition). But it’s difficult to see how this compromise can benefit anyone, because versions of record will remain protected by licence fees and pay-walls, important advances such as text and data mining will be inhibited, and some publishers’ charges will probably remain artificially high.
The next moves rest with the research councils (in setting their policies for publication of research paid for by public funds), with HEFCE (in deciding the rules of recognition for research outputs that can be submitted in the 2020 Research Excellence Framework or its equivalent), and Government (through enabling policy). A key aspect of these policy decisions will be the maximum period for which a research publication, paid for by public money, may be embargoed for open access distribution with CC-BY copyright, since this will be the accelerator for the speed of transition to full open access publishing.
Whatever the details of the outcome, it needs to be one that extends and facilitates open access to research results. Without open access, huge potential advances in research and the dissemination of new knowledge will be severely constrained.
Accessibility, sustainability, excellence: how to expand access to research publications (the Finch Group Report): available at http://www.researchinfonet.org/publish/finch/
Richard Poynder: “The Finch Report: UCL’s David Price Responds”. http://poynder.blogspot.co.uk/2012/06/finch-report-ucls-david-price-responds.html
Wellcome Trust responds to Finch Report on open access: http://www.wellcome.ac.uk/News/Media-office/Press-releases/2012/WTVM055650.htm
Alma Swan and John Houghton, Going for Gold? The costs and benefits of Gold Open Access for UK research institutions: further economic modelling. Report to the UK Open Access Implementation Group. June 2012