17 Sept 2010

Open access: the saviour for Chinese journals?

Discussing the announcement that the Chinese government is going to crack down on poor quality journals, a Nature editorial puts forward the welcome view that moving towards open access might be the best approach for Chinese publishers:

"The best opportunity to revive Chinese publishing, whether in Chinese or English, probably lies in an open-access platform — increasingly popular in Western journals. Many Chinese journals already charge authors a publication fee, so should be able to make a smooth transition to the open-access model, in which they are supported by fees rather than by subscription revenues."

5 Sept 2010

What is the scientific paper? 4: Access

This is a guest post by Joe Dunckley
Completing the series exploring the question "what is the scientific paper?", reposted from my old blog, and originally written following Science Online 2009. As I reminded people at the time, these were just my own half-thought through ideas, not the policy or manifesto of anyone or anything I'm affiliated with.
A friend of mine once told me how much she hated "the proliferation of these bioinformatics papers." All these simulations and models of what happens in real life. All of it utterly useless -- since when was the stuff that comes out of a computer worth anything? None of it even remotely reflects anything that happens in real life. And the methodology papers -- the endless methodology papers. They're making yet another neural network and modifying a bayesian something-or-other, when they haven't even found where they left the markov models yet! How can you have so many of these methodology papers? Clearly they can be no more than incremental advances. (Of course, BLAST is an exception -- it's old enough to have been around and heard of when we were undergrads, and is therefore a perfectly legitimate and mainstream molecular biology tool.)
Similarly, some people still voice their skepticism about the need for open access. Access isn't really a problem, is it? These open access advocates are just making facile arguments about the how the people who pay for scientific research should have some kind of say regarding its dissemination.[1] Come on, really, show me, who is in want of access? Everyone (everyone who matters) already has subscriptions, right? Access isn't a problem. And the open access "movement" isn't an ideology. It's just another business model.
And then, yesterday afternoon m'colleague shouted for advice handling an author of a scientific manuscript who was questioning the need to deposit her not inextensive collection of genomes in a database. I don't blame the author for wanting to get out of the chore—she had a lot of data, and depositing it will be a dull repetitive task. M'colleage was trying to write a letter and struggling to put into words the reason why we mandate deposition of sequence data, and why merely including them as supplementary MS Word files isn't good enough.
These attitudes, you will have noticed, have one particular thing in common: they all completely miss the fact that the biomedical sciences have moved on in the past quarter century. In almost every field (lets not wake the poor taxonomists) the science being done and the science being published today are not quite like that of 25 years ago. Even if the science of today were like that of 25 years ago the case for open data sharing would be strong enough; as it is, it's simply absurd to think that open sharing of data isn't worth doing.
--
Individual scientific papers -- the basic units of scientific research -- are rarely exciting; rarely even interesting. Where nerds get excited about science, it's where science offers a beautiful explanation for how the world works. And scientific papers don't do that. They offer some speculative interpretations of data on obscure problems in obscure systems. It is the literature as a whole -- hundreds of dull papers put together -- which tells a complete and exciting story. The sum is more than the parts -- the theory is more than the data.
In the field I know best -- cancer cell biology -- 99 in 100 papers published are tedious details, discovered with a science-by-numbers formula. The (anti-)proliferative effect of one abbreviation interacting with another abbreviation in three-letter-acronym-and-a-number cells, concluding with a suggestion that the authors' work might have implications for cancer treatment and a note that further work is necessary. Or even better, the complete lack of anything interesting at all happening when the first abbreviation interacts with the second. The abbreviations and their effects have been studied, in combination with others, in all of the most widely used three-letter-acronym-and-a-number cell-types, and somebody is scraping the barrel.
But the tedious details put together add up to an understanding of how the cell works and how it goes wrong. The details could be put together by a human, going through the thousands of papers on the topic, assembling the facts and finding the trends. Or, more plausibly, given the amount of tedious details out there, they could be assembled by a computer, with a database and a clever algorithm. Except that four in every five of those tedious details, discovered at great expense to taxpayers, will be inaccessible to that clever algorithm. They will be locked away in the basements of university libraries, hidden in human-readable prose that humans will never read. The results of billions of pounds of work searching for an understanding of cancer and a better chance at defeating it will be worthless, because they will never be amongst the parts that add up to the greater whole.
So I told m'colleague to explain to her author that unless she deposits her genome sequences, the last three years of her professional life will ultimately have been wasted. An average paper in a high-volume mid-tier journal that will be glanced at by a few colleagues when published. Another bullet point on a CV. They will never further science beyond that. They won't contribute any important discovery or real advance to the field. They will be forgotten. Nobody will seek them out when the time comes to make the leap forward.
That's just where biology is at these days: lots of tiny fragments of data, spread thin through the literature. The most interesting and important unanswered questions will require the synthesis of that work. The most interesting and important questions can't be answered without the heap of data that has already been produced, but which is locked away.
On machine readable data, Mike Ellis says, "at some point in the future, you'll want to do "something else" with your content. Right now you have no idea whatsoever what that something else might be." This is especially true in science: at some point in the future, tedious data obtained at great expensive, as part of the bigger picture, will finally be important and valuable. Right now, you can have no idea how important.
Publishers are allowed to get away with keeping science closed, holding it back, and wasting public money because there are still sufficient numbers of scientists who let them -- who have themselves failed to grasp that the world and science have changed.

2 Sept 2010

What is the scientific paper? 3: The metric

This is a guest post by Joe Dunckley
Continuing the series exploring the question "what is the scientific paper?", reposted from my old blog, and originally written following Science Online 2009. The topic of this post was originally discussed on FriendFeed, here.
On my recent post, what is wrong with the scientific paper?, Steve Hitchcock said that the most important problem with the paper is access, and that when we solve the problem with access, everything else will follow. I agree that access is hugely important, I recognise that we haven't won everyone over yet, and I know we do have to continue working away at the access problem, so I will devote a future post to reviewing that topic. But having thought about it a little longer, I am more convinced than ever that it is not access that is the big problem which is holding back the paper and journal, and open access is not the solution from which all others follow and fall into place.
There is one big problem, a single great big problem from which all others follow. The great ultimate cause is not, as I said last week, the journal. It is more basic than that. It is the impact factor. The journal is the problem with disseminating science, but the reason it has become the problem, the reason people let the problem continue is the impact factor. The impact factor is a greater problem than the access problem, because the former stands in the way of solving the latter. The impact factor is a great big competition killer; by far the greatest barrier to innovation and development in the dissemination of science.
Scientists can look at all of the problems with disseminating science, and they can look at us proposing all of these creative and extravagant solutions. They might agree entirely with our assessment of the state of the scientific paper and of the journal, and they can get as excited as us at the possibilities the flow from new technologies. But blogs and wikis are mere hobbies, to be abandoned when real work starts piling up; databases a dull chore, hoops to jump through when preparing a paper. So long as academics can get credit for little else besides publishing in a journal — a journal with an impact factor — any solution to publishing science outside of the journal will never be anything more than a gimmick, a hobby that takes precious time away from career development.
In a worse position than blogs and wikis, where cheap easy products are openly available, are the wonderful but complicated ideas that would benefit from financial backing to implement — the databases, and open lab notebooks, and the like — but which are currently artificially rendered unviable because no scientist could ever afford to waste time and money on a product that isn't a journal with an impact factor. No scientist can try something new; no business can offer anything new. Even such an obviously good idea and such a tame and simple advance as open access to the scientific paper has taken over a decade to get as far as it has in part because it takes so long for start-up publishers with a novel business model to develop a portfolio of new journals with attractive impact factors.
I am not a research scientist. I don't have to play the publish-or-perish game. So I have no personal grudge; no career destroyed or grant lost by rejection from a top-tier journal. It doesn't bother me how much agony, absurdity, and arbitrary hoop-jumping research scientists have to go through in their assessments and applications. But it bothers me greatly that, by putting such weight on the publication record — not actual quantity and quality of science done, but a specific proprietary measure of the average impact of the journals (and journals alone) that it's published in — public institutions across the world are distorting markets, propping up big established publishers, and destroying innovation in the dissemination of science. End the malignant metric and everything else will follow.