4 Oct 2010

Editing Wikipedia - for scientists

Wikipedia is now one of the most visited websites, and is probably the biggest source of fully free information. Wikipedia and Wikimedia in general fits well in the "open movement" alongside open source, open access, and open data. Many people, including scientists, find Wikipedia to be invaluable and read it on a daily basis, and some have even used it as a source, but you may find that the coverage is wrong or scanty. You can shrug and move on, or you can fix it yourself - and leave it better for the next reader.

If you're not contributing to Wikipedia already, as a scientist you're very well placed to do so as two of the main rules should be second nature - citing your sources and presenting the work of others neutrally. I could go into much more detail, but Darren Logan and his Cambridge colleagues have already written a brilliant guide in PLoS Computational Biology that is recommended reading for those who are as yet unfamiliar with the ins-and-outs of becoming a Wikipedian.

Joining PLoS ONE

I'm excited to say I've just started as an Associate Editor with PLoS ONE at the Public Library of Science, after freelancing with them since the beginning of the year.

It's interesting timing in the wake of a surge in submissions post-Impact Factor and the recent brickbats hurled at the journal by PZ Myers and David Gorski, but I'm looking forward to helping the journal go from strength to strength.

17 Sept 2010

Open access: the saviour for Chinese journals?

Discussing the announcement that the Chinese government is going to crack down on poor quality journals, a Nature editorial puts forward the welcome view that moving towards open access might be the best approach for Chinese publishers:

"The best opportunity to revive Chinese publishing, whether in Chinese or English, probably lies in an open-access platform — increasingly popular in Western journals. Many Chinese journals already charge authors a publication fee, so should be able to make a smooth transition to the open-access model, in which they are supported by fees rather than by subscription revenues."

5 Sept 2010

What is the scientific paper? 4: Access

This is a guest post by Joe Dunckley
Completing the series exploring the question "what is the scientific paper?", reposted from my old blog, and originally written following Science Online 2009. As I reminded people at the time, these were just my own half-thought through ideas, not the policy or manifesto of anyone or anything I'm affiliated with.
A friend of mine once told me how much she hated "the proliferation of these bioinformatics papers." All these simulations and models of what happens in real life. All of it utterly useless -- since when was the stuff that comes out of a computer worth anything? None of it even remotely reflects anything that happens in real life. And the methodology papers -- the endless methodology papers. They're making yet another neural network and modifying a bayesian something-or-other, when they haven't even found where they left the markov models yet! How can you have so many of these methodology papers? Clearly they can be no more than incremental advances. (Of course, BLAST is an exception -- it's old enough to have been around and heard of when we were undergrads, and is therefore a perfectly legitimate and mainstream molecular biology tool.)
Similarly, some people still voice their skepticism about the need for open access. Access isn't really a problem, is it? These open access advocates are just making facile arguments about the how the people who pay for scientific research should have some kind of say regarding its dissemination.[1] Come on, really, show me, who is in want of access? Everyone (everyone who matters) already has subscriptions, right? Access isn't a problem. And the open access "movement" isn't an ideology. It's just another business model.
And then, yesterday afternoon m'colleague shouted for advice handling an author of a scientific manuscript who was questioning the need to deposit her not inextensive collection of genomes in a database. I don't blame the author for wanting to get out of the chore—she had a lot of data, and depositing it will be a dull repetitive task. M'colleage was trying to write a letter and struggling to put into words the reason why we mandate deposition of sequence data, and why merely including them as supplementary MS Word files isn't good enough.
These attitudes, you will have noticed, have one particular thing in common: they all completely miss the fact that the biomedical sciences have moved on in the past quarter century. In almost every field (lets not wake the poor taxonomists) the science being done and the science being published today are not quite like that of 25 years ago. Even if the science of today were like that of 25 years ago the case for open data sharing would be strong enough; as it is, it's simply absurd to think that open sharing of data isn't worth doing.
Individual scientific papers -- the basic units of scientific research -- are rarely exciting; rarely even interesting. Where nerds get excited about science, it's where science offers a beautiful explanation for how the world works. And scientific papers don't do that. They offer some speculative interpretations of data on obscure problems in obscure systems. It is the literature as a whole -- hundreds of dull papers put together -- which tells a complete and exciting story. The sum is more than the parts -- the theory is more than the data.
In the field I know best -- cancer cell biology -- 99 in 100 papers published are tedious details, discovered with a science-by-numbers formula. The (anti-)proliferative effect of one abbreviation interacting with another abbreviation in three-letter-acronym-and-a-number cells, concluding with a suggestion that the authors' work might have implications for cancer treatment and a note that further work is necessary. Or even better, the complete lack of anything interesting at all happening when the first abbreviation interacts with the second. The abbreviations and their effects have been studied, in combination with others, in all of the most widely used three-letter-acronym-and-a-number cell-types, and somebody is scraping the barrel.
But the tedious details put together add up to an understanding of how the cell works and how it goes wrong. The details could be put together by a human, going through the thousands of papers on the topic, assembling the facts and finding the trends. Or, more plausibly, given the amount of tedious details out there, they could be assembled by a computer, with a database and a clever algorithm. Except that four in every five of those tedious details, discovered at great expense to taxpayers, will be inaccessible to that clever algorithm. They will be locked away in the basements of university libraries, hidden in human-readable prose that humans will never read. The results of billions of pounds of work searching for an understanding of cancer and a better chance at defeating it will be worthless, because they will never be amongst the parts that add up to the greater whole.
So I told m'colleague to explain to her author that unless she deposits her genome sequences, the last three years of her professional life will ultimately have been wasted. An average paper in a high-volume mid-tier journal that will be glanced at by a few colleagues when published. Another bullet point on a CV. They will never further science beyond that. They won't contribute any important discovery or real advance to the field. They will be forgotten. Nobody will seek them out when the time comes to make the leap forward.
That's just where biology is at these days: lots of tiny fragments of data, spread thin through the literature. The most interesting and important unanswered questions will require the synthesis of that work. The most interesting and important questions can't be answered without the heap of data that has already been produced, but which is locked away.
On machine readable data, Mike Ellis says, "at some point in the future, you'll want to do "something else" with your content. Right now you have no idea whatsoever what that something else might be." This is especially true in science: at some point in the future, tedious data obtained at great expensive, as part of the bigger picture, will finally be important and valuable. Right now, you can have no idea how important.
Publishers are allowed to get away with keeping science closed, holding it back, and wasting public money because there are still sufficient numbers of scientists who let them -- who have themselves failed to grasp that the world and science have changed.

2 Sept 2010

What is the scientific paper? 3: The metric

This is a guest post by Joe Dunckley
Continuing the series exploring the question "what is the scientific paper?", reposted from my old blog, and originally written following Science Online 2009. The topic of this post was originally discussed on FriendFeed, here.
On my recent post, what is wrong with the scientific paper?, Steve Hitchcock said that the most important problem with the paper is access, and that when we solve the problem with access, everything else will follow. I agree that access is hugely important, I recognise that we haven't won everyone over yet, and I know we do have to continue working away at the access problem, so I will devote a future post to reviewing that topic. But having thought about it a little longer, I am more convinced than ever that it is not access that is the big problem which is holding back the paper and journal, and open access is not the solution from which all others follow and fall into place.
There is one big problem, a single great big problem from which all others follow. The great ultimate cause is not, as I said last week, the journal. It is more basic than that. It is the impact factor. The journal is the problem with disseminating science, but the reason it has become the problem, the reason people let the problem continue is the impact factor. The impact factor is a greater problem than the access problem, because the former stands in the way of solving the latter. The impact factor is a great big competition killer; by far the greatest barrier to innovation and development in the dissemination of science.
Scientists can look at all of the problems with disseminating science, and they can look at us proposing all of these creative and extravagant solutions. They might agree entirely with our assessment of the state of the scientific paper and of the journal, and they can get as excited as us at the possibilities the flow from new technologies. But blogs and wikis are mere hobbies, to be abandoned when real work starts piling up; databases a dull chore, hoops to jump through when preparing a paper. So long as academics can get credit for little else besides publishing in a journal — a journal with an impact factor — any solution to publishing science outside of the journal will never be anything more than a gimmick, a hobby that takes precious time away from career development.
In a worse position than blogs and wikis, where cheap easy products are openly available, are the wonderful but complicated ideas that would benefit from financial backing to implement — the databases, and open lab notebooks, and the like — but which are currently artificially rendered unviable because no scientist could ever afford to waste time and money on a product that isn't a journal with an impact factor. No scientist can try something new; no business can offer anything new. Even such an obviously good idea and such a tame and simple advance as open access to the scientific paper has taken over a decade to get as far as it has in part because it takes so long for start-up publishers with a novel business model to develop a portfolio of new journals with attractive impact factors.
I am not a research scientist. I don't have to play the publish-or-perish game. So I have no personal grudge; no career destroyed or grant lost by rejection from a top-tier journal. It doesn't bother me how much agony, absurdity, and arbitrary hoop-jumping research scientists have to go through in their assessments and applications. But it bothers me greatly that, by putting such weight on the publication record — not actual quantity and quality of science done, but a specific proprietary measure of the average impact of the journals (and journals alone) that it's published in — public institutions across the world are distorting markets, propping up big established publishers, and destroying innovation in the dissemination of science. End the malignant metric and everything else will follow.

30 Aug 2010

What is the scientific paper? 2: What's wrong?

This is a guest post by Joe Dunckley
Once again, this is a re-post of something I wrote on my old blog a year ago after the Science Online conference, looking at the future of the scientific paper. As I reminded people at the time, these were just my own half-thought through ideas, not the policy or manifesto of anyone or anything I'm affiliated with.
So in response to the Science Online conference, we've been thinking about the question, "what is the scientific paper?" I already gave my answer to that a couple of weeks ago, but promised to have a go at answering the more interesting question, "what is wrong with the scientific paper?"
I've been thinking through how to sum up the answer all week, and I'm afraid the simple answer is, "the journal". The journal is what's wrong with the scientific paper. Or rather, the journal is what is holding back the development of efficient modern methods of disseminating science. So I thought I'd spend this second post making some observations on what the scientific journal traditionally is and does; what I think the modern journal shouldn't be doing; and a couple of case studies of alternative technologies that disseminate certain kinds of scientific communications better than a journal ever could.

What is the (traditional) scientific journal?
  • The journal is a collection of scientific papers limited to some kind of theme coherent enough to make it worth reading buying.
  • The journal is led by a charismatic editor-in-chief and editorial board who attract people to publish in the journal.
  • The journal is printed on pages. It can do text, still pictures, graphs, and small tables.
  • The journal publishes a sufficiently large number of papers to make it worth printing several issues each year, but a sufficiently small number of papers to make each issue manageable.
  • The purpose of the journal is to be read and cited by other scientists.
  • The purpose of the journal is to be purchased by university libraries.
  • The journal provides a peer-review, copy-editing, marketing and media relations service to their scientists.
  • Publishing in a journal provides a way for scientists to be cited and credited for their work, based on the reputation of that journal.
  • The journal decentralises scientific publishing, allowing individual pockets of innovation within the publishing world, but making change overall very slow.
What should the modern journal (not) be doing?
It is perhaps rather foolish for somebody who works for a publisher of journals -- who works developing technologies for a publisher of journals -- to say that the problem with publishing science is the journal. It would be even more foolish for me to say that publishers perhaps shouldn't be trying to fix the problem with technology. Here are a couple of interesting technological advances that the more forward thinking journals have come up with lately.
  • At Sci Online, Theo Bloom demonstrated iSee, a structural biology visualisation applet for your "supplementary information". In the same category is J. Cell Biol's DataViewer, which is presented to us as a device for visualising raw microscopy data. Did you know that the results that come out of modern microscopes are not just pretty static pictures, but vast datasets full of hidden information? The JCB DataViewer unlocks that hidden information, by providing it and an interface to it as "supplementary information" with a paper.
  • PLoS Currents: all the constraints and benefits of a traditional journal, but without the peer-review. Solves the problem of delays in publication. Publishes items that look just like the traditional paper.
Should publishers and journals be doing these things? When you look more closely at JCB's DataViewer, you find that, useful though it may be, most of its power and potential is currently wasted. The DataViewer is presented to us as a device for visualising the supplementary information of a paper; in fact, it is a potentially important database of microscopy datasets with a handy graphical interface attached. Restricted to a single journal, the database functionality lays unused.
PLoS Currents? This is supposed to be a solution to the problem of delays in publishing special types of science deemed to be important and timely enough to need rapid communication to peers in the field. What have PLoS done? What makes PLoS Currents unique? How does it speed up intra-field communication of those important results? It drops one single aspect of the paper: peer review. In all other respects, PLoS Currents does all it can to make its papers look like the scientific paper, and its "journal" look like the scientific journal. Scientists are still asked to spend hours writing up these important timely results, with an abstract, introduction, methods, results, conclusions and references, with select figures and graphs and tables. Nobody has the imagination to go beyond the paper-journal-publisher model. We would sooner give up peer review than publish science in anything that doesn't look like papers have looked for a century.
Or how about Journal of Visualised Experiments? JOVE is, for some inexplicable reason, held up as a brilliant example of innovation in publishing science -- of making the most of the new technology provided by the web. Those who point out that, well, it's not really a "journal", is it?, are chastised for their own lack of imagination. But surely it's those who can't conceive of a publishing format branded as anything other than the "Journal of ..." who are lacking the imagination.
Final example: while thinking about this post, PLoS Computational Biology kindly came up with the absurd idea of being a software repository. NO! Software repositories already make perfectly good software repositories, and there are plenty of them. Trying to turn a journal into a software repository is a suboptimal solution to a problem that disappeared long ago -- long before scientific publishers could have imagined that the problem even existed.
Breaking out of the journal
The web makes all sorts of new methods of publishing, communicating, disseminating science possible. It also comes with all sorts of well developed and widely used solutions to the problems of disseminating science. The big old publishers haven't even realised the web has happened, let alone thought about what to do with it. The hip young publishers know what's possible, and they want to be the ones to realise the possibilities. Good on the hip young publishers. But with each new possibility, scientists should be asking whether publishers, even the hip young ones, are really right for the job. Sometimes they are. Sometimes not.
GenBank, the database of gene sequences and genome projects, had to happen. Journals simply can't publish the raw results from a whole genome sequencing project. (Thought I don't suppose they gave up without trying.) And GenBank comes with dozens of benefits that papers, when spread across a decentralised system of journals, just can't have. Yes, I know that databases aren't the optimal solution for every variety of data, but they are suitable -- desirable; even required -- for more of them than you might think. The microscopy data in JCB dataviewer (or the structural data in iSee) would, I suspect, be of much greater value were it branded as a standalone public database with a fancy front-end, than as a fancy visualisation applet for some scattered and hidden supplementary files, restricted to a single journal.
Like it or not, science increasingly depends on data being published in public machine readable formats. Those who spend their days looking one-at-a-time at the elements of a single cell signalling pathway in every tumour cell line available to them are wasting our money if they bury their data in a fragmented and closed publication record. Nobody reads those papers, and the individual fragments of data don't tell us anything. Journal publishers think they can ensure that data is correctly published, but so far their only great successes are with the likes of GenBank and MIAME, where journals have ensured that data be deposited in public databases outside of the journal format.
ArXiV. Does this need any explanation? What does PLoS Currents offer that isn't already solved better by pre-print servers? Just a brand name that makes it look as though it's a journal. If you require rapid dissemination of important timely results and you want to go to the effort of writing a full traditional scientific paper, put it on a pre-print server while it's going through peer review in a real journal. Don't just abandon peer review while making it look like you've just published a real paper in a real journal.
Better yet, don't write a proper traditional paper. If you need rapid communication of important timely results, why waste time with all of the irrelevant trimmings of a scientific paper? The in-depth background and discussion and that list of a hundred references. Put these critical results on a blog with a few lines of explanation, and later submit the full paper for peer review in a real journal.
Credit where it's due
All the real scientists reading -- the ones looking for jobs and grants and promotion and tenure -- have spotted the one great big flaw in all these suggestions: credit. At least a paper in PLoS Currents can be listed in a CV. Nobody even reads blogs, let alone cites them. How can you get a grant on the back of a blog post? Am I suggesting you should be able to get a grant on the back of a blog post?
Maybe. I don't know. I don't think so. At the moment, publishing papers in journals is pretty much all a researcher can get any credit for. Asking researchers to go beyond the paper-in-journal format is going to create problems of assigning credit, and I don't know exactly what the solution to that problem might be. Simply, I haven't put much effort into considering solutions. I'm a consumer rather than creator of science, so that particular problem doesn't keep me awake at night. But there surely are solutions -- plenty of them.
Fact is, it's quite obvious to anyone in or observing science that the current method of ensuring that scientists are credited for their hard work is really quite broken. Trying to cram every new kind of "stuff" into that broken system is hardly helping.
Business models
Meanwhile, the publishers will be asking how we see the business models for these non-journal based methods of publishing working. Frankly, I'm not really interested. But then, JOVE is hardly the beacon of business success anyway. If publishers want science publishing to be a business, they need to find the new business models that work without strangling science. Otherwise, they're liable to find out that, on the web, some institutions and individual scientists can do a better job of disseminating science than the professionals can, and out of their own pocket.
The paper of the future
I don't necessarily think that anybody should stop writing papers -- perhaps not even the ones that nobody reads. The paper solves several problems better than any other proposed solution. A peer reviewed scientific paper, in a journal if you like, is as good a way as any to provide a permanent record of a unit of science done, and of a research group's interpretation of the significance of that unit of science. And it needn't change all that much. Making them shorter and a lot less waffley would be to my taste -- there's no need to put that much effort into words that won't be read. And give them semantic markup, animations, and comment threads, if you like. But don't pretend that those things are anything more than incremental advances. The real revolutions in the dissemination of science can only occur beyond the shackles of the traditional paper and journal. Every new Journal of Stuff is another step back.
Updates for 2010
Peter Murray Rust has been saying interesting things about domain-specific data repositories, which I am sure are worth paying more attention to than I have yet had time to.
When I originally posted this, I was challenged for not mentioning the problem of closed-access journals at all; that problem is addressed in the subsequent posts.

17 Aug 2010

What is the scientific paper? 1: Observations

This is a guest post by Joe Dunckley
Last year, after Science Online, I wrote a series of posts inspired by Ian Mulvany's question, what is the scientific paper? Those were originally posted on my old blog; now, with SoLo approaching once again, seems like a good time to revisit them, while migrating them over to Journalology.
Science Online charged us with answering the question, what is the scientific paper? Here is the answer. It comes from the perspective of somebody who has been middle author on just two, but who has spent a little bit of time working with them and with people who think a lot about them.
What does the scientific paper look like?

  • It's a few thousand words -- probably between 4 and 15 pages long (but can be <1 >100 pages).
  • It's mostly prose text, with a little bit of graphs, tables, and pictures.
  • It has a set matter-of-fact style and structure.
  • It's written in (American) English.1
What is in the scientific paper?
  • Who did the science.
  • Why the science was done.
  • How the science was done.
  • Data!
  • The authors' interpretation of what was achieved by doing the science.
  • Pointers to the other bits of science mentioned.
Where is the scientific paper?
  • It is in a journal, available in one or both of:
    • printed on 4-15 sheets of dead trees, between a pair of glossy (or not so glossy) covers in the basement of a library.
    • a journal website, possibly with technology deliberately designed to make it difficult and expensive to get to, probably only available in a clunky and poorly designed PDF file.
  • It might also be in-part or in-full in a searchable database, like PubMed.
  • If you're really lucky, it is available as HTML and XML.
What is the scientific paper for?
  • It aims to be a complete, objective, reliable, and permanent record of a unit of science done.
  • It's a way of telling your field what you've done.
  • It's a way of telling your field what you've found.
  • It's a way of giving data and resources to your field.
  • It's a (the?) way of proving to your (potential) employer/funder that you have done something worthwhile.
  • It's a way of making money for publishers
How is the scientific paper made?
  • The authors are given some money and lab space on the condition that they use it to do some science and write a paper about it.2
  • The authors do some science and write a paper about it.
  • They give it to a journal. The journal thinks about it.
  • Peer review! Months of scrutiny, discussion, and revisions.
  • Production! The words are turned into PDFs and printed pages.
What is the scientific paper not?
  • Part of a conversation.
  • Quick and efficient.
  • Diverse and flexible.
  • Possible to edit after acceptance by the journal (except in extreme circumstances, and via slow and unsatisfactory mechanisms).
  • Possible to edit by anybody except "the authors".
  • A way of making your data and resources reusable.
  • A way of telling the layperson what you've done and found.
Wait, that wasn't really what the question meant, you say? Well, indeed. But before we get to the real questions -- "what's wrong with the scientific paper?" and "what do you suppose we do about that?" -- it's good to define some terms and lay out the basics. Do you think I've got any of my observations wrong, or think I've overlooked some important property of the scientific paper? Do say -- it would be good to try to agree on what the paper is before going any further.
  1. Thanks to Hannah who added this point in the comments on the old blog
  2. Thanks to Cameron Neylon, ditto

Incentivising academic fraud

This is a guest post by Joe Dunckley
Catching up with the newsfeeds after a week working in Beijing (where citizens are saved from reading such subversive content as Journalology -- as they are all Blogspot blogs), I notice the Economist discussing academic fraud in China.

Being the Economist, it attempts to explain China's fraud epidemic focus on incentives:

China may be susceptible, suggests Dr Cong Cao, a specialist on the sociology of science in China at the State University of New York, because academics expect to advance according to the number, not the quality, of their published works. Thus reward can come without academic rigour. Nor do senior scientists, who are rarely punished for fraud, set a decent example to their juniors.
The trouble with this explanation is that these same incentives apply in many -- most -- other countries also. Science everywhere is plagued by the publish-or-perish game and the incentives it generates. Academic careers stand and fall on the basis of publication counts. Some countries at least try to judge quality of output in addition to quantity, but most methods are no more sophisticated than that used by China -- and every method has its incentives for fraud.

Nor does a lack of disincentives in China explain why they stand out. Fraud is rarely satisfactorily punished anywhere. If it is even discovered at all, the photoshopped figures and made-up numbers become an accident; the original data was lost sometime after that project was completed; the grad student who handled that particular experiment has moved on, and can no longer be contacted. A researcher getting fired for fraud is big news, not because fraud is rare, but because failing to weasel out of an allegation is rare.

It is my fear that China is perceived as having a higher rate of fraud compared to other countries not because it does, but because Chinese researchers aren't very good at it yet. Their fiddled figures are crude and easily spotted; their fictitious facts are amateur inventions that can not be believed. The worrying thing about these rough and unrefined fabrications is not that they themselves, easily found out and struck from the record, exist. The worrying fact is that they must be the tip of a great iceberg; 99% of the fakes are unseen, produced by forgers skilled enough to mask their work in convincing disguises and cover their tracks perfectly. As science in China matures, and the student to supervisor ratio falls and natural selection picks the cleverest conmen, the epidemic of clumsy and primitive fraud will end. That's when China joins the ranks of countries experiencing advanced and undetectable fraud epidemics.

Discussing fraud as a symptom of a Chinese problem -- of a failure of Chinese academic administration or a flaw in the Chinese culture and psyche -- is a nice distraction from the uncomfortable fact that fraud is a symptom of a global problem -- of failing academic administration everywhere. The Chinese copied the publish-or-perish game from the west. Soon they'll get good at it.

10 Aug 2010

New word - evoluating

"Evoluating". It's probably an attempt to use the French "évoluer" in English, I think it means "evolving".

6 Aug 2010

The Scientist has an attack of CNS disease

The Scientist this week tells us that
"Peer review isn’t perfect [who knew?]— meet 5 high-impact papers that should have ended up in bigger journals."
Wait, what? These high-impact papers got those citations despite ending up in "second tier" journals, so I doubt the authors have been crying into their beer about this "injustice". This is an example of CNS Disease, a term coined by Harold Varmus to characterise the obsession with Cell, Nature and Science. Not all high-impact papers must published in one of these journals, and not all papers published in these journals will be high impact. Biomedical publishing is not just a game in which editors sort articles by predicted future impact - at least, I hope it's not.

Authors chose their publication venue for all sorts of reasons, and it's hard to predict which new work will set the world on fire. Take BLAST - it was a "quick and dirty" algorithm that gave similar results to the Smith and Waterman algorithm only much faster, and the gain in speed came at a loss of accuracy. Only use by scientists in practice could decide whether this was a good approach. Focussing on the umpteen thousand citations to BLAST is missing the point: the important thing about BLAST is the millions or billions of hours of computer time saved by using it. As Joe, the other denizen of Journalology Towers, said recently:
"Lord protect us from the idea that an academic publication might have any value beyond its ability to accumulate citations."

31 Jul 2010

The stuff we didn't have time to blog about in July

This is a guest post by Joe Dunckley
Some old fashioned publishers are still claiming that open-access mandates -- by forcing the publishers to acknowledge that the internet has happened and that this event makes the status quo business model that they cling to wasteful and unsustainable -- will "stifle innovation". In other news, war has been found to be peace and it was discovered that freedom is slavery.

An unexpected and not entirely welcome development in open science: the Information Commissioner -- charged with enforcing the UK's freedom of information act -- has ruled that data collected by a Queen's University Belfast researcher falls under the remit of the act, and the data must now be released. This seems to be something of an accidental victory for open science, though rather unfortunate that it should come about as the result of a stunt by a climate change denier, and not as part of a planned, consensual, and multilateral shift in academic culture. I've yet to see much written on the repercussions of the decision (though I am a little behind on reading).

WikiReviews? A potentially interesting project for collaborating on "living" review articles, initially on cancer, introduced by George Lundberg.

Scientists who end up in industry could inadvertently find themselves in trouble when the natural tendency of the scientist to share information for the benefit of mankind conflicts with the natural tendency of big companies to jealously and zealously guard everything they have. In the US, researchers innocently publishing a scientific paper can face (at least, the threat of) decades in prison for industrial espionage if they're not very careful.

5 Jul 2010

Elsevier experiments with peer review

Well I never. I've been advocating the adoption of open peer review and community peer review for a while now; I didn't expect one of the pioneers of community peer review to be Elsevier, but they've surprised me.

On 21 June, they announced a three-month trial of what they are calling PeerChoice on Chemical Physics Letters, which allows potential reviewers to volunteer to review papers. As Ida Sim points out, this doesn't open up peer review in the sense of making it more transparent, but it should help speed up peer review and it might avoid the bias caused by editors selecting from a limited pool of the same 'usual suspect' reviewers.

The devil is in the details: who gets to be in the pool of potential reviewers; how will you motivate reviewers to volunteer, when getting reviewers to agree when directly inviting them can be hard enough; will volunteers be vetted for suitability for that article; is this alongside or instead of editorial selection? These question aside, let's hope it's a success.

Edit: There's some answers on the hidden-away page about PeerChoice - PeerChoice is supplementary to editor-invited reviewers. Registered reviewers will see titles and abstracts and be allowed to download the manuscript if they agree to provide a "timely review." There doesn't appear to be a vetting/vetoing system, but the editor still makes the decision. The trial is on nanostructures and materials; the results might not be applicable outside that very narrow field as scholars in different fields react in very different ways to variations in the peer review process.

9 Jun 2010

Green is no goal

This is a guest post by Joe Dunckley
To achieve a sufficiently large but distant win, it is worth sacrificing a much smaller but nearer win if it stands in the way or distracts and delays the larger achievement. To achieve a small but near win, it is not worth sacrificing a much larger but more distant win. But the difference in magnitude must be sufficiently large, and the difference in distance sufficiently small, to make delaying the gratification really pay off. Speculation and argument over the sizes and distances and relative probabilities of success and incompatibilities of the competing achievements fuel many a political argument.
Like "green" open access.
Green open access is simple: for every scientific journal paper, at least one of the authors must take action to ensure that the paper is freely available to the world online, somehow. They can deposit the text in PubMed central, or put a crude PDF of a draft version on their website. The increasingly preferred method for many advocates of green OA, though, is the institutional repository: each university library manages its own database of affiliated researchers' papers. This will solve the problem: the inability of people to read a paper that they want to read.
A heresy for you: access is not an interesting problem. The stubborn toll access publishers are correct when they say that most people can read most of the papers that they want to read. Yes, it takes emails to the authors, piracy amongst friends, and borrowed passwords, and yes it is a real problem, and no, the toll access publishers do not have any excuse to do nothing about it. But it's not an interesting problem any more. Letting us read a paper for free, without having to log-in or pester the author, once the paper reaches twelve months old, is not a revolution.
There are other similarly dull problems in science and publishing that green OA doesn't address. Like how to save university libraries from the parasitic subscription access publishers that are slowly killing their helpless hosts. Green OA tells parasitic publishers that they can continue draining libraries of their budgets with subscription bundles to poor quality journals that few people want to read, so long as they open access to the papers after twelve months. Now, as libraries face their greatest budget squeezes of the recession, is the perfect time for them to get some guts, speak up, say 'no', and shake off these parasites once and for all, before somebody comes along and hides them behind a bigger and stickier sticking plaster. Students should be rioting at the news that they are expected to do without textbooks and computers because their library has chosen instead to spend the several tens of thousands of pounds on a package of obscure and substandard journals. Instead, we're distracted by green OA, told that it is the one thing that academia desperately needs.
More interesting than these little problems are the opportunities that are currently presented to us: the real revolutions. Open, structured, reusable data has already demonstrated its revolutionary credentials in the field of genomics. Genome data that can be searched and mined by powerful computers and clever algorithms has enabled cheap and easy high-throughput hypothesis testing, and even hypothesis generation: it has led to countless discoveries that weren't on anybody's mind when they set out to collect the data, because the database as a whole is worth far more than the sum of the individual data gathering experiments. There are vast quantities of data in the literature: from microscopy to biogeography, epidemiological trends to drug toxicity. There are great and important discoveries waiting to be made in that data. But they're not being made, because unlike with genomics, no organisation has made the effort to build the database; no campaign group has achieved a mandate that the data be made open and reusable. Instead, the data, where it is available at all, is locked away in non-standard tables within unstructured PDF files, distributed across largely subscription access journals that reserve all rights to reuse.Gold open-access at least, by making literature mining possible, doesn't stifle these new open data opportunities, even if it's not the full solution; green open-access, by focussing on the need for access to human readable literature, distracts us from these possibilities entirely.
Or open notebook science: a model that, by getting scientists to discuss their ideas and publish their experiments in the open in real time, would force a revolution in the way that scientists work, the way that groups compete and collaborate, and the way that careers are evaluated and achievement rewarded; a revolution to the whole rhythm and pace of scientific discovery and the individual scientist's working life. A revolution that rather makes the whole issue of access to journal papers go away altogether.
Green OA advocates argue that open science, open data, and ONS are vague and fantastical distractions from the pressing matter of human access to journal articles; the we shouldn't waste time thinking about the former until we have solved the latter. I believe that green OA is a mundane and increasingly irrelevant distraction from the real problems and opportunities that are available for science to solve and grasp, but for a limited time only. The long-term achievements are too big to risk for the sake of such a small one.
Why spend time designing a better horse shoe when you could be inventing the railway train?

(Apologies for the unpolished post - this was made from my phone on the side of a Welsh mountain.)

1 Jun 2010

Literature hacks: PubMed searches by RSS

This is a guest post by Joe Dunckley
There are all sorts of ways you could find out about new articles that you might want to read. There's that big room across campus that's full of old writings on paper, but that's too far away and they have some silly rule about not eating your lunch near their writings on paper, and anyway you're not sure you still have the card that lets you in. You can't trust your colleagues to point out an article that isn't crushingly mediocre, unless it's because it concerns a species or a disease whose name sounds mildly amusingly puerile, but those ones are never actually remotely related to your work. You subscribe to electronic tables of contents, but these days everyone's publishing in PLoS ONE, and you're not wading through their contents every week in the hope of finding the occasional thing that's relevant. You could regularly search PubMed, but that means typing in keywords over and over, and wading through the results asking yourself, "have I seen this paper already, or do I just feel like I've seen this paper already?"

So you could subscribe to email alerts for your PubMed searches, but my god, man, what the hell do you think you're doing? What, you haven't got enough email already? Make you feel special, having your phone stop you every five minutes with unimportant impersonal notifications? If it's not private, not time-critical, and does not require a reply, it should not be pestering you with an email. That article has taken ten years to get from concept to publication, it can wait a little longer for you to read it -- not that you even read more than one in every twenty of the articles you're alerted to.

Which is why it should be obvious to any of our readers why they should be using HubMed's RSS feeds of PubMed searches, with their Google Reader, to keep up with the literature. New articles will accumulate and be available to scroll through in the sophisticated and cleanly laid out environs of the Google Reader, when it's convenient for you to read them. Reader will tick off items that you've seen and present to you items that you haven't yet seen, without ever screaming "look at me, look at me right now!"

Update: Since I originally wrote this, PubMed released their major update, introducing their own implementation of RSS saved searches, which looks at least as good as that of HubMed, and takes less effort to set up -- just click the RSS button next to the search box on the search results page.

19 May 2010

"Predatory" open access publishers

The Charleston Adviser has published an interesting analysis of some of the recent open access 'upstarts', titled "“Predatory” Open-Access Scholarly Publishers". They include some that I've noted before such as Bentham Open and Scientific Journals International.

As I would have expected, Libertas Academica and its sister publisher Dove Press do better than the others included in this review, but they are still far from passing with flying colours. The reviewer, Jeffrey Beall of Auraria Library, University of Colorado Denver, places a very clear "author beware" sign on:

  • Academic Journals
  • Academic Journals
  • ANSINetswork
  • Bentham Open
  • Insight Knowledge
  • Knowledgia Review
  • Science Publications
  • Scientific Journals International
Beall's summary is worth repeating:
"These publishers are predatory because their mission is not to promote, preserve, and make available scholarship; instead, their mission is to exploit the author-pays, Open-Access model for their own profit.
They work by spamming scholarly e-mail lists, with calls for papers and invitations to serve on nominal editorial boards. If you subscribe to any professional e-mail lists, you likely have received some of these solicitations. Also, these publishers typically provide little or no peer-review. In fact, in most cases, their peer review process is a façade.
None of these publishers mentions digital preservation. Indeed, any of these publishers could disappear at a moment’s notice, resulting in the loss of its content."
I'd not touch any of them with a bargepole.

13 May 2010

Why you can't copy abstracts into Wikipedia

This is a guest post by Joe Dunckley
This is an archival repost of something first published elsewhere a year ago.

I am not a lawyer, but I do have six years experience of Wikipedia, was once a very prolific Wikipedian, and, despite my lack of activity there in more recent years, am apparently still an "admin" on the English language Wikipedia. This, coupled with working for an open-access publisher, means that I have also picked up a little knowledge of (mostly US & UK) copyright over the years. Since I can't boil all that down to just 250 characters (or whatever the limit is), this post serves to answer this question, raised at FriendFeed: 'Does an article in pubmed belong to the "legal public domain", can I copy and paste it in wikipedia?'

The answer is 'no'. I don't endorse this position, and I'm not trying to be a killjoy, but it is the correct answer nonetheless. Since there appears to be some confusion over why the answer is 'no', let me explain. First I'll define some terms, then the copyright status of journal abstracts, and finally why the policy of Wikipedia must be to exclude abstracts.

First the definitions. Don't quote me on these. Like I say, IANAL. These are all just definitions that I have picked up over the years in the context of Wikipedia and open-access. In order of increasing protection of rights:

  • Public domain: completely exempt from all rights given by copyright law. Anyone can reprint it, remix it, and make money selling it, with no obligations.
  • Public/copyleft licensed, e.g. GFDL, CC-BY: the producer of the work has asserted their ownership and claim their rights, but have voluntarily given everyone in the world permission to do certain things with the work without having to ask first. There are actually several tiers of these licenses.
  • Copyright: you're not allowed to do anything with the work, unless the copyright owner has said you can.
Public domain is not a synonym for "publicly available". Something is not "in the public domain" just because it is on the internet -- indeed, most of the internet is not public domain, it falls in that third category. There is no presumption that you are allowed copy and paste material all over the internets. Perhaps there are corners of the internet where that is de facto the case, and perhaps it would be great if everything was public domain or copyleft, but it's not. Napster was a place where music was de facto public domain, before the recording industry reminded them that the law doesn't work that way. However, there is a fourth area to copyright: fair use.

Fair use is not a fourth category, like the categories above. Fair use is just a set of exemptions to copyright protections. It allows you to make use of copyrighted material without the owner's permission to do so. However, it is very limited: you may only use a limited amount of the copyright material, and you can only do a limited range of things with it. If you want to use something copyrighted and say that you are doing so under fair use provisions, you have to make the case for your specific creation being fair use of the material. Getting away with claiming fair use for an abstract in PubMed does not mean that you will get away with it for Wikipedia, or some other creation. And the copyright owner is always within their rights to object to your fair use claim.

The copyright status of journal abstracts. Copyright to most journal abstracts will be owned by the journal's publisher (or society). Copyright to others will be owned by the authors. For open-access papers, the copyright is usually owned by the authors, but the journal has made sure that they have released it under a copyleft license, allowing you to do lots of things with their work. Papers written by employees of US federal agencies in the course of their employment will be public domain, as will very old papers.

It is true that there is a culture amongst scientists of free movement of published ideas. Copyright is worthless to a scientist, who actively wants his ideas to spread, so long as he is cited and acknowledged. Scientists freely share and reprint things like abstracts.

Scientists assume that publishers feel the same about all use of "their" material. Note the fierce and desperate opposition some of the traditional publishers raise against the open-access movement, though. Ideas mean different things to a scientist and to a (traditional) publisher. You shouldn't presume that publishers will react in the same laid-back way as scientists do when the words that "belong" to them are used in novel ways.

Note that PubMed carefully argues the case that its use of abstracts falls under fair use provisions. It doesn't just say "yeah, whatever, everyone freely reproduces abstracts, no one cares."

Why you can't turn abstracts into Wikipedia articles. Wikipedia can't be laid back about copyright any more than PubMed can. Wikipedia is now, what, a top-ten website by most metrics? People notice things that are put on Wikipedia. If you start putting abstracts on it, somewhere a publisher will notice, not like it, and have the material removed. You could claim fair use, but (and remember, IANAL), I very much doubt you would be successful: an encyclopedia is very different to an index, and in Wikipedia you are remixing the material. Well, whatever. One page gets deleted. No lasting harm done. End of story?

Not exactly. Wikipedia is GFDL. What you put on Wikipedia gets copied to hundreds of mirrors and put in paper versions. People use it for whatever commercial purposes they want, and it gets remixed to death. It's difficult to undo what goes into Wikipedia. That is why, when you write in Wikipedia, you must declare either that the words are your own, or that they are already released under a compatible copyleft license. You are not just giving permission for your words to be used on Wikipedia, you are giving permission for your words to be reused and remixed for virtually any purpose. This is why Wikipedia has to be pretty careful not to let copyright violations through.

This is also why Wikipedia does not actually allow any text to be contributed as fair use (except when marked as quotations): the permissions granted by Wikipedia are just too great for the fair use claim to be defensible.

But can't Wikipedia make an exception for abstracts? Theoretically, perhaps it could be done, but sadly, the reality is 'no'. Wikipedia is too big, too old, too well known, too bureaucratic. Wikipedia's policy on copyrights is well established; it must be generalist, covering all fields and all nations, and it can't afford to be lax. To come up with exceptions to the policy would be too difficult for such a generalist site with such a tiny legal team. The Wikipedians would have to establish beyond doubt that publishers were happy for their abstracts to be used not just on the encyclopedia, but by anyone, anywhere, for virtually any purpose, reprinted and remixed. And that sounds like the open-access movement to me.

Conclusions. You can't put (non-open access) abstracts on Wikipedia. It would be nice if the gentlemen's agreement whereby publishers overlooked reuse of their material by scientists extended to all spheres, but that ain't necessarily so. Of course, it would great if it were so, and the story is just one of thousands which emphasise the need for a more rational and restricted copyright system.

Amusing typo of the day

The authors of a manuscript say their work has been approved by an "intuitional review board". I suppose it just knows when a study is ethical.

12 May 2010

Medical Hypotheses' editor is sacked

So Bruce Charlton's editorship at Medical Hypotheses comes to an end, and I must raise a small cheer. Schadenfreude is an ugly thing, but this journal was a boon to fringe 'scientists' everywhere, giving them the apparent legitimacy of publishing in a 'proper journal' (owned by Elsevier, indexed in PubMed) without the pesky hurdle of peer review. It was no surprise that it favoured kooks, having been set up by David Horrobin, a pusher of evening primose oil.

The final straw was allowing AIDS denialists a platform, and the subsequent outcry from scientists and Charlton's inability to see what he did wrong led Elsevier to pull the plug. Charlton thinks that as an editor he has a perfect right to publish whatever papers he wishes, but unaccountable editorial control is no way to run a journal. Poor editorial decisions should have consequences, and the lack of any peer review or other quality control on Medical Hypotheses (the only criterion being that a paper was 'interesting') always doomed it to be derided by serious scientists and medics.

Will the new (and improved?) Medical Hypotheses see any more gems like too much sex causing RSI, kissing evolving to spread germs, cancer being caused by stopping smoking, masturbation being good for relieving a bunged up nose, or the origin of belly button fluff?

18 Mar 2010

Peer review in the dock

This is a guest post by Joe Dunckley
Academic publishing, and peer review in particular, was headline news in February -- from stem cell researchers claiming that their work was being sabotaged by reviewers with conflicts of interest, to mainstream news noticing the absurdity of the impact factor situation. BBC Radio 4 must have decided that now was a good time to air an unedited repeat of 2008's documentary Peer Review in the Dock. So now certainly seems like a good time to post an unedited repeat of my comments from the time.


A few thoughts on Peer Review In The Dock (this evening, Radio 4).

  1. Nobody has ever questioned whether peer review is really needed: wrong. A lot of people have questioned this, and many experiments have been tried. The most prominent recent example is probably PLoS ONE (no reference to this in the programme). They very rapidly discovered that, yes, a minimum standard is peer review is required when running a journal. But perhaps moving to a non-review model is like communism: you need to have world revolution for it to have any chance of working; going it alone will just lead to your own collapse.
  2. Peer-reviewers aren't trained: somewhat misleading. Reviewers, at least in the publishing model that I am familiar with, are actively publishing research scientists of at least medium seniority. Most will, while pursuing their doctorates, have participated in "journal clubs" (where the grad students get together to shred a published paper), and many will also have co-reviewed manuscripts alongside their supervisors (not strictly allowed, but very widespread). What all students certainly are trained to do, even at undergraduate level, is not to take the truth of published work for granted, and to watch for potential flaws. To teach science is to teach scepticism. Which brings me on to the next point...
  3. Reviewers aren't all that great at spotting errors: so what? Academics and publishers know this. The system is designed this way. Review is supposed to be a basic filter for sanity and competence; it is only journalists who hear "peer-reviewed" and think it is the definitive stamp of authenticity. Like democracy and trial-by-jury, it is not used because it works, but because it fails less disastrously than the alternatives. (Incidentally, their example of introducing deliberate errors to a paper and seeing who notices them is not entirely fair: most papers are not only reviewed by the journals reviewers, but by the authors' colleagues before they submit the manuscript, and by editors before review.)
  4. The last part of the programme was devoted to publication bias. Publication bias is a big problem. But it has little, if anything, to do with peer-review, and everything to do with publisher policies and author dishonesty. The only conceivable connection it has with peer-review is that some people still mistakenly believe that negative results aren't worth publishing at all -- something that journals like BMC Research Notes and PLoS ONE, and initiatives like trial registration are explicitly tackling.
The programme explored what is an interesting issue in academic publishing at the moment (there are more interesting issues, of course), but, I think, from the wrong perspective. While it discussed many very real problems with the system, these problems are all well known and acknowledged; for decades people have explored solutions, and there are many interesting current developments. The makers of the programme seemed mostly unaware of these.

This is, of course, the limitation of having a half-hour national radio programme about a topic like academic publishing.

3 Mar 2010

Peter Suber's open access word contest

Peter Suber, the guru of open access, has challenged readers of the SPARC Open Access Newsletter to come up with a new word.

English speakers need a verb that means "to provide OA to". It should be as succinct as "sell" for use in sentences such as, "We sell the print edition but ____ the digital edition."
Oh, the joys of verbing a noun. Here are my entries:
  • "Openpublished"; "to openpublish" (or "open-published", "to open-publish"). Apparently there is already a meaning of "open publishing", which is to make the process of publishing transparent (open peer review would be an aspect of this, as well as Indymedia and Wikinews), but I think the term is little used.
  • "Commoned"; "to common". Meaning "to place into the commons", as most OA publishing uses the Creative Commons licenses.
  • "Publicked"; "to publick". Meaning "to make public". It's an archaic word, used by Joyce in Finnegan's Wake, sometimes meaning "published", sometimes meaning "populated", and recently resurfacing to mean making a private message public.
  • "Freeshared"; "to freeshare" (echoing freeware and shareware). This term is already a synonym for freecycling, and for a defunct image upload site.
  • "Openshared"; "to openshare" (echoing open source and shareware). This term is already used for an icon that represents the open sharing of content - an icon that could be adopted by the open access movement.
  • "Copylefted"; "to copyleft". Using the existing term, which refers to Creative Commons and GNU GPL licenses among others.
  • Referring to libre and gratis: "Libred"; "to libre". "Gratised"; "to gratis".
Can you do better? Seize the glory by emailing Peter.

The Open Share icon is under a Creative Commons Attribution-Share Alike 3.0 Unported License from http://www.openshareicons.com/.

3 Feb 2010

Article-level metrics

This is a guest post by Joe Dunckley
Guys, are you sure you've thought this through? I mean, they're nice. They're fun. Data is fun. Seeing that somebody somewhere has read something you've written is satisfying and reassuring. It's good to know that you've sparked a conversation, and gotten people recommending you to their friends. But you think that it can't possibly go wrong? You think we should roll it out as the universal metric right now, and sort out the details later?

The impact factor is just data. It's nice for a publisher to know that people are reading the papers that they publish, and that all their hard work is having some effect. I can't believe that it would ever have been intended that such an absurd situation as the current practice of making and breaking careers according to a journal citation index should have arisen. Give out article-level metrics and they're soon going to stop being a bit of fun. People are going to use them and abuse them. It's what people do with data.

If you're going to reduce somebody's life's work to a number, it would certainly be less absurd to pick a number that is in some way relevant to that work, rather than relevant to the work that a whole bunch of other people did several years earlier. But only a little bit. What are article-level metrics representing? The quality of the work, or the controversy you've stirred up? The web-savviness of the field? The number of friends you have? There is already huge variation between the kinds of impact factors that medical journals get compared to, say, the sort that ecology journals get. What if, at the article level, breast cancer turns out to be inherently more comment-worthy than bowel cancer? If tenure and funding committees are willing to use something as absurd as an impact factor when making a decision, do you think that they're going to give a damn about the inherent variation in readership between fields? All those bowel cancer researchers better start reading up on their breast cancers now.

But what happens when article-level metrics really start to mean something? Give a large enough number of people an incentive to cheat and some of them are going to cheat. Remember when the World Journal of Gastroenterology boosted its impact factor with a little citation loading? How are you going to stop academics from doing what flickr users do to get themselves into the site's front-page "Explore" section for the day's best photos -- posting their stuff to "I'll leave an inane comment on yours if you leave an inane comment on mine" groups? What happens when academics spend ever increasing hours marketing their work to each other? How long is it going to be before journals and universities are competing for researchers by advertising how good their average article-level metrics are? Before journals and universities open departments dedicated to pressuring people into reading and commenting and blogging their articles?

What happens when pharmaceutical companies get in on the act? What about the ideas that get ignored for ten years, before it becomes apparent how important they are -- do the metrics count for the original paper, or for the review article that reignites the interest? What about the assholes, the trolls, the groupthink...?

Article-level metrics are a bit of fun. It's possible for them to remain a bit of fun. But it's going to take a lot of forethought and vigilance to make sure that is so.

1 Feb 2010

Literature hack: context search

This is a guest post by Joe Dunckley
Ryan Gregory has just started a new blog: Hackademe. It's a Lifehacker for academics, sharing his tips for scientists who are struggling to cope with all the shiny distractions around them. Here at Journalology towers, we have a whole bunch of these hacks lying around, and I hope we won't be treading on Ryan's toes if we share some of our literature searching and managing tips. (Though Ryan has already discussed reference manager software on his day old blog, so perhaps he has this one covered...)

A very simple one to get us started, then: install context search. Firefox comes with a built-in right-click tool for easy searching of Google for the text on the page that you have highlighted. Context search replaces that search Google tool with a search any-number-of-search-engines tool.

You can add all your own favourite search engines to the menu: just go to the search engine website and use the drop menu in the toolbar search box.

Now you can run through PubMed every unfamiliar gene, disease or researcher you stumble upon while reading, with three easy clicks and no typing. (Warning: novices may find themselves up at two in the morning having followed a long chain of context searches from cell signalling pathways to YouTube videos of snow ploughs on speeding trains.)

Dudes, I don't know how you coped in the olden days of typing your search terms into PubMed, and yet people seriously try to tell me that mankind once worked with "card catalogues" and "interlibrary loan". I'm not buying it, you guys.

31 Jan 2010

A piece of peer review history

This is a guest post by Joe Dunckley
I love browsing PubMed Central. How about this, from May 1985. The paper, Exaggerated responsiveness to thyrotrophin releasing hormone: a risk factor in women with coronary artery disease, by Fowler and Dean in the BMJ, is not in itself a work of unusual historical significance. But the associated sections in the document could be: is this where open peer-review began?

In a comment posted letter published the day after I was born, Thomas Walever said:

SIR,-I congratulate you on your publication of the paper by Drs J W Dean and P W B Fowler (25 May, p 1555) complete with the referees' comments and authors' replies. This correspondence indeed illustrates some of the problems with the peer review system. It was a help to know that others have had some of the same problems that we have had as authors. While comments made by referees are often helpful, it is indeed distressing when one does not agree with the criticism of the methods. This is especially true, it seems, for statistical matters; and this was well illustrated here. It has even happened that a major criticism was that something was not done which in fact had been done and was clearly stated to be so in the manuscript.

As a suggestion, perhaps matters might be improved if we had to sign our names to our reviews. There would be many problems with this, but the comments made to the authors would probably be more careful, considerate, and constructive. Some journals already suggest this as an option; perhaps the practice should be encouraged?
Twenty-five years on, and we do not have universal open peer-review. Further experiments have been conducted, though, led by the BMJ, which now uses signed reviews (but, so far as I can tell, does not publish the reviewer reports). In 1999, a blinded trial of open review by the BMJ editors found that open review didn't really make a difference to the quality of reports, and probably lengthened the time taken for review. Editor Richard Smith, campaigner for open-science and publishing reform, stubbornly decided to adopt it anyway, citing ethical reasons: you can not be tried by an anonymous judge. (In his editorial, Smith also declares his intentions to publish reports alongside the accepted papers, and introduce live community review, neither of which have come to fruition, so far as I can see.)

Other trials have had more positive findings regarding benefits of open review, such as more thorough reports, but the utilitarian merits of open review remain contested. This Nature Neruoscience editorial from 1999 complained (without evidence) that open review is likely to lead to "bland" and "timid" reviews, in which technical deficiencies are identified, but no comment is made regarding the interest level of a paper. If true, this would, of course, be a problem for high-end journals like Nature Neuroscience which select on the basis of interest level.

It's nearly ten years since the BMJ's switch to open review, and, so far as I know, the the model is still restricted to just a minority of medical journals and hardly any journals in other fields. The BMJ's initial hope to publish reviewer reports seems to have been forgotten. Open peer review seems to be just another area where publishers are still looking at new technology and discussing where they want to go, when everyone on the internet is asking "are you guys coming, or what?"

This is an edited re-post of something previously posted on cotch dot net.

20 Jan 2010

Fraud epidemic in China?

This is a guest post by Joe Dunckley
Last week's Nature included a news feature on scientific misconduct in China which contained the extraordinary (but not unbelievable) claim that a third of all researchers at "top institutions" in China admitted to plagiarism, falsification, or fabrication. The feature contains the extreme example of the systematic fabrication of crystal structures, but one would hope that the majority of the confessions of misconduct represent no more than the borrowing of a few paragraphs by those for whom English is not a first language (a crime, but not a hanging offence). But every publisher has its examples of photoshopped figures and impossible datasets, and it's hard to deny that certain countries pop up more often than others.
The fun part of the article, though, is the comments thread, which is full of readers' speculating about why China should have such a high-rate of misconduct. A drive for quick success and control of the sector by short-sighted bureaucrats with no actual understanding of science are suggested. A rapid expansion of science with a bottom-heavy hierarchy and insufficient supervision, or else the fierce competition and pressure in a publish-or-perish world. Perhaps it's just in the nature of communist societies? Even the impact factor is cited as a contributory cause, and the blame somehow shifted onto the publishers.
Nobody seems to consider the simple possibility that researchers from other countries might have access to better tools for disguising their fraud.

16 Jan 2010

Reviewing Medical Hypotheses

This is a guest post by Joe Dunckley
Zoë Corbyn writes in the Times Higher this week that Elsevier have "started an internal review" of legendary journal Medical Hypotheses following its publication last year of two hiv/aids denialism papers (covered in Bad Science here and Respectful Insolence here). One of the offending papers, lead-authored by notorious aids denialist Peter Duesberg, took an entire two days from submission to acceptance by the peer review shunning "journal", and had already been rejected from all of the real hiv/aids journals for making such embarrassing claims as that Uganda's population increase proves that hiv can not cause aids.

It would be a shame to loose the journal that gave us Ejaculation as a potential treatment of nasal congestion in mature males and the equally entertaining response, Ejaculation as a treatment for nasal congestion in men is inconvenient, unreliable and potentially hazardous, but at the same time, we have to consider whether we are really comfortable continuing to humour the confused outbursts of Bruce Charlton.

It's interesting to note that the best defence for the journal's existence that Corbyn could find was this: "while peer review worked for 'normal science', it also had the power to suppress radical ideas." The defence comes from intelligent design creationist Steve Fuller, whose ideas I don't think even Med Hypotheses sunk as low as publishing.


This is a guest post by Joe Dunckley
I am Joe. You might remember me from such websites as cotch dot net, friendfeed, and twitter. I work for a STM publishing company, formerly editing molecular biology journals, but now helping them develop their web publishing technology. I will be posting my ill-considered thoughts about science publishing alongside Matt's on Journalology, so that I don't have to bore the readers of my real blog with them. Do follow my vanity site for updates on all my other blogging.