30 Jan 2007

Tools to search the literature, and PubReMiner plugin

Recently I came across PubMed PubReMiner, created by Jan Koster. I've been very struck by this tool, which I think is pretty much the best way to search PubMed.

I have previously tried a number of different tools (see my list of Tools to search the literature in the sidebar), and Google Scholar by far outstrips a standard PubMed search due to the use of the PageRank algorithm to pull the most prestigious work to the top of the results. The PageRank algorithm doesn't just look at citations, rather it weights them by how often that referring articles has itself been cited. A citation from a source that is itself heavily cited counts more than one from a source that nobody has ever cited.

I've tried out Kfinder, which takes an abstract or other text as the input, and suggests keywords based on the frequency of occurrence of improbable words. You select keywords, and it returns researchers who match that search in Medline at least twice. Kfinder is quite slow and limited to Medline, but it is intuitive, and a good start to selecting keywords if you haven't had much practice.

PubNet from the Gerstein lab looks really promising. It visualizes the network resulting from a query to Medline. The network to the left, focussed around Howard Ochman and Emmanuel Lerat, clearly shows a network of collaborating colleagues, but I only ran that search because I knew of the network already. It could be useful, but I've not found the time to devote to exploring its possibilities, and it takes a while to generate the visualization at times. If it were quicker and easier to navigate the results, I might use it.

I've only had a quick play with Authoratory, and while the concept is excellent (automatically mining information from the results of PubMed searches), the delivery is lacking. When Deborah Saltman, our Editorial Director for Medicine, tried it she found that she was missing, and the keyword search doesn't take Boolean searches yet. A definite work-in-progress.

e-Biosci is clever in that it accepts any text as input (an abstract, or even a whole manuscript, although it was quite sluggish!) and calculates the concepts contained within. You can add and remove concepts to refine your search, and weight how important they are, and then search using these concepts in Medline abstracts and some full text, including BioMed Central's. The advantage of this approach is that you never need to think about appropriate keywords or search terms; the disadvantage is that some concepts are quite diverse. A good example is that an abstract about physician uncertainty in medical decision-making returned some physics articles near the top! I find that it can return items that you probably wouldn't have found otherwise, and can be very accurate at times.

eTBLAST is one of the big hitters in the field. It runs searches against Medline automatically when given an input of text, much as e-Biosci does, and returns a list of related articles. You can then get list of experts in the field, journals to submit to, the history of publishing in this field and several more features. eTBLAST does all the thinking for you, but it does take its precious time. It can take minutes for the results to be returned, which makes me think that the option to have the results emailed is the only way it will get routinely used.

But, as I said at the top, PubMed PubReMiner is my current favourite. Why? Well, it takes standard PubMed queries, which makes it very easy to start using. It is quick and unfussy, and returns the results in easy-to-read columns: a list of the most common journals in the results, a list of the authors who appear most often, and a list of words that most commonly appear in the abstracts, as well as MeSH terms, affiliations and the publications by year. It is simple, but highly effective.

I liked it so much, that I made a Firefox search plugin for it. After vainly following a tutorial, I found that searchplugins.net has a plugins generator, which I've used to create one for PubReMiner, complete with a logo. It is set to the default of a 1000 abstract limit. You can view the source code, and search for it under PubMed or PubReMiner. You can also install it now.


Outplay said...

Authoratory now has full text search, which makes finding interesting people so much easier!

Martin said...

PubNet was quite helpful during an evaluation of our institute when we had to document the development of a collaboration with another institute. Unfortunately, this requires quite some work on the search strings to limit the data to certain topics and to the periods of interest.
In addition we used it once to create a representation of all co-authors of one of our professors as a gift when he retired. Luckily, his name wasn't Smith.