The many faces of PubMed search

on

The number of third-party tools for searching PubMed data seems to be increasing recently. As the NLM is about to roll out a new search interface, companies are starting to offer alternative interfaces for searching this important collection. The attraction is obvious:  a large, motivated group of searchers, an important information need, and a manageable collection size. A decade ago, over 20 million searches were done monthly through the NLM site, and the numbers are surely higher today; the collection is large but not huge — currently over 17 million entries (some with full text), occupying somewhat more than 60GB of disk space. Thus we see an increasing number of sites offering search over this collection, including PubGet, GoPubMed, TexMed, and HubMed. The offerings range from basic to flashy, and appear to be aiming at different groups of searchers.

PubGet, for example, attempts to simplify access to the full text of articles rather than showing just abstracts. (See this article for a good review of its strengths and weaknesses.) GoPubMed has a slick interface that offers a limited form of faceted search through which a query can be refined, but the interface is inconsistent (multiple aspects cannot be combined). TexMed offers a minimalistic interface with the hook that bibliographic references can be downloaded easily for the retrieved documents.  HubMed is another interactive web site that can display abstracts inline with search results, can export search results like TexMed, and has  links to a range of other search tools for each article. Because searches often retrieve many documents, these most of these sites (include the NLM PubMed interface) offer a way to save documents persistently, although not all of them allow search results from different searches to be segregated.

While I will not attempt a thorough analysis of search effectiveness of these tools, I found that the tools offer radically different result sets for the one query  (“colon cancer survival rates”) I tried. As a reference, PubMed produced 4093 matches after translating my query into a somewhat more complicated expression.  HubMed also returned 4093 hits, but did not give an explanation for its ordering of the results. GoPubMed seems to have returned the same results (in the same order) as PubMed, while PubGet offered 3,994 results with no obvious cues as to how it managed this feat. Finally, TexMed returned 250 documents. While these interfaces show considerable attention to interface design, it is not clear from casual inspection whether they will actually improve the effectiveness of serious medical search. Certainly none of them wins any awards for transparency in search, a characteristic that is important for recall-oriented searching practiced by medical reference librarians.

Update: I forgot to mention Hakia, a web site that can search a range of collections, including PubMed. Each match to the query shows an in-line snippet with the matching phrase highlighted, but does not show abstracts inline, requiring the searcher to click on a link that loads the matching document entry in PubMed. Unfortunately, it does not report the total number of documents retrieved or the order in which they are presented. Counting manually, I identified 186 matching documents.

2 Comments

  1. Sarah Vogel says:

    Thanks for sharing your experience. The difference in hits is very interesting. That’s the kind of thing that makes me want to dig around under the hood and figure out what’s going on! I’ve played with all of these a bit but haven’t done any serious evaluations since my brief looks didn’t make me think they would be particularly useful resources for me. Several colleagues have mentioned that they find GoPubMed useful for analyzing results for search terms (e.g., http://laikaspoetnik.wordpress.com/2008/09/24/finding-assigned-mesh-terms-and-more-pubreminer/ ).

  2. One approach to understanding the overall behavior of these tools is to try to use a test collection of queries with known relevant documents to assess their recall and precision. While this may not give a deterministic explanation of a particular result set, it might still give searchers some confidence about the overall effectiveness of different algorithms. Of course care would need to be taken to characterize queries appropriately: it’s well known from TREC experiments, for example, that some topics are much harder than others in terms of systems’ (or users’) ability to find relevant documents.

    I am sure that test collections of this sort have ben devised for PubMed in the past; two challenges that need to be addressed are
    a) to keep them up-to-date as new content changes, and b) to archive results of these analyses in a publicly-accessible way so that future researchers and system developers could build on well-documented results.

Comments are closed.