The Gloves Come Off: Google Scholar vs. PubMed

Logos courtesy of PubMed and Google

An interesting article has been making the rounds of health library listservs. Comparing search results from PubMed and Google Scholar based on 4 clinical questions, the authors (from Texas Tech University Health Sciences Center and Carnegie Mellon University) conclude:

“PubMed searches and Google Scholar searches often identify different articles. In this study, Google Scholar articles were more likely to be classified as relevant, had higher numbers of citations and were published in higher impact factor journals. The identification of frequently cited articles using Google Scholar for searches probably has value for initial literature searches.”

This is where those critical analysis skills come in handy, and it helps to know some of the behind-the-scenes workings of the tools you’re working with – in this case, PubMed and Google Scholar. Google’s immense popularity – especially among students – makes this a highly relevant discussion, and so I’ve made a quick comparison table that addresses much of the umbrage that has poured out (in librarian circles, in any case) in response to the article:

  PubMed Google Scholar
Transparency: What’s being searched? MEDLINE dataset academic papers from sensible websites
Search result ranking By date of publication Super-secret relevancy algorithm

There are undoubtedly benefits to using Google Scholar – you can capture things outside the scope of your more traditional, published biomedical lit, including dissertations, conference abstracts, posters, and other content that can go into institutional repositories (such as the University of Michigan’s Deep Blue) but not into the MEDLINE dataset.

BUT. There are also serious drawbacks to using Google Scholar, and personally I think this is what users really need to understand. One MEDLIB-L discussion participant put it succinctly:

“you’re searching a database that you have no editorial definition for or ability to reproduce your results and/or search string.” 

Google is famous (notorious?) for not divulging exactly what its bots crawl & index – we’ve covered it briefly before when Scholar introduced a version of citation metrics. That means if you’re searching with the assumption that your search is grabbing everything – well I hesitate to say that you’re wrong, but you certainly can’t be certain.

Knowing how your tool works on the back end is important, because it will affect your search results, but that knowledge is not a substitute for covering your bases. The bottom line? Don’t default to one search tool – especially if your search is meant to be thorough or comprehensive.

4 thoughts on “The Gloves Come Off: Google Scholar vs. PubMed

  1. i dont think the article is very good. they compare relevance of the first page for pubmed en scholar. pubmed doesnt doe relevance ranking, so… they will lose. and to compare you have to create equal search strategies. but these strategies differ soo much. you cannot compare it. peer reviewers should have done a better job on this one, and not accept it this way.

    1. Thanks for your comment! You definitely bring up some valid points – the study in question was only looking at the top 20 search results for each clinical question, and whether that is really enough of a sample is a good thing to think about.

      Much of the discussion about the study revolves around the issues you raise, yet I hope that it still serves as a good jumping off point to discuss the necessity of searching numerous sources for truly comprehensive results, as well as the importance of understanding how the research tools we use regularly are built.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s