journal impact factor: science’s misused metric and its...
TRANSCRIPT
Journal Impact Factor: Science’s Misused Metric and its Limitations
Submitted by: Sana Batool
Macaulay Honors College
Course: The Purpose, Practice and Politics of Science (MHC 360)
Instructor: Dr. Harold Varmus
Date: May 16th, 2018
Batool1
Introduction
Why are some journals like the Nature, Science, and Cell better known than others? Part
of the reason is their high Journal Impact Factor, abbreviated as JIF, which is, by far, the most
discussed bibliometric indicator in the Scientific Community. Eugene Garfield first mentioned
the idea of JIF in 1955 but it was not until 1975 when he first reported the impact factor for
scientific journals. It was primarily created to facilitate the US librarians to make subscription
decisions using an objective quantitative method. Since its inception, JIF has gradually evolved
as a tool for not only quantifying the impact of journals, but also for measuring the importance of
individual articles published in respective journals; However, JIF is far from being an
appropriate measure of scientific impact and it should not be used as a proxy to gauge the
scientific quality and productivity. The evaluation committees should, instead, base their
decisions for a grant or job position on a combination of more appropriate article-level metric
systems and utilize biographical sketches that briefly list the major accomplishments of a
candidate.
Limitations of JIF
Many skeptics of JIF point out its manipulatable nature that could potentially be
exploited by the journal editors in an effort to increase the impact factor of their journals. JIF is
annually reported in the Journal Citation Reports (JCR), currently published by Clarivate
Analytics. It is calculated as a mean value of citations (see Fig. 1), for which the published
documents are categorized as citable and non-citable items. Research and review articles are
considered citable items whereas the editorials, letters to the editors and news items are
considered non-citable items. For the calculation of JIF, the numerator includes citations
accumulated for both citable and non-citable items whereas the denominator, which accounts for
Batool2
the number of articles published, only includes the citable items. This introduces asymmetry in
the numerator and denominator and results in an inflating effect on JIF for the journals that
publish a lot more non-citable items than others. Moreover, review articles are compilations of
previously published studies and are often heavily cited. Therefore, journal editors could increase
the impact factor of their journals by publishing more non-citable items and review articles.
Fig 1. Formula used for calculation of JIF annually reported by JCR.
Another means of editorial manipulation of JIF is self-citations, which can be used by
authors and editors that have control over self-citations to inflate JIF. Such a systematic flaw
causes the authors to add unnecessary citations to their recently published articles to raise the
JIF. An analysis published in JCR in 2002 revealed that on average about 12% of the citations
for a journal were self-citations across all disciplines. There have been some cases reported
where the journals have returned papers to the authors, asking them to add more citations from
articles published in their journals.
JIF is also criticized for the brief length of citation window used for its calculation. As
indicated by Fig. 1, JIF only takes into account citations over a two-year window, which focuses
on the short-term impact of the articles. Dr. Rob Hyndman, Professor of Statistics at Monash
University, mentioned in a blog that this two-year period emphasizes on the recency instead of
longevity of scholarly documents and does not measure the real impact of articles. He gave an
example of one of his papers which was cited only 47 times in the first ten years of its
publication date, while in the next ten years it accumulated 457 citations. If JIF were calculated
over a longer time, this article would perhaps dramatically increase JIF of the journal it was
2018 𝐽𝑜𝑢𝑟𝑛𝑎𝑙 𝐼𝑚𝑝𝑎𝑐𝑡 𝐹𝑎𝑐𝑡𝑜𝑟 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑖𝑡𝑎𝑡𝑖𝑜𝑛𝑠 𝑟𝑒𝑐𝑒𝑖𝑣𝑒𝑑 𝑏𝑦 𝐽𝑜𝑢𝑟𝑛𝑎𝑙 𝑋 𝑖𝑛 2018 𝑓𝑜𝑟 𝒄𝒊𝒕𝒂𝒃𝒍𝒆 𝒂𝒏𝒅 𝒏𝒐𝒏 − 𝒄𝒊𝒕𝒂𝒃𝒍𝒆 𝒊𝒕𝒆𝒎𝒔 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑑 𝑖𝑛 2016− 2017
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝒄𝒊𝒕𝒂𝒃𝒍𝒆 𝒊𝒕𝒆𝒎𝒔 𝑝𝑢𝑏𝑙𝑖𝑠ℎ𝑒𝑑 𝑖𝑛 𝐽𝑜𝑢𝑟𝑛𝑎𝑙 𝑋 𝑖𝑛 2016− 2017
Batool3
published in.
Although the number of citations of an article does give substantial data that could
gauge its impact in its relative field, it is equally important to consider the nature of these
citations, which could be positive or negative depending on whether the paper is credited or
criticized. A journal would still have an increase in its JIF even if its papers were cited as
examples of weak studies. For instance, a paper discussing arsenic-loving bacteria published in
Science in 2010 has been cited over 500 times, but it certainly has no impact in the field because
the results of this particular study remains irreproducible and mostly cited by others to criticize
the study. Dr. Stephan Curry, structural biologist at Imperial College and a longtime critic of JIF,
says that a fixation on JIF is problematic because if a single number could really quantify the
value of a scholarly document then this Science paper, which is a “worthless piece,” would be
considered impactful.
One of the major shortcomings of JIF is that it is calculated as a mean value instead of a
median. This means that a minority of papers in a journal that are heavily cited could account for
the vast majority of citations of that journal and raise the JIF. This results in a highly skewed
data when the citation distribution is plotted for these journals. For instance, an article in Nature
might be considered impactful because of the high JIF but could have zero citations in reality. In
a 2016 paper published by Lariviere et al., the citation distribution plots of 11 different journals
showed an extensive overlap, which demonstrated that JIFs cannot be used to measure the
impact of an individual article. It has also been found that on average, only about 28% of the
citable items accumulate citations equal to or greater than the impact factor of their journals
while about 70% of them have citations below the reported JIF.
In an effort to improve JIF, Dr. Randy Schekman, a 2013 Nobel Laureate and editor-in-
Batool4
chief of eLife, visited Thomson Reuters, who previously owned JCR, along with a group of
scientists a few years ago. They argued that the skewness of citation distribution makes the
average value of citations meaningless and made a suggestion to calculate JIF as a median.
Thomson Reuters responded that they had previously tried to calculate JIF as a median, but it
had resulted in the same impact factor for multiple journals. Since the scientific community
cannot differentiate between journals with the same JIF, Thomson Reuters abandoned this
approach, citing “we have always calculated it as a mean and that’s what we are going to
continue doing.” Dr. Schekman calls this a “bogus excuse” and says that the publishers of JIF
consider themselves like an “Olympic clock timekeeper” that wants to avoid ties at any cost.
Critics of JIF
Dr. Schekman made a public statement after he won the Nobel Prize in 2013, announcing
that his lab will boycott the so-called “luxury journals” namely Nature, Science, and Cell. He has
compared these journals to fashion designers who produce limited-edition bags. He says that
chasing after journals with high JIF is a “phony obsession,” which impedes the progress of
Science and “makes life miserable for the biomedical scientists.” Soon after his statement in
2013, Nature stopped advertising impact factor on its website and called it a false metric.
Dr. Stefano Bertuzzi, president of the American Society of Microbiology (ASM), is
another well-known critic of JIF who has called for complete abandonment of the JIF system,
which would redirect the scientists’ attention back to the content of the paper instead of the
prestige of journal. As the Executive Director of American Society of Cell Biology (ASCB) in
2012, he coordinated San Francisco Declaration on Research Assessment (DORA) after
discussing growing concerns over the misuse of JIF at the society’s annual meeting.
DORA is an important document that lays the foundation for movement against JIF. It
Batool5
encourages scientists on hiring, promotion and grant committees to judge the papers based on its
content instead of where it is published. DORA also calls for the scientists to send their
manuscript for publication to journals that best fit the scope of their study instead of blindly
going after journals with high JIF. Dr. Schekman was one of the first signatories among other
eminent scientists who showed support for DORA. As of this writing, DORA has been signed by
12,047 individuals and 475 organizations.
Consequences of misapplication of JIF
Institutions and individuals are both responsible for promoting JIF in the research
evaluation context, which has multiple consequences. In some countries, the obsession with
publishing in journals with high JIF is so strong that institutions have adopted cash-for-
publication policy. According to a recent article, Chinese scientists could earn over $100,000 per
paper published in prestigious Western journals like Nature and Science. Such policies not only
externalize the incentive to do scientific research but also promote fraudulent activity, as
desperate scientists falsify and fabricate data to be able to publish in top journals. Dr. Schekman
says, “such obsession has made biomedical publishing very uncomfortable and led to fraud, as
the scientists feel the need to jazz up their results and sensationalize their data just to publish in
luxury journals.” In some cases, the journals have even asked the authors to change the title of
their papers to make it more captivating. Such practices subtly influence the researchers to
choose topics and dissemination venues based on a single number that ends up controlling
science making across various fields.
As Dr. Curry rightly points out, the young researchers are the hardest hit by the current
system of evaluation. Many students may want to pursue fascinating and novel questions but are
subverted by JIF. In a correspondence, Gomar points out that many PhD students are under
Batool6
constant pressure by their Principal Investigators to have at least one Nature, Science, or Cell
publication to bolster their CV for jobs. Top journals like Nature have an acceptance rate of only
about 5%, which means few PhD students end up with a publication in these journals at the end
of their doctoral program; Consequently, this decreases their chances to obtain a postdoctoral
fellowship, which narrows down their career options. Therefore, the young researchers tend to
focus more on where to publish instead of what to publish, resulting in a loss of originality and
creativity in Science.
Dr. Sven Hendrix, a Neuroanatomy professor in Belgium, has called impact factors as the
“Currency of Fame” in Academia and emphasized that in an environment where JIF is highly
valued as in the scientific community, only those researchers climb up the success ladder whose
scientific endeavors are dictated by their desire to publish in top-notch journals. Dr. Bertuzzi
calls this a “deeply rooted cultural issue” that not only controls the fate of young researchers but
also slows down the dissemination of scientific knowledge. It could take several months for
scientists to publish their papers in a prestigious journal. The misapplication of JIF also dictates
the scientific progress because most PhD and postdoctoral students tend to be attracted to labs
with a high number of Nature and Science publications and overlook the labs that may do
meaningful science but publish in lesser-known journals.
Persistence of the misuse of JIF
Despite significant shortcomings of the JIF system and numerous consequences of its
misuse, it continues to be the main tool for research assessment in the scientific community.
Funding agencies use it to award grants, hiring, and promotion committees use it to fill academic
positions at universities, tenure committees use it to grant tenure to faculty members and
principal investigators use it to hire postdoctoral and graduate students. Many have referred to
Batool7
this obsession as “journal mania” and “impactitis.”
“Impact factor mania” continues to persist for a number of reasons. Scarcity of grants and
jobs combined with increasing number of postdocs in Science forces the evaluation committees
to discriminate among many similar candidates and grant proposals. The idea of accomplishing
this task by using a single quantitative bibliometric tool is desirable, as a single number often
gives the illusion of being an objective measure.
Moreover, the complex nature of science resulting from hyper-specialization in the past
few decades has also made the task of research assessment arduous across multiple fields. JIF
offers a convenient solution as a surrogate measure of quality for those scientists who have to
evaluate papers outside the scope of their expertise. Some critics have also attributed continued
use of JIF to laziness on the part of committees responsible for evaluating scientific work.
Proponents of JIF
Although most of the literature discussing JIF criticizes the system and encourages either
its abandonment or replacement, there are a small proportion of scientists who think otherwise.
Dr. Ludo Waltman, deputy director of the Centre for Science and Technology Studies (CWTS) at
Leiden University, Holland, is one such scientist who does not necessarily agree with the critics
of JIF. Although he acknowledges the shortcomings of JIF system, he does not believe that the
system should be abandoned. He argues that people who use JIF to compare journals for
subscription purposes are indirectly comparing papers in those journals, which subsequently
makes it a valid tool for assessing the quality of papers as well. He says that there should be
consistency in the way critics use JIF. If they are criticizing the system, then they should accept
the full consequences of their criticisms and fully reject any use of JIF because it is impractical
to separate these two ways of using Garfield’s bibliometric system.
Batool8
Dr. Waltman also seems to favor JIF because of the need for heuristics for quick
decision-making in a hypercompetitive field like Science. He says that reading biosketches for
all the candidates might be too time-consuming and JIF provides a shortcut for narrowing down
the application pool to a smaller number of candidates, who could then be subject to an in-depth
evaluation. Dr. Schekman refutes this argument by pointing out that reading a brief biosketch is
far less work than reading letters of recommendation and spending a few extra minutes on
evaluating applications is a small cost to pay in return for a better and fair assessment of
scientific quality that is free of fallacious JIF system. Dr. Bertuzzi also holds similar views and
says, “there are no shortcuts for research assessment.”
Alternative ways for Evaluation of Scientific Papers and Authors
Since JIF is a journal level metric, it is inappropriate to use it for the purposes of
assessing individual articles; thus, replacing it with alternative article level metrics would be
more effective. Two of the prominent article level metric systems include the h-index, introduced
by Jorge Hirsh in 2005 and Relative Citation Ratio (RCR) created by NIH working group in
2015. Such author-level metric systems reflect the number of publications of individual scientists
as well as the number of citations for individual papers, allowing comparison of authors without
being biased about the prestige of journals.
Just like the JIF, article level metrics have their own downsides. Dr. Curry describes the
h-index as fundamentally flawed because it heavily relies on the quantity of papers. For instance,
Dr. Harry Kroto won the Nobel Prize in Chemistry in 1985 because of his single influential
paper but ranks only 264th on the h-index because of a relatively lower number of publications.
The h-index also unduly favors the senior scientists, as the young researchers with only a handful
of papers score lower, no matter how impactful their papers are.
Batool9
RCR has also caught the attention of critics as many have pointed out the restrictive
nature of the system and raised concerns about its relevance to other disciplines besides
Biomedicine. A recent paper by Janssens et al. breaks down the numerator and denominator used
for the calculation of RCR and shows that RCR may decrease for older publications, putting the
established researchers at disadvantage. Since RCR is a fairly new metric system, it warrants
further research and perhaps a few adjustments to provide more robust bibliometric indicator and
potentially a better replacement for JIF.
Dr. Curry encourages the journals to display citation distributions alongside their JIF on
their website. This will reveal the full extent of skew of these distributions and give a realistic
view of JIF to the readers. It will be a countermeasure to the wider trend in Science to
inappropriately rely on JIF for the evaluation of researchers and perhaps, attenuate its influence
on the assessment of scholarly documents.
The abundance of available metric systems in Science can be attributed to natural
tendency of scientists to rely on numbers for research assessment. Majority of the skeptics of JIF
discourage reliance upon one particular metric system in deciding on grant awards or hiring
researchers. “The complexity of Science cannot be discerned by a single number,” says Dr.
Bertuzzi. Ideally, all the papers of all candidates should be read to make fair decisions about
grants and jobs, but given the time-constraints and the hyper-specialized nature of science, this
would not be a realistic solution.
Biographical sketches provide a useful alternative for evaluation committees to assess
individual scientists. A biosketch highlights each individual's accomplishments and briefly
describes his/her major contributions to their field. Dr. Harold Varmus proposed the inclusion of
biosketch in the National Institute of Health (NIH) grant applications during his time as the
Batool10
director of NIH between 1993 and 1999. It replaced the traditional bibliography and enabled the
grant committees to familiarize themselves with the content of a scientist’s research before
making their decision on grant allocations and awards. Since then, NIH requires a biosketch by
scientists who are either applying for or renewing their grants; Similarly, biosketch should be
made a mandatory part of every grant and job application because a single statistical number
from any metric system cannot fully gauge the merit of scientific papers. In addition, informed
decisions can only be made when the committees are acquainted with the scientific work.
Garfield’s bibliometric indicator, without a doubt, has some significant deficiencies that
render it as an unfitting surrogate for assessing research results and scientific productivity. It has
had adverse effects on the scientific community by negatively transforming the publication
industry and overemphasizing the importance of the prestige of journals for grant allocation and
hiring decisions. As a consequence, research interests of the scientists in numerous fields are
redirected, damaging scientific progress and discoveries.
Individual articles should be evaluated based on the quality of its content. Individual
scientists should be assessed based on what they publish instead of where they publish. This is
possible if the evaluation committees judge applications by using biographical sketches of the
candidates along with a combination of more appropriate article level metric systems such as the
RCR or h-index. Eradicating JIF would require a cultural change, and it may not occur overnight.
The funding agencies, evaluation committees and principal investigators could gradually uproot
the JIF system and reform the publication practices in Science by discouraging its use as a proxy
to measure the scientific impact. They can begin by signing the San Francisco DORA.
Batool11
References
Bertuzzi, S. (2018, April 27). Personal phone interview. Curry, S. (2018, May 01). Personal skype interview. Schekman, R. (2018, May 06). Personal skype interview. Waltman, L. (2018, May 07). Personal skype interview. Garfield, E. (2006). The History and Meaning of the Journal Impact Factor. Jama, 295(1), 90.
doi:10.1001/jama.295.1.90 Hyndman, R. J. (2017, June 21). Why I'm not celebrating the 2016 impact factors. Retrieved
from https://robjhyndman.com/hyndsight/2016-impact-factors/ Casadevalla, A. (2015, October 13). Arturo Casadevall. Retrieved from
http://mbio.asm.org/content/6/5/e01593-15.full Liu, X., Gai, S., & Zhou, J. (2016). Journal Impact Factor: Do the Numerator and Denominator
Need Correction? Plos One, 11(3). doi:10.1371/journal.pone.0151414 Kiesslich, T., Weineck, S. B., & Koelblinger, D. (2016). Reasons for Journal Impact Factor
Changes: Influence of Changing Source Items. Plos One, 11(4). doi:10.1371/journal.pone.0154199
Lariviere, V., Kiermer, V., MacCallum, C., McNutt, M., Patterson, M., Pulverer, B.,
Swaminathan, S., Taylor, S., Curry, S. (2016, September 11). A simple proposal for the publication of journal citation distributions. Retrieved from bioRxiv 062109; doi: https://doi.org/10.1101/062109
Bornmann, L., Marx, W., Gasparyan, A.Y. et al. Rheumatol Int (2012) 32: 1861. https://doi-
org.lehman.ezproxy.cuny.edu/10.1007/s00296-011-2276-1 Curry, S. (2012, August 13). Sick of Impact Factors. Retrieved from
http://occamstypewriter.org/scurry/2012/08/13/sick-of-impact-factors/ Cagan, R. (2013). The San Francisco Declaration on Research Assessment. Disease Models &
Mechanisms, 6(4), 869–870. http://doi.org/10.1242/dmm.012955 ArXiv, E. T. (2017, July 12). Chinese scientists can be paid up to $165K for publishing a single
paper in a top Western journal. Retrieved from https://www.technologyreview.com/s/608266/the-truth-about-chinas-cash-for-publication-policy/
Batool12
Gomar, F. S. (2014, March 01). How does the journal impact factor affect the CV of PhD students? Retrieved from http://embor.embopress.org/content/15/3/207
Callaway, E. (2016, July 08). Beat it, impact factor! Publishing elite turns against controversial
metric. Retrieved from https://www.nature.com/news/beat-it-impact-factor-publishing-elite-turns-against-controversial-metric-1.20224#/b1
Casadevalla, A. (2014, March 18). Arturo Casadevall. Retrieved from
http://mbio.asm.org/content/5/2/e00064-14.full Hendrix, S. (2018, March 08). Do I need Nature or Science papers for a successful career in
science? Retrieved from https://www.smartsciencecareer.com/do-i-need-nature-papers/ Lariviere, L., & Sugimoto, C. (2018, March 05). The Journal Impact Factor: A brief history,
critique, and discussion of adverse effects. Retrieved from https://arxiv.org/abs/1801.08992v2
Badner, A. (2015, January 8). The Growing Big Three Boycott: Understanding why scientists are
publically boycotting Cell, Nature and Science Journals. Retrieved from http://www.imsmagazine.com/the-growing-big-three-boycott-understanding-why-scientists-are-publically-boycotting-cell-nature-and-science-journals/
Curry, S. (2018, February 07). Let's move beyond the rhetoric: It's time to change how we judge
research. Retrieved from https://www.nature.com/articles/d41586-018-01642-w Waltman, L. (2016, July 11). The importance of taking a clear position in the impact factor
debate. Retrieved from https://www.cwts.nl/blog?article=n-q2w2c4 Bertuzzi, S. (2016, March 17). A New and Stunning Metric from NIH Reveals the Real Nature
of Scientific Impact. Retrieved from https://www.ascb.org/activation-energy/a-new-and-stunning-metric-from-nih-reveals-the-real-nature-of-scientific-impact/
Ball, P. (2012, January 06). The h-index, or the academic equivalent of the stag's antlers.
Retrieved from https://www.theguardian.com/commentisfree/2012/jan/06/bad-science-h-index
Janssens, A. C., Goodman, M., Powell, K. R., & Gwinn, M. (2017). A critical evaluation of the
algorithm behind the Relative Citation Ratio (RCR). PLOS Biology, 15(10). doi:10.1371/journal.pbio.2002536
Editage Insights. (2013, November 4). Why you should not use the journal impact factor to
evaluate research. Retrieved from https://www.editage.com/insights/why-you-should-not-use-the-journal-impact-factor-to-evaluate-research