![Page 1: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/1.jpg)
1/26
Semantometrics: Towards full text-based research evaluation
Petr Knoth and Drahomira Herrmannova
Semantometrics.org
![Page 2: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/2.jpg)
2/26
Towards full-text based research metrics: Exploring semantometrics
13th June 2016: Announcement of the report release:https://scholarlyfutures.jiscinvolve.org/wp/2016/06/towards-full-text-based-research-metrics-exploring-semantometrics/
Report available at: http://repository.jisc.ac.uk/6376/1/Jisc-semantometrics-experiments-report-final.pdf
![Page 3: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/3.jpg)
3/26
Current impact metrics
• Pros: simplicity• Cons: insufficient evidence they capture quality and research
contribution, ad-hoc/established axiomatically
![Page 4: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/4.jpg)
4/26
The crisis of research evaluation?
Figure: Rejection rates vs Journal Impact Factor (JIF) according to (da Silva, 2015).
![Page 5: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/5.jpg)
5/26
Problems of current impact metrics• Sentiment, semantics, context and motives [Nicolaisen, 2007]• Popularity and size of research communities [Brumback,
2009; Seglen, 1997]• Time delay [Priem and Hemminger, 2010]• Skewness of the distribution [Seglen, 1992]• Differences between types of research papers [Seglen, 1997]• Ability to game/manipulate citations [Arnold and Fowler,
2010; PLoS Medicine Editors, 2006]
![Page 6: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/6.jpg)
6/26
Alternative metrics• Alt-/Webo-metrics etc.– Impact still dependent on the number of interactions in a
scholarly communication network (downloads, views, readers, tweets, etc.)
![Page 7: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/7.jpg)
7/26
SemantometricsContribution to the discipline assessed by using the article manuscript
![Page 8: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/8.jpg)
8/26
Many possibilities for semantometrics …• Detecting good research practices were followed
(sound methodology, research data/code shared …)• Detecting paper type …• Analysing citation contexts (tracking facts
propagation) …• Detecting the sentiment of citations …• Normalising by size of community that is likely to
read the research …• Detecting good writing style …
![Page 9: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/9.jpg)
9/26
Semantometrics – contribution metricHypothesis: Added value of publication p can be estimated based on the semantic distance from the publications cited by p to publications citing p.
Detailed explanation: http://semantometrics.org
![Page 10: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/10.jpg)
10/26
Contribution metric• Based on semantic distance between citing
and cited publications– Cited publications – state-of-the-art in the domain
of the publication in question– Citing publications – areas of application
![Page 11: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/11.jpg)
11/26
Contribution metric – a practical example• Below- and above-average publication
![Page 12: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/12.jpg)
12/26
Experiment – dataset• Obtained by merging three open datasets:– Connecting Repositories (CORE) – OA publications,
metadata and full-texts– Microsoft Academic Graph (MAG) – citation
network– Mendeley – publication texts (abstracts) and
readership information• Over 1.6 million CORE publications, over 12
million publications in total
![Page 13: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/13.jpg)
13/26
Experiment – dataset statistics
Articles from CORE matched with MAG 1,655,835
Average number of received citations 16.09
Standard deviation 66.30
Max number of received citations 13,979
Average readership 15.94
Standard deviation 42.17
Max readership 15,193
Average contribution value 0.89
Standard deviation 0.0810
Total number of publications 12,075,238
![Page 14: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/14.jpg)
14/26
Experiment – dataset statistics• Citation and readership distribution
![Page 15: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/15.jpg)
15/26
Experiments – dataset statistics• Contribution distribution
![Page 16: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/16.jpg)
16/26
Experiment – dataset statistics• Relation between citations and readership
![Page 17: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/17.jpg)
17/26
Experiment – results • No direct correlation between contribution
measure and citations/readership• When working with mean citation, readership
and contribution values a clear behavioral trend emerges
![Page 18: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/18.jpg)
18/26
Experiment – results • Relation between mean contribution and
citations
![Page 19: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/19.jpg)
19/26
Experiment – results • Relation between mean contribution and
readership
![Page 20: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/20.jpg)
20/26
Current impact metrics vs semantometrics
Unaffected by Current impact metrics Semantometrics
Citation sentiment, semantics, context, motives
✗ ✔
Popularity & size of res. communities ✗ ✔
Time delay ✗ ✗/✔*
Skewness of the citation distribution ✗ ✔
Differences between types of res. papers ✗ ✔
Ability to game/manipulate the metrics ✗ ✗/✔**
* reduced to 1 citation ** assuming that self-citations are not taken into account
12
![Page 21: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/21.jpg)
21/26
Metrics for evaluating article sets• Encourage focusing on quality rather than
quantity• Comparable regardless of discipline, seniority,
etc.
![Page 22: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/22.jpg)
22/26
Evaluating research metrics• Need for a data driven approach– Ground truth – Human judgments– Many facets of performance (societal impact,
economical impact, rigour, originality/novelty)
![Page 23: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/23.jpg)
23/26
WSDM Cup – work on new metrics• The goal of the challenge is to assess the
query-independent importance of scholarly articles, using data from the Microsoft Academic Graph (>120M papers).
• Human judgements• But no full text in MAG
![Page 24: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/24.jpg)
24/26
Dataset for semantometrics research
By connecting MAG, CORE and Mendeley data, we have a dataset to study semantometrics.
![Page 25: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/25.jpg)
25/26
Conclusions• Full-text necessary for research evaluation• Semantometrics are a new class of methods. • We are studying one semantometric method
to assess the research contribution• Need for a data driven approach for evaluating
metrics
![Page 26: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/26.jpg)
26/26
References• Jeppe Nicolaisen. 2007. Citation Analysis. Annual Review of
Information Science and Technology, 41(1):609-641.• Douglas N Arnold and Kristine K Fowler. 2010. Nefarious
numbers. Notices of the American Mathematical Society, 58(3):434-437.
• Roger A Brumback. 2009. Impact factor wars: Episode V -- The Empire Strikes Back. Journal of child neurology, 24(3):260-2, March.
• The PLoS Medicine Editors. 2006. The impact factor game. PLoS medicine, 3(6), June.
![Page 27: Semantometrics: Towards Fulltext-based Research Evaluation](https://reader035.vdocuments.net/reader035/viewer/2022070523/58ed93481a28abbc628b45ff/html5/thumbnails/27.jpg)
27/26
References• Jason Priem and Bradely M. Hemminger. 2010. Scientometrics
2.0: Toward new metrics of scholarly impact on the social Web. First Monday, 15(7), July.
• Per Ottar Seglen. 1992. The Skewness of Science. Journal of the American Society for Information Science, 43(9):628-638, October.
• Per Ottar Seglen. 1997. Why the impact factor of journals should not be used for evaluating research. BMJ: British Medical Journal, 314(February):498-502.