kiduk yang citesearch

23
CiteSearch: Multi-faceted Fusion Approach to Citation Analysis Kiduk Yang and Lokman Meho Web Information Discovery Integrated Tool Laboratory Keimyung University, Korea American University of Beirut, Lebanon October 27, 2010

Upload: gali-halevi

Post on 08-May-2015

936 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Kiduk yang citesearch

CiteSearch: Multi-faceted Fusion Approach to Citation Analysis

Kiduk Yang and Lokman Meho

Web Information Discovery Integrated Tool LaboratoryKeimyung University, Korea

American University of Beirut, LebanonOctober 27, 2010

Page 2: Kiduk yang citesearch

2

CiteSearch: What, Why, & How

Goal• Quality Assessment of Scholarly Publications

Motivation• Lack of comprehensive citation database• Limitations of conventional citation analysis

One-dimensional assessment Misleading evaluation

Approach• Multi-faceted, Fusion-based Citation Analysis

Combine data from multiple citation databases Assess quality using various quality evaluation measures

Page 3: Kiduk yang citesearch

3

CiteSearch Study: Overview Objectives

• Investigate current citation analysis environment • Test the viability of CiteSearch system

Method• Search citation databases and compare the results

Setup• Study sample

Publications of 15 SLIS faculty members (approx. 1,100 publications)

• Databases used Google Scholar, Scopus, Web of Science

• Citation sources Journals and conference papers in 1996-2005

Page 4: Kiduk yang citesearch

44

Citation Databases

Web of Science Scopus Google Scholar

Breadth of coverage

36M records

8,700 titlesJournals (240 open access) & conference papers

28M records

15,000 titles Journals (500 open access) & conference papers

500M records

Unknown30+ document types

Coverage years A&HCI: 1975-

SCI: 1900-

SSCI: 1956-

1996-present (with cited references)

1966-present (without cited references)

Unknown

Subject area All All All

• Data collection- WoS: 100 hours- Scopus: 200 hours- GS: over 3,000 hours

Data as of 2006

Page 5: Kiduk yang citesearch

5

Scopus and WoS: Citation Count Scopus vs. WoS

• 14.0% (278) more citations by Scopus More comprehensive coverage by Scopus (15,000 vs. 8,700 periodicals)

Scopus + WoS• Scopus increases WoS citations by 35% (710)• WoS increases Scopus citations by 19.0% (432)• Relatively low overlap (58%) and high uniqueness (42%)

Scopus(2,301)

Web of Science(2,023)

58%(1,591)

26%(710)

16%(432)

Scopus WoS(2,733)

Page 6: Kiduk yang citesearch

6

Impact of Scopus By Research Area- varies significantly between research areas- varies significantly between research areas

Page 7: Kiduk yang citesearch

7

Impact of Scopus on Faculty Members Relative Ranking

Scopus significantly alters the relative ranking of those faculty members that appear in the middle of the rankings

Page 8: Kiduk yang citesearch

8

Scopus + WoS: Citation Count By Document Type

Scopus(359)

WoS(229)

18%(92)

54%(267)

Scopus WoS(496)

28%(137)

Conference Papers Only

Page 9: Kiduk yang citesearch

9

Scopus + WoS: Summary of Results Coverage

• Varies greatly between research areas Increase in citations ranges from 5% to 99% by combining results from

both databases • Scopus has a much better coverage of conference proceedings

Overlap: 18% Scopus only: 54% WoS only: 28%

Ranking by citation count• Relative ranking of faculty members changes significantly for those in

the middle

Page 10: Kiduk yang citesearch

10

Google Scholar Citations By Document Type

Page 11: Kiduk yang citesearch

11

Citations By Language

Page 12: Kiduk yang citesearch

12

Impact of GS By Research Area

Page 13: Kiduk yang citesearch

13

Impact of GS on Faculty Members Relative Ranking

GS does not significantly alter the rankings of faculty members

Page 14: Kiduk yang citesearch

14

GS vs. ScopusWoS GS increases WoSScopus citations by 93% (2,552) ScopusWoS increases GS citations by 26% (1,104) GS identifies 53% (or 1,448) more citations than WoSScopus GS has much better coverage of conference proceedings

• (1,849 by GS vs. 496 by ScopusWoS) GS has over twice as many unique citations as ScopusWoS

• (2,552 vs. 1,104, respectively)

Google Scholar(4,181)

ScopusWoS(2,733)

31%(1,629)

48%(2,552)

21%(1,104)

GS ScopusWoS(5,285)

Page 15: Kiduk yang citesearch

15

CiteSearch Study: GS + Scopus + WoS

Google Scholar(4203)

4.3%(230)

18.3%(970)

48.3%(2561)

GS Scopus WoS(5307)

Scopus(2308)

WoS(2025)

11.7%(617)

8.2%(435)

3.8%(204)

5.3%(282)

Page 16: Kiduk yang citesearch

16

GS + ScopusWoS: Summary of Results Coverage

• Varies greatly between research areas 23% to 144% increase by combining GS & ScopusWoS 5% to 98% increase by combining Scopus & WoS

• GS has strong coverage in CS & IS HCI, IR, computational linguistics, social informatics

• ScopusWoS has stronger coverage in LS Bibliometrics, collection development, information policy

• GS provides significantly better coverage of non-English materials GS (7%); Scopus (1%); WoS (1%)

Ranking• No significant changes in relative ranking of faculty members

Page 17: Kiduk yang citesearch

17

Findings Scopus, WoS, and GS complement rather than replace each other

GS can be useful in showing evidence of broader international impact than could possibly be done through Scopus and WoS

GS can be very useful for citation searching purposes; however, it is not conducive for large-scale comparative citation analyses

Scopus significantly alters the relative citation ranking of scholars as measured by Web of Science. GS does not

Page 18: Kiduk yang citesearch

18

Conclusions Multiple sources of citations should be used to generate accurate

citation counts and rankings• Citation databases complement one another • Small overlap between sources may significantly influence relative ranking

Multi-faceted citation analysis is needed

• citation coverage varies by research area, document type, language

CiteSearch can greatly facilitate citation analysis• Enormous effort is required to

Refine search strategy Parse search results Eliminate noise (duplicate citations) Extract & normalize citation metadata

Page 19: Kiduk yang citesearch

19

CiteSearch System: Work-in-Progress Federated Citation Search

• To compile comprehensive & usable citation data

1. Query multiple citation databases2. Filter out noise

• e.g., invalid, duplicate citations3. Extract & normalize metadata

• bibliographical metadata (e.g., title, author, year, source, etc.)• citation metadata (e.g., doctype, subject, language, etc.)

Multi-faceted Citation Analysis• To produce multi-faceted quality/impact assessment measures that

account for variance in citation quality (e.g., Weighted citation counts, CiteRank) consider various facets of evaluation metric (e.g., Document type, language) accommodate diffent aspects of quality assessment (e.g., H-Index, Mentor-Index)

1. Compute citation-based quality scores (CQS) for each publication2. Compute CQS for authors, schools, publishers using publication CQS3. Compute CQS for each publication weighted by author/school/publisher scores4. Compute CQS for authors, schools, publishers using weighted publication CQS5. Repeat steps 3 and 4 until convergence

Page 20: Kiduk yang citesearch

20

CiteSearch System: Architecture

Page 21: Kiduk yang citesearch

21

End

Page 22: Kiduk yang citesearch

22

Page 23: Kiduk yang citesearch

23