vermelding onderdeel organisatie september 18, 2015 1 literature search...

45
Vermelding onderdeel organisatie June 23, 2022 1 Literature Search http://www.pds.ewi.tudelft.nl/~iosup/Courses/ 2012_aiosup_lit_search.ppt IN 3305 Alexandru Iosup. Initial slides by Tomas Klos. Course manager: Peter van Nieuwenhuizen. Parallel and Distributed Systems Groep http://www.pds.ewi.tudelft.nl/

Upload: isaac-cameron

Post on 11-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

Vermelding onderdeel organisatie

April 21, 2023

1

Literature Searchhttp://www.pds.ewi.tudelft.nl/~iosup/Courses/2012_aiosup_lit_search.ppt

IN 3305

Alexandru Iosup. Initial slides by Tomas Klos. Course manager: Peter van Nieuwenhuizen.

Parallel and Distributed Systems Groephttp://www.pds.ewi.tudelft.nl/

Page 2: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

Literature Surveys: At the Core of InnovationGiven a problem (topic of interest)Answer questions about it

• What solutions exist?• What is the most influential solution?• What is the rate of innovation in the field?

By surveying (understanding, interpreting, and summarizing) the body of related (scientific) knowledge.• Where and how can I innovate?

IN3305’s study goal “kennismaken met wetenschappelijke literatuur”

Page 3: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

Innovation is a Vital Competitive Tool

• Innovation = novel application of knowledge• Innovation favors small (but efficient) countries• High-tech companies tend to be more innovation-intensive

Source: Economist Intelligence Unit, A new ranking of the world’s most innovative countries, April 2009, http://graphics.eiu.com/PDF/Cisco_Innovation_Complete.pdf

Page 4: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

What is Novel?The Overwhelming Growth of Knowledge“When 12 men founded the

Royal Society in 1660, it was possible for an educated person to encompass all of scientific knowledge. […] In the last 50 years, such has been the pace of scientific advance that even the best scientists cannot keep up with discoveries at frontiers outside their own field.” Tony Blair, PM Speech, May 2002

19972001

19931997

Number of Publicatio

ns

Data: King,The scientific impact of nations,Nature’04.

Page 5: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

The “Size” of a Research Topic

• Grid Computing• Billions of $ in research investment• 2,500 PhDs (my est.)• Over 15,000 scientific publications (my est.) in 15

years• Several surveys of 100-200 articles each

• Grid Scheduling• Conferences: Grid, CCGrid, HPDC, SC, IPDPS, ICDCS, …• Journals: TPDS, CCPE, FGCS, JoGC, …

• Peer-to-Peer Search Methods• Survey of over 300 articles after 5 years of research

Page 6: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

How to Talk About Books You Haven’t Read

• “There is more than one way not to read”• Not opening the book

• You cannot read everything• How many books can a librarian read?• How many books can you read? Let’s

estimate

• Librarians can talk about every book in the library (every book out of millions)

There exists a system to (not) read

Page 7: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 7

Outline

1. From the IN3305 study goals:1. “kennismaken met wetenschappelijke

literatuur”2. To read or not to read?3. What is “scientific literature”? (input and

output)4. Measuring and assessing Quality5. Useful sites and tools6. On gaming the citation indices (unethical)7. Conclusion

Page 8: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 8

Literature = input

• Citations• Place your work in context• Give credit to previous work• Support your arguments• Show your marginal contribution• Prevent plagiarism

• Read what you cite! (prevent superfluous citing)This does NOT mean:• “You should read everything”• “You cannot also read what you don’t cite”

Page 9: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 9

Literature = inputSources: peer-reviewed• Textbook/monograph: for teaching and background

• Complete treatment of a topic• Cite a textbook? Mention chapter or page number

• Journal article• More space, detail, thorough than conference paper• Sometimes old news at publication date (lag)

• Paper in edited volume:• Multiple papers, review of state-of-the-art• Cite individual papers

• Paper in conference proceedings• Recent results• Conference quality; publisher of proceedings?

Page 10: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 10

Sources: not peer-reviewed

• Working papers, Preprints• Up-to-date, spread ideas• “Open access”• Computing Research Repository (CoRR)

http://arxiv.org/corr/home• Websites• ‘Personal communication’

Page 11: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 11

Literature = output• Publish to conferences and journals• Peer-review (for conferences, journals):• (double) blind review:

Accept, with/without (major) revisionsReject

• Acceptance rate ratio, e.g., 25% (not bad)• (Nature: 10% articles are reviewed)• Time to print: up to 1.5 years for journals,

3-6 months for conferences• Measuring scientific output: “scientometrics”

Q What do you think about this situation?

Page 12: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 12

Quality?

• Reputation: ACM, IEEE, Springer, Elsevier, MIT/Princeton/Oxford/… University Press

• SCIgen - An Automatic CS Paper Generatorhttp://pdos.csail.mit.edu/scigen/accepted (non-reviewed) for: 2005 World Multi-Conference on Systemics, Cybernetics and Informatics (another one: an Elsevier journal!)

Page 13: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 13

Scientometrics• Scientometrics, “measuring and analyzing science”,• Bibliometrics, “study or measurement of texts and

information”• Citation analysis• Which papers cite a paper / does a paper cite?• Authority of countries, research groups, individual

authors, journals/conferences, individual paper

Q What is a citation?

• “Publish or perish”: quality vs quantity• (“80% of all published papers are not cited”)

Q Conference or journal? Which conference or journal?

Page 14: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 14

Citation Databases

• Commercial• ScienceCitation Index (Web of Science/Inf. Sci.

Inst.)• Scopus (Elsevier)

• Free• Google Scholar: better coverage than ISI• CiteSeer (computer science)• ArNetMiner (computer science)• RePec (economics)

• More: en.wikipedia.org/wiki/List_of_academic_databases_and_search_engines

Page 15: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

Comparing Countries

Data: King, The scientific impact of nations, Nature’04.

Citation rate per paper, norm.

Citation intensity=

#Citations/GDP

Page 16: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

Comparing Groups or Individuals [1/3]• An idea: Google PageRank principle

• Web: network of sites, linking to each other• Science: network of papers, citing each

other

Time

World Wide Web’s Links Network

Academic Citations Network

Q What do you think about this approach?

Page 17: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 17

Comparing Groups or Individuals [2/3]• Journals: Journal Impact Factor• Personal: h-index (Hirsch, 2005):

A scientist has index h if h of his/her N papers have at least h citations each, and the other (N − h) papers have no more than h citations each.g-index (Egghe, 2006): highest number g s.t. the first g most cited articles have attracted at least g2 citations.

• Extensions: e-index; group evaluation

Q What about conferences?Q Really, what is a citation?Q (unethical) How to abuse citation indices?

Page 18: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 18

Journal Impact Factor (JIF)

• Many journals have no impact factor• JIF is the average number of citations in a given

year, to papers in a journal in the 2 previous years.

• For journal x, 2010

number of citations in 2010 to papers in journal xfrom the period 2008 – 2009

JIF (x, 2008) =Total number of papers in journal x

in the period 2008 – 2009

• What does an average value mean?

Page 19: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 19

Journal Impact factors, 20042004 Science Journals Impact Factors (Bron: ISI)

0.001

0.01

0.1

1

10

100

0 1000 2000 3000 4000 5000

≥1 citation/publication(last 2 years)

JIF

Journal Rank

Highest JIF ~30

Very high JIF ≥15

Page 20: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 20

CS impact factors, 20052005 Impact Factor CS Journals (Bron: ISI)

0.01

0.1

1

10

0 100 200 300

Journal Rank

JIF

Highest JIF ~8

Very high JIF ≥2

Highest JIF ~30

Very high JIF ≥15

CS All

Q What do you think about this situation?

Page 21: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

Comparing Groups or Individuals [3/3]For Computer Science• Conference proceedings are to be preferred to

journals• ISI Web of Science and Elsevier Scopus are not good

impact indicators—poor, albeit improving, coverage• Google Scholar is a better impact indicator than ISI

WoS and Elsevier Scopus; ArNetMiner is reasonable• DBLP is a good, selective source, but has no citation

links• Expert knowledge is required to select the best

topical conferences and journals (regardless of their acceptance ratios and impact factors)Q Problems with this

approach?

Page 22: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 22

Outline

1. From the IN3305 study goals:1. “kennismaken met wetenschappelijke

literatuur”2. To read or not to read?3. What is “scientific literature”? (input and

output)4. Measuring and assessing Quality5. Useful sites and tools6. On gaming the citation indices (unethical)7. Conclusion

Page 23: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 23

Method To Find Sources

• Browse:• Google Scholar: http://scholar.google.com/• DBLP: http://dblp.uni-trier.de/• Others: TU Delft library tools

• Study author using Publish or Perish• Look at author homepages• Follow links and citations (forward and

backward)

Page 24: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 24

Google Scholar

• “cited by”• Relevant authors• TU Delft SFX linking• Import into bibtex

Page 25: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 25

Google Scholar at Work

Page 26: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 26

Page 27: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 27

Google Scholar at Work

From home: use vpn!

Page 28: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 28

Page 29: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 29

DBLP

• “lists more than one million articles” (april 2008)• Indexes:• Authors• Now also “Faceted search”,

“CompleteSearch”• Conferences• Journals• Series• Subjects

Page 30: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

DBLP at Work

Page 31: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 31

DBLP at Work

Page 32: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 32

Page 33: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 33

Page 34: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 34

TU Delft Library

• Search• http://www.library.tudelft.nl/ws/search/• e.g. “information by subject” -> computer

science• TUlib• “how to find and use scientific information”• http://www.library.tudelft.nl/tulib/

Page 35: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

Harzing’s Publish or Perish

• Uses Google Scholar data• Calculates many indices• Number of citations (also per year / article /

author /…)• Hirsch’s h-index• Zhang’s e-index (excess in h-index set)• Egghe’s g-index• …

• Similar online tool: ArNetMinerApril 21, 2023 35

Page 36: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 36

Publish or Perish (http://www.harzing.com/pop.htm)

Page 37: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 37

Outline

1. From the IN3305 study goals:1. “kennismaken met wetenschappelijke

literatuur”2. To read or not to read?3. What is “scientific literature”? (input and

output)4. Measuring and assessing Quality5. Useful sites and tools6. On gaming the citation indices (unethical)7. Conclusion

Page 38: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 38

Unethical!How to Game the Citation System?(part of)Collaboration graph

Page 39: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 39

All authors with Erdős number 1Note: The h-index was “invented” almost a decade after Erdos.

Page 40: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 40

Collaboration Graph Degree Distribution

Erdős

Page 41: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 41

Collaboration Graph: Connected Components Distribution

Giant Component

Page 42: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 42

Interested?

• Mark Newman answers: “who is the best connected scientist?”

• Other references• Erdős Number Project

http://www.oakland.edu/enp/http://harveycohen.net/erdos/ -- Jerry Grossman and Smarty

• Kevin Bacon Oracle—is Kevin Bacon the center of the Hollywood movie industry? (or Sean Connery? or Christopher Lee?) http://oracleofbacon.org/

Page 43: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

More on the (unethical) Gaming the Citation Indices• Self-cite, self-cite, self-cite• Journals asking for submitters to cite journal’s

papers• Program committee members and reviewers asking

for their own work to be cited (when not necessary)• Not citing old work because it’s old—”killing” old

results now allows you to republish them later• Work on a popular topic—more people, more

citations, more chances• (Google Scholar-only) Blog, Tweet, and FB daily

about your papers. Ask your friends to re-post.

Page 44: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

How to Talk About Books You Haven’t Read

There exists a system to (not) read

1. Know where to find sources• Trustworthy: DBLP, ACM DL, Google

Scholar• Less trustworthy: CoRR, …

2. Know how to find good sources• Number of citations: Google

Scholar+Others• H-index: Publish or Perish (the program)• Try to avoid or weight-out citation cliques

3. Select from the good sources

Page 45: Vermelding onderdeel organisatie September 18, 2015 1 Literature Search iosup/Courses/2012_aiosup_lit_search.ppt IN 3305

April 21, 2023 45

Questions?