central library - alexander smith librarian, central library, iit kharagpur national workshop on...

30
OSSLM-2016 Open Source Software (OSS) for Bibliometrics and Scientometrics WORKSHOP MANUAL Prepared by Dr. S. K. Jalal Deputy Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management (OSSLM 2016) June 13-18, 2016 Jointly Organized by Central Library Indian Institute of Technology Kharagpur Kharagpur 721302, West Bengal, India & National Digital Library (NDL) NMEICT Project, MHRD, Govt. of India

Upload: phungkiet

Post on 08-Mar-2018

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Open Source Software (OSS) for Bibliometrics and Scientometrics

WORKSHOP MANUAL

Prepared by

Dr. S. K. Jalal

Deputy Librarian, Central Library, IIT Kharagpur

National Workshop on Open Source Software for Library

Management

(OSSLM 2016) – June 13-18, 2016

Jointly Organized by

Central Library

Indian Institute of Technology Kharagpur Kharagpur – 721302, West Bengal, India

&

National Digital Library (NDL) NMEICT Project, MHRD, Govt. of India

Page 2: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Open Source Software (OSS) for Bibliometrics and Scientometrics

Abstract

The manual deals with three important open source software i.e. Publish or Perish

(POP), SciMAT and Bibexcel. These software will help in analysing the faculty

publications of an author or an institute or a specific subject.

1. Introduction

There are many open source software available on the Web for Bibliometric and

Scientometrics analysis. Bibliometric mainly deals with the statistical and mathematical

application to printed documents, whereas Scientometrics deals with the application of

statistical and mathematical techniques to printed documents especially scientific documents

and analysis of science and its modeling etc. Scientific document, usually, means articles,

letters, reviews or proceedings papers. Citation analysis is one the most popular methods in

Bibliometrics. Citation analysis is the examination, visualization and representation of

frequency pattern of citations of published documents. Authorship pattern, publication

analysis, bibliographic coupling and co-citation analysis are some of the important measures

in citation analysis.

Some common features of Scientometric software are as follows:

Preparation of a list of authors;

Preparation of a list of titles;

Preparation of a list of Journals;

Finding out authorship pattern;

Supports mapping and visualization of discipline (e.g. Physics, Chemistry etc.);

Analyzing data imported from other data sources ( e.g Scopus, WoS);

Helps to execute metrics based evaluation (h-index, number of citations);

Creation of maps and generating networks

Page 3: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

2. Open Source software for Bibliometrics Analysis

Some open source software predominantly used in Bibliometric and Scientometric analysis

are given below:

S.N Name of OSS Available

1. POP Publish or Perish http://www.harzing.com/resources/publish-or-perish

2. SciMAT http://sci2s.ugr.es/scimat/

3. VOSViewer http://www.vosviewer.com/

4. BibExcel http://homepage.univie.ac.at/juan.gorraiz/bibexcel/

5. CiteSpace II http://cluster.cis.drexel.edu/~cchen/citespace/

6. Gephi https://gephi.org/

7. HistCite http://histcite.software.informer.com/12.3/

8. NodeXL http://nodexl.codeplex.com/

9. Pajek http://vlado.fmf.uni-lj.si/pub/networks/pajek/

10. Ucinet https://sites.google.com/site/ucinetsoftware/home

Among these software, only a few will be discussed here.

3. POP – Publish or Perish

3.1 Introduction

Publish or Perish (Version: 4.25.1)is a software program that retrieves and analyses

academic citations. It uses Google Scholar data and Microsoft Academic Search data to

obtain raw data and citations for analysis and prepare some metrics:

3.2 Important Metrics

Total Number of Papers andtotal number of citations. It provides total

number of citations for a set of publications.

Average citations per paper, citations per author, papers per author, and

citations per year. These metrics can be calculated.

Hirsch's h-index and related parameters.A scientist has index h if h of his/her Np

papers have at least h citations each, and the other (Np-h) papers have no more

than h citations each.

Egghe's g-index:A set of articles ranked in decreasing order of the number of

citations that they received, the g-index is the (unique) largest number such that

the top g articles received (together) at least g2 citations.

The contemporary h-index:It adds an age-related weighting to each cited

article, giving less weight to older articles. This means that for an article

Page 4: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

published during the current year, its citations count four times. For an article

published 4 years ago, its citations count only once (4/4). For an article published

6 years ago, its citations count 4/6 times, and so on.

The Individual h-index (hI):It divides the standard h-index by the average

number of authors in the articles that contribute to the h-index, in order to reduce

the effects of co-authorship; the resulting index is called hI.

The age-weighted citation rate (AWCR) and AW:It is an age-weighted

citation rate, where the number of citations to a given paper is divided by the age

of that paper. The AW-index is defined as the square root of the AWCR. Jin

defines the AR-index as the square root of the sum of all age-weighted citation

counts over all papers that contribute to the h-index.

The results are available on screen. It can be copied to Microsoft word or Excel.

3.3 Installation

3.3.1 Download and Install POP (Windows)

The Publish or Perish software is a Microsoft Windows application that can also be installed

in Linux computers, with the aid of a suitable emulator such as Wine.

1. Download the Publish or Perish software installer from the Harzing.com :

http://www.harzing.com/download/PoPSetup.exe.

Publish or Perish installer for Windows (949 KB) .Version: 4.25.1 (17 Jan ‘16)

2. Start the PoPSetup.exe installer by double-clicking on the file that you just

downloaded.

3. On most systems, a security warning dialog box will

appear.Click Run or Continue or Yes after you have verified that the publisher's

name is Tarma Software Research Ltd

4. The installer will now start. Follow the instructions on the screen to confirm your acceptance of the license agreement

and to install the Publish or Perish software on your computer.

3.3.2 Download and Install POP (LINUX)

Wine is an Open Source implementation of the Windows API on top of UNIX. Wine

provides the programs and libraries that allow you to run many Windows applications

(including Publish or Perish) unchanged on your Linux system and other supported Unix-

like operating systems.

Page 5: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

3.3.3 Installing Publish or Perish using Wine

Once you have Wine installed, you can install Publish or Perish using its normal Windows

installer according to the following procedure.

a) Download the Publish or Perish software installer from the Harzing.com web site:

Publish or Perish installer for Windows (949 KB)

Version: 4.25.1 (17 January 2016) - What's new?

b) Open a File Browser window and go to the PoPSetup.exe file that you just

downloaded. Right-click on the file and choose Properties from the popup menu.

c) In the PoPSetup.exe Properties window that appears, check the Allow executing file

as program box:

d) Click Close todismiss the Properties window, then double-click the PoPSetup.exefile

to start the Publish or Perish installer.

e) Follow the instructions on the screen to install Publish or Perish.

After successful installation, once we start POP, following interface may be seen

Page 6: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

3.4 Search Trips

1. Always use "quotes" around the author’s name, e.g. "A Harzing"

2. PoP is not case dependent, "A HARZING" gives the same result as "a harzing"

3. The order of search terms does not matter. "A Harzing" will give the same result as

"Harzing A".

4. Use an author’s initials rather than their full given name as not all journals publish

author names in full.

5. If an author has consistently published with only one initial, you can exclude

namesakes using 2nd and 3rd initials by using wildcards in the "exclude these

names" field, e.g. when searching for "G Sewell", you can exclude "G* Sewell"

"G** Sewell".

6. If an author has published under two different names (e.g. maiden name and married

name) use OR between search terms for a combined search.

7. If an author has mostly published with two initials, but has incidental publications

with one initial, a combined search with initials and full given name (e.g. "CT Kulik"

OR "Carol Kulik") will usually capture all of their publications.

8. Lookup: Looks up the current query, using the internal cache if possible. This means

that if you have run the query before, the results will come from the cache and the

search is not submitted to Google Scholar again

Page 7: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

9. Lookup Directly: Look up directly means the query will be sent to Google Scholar

directly and result will be shown on screen. It does not depend on cache.

3.5 Internal Search

3.5.1 Author Search One can look for his publications indexed under Google Scholar along with citation using

query like “Mike Thelwall” .

3.5.2 Journal Search One can look for his publications in a particular Journal indexed under Google Scholar

along with citation.

3.5.3 General Citation Search Under General Citation Search implies that , it is possible to execute multiple query at a

time, for example, one can search articles on nanoparticles from the “Journal of

Nanotechnology and Nanoscience’’ published during 2013-2015.

3.6 External Data. The main attraction of the software is that you can work with external data.

3.6.1 Scopus The data can be exported or downloaded from Scopus. The data should be

exported/downloaded either .csv or .ris format. Besides, downloaded data should be

Page 8: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

compatible with “citation information only”. It is advisable to use *.csv format because

sometimes, *.ris file format, we failed to get the citation information.

Click on the drop-down field next to export. You will get a pop-up menu. On that menu,

select CSV export and save the file with a meaningful name. Do not change anything under

"Choose the information to be exported". Doing so will make the file unreadable for

Publish or Perish. This provides you with a *.csv file that you can import into Publish or

Perish, simply by clicking on the New Import icon [See multi-query center] or by clicking

File/Import

3.6.2 Web of Science The can also be downloaded from Web of Science database. The data should be saved in a

plain text format.

Page 9: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

You will then see the following screenshot. Click on OK and Publish or Perish will import

the Scopus data into the multi-query center. The results will appear in the folder that you are

in when you import the data.

The result is a neat list of publications that can then be sorted in any way you want. Statistics

and results can be exported for further analyses just like the results of Google Scholar

searches.

3.7 Export Results

Results can be exported into CSV format as: File>Save as CSV

Copy> Statistics with excel header=>details statistics will be copied

Copy> Results with excel header => details result will be copied

3.8 Results/ Descriptive Statistics a) Result of the authorship analysis for the data of IIT Kharagpur for 2015. The data

was downloaded from Scopus and imported to POP software and made the analysis.

Page 10: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

60 paper(s) with 1 author(s)

530 paper(s) with 2 author(s)

495 paper(s) with 3 author(s)

330 paper(s) with 4 author(s)

170 paper(s) with 5 author(s)

96 paper(s) with 6 author(s)

52 paper(s) with 7 author(s)

44 paper(s) with 8 author(s)

16 paper(s) with 9 author(s)

9 paper(s) with 10 author(s)

12 paper(s) with 11 author(s)

1 paper(s) with 12 author(s)

4 paper(s) with 13 author(s)

1 paper(s) with 15 author(s)

2 paper(s) with 16 author(s)

1 paper(s) with 17 author(s)

2 paper(s) with 23 author(s)

1 paper(s) with 33 author(s)

1 paper(s) with 48 author(s)

3.9 Limitations of Publish or Perish (POP) There are some limitations of POP software:

a) Publish or perish- supports all file format except BibTeX

b) Google Scholar-- CSV, EndNote, RIS

c) Scopus -- comma separated file , RIS

d) Web of Science –supported Tab Delimited Win or Win, UTF-8, Plain Text

e) Web of Science not supported Mac-based format

f) POP under Google Scholar supports only 1000 papers

g) POP under Microsoft Academic Search supports 1888 papers

Note: If you are using the Publish or Perish software in one of your research articles or

otherwise want to refer to it, please use the following format: Harzing, A.W.

(2007) Publish or Perish, available from http://www.harzing.com/pop.htm

Page 11: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

4. SciMAT: Science Mapping Analysis Tool

4.1 Why SciMAT?

SciMAT– a science mapping analysis tool is a open source software written in java

programming language and compatible in LINUX, windows and other operating systems.

SciMAT generates a knowledge base with key parameters like author, title, publishers,

keywords, journal name and references etc. Knowledge base has sixteen entities like author,

document, affiliation etc.

SciMAT has three modules

1. A module dedicated for management of knowledge base

2. A module responsible for carrying science mapping;

3. A module for visualization of generated results and maps.

4.2 SciMAT-1.1.03: Installation

Step-1: Download the software from http://sci2s.ugr.es/scimat/download.html

Step-2: Installation of Java version 6 or more[given in a CD]

Step-3:To run SciMAT v1.1.03, unpack this zip file and execute the SciMAT jar file.

Step-4: User guide can be downloaded here:

http://sci2s.ugr.es/scimat/software/v1.01/SciMAT-v1.0-userGuide.pdf

Step-5: Install sqlite viewer [free to download and it is optional]

4.3 Main Components and sub-components in SciMAT

Fig-1: Main window of SciMAT

Page 12: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

4.4 SciMAT has the following components as:

4.4.1 File

New Project: You need to create a project for a particular data. You need to specify

the path where the new project will be saved. First time, no need to open the

project because it is already active.

Open Project: Once you create a new project, you need to open it for further

processing and analysis;

Close Project: Once the work is over, you need to close the project.

Add files: It is possibleto add files for analysis. It is better to download data from

scopus*.ris format and add here.

Export > Groups

Export option works after completing the Group work. Exported file can be

opened in text editor and it is an xml file.

Page 13: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Import> Groups

Exit: Exit from the project

4.4.2 Edit

Global Replace

Undo

Redo

4.4.3 Knowledgebase and its Managers Knowledge Base

SciMAT generates a knowledge base from a set of scientific documents, where the relations

of the different entities related with each document (authors, keywords, journal, references,

etc.) are stored. The knowledge base is composed of sixteen entities.

Knowledge Base Manager

The module to manage the knowledge base is responsible for building it, importing the data

from different bibliographical sources, and cleaning and fixing the possible errors in the

entities.

Step-1: The first step in this module is to build a new project or load an existing one. It

can be done through the menu File or using the buttons of the toolbar. If a new project is

selected, a new window will appear asking for the path where the knowledge base file will

be stored and the name of the file. We can give any extension for the file.

Step-2: Once you create a new project or you open the project, New Project and Existing

Project option will be inactive and Add file option, Knowledge Base Manager, Group

Manager, Export and Import option will be activated.

Step-3: The add files option allows the user to add bibliographical information, exported

from bibliographical databases to the knowledge base. Particularly, SciMAT is able to read

bibliographical information exported in ISI Web of Knowledge format (ISI-CE) or RIS

(Scopus) format. While adding files you should follow Add Files> In RIS (May 2004

format). Then it will ask a) will you want to import data with reference, if so it will

delay the process. You can say yes.

Step-4: The manager allows us to add a new Document (filling manually each attribute) by

clicking Add button.

Page 14: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Supported File Format Scopus>RIS format

Web of Science> Text format

4.4.3.1 Author Author>Author Manager: Name of author along with no. of documents

Author>Author Affiliation Manager: It gives department-wise output or provides

author’s affiliation

Author>Author Group Manager: It will be activated after the group work

4.4.3.2 Documents

Document>Document Manager: Gives Article, author, year, citation

4.4.3.3 Journal

Journal>Journal Manager: Gives source along with no. of documents

4.4.3.4 References: Reference represents the intellectual base of the document. A

document has a set of references associated with it. Each reference can be

represented by different documents. References may be author-reference

and Source reference.

Reference Manager: Gives cited documents and citing document

4.4.3.5 Periods

Period>Periods Manager: Gives group of documents

Period>Period Manager>Add>2000-2005 >click add

Period will be added> go to right panel>Add> Add document

Page 15: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

4.4.3.6 Publish Dates

Publish date>Publish date Manager: It provides year-wise record of documents

published under the query.

4.4.3.7 Subject Categories

A document can have one journal or conference and publish date associated with it.

But, both entries (Journal and published date) can have a set of documents

associated. These entries can have an associated subject category, which represents a

global category. A journal can also be associated with many subject categories.

Subject Category Manager> Click on Add on left panel> type the name of subject

category. More than one subject category can be added

4.4.3.8 Words

Words Manager: Gives author keywords with number of documents. For example, the

ISIWoS adds a set of keywords called ISI Keywords PLUS to each document. In this sense,

the entity Word represents a descriptive term of a document. A set of Words can appear in

different Documents and each Document can have a set of Words.

The words provided by the authors (author's words), provided by the database (source's

words), or added in the pre-processing step (extracted words).

…………………………………………………………………………………………..

Page 16: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Working with Group

Working under Group- ‘Move To’ The move or join capability allows us to join a set of entities under other. It is especially

useful when we are working with groups. Once we have selected a set of entities that we

want to join, a new dialog (click on ‘Move to’) will appear. In this dialog box, one has to

select one record under which remaining 26 documents will join. These 26 documents have

an association with the master entry.

Step-1:Go to Document Manager

Step-2: Make Filter with keyword like ‘nanostructure’ and click on Filter; Result will

show that there are only six documents out of 1298 documents

Step-3: As soon as you select these six documents, Move to button will be activated;

Step-4: Click on ‘Move to’ button, new window will open.

Step-5: The user should select one main entry among six documents under which

remaining five documents will join. The main target entry will maintain its association with

other entries.

Working with Group Set

To the entry manager the manual groups set manager have a common structure: the left-side

shows a list of defined groups, and the right-side shows the entities associated with the

selected entity (header-table) and the entities without groups (foot-table). The manual set

group manager allows us to add a new group, delete a set of groups, join a set of groups

under other, and finally edit them.

4.4.4 Group Set

A group is a set of items that represents the same entity (e.g. E. Garfield &

Eugene Garfield]

A group can be marked as stop group and it will not be taken part as science

mapping analysis.

4.4.5 Author

Author Group Manual Set>Here one can create a new author group by

clicking Add button. Then select the author group from the left panel and

add authors of your choice from the right panel at middle using up and

down arrow.

Find Similar Authors by Distances

Page 17: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

4.4.6 Author-References

Author-References Group Manual Set

Find Similar Author Reference by Distances

4.4.7 References

Reference Group Manual Set

Find Similar References by Distances

4.4.8 Source References

Source References Manual Set

Find Similar Source-References by Distances

4.4.9 Words

Word Group Manual Set

Find similar words by plurals

Find similar words by distance

…………………………………………………………………………………….

4.5 Statistics[based on Group Set]

Author Groups Statistics

References Groups Statistics

Words Groups Statistics> provides period wise documents with mean, median,

standardization and variance.

4.6 Analysis [based on Group Set]

Make Analysis

Load Analysis

4.6.1 Science mapping analysis wizard

Step-1: In the first step the user has to select the periods that he/she wants to analyze. Each

period will produce a map. These periods will be used in the longitudinal or temporal

analysis in order to study the structural evolution of the field.

Step-2: The second step is the selection of the unit of analysis. As the unit of analysis the

user can select any of the five groups existing in the knowledge base: Author Group,

Author-Reference Group, Source-Reference Group, Reference Group, or Word Group. Only

one of them can be selected. If the Word Group has been selected, the role of the word with

which the user wants to perform the analysis has to be chosen.

Page 18: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Step-3: The third step is the data reduction. SciMAT allows the data to be filtered using a

minimum frequency threshold. For each selected period, a threshold must also be selected.

That is, only the item that appears in almost n documents in a given period will be taken into

account.

Step-4: The fourth step is the selection of the way in which the network will be built: co-

occurrence or coupling. Using co-occurrence, co-author, co-word, co-citation (using the

references), author co-citation (using the authors-reference), and journal co-citation (using

the sources-reference) network can be built.

b) Co-occurrence

c) Basic Coupling

d) Aggregated Coupling based on Author

e) Aggregated Coupling Based on Journal

Step-5: The fifth step is the network reduction. SciMAT allows the network to be filtered

using a minimum edge value threshold. For each selected period, a threshold value must be

set. That is, only the edges with a value greater or equal to n in a given period will be taken

into account.

Step-6: The sixth step is the selection of the similarity measure used to normalize the

network. SciMAT allows the user to choose the similarity measures commonly used in the

Page 19: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

literature to normalize networks: Association Strength, Equivalence Index, Inclusion Index,

Jaccard’s Index and Salton’s Cosine.

Step-7: The seventh step is the selection of the clustering algorithm used to get the map and

its associated clusters or subnetworks.

Clustering Analysis

Cluster analysis is the process of identification of homogeneous group of objects. It is a

collection of statistical methods. The objective of cluster analysis is to find out optimum tree

or set of clusters. There are various ways to find out clusters: hierarchical cluster method

and k-means method.

Clustering is the process of classifying objects into sub-groups based on some similarity

criteria. Cluster analysis or clustering is the task of grouping a set of objects in such a way

that objects in the same group (called a cluster) are more similar (in some sense or another)

to each other than to those in other groups (clusters).

The agglomerative algorithm for hierarchical clustering starts by placing each of the

objects in the data set in an individual cluster and then gradually merges those individual

clusters.

The divisive algorithm however, starts with the whole data set as a single cluster and then

breaks it down into fewer clusters. Single Link and Complete Link are two hierarchical

agglomerative clustering procedures.

– Single Link Clustering Algorithm

– Complete Link Clustering Algorithm

Step-8: The eighth step is the selection of the documents mapper used in the performance

analysis. SciMAT incorporates five different document mappers for co-occurrence

networks:

Step-9: The ninth step is the selection of the performance and quality bibliometric

measures. SciMAT adds by default the number of documents as performance measure.

Moreover, the citations of a set of documents are used in order to assess the quality and

impact of the clusters. In this sense, basic measures such as the sum, minimum, maximum

and average citations, or complex measures such as the h-index, g-index, hg-index or q2-

index can be selected.

Page 20: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Step-10: The tenth step is the selection of the similarity measure used to build the evolution

map and the overlapping map. SciMAT allows us to choose between: Association Strength,

Equivalence Index, Inclusion Index, Jaccard’s Index and Salton’s Cosine.

Step-11: Finally, the eleventh step is responsible to perform the science mapping analysis.

This process can be cancelled at any time. At the end, the analysis has to be saved (a new

save window will be open when the process end), and then the results are visualized in the

visualization module.

Results

Page 21: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Page 22: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

BibExcel Manual

Introduction

Bibexcel is a great tool for citation analysis. BibExcel is designed by OllePersson from

Umea University, Sweden to assist a user in analyzing bibliographic data.

Bibexcel: Features

• Co-citation, bibliographic coupling, mapping and clustering analysis;

• Bibexcel allows interaction with other software like Pajek, Excel and SPSS;

• Able to import many different type of data besides Web of Science

• Flexibility data management and analysis

Menu

• File Edit DOC file

• Edit OUT file Add data Classify

• Analyze Misc

• Mapping Help

Bibexcel

Page 23: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

How Bibexcel works?

• Download bibliographic data either from Scopus or Web of Science

• Preparing Data, which will be suitable for use in Bibexcel;

• Analysis the data ;

• Preparing Reports.

What Bibexcel can do?

• Step-1: Download data from Web of Science in Plain text format OR from Scopus

in RIS format.

• Step-2: Restructuring of downloaded data

• Step-3: Creating an OUT-file.

• Step-4: Analysis of data;

• Step-5: Export files to Pajek for visualizations

How to do the Re-structuring of data?

Step-1: There are two steps of retracting the data

– Insert carriage return in the file

– Convert the bibliographic record to DIALOG format

Step-2: Carriage return can be done

– Go to Bibexcel menu. Edit doc file>Replace line feed with carriage return. <

*.tx2 file will be created>

Step-3: to convert the bibliographic record to DIALOG format;

– Selecting the file *.tx2 and then choose option like

– Misc>Convert to Dialog format> Convert from Web of Science, OR

– Misc>Convert to Dialog format> Convert from Scopus RIS format. < Result:

*.doc file will be created>

WoS -Record Looks Like! • PT- Journal|

• AU- Brown S; Blackmon K|

• TI- Aligning manufacturing strategy and business-level competitive strategy in new

competitive environments: The case for strategic resonance|

• SO- JOURNAL OF MANAGEMENT STUDIES|

• NR- 190|

• CD- 1998, IND WEEK 1207, P22, V247; 1998, IND WEEK 1207, P24, V247;

ADLER PS, 1990, P55, CALIFORNIA MANAG SPR; ANDERSON J, 1991, V1,

P86, INT J PRODUCTION OPE; ZAJAC EJ, 2000, V21, P429, STRATEGIC

Page 24: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

MANAGE J; ZAJAC EJ, 1989, V10, P413, STRATEGIC MANAGE J9- J

MANAGE STUD-OXFORD|

• JN- JOURNAL OF MANAGEMENT STUDIES, 2005, V42, N4, P793-815|

• UT- ISI:000229369000004 ER ||

Scopus- Record Looks Like! [*.doc] • TY- JOUR|

• TI- Surface modification of polyacrylonitrile co-polymer membranes using pulsed

direct current nitrogen plasma|

• T2- Thin Solid Films|

• VL- 597|

• SP- 171|

• EP- 182|

• PY- 2015|

• DO- 10.1016/j.tsf.2015.11.050|

• AU- Pal, D.; Neogi, S.; De, S.|

• N1- Export Date: 17 April 2016|

• M3- Article|

• DB- Scopus|

• UR- http://www.scopus.com/inward/record.url?eid=2-s2.0-

84959475022&partnerID=40&md5=7f25e70a7940e813bd6cfcd7c1a3ac78|

• ER- ||

How do you analyze the file in Bibexcel?

How do you create an OUT file?

OUT file is a tab delimited text file. It can be imported into excel. It can be created as:

Step-1: Select the *.doc file.

Step-2: Entering the field TAG (e.g. AU for Author, TI for Title) under Frequency

Distribution Panel in the box marked “Old Tag”;

Step-3: Go to select field to be analyzed; From drop down, choose Any; separated field

option;

Step-4:Choose ‘Whole String’ under Frequency distribution panel

Step-5:Click on “Prep”

Page 25: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Step-6: OUT file will be generated <with extension *.OUT> after answering questions.

Creating Various file types in Bibexcel Step-1: *.tx2 file [carriage return]

Step-2: *.doc file

Step-3: *.OUT file

Step-4: *.CIT file [output file under frequency dist.]

Step-5: *.OUX file

Step-6: *.COC file

Step-7: *.jn1 or *.jn2 etc file

Result-1: Author / title/ journal with Document Identification No

How to create Author*.OUT file? Step-1: Selecting *.doc file

Step-2:Select field delimiter as Any ; separated field [from Select field to be analyzed

box]

Step-3: Type AU in the box marked under Old Tag Box

Step-4: Press the button Prep.

Step-5: Click ok , ok and yes

Step-6: Result will be generated in [*.OUT file].

How to save the *.OUT file ?

Step-7: The list of author or Title can be saved in a new file as

– Click on view whole file

– Type a file name under the box ‘ Type New file name here’

– Click on Start under select documents

Result-2: Author / title/ journal with no. of articles

Step-1: Make OUT file for AU Tag and Selecting *.OUT file

Step-2:Select field delimiter as Any ; separated field [from Select field to be analyzed

box]

Step-3: Type AU in the box marked under Old Tag Box

Step-4: Choose Whole String [from frequency distribution]

click on sort descending and Press the button “Start.”

Page 26: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Step-5: Click ok , ok and yes

Step-6: Result will be generated [*.CIT file].

How to save the *.CIT file ?

Step-7: The list of author or Title can be saved in a new file

– Click on view whole file

– Type a file name under the box ‘ Type New file name here’

– Click on Start under select documents

Result-3: List of Journals with Citations

Step-1: Select *.doc file

Step-2: Create *.OUT file by selecting “Cited Journals with whole string” and Any ;

separated field and click on “Prep”. Result is *.out file with doc. ID and Journal

Name

Step-3: Select *.out file and Enter Tag “TC” in the ‘Old Tag’ and click “Add Field to

unit” and then say “No”; Result is *.jn1 file . The List of journals with doc. ID and

Citation.

Note: you can use *.jn1 to create journal map as: Mapping>Create Pajek map file.

Remove Duplicates 1. Select *.doc and view whole file

2. Put Old Tag AU

3. Check duplicates

4. Frequency Distribution- Whole String

5. Select Field Any comma separated

6. Click on “Prep” to create *.out file

7. Select *.out file and click on ‘Start’ to remove duplicate authors. Result is the

*.cit file

Sort by Number of Documents 1. Select *.out file and view whole file

2. Put Old Tag AU

3. Check sort descending

Page 27: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

4. Frequency Distribution- Whole String

5. Select Field Any comma separated

6. Click on “Start”

Bibexcel: Frequency Distribution

• First of all choose OUT file. • Fractionalize : If a document written by two authors, each will contribute half an

article. If the box is unchecked , it implies that we have chosen Whole Counts

method:,

• Whole Counts: If the Fractionalize box is checked, it implies that we have chosen

Fractionalize method.

• Whole String : We select Whole string from the scrollbar under ‘Select type of

unit’, Bibexcel will count whole author name.

Page 28: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Co-occurrence analysis [co-word analysis/ Co-author analysis]

Creating Pajek Files in Bibexcel

Step-1: Download data from Scopus in *.ris format.

Step-2: Go to bibexcel menu>Edit doc file>Replace Line feed with Carriage return. Got

*.tx2 file

Step-3: Go to bibexcel menu>Misc>Convert to Dialog format>Convert from Scopus RIS

Format.Got *.doc file

……………………………………………………….

Sample Record TY- INPR|

TI- Interpreting correlations between citation counts and other indicators|

T2- Scientometrics|

J2- Scientometrics|

SP- 1|

EP- 11|

PY- 2016|

DO- 10.1007/s11192-016-1973-7|

SN- 01389130 (ISSN)|

AU- Thelwall, M.|

KW- Altmetrics; Citation analysis; Correlation; Discretised lognormal; Indicators; Simulation|

PB- Springer Netherlands|

N1- Export Date: 28 May 2016|

M3- Article in Press|

DB- Scopus|

N1- Article in Press|

LA- English|

RP- Thelwall, M.; Statistical Cybermetrics Research Group, University of Wolverhampton,

Wulfruna Street, United Kingdom; email: [email protected]|

UR- https://www.scopus.com/inward/record.uri?eid=2-s2.0-

84966667404&partnerID=40&md5=3fce45bd36c86dc44b08849c8be47787|

ER- ||

Step-4: Select *.doc and go to ”Frequency distribution”-box, choose from the drop-down

menu ”Whole string” check the checkbox labeled ”Make new out-file” and write in Old

tag-field AU (Author). Click Start. This will create a new file, a.oux-file.

Page 29: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Step-5: Choose the .oux -file and open it to The List by clicking on “View file”. Then, from

the “Select field to be analysed…” -box, choose from the drop-down menu ”Any;

separated field” press on the Prep-button. This will create a new .out -file, where all the

authors are listed by cases.

Step-6:From The List, mark the *.out-file and choose Analyze-> Co-occurence-> Make pairs

via listbox. Answer NO to the first question and OK to the second. This will result in an

.coc-file.

Step-7:Use the .coc file to create a network file. Go to Mapping and choose “Create a .net file

for Pajek. Say No and yes. The result will be un-directed graph. If we say Yes and Yes , the

result will be directed graph.

Step-8:Use *.cit file, to create VEC file, go to Mapping > create VEC file. This will result in an

*.vec file.

Step-9: To partition the co-citation matrix, Use *.coc file. Go to Analyze>Co-

occurrence>Cluster pair . The result will create three files *.pe2, *.pe3, *.pe4, and *.pe5

Step10: Use *.pe2, Go to menu>Mapping>Create clu file

Step-11: Importing files *.net, under network, *.vec under vectors, and *.clu under

Partition in Pajek.

Step-12: After we have opened these files in Pajek, we choose the following option from

Pajeck Menu: Draw>Draw-Partition-Vector

Page 30: Central Library - Alexander Smith Librarian, Central Library, IIT Kharagpur National Workshop on Open Source Software for Library Management ... (POP), SciMAT and Bibexcel

OSSLM-2016

Conclusion

The manual of Publish or Perish (POP), SciMAT and Bibexcel are some of the tools for

Bibliometric or scientometrics analysis using the basic principles of Bibliometrics /

scientometrics techniques. For visualization of results, Pajek is most suitable and easy tool.

Acknowledgements

We acknowledge all the developers of the software and organizations who are directly and

indirectly involved in the software development or upgradation process. Also, each software

Manual are really helpful to prepare the working manual for the workshop.