social network extraction of academic researchers
DESCRIPTION
Social Network Extraction of Academic Researchers. Jie Tang, Duo Zhang, and Limin Yao Tsinghua University Oct. 29 th 2007. Outline. Motivation Related Work Problem Description Our Approach Experimental Results Summary. Motivation. More and more online social networks become available - PowerPoint PPT PresentationTRANSCRIPT
1
Social Network Extraction of Academic Researchers
Jie Tang, Duo Zhang, and Limin YaoTsinghua University
Oct. 29th 2007
2
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
3
Motivation
• More and more online social networks become available– e.g., YouTube.com, Facebook.com, etc.
• However, the social networks are usually separated• A question arises: can we build a integrated social
network from the separated ones automatically?• As a case study, how to build an social network
automatically for academic community? – ArnetMiner.org
4
Ruud Bolle Office: 1S-D58 Letters: IBM T.J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 USA Packages: IBM T.J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532 USA Email: [email protected] Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's Degree in Analog Electronics in 1977 and the Master's Degree in Electrical Engineering in 1980, both from Delft University of Technology, Delft, The Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in 1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson Research Center in the Artificial Intelligence Department of the Computer Science Department. In 1988 he became manager of the newly formed Exploratory Computer Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud M. Bolle is a Member of the IBM Academy of Technology.
DBLP: Ruud Bolle 2006
Nalini K. Ratha, Jonathan Connell, Ruud M. Bolle, Sharat Chikkerur: Cancelable Biometrics: A Case Study in Fingerprints. ICPR (4) 2006: 370-373
EE50
Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint Representation Using Localized Texture Features. ICPR (4) 2006: 521-524
EE49
Andrew Senior, Arun Hampapur, Ying-li Tian, Lisa Brown, Sharath Pankanti, Ruud M. Bolle: Appearance models for occlusion handling. Image Vision Comput. 24(11): 1233-1243 (2006)EE48
2005
Ruud M. Bolle, Jonathan H. Connell, Sharath Pankanti, Nalini K. Ratha, Andrew W. Senior: The Relation between the ROC Curve and the CMC. AutoID 2005: 15-20EE47
Sharat Chikkerur, Venu Govindaraju, Sharath Pankanti, Ruud M. Bolle, Nalini K. Ratha: Novel Approaches for Minutiae Verification in Fingerprint Images. WACV. 2005: 111-116
EE46
...
Ruud Bolle Office: 1S-D58 Letters: IBM T.J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 USA Packages: IBM T.J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532 USA Email: [email protected] Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's Degree in Analog Electronics in 1977 and the Master's Degree in Electrical Engineering in 1980, both from Delft University of Technology, Delft, The Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in 1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson Research Center in the Artificial Intelligence Department of the Computer Science Department. In 1988 he became manager of the newly formed Exploratory Computer Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud M. Bolle is a Member of the IBM Academy of Technology.
Motivating ExampleContact Information
Educational history
Academic services
Publications
1
1
2
2
Ruud Bolle
Position
Affiliation
Address
Address
PhdunivPhdmajor
Phddate
Msuniv
Msdate
Msmajor
BsunivBsdate
Bsmajor
Research Staff
IBM T.J. Watson Research CenterP.O. Box 704Yorktown Heights, NY 10598 USA
Brown University
1984
Electrical Engineering
Delft University of Technology
Analog Electronics
1977
Delft University of Technology
IBM T.J. Watson Research Center19 Skyline DriveHawthorne, NY 10532 USA
IBM T.J. Watson Research Center
Electrical Engineering
1980
Applied Mathematics
Msmajor
http://researchweb.watson.ibm.com/ecvg/people/bolle.html
Homepage
Ruud BolleName
video database indexingvideo processing
visual human-computer interactionbiometrics applications
Research_Interest
Photo
Publication 1#
Cancelable Biometrics: A Case Study in Fingerprints
ICPR 370
2006
Date
Start_page
Venue
Title
373
End_page
Publication 2#
Fingerprint Representation Using Localized Texture Features
ICPR 521
2006
Date
Start_page
Venue
Title
524
End_page
. . .
Co-authorCo-author
1
Ruud Bolle
2Publication #3
Publication #5
coauthor
coauthor
UIUCaffiliation
Professorposition
2
1
5
Ruud Bolle Office: 1S-D58 Letters: IBM T.J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 USA Packages: IBM T.J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532 USA Email: [email protected] Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's Degree in Analog Electronics in 1977 and the Master's Degree in Electrical Engineering in 1980, both from Delft University of Technology, Delft, The Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in 1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson Research Center in the Artificial Intelligence Department of the Computer Science Department. In 1988 he became manager of the newly formed Exploratory Computer Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud M. Bolle is a Member of the IBM Academy of Technology.
DBLP: Ruud Bolle 2006
Nalini K. Ratha, Jonathan Connell, Ruud M. Bolle, Sharat Chikkerur: Cancelable Biometrics: A Case Study in Fingerprints. ICPR (4) 2006: 370-373
EE50
Sharat Chikkerur, Sharath Pankanti, Alan Jea, Nalini K. Ratha, Ruud M. Bolle: Fingerprint Representation Using Localized Texture Features. ICPR (4) 2006: 521-524
EE49
Andrew Senior, Arun Hampapur, Ying-li Tian, Lisa Brown, Sharath Pankanti, Ruud M. Bolle: Appearance models for occlusion handling. Image Vision Comput. 24(11): 1233-1243 (2006)EE48
2005
Ruud M. Bolle, Jonathan H. Connell, Sharath Pankanti, Nalini K. Ratha, Andrew W. Senior: The Relation between the ROC Curve and the CMC. AutoID 2005: 15-20EE47
Sharat Chikkerur, Venu Govindaraju, Sharath Pankanti, Ruud M. Bolle, Nalini K. Ratha: Novel Approaches for Minutiae Verification in Fingerprint Images. WACV. 2005: 111-116
EE46
...
Ruud Bolle Office: 1S-D58 Letters: IBM T.J. Watson Research Center P.O. Box 704 Yorktown Heights, NY 10598 USA Packages: IBM T.J. Watson Research Center 19 Skyline Drive Hawthorne, NY 10532 USA Email: [email protected] Ruud M. Bolle was born in Voorburg, The Netherlands. He received the Bachelor's Degree in Analog Electronics in 1977 and the Master's Degree in Electrical Engineering in 1980, both from Delft University of Technology, Delft, The Netherlands. In 1983 he received the Master's Degree in Applied Mathematics and in 1984 the Ph.D. in Electrical Engineering from Brown University, Providence, Rhode Island. In 1984 he became a Research Staff Member at the IBM Thomas J. Watson Research Center in the Artificial Intelligence Department of the Computer Science Department. In 1988 he became manager of the newly formed Exploratory Computer Vision Group which is part of the Math Sciences Department.
Currently, his research interests are focused on video database indexing, video processing, visual human-computer interaction and biometrics applications.
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud M. Bolle is a Member of the IBM Academy of Technology.
Motivating ExampleContact Information
Educational history
Academic services
Publications
1
1
2
2
Ruud Bolle
Position
Affiliation
Address
Address
PhdunivPhdmajor
Phddate
Msuniv
Msdate
Msmajor
BsunivBsdate
Bsmajor
Research Staff
IBM T.J. Watson Research CenterP.O. Box 704Yorktown Heights, NY 10598 USA
Brown University
1984
Electrical Engineering
Delft University of Technology
Analog Electronics
1977
Delft University of Technology
IBM T.J. Watson Research Center19 Skyline DriveHawthorne, NY 10532 USA
IBM T.J. Watson Research Center
Electrical Engineering
1980
Applied Mathematics
Msmajor
http://researchweb.watson.ibm.com/ecvg/people/bolle.html
Homepage
Ruud BolleName
video database indexingvideo processing
visual human-computer interactionbiometrics applications
Research_Interest
Photo
Publication 1#
Cancelable Biometrics: A Case Study in Fingerprints
ICPR 370
2006
Date
Start_page
Venue
Title
373
End_page
Publication 2#
Fingerprint Representation Using Localized Texture Features
ICPR 521
2006
Date
Start_page
Venue
Title
524
End_page
. . .
Co-authorCo-author
1
Ruud Bolle
2Publication #3
Publication #5
coauthor
coauthor
UIUCaffiliation
Professorposition
2
1
Two key issues:• How to accurately extract the researcher profile information from the Web?• How to integrate the information from different sources?
6
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
7
Related Work – Person Profiling
• Profile Information Extraction– E.g., Yu et al. (2005), resume IE– Alani et al. (2003), Artequakt system
• Contact Information Extraction– E.g., Kristjansson et al. (2004), Interactive extraction– Balog and Rijke (2006), Heuristic rules
• Information Extraction Methods– E.g., HMM (Ghahramani, 1997), – MEMM (McCallum, 2000), – CRFs (Lafferty, 2001)
8
Related Work – Name Disambiguation
• Unsupervised Methods– Hierarchy clustering, K-way spectral clustering, etc.– E.g. Han (2005), Mann (2003), Tan (2006)
• Supervised Methods– Support Vector Machines, Naïve Bayes, etc.– E.g. Han (2004)
• Graph-based Approach – Random Walk, etc.– E.g. Bekkerman (2005), Malin (2005), Minkov (2006)
9
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
10
Researcher Social Network Extraction
Researcher
Homepage
Phone
Address
Phduniv
Phddate
PhdmajorMsuniv
Bsmajor
Bsdate
Bsuniv
Affiliation
Postion
Msmajor
Msdate
Fax
Person Photo
Publication
Research_Interest
NameAuthored
Title
Publication_venue
Start_page
End_page
Date
Coauthor
70.60% of the researchers have at least one homepage
or an introducing page
There are a large number of person names having the
ambiguity problem
85.6% from universities
14.4% from companies
71.9% are homepages
28.1% are introducing
pages
60% are natural language text
40% are in lists and tables
70% moved at least one time
Even 3 “Yi Li” graduated the author’s lab
11
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
12
Markov Random Field
aYbY
cY
eY
fY
dY
)~||()||( ijjiijji YYYYPYYYYP
Special Cases: - Conditional Random Fields - Hidden Markov Random Fields
Markov Property:
13
CRFs
He is a Professor at
OTH OTH
POS
AFF
…
POS
OTH
POS
UIUC
OTH
POS
AFF
OTH
POS
AFF
……
AFF
OTH
POS
AFF
… ADR
AFF
ADR
- Green nodes are hidden vars, - Purple nodes are observations
, ,
1( | ) exp ( , | , ) ( , | , )
( ) j j e k k ve E j v V k
p y x t e y x s v y xZ x
14
…..
ALC
FUC
AMC
PRV
RPA
DEL
PSB
PRV
DEL
PRV
DEL
AUC AUC
ALC
FUC
AMC
AUC
ALC
FUC
AMC
PRV
DEL
AUC
ALC
FUC
AMC
ofRuud is Fellow the IEEEBolle a
Ruud M. Bolle is a Fellow of the IEEE and the AIPR. He is Area Editor of Computer Vision and Image Understanding and Associate Editor of Pattern Recognition. Ruud M. Bolle is a Member of the IBM Academy of Technology...
ofRuud is Fellow the IEEE
…..
ALC
FUC
AMC
PRV
RPA
DEL
PSB
PRV
DEL
PRV
DEL
AUC AUC
ALC
FUC
AMC
AUC
ALC
FUC
AMC
Bolle a
PRV
DEL
AUC
ALC
FUC
AMC
Processing Flow for ProfilingPreprocessing
Determine Tokens
Standard word
Special word
Image Token
Term
Punc. mark
Labeling data
Labeled data
Learning a CRF model
Train
Test
Assigning tags
A unified tagging model
Model Learning
Tagging
Tagging results
Inputted docs
Featuredefinitions
Document
12
3
He obtained his BS in Computer Science in 1999...
Ruud M. Bolle is a Fellow of the IEEE...
...
Scienceobtained BS Computer
…..
ALC RPAPRV ALC FUC
his in
PRV
15
Token Definitions
Standard word Words in natural language
Special word
Including several general ‘special words’
e.g. email address, IP address, URL, date, number, money, percentage, unnecessary tokens (e.g. ‘===’ and ‘###’), etc.
Image token <IMAGE src="defaul3.jpg" alt=""/>
Term base NP, like “Computer Science”
Punctuation marks
Including period, question mark, and exclamation mark
Standard word Words in natural language
Special word
Including several general ‘special words’e.g. email address, IP address, URL, date, number, money, percentage, unnecessary tokens (e.g. ‘===’ and ‘###’), etc.
Image token
Punctuation marks
Including period, question mark, and exclamation mark
<IMAGE src="defaul3.jpg" alt=""/>
Term base NP, like “Computer Science”
16
Possible Tag Assignment
Token type Possible tags
Standard word All possible tags
Special wordPosition, Affiliation, Address, Email,
Phone, Fax, Phd/Ms/Bs-date
Image token Photo, Email
Term tokenPosition, Affiliation, Address, Phd/Ms/Bs-
univ, Phd/Ms/Bs-major
Punctuation marksPosition, Affiliation, Address, Email,
Phone, Fax, Phd/Ms/Bs-date
17
Feature Definition• Content features
Word features
Morphological features
Whether the current token is a word
Whether the word is capitalized
Standard Word
Image Token
Image size The size of the image
Image height/width ratio The value of height/width. The value of a person photo is often larger than 1
Image format JPG or BMP
The number of the “unique color” used in the image and the number of bits used for per pixel, i.e. 32,24,16,8,1
Face recognition Whether the current image contains a person face
Whether the filename contains (partially) the researcher name
Image filename
Whether the “alt” of the image contains (partially) the researcher name
Image “ALT”
Image positive keywords “myself”, “biology”
Image negative keywords “ads”, “banner”, “logo”
Image color
18
Feature Definition
• Pattern features
• Term features
Positive words “Fax:” for Fax, “director” for Position
Special tokensWhether the current token is a special
word
Term Whether the current token is a term
DictionaryWhether the current token is included
in a dictionary
19
Our Method to Name Disambiguation
• A hidden Markov Random Field model
• Observable Variables X represent publications
• Hidden Variables Y represent the labels of publications
• Constraints define the dependencies over hidden variables
x1
x2
x3
co-conference
cite
coauthor
cite
coauthor
coauthor
coauthort- coauthor
cite
co-conference
x4
x5
x6
x7
x8
x9
x10
x11
y1=1
y8=1
y9=3
y11=3
y10=3
y2=1
y3=1
y4=2
y7=2
y6=2
y5=2
20
Objective Function
( | ) ( ) ( | )Y X Y X YP P P
2
1( | ) exp( ( , ))
i
i ix X
X Y D x yZ
P
1
1
1
1
1exp( ( ))
1exp( ( ))
1exp( ( , ))
1exp( ( , ) ( ) [ ( , )])
i
i
k
NN N
i j
i j i j k k i ji j c C
V YZ
V YZ
V i jZ
D x x I y y w c y yZ
{ ( , ) ( ) [ ( , )]} ( , ) logk i
obj i j i j k k i j i ii j c C x X
f D x x I y y w c y y D x y Z
maximize
minimize1 22
21
Constraint Definition
p1: A, B, Cp2: A, Bp3: A, Dp4: C, D
Mp(0): p1 p2 p3
p1 p2
p3
1 0 00 1 00 0 1
p1: A, B, Cp2: A, Bp3: A, Dp4: C, D
Mp(1): p1 p2 p3
p1 p2
p3
1 1 01 1 00 0 1
p1: A, B, Cp2: A, Bp3: A, Dp4: C, D
Mp(2): p1 p2 p3
p1 p2
p3
1 1 11 1 01 0 1
p1: A, B, Cp2: A, Bp3: A, Dp4: C, D
Mp(3): p1 p2 p3
p1 p2
p3
1 1 11 1 11 1 1
C W Constraint Name Descriptionc1 w1 CoOrg ai
(0).affiliation = aj(0).affiliation
c2 w2 CoAuthor r, s>0, ai(r)=aj
(s)
c3 w3 Citation pi cites pj or pj cites pi
c4 w4 CoEmail ai(0).email = aj
(0).email
c5 w5 Feedback Constraints from user feedback
c6 w6 τ-CoAuthor one common author in τ extension
22
We define our distance function as follows:
where
We can see that actually maps each vector xi
into another new space, i.e. A1/2xi
To simplify our question, we define A as a diagonal matrix
Parameterized Distance Function
|| || Ti i iA Ax x x
( , ) 1|| || || ||
Ti j
i ji j
D A A
Ax xx x
x x
|| ||i Ax
23
• Initialization • use constraints to generate initial k clusters
• E-Step
• M-Step• Update cluster centroid• Update parameter matrix A
EM Framework
( , ) { ( , ) ( ) [ ( , )]} ( , )k
obj i i i j j k k i j i ii j c C
f y D x x I h l w c p p D x y
x
:
:
|| ||i
i
ii l h
ii
i l h
y
A
x
x
( , ) ( , ){ ( ) [ ( , )]}
k i
obj i j i ii j k k i j
i j c C x Xm m m
f D x x D x yI l l w c p p
a a a
2 2|| || || ||
|| || || ||( , ) 2 || || || ||
|| || || ||
im i jm jTim jm i j i j
i j i j
m i j
x x x xx x x x x x
D x x x x
a x x
2 2A A
A AA A
2 2A A
A
24
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
25
Profiling Experiments
• Dataset– IK researchers from ArnetMiner.org
• Baseline– Amilcare– Support Vector Machines– Unified_NT (CRFs without transition features)
• Evaluation measures– Precision, Recall, F1
26
Profiling Results—5-fold cross validation
Profiling Task Unified Unified_NT SVM AmilcarePhoto 89.11 88.64 88.86 31.62
Position 69.44 64.70 64.68 56.48
Affiliation 83.52 72.16 73.86 46.65
Phone 91.10 78.72 79.71 83.33
Fax 90.83 64.28 64.17 86.88
Email 80.35 75.47 79.37 78.70
Address 86.34 75.15 77.04 66.24
Bsuniv 67.38 57.56 59.54 47.17
Bsmajor 64.20 59.18 60.75 58.67
Bsdate 53.49 40.59 28.49 52.34
Msuniv 57.55 47.49 49.78 45.00
Msmajor 63.35 61.92 62.10 57.14
Msdate 48.96 41.27 30.07 56.00
Phduniv 63.73 53.11 57.01 59.42
Phdmajor 67.92 59.30 59.67 57.93
Phddate 57.75 42.49 41.44 61.19
Overall 83.37 72.09 73.57 62.3083.37
27
Contribution of Features
68
70
72
74
76
78
80
82
84
F1
mea
sure
Features
content
content+term
content+pattern
all
28
Disambiguation Experiments
• Data Sets:
Name Set Publications Name Variations
C. Chang 402 97
G. Wu 152 46
K. Zhang 293 40
J. Li 551 102
B. Liang 55 14
M. Hong 108 30
X. Xie 136 36
P. Xu 39 5
H. Xu 182 60
W. Yang 263 82
Name Affiliation Publication
Jing Zhang(54/25)
Shanghai Jiao Tong Univ. 6
Dept. of Automation, Tsinghua Univ.
3
Alabama Univ. 8
Univ. of California, Davis 4
Carnegie Mellon University 5
State Univ. of New York at Albany 4
Yi Li(42/22)
National Univ. of Singapore 6
South China Univ. of Technology 2
George Mason Univ. 2
Chinese Academy of Sciences 5
Univ. of Washington 3
Nanjing Normal Univ. 4
Abbreviated Name dataset Real Name dataset
29
Experiment Setup
• Baseline MethodUnsupervised Hierarchical Clustering Method
• Measurement#PairsCorrectlyPredictedToSameAuthor
_TotalPairsPredictedToSameAuthor
Pairwise Precision
# PairsCorrectlyPredictedToSameAuthorPairwise_Recall
TotalPairsToSameAuthor
2_ 1
Precision RecallPairwise F measure
Precision+Recall
30
Disambiguation Results
NameUnsupervised Hierarchical
ClusteringConstraint-based Probabilistic
Framework
Precision Recall F1 Precision Recall F1
C. Chang 0.65 0.59 0.62 0.73 0.67 0.70
G. Wu 0.71 0.62 0.66 0.75 0.75 0.75
K. Zhang 0.75 0.60 0.67 0.79 0.71 0.75
J. Li 0.62 0.52 0.57 0.66 0.59 0.62
B. Liang 0.82 0.76 0.79 0.85 0.89 0.87
M. Hong 0.79 0.65 0.71 0.82 0.75 0.78
X. Xie 0.77 0.73 0.75 0.83 0.82 0.82
P. Xu 0.89 0.95 0.92 0.94 1.00 0.97
H. Xu 0.65 0.59 0.62 0.73 0.67 0.70
W. Yang 0.71 0.62 0.66 0.75 0.75 0.75
Average 0.75 0.60 0.67 0.79 0.71 0.75
31
Contribution of Different Constraint
0. 000 0. 200 0. 400 0. 600 0. 800 1. 000
Al l
CoAuthor
CoEmai l
CoOrg
k-Author
Reference
No-Coauthor
No-CoEmai l
No-CoOrg
No-k-Author
No-Reference
Basel i neCo
nstr
aint
Com
bina
tion
Pai rwi se F1-Measure
Yi LiJ i ng Zhang
32
How Profiling and Disambiguation Help Expert Finding
• Expert finding by using a PageRank-based method
0
0.2
0.4
0.6
0.8
1
P@5 P@10 P@20 P@30 R-prec MAP bpre MRR
EF EF+RPE EF+RPE+ND
33
Outline
• Motivation
• Related Work
• Problem Description
• Our Approach
• Experimental Results
• Summary
34
Summary
• Investigated the problem of researcher social network extraction
• Proposed a unified approach to perform profiling and a constraint-based probabilistic model to name disambiguation
• Experimental results show that our approaches outperform the baseline methods
• When applying it to expert finding, we obtain a significant improvement on performances
35
Thanks!
Q&AHP: http://keg.cs.tsinghua.edu.cn/persons/tj/
Online Demo: http://arnetminer.org