cs 5604: information storage and retrieval - vtechworks

21
Introduction to Information Retrieval CS 5604: Information Storage and Retrieval ProjCINETViz by Maksudul Alam, S M Arifuzzaman, and Md Hasanuzzaman Bhuiyan

Upload: others

Post on 10-Feb-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CS 5604: Information Storage and Retrieval

ProjCINETViz

by

Maksudul Alam,

S M Arifuzzaman, and

Md Hasanuzzaman Bhuiyan

Page 2: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Overview

Recap

Features

Demonstration

Technical Challenges

Future work

2

Page 3: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Project Description

• Developed a visualization module

– Visualize graphs using Gephi

– Integrate this visualization module with CINET

• Supports large network graphs

3

Page 4: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Gephi

• Java based visualization and exploration platform

• Interactive

• Visualize all kinds of networks

• Compatible with Windows, Linux and Mac OS X

• Open-source and free

4

Page 5: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

How to use Gephi?

• Stand-alone desktop application

• Java based Gephi Toolkit library

• We will use Gephi Toolkit library

5

Page 6: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Network Representation

6

Page 7: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Network Visualization

• Typical steps to visualize a network:

1. Layout

Random

Force Atlas

Yifan Hu’s

7

2. Feature based organization

Degree

Betweennesscentrality

Closeness centrality

Modularity

3. Visualization in Web Browser

Java Applet

Javascript

Flash

WebGL

Page 8: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINET

• Cyber-Infrastructure for NETwork Science

• Easy-to-use cyber-environment

• Provides computational and analytic environment for network analysis

• Developed in NDSSL lab

• Funded by NSF

8

Page 9: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Integration of Visualization to CINET

User Interfaces

Research

Interface

Instructional

Interface

GaLib

Broker

NetworkX

Broker

CINet

Broker

Interface

Broker

JA

VA

AP

I

WS

AP

I

Blackboard

Digital Library

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Autauga County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Baldwin County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Barbour County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Bibb County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Blount County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Bullock County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Butler County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Calhoun County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Chambers County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Cherokee County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Chilton County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Choctaw County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Clarke County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Clay County

CountyState

0 20 40 60 80 100 120

0.00

0.02

0.04

0.06

0.08

Day

Proportion Symptomatic

Cleburne County

CountyState

ResultsNetworks

Measures and

Analysis

GaLib

NetworkX

EpiFast(upcoming)

Pajek(upcoming)

Execution

Broker

Resource

Broker

Compute ResourcesHPC Cluster Individual Sever

Batch

API

Model

Wrapper

Model

Wrapper

Batch

Interface

Digital Lib

Broker

DL

API

Viz. Interface

Preprocessed viz. data

9

Page 10: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Typical Visualization Workflow

10CINETViz

GexfGeneration

Core

Layout Core

Network Analysis

Core

VisualizationCore

Web Rendering

Script

CINET Server

User Parameters

Generate Gexf from

CINET Graphs

Apply Layout

Network Analysis

Color, Size, Label

Process Data for Web Browser

Store Rendered Graph

Display Network in Web Browser

User

Page 11: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINETViz – Features

• Mimic the core functionalities of Gephi Desktop Application into web interface:

– Layout

– Ranking based on parameters

– Partitioning

• Dynamic range of visualization

– User can pick how the node color, size would vary and by how much

• Store rendered networks into organized structure

11

Page 12: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINETViz-DEMO

• Main Screen

• http://128.173.98.199:8082/granite

12

Page 13: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINETViz-DEMO

• Visualization integrated as a Tab into CINET interface

13

Page 14: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINETViz-DEMO

• User can visualize pre-rendered network or submit new network visualization.

14

Page 15: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINETViz-DEMO

• User can visualize pre-rendered network or submit new network visualization.

15

Page 16: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINETViz-DEMO

• To generate new network visualization user can pick a network and select appropriate visualization parameters

16

Page 17: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Difficulties

• Graph format

– Diverse

– Conversion

• Data transfer from server to web app

– Latency, bandwidth, browser compatibility and support

• Integration with CINET

– Compatibility with existing architecture

– Issues with smart-gwt etc.

17

Page 18: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

CINETViz Implementation Challenges

• Study of CINET GRANITE framework

• Integration of visualization toolkit into web browser

– Communicate between GWT and sigmajs visualization library using native javascript

• Communication between web server and high performance cluster

• Implementation of visualization methods (coloring, sizing, layouting) using gephi-toolkit programmatically

18

Page 19: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Visualizing Large Networks

• Large network if |V| >= 10,000 or |E| >= 50,000

• Choose a root node

– Randomly

– User defined

• Using BFS, explore from root up to:

– Pre-specified depth (i.e., 4 or 5)

– Pre-specified number of nodes (i.e., 200 nodes)

19

Page 20: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Future Work

• Workflow

– Visualizing the output

• Providing more information

– Showing node label, id, edge weight and etc.

• Filtering

– Visualize small part of graph

• Graph organization by applying multiple algorithms

– For example, we want to apply both page rank and betweennesscentrality

• Comparison of the different visualization

– Using different measures

20

Page 21: CS 5604: Information Storage and Retrieval - VTechWorks

Introduction to Information Retrieval

Questions and Comments

21