graph-rat overview by daniel mcennis. 2/32 what is graph-rat relational analysis toolkit database...

37
Graph-RAT Overview By Daniel McEnnis

Upload: jennifer-farmer

Post on 26-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

Graph-RAT Overview

By

Daniel McEnnis

Page 2: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

2/32

What is Graph-RAT

Relational Analysis Toolkit

Database abstraction layer

Evaluation platform

Robustly evaluate all different ways of performing recommendation

Page 3: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

3/32

Kinds of Analysis

Recommendation Systems

Relational Machine Learning

Data Mining

MIR document retrieval

Page 4: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

4/32

Talk Outline

Base Components Queries Algorithms Schedulers Graph-RAT Language Conclusion and Examples

Page 5: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

5/32

Base Components

Graphs

Actors

Links

Properties

A

B

EC

D

AA

B

EC

D

AA

B

EC

D

A

[Vector]HikingBiking

22

JohnAName

Age

HobbiesLibrary

Page 6: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

6/32

Properties

Variables of Graph-RAT Can be arbitrary Java types Can be attached to anything Unique ID string for each object Accessed only as sets, not as objects

Page 7: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

7/32

Data View

Hyper-graph structure defined by the set of actors and links in a graph

Accessible from the enclosing graph Can be cyclic

A

B

EC

D

AA

B

EC

D

AA

B

EC

D

A

Page 8: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

8/32

Metadata View

Not constructed by default Implicit graph described by modes and the

relations between them Needed for relational machine learning

User

Friend

Page 9: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

9/32

Query Language

Constructs sets retrieved from a graph Functional structure Similar to SQL 4 types

Graph Queries Actor Queries Link Queries Property Queries

Page 10: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

10/32

Query Structure

Cascading queries in a LISP style syntax

Each child query is of a different type

Restrictions can be added at runtime

Page 11: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

11/32

Query Examples

LinkByActor( false, ActorByMode(false, “Target”,”.*”) ActorByMode(false, “Source”,”.*”) SetOperation.XOR)

Page 12: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

12/32

Query Comparisons

Similar to the JENA interface

Construction is similar to Jung system

Implements all SQL queries that do not require temporary tables

Page 13: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

13/32

0.4.3 Query

Uses graph primitives instead of Queries

Algorithms use hard-coded GraphByID

Page 14: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

14/32

Algorithms

Functions that execute over a given graph

Metadata is a part of the algorithm

Properties utilized or created are declared up front.

Excepting output algorithms, no side effects are permitted.execute(Graph graph)

IODescriptor getInput()IODescriptor getOuput()

Page 15: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

15/32

Propositional Algorithms

Utilizes aggregator function as a parameter Crosses all ways of shifting data

Aggregate By Link Aggregate By Link Property Aggregate On Graph Graph To Actor Link To Graph Graph To Graph

Page 16: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

16/32

Aggregator Functions

1 or more elements to equal or fewer elements

Examples– Statistical Moments– Arithmetic Operations– Null Aggregation– Concatentation

Page 17: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

17/32

Social Network Analysis Algorithms

Prestige Algorithms Degree Betweeness Closeness Page Rank HITS

Graph Triples

Page 18: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

18/32

Classification Algorithms

Machine Learning Primitives Uses Weka Separate algorithms for training and

classifying

Page 19: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

19/32

Clustering Algorithms

Several graph-based algorithms Weak Component Clustering Strong Component Clustering Edge Betweeness Clustering Norman-Girvan Edge Betweeness

Also has primitives calling Weka on vector data

Page 20: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

20/32

Similarity Algorithms

Comparisons between modes Types of Similarity

– Similarity By Link– Similarity By Property– Graph Similarity

Distance Functions– All Weka distance functions– KLDistance– Exponential Distance

Page 21: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

21/32

Collaborative Filtering Algorithms

Traditional recommendation algorithms Item to Item User to User Associative Mining

Page 22: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

22/32

Array-Based Algorithms

Transform To Array Principal Component Analysis

Page 23: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

23/32

Evaluation

All forms of evaluating results Set Based (precision and recall) Weighted Set (Correlations) Ordered Lists (Kendall Tau, Half Life)

Cross-Validation algorithms By Actor By Link By Graph

Page 24: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

24/32

Data Acquisition

Components for acquiring source data File Reader Types

Reading different file formats

Web Crawling Types LiveJournal or LastFM

Connection Types Links different sets together

Page 25: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

25/32

Web Crawler

Custom Multi-threaded web crawler Dynamic parsers Properties passing between both crawls and

parser execution Stop and filter conditions are parameterized

Page 26: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

26/32

Existing Parsers

Base HTML parsing XML Parsing (SAX)

LiveJournal FOAF LastFM REST services Graph-RAT documents Yahoo search queries

Page 27: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

27/32

Comparisons

SQL LINQ Matlab Other graph packages Prolog?

Page 28: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

28/32

Embedded Use

Dynamic Loading AbstractFactory abstract superclass Example - Retrieving links to YouTube

videos from GData

Page 29: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

29/32

Graph-RAT Language

Base Graph-RAT: Data Acquisition components executed For each algorithm entry:

Graph Query selects a set of graphs Algorithm is executed over each graph

Cross-Validation Graph-RAT Mode, relation, or graph chosen in advance, Data Acquisition components run once Algorithm entries rerun for each fold

Statistical Graph-RAT List of cross-validation schedulers Statistical metrics of which performed better

Page 30: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

30/32

User To User Collaborative Filtering Example

Aggregate By Link(Artist->User) Similarity By Link (User->User) Aggregate By Link (User->User) Property to Link (User->Artist)

Page 31: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

31/32

Setup Example

Page 32: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

Setup Example

<Scheduler class=“BasicScheduler”><Graph>

<MemGraph/></Graph>

…</Scheduler>

Page 33: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

33/32

DataAquisition

<DataAcquisition><Class>Crawl LastFM</Class><Name>Crawl LastFM</Name>

<MemGraph/><Property><Name>Proxy</Name>

<Value>proxy.waikato.ac.nz</Value></Property>

…</DataAquisition>

Page 34: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

34/32

Query Entry

<Algorithm><Query>

<GraphByID><Pattern>.*</Pattern>

</GraphByID></Query>

</Algorithm>

Page 35: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

Algorithm Entry

<Algorithm>

<Query>…</Query>

<Class>GraphTriples</Class>

<Name>Graph Triples</Name>

<Property><Name>Relation</Name>

<Value>Friends</Value>

</Property>

<Property><Name>Destination</Name>

<Value>TriplesVector</Value>

</Property>

</Algorithm>

Page 36: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

36/32

Future Work

Stabilization - 0.5.1 to beta Statistical testing on result sets Upgrading the GUI interface Memory performance upgrades Octave Integration

Page 37: Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly

37/32

Questions?

http://graph-rat.sourceforge.net Stable (beta) release is 0.4.3