how to use the powerpoint template · 2016-07-03 · commitment to deliver any material, code, or...
TRANSCRIPT
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph A Mutable Multi-Versioned Graph Container with High Analytical Performance
Michael Haubenschild Manuel Then Sungpack Hong Hassan Chafi Oracle Labs June 24, 2016
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
How do we store graphs?
0
2
1
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
How do we store graphs?
A B C
A x
B x x
C x x
Adjacency matrix
0
2
1
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
How do we store graphs?
A B C
A x
B x x
C x x
Adjacency matrix
0
2
1
0
1
2
1
1 2
0 2
Adjacency Lists
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
How do we store graphs?
A B C
A x
B x x
C x x
Adjacency matrix
0
2
1
0
1
2
1
1 2
0 2
Adjacency Lists
Compressed Sparse Row
1 1 2 0 2
0 1 2 ⊥
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
How do we store graphs?
A B C
A x
B x x
C x x
Adjacency matrix
0
2
1
0
1
2
1
1 2
0 2
Adjacency Lists
Compressed Sparse Row
1 1 2 0 2
0 1 2 ⊥
How can we store mutable, timestamp versioned graphs?
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph (=Analytical Snapshot Graph)
0
1
2
3
4
…
Operation Types
NINS Node Insert
NDEL Node Delete
EINS Edge Insert
EDEL Edge Delete
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
1 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
0
2
1
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
Operation Types
NINS Node Insert
NDEL Node Delete
EINS Edge Insert
EDEL Edge Delete
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
1 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
- t1 NINS
0
2
1
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
Operation Types
NINS Node Insert
NDEL Node Delete
EINS Edge Insert
EDEL Edge Delete
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
- t1 NINS
- t0 NINS
1 t0 EINS
2 t2 EINS
0
2
1
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
Operation Types
NINS Node Insert
NDEL Node Delete
EINS Edge Insert
EDEL Edge Delete
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
- t0 NINS
1 t0 EINS
2 t2 EINS
0
2
1
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
Operation Types
NINS Node Insert
NDEL Node Delete
EINS Edge Insert
EDEL Edge Delete
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
- t0 NINS
1 t0 EINS
2 t2 EINS
0
2
1
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
0
2
1
Operation Types
NINS Node Insert
NDEL Node Delete
EINS Edge Insert
EDEL Edge Delete
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
3
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
2
1
Operation Types
NINS Node Insert
NDEL Node Delete
EINS Edge Insert
EDEL Edge Delete
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
3
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
2
1
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
3
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
How can we do efficient neighbor iteration in this data
structure?
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
ASGraph
0
1
2
3
4
…
2
1
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
3
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
0 createSnapshot(TS t) 1 for n : nodes 2 var CAND := List[operation] //edge candidates 3 var REST := List[operation] //remaining operations 4 var DEL := Map[nodeId->timestamp] 5 for op : n.operations 6 if(op.type == EINS && op.timestamp <= t) 7 CAND.append(op) 8 else 9 REST.append(op) 10 if(op.type == EDEL) 11 DEL[op.to] = op.timestamp 12 for c : CAND 13 if(c.timestamp <= DEL[c.to]) 14 //move c to REST 15 var tmp = CAND.remove(c) 16 REST.append(tmp) 17 //OPTIONALLY: sort CAND at this point 18 //Replace bucket content with reordered operations 19 n.operations <- concat(CAND,[EOS],REST)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
createSnapshot(t5)
0
1
2
3
4
…
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
1 for n : nodes 2 var CAND := List[operation] 3 var REST := List[operation] 4 var DEL := Map[nodeId->timestamp] 5 for op : n.operations 6 if(op.type == EINS && op.timestamp <= t) 7 CAND.append(op) 8 else 9 REST.append(op) 10 if(op.type == EDEL) 11 DEL[op.to] = op.timestamp 12 for c : CAND 13 if(c.timestamp <= DEL[c.to]) 14 var tmp = CAND.remove(c) 15 REST.append(tmp) 16 n.operations <- concat(CAND,[EOS],REST)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
createSnapshot(t5)
0
1
2
3
4
…
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
1 for n : nodes 2 var CAND := List[operation] 3 var REST := List[operation] 4 var DEL := Map[nodeId->timestamp] 5 for op : n.operations 6 if(op.type == EINS && op.timestamp <= t) 7 CAND.append(op) 8 else 9 REST.append(op) 10 if(op.type == EDEL) 11 DEL[op.to] = op.timestamp 12 for c : CAND 13 if(c.timestamp <= DEL[c.to]) 14 var tmp = CAND.remove(c) 15 REST.append(tmp) 16 n.operations <- concat(CAND,[EOS],REST)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
createSnapshot(t5)
0
1
2
3
4
…
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
1 for n : nodes 2 var CAND := List[operation] 3 var REST := List[operation] 4 var DEL := Map[nodeId->timestamp] 5 for op : n.operations 6 if(op.type == EINS && op.timestamp <= t) 7 CAND.append(op) 8 else 9 REST.append(op) 10 if(op.type == EDEL) 11 DEL[op.to] = op.timestamp 12 for c : CAND 13 if(c.timestamp <= DEL[c.to]) 14 var tmp = CAND.remove(c) 15 REST.append(tmp) 16 n.operations <- concat(CAND,[EOS],REST)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
createSnapshot(t5)
0
1
2
3
4
…
- t0 NINS
1 t0 EINS
2 t0 EINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
- t1 NINS
1 t4 EINS
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
1 for n : nodes 2 var CAND := List[operation] 3 var REST := List[operation] 4 var DEL := Map[nodeId->timestamp] 5 for op : n.operations 6 if(op.type == EINS && op.timestamp <= t) 7 CAND.append(op) 8 else 9 REST.append(op) 10 if(op.type == EDEL) 11 DEL[op.to] = op.timestamp 12 for c : CAND 13 if(c.timestamp <= DEL[c.to]) 14 var tmp = CAND.remove(c) 15 REST.append(tmp) 16 n.operations <- concat(CAND,[EOS],REST)
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
createSnapshot(t5)
0
1
2
3
4
…
1 t0 EINS
2 t0 EINS
- t0 NINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
1 t4 EINS
- t1 NINS
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
1 for n : nodes 2 var CAND := List[operation] 3 var REST := List[operation] 4 var DEL := Map[nodeId->timestamp] 5 for op : n.operations 6 if(op.type == EINS && op.timestamp <= t) 7 CAND.append(op) 8 else 9 REST.append(op) 10 if(op.type == EDEL) 11 DEL[op.to] = op.timestamp 12 for c : CAND 13 if(c.timestamp <= DEL[c.to]) 14 var tmp = CAND.remove(c) 15 REST.append(tmp) 16 n.operations <- concat(CAND,[EOS],REST)
EOS
EOS
ND
Marker Types
[EOS] End of Snapshot
[ND] Node Deleted
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
createSnapshot(t5)
0
1
2
3
4
…
1 t0 EINS
2 t0 EINS
- t0 NINS
- t0 NINS
0 t0 EINS
1 t0 EINS
1 t3 EDEL
1 t4 EINS
- t1 NINS
- t0 NINS
1 t0 EINS
2 t2 EINS
1 t5 EDEL
2 t5 EDEL
- t5 NDEL
0 t5 EDEL
1 for n : nodes 2 var CAND := List[operation] 3 var REST := List[operation] 4 var DEL := Map[nodeId->timestamp] 5 for op : n.operations 6 if(op.type == EINS && op.timestamp <= t) 7 CAND.append(op) 8 else 9 REST.append(op) 10 if(op.type == EDEL) 11 DEL[op.to] = op.timestamp 12 for c : CAND 13 if(c.timestamp <= DEL[c.to]) 14 var tmp = CAND.remove(c) 15 REST.append(tmp) 16 n.operations <- concat(CAND,[EOS],REST)
EOS
EOS
ND
2
1
3
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Advantages of this approach
• Keep the iteration logic simple
better code generation & branch prediction
• Prefetch next buckets explicitly
mitigate effects of cache misses
• Concurrent updates are possible
We do not touch memory beyond the last entry at the time of snapshot creation
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Disadvantages
• Neighbors can be accessed only via forward iteration, no random access
Common neighbor iteration profits heavily from random access
• High memory consumption
– Additional header data per bucket
– Fill factor of the buckets
– Timestamps and OpTypes
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Evaluation – Algorithm Comparison
0.001
0.01
0.1
1
10
100
tim
e[s
]
ASGraph
CSR
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Evaluation – Algorithm Comparison
0.001
0.01
0.1
1
10
100
tim
e[s
]
ASGraph
CSR
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Evaluation – Complex Scenario
run
tim
e[s
ec]
0
0.2
0.4
0.6
0.8
0
5
10
15
20
25
0
10
20
30
40
0
100
200
300
400
500
0
500
1000
ASG r aph CSR ASG r aph CSR ASG r aph CSR ASG r aph CSR ASG r aph CSR
loadInitialGraph
firstAnalytics
generateChanges
applyChanges
secondAnalytics
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
• ASGraph has performance in the same order of magnitude as CSR
• Performance independent on #snapshots
• Concurrent updates during analysis
• Outperforms CSR when graph is updated between multiple analysis
• Resizing of the node directory
• Concurrent access to multiple versions
• Re-Mapping of vertex ids to a dense range
• Distributed version of ASGraph
• Reverse bucket lists for stable update performance
Summary Future Work
Copyright © 2016, Oracle and/or its affiliates. All rights reserved.
Questions?