review: scalable semantic web data management using vertical partitioning

Abadi, Marcus, Madden, Hollenbach VLDB 2007 Presented by: {Gui}llermo Cabrera The University of Texas at Austin

Upload: guillermo-cabrera

Post on 09-Jul-2015

496 views

Category:

Technology

0 download

Report

Download

Tags:

Embed Size (px):

DESCRIPTION

Part of the Semantic Web, Ontologies and the Cloud class at The University of Texas at Austin's Computer Science department during Spring 2010 term

TRANSCRIPT

Page 1: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Abadi, Marcus, Madden, HollenbachVLDB 2007

Presented by: {Gui}llermo CabreraThe University of Texas at Austin

Page 2: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Problem

Storage Goal

RDBMS use

RDF Physical Organization

Column store vs. Row Store

Materialized Path Expressions

Experiment & Results

Discussion

Page 3: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Performance: Self-joins

Many triples

Page 4: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Achieve scalability & performance in triple storage

Survey approaches in RDBMS

Benefits of vertical partition and column store

Page 5: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

1 table with 3 indexed columns?

Multi layer architecture◦ Translate -> Optimize -> Execute

Mapping tables for long URI and literals

Jena, Oracle, Sesame, 3store (Hyunjun),

Hexastore (Donghyuk)

Page 6: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Property tables◦ Clustered property table

Denormalize RDF (wider tables)

Clustering algorithm

NULL values

Page 7: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 8: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Property tables◦ Property-Class Tables

Exploit the type property

Properties may exist in multiple tables

Page 9: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 10: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Advantage:◦ Fewer joins

Disadvantage:◦ NULL values

◦ Multivalued attributes are complicated

Page 11: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Vertical Partition◦ n two-column tables, n = # of unique properties

◦ Table sorted by subject

Merge join

Page 12: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 13: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

• Advantage

Multi valued attributes supported

No clustering algorithm (Property tables)

Only accessed properties are read

• Disadvantage

Use of multiple properties (table joins)

Inserts expensive

Page 14: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Triple Store

Property Table

Vertical Partition (Row Store)

Vertical Partition Store (Column Store)

Page 15: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Why?

Projection is free

Tuple headers (metadata on row)◦ 35 bytes in Postgres vs. 8 bytes in C-Store

Column oriented compression◦ Run-length encoding (ex. 1,1,1,2,2 1x3, 2x2)

Optimized merge join◦ Prefetching

Page 16: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

<BookID1, Author, http://preamble/FoxJoe>

<http://preamble/FoxJoe,wasBorn, “1860”>

Find all books whose authors were born in 1860

Page 17: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 18: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Barton Libraries Dataset

Longwell Queries◦ Calculating counts

◦ Filtering

◦ Inference

Page 19: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

8.3 GB – Triple Store (Postgres)

14 GB – Property Table (Postgres)

5.2 GB – Vertically Partitioned (Postgres)

2.7 GB – Vertically Partitioned (C-store)

Including indices and mapping table

Page 20: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 21: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 22: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 23: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Replace ◦ subject-object joins subject-subject joins

Page 24: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Add 60 integer valued columns

7 GB increase in size

Page 25: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Great for reads, writes not considered

What about load times?

Using another benchmark (ex. LUBM)?

Native XML databases for RDF/XML?

Test triple store in Sesame

Page 26: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Page 27: Review: Scalable Semantic Web Data Management Using Vertical Partitioning

Scalable Semantic Web Data Management Using …Scalable Semantic Web Data Management Using Vertical Partitioning Daniel J. Abadi MIT [email protected] Adam Marcus MIT [email protected]

Drug-Target Interaction Prediction Using Semantic Similarity and … · 2018. 11. 26. · Drug-Target Interaction Prediction Using Semantic Similarity and Edge Partitioning . Guillermo

Graph Partitioning for Scalable Distributed Graph Computations · 2012-11-05 · Contemporary Mathematics Graph Partitioning for Scalable Distributed Graph Computations Ayd n Bulu˘c

Patrizia Paggio Center for Sprogteknologi A Modular and Scalable Environment for the Semantic WEB

Scalable Semantic Querying of Text - people.cs.umass.eduxlwang/koko-VLDB.pdf · Scalable Semantic Querying of Text Xiaolan Wangy Aaron Fengz Behzad Golshanz Alon Halevyz George Mihailaz

Scalable Semantic Querying of Text - VLDB · Scalable Semantic Querying of Text Xiaolan Wangy Aaron Fengz Behzad Golshanz Alon Halevyz George Mihailaz Hidekazu Oiwaz Wang-Chiew Tanz

Table Partition Application and Designmedia.progress.com › ... › track2_table-partitioning... · List Partitioning: Data Access List Partitioning Range Partitioning Range partition

Scalable Location Management for Geographic Routing in ...Scalable Location Management (SLALoM), which outlines a scheme for partitioning a given terrain into ordered regions for location

Integrating semantic analysis and scalable video coding for …lherranz.org/local/pubs/msj2007_integrating.pdf · 2017-02-07 · Integrating semantic analysis and scalable video coding

The 5th International Workshop on Scalable Semantic Web …ceur-ws.org/Vol-517/SSWS09-Proceedings.pdf · 2009-11-01 · The 5th International Workshop on Scalable Semantic Web Knowledge

Towards a Scalable and Robust Entity Resolution ... · PDF fileTowards a Scalable and Robust Entity Resolution-Approximate Blocking with Semantic ... this report would ... Hashing,

Graph Partitioning for Scalable Distributed Graph Computations04]-BulucMadduri_DIMACS_w… · Scalable Distributed Graph Computations Ayd n Bulu˘c1 and Kamesh Madduri2 1 Lawrence

Vantage: Scalable and Efﬁcient Fine-Grain Cache Partitioningpeople.csail.mit.edu/sanchez/papers/2011.vantage.isca.pdf · Vantage: Scalable and Efﬁcient Fine-Grain Cache Partitioning

Scalable Semantic Analytics on Social Networks for ...cobweb.cs.uga.edu/~budak/papers/coi.pdfimportant aspects on building Semantic Web applications, namely, data acquisition and entity

Oracle DB Semantic Technologies Overview · Oracle Database 11g Semantic Technologies • Only leading commercial database with native semantic data management • Scalable & secure

Lecture 3: Layout - University of Florida€¦ · 1 Lecture 3: Layout • CMOS Enhancements • Scalable rules • Poly ordering • Design Partitioning • Floorplanning Layout •

Flexible, Scalable Mesh and Data Management using PETSc DMPlex€¦ · I Mesh management optimisations I Scalable read/write routines I Parallel partitioning and load-balancing I

Scalable Semantic Web Data Management Using Vertical Partitioning Daniel J. Abadi, Adam Marcus, Samuel R. Madden, Kate Hollenbach VLDB, 2007 Oct 15, 2014

1 Spinner: Scalable Graph Partitioning in the Cloud · graph partitioning is critical for efﬁcient graph management. Existing partitioning algorithms focus on ﬁnding graph parti-tions

Scalable Semantic Web-based Source Code Search Infrastructure

Partitioning& Re Partitioning

Dynamic Metric Learning: Towards a Scalable Metric Space To … · 2021. 6. 11. · Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

The Neuroscience Information Framework: A Scalable Platform for Information Exploration and Semantic Search Computing

Towards a New Scalable Big Data System Semantic Web

Scalable Cascade Inference for Semantic Image …tvg/publications/2012/CascALE...Scalable Cascade Inference for Semantic Image Segmentation Paul Sturgess 1 [email protected]

Towards a Scalable Infrastructure for Semantic Web Services

Partitioning of SAS® Scalable Performance Data Server® Tables of SAS ® Scalable Performance Data Server ® Tables 1 Abstract This paper is an overview on setting and controlling

Scalable Skyline Computation Using Object-based Space Partitioning

Scalable Semantic Brokering over Dynamic Heterogeneous ... · Scalable Semantic Brokering over Dynamic Heterogeneous Data Sources in InfoSleuth TM Marian Nodine, William Bohrer MCC

Feature Hashing Malware for Scalable Triage and Semantic Analysis

Scalable Matrix Computations on Large Scale-Free Graphs ...srajama/publications/2dpart_SC13.pdf · Scalable Matrix Computations on Large Scale-Free Graphs Using 2D Graph Partitioning

Straightforward Feature Selection for Scalable …...Straightforward Feature Selection for Scalable Latent Semantic Indexing Jun Yan1 Shuicheng Yan2 Ning Liu1 Zheng Chen1 1Microsoft

A Scalable Approach to Learn Semantic Models of Structured Sources

D1.3.2A Distributed Semantic Spaces Scalable Approach · D1.3.2A Distributed Semantic Spaces: A Scalable Approach To Coordination Activity N: ... Start Date of Project: 01/03/2008

Building Scalable Technologies for Semantic Analysis · Building Scalable Technologies for Semantic Analysis ... Use graphs rather than tables 5 . ... Makes parallel systems easy

review: scalable semantic web data management using vertical partitioning

Technology

column tables

clustered property tables

column storegoal1 table

executemapping tables

store hyunjun

column tablegood

subject column

table11 vertical partition