survey on nosql integration

17
A survey on NoSQL database integration Luiz Henrique Zambom Santana Prof. Dr. Ronaldo dos Santos Mello Profa. Dra. Carina Dorneles

Upload: luiz-henrique-zambom-santana

Post on 22-Jan-2018

567 views

Category:

Technology


0 download

TRANSCRIPT

A survey on NoSQL database integration

Luiz Henrique Zambom SantanaProf. Dr. Ronaldo dos Santos Mello

Profa. Dra. Carina Dorneles

Agenda

• Background• NoSQL

• Global vs. Local

• Model

• Related Works

• Comparison

• Taxonomy

• Conclusions

Background: NoSQL

Background: NoSQL

Sadalage e Fowler, 2012

(http://martinfowler.com/books/nosql.html)

Not only SQL

Nathan Marz, 2014(http://www.slideshare.net/nathanmarz/runaway-complexity-in-big-data-and-a-plan-to-stop-it)

Relational databases will be a footnote in history

Background: NoSQL

Background: Global-as-view Vs. Local-as-view

● GAV

○ mapping from entities in

the mediated schema to

entities in the original

sources

● LAV

○ mapping from entities in

the original sources to

the mediated schema● The latter approach requires more sophisticated

inferences to resolve a query on the mediated

schema, but makes it easier to add new data

sources to a (stable) mediated schema.

Model

• Hipothesis:

Dey, Akon, Alan Fekete, and Uwe Röhm. "Scalable transactions across heterogeneous NoSQL key-value data stores." Proceedings of the VLDB

Endowment 6.12 (2013): 1434-1439.

• VLDB Endowment• Qualis A1

• Impact Factor 1.568

• Why it is important?• Seminal

• Transactions

•“Weak” global-as-view

Zhang, Duo, Benjamin Rubinstein, and Jim Gemmell. "Principled graph matching algorithms for integrating multiple

data sources." (2014).

• IEEE Transactions on Knowledge and Data Engineering (TKDE)• Qualis A1

• Impact factor 2.067

• Why it is important?• Graph matching algorithms

• Entity resolution

• Shows that integration is far more complicated in NoSQL applications

•Local-as-view

Da Silva, Daniel L., et al. "A Computational Framework for Integrating and Retrieving Biodiversity Data on a Large Scale." Big Data (BigData

Congress), 2014 IEEE International Congress on. IEEE, 2014.

• IEEE International Congress on Big Data• No Qualis (yet)

• Impact factor

• Why it is important?• Integrating and Retrieving Biodiversity Data

•Global-as-view

•Resembles the Lambda Architecture

Kiran, V. K., and R. Vijayakumar. "Ontology based data integration of NoSQL datastores." Industrial and Information Systems (ICIIS), 2014 9th International

Conference on. IEEE, 2014.

• 2014 9th International Conference on Industrial and Information Systems (ICIIS)• Qualis B1

• Why it is important?• Intermediate model

• Global-Local-as-view

• Information extraction may require sourcing data from multiple data sources, establishing relationship among them and querying across these data sources together.

Kaur, Karamjit, and Rinkle Rani. "Managing Data in Healthcare Information Systems: Many Models, One

Solution." Computer 3 (2015): 52-59.

• IEEE Computer• Qualis A1

• Impact fator 1.443•Global-as-view

• Why it is important?• Because healthcare data comes from

multiple, vastly different sources, databases must adopt a range of models to process and store it. A polyglot-persistent framework combines relational, graph, and document data models to accommodate information variety

Duggan, Jennie, et al. "The BigDAWG Polystore System." ACM SIGMOD Record 44.2 (2015): 11-

16.• SIGMOD Record

• Qualis A1

• Impact Factor 1.05

• Global-as-view

• A polystore architecture designed to unify querying over multiple data models.•“No one size fits all”

Duggan, Jennie, et al. "The BigDAWG Polystore System." ACM SIGMOD Record 44.2 (2015): 11-16.

• Why it is important?• Twitter guys and Stonebraker

• Deals with the entire complexity

• Introduces the Island abstraction

• Model cast between the DBMS

Taxonomy

Comparativo

Year Main author Summary NoSQL Taxonomy

2013 Dey Transactional access Key/Value Schema unification > Poliglot

2014 Zhang Graph match Graph Schema unification > Unified Language

2014 Da Silva Biodiversity databases integration

Document Applicational integration > CAP

2014 Kiran Ontology as canonical model

Column-oriented Schema unification > Unified Language

2015 Kaur Medical Virtually any of database Applicational integration > CAP

2015 Duggan BigDAWG Virtually any of database Federation >Indepent access

Conclusions

• The problem is real•Important for many fields

• Most of the solutions uses Global-as-View

• Most of the solutions exposes a REST API as unified access

• Many works cites also SQL and NoSQL integration

• Concerns• The solution have to scalable

• The solution cannot be difficult to setup

• BigDAWG is the most complete approach