introduction to data management cse 344 · – xml, json, protobuf • increasingly, some systems...

30
Introduction to Data Management CSE 344 Lecture 16: JSon and N1QL CSE 344 - Winter 2016 1

Upload: others

Post on 20-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Introduction to Data Management CSE 344

Lecture 16: JSon and N1QL

CSE 344 - Winter 2016 1

Page 2: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Announcements

•  No lecture on Monday (President’s Day)

•  HW5 due on Friday night

CSE 344 - Winter 2016 2

Page 3: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

3

JSon Semantics: a Tree !

person

Mary

name address

name address

street no city

Maple 345 Seattle

John Thai

phone

23456

{“person”: [ {“name”: “Mary”, “address”:

{“street”:“Maple”, “no”:345, “city”: “Seattle”}}, {“name”: “John”, “address”: “Thailand”, “phone”:2345678}} ]

}

Page 4: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

4

JSon Data

•  JSon is self-describing •  Schema elements become part of the data

–  Relational schema: person(name,phone) –  In Json “person”, “name”, “phone” are part of the

data, and are repeated many times •  Consequence: JSon is much more flexible •  JSon = semistructured data

CSE 344 - Winter 2016

Page 5: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Mapping Relational Data to JSon

CSE 344 - Winter 2016 5

name name name phone phone phone

“John” 3634 “Sue” “Dirk” 6343 6363 Person

person

name phone John 3634 Sue 6343 Dirk 6363

{“person”: [{“name”: “John”, “phone”:3634},

{“name”: “Sue”, ”phone”:6343}, {“name”: “Dirk”, ”phone”:6383} ]

}

Page 6: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Mapping Relational Data to JSon

6

Person name phone John 3634 Sue 6343

May inline foreign keys

Orders personName date product John 2002 Gizmo John 2004 Gadget Sue 2002 Gadget

{“Person”: [{“name”: “John”, “phone”:3646, “Orders”:[{“date”:2002, “product”:”Gizmo”}, {“date”:2004, “product”:”Gadget”} ] }, {“name”: “Sue”, “phone”:6343, “Orders”:[{“date”:2002, “product”:”Gadget”} ] } ]

}

Page 7: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

7

JSon=Semi-structured Data (1/3)

•  Missing attributes:

•  Could represent in a table with nulls

name phone John 1234 Joe -

CSE 344 - Winter 2016

{“person”: [{“name”:”John”, “phone”:1234}, {“name”:”Joe”}] }

no phone !

Page 8: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

8

JSon=Semi-structured Data (2/3)

•  Repeated attributes

•  Impossible in one table:

name phone Mary 2345 3456 ???

CSE 344 - Winter 2016

{“person”: [{“name”:”John”, “phone”:1234}, {“name”:”Mary”, “phone”:[1234,5678]}] }

Two phones !

Page 9: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

9

JSon=Semi-structured Data (3/3)

•  Attributes with different types in different objects

•  Nested collections •  Heterogeneous collections

CSE 344 - Winter 2016

{“person”: [{“name”:”Sue”, “phone”:3456}, {“name”:{“first”:”John”,”last”:”Smith”},”phone”:2345} ] }

Structured name !

Page 10: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Discussion

•  Data exchange formats –  Ideally suited for exchanging data between apps. –  XML, JSon, Protobuf

•  Increasingly, some systems use them as a data model: –  SQL Server supports for XML-valued relations –  CouchBase, Mongodb: JSon as data model –  Dremel (BigQuery): Protobuf as data model

CSE 344 - Winter 2016 10

Page 11: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Query Languages for SS Data

•  XML: XPath, XQuery (see textbook) –  Supported inside many relational engines (SQL Server, DB2,

Oracle) –  Several standalone XPath/XQuery engines

•  Protobuf: SQL-ish language (Dremel) used internally by google, and externally in BigQuery

•  JSon: –  CouchBase: N1QL (we’ll struggle with it in HW5), may be

replaced by AQL (better designed) –  MongoDB: has a pattern-based language –  JSONiq http://www.jsoniq.org/

Page 12: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

12

N1QL

•  Used in CouchDB only

•  SQL-ish notation with extensions: –  Nested collections –  Dependent joins

CSE 344 - Winter 2016

Page 13: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

N1QL Overview

CSE 344 - Winter 2016 13

SELECT … FROM bucket ... WHERE

Page 14: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Buckets in CouchDB

•  A bucket = a database

•  A CouchDB server can hold several buckets

•  Buckets stored in main memory: –  To load a new bucket need to make room by

deleting an old one

CSE 344 - Winter 2016 14

Page 15: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Mondial.json

mondial bucket {“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

May use different names

This is like several tables:

Country

code capital …

Continent

id name … …

Page 16: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Retrieve Everything

CSE 344 - Winter 2016 16

SELECT x FROM mondial x;

Bucket name

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 17: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Retrieve Countries

CSE 344 - Winter 2016 17

SELECT x.mondial.country FROM mondial x;

Bucket name

[ country1, country2, …] Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 18: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Retrieve Countries (2)

CSE 344 - Winter 2016 18

select z from mondial x unnest x.mondial y unnest y.country z;

[ country1, country2, …] Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 19: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Retrieve Countries (2)

CSE 344 - Winter 2016 19

select z from mondial x unnest x.mondial y unnest y.country z;

[ country1, country2, …] Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 20: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Retrieve Countries (2)

CSE 344 - Winter 2016 20

select z from mondial x unnest x.mondial y unnest y.country z;

[ country1, country2, …] Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 21: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Retrieve Countries (2)

CSE 344 - Winter 2016 21

select z from mondial x unnest x.mondial y unnest y.country z;

[ country1, country2, …] Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 22: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Unnesting A Collection

•  Unnest( {{a,b,c}, {d,b},{g,a,f}} ) = {a,b,c,d,b,g,a,f}

CSE 344 - Winter 2016 22

Page 23: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Retrieve Names

CSE 344 - Winter 2016 23

select z.name from mondial x

unnest x.mondial y unnest y.country z;

[ {"name": "Albania”}, {"name": "Greece”}, {"name": "Macedonia”}, {"name": "Serbia”}, ...

]

Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 24: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Accessing Arrays

CSE 344 - Winter 2016 24

select z[3] from mondial x unnest x.mondial y unnest y.country z;

country3 Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Retrieve the 3’rd country

Page 25: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

WHERE Clause

CSE 344 - Winter 2016 25

select z from mondial x

unnest x.mondial y unnest y.country z

where z.name = “Greece”

country2 Answer: some metadata +

{“mondial”: {“country”: [ country1, country2, …],

{“continent”: […]}, {“organization”: […]}, ... ...

}

Page 26: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Non-standard Names

•  Normally, a JSon name like "population” is referenced like this:

x.population

•  Mondial.json has some nostandard names: “-car_code”, "-area”, "-capital”

•  Reference them like this: x.["-car_code"]

CSE 344 - Winter 2016 26

Page 27: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Joins

•  Pretty much like in SQL, but we need to unnest to get to the right collection(s) to join

CSE 344 - Winter 2016 27

Page 28: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

All Rivers in France {“mondial”:

{“country”: [ … {“-car_code”: “F”, “name”:”France”, ...} ...] ... {“river”: [ ... {“"-id": "river-Loire“, "-country": "F“, "name": "Loire",....} ... ]

}

key

Foreign Key

[{"name":"Loire"},{"name":"Saone"},{"name":"Isere"}, {"name":"Seine"},{"name":"Marne"}]

Answer

select u.name from mondial x unnest x.mondial y unnest y.country z unnest y.river u where z.name = “France” and z.[“-car_code”]=u.[“-country”]

Page 29: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Other Constructs

•  Group by, order by = as usual

•  ARRAY_LENGTH(collection) = an alternative to count(*)

•  TONUMBER(field) = converts from string to number

•  You can find more online (see HW5)

Page 30: Introduction to Data Management CSE 344 · – XML, JSon, Protobuf • Increasingly, some systems use them as a data model: – SQL Server supports for XML-valued relations – CouchBase,

Final Thoughts

•  Unclear what will become of N1QL in the futurer; however, its treatement of nested collections is similar to other languages: XQuery, Dremel, AQL, …

•  Querying non-relational data is painful –  E.g. find all rivers that pass through France, not

just those located entirely in France.

CSE 344 - Winter 2016 30