nosql and mysql: news about json

40
NoSQL and SQL: The Best of Both Worlds Mario Beck MySQL Presales Manager EMEA Mablomy.blogspot.de 3 rd November, 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved.

Upload: mario-beck

Post on 22-Jan-2018

950 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: NoSQL and MySQL: News about JSON

NoSQL and SQL: The Best of Both Worlds

Mario Beck MySQL Presales Manager EMEA Mablomy.blogspot.de 3rd November, 2015

Copyright © 2015, Oracle and/or its affiliates. All rights reserved.

Page 2: NoSQL and MySQL: News about JSON

Safe Harbor Statement

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.

Copyright 2015, Oracle and/or its affiliates. All rights reserved 2

Page 3: NoSQL and MySQL: News about JSON

NoSQL

Simple access patterns

Compromise on consistency for performance

Ad-hoc data format

Simple operation

SQL

Complex queries with joins

ACID transactions

Well defined schemas

Rich set of tools

Still a role for SQL (RDBMS)?

Scalability

Performance

HA

Ease of use

SQL/Joins

ACID Transactions

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 3

Page 4: NoSQL and MySQL: News about JSON

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 4

Page 5: NoSQL and MySQL: News about JSON

MySQL Cluster Overview

• In-Memory Optimization + Disk-Data

• Predictable Low-Latency, Bounded Access Time REAL-TIME

• Auto-Sharding, Multi-Master

• ACID Compliant, OLTP + Real-Time Analytics HIGH SCALE, READS +

WRITES

• Shared nothing, no Single Point of Failure

• Self Healing + On-Line Operations 99.999% AVAILABILITY

• Key/Value + Complex, Relational Queries

• SQL + Memcached + JavaScript + Java + HTTP/REST & C++ SQL + NoSQL

• Open Source + Commercial Editions

• Commodity hardware + Management, Monitoring Tools LOW TCO

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 5

Page 6: NoSQL and MySQL: News about JSON

MySQL Cluster Scaling

MySQL Cluster Data Nodes

Clients

Application Layer

Data Layer

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 6

Page 7: NoSQL and MySQL: News about JSON

NoSQL Access to MySQL Cluster data

Apps Apps Apps Apps Apps Apps Apps Apps Apps Apps Apps Apps

JPA

Cluster JPA

PHP Perl Python Ruby JDBC Cluster J JS Apache Memcached

MySQL JNI Node.JS mod_ndb ndb_eng

NDB API (C++)

MySQL Cluster Data Nodes

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 7

Page 8: NoSQL and MySQL: News about JSON

1.2 Billion UPDATEs per Minute

• Distributed Joins also possible

0

5

10

15

20

25

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

Millio

ns o

f U

PD

AT

Es p

er

Se

co

nd

MySQL Cluster Data Nodes

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 8

Scalability a

Performance a

HA a

Ease of use a

SQL/Joins a

ACID Transactions a

Page 9: NoSQL and MySQL: News about JSON

• Memory optimized tables

– Durable

– Mix with disk-based tables

• Massively concurrent OLTP

• Distributed Joins for analytics

• Parallel table scans for non-indexed searches

• MySQL Cluster 7.4 FlexAsych – 200M NoSQL Reads/Second

26th March 2015 9

MySQL Cluster 7.4 NoSQL Performance 200 Million NoSQL Reads/Second

Copyright © 2015, Oracle and/or its affiliates. All rights reserved.

-

50,000,000

100,000,000

150,000,000

200,000,000

250,000,000

2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32

Readspersecond

DataNodes

FlexAsyncReads

Page 10: NoSQL and MySQL: News about JSON

Cluster & Memcached - Configured Schema

<town:maidenhead,SL6>

prefix key value

<town:maidenhead,SL6>

key value

Prefix Table Key-col Val-col policy

town: map.zip town code cluster

Config tables

town ... code ...

maidenhead ... SL6 ...

map.zip

Application view

SQL view

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 10

Page 11: NoSQL and MySQL: News about JSON

Node.js NoSQL API • Native JavaScript access to MySQL Cluster

– End-to-End JavaScript: browser to the app & DB

– Storing and retrieving JavaScript objects directly in MySQL Cluster

– Eliminate SQL transformation

• Implemented as a module for node.js

– Integrates Cluster API library within the web app

• Couple high performance, distributed apps, with high performance distributed database

• Optionally routes through MySQL Server

– Use with InnoDB

V8 JavaScript Engine

MySQL Cluster Node.js Module

MySQL Cluster Data Nodes

Clients

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 11

Page 12: NoSQL and MySQL: News about JSON

NoSQL API for Node.js & FKs

FKs enforced on all APIs: { message: 'Error',

sqlstate: '23000',

ndb_error: null,

cause:

{message: 'Foreign key constraint violated: No parent row found [255]',

sqlstate: '23000',

ndb_error:

{ message: 'Foreign key constraint violated: No parent row found',

code: 255,

classification: 'ConstraintViolation',

handler_error_code: 151,

status: 'PermanentError' },

cause: null } }

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 12

Page 13: NoSQL and MySQL: News about JSON

SQL

• Industry standard

• Joins & Complex queries

• Relational model

Memcached

• Simple to use API

• Key/value

• Drivers for many languages

Mod-ndb

• REST

• Html

• Plugin for Apache

ClusterJ

• Simple to Use Java API

• Web & telco

• Object Relational Mapping

• Native & fast access to data

ClusterJPA

• OpenJPA plugin

• Standards defined ORM

• Cross table Joins

JavaScript/Node.js

• Native JavaScript: client to DB

• Blazing fast asynchronous throughput

Choosing the right application API

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 13

Page 14: NoSQL and MySQL: News about JSON

MySQL 5.6 Memcached with InnoDB

0

10000

20000

30000

40000

50000

60000

70000

80000

8 32 128 512

TP

S

Client Connections

Memcached API

SQL

Clients and Applications

MySQL Server Memcached Plug-in

innodb_ memcached

local cache (optional)

Handler API InnoDB API

InnoDB Storage Engine

mysqld process

SQL Memcached Protocol

Up to 9x Higher “SET / INSERT” Throughput

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 14

Page 15: NoSQL and MySQL: News about JSON

26th March 2015 Copyright © 2015, Oracle and/or its affiliates. All rights reserved. 15

Page 16: NoSQL and MySQL: News about JSON

Core New JSON features in MySQL 5.7

• Native JSON datatype

• JSON Functions

• Generated Columns

16

Page 17: NoSQL and MySQL: News about JSON

The JSON Type

17

CREATE TABLE employees (data JSON); INSERT INTO employees VALUES ('{"id": 1, "name": "Jane"}'); INSERT INTO employees VALUES ('{"id": 2, "name": "Joe"}'); SELECT * FROM employees; +---------------------------+ | data | +---------------------------+ | {"id": 1, "name": "Jane"} | | {"id": 2, "name": "Joe"} | +---------------------------+ 2 rows in set (0,00 sec)

Page 18: NoSQL and MySQL: News about JSON

JSON Type Tech Specs

• utf8mb4 character set

• Optimized for read intensive workload

• Parse and validation on insert only

• Dictionary

• Sorted objects' keys

• Fast access to array cells by index

18

Page 19: NoSQL and MySQL: News about JSON

JSON Type Tech Specs (cont.)

• Supports all native JSON types

• Numbers, strings, bool

• Objects, arrays

• Extended

• Date, time, datetime, timestamp

• Other

19

Page 20: NoSQL and MySQL: News about JSON

Advantages over TEXT/VARCHAR

1. Provides Document Validation:

2. Efficient Binary Format Allows quicker access to object members and array elements

20

INSERT INTO employees VALUES ('some random text'); ERROR 3130 (22032): Invalid JSON text: "Expect a value here." at position 0 in value (or column) 'some random text'.

Page 21: NoSQL and MySQL: News about JSON

JSON Functions

21

SET @document = '[10, 20, [30, 40]]'; SELECT JSON_EXTRACT(@document, '$[1]'); +---------------------------------+ | JSON_EXTRACT(@document, '$[1]') | +---------------------------------+ | 20 | +---------------------------------+ 1 row in set (0.01 sec)

Page 22: NoSQL and MySQL: News about JSON

JSON Array Creation

22

SELECT JSON_ARRAY(id, feature->"$.properties.STREET", feature->'$.type") AS json_array FROM features ORDER BY RAND() LIMIT 3; +-------------------------------+ | json_array | +-------------------------------+ | [65298, "10TH", "Feature"] | | [122985, "08TH", "Feature"] | | [172884, "CURTIS", "Feature"] | +-------------------------------+ 3 rows in set (2.66 sec)

Page 23: NoSQL and MySQL: News about JSON

JSON Object Creation

23

SELECT JSON_OBJECT('id', id, 'street', feature->"$.properties.STREET", 'type', feature->"$.type" ) AS json_object FROM features ORDER BY RAND() LIMIT 3; +--------------------------------------------------------+ | json_object | +--------------------------------------------------------+ | {"id": 122976, "type": "Feature", "street": "RAUSCH"} | | {"id": 148698, "type": "Feature", "street": "WALLACE"} | | {"id": 45214, "type": "Feature", "street": "HAIGHT"} | +--------------------------------------------------------+ 3 rows in set (3.11 sec)

Page 24: NoSQL and MySQL: News about JSON

• 5.7 supports functions to CREATE, SEARCH, MODIFY and RETURN JSON values:

JSON Functions

24

JSON_ARRAY_APPEND()

JSON_ARRAY_INSERT()

JSON_ARRAY()

JSON_CONTAINS_PATH()

JSON_CONTAINS()

JSON_DEPTH()

JSON_EXTRACT()

JSON_INSERT()

JSON_KEYS()

JSON_LENGTH()

JSON_MERGE()

JSON_OBJECT()

JSON_QUOTE()

JSON_REMOVE()

JSON_REPLACE()

JSON_SEARCH()

JSON_SET()

JSON_TYPE()

JSON_UNQUOTE()

JSON_VALID()

https://dev.mysql.com/doc/refman/5.7/en/json-functions.html

Page 25: NoSQL and MySQL: News about JSON

Tests Using Real Life Data

• Via SF OpenData

• 206K JSON objects representing subdivision parcels.

• Imported from https://github.com/zemirco/sf-city-lots-json + small tweaks

25

CREATE TABLE features ( id INT NOT NULL auto_increment primary key, feature JSON NOT NULL );

Page 26: NoSQL and MySQL: News about JSON

26

{ "type":"Feature", "geometry":{ "type":"Polygon", "coordinates":[ [ [-122.42200352825247,37.80848009696725,0], [-122.42207601332528,37.808835019815085,0], [-122.42110217434865,37.808803534992904,0], [-122.42106256906727,37.80860105681814,0], [-122.42200352825247,37.80848009696725,0] ] ] }, "properties":{ "TO_ST":"0", "BLKLOT":"0001001", "STREET":"UNKNOWN", "FROM_ST":"0", "LOT_NUM":"001", "ST_TYPE":null, "ODD_EVEN":"E", "BLOCK_NUM":"0001", "MAPBLKLOT":"0001001" } }

Page 27: NoSQL and MySQL: News about JSON

Naive Performance Comparison

27

# as JSON type SELECT DISTINCT feature->"$.type" as json_extract FROM features; +--------------+ | json_extract | +--------------+ | "Feature" | +--------------+ 1 row in set (1.25 sec)

Unindexed traversal of 206K documents

# as TEXT type SELECT DISTINCT feature->"$.type" as json_extract FROM features; +--------------+ | json_extract | +--------------+ | "Feature" | +--------------+ 1 row in set (12.85 sec)

Explanation: Binary format of JSON type is very efficient at searching. Storing as TEXT performs over 10x worse at traversal.

Using short cut for JSON_EXTRACT. Coming in 5.7.9.

Page 28: NoSQL and MySQL: News about JSON

Introducing Generated Columns

28

id my_integer my_integer_plus_one

1 10 11

2 20 21

3 30 31

4 40 41

CREATE TABLE t1 ( id INT NOT NULL PRIMARY KEY auto_increment, my_integer INT, my_integer_plus_one INT AS (my_integer+1) ); UPDATE t1 SET my_integer_plus_one = 10 WHERE id = 1; ERROR 3105 (HY000): The value specified for generated column 'my_integer_plus_one' in table 't1' is not allowed.

Column automatically maintained based on your specification.

Read-only of course

Page 29: NoSQL and MySQL: News about JSON

Generated Columns Support Indexes!

29

ALTER TABLE features ADD feature_type VARCHAR(30) AS (feature->"$.type"); Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 ALTER TABLE features ADD INDEX (feature_type); Query OK, 0 rows affected (0.73 sec) Records: 0 Duplicates: 0 Warnings: 0 SELECT DISTINCT feature_type FROM features; +--------------+ | feature_type | +--------------+ | "Feature" | +--------------+ 1 row in set (0.06 sec)

From table scan on 206K documents to index scan on 206K materialized values

Down from 1.25 sec to 0.06 sec

Creates index only. Does not modify table rows.

Meta data change only (FAST). Does not need to touch table.

Page 30: NoSQL and MySQL: News about JSON

Generated Columns (cont.)

• Used for “functional index”

• Available as either VIRTUAL (default) or STORED:

• Both types of computed columns permit for indexes to be added.

30

ALTER TABLE features ADD feature_type varchar(30) AS (feature->"$.type") STORED; Query OK, 206560 rows affected (4.70 sec) Records: 206560 Duplicates: 0 Warnings: 0

Page 31: NoSQL and MySQL: News about JSON

Indexing Options Available

31

STORED VIRTUAL

Primary and Secondary

BTREE, Fulltext, GIS

Mixed with fields

Requires table rebuild

Not Online

Secondary Only

BTREE Only

Mixed with fields

No table rebuild

INSTANT Alter

Faster Insert

Bottom Line: Unless you need a PRIMARY KEY, FULLTEXT or GIS index VIRTUAL is probably better.

Page 32: NoSQL and MySQL: News about JSON

Virtual vs. Stored Performance

• Approximate worst case scenario via a table scan:

32

SELECT DISTINCT feature_type FROM features; +--------------+ | feature_type | +--------------+ | "Feature" | +--------------+

VIRTUAL-TEXT (9.89 sec) STORED-TEXT (0.22 sec) VIRTUAL-JSON (0.85 sec) STORED-JSON (0.24 sec)

Clarification: Since indexes are materialized (stored) themselves, the real-life case for STORED is when generating the column is computationally expensive and you can not use indexes effectively.

Page 33: NoSQL and MySQL: News about JSON

Road Map

• In-place partial update of JSON/BLOB (performance)

• Partial streaming of JSON/BLOB (replication)

• Full text and GIS index on virtual columns

• Currently works for "STORED"

• Improved performance through condition pushdown

33

Page 34: NoSQL and MySQL: News about JSON

Prefer the Relational Model - Storing as a Column

• Easier to apply a schema to your application

• Schema may make applications easier to maintain over time, as change is controlled;

• Do not have to expect as many permutations

• Allows some constraints over data

34

Page 35: NoSQL and MySQL: News about JSON

Prefer the Document Model - Storing as JSON

• More flexible way to represent data that is hard to model in schema;

• Easier denormalization; an optimization that is important in some specific situations

• No painful schema changes*

• Easier prototyping, Fewer types to consider

• No enforced schema, start storing values immediately

35

* MySQL 5.6 has Online DDL. This is not as large of an issue as it was historically.

Page 36: NoSQL and MySQL: News about JSON

Prefer the Hybrid Model – Just do it!

36

SSDs have capacity_in_gb, CPUs have a core_count. These attributes are not consistent across products.

CREATE TABLE pc_components ( id INT NOT NULL PRIMARY KEY, description VARCHAR(60) NOT NULL, vendor VARCHAR(30) NOT NULL, serial_number VARCHAR(30) NOT NULL, attributes JSON NOT NULL );

Page 37: NoSQL and MySQL: News about JSON

Prefer Simple Access Pattern – Using Key-Value

• Full access to relational data –Value can be col1|col2|col3

–Value can be json

• Much higher throughput

• Only single Row,Primary Key Access

37

0

10000

20000

30000

40000

50000

60000

70000

80000

8 32 128 512

TP

S

Client Connections

Memcached API

SQL

Page 38: NoSQL and MySQL: News about JSON

Options for Dev – Simplicity for Ops

• Always the same tool to Backup (MySQL Enterprise Backup)

• Always the same tool to Monitor (MySQL Enterprise Monitor)

• Always the same tool to Audit (MySQL Enterprise Audit)

• Always the same tool to Protect (MySQL Enterprise Firewall)

• Always the same source of Support (Oracle MySQL Support)

• Always the same way to Deploy (Repos, Openstack, ...)

38

Polyglot Persistence with Operational Stability

Page 40: NoSQL and MySQL: News about JSON

Thank You!

Copyright © 2015, Oracle and/or its affiliates. All rights reserved.