Download - NoSQL no MySQL 5.7
NoSQL em um mundo SQL
Airton Lastori [email protected] Abril-2016
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
DBA Dev Gerencial
Quem?
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Nunca usou NoSQL
Usa NoSQL apenas em apps não-
críticas
NoSQL em apps críticas
Quem?
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Agenda
1. NoSQL?
2. Uso do relacional como não-relacional
3. NewSQL
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
NoSQL? uma breve introdução
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 6
NoSQL = Não-relacional
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Modelo Relacional
• Edgard F. Codd –1970, IBM
–Turing Award 1981
• Forte base teórica matemática –teoria dos conjuntos, lógica de predicados, etc.
• Implementada como SQL
Oracle Confidential – Internal/Restricted/Highly Restricted 7
The relational model for database management: version 2
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 8
SQL = implementação do modelo relacional
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Sistemas Gerenciadores de Bancos de Dados Relacionais
• Software para gerenciar dados baseados no Modelo Relacional
• Estrutura lógica dos dados – visão do usuário
– Dados organizados em tabelas compostas de linhas e colunas e possuem regras de relacionamento entre sí (constraints) • Conhecidos como Dados Estruturados
– SQL permite criar, manter e consultar os dados nestas estruturas
– Normalização e Constraints evitam duplicidade da informação, aumentando consistência e qualidade dos dados
– propriedades ACID (suporte a transações)
• Estrutura física dos dados – visão da máquina
– Árvores B*Tree = buscas muito rápidas O(log n)
Oracle Confidential – Internal/Restricted/Highly Restricted 9
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Vários modelos de dados
• O modelo relacional não resolve bem todos os problemas
• 1960 - navigacional
– Hierárquico
– Network
• 1970 - SQL/relacional
• 1990 - Orientado a Objetos – em parte, absorvido pelos SGBDRs
• 2000 – NoSQL
– será absorvido pelos SGBDRs? Apenas um Hype?
Oracle Confidential – Internal/Restricted/Highly Restricted 11
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
A Web
Oracle Confidential – Internal/Restricted/Highly Restricted 12
Dados massivos, problemas Big Data
Altíssima escala
Sempre online
Estratégia: hardwares commodity em nuvem + software livre
Modelo relacional: difícil escalar e implementar alta disponibilidade em nuvem de hw commodity
www.leavcom.com/pdf/NoSQL.pdf
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Standalone
Clusterizado
Oracle Confidential – Internal/Restricted/Highly Restricted 13
Problemas diferentes
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Problemas em ambientes clusterizados
• Quando os dados estão normalizados e distribuídos, é difícil manter a performance
• Quando os dados estão distribuídos, é difícil manter a consistência e implementar transações
• Quando o dado está distribuído em hw commodity, é preciso ter duplicidade e sincronização para tolerância a falhas
• Etc.
Que tal abrir mão de algumas coisas do modelo relacional em prol de outras?
Oracle Confidential – Internal/Restricted/Highly Restricted 14
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 15
Web = gatilho para o surgimento das tecnologias NoSQL
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | 16
http://db-engines.com/en/ranking_categories
183 NoSQL
12 categorias
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Características comuns
• Alta Performance
– Normalmente um banco NoSQL é muito rápido, pois possui uma arquitetura simplificada
• Evita operações de JOINs
– armazenando dados duplicados e denormalizados
• Projetado para escalar Horizontalmente – Lembra da Cloud Computing? Pois é...
• Normalmente abre-se mão de funcionalidades em prol da simplicidade de uso, inclusive em escala
Oracle Confidential – Internal/Restricted/Highly Restricted 17
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Schemaless
• key-value
• document
• wide column
• graph
• Etc
• Muda o modelo lógico, a visão do usuário
– Outras APIs de acesso = Not Only SQL
– Em muitos casos, simplifica a vida do desenvolvedor (do DBA, nem tanto...)
Oracle Confidential – Internal/Restricted/Highly Restricted 18
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 19
Computação acessível via clouds públicas, software livre e simplicidade no uso tornam o movimento NoSQL
muito relevante
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | 20
http://db-engines.com/en/ranking_trend (mar-2016)
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Uso do relacional como não-relacional casos de sucesso da web
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Grandes usuários MySQL
23
Web, Cloud, Distribuído e Embarcado…
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 24
Muitas foram start ups há poucos anos, começaram e cresceram com
MySQL
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Usa MySQL como NoSQL
eng.uber.com/schemaless-part-one
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Usa MySQL como NoSQL
eng.uber.com/schemaless-part-one
• Our new solution needed to be able to linearly add capacity by adding more servers
• We needed write availability – substituir Redis como data pipeline em busca de consistência de leitura sem abrir mão da performance de escrita
• We needed secondary indexes – saindo do Postgres, mas mantendo a mesma funcionalidade
• We needed operation trust in the system, as it contains mission-critical trip data
• We needed a way of notifying downstream dependencies – múltiplos processos (billing, analytics) inter-dependentes, mas que precisam ser isolados para escalar e sem perdas
•
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Usa MySQL como NoSQL
eng.uber.com/schemaless-part-one
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Usa MySQL como NoSQL
eng.uber.com/schemaless-part-one
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
“We had an unexpected loss of data on nearly every technology we used at one time or another, except MySQL.”
– Pinterest Engineering
Oracle Confidential – Internal/Restricted/Highly Restricted 29
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
NewSQL o mundo relacional abraça o NoSQL
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Suporte ao modelo chave-valor
• Memcached plug-in
Oracle Confidential – Internal/Restricted/Highly Restricted 31
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
MySQL 5.7 Sysbench Benchmark: SQL Point Selects 3x Faster than MySQL 5.6
1,600,000 QPS
0
200,000
400,000
600,000
800,000
1,000,000
1,200,000
1,400,000
1,600,000
1,800,000
8 16 32 64 128 256 512 1,024
Qu
eri
es
pe
r Se
con
d
Connections
MySQL 5.7: Sysbench OLTP Read Only (SQL Point Selects)
MySQL 5.7
MySQL 5.6
MySQL 5.5
Intel(R) Xeon(R) CPU E7-8890 v3 4 sockets x 18 cores-HT (144 CPU threads) 2.5 Ghz, 512GB RAM Linux kernel 3.16
32
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Suporte ao modelo orientado a documentos no MySQL 5.7
1. Native JSON datatype
2. JSON Functions
3. Generated Columns
33
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Tipo nativo JSON
34
CREATE TABLE employees (data JSON);
INSERT INTO employees VALUES ('{"id": 1, "name": "Jane"}');
INSERT INTO employees VALUES ('{"id": 2, "name": "Joe"}');
SELECT * FROM employees;
+---------------------------+
| data |
+---------------------------+
| {"id": 1, "name": "Jane"} |
| {"id": 2, "name": "Joe"} |
+---------------------------+
2 rows in set (0,00 sec)
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Vantagens sobre tipos TEXT/VARCHAR
1. Document Validation:
2. Eficiência no armazenamento físico Allows quicker access to object members and array elements because its optimized binary format
37
INSERT INTO employees VALUES ('some random text');
ERROR 3130 (22032): Invalid JSON text: "Expect a value here." at
position 0 in value (or column) 'some random text'.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON Functions
38
SET @document = '[10, 20, [30, 40]]';
SELECT JSON_EXTRACT(@document, '$[1]');
+---------------------------------+
| JSON_EXTRACT(@document, '$[1]') |
+---------------------------------+
| 20 |
+---------------------------------+
1 row in set (0.01 sec)
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Testes com dados reais
• Via SF OpenData
• 206K JSON objects representing subdivision parcels.
• Imported from https://github.com/zemirco/sf-city-lots-json + small tweaks
39
CREATE TABLE features (
id INT NOT NULL auto_increment primary key,
feature JSON NOT NULL
);
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. | 40
{
"type":"Feature",
"geometry":{
"type":"Polygon",
"coordinates":[
[
[-122.42200352825247,37.80848009696725,0],
[-122.42207601332528,37.808835019815085,0],
[-122.42110217434865,37.808803534992904,0],
[-122.42106256906727,37.80860105681814,0],
[-122.42200352825247,37.80848009696725,0]
]
]
},
"properties":{
"TO_ST":"0",
"BLKLOT":"0001001",
"STREET":"UNKNOWN",
"FROM_ST":"0",
"LOT_NUM":"001",
"ST_TYPE":null,
"ODD_EVEN":"E",
"BLOCK_NUM":"0001",
"MAPBLKLOT":"0001001"
}
}
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Naive Performance Comparison
41
# as JSON type
SELECT DISTINCT
feature->"$.type" as json_extract
FROM features;
+--------------+
| json_extract |
+--------------+
| "Feature" |
+--------------+
1 row in set (1.25 sec)
Unindexed traversal of 206K documents
# as TEXT type
SELECT DISTINCT
feature->"$.type" as json_extract
FROM features;
+--------------+
| json_extract |
+--------------+
| "Feature" |
+--------------+
1 row in set (12.85 sec)
Explanation: Binary format of JSON type is very efficient at searching. Storing as TEXT performs over 10x worse at traversal.
Using short cut for JSON_EXTRACT. Coming in 5.7.9.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Generated Columns
42
id my_integer my_integer_plus_one
1 10 11
2 20 21
3 30 31
4 40 41
CREATE TABLE t1 (
id INT NOT NULL PRIMARY KEY auto_increment,
my_integer INT,
my_integer_plus_one INT AS (my_integer+1)
);
UPDATE t1 SET my_integer_plus_one = 10 WHERE id = 1;
ERROR 3105 (HY000): The value specified for generated column
'my_integer_plus_one' in table 't1' is not allowed.
Column automatically maintained based on your specification.
Read-only of course
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Generated Columns Support Indexes!
43
ALTER TABLE features ADD feature_type VARCHAR(30) AS (feature->"$.type");
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
ALTER TABLE features ADD INDEX (feature_type);
Query OK, 0 rows affected (0.73 sec)
Records: 0 Duplicates: 0 Warnings: 0
SELECT DISTINCT feature_type FROM features;
+--------------+
| feature_type |
+--------------+
| "Feature" |
+--------------+
1 row in set (0.06 sec)
From table scan on 206K documents to index scan on 206K materialized values
Down from 1.25 sec to 0.06 sec
Creates index only. Does not modify table rows.
Meta data change only (FAST). Does not need to touch table.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Generated Columns (cont.)
• Used for “functional index”
• Available as either VIRTUAL (default) or STORED:
• Both types of computed columns permit for indexes to be added.
44
ALTER TABLE features ADD feature_type varchar(30) AS (feature-
>"$.type") STORED;
Query OK, 206560 rows affected (4.70 sec)
Records: 206560 Duplicates: 0 Warnings: 0
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Indexing Options Available
45
STORED VIRTUAL
Primary and Secondary
BTREE, Fulltext, GIS
Mixed with fields
Requires table rebuild
Not Online
Secondary Only
BTREE Only
Mixed with fields
No table rebuild
INSTANT Alter
Faster Insert
Bottom Line: Unless you need a PRIMARY KEY, FULLTEXT or GIS index VIRTUAL is probably better.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Virtual vs. Stored Performance
• Approximate worst case scenario via a table scan:
46
SELECT DISTINCT feature_type FROM
features;
+--------------+
| feature_type |
+--------------+
| "Feature" |
+--------------+
VIRTUAL-TEXT (9.89 sec)
STORED-TEXT (0.22 sec)
VIRTUAL-JSON (0.85 sec)
STORED-JSON (0.24 sec)
Clarification: Since indexes are materialized (stored) themselves, the real-life case for STORED is when generating the column is computationally expensive and you can not use indexes effectively.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Unquote JSON String
SELECT
DISTINCT JSON_UNQUOTE(feature->"$.type")
as feature_type
FROM features;
+-----------------+
| feature_type |
+-----------------+
| Feature |
+-----------------+
1 row in set (1.22 sec)
47
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON Path Search
• Provides a novice way to know the path. To retrieve via: [[database.]table.]column->"$<path spec>"
48
SELECT JSON_SEARCH(feature,
'one', 'MARKET') AS
extract_path
FROM features
WHERE id = 121254;
+-----------------------+
| extract_path |
+-----------------------+
| "$.properties.STREET" |
+-----------------------+
1 row in set (0.00 sec)
SELECT
feature->"$.properties.STREET"
AS property_street
FROM features
WHERE id = 121254;
+-----------------+
| property_street |
+-----------------+
| "MARKET" |
+-----------------+
1 row in set (0.00 sec)
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON Array Creation
49
SELECT JSON_ARRAY(id,
feature->"$.properties.STREET",
feature->'$.type") AS json_array
FROM features ORDER BY RAND() LIMIT 3;
+-------------------------------+
| json_array |
+-------------------------------+
| [65298, "10TH", "Feature"] |
| [122985, "08TH", "Feature"] |
| [172884, "CURTIS", "Feature"] |
+-------------------------------+
3 rows in set (2.66 sec)
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON Object Creation
50
SELECT JSON_OBJECT('id', id,
'street', feature->"$.properties.STREET",
'type', feature->"$.type"
) AS json_object
FROM features ORDER BY RAND() LIMIT 3;
+--------------------------------------------------------+
| json_object |
+--------------------------------------------------------+
| {"id": 122976, "type": "Feature", "street": "RAUSCH"} |
| {"id": 148698, "type": "Feature", "street": "WALLACE"} |
| {"id": 45214, "type": "Feature", "street": "HAIGHT"} |
+--------------------------------------------------------+
3 rows in set (3.11 sec)
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON_REPLACE
51
SELECT JSON_REPLACE(feature, '$.type', JSON_ARRAY('feature', 'bug')) as
json_object FROM features LIMIT 1;
+--------------------------------------------------------+
| json_object |
+--------------------------------------------------------+
| {"type": ["feature", "bug"], "geometry": {"type": ..}} |
+--------------------------------------------------------+
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
• 5.7 supports functions to CREATE, SEARCH, MODIFY and RETURN JSON values:
JSON Functions
52
JSON_ARRAY_APPEND()
JSON_ARRAY_INSERT()
JSON_ARRAY()
JSON_CONTAINS_PATH()
JSON_CONTAINS()
JSON_DEPTH()
JSON_EXTRACT()
JSON_INSERT()
JSON_KEYS()
JSON_LENGTH()
JSON_MERGE()
JSON_OBJECT()
JSON_QUOTE()
JSON_REMOVE()
JSON_REPLACE()
JSON_SEARCH()
JSON_SET()
JSON_TYPE()
JSON_UNQUOTE()
JSON_VALID()
https://dev.mysql.com/doc/refman/5.7/en/json-functions.html
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON Comparator
53
SELECT CAST(1 AS JSON) = 1;
+---------------------+
| CAST(1 AS JSON) = 1 |
+---------------------+
| 1 |
+---------------------+
1 row in set (0.01 sec)
JSON value of 1 equals 1
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON ou Coluna?
• Você escolhe
• Vantagens em ambas abordagens
54
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Storing as a Column
• Easier to apply a schema to your application
• Schema may make applications easier to maintain over time, as change is controlled;
• Do not have to expect as many permutations
• Allows some constraints over data
55
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Storing as JSON
• More flexible way to represent data that is hard to model in schema;
• Imagine you are a SaaS application serving many customers
• Strong use-case to support custom-fields
• Historically this may have used Entity–attribute–value model (EAV). Does not always perform well
56
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
JSON (cont.)
• Easier denormalization; an optimization that is important in some specific situations
• No painful schema changes*
• Easier prototyping
• Fewer types to consider
• No enforced schema, start storing values immediately
57
* MySQL 5.6 has Online DDL. This is not as large of an issue as it was historically.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Schema + Schemaless
58
SSDs have capacity_in_gb, CPUs have a core_count. These attributes are not consistent across products.
CREATE TABLE pc_components (
id INT NOT NULL PRIMARY KEY,
description VARCHAR(60) NOT NULL,
vendor VARCHAR(30) NOT NULL,
serial_number VARCHAR(30) NOT NULL,
attributes JSON NOT NULL
);
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Sumário
1. O movimento NoSQL é de grande relevância e têm os gigantes da Web como protagonistas
2. NoSQL complementa Bancos Relacionais
3. NewSQL = combinando os dois mundos
4. MySQL continua muito relevante na Web
5. Memcached plugin e JSON são exemplos no MySQL de como bancos relacionais podem abraçar o NoSQL
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Obrigado!
Copyright © 2014, Oracle and/or its affiliates. All rights reserved.
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
@MySQLBR meetup.com/MySQL-BR facebook.com/MySQLBR
pt.planet.mysql.com
Copyright © 2015 Oracle and/or its affiliates. All rights reserved. |
Perguntas?
NoSQL em um mundo SQL Contato: [email protected] twitter.com/mysqlbr facebook.com/mysqlbr