voltdb: shard it by v. torshyn

Download Voltdb: Shard It by V. Torshyn

Post on 05-Jul-2015

675 views

Category:

Data & Analytics

0 download

Embed Size (px)

DESCRIPTION

VoltDB Shard It: Java Morning slides

TRANSCRIPT

  • VoltDB: Shard it!

    By Vitalii Torshyn

  • VoltDB Designed for

    Network Activity Monitoring / Security Real-Time Analytics/Monitoring Telecom: billing, QoS, Policy

    Management Financial Services Gaming

  • Agenda

    #1: Academic issues: Shard, HP, VP#2: What is Volt DB? Why ! VoltDB?#3: Real VoltDB application#4: Tips and Tweaks: you are not supposed to do that at all

  • Agenda

    #1: Academic issues: Shard, HP, VP#2: What is Volt DB? Why ! Volt DB?#3: Simple Volt DB application#4: Tips and Tweaks: you are not supposed to do that at all

  • Academic issues

    What is: Shard, Horizontal Partitioning ACID complaint DBMS In-memory and real time DB

  • Academic issues

    What is: Shard, Horizontal Partitioning ACID compliant DBMS In-memory and real time DB

  • Straight forward approach

  • What is Horizontal Partitioning

  • What is shard

  • Use over time for shard

  • Academic issuesWhat is: Shard, Horizontal Partitioning ACID compliant DBMS In-memory and real time DB

  • ACID

    Atomicity Consistency Isolation Durability

  • Academic issuesWhat is: Shard, Horizontal Partitioning ACID compliant DBMS In-memory and real time database

  • In-memory/real time database

    Storage is option, not requirement Key value? No! Data access is cheap (no I/O) Real time processing

  • Agenda

    #1: Academic issues: Shard,HP,VP#2: What is Volt DB? Why?! Volt DB?#3: Simple Volt DB application#4: Tips and Tweaks: you are not supposed to do that at all

  • db-engines.com: Ranking

  • What is VoltDB?ACID compliant DBMSIn-memory databaseReal timeShared nothing architecture (SN)SQL supportJava Stored procedures

  • C++ Engine:Indexing, Lookup, Mem. Management ...

    Java over JNISP, Query Prepare, Transfer

  • Why Volt DB? Low cost of scaling Low latency Automatic Cross-partition joins (no app. Code)

    Multi-master replication (HA, K-Safety) No buffer management (In memory) Lockless Licensing (GPLv3, Enterprise) Client libraries: Java, Python, C++, C#...

  • Requests execution

  • Access and Networking RESTful HTTP/JSON API Socket connections Java API JDBC

  • Is VoltDB fast enough?

    Sharding High Speed Small transactions No journaling required Buffering is not required

  • Replication: K-Safety = 1

  • Replication: K-Safety = 1

  • Replication: Active-Passive

  • K-Factor: performance

  • Volt DB tools csvloader can load data from CSV file

    exporttofile exports to CSV/TSV file sqlcmd mysql like client voltadmin - administrative functions voltdb catalog (DB) management and server

  • Questions?

  • Volt DB: Super Chat

  • Agenda

    #1: Academic issues: Shard,HP,VP#2: What is Volt DB? Why?! Volt DB?#3: Simple Volt DB application(Super Chat)#4: Tips and Tweaks: you are not supposed to do that at all

  • Super Chat: flow

  • DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));

  • DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));

    2: CREATE INDEX messages_idx ON messages (nick, ip);

  • DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));

    2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;

  • DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));

    2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;

    4: CREATE PROCEDURE FROM CLASS vposter.procedures.AddMessage;

  • DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));

    2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;

    4: CREATE PROCEDURE FROM CLASS vposter.procedures.AddMessage;5: PARTITION PROCEDURE AddMessage ON TABLE messages COLUMN nick;

  • DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));

    2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;

    4: CREATE PROCEDURE FROM CLASS vposter.procedures.AddMessage;5: PARTITION PROCEDURE AddMessage ON TABLE messages COLUMN nick;

  • Simple Java Procedure

  • Putting All Togethersh $ javac -cp "$CLASS_PATH:/opt/voltdb-3.7/voltdb/*" java/vposter/procedures/AddMessage.java

    sh$ voltdb compile --classpath=./java/ -o voltdb-catalog.jar /path/to/schema/schema-ddl.sql

    # Finaly, run the serversh$ voltdb create catalog voltdb-catalog.jar

  • Questions?

  • Agenda

    #1: Academic issues: Shard,HP,VP#2: What is Volt DB? Why?! Volt DB?#3: Simple Volt DB application(Super Chat)#4: Tips and Tweaks: you are not supposed to do that at all

  • Tips and Tweaks

    Java Stored procedures Schema file: restrictions and syntax Client code vs Server code TPS: Performance Testing Deployment configuration

  • Java Stored procedures

    Minimize hard math. calculations, I.e. let's client do what it needs

    Minimize SQL queue, i.e. usage of lists/arrays as parameters is real optimization

    Do not return huge chunk of data To Throw or Not To Throw Use statuses (application and response)

  • Schema file: restrictions and syntax

    Use comments Renaming columns/tables in DDL Constraint LIMIT PARTITION ROWS Standard constraints Views as mechanism of aggregation

  • Client code vs Server code

    Let server aggregate data, let client process data

    Table-Of-Tables Optimize Insertions Why SELECT * requests are

    dangerous?

  • TPS: Performance Testing

    @STATISTICS usage @EXPLAIN[PROC] usage @SystemInformation

  • Questions?

  • Referencesl http://odbms.org/download/VoltDBTechnicalOverview.pdfl http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-really-as-scalable-as-they-claim/l http://voltdb.coml http://techledger.wordpress.com/2011/07/08/voltdb-faq/l http://highscalability.com/blog/2010/6/28/voltdb-decapitates-six-sql-urban-myths-and-delivers-internet.htmll http://www.perfdynamics.com/Manifesto/USLscalability.html#tth_sEc1l git@github.com:vtorshyn/voltdb-shardit-src.git

    http://odbms.org/download/VoltDBTechnicalOverview.pdfhttp://techledger.wordpress.com/2011/07/08/voltdb-faq/http://highscalability.com/blog/2010/6/28/voltdb-decapitates-six-sql-urban-myths-and-delivers-internet.htmlhttp://www.perfdynamics.com/Manifesto/USLscalability.html#tth_sEc1