Download - Voltdb: Shard It by V. Torshyn
VoltDB: Shard it!
By Vitalii Torshyn
VoltDB Designed for
● Network Activity Monitoring / Security● Real-Time Analytics/Monitoring● Telecom: billing, QoS, Policy
Management● Financial Services● Gaming
Agenda
#1: Academic issues: Shard, HP, VP
#2: What is Volt DB? Why ! VoltDB?
#3: Real VoltDB application
#4: Tips and Tweaks: you are not supposed to do that at all
Agenda
#1: Academic issues: Shard, HP, VP
#2: What is Volt DB? Why ! Volt DB?
#3: Simple Volt DB application
#4: Tips and Tweaks: you are not supposed to do that at all
Academic issues
What is:● Shard, Horizontal Partitioning● ACID complaint DBMS● In-memory and real time DB
Academic issues
What is:● Shard, Horizontal Partitioning● ACID compliant DBMS● In-memory and real time DB
Straight forward approach
What is Horizontal Partitioning
What is shard
Use over time for shard
Academic issuesWhat is:● Shard, Horizontal Partitioning● ACID compliant DBMS● In-memory and real time DB
ACID
● Atomicity● Consistency● Isolation● Durability
Academic issuesWhat is:● Shard, Horizontal Partitioning● ACID compliant DBMS● In-memory and real time database
In-memory/real time database
● Storage is option, not requirement● Key value? No!● Data access is cheap (no I/O)● Real time processing
Agenda
#1: Academic issues: Shard,HP,VP
#2: What is Volt DB? Why?! Volt DB?
#3: Simple Volt DB application
#4: Tips and Tweaks: you are not supposed to do that at all
db-engines.com: Ranking
What is VoltDB?●ACID compliant DBMS●In-memory database●Real time●Shared nothing architecture (SN)●SQL support●Java Stored procedures
C++ Engine:Indexing, Lookup, Mem. Management ...
Java over JNISP, Query Prepare, Transfer
Why Volt DB?● Low cost of scaling● Low latency● Automatic Cross-partition joins (no app. Code)
● Multi-master replication (HA, K-Safety)● No buffer management (In memory)● Lockless● Licensing (GPLv3, Enterprise)● Client libraries: Java, Python, C++, C#...
Requests execution
Access and Networking● RESTful HTTP/JSON API● Socket connections● Java API● JDBC
Is VoltDB fast enough?
● Sharding● High Speed Small transactions● No journaling required● Buffering is not required
Replication: K-Safety = 1
Replication: K-Safety = 1
Replication: Active-Passive
K-Factor: performance
Volt DB tools● csvloader — can load data from CSV file
● exporttofile — exports to CSV/TSV file ● sqlcmd — mysql like client ● voltadmin - administrative functions● voltdb — catalog (DB) management and server
Questions?
Volt DB: Super Chat
Agenda
#1: Academic issues: Shard,HP,VP
#2: What is Volt DB? Why?! Volt DB?
#3: Simple Volt DB application(Super Chat)
#4: Tips and Tweaks: you are not supposed to do that at all
Super Chat: flow
DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));
DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));
2: CREATE INDEX messages_idx ON messages (nick, ip);
DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));
2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;
DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));
2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;
4: CREATE PROCEDURE FROM CLASS vposter.procedures.AddMessage;
DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));
2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;
4: CREATE PROCEDURE FROM CLASS vposter.procedures.AddMessage;5: PARTITION PROCEDURE AddMessage ON TABLE messages COLUMN nick;
DB Schema file1: CREATE TABLE messages ( uid BIGINT NOT NULL, nick VARCHAR(64) NOT NULL, ip VARCHAR(16) NOT NULL, text VARCHAR(1024));
2: CREATE INDEX messages_idx ON messages (nick, ip);3: PARTITION TABLE messages ON COLUMN nick;
4: CREATE PROCEDURE FROM CLASS vposter.procedures.AddMessage;5: PARTITION PROCEDURE AddMessage ON TABLE messages COLUMN nick;
Simple Java Procedure
Putting All Togethersh $ javac -cp "$CLASS_PATH:/opt/voltdb-3.7/voltdb/*" java/vposter/procedures/AddMessage.java
sh$ voltdb compile --classpath=./java/ -o voltdb-catalog.jar /path/to/schema/schema-ddl.sql
# Finaly, run the serversh$ voltdb create catalog voltdb-catalog.jar
Questions?
Agenda
#1: Academic issues: Shard,HP,VP
#2: What is Volt DB? Why?! Volt DB?
#3: Simple Volt DB application(Super Chat)
#4: Tips and Tweaks: you are not supposed to do that at all
Tips and Tweaks
● Java Stored procedures● Schema file: restrictions and syntax● Client code vs Server code● TPS: Performance Testing● Deployment configuration
Java Stored procedures
● Minimize hard math. calculations, I.e. let's client do what it needs
● Minimize SQL queue, i.e. usage of lists/arrays as parameters is real optimization
● Do not return huge chunk of data● To Throw or Not To Throw● Use statuses (application and response)
Schema file: restrictions and syntax
● Use comments● Renaming columns/tables in DDL● Constraint LIMIT PARTITION ROWS● Standard constraints● Views as mechanism of aggregation
Client code vs Server code
● Let server aggregate data, let client process data
● Table-Of-Tables ● Optimize Insertions● Why SELECT * requests are
dangerous?
TPS: Performance Testing
● @STATISTICS usage● @EXPLAIN[PROC] usage● @SystemInformation
Questions?
Referencesl http://odbms.org/download/VoltDBTechnicalOverview.pdfl http://www.mysqlperformanceblog.com/2011/02/28/is-voltdb-really-as-scalable-as-they-claim/l http://voltdb.coml http://techledger.wordpress.com/2011/07/08/voltdb-faq/l http://highscalability.com/blog/2010/6/28/voltdb-decapitates-six-sql-urban-myths-and-delivers-internet.htmll http://www.perfdynamics.com/Manifesto/USLscalability.html#tth_sEc1l [email protected]:vtorshyn/voltdb-shardit-src.git