hbase nosql
DESCRIPTION
TRANSCRIPT
![Page 1: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/1.jpg)
HBase
Ryan Rawson Sr Developer @ SU, HBase commi8er
June 11th, NOSQL
![Page 2: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/2.jpg)
Quick Backstory
• Needed large data store @ SU • Started looking back in Jan ‘09 • Looked at the field of stores, tried: – Cassandra – Hypertable (fast) – HBase
• Ended picking HBase
NOSQL Meetup
![Page 3: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/3.jpg)
Now
• Personally rewri8en large porRons of HBase for 0.20 – Code easy to work with, understand, modify
• Recently voted to commi8er status (thanks!)
• Now giving presentaRons (hi!)
NOSQL Meetup
![Page 4: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/4.jpg)
Four Point Agenda
• What is HBase? • Why HBase?
• HBase 0.20 • HBase At Stumbleupon
NOSQL Meetup
![Page 5: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/5.jpg)
What is HBase?
• Clone of Bigtable ‐ h8p://labs.google.com/papers/bigtable.html
• Created originally at Powerset in 2007 • Hadoop‐subproject – The usual ASF things apply (license, JIRA, etc)
NOSQL Meetup
![Page 6: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/6.jpg)
What is HBase?
• Column‐oriented semi‐structured data store • Distributed over many machines – Bigtable known to scale to >1000 nodes
• Tolerant of machine failure
• Layered over HDFS (& KFS) • Strong consistency (important)
NOSQL Meetup
![Page 7: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/7.jpg)
Table & Regions
• Rows stored in byte‐lexographic sorted order • Table dynamically split into “regions”
• Each region contains values [startKey, endKey) • Regions hosted on a regionserver
NOSQL Meetup
![Page 8: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/8.jpg)
Table & Regions
NOSQL Meetup
![Page 9: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/9.jpg)
Column Storage
• In HBase, don’t think of a spreadsheet:
All columns same ‘size’ and present (as NULL)
NOSQL Meetup
![Page 10: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/10.jpg)
Column Storage
• Instead think of tags. Values any length, no predefined names or widths:
Column names carry info (just like tags)
NOSQL Meetup
![Page 11: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/11.jpg)
Column Families
• Table consists of 1+ “column families” • Column family is unit of performance tuning
• Stored in separate set of files • Column names scoped like so: – “Family:qualifier”
NOSQL Meetup
![Page 12: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/12.jpg)
SorCng
• Rows stored in byte‐lexographical order (row keys are raw bytes, not just strings)
• Furthermore within a row, columns stored in sorted order
• Fast, cheap easy to scan adjacent rows & columns
NOSQL Meetup
![Page 13: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/13.jpg)
SorCng (but there’s more!)
• Not just scanning, but can do parRal‐key lookups
• When combined with compound keys, has the same properRes as leading‐lel edge indexes in standard RDBMS – (Except your index is distributed of course)
• Can use a second table to index a primary table.
NOSQL Meetup
![Page 14: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/14.jpg)
Values
• Row id, column name, value all byte [] • Can store ascii, any binary or use serializaRon (eg: thril, protobuf)
• Atomic increments available
• SerializaRon good for structs that are always read in one unit (eg: Address book entry)
NOSQL Meetup
![Page 15: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/15.jpg)
Values & Versions
• Each row id + column – stored with Rmestamp • HBase stores mulRple versions
• Can be useful to recover data due to bugs! • Use to detect write conflicts/collisions
NOSQL Meetup
![Page 16: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/16.jpg)
API Example
Scan scan = new Scan(startRow, endRow).addFamily(“family”);
ResultScanner scanner = table.getScanner(scan); Result result; while ( (result=scanner.next()) != null) { EnRty e = new EnRty(); dser.deserialize(e, result.getValue("default”, “0”); } scanner.close();
NOSQL Meetup
![Page 17: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/17.jpg)
Why HBase?
• Community is highly acRve, diverse, helpful • User list Email acRvity for May: 78 threads
• IRC Channel #hbase highly acRve • Helpful people in mulRple Rmezones, email answered all hours of the day/night/weekend.
NOSQL Meetup
![Page 18: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/18.jpg)
Why HBase?
• Commi8er & contributor base broad: – PSet, Streamy, SU, Trend Micro, Openplaces, and more!
• No monopoly on experts – deep knowledge at these companies and more!
• (We’re really friendly… honest!)
NOSQL Meetup
![Page 19: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/19.jpg)
Why HBase?
• Used in producRon at many companies • 12 companies listed on h8p://wiki.apache.org/hadoop/Hbase/PoweredBy
• Openplaces, Streamy, SU serve websites out of HBase
• Lots of experience to draw upon!
NOSQL Meetup
![Page 20: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/20.jpg)
Why HBase? (Features)
• Full web management/monitoring UI (master & regionservers)
• Push metrics to log files & Ganglia
• Rolling upgrades possible! (Including master!)
• Non‐SQL shell – re‐enforces the non‐SQL‐ness of HBase
NOSQL Meetup
![Page 21: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/21.jpg)
HBase Features
• Easy integraRon with Hadoop MR – table input and output formats ship
• Cascading connectors for input and output • Other ancillary open source acRviRes around the edges (ORM, schema management, etc)
NOSQL Meetup
![Page 22: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/22.jpg)
Why HBase?
• But… HBase is slow! • That metabrew/last.fm blog post said so! – (Also other people too…)
• “It’s much more than a KV store, but latency is too great to serve data to the website.”
• Answer: 0.20
NOSQL Meetup
![Page 23: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/23.jpg)
HBase 0.20
• Two major and exciRng themes:
• #1: Performance
• #2: ZooKeeper integraRon, mulRple masters
NOSQL Meetup
![Page 24: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/24.jpg)
HBase 0.20 vs 0.19 0.19 0.20
Master Single master – if it fails, so does the cluster
Master elecRon and membership via ZK
Compression Not really GZ, LZO
Memory usage Small values cause big indexes and OOM
New file‐format limits index size (800kB for 10m entries)
Scan Speed 300‐600ms per 500 rows 20‐30ms per 500 rows
NOSQL Meetup
![Page 25: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/25.jpg)
Zookeeper?
• A highly available configuraRon storage system • Set up in a 2N+1 quorum
• Hadoop subproject
NOSQL Meetup
![Page 26: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/26.jpg)
Master & Zookeeper
• Store membership info in ZK • Detect dead servers (via ephemeral nodes)
• Master elecRon and recovery
• Can kill master and cluster conRnues
• New master determines state and conRnues
NOSQL Meetup
![Page 27: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/27.jpg)
Performance
• Significant performance gains in 0.20 • New file format with 0‐copy infrastructure
• Scan and get improvements
• LZO compression
• Block caching • Speed increases as much as 30x!
NOSQL Meetup
![Page 28: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/28.jpg)
Performance
• 0.20 is not the final word on performance: • Other RPC‐related performance improvements
• Other Java‐related improvements (G1?, 1.7?)
NOSQL Meetup
![Page 29: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/29.jpg)
Performance Numbers
• 1m rows, 1 column per row (~16 bytes) – SequenRal insert: 24s, .024ms/row
– Random read: 1.42ms/row (avg) – Full Scan: 11s, (117ms/10k rows)
• Performance under cache is very high: – 1ms to get single row
– 20 ms to read 550 rows – 75ms to get 5500 rows
NOSQL Meetup
![Page 30: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/30.jpg)
HBase at Stumbleupon
• Strong commitment to HBase @ SU • Supports a HBase commi8er
• Looking to hire more HBase hackers
NOSQL Meetup
![Page 31: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/31.jpg)
Big accomplishments @ SU
• Over 9b small rows in single table – Sustained import performance – 3‐4 days to import 9b rows (mysql limiRng speed)
• 1.2m row reads/sec on 19 nodes (!!) – That is 60‐100k reads/sec/node sustained, 2hrs – Scalable with more nodes – HBase has been improved since then
NOSQL Meetup
![Page 32: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/32.jpg)
Fast accomplishments @ SU
• Extremely high speed increments and writes • Supports su.pr analyRcs • Su.pr reads from HBase with no intervening caches
• Integrated with PHP
NOSQL Meetup
![Page 33: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/33.jpg)
HBase & PHP @ SU
• PHP access via Thril gateway • Easy (PHP) deployment with Thril
• App developers like sol‐schema, easy querying and wriRng
• Want to use HBase for more features and applicaRons!
NOSQL Meetup
![Page 34: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/34.jpg)
HBase deployment trivia
• Nodes are 8x16 w/2TB (best price point) – Don’t use RAID1. Use RAID0 or JBOD support
• Ganglia allows overall cluster performance monitoring
• Clusters won’t span datacenters – We want fully duplicate data for DR anyways
• Update master with code & config – Rsync to other nodes (1 dir, very easy) – Controlled restart for rolling upgrade
NOSQL Meetup
![Page 35: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/35.jpg)
HBase deployment trivia
• HDFS – set xciever limit to 2048, Xmx2000m – Never get HDFS problems even under heavy load
• For 9b row import, randomized key insert order gives substanRal speedup
• Give HBase enough ram, you wouldn’t starve mysql!
• Import speeds of 200k ops/sec on 19 machines possible! – Hard to provide a SQL‐based source fast enough – 100k ops/sec typical for sustained
NOSQL Meetup
![Page 36: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/36.jpg)
HBase deployment trivia
• Consider dual writes or logs to get HBase up to date but without actually moving your data
• Duplicate data in indexes (already done in mysql)
• Have to think about read pa8erns when designing table key order!
NOSQL Meetup
![Page 37: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/37.jpg)
HBase future @ SU
• Latency sensiRve cluster • Batch/analyRcs cluster • Use replicaRon to keep la8er up to date • Allows batch jobs to go full thro8le against reasonably up to date data without risking the website
NOSQL Meetup
![Page 38: Hbase Nosql](https://reader034.vdocuments.net/reader034/viewer/2022052215/547ec2d6b4af9fb9158b578a/html5/thumbnails/38.jpg)
Q&A
• QuesRons?
• Stumbleupon is hiring awesome HBase hackers!
NOSQL Meetup