apache hadoop india summit 2011 talk "hive evolution" by namit jain

43
Hive Evolution Hadoop India Summit February 2011 Namit Jain (Facebook)

Upload: yahoo-developer-network

Post on 15-Jan-2015

3.457 views

Category:

Documents


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Hive Evolution

Hadoop India Summit

February 2011

Namit Jain (Facebook)

Page 2: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Agenda

• Hive Overview• Version 0.6 (released!)• Version 0.7 (under development)• Hive is now a TLP!• Roadmaps

Page 3: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

What is Hive?• A Hadoop-based system for querying

and managing structured data– Uses Map/Reduce for execution– Uses Hadoop Distributed File System

(HDFS) for storage

Page 4: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Hive Origins• Data explosion at Facebook• Traditional DBMS technology could

not keep up with the growth• Hadoop to the rescue!• Incubation with ASF, then became a

Hadoop sub-project• Now a top-level ASF project

Page 5: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

SQL vs MapReducehive> select key, count(1) from kv1 where key >

100 group by key;

vs.$ cat > /tmp/reducer.shuniq -c | awk '{print $2"\t"$1}‘$ cat > /tmp/map.shawk -F '\001' '{if($1 > 100) print $1}‘$ bin/hadoop jar contrib/hadoop-0.19.2-dev-

streaming.jar -input /user/hive/warehouse/kv1 -mapper map.sh -file /tmp/reducer.sh -file /tmp/map.sh -reducer reducer.sh -output /tmp/largekey -numReduceTasks 1

$ bin/hadoop dfs –cat /tmp/largekey/part*

Page 6: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Hive Evolution• Originally:

– a way for Hadoop users to express queries in a high-level language without having to write map/reduce programs

• Now more and more:– A parallel SQL DBMS which happens to

use Hadoop for its storage and execution architecture

Page 7: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Intended Usage• Web-scale Big Data

– 100’s of terabytes• Large Hadoop cluster

– 100’s of nodes (heterogeneous OK)• Data has a schema• Batch jobs

– for both loads and queries

Page 8: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

So Don’t Use Hive If…• Your data is measured in GB• You don’t want to impose a schema• You need responses in seconds• A “conventional” analytic DBMS can

already do the job– (and you can afford it)

• You don’t have a lot of time and smart people

Page 9: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Scaling Up• Facebook warehouse, Jan 2011:

– 2750 nodes– 30 petabytes disk space

• Data access per day:– ~40 terabytes added (compressed)– 25000 map/reduce jobs

• 300-400 users/month

Page 10: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Facebook Deployment

Web Servers Scribe MidTier

Production Hive-Hadoop Cluster

Sharded MySQL

Scribe-Hadoop Clusters

Adhoc Hive-Hadoop Cluster

Hive Replication

Archival Hive-Hadoop Cluster

Page 11: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

System Architecture

Page 12: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Data Model

Hive Entity

Sample Metastore Entity

Sample HDFS Location

Table T /wh/T

Partition date=d1 /wh/T/date=d1

Bucketing column

userid

/wh/T/date=d1/part-0000…/wh/T/date=d1/part-1000(hashed on userid)

External Table

extT/wh2/existing/dir(arbitrary location)

Page 13: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Column Data Types

• Primitive Types• integer types, float, string, boolean

• Nest-able Collections• array<any-type>• map<primitive-type, any-type>

• User-defined types• structures with attributes which can be of any-

type

Page 14: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Hive Query Language• DDL

– {create/alter/drop} {table/view/partition}– create table as select

• DML– Insert overwrite

• QL– Sub-queries in from clause– Equi-joins (including Outer joins)– Multi-table Insert– Sampling– Lateral Views

• Interfaces– JDBC/ODBC/Thrift

Page 15: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Query Translation Example• SELECT url, count(*) FROM

page_views GROUP BY url• Map tasks compute partial counts for

each URL in a hash table– “map side” pre-aggregation– map outputs are partitioned by URL and

shipped to corresponding reducers• Reduce tasks tally up partial counts to

produce final results

Page 16: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

FROM (SELECT a.status, b.school, b.gender FROM status_updates a JOIN profiles b ON (a.userid = b.userid and a.ds='2009-03-20' ) ) subq1INSERT OVERWRITE TABLE gender_summary PARTITION(ds='2009-03-20')SELECT subq1.gender, COUNT(1) GROUP BY subq1.genderINSERT OVERWRITE TABLE school_summary PARTITION(ds='2009-03-

20')SELECT subq1.school, COUNT(1)GROUP BY subq1.school

Page 17: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

It Gets Quite Complicated!

Page 18: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Behavior Extensibility• TRANSFORM scripts (any language)

– Serialization+IPC overhead• User defined functions (Java)

– In-process, lazy object evaluation• Pre/Post Hooks (Java)

– Statement validation/execution– Example uses: auditing, replication,

authorization, multiple clusters

Page 19: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Map/Reduce Scripts Examples

• add file page_url_to_id.py;• add file my_python_session_cutter.py;• FROM (SELECT TRANSFORM(user_id, page_url, unix_time) USING 'page_url_to_id.py' AS (user_id, page_id, unix_time) FROM mylog DISTRIBUTE BY user_id SORT BY user_id, unix_time) mylog2 SELECT TRANSFORM(user_id, page_id, unix_time) USING 'my_python_session_cutter.py' AS (user_id, session_info);

Page 20: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

UDF vs UDAF vs UDTF• User Defined Function

• One-to-one row mapping• Concat(‘foo’, ‘bar’)

• User Defined Aggregate Function• Many-to-one row mapping• Sum(num_ads)

• User Defined Table Function• One-to-many row mapping• Explode([1,2,3])

Page 21: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

UDF Example• add jar build/ql/test/test-udfs.jar;• CREATE TEMPORARY FUNCTION testlength AS

'org.apache.hadoop.hive.ql.udf.UDFTestLength';• SELECT testlength(src.value) FROM src;• DROP TEMPORARY FUNCTION testlength;

• UDFTestLength.java:package org.apache.hadoop.hive.ql.udf; public class UDFTestLength extends UDF { public Integer evaluate(String s) { if (s == null) { return null; } return s.length(); }}

Page 22: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Storage Extensibility• Input/OutputFormat: file formats

– SequenceFile, RCFile, TextFile, …• SerDe: row formats

– Thrift, JSON, ProtocolBuffer, …• Storage Handlers (new in 0.6)

– Integrate foreign metadata, e.g. HBase• Indexing

– Under development in 0.7

Page 23: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Release 0.6• October 2010

– Views– Multiple Databases– Dynamic Partitioning– Automatic Merge– New Join Strategies– Storage Handlers

Page 24: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Dynamic Partitions

Automatically create partitions based on distinct values in columns

INSERT OVERWRITE TABLE page_view PARTITION(dt='2008-06-08', country)

SELECT pvs.viewTime, pvs.userid, pvs.page_url, pvs.referrer_url, null, null, pvs.ip, pvs.country

FROM page_view_stg pvs

Page 25: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Automatic merge• Jobs can produce many files• Why is this bad?

– Namenode pressure– Downstream jobs have to deal with file

processing overhead• So, clean up by merging results into a

few large files (configurable)– Use conditional map-only task to do this

Page 26: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Join Strategies• Old Join Strategies

– Map-reduce and Map Join• Bucketed map-join

– Allows “small” table to be much bigger• Sort Merge Map Join• Deal with skew in map/reduce join

– Conditional plan step for skewed keys

Page 27: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Storage Handler Syntax• HBase Example

CREATE TABLE users(

userid int, name string, email string, notes string)

STORED BY

'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES (

“hbase.columns.mapping” = “small:name,small:email,large:notes”)

TBLPROPERTIES (

“hbase.table.name” = “user_list”);

Page 28: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Release 0.7• Deployed in

Facebook– Stats Functions– Indexes– Local Mode– Automatic Map Join– Multiple DISTINCTs– Archiving

• In development– Concurrency Control– Stats Collection– J/ODBC Enhancements– Authorization– RCFile2– Partitioned Views– Security Enhancements

Page 29: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Statistical Functions• Stats 101

– Stddev, var, covar– Percentile_approx

• Data Mining– Ngrams, sentences (text analysis)– Histogram_numeric

• SELECT histogram_numeric(dob_year) FROM users GROUP BY relationshipstatus

Page 30: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Histogram query results

• “It’s complicated” peaks at 18-19, but lasts into late 40s!• “In a relationship” peaks at 20• “Engaged” peaks at 25• Married peaks in early 30s• More married than single at 28• Only teenagers use widowed?

Page 31: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Pluggable Indexing• Reference implementation

– Index is stored in a normal Hive table– Compact: distinct block addresses– Partition-level rebuild

• Currently in R&D– Automatic use for WHERE, GROUP BY– New index types (e.g. bitmap, HBase)

Page 32: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Local Mode Execution• Avoids map/reduce cluster job latency• Good for jobs which process small

amounts of data• Let Hive decide when to use it

– set hive.exec.model.local.auto=true;• Or force its usage

– set mapred.job.tracker=local;

Page 33: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Automatic Map Join• Map-Join if small table fits in memory

– If it can’t, fall back to reduce join• Optimize hash table data structures• Use distributed cache to push out pre-

filtered lookup table– Avoid swamping HDFS with reads from

thousands of mappers

Page 34: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Multiple DISTINCT Aggs• Example

SELECT

view_date,

COUNT(DISTINCT userid),

COUNT(DISTINCT page_url)

FROM page_views

GROUP BY view_date

Page 35: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Archiving• Use HAR (Hadoop archive format) to

combine many files into a few• Relieves namenode memory

ALTER TABLE page_views

{ARCHIVE|UNARCHIVE}

PARTITION (ds=‘2010-10-30’)

Page 36: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Concurrency Control• Pluggable distributed lock manager

– Default is Zookeeper-based• Simple read/write locking• Table-level and partition-level• Implicit locking (statement level)

– Deadlock-free via lock ordering• Explicit LOCK TABLE (global)

Page 37: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Statistics Collection• Implicit metastore update during load

– Or explicit via ANALYZE TABLE• Table/partition-level

– Number of rows– Number of files– Size in bytes

Page 38: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Hive is now a TLP• PMC

– Namit Jain (chair)– John Sichi– Zheng Shao– Edward Capriolo– Raghotham Murthy

• Committers– Amareshwari Sriramadasu– Carl Steinbach

– Paul Yang– He Yongqiang– Prasad Chakka– Joydeep Sen Sarma– Ashish Thusoo– Ning Zhang

Page 39: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Developer Diversity• Recent Contributors

– Facebook, Yahoo, Cloudera– Netflix, Amazon, Media6Degrees, Intuit,

Persistent Systems– Numerous research projects– Many many more…

• Monthly San Francisco bay area contributor meetups

• India meetups ?

Page 40: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Roadmap: Heavy-Duty Tests• Unit tests are insufficient• What is needed:

– Real-world schemas/queries– Non-toy data scales– Scripted setup; configuration matrix– Correctness/performance verification– Automatic reports: throughput, latency,

profiles, coverage, perf counters…

Page 41: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Roadmap: Shared Test Site • Nightly runs, regression alerting• Performance trending• Synthetic workload (e.g. TPC-H)• Real-world workload (anonymized?)• This is critical for

– Non-subjective commit criteria– Release quality

Page 42: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Roadmap: New Features• Hive Server Stability/Deployment• File Concatenation

– Reduce Number of Files• Performance

– Bloom Filters– Push Down Filters

• Cost Based Optimizer– Column Level Statistics– Plan should be based on Statistics

Page 43: Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain

Resources• http://hive.apache.org• user/[email protected][email protected]• Questions?