hoya for code review
DESCRIPTION
An iteration of the Hoya slides put up as part of a review of the code with others writing YARN services; looks at what Hoya offers -what we need from Apps to be able to deploy them this way, and what we need from YARNTRANSCRIPT
![Page 1: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/1.jpg)
© Hortonworks Inc. 2013
Hoya: HBase on YARNSteve Loughran & Devaraj Das
{stevel, ddas} at hortonworks.com@steveloughran, @ddraj
August 2013
![Page 2: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/2.jpg)
© Hortonworks Inc. 2012
Hadoop as Next-Gen Platform
HADOOP 1.0
HDFS(redundant, reliable storage)
MapReduce(cluster resource management
& data processing)
HDFS2(redundant, reliable storage)
YARN(cluster resource management)
MapReduce(data processing)
Others(data processing)
HADOOP 2.0
Single Use SystemBatch Apps
Multi Purpose PlatformBatch, Interactive, Online, Streaming, …
Page 2
![Page 3: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/3.jpg)
© Hortonworks Inc.
YARN: Taking Hadoop Beyond Batch
Page 3
Applications Run Natively IN Hadoop
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH(MapReduce)
INTERACTIVE(Tez)
STREAMING(Storm, S4,…)
GRAPH(Giraph)
HPC MPI(OpenMPI)
OTHER(Search)
(Weave…)Samza
Store ALL DATA in one place…
Interact with that data in MULTIPLE WAYS
with Predictable Performance and Quality of Service
IN-MEMORY(Spark)
![Page 4: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/4.jpg)
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH(MapReduce)
INTERACTIVE(Tez)
STREAMING(Storm, S4,…)
GRAPH(Giraph)
HPC MPI(OpenMPI)
OTHER(Search)
(Weave…)HBaseIN-MEMORY
(Spark)
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH(MapReduce)
INTERACTIVE(Tez)
STREAMING(Storm, S4,…)
GRAPH(Giraph)
HPC MPI(OpenMPI)
OTHER(Search)
(Weave…)HBase
IN-MEMORY(Spark)
And HBase?
![Page 5: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/5.jpg)
© Hortonworks Inc. Page 5
![Page 6: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/6.jpg)
© Hortonworks Inc.
Hoya: On-demand HBase clusters
1. Small HBase cluster in large YARN cluster
2. Dynamic HBase clusters
3. Self-healing HBase Cluster
4. Elastic HBase clusters
5. Transient/intermittent clusters for workflows
6. Custom versions & configurations
7. More efficient utilization/sharing of cluster
Page 6
![Page 7: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/7.jpg)
© Hortonworks Inc.
Goal: No code changes in HBase
• Today : none
HBase 0.95.2$ mvn install -Dhadoop.version=2.0
But we'd like• ZK reporting of web UI ports• Allocation of tables in RS to be block location aware• A way to get from failed RS to YARN container (configurable ID is enough)
Page 7
![Page 8: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/8.jpg)
© Hortonworks Inc.
Hoya – the tool
• Hoya (Hbase On YArn)– Java tool–Completely CLI driven
• Input: cluster description as JSON–Specification of cluster: node options, ZK params–Configuration generated–Entire state persisted
• Actions: create, freeze/thaw, flex, exists <cluster>• Can change cluster state later
–Add/remove nodes, started / stopped states
![Page 9: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/9.jpg)
© Hortonworks Inc. 2012
YARN manages the cluster
Page 9
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
• Servers run YARN Node Managers• NM's heartbeat to Resource Manager• RM schedules work over cluster• RM allocates containers to apps• NMs start containers• NMs report container health
![Page 10: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/10.jpg)
© Hortonworks Inc. 2012
Hoya Client creates App Master
Page 10
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
HDFS
YARN Node Manager
Hoya ClientHoya AM
![Page 11: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/11.jpg)
© Hortonworks Inc. 2012
AM deploys HBase with YARN
Page 11
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HDFS
YARN Resource Manager
Hoya Client
HDFS
YARN Node Manager
Hoya AM [HBase Master]
HBase Region Server
HBase Region Server
![Page 12: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/12.jpg)
© Hortonworks Inc. 2012
HBase & clients bind via Zookeeper
Page 12
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Resource Manager
HBase Client HDFS
YARN Node Manager
Hoya AM [HBase Master]Hoya Client
![Page 13: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/13.jpg)
© Hortonworks Inc. 2012
YARN notifies AM of failures
Page 13
HDFS
YARN Node Manager
HDFS
YARN Node Manager
HBase Region Server
HDFS
YARN Resource Manager
Hoya Client
HDFS
YARN Node Manager
Hoya AM [HBase Master]
HBase Region Server
HBase Region Server
![Page 14: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/14.jpg)
© Hortonworks Inc.
HOYA - cool bits
• Cluster specification stored as JSON in HDFS• Conf dir cached, dynamically patched before pushing up as local resources for master & region servers
• HBase .tar file stored in HDFS -clusters can use the same/different HBase versions
• Handling of cluster flexing is the same code as unplanned container loss.
• No Hoya code on region servers
Page 14
![Page 15: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/15.jpg)
© Hortonworks Inc.
HOYA - AM RPC API
//shut downpublic void stopCluster();
//change #of worker nodes in clusterpublic boolean flexNodes(int workers);
//get JSON description of live clusterpublic String getClusterStatus();
Page 15
![Page 16: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/16.jpg)
© Hortonworks Inc.
Flexing/failure handling is same code
public boolean flexNodes(int workers) throws IOException { log.info("Flexing cluster count from {} to {}", numTotalContainers, workers); if (numTotalContainers == workers) { //no-op log.info("Flex is a no-op"); return false; } //update the #of workers numTotalContainers = workers;
// ask for more containers if needed reviewRequestAndReleaseNodes(); return true; }
Page 16
![Page 17: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/17.jpg)
© Hortonworks Inc. 2012
{ "version" : "1.0", "name" : "cl1", "type" : "hbase", "state" : 1, "createTime" : 1377276630308, "originConfigurationPath" : "hdfs://ubuntu:9000/user/stevel/.hoya/cluster/cl1/original", "generatedConfigurationPath" : "hdfs://ubuntu:9000/user/stevel/.hoya/cluster/cl1/generated", "zkHosts" : "localhost", "zkPort" : 2181, "zkPath" : "/yarnapps_hoya_stevel_cl1", "hbaseDataPath" : "hdfs://ubuntu:9000/user/stevel/.hoya/cluster/cl1/hbase", "imagePath" : "hdfs://ubuntu:9000/hbase.tar", "options" : { "hoya.test" : "true" }, ...}
Cluster Specification: persistent & wire
![Page 18: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/18.jpg)
© Hortonworks Inc. 2012
"roles" : { "worker" : { "yarn.memory" : "256", "role.instances" : "5", "role.name" : "worker", "jvm.heapsize" : "256", "yarn.vcores" : "1", "app.infoport" : "0" "env.MALLOC_ARENA_MAX": "4" }, "master" : { "yarn.memory" : "128", "role.instances" : "1", "role.name" : "master", "jvm.heapsize" : "128", "yarn.vcores" : "1", "app.infoport" : "8585" }}
Role Specifications
![Page 19: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/19.jpg)
© Hortonworks Inc.
Current status
• Able to create & stop on-demand HBase clusters
–RegionServer failures handled
• Able to specify specific HBase configuration:
hbase-home or .tar.gz
• Cluster stop, restart, flex
• get (dynamic) conf as XML, properties
![Page 20: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/20.jpg)
© Hortonworks Inc.
Ongoing
• Multiple roles: worker, master, monitor
--role worker --roleopts worker yarn.vcores 2
• Multiple Providers: HBase + others
– client side: preflight, configuration patching
– server side: starting roles, liveness
• Liveness probes: HTTP GET, RPC port, RPC op?
• What do we need in YARN for production?
Page 20
![Page 21: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/21.jpg)
© Hortonworks Inc.
Requirements of an App: MUST
• Install from tarball; run as normal user• Deploy/start without human intervention• Pre-configurable, static instance config data• Support dynamic discovery/binding of peers• Co-existence with other app instance in cluster/nodes• Handle co-located role instances• Persist data to HDFS• Support 'kill' as a shutdown option• Handle failed role instances• Support role instances moving after failure
Page 21
![Page 22: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/22.jpg)
© Hortonworks Inc.
Requirements of an App: SHOULD
• Be configurable by Hadoop XML files• Publish dynamically assigned web UI & RPC ports• Support cluster flexing up/down• Support API to determine role instance status• Make it possible to determine role instance ID from app
• Support simple remote liveness probes
Page 22
![Page 23: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/23.jpg)
© Hortonworks Inc.
YARN-896: long-lived services
1. Container reconnect on AM restart
2. Token renewal on long-lived apps
3. Containers: signalling, >1 process sequence
4. AM/RM managed gang scheduling
5. Anti-affinity hint in container requests
6. Service Registry - ZK?
7. Logging
All post Hadoop-2.1
Page 23
![Page 24: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/24.jpg)
© Hortonworks Inc.
SLAs & co-existence with MapReduce
1. Make IO bandwidth/IOPs a resource used in scheduling & limits
2. Need to monitor what's going on w.r.t IO & net load from containers apps queues
3. Dynamic adaptation of cgroup HDD, Net, RAM limits
Page 24
![Page 25: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/25.jpg)
© Hortonworks Inc.
Hoya needs a home!
Page 25
https://github.com/hortonworks/hoya
![Page 26: Hoya for Code Review](https://reader035.vdocuments.net/reader035/viewer/2022062512/554a1e8db4c905825d8b5613/html5/thumbnails/26.jpg)
© Hortonworks Inc
Questions?
hortonworks.com
Page 26