hops - distributed metadata for hadoop

57
Hops – Distributed MetaData for Hadoop Jim Dowling Associate Prof @ KTH Senior Researcher @ SICS CEO @ Hops AB BDOOP Meetup, Hadoop Summit, Dublin, 12 th April 2016 www.hops.io @hopshadoop

Upload: jim-dowling

Post on 16-Apr-2017

105 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Hops - Distributed metadata for Hadoop

Hops – Distributed MetaData for Hadoop

Jim Dowling Associate Prof @ KTH

Senior Researcher @ SICSCEO @ Hops AB

BDOOP Meetup, Hadoop Summit, Dublin, 12th April 2016

www.hops.io @hopshadoop

Page 2: Hops - Distributed metadata for Hadoop

MetaData Services in Hadoop

2

Page 3: Hops - Distributed metadata for Hadoop

3

Metadata Totem Poles in Hadoop

Eventual Consistency

Page 4: Hops - Distributed metadata for Hadoop

4

With Many Hadoop Clusters

Cluster 1 Cluster N

MetaDataService

MetaDataService

MetaData Service (Aggregator)

MetaData consistency protocols have O(N) operational complexity.

Page 5: Hops - Distributed metadata for Hadoop

Case Study: Access Control as a MetaData Service

5

Page 6: Hops - Distributed metadata for Hadoop

6

Access Control in Relational Databases# Multi-tenancy for alice and bob on db1 and db2

grant all privileges on db1.* to ‘alice'@‘%‘;grant all privileges on db2.* to ‘bob'@‘%‘;

#More fine-grained privilegesgrant SELECT privileges on db2.sensitiveTable to ‘alice'@‘192.168.1.2‘;

Databases ensure the consistency of security and policies using foreign keys.

“drop table db2.sensitiveTable” => delete associated privileges

Page 7: Hops - Distributed metadata for Hadoop

7

Access Control in Hadoop: Apache Sentry

How do you ensure the consistency of the policies and the data?

[Mujumdar’15]

Page 8: Hops - Distributed metadata for Hadoop

8

Policy Editor for Sentry

Administrators administer privileges for users

Page 9: Hops - Distributed metadata for Hadoop

9

Talk Overview• Our Story of Distributed Metadata for Hadoop

• Metadata at work in Hops: Multi-tenancy

• Metadata at work in Hops: HopsWorks

Page 10: Hops - Distributed metadata for Hadoop

Bill Gates’ biggest product regret?*

Page 11: Hops - Distributed metadata for Hadoop

Windows Future Storage (WinFS*)

*http://www.zdnet.com/article/bill-gates-biggest-microsoft-product-regret-winfs/

Page 12: Hops - Distributed metadata for Hadoop

12

HDFS v2

DataNodes

HDFS Client

Journal Nodes Zookeeper

SnapshotNode

ActiveNameNode

StandbyNameNode

Asynchronous Replication of NN LogAgreement on the Active NameNodeFaster Recovery - Cut the NN Log

Page 13: Hops - Distributed metadata for Hadoop

13

Max Pause times for NameNode Heap Sizes*

Max Pause-Times (ms)

100

1000

10000

10

JVM Heap Size (GB)

25 50 75 100

Unopt

imize

d

Optimized

*OpenJDK or Oracle JVM

Page 14: Hops - Distributed metadata for Hadoop

14

NameNode and Decreasing Memory Costs

Size (GB)

250

500

1000

Year

2015 2016 2017 2018

Projected Max NameNode JVM Heap Size

2019

0

750

Size of RAM in a COTS $7,000 Rack Server

Page 15: Hops - Distributed metadata for Hadoop

15

Externalizing the NameNode State• Problem:NameNode not scaling up with lower RAM prices

• Solution:Move the metadata off the JVM Heap

• Move it where?An in-memory storage system that can be efficiently queried and managed. Preferably Open-Source.

• MySQL Cluster (NDB)

Page 16: Hops - Distributed metadata for Hadoop

16

HopsFS Architecture

NameNodes

NDB

Leader

HDFS Client

HopsFS Client

Load Balancer

DataNodes

Page 17: Hops - Distributed metadata for Hadoop

17

Pluggable DBs: Data Abstraction Layer (DAL)

NameNode(Apache v2)

DAL API(Apache v2)

NDB-DAL-Impl(GPL v2)

Other DB(Other License)

hops-2.5.0.jar dal-ndb-2.5.0-7.4.7.jar

Page 18: Hops - Distributed metadata for Hadoop

The Global Lock in the NameNode

18

Page 19: Hops - Distributed metadata for Hadoop

Apache NameNode InternalsClient: mkdir, getblocklocations, createFile,…..

NameNode

Journal Nodes

Client

Reader1 ReaderN…

Handler1 HandlerM

ConnectionList

Call Queue

Meta Data & In-Memory EditLogFSNameSystem Lock

EditLog Buffer

EditLog1 EditLog2 EditLog3

Listener(Nio Thread)

Responder(Nio Thread)

dfs.namenode.service.handlercount (default 10)

ipc.server.read.threadpool.size (default 1)

Handler1 HandlerM… Done RPCs

ackIdsflush

Page 20: Hops - Distributed metadata for Hadoop

HopsFS NameNode InternalsClient: mkdir, getblocklocations, createFile,…..

NameNode

NDB

Client

Reader1 ReaderN…

Handler1 HandlerM

ConnectionList

Call Queue

inodes block_infos replicas

Listener(Nio Thread)

Responder(Nio Thread)

dfs.namenode.service.handlercount (default 10)

ipc.server.read.threadpool.size (default 1)

Handler1 HandlerM…

Done RPCs

ackIds

leases…

DAL-ImplDAL API

HARD PART

Page 21: Hops - Distributed metadata for Hadoop

21

Consistency: Transactions & Implicit Locking

• Serializabile FS ops using implicit locking of subtrees.

[Hakimzadeh, Peiro, Dowling, ”Scaling HDFS with a Strongly Consistent Relational Model for Metadata”, DAIS 2014]

Page 22: Hops - Distributed metadata for Hadoop

22

Preventing Deadlock and Starvation

• Acquire FS locks in agreed order using FS Hierarchy. • Block-level operations follow the same agreed order.• No cycles => Freedom from deadlock• Pessimistic Concurrency Control ensures progress

/user/jim/myFilemv

readblock_report

Client DataNodeNameNode

Client

Page 23: Hops - Distributed metadata for Hadoop

Per Transaction Cache• Reusing the HDFS codebase resulted in too many roundtrips to the database per transaction.

• Cache intermediate transaction results at NameNodes.

Page 24: Hops - Distributed metadata for Hadoop

24

Sometimes, Transactions Just ain’t Enough• Large Subtree Operations (delete, mv, set-quota) can’t always be executed in a single Transaction.

• 4-phase Protocol• Isolation and Consistency• Aggressive batching• Transparent failure handling• Failed ops retried on new NN.• Lease timeout for failed clients.

Page 25: Hops - Distributed metadata for Hadoop

Leader Election using NDB• Leader to coordinate replication/lease management• NDB as shared memory for Leader Election of NN.

• No more Zookeeper, yay!25[Niazi, Berthou, Ismail, Dowling, ”Leader Election in a NewSQL Database”, DAIS 2015]

Page 26: Hops - Distributed metadata for Hadoop

Path Component Caching• Path of length N needs O(N) round-trips to resolve• With our cache, O(1) round-trip for a cache hit

/user/jim/myFile

NDB

getInode(0, “user”) getInode

(1, “jim”) getInode(2, “myFile”)

NameNode

/user/jim/myFile

NDB

validateInodes([(0, “user”), (1,”jim”),(2,”myFile”)])

NameNode

CachegetInodes(“/user/jim/myFile”)

Page 27: Hops - Distributed metadata for Hadoop

Scalable Blocking Reporting• On 100PB+ clusters, internal maintenance protocol traffic makes up much of the network traffic

• Block Reporting - Leader Load Balances- Work-steal when exiting

safe-mode

SafeBlocks

DataNodes

NameNodes

NDB

Leader

Blocks

work steal

Page 28: Hops - Distributed metadata for Hadoop

HopsFS Performance

28

Page 29: Hops - Distributed metadata for Hadoop

29

HopsFS Metadata Scaleout

Assuming 256MB Block Size, 100 GB JVM Heap for Apache Hadoop

Page 30: Hops - Distributed metadata for Hadoop

30

HopsFS Throughput (Spotify Workload)

Experiments performed on AWS EC2 with enhanced networking and C3.8xLarge instances

Page 31: Hops - Distributed metadata for Hadoop

Hops-YARN

31

Page 32: Hops - Distributed metadata for Hadoop

32

YARN Architecture

NodeManagers

YARN Client

Zookeeper Nodes

ResourceMgr StandbyResourceMgr

1. Master-Slave Replication of RM State2. Agreement on the Active ResourceMgr

Page 33: Hops - Distributed metadata for Hadoop

33

NDB

ResourceManager– Monolithic but Modular

ApplicationMasterService

ResourceTrackerService

Scheduler

ClientService

YARN Client

AdminService

Security

Cluster State

HopsResourceTracker

Cluster State

HopsScheduler

NodeManagerNodeManagerYARN Client App MasterApp Master

ResourceManager

Page 34: Hops - Distributed metadata for Hadoop

34

Hops-YARN Architecture

ResourceMgrs

NDB

Scheduler

YARN Client

NodeManagers

Resource Trackers Leader Election forFailed Scheduler

Page 35: Hops - Distributed metadata for Hadoop

What do we do with all this Metadata?

35

Page 36: Hops - Distributed metadata for Hadoop

Hops MetaData Tree

36

HopsFSHopsYARN

NDB

ProjectsDataSets

Hops Users

ProvenanceSearch

HistoryServiceExt-Metadata

Page 37: Hops - Distributed metadata for Hadoop

37

Problem: Need Cluster per Sensitive DataSet

NSA DataSet

User DataSet

has access to

has access to

Alice can copy/cross-link between data sets

Alice has only one Kerberos Identity. Dynamic Roles not supported in Hadoop.

Alice

Page 38: Hops - Distributed metadata for Hadoop

38

Solution: Project-Specific UserIDs

Project NSA

Project UsersMember of

NSA__Alice

Users__Alice

Member of

HDFS enforcesaccess control

Page 39: Hops - Distributed metadata for Hadoop

39

Sharing DataSets with HopsWorks

Project NSA

Project UsersMember of

DataSetowns

Add members of Project NSA to the DataSet group

NSA__Alice

Users__Alice

Member of

Page 40: Hops - Distributed metadata for Hadoop

HopsWorks enforces Dynamic Roles

40

[email protected]

NSA__Alice

Authenticate

Users__Alice

HopsWorks

HopsFS

HopsYARN

Projects

SecureImpersonation

Page 41: Hops - Distributed metadata for Hadoop

41

User• Authentication Provider

- JDBC Realm- 2-Factor Authentication- LDAP

Page 42: Hops - Distributed metadata for Hadoop

42

Project• Members

- Roles: Owner, Data Scientist

• DataSets - Home project- Can be shared

Page 43: Hops - Distributed metadata for Hadoop

43

Project Roles• Data Owner Privileges

- Import/Export data- Manage Membership- Share DataSets

• Data Scientist Privileges- Write code- Run code- Request access to DataSets

We delegate administration of privileges to users

Page 44: Hops - Distributed metadata for Hadoop

45

Sharing DataSets between Projects

The same as Sharing Folders in Dropbox

Page 45: Hops - Distributed metadata for Hadoop

46

Delegate Access Control to HDFS• HDFS enforces access control- UserID per Project- GroupID per

Project and DataSet

• Metadata Integrity using Foreign Keys- Removing a project removes

all users, groups, and (optionally) DataSets.

Page 46: Hops - Distributed metadata for Hadoop

47

How ACME Inc. handles Free-Text Search

HDFS

In Theory

Unified Search and Update API

In Practice

Inconsistent Metadata

Page 47: Hops - Distributed metadata for Hadoop

48

Free Text Search with Consistent Metadata

Free-Text Search

Distributed DatabaseElasticSearch

The Distributed Database is the Single Source of Truth.Foreign keys ensure the integrity of Metadata.

MetaDataDesigner

MetaDataEntry

Page 48: Hops - Distributed metadata for Hadoop

49

Global Search: Projects and DataSets

Page 49: Hops - Distributed metadata for Hadoop

50

Project Search: Files, Directories

Page 50: Hops - Distributed metadata for Hadoop

51

Design your own Extended Metadata

Page 51: Hops - Distributed metadata for Hadoop

Analytics in HopsWorks

52

Page 52: Hops - Distributed metadata for Hadoop

53

Batch Job Analytics

Page 53: Hops - Distributed metadata for Hadoop

Interactive Analytics: Zeppelin

Page 54: Hops - Distributed metadata for Hadoop

Other Features• Audit Logs

• Erasure Coding Replication

• Online upgrade of Hops (and NDB)

• Automated Installation with Karamel

• Tinker friendly – easy to extend metadata!

55

Page 55: Hops - Distributed metadata for Hadoop

56

Conclusions• Hops is a next-generation distribution of Hadoop.

• HopsWorks is a frontend to Hops that supports true multi-tenancy, free-text search, interactive analytics with Zeppelin/Flink/Spark, and batch jobs.

• Looking for contributors/committers- Pick-me-ups on GitHub

www.hops.io

Page 56: Hops - Distributed metadata for Hadoop

The TeamActive: Jim Dowling, Seif Haridi, Tor Björn Minde,

Gautier Berthou, Salman Niazi, Mahmoud Ismail,Kamal Hakimzadeh, Ermias Gebremeskel, Theofilos Kakantousis, Johan Svedlund Nordström, Someya Sayeh, Vasileios Giannokostas, Antonios Kouzoupis, Misganu Dessalegn, Ahmad Al-Shishtawy, Ali Gholami.

Alumni: K. “Sri” Srijeyanthan, Steffen Grohsschmiedt, Alberto Lorente, Andre Moré, Stig Viaene, Hooman Peiro, Evangelos Savvidis, Jude D’Souza, Qi Qi, Gayana Chandrasekara,Nikolaos Stanogias, Daniel Bali, Ioannis Kerkinos,Peter Buechler, Pushparaj Motamari, Hamid Afzali,Wasif Malik, Lalith Suresh, Mariano Valles, Ying Lieu.

Hops

Page 57: Hops - Distributed metadata for Hadoop

Hops[Hadoop For Humans]

Join us!http://github.com/hopshadoop