the data center and hadoop jacob rapp, cisco [email protected]

Download The Data Center and Hadoop Jacob Rapp, Cisco jarapp@cisco.com

If you can't read please download the document

Upload: dana-baye

Post on 14-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

  • Slide 1

The Data Center and Hadoop Jacob Rapp, Cisco [email protected] Slide 2 Hadoop Considerations Traffic Types, Job Patterns, Network Considerations, Compute Network Integration Co-exist with current Data Center infrastructure Open, Programmable and Application-Aware Networks Multi-tenancy Remove the Silo clusters 2 Slide 3 3 Slide 4 4 Analyze Extract Transform Load (ETL) Explode Reduce Ingress vs. Egress Data Set 1:0.3 Ingress vs. Egress Data Set 1:1 Ingress vs. Egress Data Set 1:2 The Time the reducers start is dependent on: mapred.reduce.slowstart.co mpleted.maps It doesnt change the amount of data sent to Reducers, but may change the timing to send that data Slide 5 5 Small Flows/Messaging (Admin Related, Heart-beats, Keep-alive, delay sensitive application messaging) Small Medium Incast (Hadoop Shuffle) Large Flows (HDFS Ingest) Large Incast (Hadoop Replication) Slide 6 6 Many-to-Many Traffic Pattern Map 1Map 2Map NMap 3 Reducer 1Reducer 2Reducer 3Reducer N HDFS Shuffle Output Replication NameNode JobTracker ZooKeeper Slide 7 Analyze Simulated with Shakespeare Wordcount Extract Transform Load (ETL) Simulated with Yahoo TeraSort Extract Transform Load (ETL) Simulated with Yahoo TeraSort with output replication Job Patterns have varying impact on network utilization Slide 8 8 Slide 9 9 Network Attributes Architecture Availability Capacity, Scale & Oversubscription Flexibility Management & Visibility Integration Considerations Slide 10 10 Single 1GE 100% Utilized Dual 1GE 75% Utilized 10GE 40% Utilized Generally 1G is being used largely due to the cost/performance trade-offs. Though 10GE can provide benefits depending on workload Slide 11 No single point of failure from network view point. No impact on job completion time NIC bonding configured at Linux with LACP mode of bonding Effective load-sharing of traffic flow on two NICs. Recommended to change the hashing to src-dst-ip-port (both network and NIC bonding in Linux) for optimal load-sharing 11 Slide 12 1GE vs. 10GE Buffer Usage 12 Moving from 1GE to 10GE actually lowers the buffer requirement at the switching layer. By moving to 10GE, the data node has a wider pipe to receive data lessening the need for buffers on the network as the total aggregate transfer rate and amount of data does not increase substantially. This is due, in part, to limits of I/O and Compute capabilities Slide 13 Goals Extensive Validation of Hadoop Workload Reference Architecture Make it easy for Enterprise Demystify Network for Hadoop Deployment Integration with Enterprise with efficient choices of network topology/devices Findings 10G and/or Dual attached server provides consistent job completion time & better buffer utilization 10G provide reduce burst at the access layer Dual Attached Sever is recommended design 1G or 10G. 10G for future proofing Rack failure has the biggest impact on job completion time Does not require non-blocking network Latency does not matter much in Hadoop workloads 13 http://www.slideshare.net/Hadoop_Summit/ref-arch-validated-and-tested-approach-to-define-a-network-design http://youtu.be/YJODsK0T67A More Details From Hadoop Summit 2012 at: Slide 14 14 Slide 15 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15 n3548-001# show interface brief -------------------------------------------------------------------------------- Ethernet VLAN Type Mode Status Reason Speed Port Interface Ch # -------------------------------------------------------------------------------- Eth1/1 1 eth access up none 10G(D) -- Eth1/2 1 eth access up none 10G(D) -- Eth1/3 1 eth access up none 10G(D) -- Eth1/4 1 eth access up none 10G(D) -- Eth1/5 1 eth access up none 10G(D) -. Eth1/33 1 eth access up none 10G(D) -- Eth1/34 1 eth access up none 10G(D) -- Eth1/35 1 eth access down SFP not inserted 10G(D) -- Eth1/36 1 eth access down SFP not inserted 10G(D) -- Eth1/37 1 eth access down Administratively down 10G(D) . Slide 16 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16 n3548-001# show mac address-table dynamic Legend: * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC age - seconds since first seen,+ - primary entry using vPC Peer- Link VLAN MAC Address Type age Secure NTFY Ports ---------+-----------------+--------+---------+------+----+---------------- -- * 1 e8b7.484d.a208 dynamic 60570 F F Eth1/31 * 1 e8b7.484d.a20a dynamic 60560 F F Eth1/31 * 1 e8b7.484d.a73e dynamic 60560 F F Eth1/34 * 1 e8b7.484d.a740 dynamic 60560 F F Eth1/34 * 1 e8b7.484d.ad15 dynamic 60560 F F Eth1/28 * 1 e8b7.484d.ad17 dynamic 60560 F F Eth1/28 * 1 e8b7.484d.b3e9 dynamic 60570 F F Eth1/25 * 1 e8b7.484d.b3eb dynamic 60560 F F Eth1/25. MAC Addresses of the connected devices and the port they are on Slide 17 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17 n3548-001# portServerMap ======================================= Port Server FQDN --------------------------------------- Eth1/1 c200-m2-10g2-001.cluster10g.com Eth1/2 c200-m2-10g2-002.cluster10g.com Eth1/3 c200-m2-10g2-003.cluster10g.com Eth1/4 c200-m2-10g2-004.cluster10g.com Eth1/5 c200-m2-10g2-005.cluster10g.com Eth1/6 c200-m2-10g2-006.cluster10g.com Eth1/7 c200-m2-10g2-031.cluster10g.com Eth1/8 c200-m2-10g2-008.cluster10g.com Eth1/9 c200-m2-10g2-009.cluster10g.com Eth1/11 c200-m2-10g2-011.cluster10g.com. Slide 18 2013 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18 n3548-001# trackerList =========================================== Port Server Server Port ------------------------------------------- Eth1/2 c200-m2-10g2-002 50544 Eth1/3 c200-m2-10g2-003 41909 Eth1/4 c200-m2-10g2-004 36480 Eth1/5 c200-m2-10g2-005 38179 Eth1/6 c200-m2-10g2-006 51375 Eth1/7 c200-m2-10g2-031 41915 Eth1/8 c200-m2-10g2-008 50983 Eth1/9 c200-m2-10g2-009 37056 Eth1/11 c200-m2-10g2-011 35882 Eth1/12 c200-m2-10g2-012 44551. Slide 19 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19 n3548-001# bufferServerMap =================================================================== Port Server 1sec 5sec 60sec 5min 1hr ------------------------------------------------------------------- Eth1/1 c200-m2-10g2-001 0KB 0KB 0KB 0KB 0KB Eth1/2 c200-m2-10g2-002 384KB 384KB 1536KB 2304KB 2304KB Eth1/3 c200-m2-10g2-003 384KB 384KB 1152KB 1536KB 1536KB Eth1/4 c200-m2-10g2-004 384KB 384KB 2304KB 2304KB 2304KB Eth1/5 c200-m2-10g2-005 384KB 384KB 768KB 1536KB 1536KB Eth1/6 c200-m2-10g2-006 384KB 2304KB 2304KB 2304KB 2304KB Eth1/7 c200-m2-10g2-031 384KB 384KB 3456KB 3840KB 3840KB Eth1/8 c200-m2-10g2-008 768KB 768KB 2688KB 2688KB 2688KB Eth1/9 c200-m2-10g2-009 384KB 384KB 2304KB 2304KB 2304KB Eth1/11 c200-m2-10g2-011 384KB 384KB 1920KB 1920KB 1920KB. Eth1/1(c200-m2-10g2-001) has 0 buffer usage because its the name node Slide 20 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20 n3548-001# jobsBuffer Hadoop Job Info... =================================================================== 1 jobs currently running JobId RunTime(secs) User Priority job_201306131423_0009 120 hadoop NORMAL =================================================================== Buffer Info - Per Port Port Server 1sec 5sec 60sec 5min 1hr ------------------------------------------------------------------- Eth1/1 c200-m2-10g2-001 0KB 0KB 0KB 0KB 0KB Eth1/2 c200-m2-10g2-002 384KB 384KB 768KB 768KB 768KB Eth1/3 c200-m2-10g2-003 384KB 384KB 1152KB 1152KB 1152KB Eth1/4 c200-m2-10g2-004 384KB 1536KB 1536KB 1536KB 1536KB Eth1/5 c200-m2-10g2-005 384KB 768KB 1152KB 1152KB 1152KB. What jobs were running during peak buffer usage and for how long were they running Slide 21 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21 n3548-001(config)# jobsBuffer Hadoop Job Info... =================================================================== 0 jobs currently running JobId RunTime(secs) User Priority =================================================================== Buffer Info - Per Port Port Server 1sec 5sec 60sec 5min 1hr ------------------------------------------------------------------- Eth1/1 c200-m2-10g2-001 0KB 0KB 0KB 0KB 0KB Eth1/2 c200-m2-10g2-002 0KB 0KB 0KB 1920KB 1920KB Eth1/3 c200-m2-10g2-003 0KB 0KB 0KB 2304KB 2304KB Eth1/4 c200-m2-10g2-004 0KB 0KB 0KB 2688KB 2688KB Eth1/5 c200-m2-10g2-005 0KB 0KB 0KB 2304KB 2304KB Eth1/6 c200-m2-10g2-006 0KB 0KB 0KB 2304KB 2304KB Eth1/7 c200-m2-10g2-031 0KB 0KB 0KB 1920KB 2688KB. Historic look at the buffer usage Slide 22 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22 Slide 23 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23 Slide 24 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24 Slide 25 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 25 Buffer Usage Shuffle Replication Reduce Map 0 60 120 180 240 300 360 420 480 540 600 660 720 780 Slide 26 2011 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 26 (Python Socket) Push Data Application Logs PTP Grandmaster (OPTIONAL) Analyze API to Application Info Synchronize Time github.com/datacenter Slide 27 27 Slide 28 28 Hadoop + HBASE Job Based Department Based Various Multitenant Environments Need to understand Traffic Patterns Scheduling Dependent Permissions and Scheduling Dependent Slide 29 29 Map 1Map 2Map NMap 3 Reducer 1 Reducer 2 Reducer 3 Reducer N HDFS Shuffle Output Replication Region Server Client Major Compaction Read Update Read Major Compaction Slide 30 30 Hbase During Major Compaction Read/Update Latency Comparison of Non- QoS vs. QoS Policy ~45% for Read Improvement Switch Buffer Usage With Network QoS Policy to prioritize Hbase Update/Read Operations Slide 31 Switch Buffer Usage With Network QoS Policy to prioritize Hbase Update/Read Operations Hbase + Hadoop Map Reduce Read/Update Latency Comparison of Non- QoS vs. QoS Policy ~60% for Read Improvement Slide 32 Cisco Unified Data Center UNIFIED FABRIC UNIFIED COMPUTING Highly Scalable, Secure Network Fabric Modular Stateless Computing Elements UNIFIED MANAGEMENT Automated Management THANK YOU FOR LISTENING www.cisco.com/go/ucswww.cisco.com/go/nexus http://www.cisco.com/go/wor kloadautomation Manages Enterprise Workloads Cisco.com Big Data www.cisco.com/go/bigdata Data Center Script Examples from Presentation: github.com/datacenter