delivering a flexible it infrastructure for analytics on ibm power systems
TRANSCRIPT
Delivering a Flexible IT Infrastructure for Analytics
with HDP on IBM Power Systems
Nov 30, 2016
Agenda
• Hortonworks and IBM Power Systems Partnership
• Customer Analytics Journey
• Open Community Innovation
• Leading Time to Insights
2
Hortonworks, IBM Collaborate to Offer Open Source Distribution on Power SystemsLatest Hortonworks Data Platform (HDP) to provide IBM customers with more choice in open source Hadoop distribution for big data processing
3
Las Vegas, NV (IBM Edge) - 19 Sep 2016: IBM (NYSE: IBM) and Hortonworks (NASDAQ: HDP) today announced the planned availability of Hortonworks Data Platform (HDP®) for IBM Power Systems enabling POWER8 clients to support a broad range of new applications while enriching existing ones with additional data sources.
Scott Gnau, CTO, Hortonworks at Edge. Youtube: http://bit.ly/2dSOliW
James Wade, Director of Application Hosting, Florida Blue. Youtube: http://bit.ly/2dxVHIY
© 2016 IBM Corporation
Customer Story – Guidewell Health - Florida Blue
• Business Problem– Transformational journey resulting in rapid expansion of business models– Technology innovation required to keep up with the business expansion while improving
client satisfaction, reducing costs and supporting the company’s green IT initiativeso Existing x86 server sprawl not sustainable
• Solution with Hortonworks, IBM OpenPOWER servers and Sage Solutions Consulting– Embraces the open software and hardware model adopted by Florida Blue– Hortonworks supporting new fraud analytics initiative to reduce costs and client premiums– OpenPOWER to enable smaller datacenter footprint with stronger reliability
4
See the full story in this Hortonworks Blog post.
© 2016 IBM Corporation
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Payment Tracking
DueDiligence
SocialMapping
ProductDesign M & ACall
AnalysisMachine
Data
DefectDetecting
FactoryYields
CustomerSupport
BasketAnalysis Segments
CustomerRetention
SentimentAnalysis
OptimizeInventories
SupplyChain
Cross-Sell
VendorScorecards
AdPlacement
CyberSecurity
DisasterMitigation
InvestmentPlanning
AdPlacement
RiskModeling
ProactiveRepair
InventoryPredictions
NextProduct Recs
OPEXReduction
HistoricalRecords
MainframeOffloads
Device Data
Ingest
Rapid Reporting
DigitalProtection
Dataas a
Service
FraudPrevention
PublicData
Capture
INNOVATE
RENOVATE
EXPLORE OPTIMIZE TRANSFORM
ACTIVEARCHIVE
ETLONBOARD
DATAENRICHMENT
DATADISCOVERY
SINGLEVIEW
PREDICTIVEANALYTICS
M&A StorageBlending
M&A Ingest
Integration
About Hortonworks
The Leader in Connected Data PlatformsPublicly traded on NASDAQ: HDPHortonworks DataFlow for data in motionHortonworks Data Platform for data at restPowering new modern data applications
Partnering for Customer SuccessLeader in open-source community, focused on innovation to meet enterprise needsUnrivaled support subscriptions
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Founded in 2011
Original 24 Architects, Developers, Operators of Hadoop from Yahoo!
1000+Employees
1500+Ecosystem
Partners
HPD is a 100% Open Source Connected Data Platform
Eliminates Riskof vendor lock-in by delivering100% Apache open source technology
Maximizes Community Innovationwith hundreds of developers across hundreds of companies
Integrates Seamlesslythrough committed co-engineering partnerships with other leading technologies
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Hadoop Committers
We Employ the Committersone third of all committers to the Apache® Hadoop™ project, and a majority in other important projects
Our Committers Innovateand expand Open Enterprise Hadoop
We Influence the Hadoop Roadmapby communicating important requirementsto the community through our leaders
Hortonworks Influences the Apache Community
Hortonworks Nourishes the Community and Ecosystem
Hortonworks Community Connection
Hortonworks Partnerworks
• Community Q/A Resources
• Articles & Code Repos!
• Community of (big data) developers
• Open Ecosystem of Big Data for vendors & end-users
• Advance Apache™ Hadoop®
• Enable more Big Data Apps
• World class partner program
• Network of partners providing best-in-class solutions
Hadoop & Big data ecosystem
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hortonworks Delivers Proactive Support
Hortonworks SmartSense™with machine learning and predictive analytics on your cluster
Integrated Customer Portalwith knowledge base and on-demand training
© Hortonworks Inc. 2011 – 2016. All Rights Reserved
Client Value proposition for HDP on Power:
• 100% Open Source Hadoop running on OpenPOWER Hardware Platform– OpenPOWER is open-source hardware solution for open source SW– Intel x86 – perceived as default/commodity option - but x86 is not open– Open = no vendor lock-in and flexibility
• Processor and Server Architecture optimized for big data processing– 2x per core performance compared to intel x86
o Fewer cores & servers needed: contain server sprawlo Improved HW price/performance
– Higher server & data reliability – designed to run the enterprise
11
5 IBMers contributing to Linux and Apache Projects
1999
IBM is investing in the Linux ecosystems & open innovation
270+ OpenPOWER-based innovations under way
2016
50k+ IBMers contributing to 150+ open organizations
1. Source: https://developer.ibm.com/start/
1
BlockchainHyperledger
Open Source Databases
12
13 13
Accelerated innovation through collaboration of partners
Amplified capabilities driving industry performance leadership
Vibrant ecosystem through open development
Cloud ComputingHyperscale & Large scale
Datacenters
High PerformanceComputing & Analytics
Domestic IT Agendas
Industry adoption, Open choice
OpenPOWER Strategy
Moore’s law no longer satisfies performance gain
Numerous IT consumption models
Growing workload demands
Mature Open software ecosystem
Market Shifts
OpenPOWER is an open development community, using the POWER Architecture to serve the evolving needs of customers.
OpenPOWER, a catalyst for Open Innovation
Fueling an Open Development Community
14
Chip / SOC
BoardsSystems
I/O / StorageAcceleration
System / SoftwareIntegration
Implementation / HPC /Research
- Design and cost optimized for deployments of multiples (cloud and cluster)
- Broad number of optimal solutions
- Co-Designed with the OpenPOWER Ecosystem
IBM Support
Community / 3rd Party Support
running
The LC Line
The L Line
PurePower
Enterprise& IFLs
IaaS
Scale-Out, Linux-Only
ConvergedInfrastructure
Scale-Up
- Enterprise level RAS for single system deployments
- Solutions for Big Data & Analytics
- Converged infrastructure offering
- Rapid time to value and simplicity of management
- Enterprise level robustness and IFL capability
- Solution editions for in memory databases
- (HANA, DB2 BLU)
- Hosted cloud and hybrid cloud solutions
- Rapid deployments and POCs
The IBM Power Systems Linux Portfolio
Broad Linux portfolio delivers all your Linux deployment needs
POWER8 is designed for the Big Data era and delivers price-performance leadership to the Linux Market!
155 © 2016 IBM Corporation
Introducing the IBM Power Systems LC Line
16
S822LC For High Performance Computing
•Incorporates the new POWER8 processor with NVIDIA NVLink
•Delivers 2.8X the bandwidth to GPUs accelerators
•Up to 4 integrated NVIDIA “Pascal” GPUs
S822LC For Big Data
•Ideal for storage-centric and high data through-put workloads
•Brings 2 POWER8 sockets for Big Data workloads
•Big data acceleration with work CAPI and GPUs
S821LC
•Storage rich single socket system for big data applications
•Memory Intensive workloads
S822LC
•2X memory bandwidth of Intel x86 systems
•Memory Intensive workloads
S812LC
•2 POWER8 sockets in a 1U form factor
•Ideal for environments requiring dense computing
NEW NEWNEWBig Data
High Performance Computing
ComputeIntensive
Announce 9/8, GA 9/26Announce and GA 9/8
Announce and GA 9/8
OpenPOWER servers for cloud and cluster deployments that are different by design
Innovation Pervasive in the Design
Power Systems S822LC for Big Data
NVIDIA: Tesla K80 GPU Accelerator
Red Hat, Ubuntu, SUSE: Linux OS
Mellanox: InfiniBand/Ethernet Connectivity in and out of server
HGST: Optional NVMe Adapters
Alpha Data with Xilinx FPGA: Optional CAPI Accelerator
Broadcom: Optional PCIe Adapters
QLogic: Optional Fiber Channel PCIe
Samsung: SSDs & NVMe
Hynix, Samsung, Micron: DDR4
IBM: POWER8 CPU
17
Leading Operational DBMSs Available & Optimized for Linux on Power
In-Memory, NoSQLSAP HANAMongoDBNeo4jEnterpriseDBMariaDB
Open SourceIBM DB2 BLURedisLabsCassandraPostgreSQL
18
19
IBM is your single, trusted vendor to support and help you manage your Linux infrastructure
1Based on IBM internal data. 2Original equipment manufacturer
Pric
e/P
erfo
rman
ce
Moore’s Law
Processor Technology
2000 2020
Firmware / OSAcceleratorsSoftwareStorageNetwork
20
Data holds competitive valueFull system and stack open innovation required
You are here
44 zettabytes
unstructured data
2010 2020
structured data
Data G
rowth
Today’s challenges demand innovation
4X
Threads per core4X
Mem. Bandwidth1
6XMore cache2 @ Lower Latency
SMT=Simultaneous Multi-Threading OLTP = On-Line Transaction Processing
These design decisions result in best performance for data centric workloads like: Spark, Hadoop, Database, NoSQL, Big Data Analytics, OLTP
POWER8: Designed for data to deliver breakthrough performance
POWER8SMT8
x86Hyperthread
Parallel Processing
POWER8pipe
Data flow
x86 pipe POWER8
x86 POWER8 + OpenPOWER
x86
21
1. Up to 4X depending on specific x86 and POWER8 servers being compared2. Up to 6X more cache comparing Intel e7-8890 servers to 12 core POWER8 servers. See speaker notes for more details
© 2016 IBM Corporation
SAP S&D industry benchmark demonstrates that x86 technologies have NOT improved performance per core
Source: http://global.sap.com/solutions/benchmark/sd2tier.epx
Certification # 2016023Date: 05/10/2016Fujitsu PRIMEQUEST 2800E3192 Cores & 384 Threads E7-8890 v4 at 2.20 GHz2048 GB MemoryUsers: 74,000, SAPS: 404,200Windows Server 2012 R2SQL Server 2012
Sandy Bridge EP Ivy Bridge EP Ivy Bridge EX Haswell EP Haswell EX Broadwell EP Broadwell EXCertification # 2014034Date: 10/3/2014IBM Power E87080 Cores & 640 Threads POWER8 at 4.19 GHz2 TB MemoryUsers: 79,750, SAPS: 436,100AIX 7.1DB2 10.5
POWER8
2.58x
Certification # 2013039Date: 12/10/2013Cisco UCS C420 M332 Cores & 64 Threads E5-4650 at 2.70 GHz256 GB MemoryUsers: 13,010, SAPS: 71,170Windows Server 2012SQL Server 2012
Certification # 2014017Date: 05/05/2014Dell PowerEdge R72024 Cores & 48 Threads E5-2697 v2 at 2.70 GHz256 GB MemoryUsers: 10,253, SAPS: 55,970RHEL 6.5SAP ASE 16
Certification # 2014018Date: 05/05/2014Cisco UCS B260 M430 Cores & 60 threads E7-4890 v2 at 2.80 GHz512 GB MemoryUsers: 12,280, SAPS: 67,020Windows Server 2012SQL Server 2012
Certification # 2014033Date: 09/10/2014Dell PowerEdge R73036 Cores & 72 threads E5-2699 v3 at 2.3 Ghz256 GB MemoryUsers: 16,500, SAPS: 90,120RHEL 7.0SAP ASE 16
Certification # 2015012Date: 05/05/2015Dell PowerEdge R93072 Cores & 144 threads E7-8890 v3 at 2.5 GHz1 TB MemoryUsers: 31,000, SAPS: 170,030RHEL 7.1SAP ASE 16
Certification # 2016006Date: 03/31/2016Cisco UCS C240 M444 Cores & 88 Threads E5-2699 v4 at 2.20 GHz512 GB MemoryUsers: 21,210, SAPS: 115,820Windows Server 2012 R2DB2 10.1
Rel
ativ
e R
PE2*
* Per
Cor
e
P
OW
ER
6 =
1.0
**Gartner RPE2 Details:http://www.gartner.com/technology/research/RPE2-methodology-details.jsp
RPE2** numbers are derived from the following six benchmark inputs:SAP SD Two-Tier, TPC-C, TPC-H, SPECjbb2006 and two SPEC CPU2006 components
The data on this chart is derived from RPE2 from Gartner, Inc.'s Competitive Profile tool. © 2014 Gartner, Inc. and/or its affiliates. All rights reserved.
This was a POWER8 Design Goal
POWER73.55 GHz
16 cores
POWER7+4.2 GHz
16 cores
POWER83.52 GHz
24 cores
POWER64.2 GHz
16 cores
POWER Performance per core is increasing!
IncreasingPerformance
The servers shown are best in each category (sockets and number of cores)
1.0
Example: OpenPOWER acceleration with NVLink
Current CPU to GPU PCIe Attachment
System Bottleneck
Graphics Memory
New POWER8
CPU
P100Tesla GPU
NVL
ink NVLink
NVLink
115GB/s
80 GB/s
80 G
B/s 80 G
B/s
New POWER8 with NVLink Processor Technology
CPU
GPU
PC
Ie
32 G
B/s
Graphics Memory
Graphics Memory
POWER8 with NVLink delivers 2.8X the
bandwidthPCIe Data Pipe
POWER8 NVLink Data Pipe
P100Tesla GPUz
© 2016 IBM Corporation
YCSB running MongoDB on POWER8 delivers leadership performance and 2.2X better price-performance than Intel Xeon E5-2690 v3 Haswell
IBM Power
S822LC(16-core, 256GB)
HP DL380 Gen9(24-core, 256GB)
Server web price*-3-year warranty
$16,295 $24,615
System Cost-Server + RHEL OS + MongoDB Annual Subscription
$29,584($16,295 + $1,299 + $11,990)
$37,904($24,615 + $1,299 +
$11,990)
MongoDB YCSB(total operations per second)
297.5 k ops 169.5 k ops
$ / Op per Sec 100 $ / ops 223 $ / ops 2.2X better
2.2XBetter Price-Performance
33%Lower HW costs and
maintenance
75%More Performance
per Server
@
•Based on IBM internal testing of single system and OS image running Yahoo Cloud Services Benchmark (YCSB) 0.6.0, 1M record workload at 50/50 read/write factor. Conducted under laboratory condition, individual result can vary based on workload size, use of storage subsystems & other conditions.• IBM Power System S822LC; 16 cores (2 x 8c chips) / 128 threads, POWER8; 3.3 GHz, 256 GB memory, MongoDB 3.3.4, RHEL 7.2. Competitive stack: HP Proliant DL380 Gen9; 24 cores (2 x 12c chips) / 48 threads; Intel E5-2690 v3; 2.6 GHz; 256 GB memory, MongoDB 3.3, RHEL 7.2 . Both server priced with 2 x 1TB SATA 7.2K rpm HDD, 1 Gb 2-port, 2 x 16gbps FCA. Configurations represent the highest processor frequency for that specific processor running the MongoDB server on 1 socket & the YCSB application workload on the 2nd socket. RAM disk was used to focus testing on processor technology differences.Pricing is based on web pricing for S822LC http://www-03.ibm.com/systems/power/hardware/s822lc-commercial/buy.html and HP DL380 Gen9 https://h22174.www2.hp.com/SimplifiedConfig/Index MongoDB https://www.mongodb.com/compare/mongodb-oracle Page: 6
POWER Advantages for Spark
• Streaming and SQL benefit from High Thread Density and Concurrency
• Machine Learning benefits from Large Caches and Memory Bandwidth
• Graph also benefits from Large Caches, Memory Bandwidth and Higher Computational Strength
26
Machine Learning SQL Graph
2X Core-to-Core Advantage
Machine Learning SQL Graph
1.5X Price Performance Advantage
7-Node S812LC 10-core vs. 7-Node E5-2690 v3 12-core
© 2016 IBM Corporation
Future Proof Your Hadoop Infrastructure
• Total Cost of Ownership benefits of a Linux on Power decision
– Less infrastructure means reduced costs in many areas:o Energy, cooling, server administration, floor space, SW licensing
– Position for future growth, avoid hitting the data center wall with cluster sprawl
– As your workloads evolve, POWER8 gives you options:o Scale up each node by exploiting the memory bandwidth and multi-threadingo Add new workload optimized servers to the cluster (such a GPU with NVLink)
27
HDP and OpenPOWER – Better Together
• Leading Open Hadoop and Spark distribution on the leading Open Server platform – built for Big Data, driving continuous innovation in the community
• Industry Experience to Lead Clients on an Analytics Journey – Discovery workshops to identify the key business use cases and the client progression
• Mission Critical Support– Stable, trusted Hadoop platform on proven Power System with outstanding client support
• Performance optimized for enterprise scale– Price performance leadership with POWER8– Exploiting POWER8 advantages for key workloads such as SQL and Spark– Future exploitation of accelerators to maintain leadership
28© 2016 IBM Corporation
29
Hortonworks (HDP) on Power Roadmap
3Q2016
Ramp Up ProgramHDP early code on Power
Head start on your Analytics Journey
IBM and Hortonworks assistance for non-prod
POCs
General AvailabilityHDP on Power
Proven Power Reference Configs
Hortonworks and IBM production
Support
General AvailabilityHDP next on
Power
Simultaneous releases
4Q2016 Q1 2017 later 2017
Partnership Announce
Edge Ann Sept 19Note: Timing and content subject to change.
© 2016 IBM Corporation
How to Get Started with HDP on OpenPOWER Systems
• Join the Hortonworks Community: https://community.hortonworks.com/
• Learn more about the benefits of Hortonworks: http://hortonworks.com/training/
• Learn more about the benefits of IBM Power Systems and OpenPOWER
• If you are interested in discussing a HDP on Power Systems option or proposal, talk to your Hortonworks or Power sales reps
30© 2016 IBM Corporation