beyond tco
TRANSCRIPT
2016-06-29
Beyond TCOArchitecting Hadoop for adoption and data applications
Reid Levesque – Head, Solution Engineering
Introduction
Topics
Technology
Use cases
Deployment Impact Next
steps
Technology – Let’s talk Hadoop
Every company is a technology company…
some just don’t know it yet.
Traditional systems under pressure
Conventional wisdom• Put the code on an Application Server• Move the data to/from database• Move the data to/from NASReality check• This works well for small amounts of data• As data volumes increase this design falls apart
Hadoop to the rescue
Enterprise
How do we get Hadoop into the organization?
How about these use cases?
File archive +Hadoop
Data-intensive grid compute analytics
Database replacement
ETL off-load +Hadoop
+Hadoop
+Hadoop
• Data is online; no need for tape backup
• Cheaper than NAS / SAN
• Increased performance / scalability
• Metadata is easier to get; all the data is in one spot
• Improved performance
• Lower TCO
• Reduced dependence on proprietary software
• Reduce RDBMS licensing
• Reduced operational cost for analysis
• Improved functionality with stored XML
• Lower TCO
• Additional analytic capability
• Better hardware utilization
• Lower platform management
Not so much
File archive +Hadoop
Data-intensive grid compute analytics
Database replacement
ETL off-load +Hadoop
+Hadoop
+HadoopTCO
Which use case did work?
Current batch was taking 4 hours; which limited the way they did their job
Users wanted interactive response times to design and test their financial models
This was net new functionality that could only be achieved in Hadoop
Now TCO makes more sense
File archive +Hadoop
Data-intensive grid compute analytics
Database replacement
ETL off-load +Hadoop
+Hadoop
+Hadoop
With Hadoop TCO covered, previous use cases are now more compelling.
How do we deploy this?
Which distribution?
Pick one:
Time to pick the hardware
Is this true?
Commodity hardware + commodity networking = bad architecture
Before there was Hadoop, there were enterprise IT standards
To name a few conflicts during the rollout…
• Local account UID / names• OS settings• Root access• File locations• Standard mount sizes• Enterprise Active Directory• Monitoring systems
Hadoop is NOT flexible on deployment requirements
Who does the work?
Single team including:• Dedicated infrastructure team (Compute, Network, Data Center, Operations)• Dedicated Hadoop team (sysadmin/operations, engineering)• Hardware vendor engineers• Hadoop distribution engineers
Into production we go!
What was the impact?
Changing perceptions
Impact across the organization
Infrastructure• Networking / Data Center designs• Relationship with storage, cloud,
virtualization capabilities• Generating analytic use cases
Development• Mega-attractor for talent• Application consolidation• Shifting from IT to business focus
Management• Understanding (or accepting) new
paradigm• Cross-department architecture
alignment• Data-focus rather than application-
focus
Business• Continuously evolving understanding of
capability / possibilities• Next generation IT w/ rapidly evolving
ecosystem• Self-service innovation for business
users
Lessons Learned
Hadoop doesn’t remove hardware maintenance
Hadoop development is still development!
New paradigm – requires skilled developers
A whole new set of error messages to decode
There aren’t that many experts
Where do we go next?
Self-service tools
Selling Hadoop internally• This journey has taught me a lot about Hadoop; more than most people at the organization• The biggest tasks are educating the organization and doing simple things as a first step
Thank You