4 hadoop for-the-disillusioned
TRANSCRIPT
@wattsteve
Hadoop in 2013
CC flickr lowfatbrains
Platform Layers Technologies
Computational Runtimes
YARN, GiRAPH, MapReduce, HBase, Phoenix, Spark/BDAS, Drill, Impala, Stinger & more
FileSystems Azure, CassandraFS, CephFS, CleverSafe, GlusterFS, GridGain, HDFS, LustreMapR FS, S3, SWIFT, Quantcast FS, Symantec VCFS & more
Infrastructures System on a Chip, x86, Virtualization and Cloud
Distributions Cloudera, Hortonworks, IBM, Intel, MapR, WanDisco
@wattsteve
Geoffrey Moore’s Technology Adoption Lifecycle
CHASM
Innovators EarlyAdopters
EarlyMajority
LateMajority
Laggards
Hadoop Customer, “Great, but now what?”
@wattsteveCC flickr birdwatcher63
Ask your domain experts and LOB folks what unanswered questions they have Where can you get the data you need to answer that question? (domain experts should know
where to get it) Some of this data may be outside your organization (Social Media, Sensor Data, Data
brokerages/Marketplaces, Web Pages) and some of it may be inside. If the data for the query doesn’t exist, figure out how to instrument or gather it. Pair your domain experts with your data engineers so they can work out how to obtain and
massage the data given the types of queries desired
@wattsteveCC flickr syume
• Building data products is a similar exercise except that it involves typical product planning, such as identifying a market.
• This is also a great way for an organization to explore what assets they have within their data