copyright 2012 freeform dynamics ltd 1 big data in context a practical, real-world view bcs event,...
TRANSCRIPT
Copyright 2012 Freeform Dynamics Ltd 1 Copyright 2012 Freeform Dynamics Ltd
Big Data in ContextA practical, real-world view
BCS Event, LeedsNovember 2012
Dale VileCEO & Research Director
Twitter: dale_vileBlog: researchbeat.com
Freeform Dynamics Ltdwww.freeformdynamics.com
Dale’s take
as an aging,
sceptical
RDBMS
techie
Copyright 2012 Freeform Dynamics Ltd 2
The term ‘big data’ is currently being over-hyped by IT vendors in an unhelpful way
Sep 2012
Sep 2011
0% 20% 40% 60% 80% 100%
5-Totally agree 4 3 2 1-Totally disagree Unsure
Copyright 2012 Freeform Dynamics Ltd 3
Some up front statements‘Big Data’ is a bandwagonBut some genuinely new and interesting
stuff is going on behind the hypeMaturity remains an issue, and lots of
challenges existThe new doesn’t (usually) replace the oldIt’s important to keep things in context
Copyright 2012 Freeform Dynamics Ltd 4
TopicsWhat’s the problem we are trying to solve?What, exactly, constitutes ‘big data’?Hadoop as an example of a big data solutionHow does big data change the way we think?Some common use casesThe broader technology pictureFrequently encountered challengesLooking to the future
Copyright 2012 Freeform Dynamics Ltd 5
The problem (and opportunity) in a nutshell
Structured data (e.g. tabular data in
RDBMSs)
Unstructured data (e.g. documents, messages,
multi-media, etc)
0% 20% 40% 60% 80% 100%
5 (Extremely high growth) 4
3 2
1 (No growth)
0% 20% 40% 60% 80% 100%
5 (Fully exploit) 4
3 2
1 (Very poorly exploit)
How much growth? How well do you exploit?
Copyright 2012 Freeform Dynamics Ltd 6
And that data just keeps on comingIn the words of survey respondents…
Increased transaction rates
Digital imagery, Webex logging,
email.
Desire for reporting over longer time periods, with
higher levels of drill down.
Greater use of ecommerce methods for supply
management
*&%!* SharePoint
Cheap storage
More affordable technology is available to
store and analyse data
CRM & social networking
Increasing availability of external data which
may or may not be highly relevant
Better and more widespread
sensors
Audit stipulation
Duplicate copies of data for BI and data mining.
Vast number of emails with client
presentations attached
Ever more detailed (higher resolution)
survey data.
Everything is bigger, faster, cheaper
Digital video archiving
Storage costs drop and processing power increases; formerly impossible applications
morph into expensive ones, which eventually become mainstream
Smart metersFear of
'throwing away'
The shift from above the line advertising spend to
direct marketing
Increased use of digital cameras for
data capture Increased signalling traffic in telecoms networks
Predictive analytics
Movement away from paper to electronic
documents
No desire from Business to archive data
New business paradigms, especially the moving of revenue streams online
Regulation & compliance
Poorly designed systems with inefficient storage and no archive
functions
Same information stored in many places (mail, file server, SharePoint, ...)
Stashing data that we used to archive to take
advantage of future technologies
High demand for immediate access to more and more data
Automation, new working
practices, new regulationSpecific industry
drivers
Cheap storagePoor information management
Business demand for better knowledge
and insight
External data
and feeds
Copyright 2012 Freeform Dynamics Ltd 7
So what constitutes big data?
More V’s
VoracityValueValue-Density
Volume
Variety
Velocity
The 3 V’s
Copyright 2012 Freeform Dynamics Ltd 8
M2M feeds, web activity logs, ticker data, etc.
ERP, CRM, SCM & other transaction
data
Social media, news feeds,
harvested web content, etc.
Document repositories,
message stores, etc.
A practical view
HIG
H V
AL
UE
D
EN
SIT
Y
LO
W V
AL
UE
D
EN
SIT
Y
HIGHLY STRUCTURED
HIGHLY UNSTRUCTURED
BIG DATA
Copyright 2012 Freeform Dynamics Ltd 9
Need for a different architectural approach
Parallel processing Principle of divide and conquer Distribute data into small chunks Execute lots of little tasks close
to the data, then merge results
SCALE UP(e.g. high
performance RDBMS cluster)
Powerful CPUsLots of cores
Huge memoryExpensive diskExpensive SW
SCALE OUTDistributed Commodity Hardware
Open Source Software
Copyright 2012 Freeform Dynamics Ltd 10
The elephant in the room
hadoop.apache.org
HDFS
MapReduce
Hiv
e
HBa
se
Cass
andr
a
Pig
Oth
er to
ols
ZooK
eepe
r
Breaks traditional conventions
Copyright 2012 Freeform Dynamics Ltd 11
Comparison of approaches
Different way of thinking, different level of impact
Schema based data model Key/value based (no schema)
Create model, then load data Load data, then create model
Only load what’s valuable Load data speculatively
Premeditated/prescriptive analysis Exploratory/iterative analysis
What’s the answer? What’s the question?
Fastest time to result Generate the best insight
TRADITIONAL APPROACH BIG DATA APPROACH
Copyright 2012 Freeform Dynamics Ltd 12
Some common big data use cases Social analytics (the ‘poster child’) Customer analytics in the broader sense
Profiling and segmentation Advertising and promotion Retail optimisation (pricing, merchandising, etc) Customer services and support
IT systems monitoring and management Security and associated forensics Business operations
Suppler management, logistics, energy management
Industry specific Financial services, public sector, telecoms
INPUTSMore data
Greater diversityFaster acquisition
More sources
ANALYSISMore urgency
Less predictabilityMore granularity
More historySmaller time-slices
Copyright 2012 Freeform Dynamics Ltd 13
But vanilla Hadoop seldom the answer Enterprise readiness of Hadoop
Resilience, security, integration friendliness Apache tools relatively raw, so look out for other distributions
Cloudera, Hortonworks, MapR Technologies, IBM InfoSphere BigInsights… Mainstream vendors substituting components and extending framework
Hadoop becoming an engine that sits behind commercial frameworks and tools IBM, Microsoft, Oracle, SAP, SAS, EMC, Teradata, …
And Hadoop doesn’t define the whole advanced data management and analytics opportunity anyway Enhanced RDBMS, next generation data warehousing, NoSQL, statistical
modelling, predictive analytics, time-series analysis, in-memory databases, stream based processing engines, and more…. it’s a pretty lively area
Copyright 2012 Freeform Dynamics Ltd 14
Use of traditional and emerging technologies
Legacy databases and file systems
General purpose RDBMS servers
High performance RDBMS configurations
OLAP multi-dimensional database systems
Write once read many (WORM) databases
Rule-based stream processing engines
In memory databases
Scale-out storage architectures
Distributed indexing and search
Distributed data analytics engines
0% 20% 40% 60% 80% 100%
5 (Extensive use) 4 3 2 1 (Not used at all) Unsure
Current level of use
Series1
-60% -40% -20% 0% 20% 40% 60%
Less use More use
Change over next 3 years
Copyright 2012 Freeform Dynamics Ltd 15
Taking a joined up approachDerivative
structured data
Business models and policies
Business insights
Actionable rules
Front line staffCustomers
& suppliers
Data scientists
Traditional BI systems Business
decision makers
Operational data
£
Advanced analyticsExternal
feeds
Operational data
Operational systems
Copyright 2012 Freeform Dynamics Ltd 16
Common challenges organisations face Culture of driving via the rear view mirror
Too much focus on ‘lag’ rather than ‘lead’ indicators Emphasis on planning/score keeping rather than in-flight control
Management and decision making issues Lack of business and political alignment between divisions Parochial approach to budgeting and investment in IT
Fragmented and disjointed systems and information Different formats, different coding structures Different levels of accuracy, quality and completeness
Governance and control Ownership of source data often ambiguous Security, privacy and compliance challenges of centralised big data repositories
Business and IT staff don’t know what they don’t know Locked into historical perceptions and assumptions Knowledge and skills gap often not recognised
Copyright 2012 Freeform Dynamics Ltd 17
Looking to the future Blurring of the lines
Big data and traditional BI Operational control and analytics Analysts and business people Managers and front-line staff On premise and cloud Mobile and office based
KEY QUESTIONS
How many of those data stores can be combined? Layering of analytics tools over big data infrastructure Promise and potential of in memory solutions
Role of deep space skills vs standard models and templates? How quickly will the cultural shifts take place?
Copyright 2012 Freeform Dynamics Ltd 18
How much do you agree or disagree with the following statements?
Developments in advanced storage, access and analytics can allow us to tackle problems today that were either too hard or too expensive to deal with in the past
Developments in advanced storage, access and analytics can allow us to take different and better approaches to tackling some key business requirements
Vendors and consulting firms are well geared up to providing us with the support and services we need to take advanced storage, access and analytics on board effectively
0% 20% 40% 60% 80% 100%
5 (Totally agree) 4 3 2 1 (Totally disagree) Unsure
Copyright 2012 Freeform Dynamics Ltd 19
Thank You
Copyright 2012 Freeform Dynamics Ltd 20 Copyright 2012 Freeform Dynamics Ltd
Big Data in ContextA practical, real-world view
BCS Event, LeedsNovember 2012
Dale VileCEO & Research Director
Twitter: dale_vileBlog: researchbeat.com
Freeform Dynamics Ltdwww.freeformdynamics.com
Copyright 2012 Freeform Dynamics Ltd 21
About Freeform DynamicsMission: To make emerging ideas and technologies
more accessible the mainstream organisations
Cut through vendor promises and hypeDecipher aspirational marketing aimed at early adoptersPick the brains of early movers and learn from their experienceDistil out critical success factors, tips, tricks and traps Provide advice to the broader community in plain English
MechanicsBriefings with IT vendors and service providersPrimary research - face to face, telephone and onlineUse of press and social media to get stuff out there