the bigger they are the harder they fall
DESCRIPTION
TRANSCRIPT
![Page 1: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/1.jpg)
Be Certain. Be Trillium Certain.
The Bigger They Are The
Harder They Fall: Big Data & the Data Quality
Imperative
Nigel Turner, VP Strategic Information Management
Tuesday 19th June 2012
![Page 2: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/2.jpg)
The bigger they are the harder they fall…
![Page 3: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/3.jpg)
But big can pay off…
![Page 4: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/4.jpg)
Big Data – what is it?
� Set of new concepts, practices & technologies to manage &
exploit digital data
� OVUM defines it as:
� “A data computational problem that is large and varied enough to
demand new approaches to traditional SQL & related practices”
� Key premise is that all data has potential value if it can be collected, analysed and used to generate actionable insight
![Page 5: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/5.jpg)
Big Data – its characteristicsThe 3Vs
• Reflects exponential growth of data – predicted 40-60% per annum
• Today 2.5 quintillion bytes of data are created every day
• 90% of all digital data was created in the last two years
• Data generated more varied and complex than before:
– Text, Audio, Images, Machine Generated etc.
• Much of this data is semi-structured or unstructured
• Traditional IT techniques ill equipped to process & analyse it
• Data often generated in real time
• Analysis and response needs to be rapid, often also real time
• Traditional BI / DW environments becoming obsolescent – new
approaches are needed
![Page 6: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/6.jpg)
What’s different about Big Data?
� New technologies which enable distributed & highly
scalable MPP (Massively Parallel Processing), e.g.
� Apache Hadoop
� MapReduce
� NoSQL databases
� Strong emphasis on analytical approaches
� Emergence of “data science”
� Predictive Analytics
� Data Mining
� The “democratisation” of data
� Data made available to all (cf Cloud Computing)
� Business and not IT led BI
![Page 7: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/7.jpg)
Where does Big Data come from?Widely known sources
![Page 8: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/8.jpg)
Where does Big Data come from?Social Media & Social Networks
![Page 9: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/9.jpg)
Where does Big Data come from?Machine Generated data
![Page 10: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/10.jpg)
Big Data – some vertical applications
� Retail: using point of sale & social media data to
supplement & enrich traditional CRM / Marketing data
� Insurance & Banking: fraud detection
� Health: holistic patient analysis
� Utilities: consumption peaks & troughs & capacity
planning
� Telcos: call routing optimisation & customer churn
� Manufacturing: predictive fault identification & supply
chain optimisation
� Research: particle analysis, genomics etc.
![Page 11: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/11.jpg)
Big Data in practice - Volvo
� Every Volvo vehicle has hundreds of
microprocessors / sensors
� Data generated used within the car itself but
also captured for analysis by Volvo and its dealers
� All data is loaded into a centralised data
analysis hub & integrated with CRM, dealership & product data
� Used to optimise design & manufacturing, enhance customer interaction & improve
safety
![Page 12: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/12.jpg)
Big data in practice – fraud detection
![Page 13: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/13.jpg)
Big Data – why invest?
� Better understanding of customer & market behaviour
� Improved knowledge of product & service performance
� Aids innovation in products & services
� Fact based and more rapid decision making
� Enhances revenue
� Reduces costs
� Stimulates economic growth
![Page 14: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/14.jpg)
Big Data – the impact on individuals
� Employees
� Empower & devolve decision making
� Create new job & upskilling opportunities
� Consumers
� Better targeted offers
� Improved products & services that meet needs
![Page 15: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/15.jpg)
Big Data – the privacy concern
![Page 16: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/16.jpg)
Big Data – Foundations of Success
� Identifying the right data to solve the business problem or
opportunity
� The ability to integrate & match varied data from multiple data
sources
� structured, semi-structured, unstructured
� Building the right IT infrastructure to support Big Data
applications
� Having the right capabilities & skills to exploit the data
![Page 17: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/17.jpg)
Big Data – the data integration challenge
SOCIAL
MEDIA
SENSORS
CS
DATA
MOBILES
EX
TE
RN
AL
DA
TA
SO
UR
CE
S
INT
ER
NA
L D
AT
A S
OU
RC
ESCRM
BILLING
OPS
SALES
PRODS
ANALYTICS PLATFORM 1
ANALYTICS PLATFORM 2
ANALYTICS PLATFORM 3
ANALYTICS PLATFORM n
ACTIONABLE INSIGHT & KNOWLEDGE
![Page 18: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/18.jpg)
Big Data – Barriers & Pitfalls
� The sheer volume of data – what’s worth using?
� Data extraction challenges
� The ability to match data from disparate sources / formats / media
� The time taken to integrate new data sources
� The risks of mismatching and incorrect identification of individuals
� Legal & regulatory pitfalls
� Security concerns – corporate & individual
� Lack of skills & expertise
� Making the case for investment
![Page 19: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/19.jpg)
Big Data – the Data Quality Imperative (1)
� Need to profile external and internal data sources
� Need to classify data to define what data really matters
� Need to assure the quality of internal (and some external)
data sources for accuracy, completeness, consistency
� Need to define & apply business rules & metadata
management to how the data will be defined and used
� Need for a data governance framework to ensure consistency & control
![Page 20: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/20.jpg)
Big Data – the Data Quality Imperative (2)
� Need processes & tools to enable:� Source data profiling
� Data integration
� Data parsing
� Data standardisation
� Business rule creation & management
� Metadata management & a shared business / IT glossary
� Data de-duplication
� Data normalisation
� Data standardisation
� Data matching
� Data enrichment
� Data audit
� Many of these functions must be capable of being carried out in real time with zero lag
![Page 21: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/21.jpg)
Big Data – the key enablerE
XT
ER
NA
L D
AT
A S
OU
RC
ES
INT
ER
NA
L D
AT
A S
OU
RC
ES
ANALYTICS PLATFORM 1
ANALYTICS PLATFORM 2
ANALYTICS PLATFORM 3
ANALYTICS PLATFORM n
ACTIONABLE INSIGHT & KNOWLEDGE
PROFILE
PARSE
STANDARDISE
MATCH
ENRICH
DATA QUALITY PLATFORM
PROFILE
PARSE
STANDARDISE
MATCH
ENRICH
![Page 22: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/22.jpg)
Big Data – some algorithms
1. BIG DATA + POOR DATA QUALITY = BIG PROBLEMS
2. DATA DEMOCRITISATION – DATA GOVERNANCE =
ANARCHY
3. DATA MASH UPS – DATA QUALITY = DATA MESS
4. BIG DATA ANALYTICS + POOR DQ = WRONG RESULTS
5. BIG DATA – DATA ASSURANCE = JAIL
6. 3V + DATA QUALITY = 4V (VALIDITY)
![Page 23: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/23.jpg)
Big Data – the future
� To date Big Data has been overhyped but now a tipping point has come
� It is here and will grow in volume, velocity & variety
� Immature concept & market so hard to plan – but consolidation is happening
� Big data in a business context reflects emerging generation’s expectations & needs
� Data will increasingly be seen as an asset
� Data skills will become increasingly valued
![Page 24: The Bigger They Are The Harder They Fall](https://reader034.vdocuments.net/reader034/viewer/2022051514/54bb3fcf4a79595d118b458e/html5/thumbnails/24.jpg)
Big Data – how Trillium Software can help
� Current Trillium Software products & services
can help you succeed in your Big Data journey:
� Real time & batch data capabilities in:o Data profiling
o Parsing
o Standardisation
o De-duplication
oMatching
o Enrichment
o Audit
� Strategic consulting services to prepare for and
realise Big Data opportunities