ibm smarter business 2012 - big data analytics
DESCRIPTION
Det finns en enorm potential i analys, förädling, modellering och åskådliggörande av de enorma datamängder som genomsyrar såväl näringsliv som samhälle. För att realisera denna potential räcker det inte med kapacitet att lagra, förmedla och söka igenom data. Till det behövs istället Big Data Analytics, storskalig analys av enorma datamängder. Vi presenterar här den forskningsagenda för Big Data Analytics som SICS tillsammans med IBM och ytterligare parter från industri och akademi håller på att ta fram. Talare: Daniel Gillberg, Research Group Leader, SICS, Anders Holst Senior Research Scientist, SICS, samt Flemming Bagger, IBM Besök http://smarterbusiness.se för mer information.TRANSCRIPT
Big Data Analytics– Challenge and Opportunity
6,000,000 users on Twitter
pushing out 300,000 tweets per day
500,000,000 users on Twitter
pushing out 400,000,000 tweets per day
83x
1333x
2+ billion
people on the
Web by end 2011
30 billion RFID tags today
(1.3B in 2005)
4.6 billion camera phones
world wide
100s of millions of GPS
enabled devices
sold annually
76 million smart meters
in 2009… 200M by 2014
12+ TBs of tweet data
every day
25+ TBs of
log data every day
? T
Bs
of
dat
a ev
ery
day
Where is big data coming from?
The Characteristics of Big Data
Collectively analyzing the broadening Variety
Responding to the increasing Velocity
Cost efficiently processing the growing Volume
Establishing the Veracity of big data sources
By 2015, 80% of all available data will be uncertain - The number of networked devices will be double the entire
global population
- The total number of social media accounts exceeds the entire global population
50x 35 ZB
20202010
30 Billion RFID sensors and counting
80% of the worlds data is unstructured
Data AVAILABLE to an organization
Data an organization can PROCESS
Big Data is a Hot topic - Because it is possible to Analyze ALL Available Data• The percentage of available data an enterprise can analyze is decreasing proportionately to
the available to that enterprise– Quite simply, this means as enterprises, we are getting “more naive” about our business over time
• Just collecting and storing “Big Data” doesn’t drive a cent of value to an organization’s bottom line
• Cost effectively manage and analyze ALL available data in its native form unstructured, structured, realtime streaming…….Internal and external
Signalsand
Noise001100110010010101001010011110010010110101101000100100010001010011110010001001000100100011001000100100010010001010001001000
Business-centric Big Data Platform
• “Big data” isn’t just a technology—it’s a business strategy for capitalizing on information resources
• Getting started is crucial
• Success at each entry point is accelerated by products within the Big Data platform
• Build the foundation for future requirements by expanding further into the big data platform
6
Database services that handle large volumes of transactions with high availability, scalability and integrity
Data Warehouse services for complex analytics and reportingon data up to petabyte scale -with minimal administration
Operational Warehouse services for continuous ingest of operational data, complex analytics, and a large volume of concurrent operational queries
Different data workloads have different characteristics
System for Transactions
System for Analytics
System for Operational Analytics
powered by Netezza technology
Big Data Analytics – A national research initiative
Big Data Analytics – A national research initiative
Daniel Gillblad
Research Group Leader, Senior Research Scientist
SICS, Swedish Institute of Computer Science
Background
• There is a very large potential, both societal and commercial, in the analysis, refinement, modeling, and visualization these data sets
• Capacity to store, transfer, and search is not enough - analytics is critical
Additional business value of Analytics
• Predict and optimize business outcomes
• New services and applications, both for end-users and industry
• New value chains, were different actors can create and exchange new analysis services
A national Big Data Analytics initiative
① A strategic nation-wide research and innovation agenda – Input from several sectors and application areas– Both new businesses built on analytics applications
and traditional industry – Input from academia, both as developers and as users
② A national Big Data Analytics network – Open to all interested parties – Industry and academia with an active interest in Big Data Analytics
Focus areas
Analytics
Computation
Storage
Collection
Visualization
{Focus areas
Control and planning
Current constellation
Research and development challenges
• Huge businesses are built on Big Data Analytics today, but a large number of issues must be resolved to fully realize the potential
• Three examples
Example 1: Large-scale physics experimentation
• Challenges: Scale (storage, computation), scalable analytics
Example 2: Social network mining
• Challenges: Unstructured data, biased data, data access
Example 3: Access network pattern mining
• Challenges: Integrity issues, distributed mining, service frameworks
Long term trends
• Currently dominating approach will continue to be successful, but will be complemented due to– Too much data, unstructured data, noisy data– Limited access – security, integrity, legal, and business– Fast data generation, situation awareness
• The consequences are– Analysis closer to data generation / collection– No storage - Catching information on the fly– Distributed analysis with incomplete data– Real time collection, real time analytics
Research challenges
• Research challenges on different levels:– The sensor/collection level– The algorithmic/analytical level– The system level– The organisational level
Technical challenges, examples
• Computational and storage framework development• Analysis of unstructured data• Distributed analysis• Efficient analysis algorithms• Stream mining• Managing sample bias• Managing uncertain and missing data
Platform and organisational challenges, examples
• Service and analytics frameworks, exchanging models and data• API:s and standards
• Privacy, integrity, security, and legal• Business models
Contacts
• If you are interested in the Swedish Big Data Analytics Network, feel free to contact
Daniel [email protected]+46 8 633 15 68
Anders [email protected]+46 8 633 15 93