a roadmap for big-data research and education › cms_fs › 1.145312! › file ›...

23
A roadmap for big-data research and education at LTU [email protected]

Upload: others

Post on 06-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

A roadmap for big-data

research and education at LTU

[email protected]

Page 2: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Outline

•  What is big data? What’s behind the hype? •  Industry and academia outlooks •  Basic tools & frameworks •  National/international research and innovation agendas •  Roadmap opportunities:

•  Mobility & cloud •  Internet of Things, cyber-physical systems •  Data analytics •  Datacenter automation and management, etc.

•  Strengths, weaknesses, opportunities, threats

Page 3: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

What is “big data”?

•  Data with properties meeting the 3-4 Vs •  volume: from machines, networks, social media, etc. •  variety: often unstructured •  velocity: continuous flow, often real-time •  veracity: full of bias, noise, abnormality, irrelevance

Page 4: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

How do we process it?

•  Similar objectives as with any data •  creation •  retrieval •  storage •  analysis •  presentation •  visualization, etc.

•  However, new scalable methods needed to effectively and efficiently process the data

Page 5: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Origin 1: business analytics and corporate decision making in enterprises

A survey by BARC shows where data comes from

Page 6: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

More on enterprise and business analytics

A survey by Jaspersoft shows how data is stored

Page 7: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Origin 2: The big four in cloud

Amazon, Google, Facebook, Yahoo (but now there are hundreds of followers) •  It is worth studying how their systems are built

under the hood. •  Based on fundamentals in distributed systems

research •  New solutions that are adapted to specific

requirements, which allow for trade-offs in order to increase speed

•  Adressing all 4 Vs

Page 8: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Research fields

•  Distributed and pervasive systems, grid systems

•  Computer architecture, virtualization •  Networking •  Data mining and big data analytics •  Automation •  Control theory

•  In combination with research in application areas (or deep understanding of user needs) !

Page 9: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Toolsets

•  The traditional tools used in the mentioned fields

•  Some relatively new ones specifically for big data processing •  Showing two example stacks on next page

•  The potential set is huge and new inventions are added quickly

•  Having some common ground knowledge and a lab that supports those tools is a success factor!

Page 10: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

BDAS

Page 11: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Stratosphere

Page 12: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Notes

•  BDAS and Stratosphere will be presented by their originators at the Cloudberry workshop in June!

•  Whatever toolsets we prefer, it should as far as possible be used in lab assignments at undergraduate and masters level

Page 13: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Arenas and agendas

•  Process IT Innovations •  Cloudberry Datacenters •  Centek and county municipality efforts in the

region •  The information driven society (Vinnova SIO) •  EU arenas, Horizon 2020

•  Partnerships and cooperations

Page 14: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Potential roadmap items follow

•  Initial set, more can be added •  Mostly focused on systems with experimental

research and evaluation •  Theoretical evaluations where applicable

Page 15: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Mobility and cloud computing

•  Personalized (group) clouds •  credentials, security

•  Light-weight distributed cloud architectures •  Monitoring and profiling •  Make mobility and cloud even smoother

•  locality, caching,

Page 16: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Distributed algorithms and data structures

•  Based on application class specific requirements and trade-offs •  Many fundamentals where researched decades ago,

but with new deltas in requrements, there are opportunities

•  Looking into dynamic scenarios and mobility •  Not only fast lookups, but also fast re-build of data

structures, locality challenges and opportunities, etc

Page 17: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Machine learning

Covered in depth in Fredrik Sandins report

Page 18: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Content distribution and named data networking

•  A major challenge of growth in data intensive applications (e.g., video)

•  Interesting in combination with sensor data and similar models where content is produced by billions of devices •  Addressing models •  Data aggregation

Page 19: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Internet of Things (IoT)

•  By definition connected to the Internet •  Large number of devices •  Crowd sensing •  Aggregation and indexing architectures •  Open data, or restricted data •  Resource efficiency (power, bandwidth,

storage, space etc)

Page 20: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Cyber physical systems (CPS)

•  Can encompass IoT technologies •  But also embedded/closed systems •  Process industry •  Real-time systems •  Availability, fail-over, redundancy

Page 21: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

Data analytics

•  Novel analytics methods related to the data presented on previous slides

•  Application specific data to analyse •  Where are the gaps?

Page 22: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

SWOT Strengths good systems knowledge, experimental research, strong industry cooperation.

Opportunities the growth in datacenter industry, strong arenas, great industry interest, cross functional projects (applications software infrastructure, IoT/M2M)

Weaknesses late starter in big data, few researchers directly engaged in topic, too few graduate students in the topic.

Treats speed, ramp-up of research, lack of international cooperation, insufficient contribution/hype ratio.

Page 23: A roadmap for big-data research and education › cms_fs › 1.145312! › file › 2014-05-09_LTU_BigDa… · A roadmap for big-data research and education at LTU olov.schelen@ltu.se

So, lets kick off!

Discussions J