www.cs.helsinki.fi
Internet of Things and Big Data for Smarter Systems
Esineiden Internet ja Big Data Älykkäille Järjestelmille
Professor Sasu Tarkoma Department of Computer Science
University of Helsinki
Toward Internet of Things
1875 1900 1925 1950 1975 2000 2025
1 Billion Places
7 Billion People
Hundreds of Billions
Things
Global connectivity
Personal mobile
Digital Society
www.decentlab.com
http://www.technologyreview.com/view/512166/growing-up-with-google-glass/
National Digile SHOK research program 2012-2015 Develops solutions for the key IoT challenges and development of
ecosystems and business models Ericsson leads the consortium University of Helsinki coordinates the academic research More than 350 researchers from over 40 organizations involved Estimated program budget 50 million € More information: www.iot.fi
Internet of Things (IoT) Program
5
• Corenet • Elektrobit • Ericsson • Falck • F-‐Secure • Intel
• NSN • Pohjanmaan Verkkopalvelut
• So>era • TeliaSonera
Big companies SMEs
• Aalto University • Laurea University of Applied Sciences • Tampere University of Technology • University of Helsinki • University of Jyväskylä • University of Oulu • University of Tampere • VTT Technical Research Centre of Finland
Research OrganizaOons
• 4G-‐Service • Arch Red • Bluegiga • Cybercube • Finwe • FRUCT • iProtoXi • Jolla • Laturi
• MaTerso> • Mikkelin Puhelin • Mobiso> • Mohinet • Nixu • Probot • Refecor • Vediafi
Vediafi
IoT Program Partners 2014
6
The Way from Silos to Pla9orms
IoT ICT SHOK
2015 2014 2013 2012
Health Transport Security Energy ... App App App ... App
Interoperability, connectivity, access
control, service discovery, privacy
IoT optimized fixed/wireless connectivity
Within 4 years the founda<ons for new horizontal solu<ons shall exist! Goal is to move from silos towards horizontal solu<ons.
Fixed/wireless connectivity
Program Results
Vision
By 2017 the Finnish ICT industry is a recognized leader in the IoT domain due to its expertise in standards, software,
devices, and business models integrating various vertical industry segments
• Ecosystem seeds and demonstrators on industrial communica1ons (M2M, chromium mine, factories, smart grid, HetNets), smart apartments, intelligent traffic, consumer devices , …
• Work on both IoT and regular protocol stack for IoT deployments • Significant contribu1ons to IETF CoAP and HOMENET, IEEE 802.11ah, 3GPP LTE, … • The program has published or submiSed over 100 scien1fic ar1cles including: • IEEE Communica1ons, IEEE Transac1ons on Mobile Compu1ng, IEEE Network, IEEE
Percom, ACM SenSys, ACM Mobicom, ACM Ubicomp, ACM ExtremeCom, …
The billions of devices will generate massive amounts of data
Data has immense value for optimizing existing systems
and creating new services The Big Data challenge: how to store, process, and share
the data and obtain the valuable insights while maintaining security and privacy
Overall we have three parts: connecting things and then
analyzing the data and reacting to the observations
Internet of Things and Big Data
Data processing in the network (4G/5G Mobile Core)
Streaming
Edge Analytics (can be virtualized)
Data processing in the computing cluster
(cloud)
Streaming Batch processes
Big Data Frameworks
IoT and Big Data Applications: Real-time situational awareness, condition and security monitoring, traffic management, smart cities, …
Data gathering, processing, and
control at the edge
Streaming
www.decentlab.com
Data processing in the network (4G/5G Mobile Core)
Streaming
Edge Analytics (can be virtualized)
Data processing in the computing cluster
(cloud)
Streaming Batch processes
Big Data Frameworks
IoT and Big Data Applications: Real-time situational awareness, condition and security monitoring, traffic management, smart cities, …
Data gathering, processing, and
control at the edge
Streaming
Worker
Worker
Worker
Worker
split 0
split 1
split 2
split 3
split 4
(3)read
(1)fork
outputfile 0(4)
local write
outputfile 1
Userprogram
Master
(1)fork
(2)assignmap
(6)write
Worker
(5)Remote read
(1)fork
(2)assignreduce
Inputfiles
Mapphase
Intermediate files(on local disks)
Reducephase
Outputfiles
MapReduce paradigm
www.decentlab.com
• Smart devices and machines with Big Data analytics • New levels of connectivity through Internet of Things
and 5G • Real-time sensing and sensing as a service
• Visions and views: • Industrial Internet (General Electric) • Internet of Everything (Cisco) • Industry 4.0 in Germany • Internet of Things
Current Directions
Collaborative Data Analysis for Smarter Energy Efficiency and
Security
Motivation
A lot of heterogeneous, active devices and lot of users with different intents. – What kind of behavior is normal or typical?
Battery lifetime? Risk level?
Introducing Carat
Carat is the first system to use the mobile device community to detect and correct energy problems
Our method for diagnosing energy
anomalies uses the community to infer a specification (expected energy use), and we call deviation from that inferred specification an anomaly
Carat ● Originated in UC Berkeley, in collaboration with
University of Helsinki ● Mobile app for Android and iOS ● Currently over 770 000 users ● >2TB of data, > 100 million measurements ● Research project with many directions ● http://carat.cs.helsinki.fi
16 17/11/14
The Carat project: System
What is Carat?
● Users see Hogs, high energy use apps ● And Bugs that use energy faster on THEIR
device than on others ● Users with these
issues quickly see battery life benefits once they are taken care of
Group receiving recommendations improved battery life by 41%
Collaborative Data Gathering
Each device collects Battery life, timestamp, running apps, system
settings The data is combined and results for your apps and
your device sent back to you Collaborative aspect: We know trends in the
community, as well as how your device is different This can be used for phones, sensors, houses,
base stations, servers, laptops, … anything that generates measurements
How Prevalent is Mobile Malware?
?
NDSS 2013
0.0009%
19
2.6%
4.3%
Detecting Malware
Carat collected the public key used to sign applications 77K users and 460K apps We obtained thousands of application, signature, version records We compared them with blacklists from multiple anti-malware vendors and projects
– McAfee, Mobile Sandbox, MalGenome, ...
Malware Infection Rates
Stopped using applications,
Replaced with similar ones
Kill running applications
More often
Use hogs and bugs less
Stopped using applications,
Did not replace functionality
Restart applications
More often
Did not change behavior
0 10 20 30 40
Changes in Behavior
Beginners
Advanced users
0
50
100
150
200
250
Top Games
Min
ute
s
Carat: Collaborative Energy
and Malware Diagnosis
Eemil Lagerspetz, Ella Peltonen, Sasu Tarkoma
Department of Computer Science, University of Helsinki
HELSINGIN YLIOPISTO
HELSINGFORS UNIVERSITET
UNIVERSITY OF HELSINKI
MATEMAATTIS-LUONNONTIETEELLINEN TIEDEKUNTA
MATEMATISK-NATURVETENSKAPLIGA FAKULTETEN
FACULTY OF SCIENCE
Mobile ApplicationsTake samples and show personal reports
Android and iOS
J-Score lets users compare with others
Recommended Actions
Bugs
Hogs
Carat Data Analysis [1]Scalable machine learning and data mining methods
Carat anomaly detection uses basic statistics and the community defines what is normal
We investigate distributed machine learning techniques to improve the accuracy and give more detailed recommendations
Mobile Malware Prognosis [2]We detected infected devices in the Carat dataset using Android package name, developer certificate hash, and version code.
0.26% - 0.28% of Android devices are infected with known malware
Prediction technique that can identify vulnerable devices to bescanned with moreexpensive techniques
We can reduce the set ofdevices for deep scanningby a factor of 5
Carat CoreReceives data from 700,000 users
Collaborative energy anomaly detection
Computes personalized reports
Over 50M data items
240K Bugs, 16K Hogs
124 device models
500 GB of data
10-node Spark cluster in EC2
4 cores, 15-32 GB RAM
Carat CoreReceives data from 700,000 users
Collaborative energy anomaly detection
Computes personalized reports
Over 50M data items
240K Bugs, 16K Hogs
124 device models
500 GB of data
10-node Spark cluster in EC2
4 cores, 15-32 GB RAM
Energy HogsUse more energy than the average app
Defined by crowdsourced data
Users that stopped using hogsand bugs gained up to 41%more battery life
Hogs can be caused by an app'snormal behavior, such as videoand games
They can be caused by excessiveuse of network, screen,advertising, or programmingerrors (not releasing a lock)
Sizes of the three
malware datasets and
the extent of overlaps
among them.
The MDoctor app shows
infection status as an
intuitive traffic signal. The
app predicts infection and
shows a list of risky apps.
Our infection estimate is
higher than previous
research, but lower than
some AV vendors.
[1] Oliner, Iyer, Stoica, Lagerspetz, and Tarkoma. Carat: Collaborative Energy Diagnosis for Mobile Devices. ACM SenSys 2013.
[2] Truong, Lagerspetz, Nurmi, Oliner, Tarkoma, Asokan, and Bhattacharya. The Company You Keep: Mobile Malware Infection Rates and Inexpensive Risk Indicators. WWW 2014.
[3] Athukorala, Jylhä, Lagerspetz, von Kügelgen, Oliner, Tarkoma, and Jacucci. How Carat Affects User Behavior: Implications for Mobile Battery Awareness Applications. ACM CHI 2014.
with SwiftKeyUpgrade OS +30 ± 2 min
Downgrade OS -14 ± 2 min
No Movement +10 ± 3 min
Move around -24 ± 4 min
Use WIFI +30 ± 5 min
SK
UG DG
MNM
W NWDisable WIFI
-14 ± 4 minCarat aims
to diagnose
energy anomalies and their root causes, such as OS
version, connectivity type, and user mobility.
Human Factors [3]
We conducted a survey of 1,000 Carat users
The results show that long-term Carat users save more battery
charge their devices less often
learn to manage their battery with less help from Carat
Malware infection rates are higher than conservative estimates (0.26% of devices) Google says 0.12% of manually installed packages are malware, not very far from this number Lookout Antivirus predicts >1%
An Early Warning System for Malware A lightweight technique for identifying devices at risk By looking at applications that occur with malware, it is possible to predict infection 5x better than choosing devices at random
– Useful for administrators, organisations (Bring Your Own Device scenario)
MDoctor: Increasing Awareness of Infection Vulnerability
MDoctor shows status of applications according to a malware dataset
Infection vulnerability can be seen
from device health Three metrics for application
analysis: malware correlation, key rarity, and market vulnerability
Department of Computer Science / Eemil Lagerspetz / MDoctor
1406/27/14www.helsinki.fi/yliopisto
MDoctor: Increasing awareness of infection vulnerability
● MDoctor shows status of applications according to a malware dataset (User chooses)
● Infection vulnerability can be seen from device health
● We use three metrics, malware correlation, key rarity, and market vulnerability
● http://is.gd/mdoctor
● Will be on Google Play later
Towards Smarter Systems with IoT and Data Analytics
http://www.technologyreview.com/view/512166/growing-up-with-google-glass/
www.decentlab.com
Tarkoma, Siekkinen,
Lagerspetz and XiaoSM
ARTPHONE ENERGY CONSUMPTION
Cover illustration: smartphone and full battery © iStockphoto.com/fonikum and koya79. Cover designed by Zoe Naylor.
97
81
10
70
42
33
9 Tarkom
a, Siekkinen, Lagerspetz and Xiao PP
C C M
Y K
ENERGY CONSUMPTION
SMARTPHONE
Modeling and Optimization
Sasu TarkomaMatti SiekkinenEemil Lagerspetz
Yu Xiao
Sasu Tarkoma is Full Professor in the Department of Computer Science at the University of Helsinki, Finland. He has worked in the IT industry as a consultant and chief system architect as well as principal researcher and laboratory expert at Nokia Research Center. His interests include mobile computing, internet technologies, and middleware.
Matti Siekkinen is Teaching Research Scientist at Aalto University, Finland. He has co-authored over 40 scientific publications and his research interests include efficiency of mobile devices, network measurements, and protocols.
Eemil Lagerspetz is a doctural student in the Department of Computer Science at the University of Helsinki, Finland. His research interests include mobile energy awareness, data analysis and cloud computing. He has published many scientific articles on mobile energy efficiency.
Yu Xiao is Postdoctoral Researcher in the Department of Computer Science and Engineering at Aalto University, Finland. Her research interests include energy-efficient wireless networking, mobile cloud computing, and mobile crowd-sensing.
With an ever-increasing number of applications available for mobile devices, battery life is becoming a critical factor in user satisfaction. This practical guide provides you with the key measurement, modeling, and analytical tools needed to optimize battery power by developing energy-aware and energy-efficient systems and applications.
As well as the necessary theoretical background and results of the field, this hands-on book also provides real-world examples, practical guidance on assessing and optimizing energy consumption, and details of prototypes and possible future trends. Uniquely, you will learn about energy optimization of both hardware and software in one book, enabling you to get the most from the available battery power.
Covering experimental system design and implementation, the book supports assignment-based courses with a laboratory component, making it an ideal textbook for graduate students. It is also a perfect guidebook for software engineers and systems architects working in industry.
tarkoma
Related Publications
• A. J. Oliner, A. P. Iyer, I. Stoica, E. Lagerspetz, S. Tarkoma. Carat: Collaborative Energy Diagnosis for Mobile Devices. In ACM SenSys 2013.
• A. J. Oliner, A. Iyer, E. Lagerspetz, S. Tarkoma, I. Stoica. Carat: Collaborative energy debugging for mobile devices. In HotDep 2012.
• A. J. Oliner, A. P. Iyer, E. Lagerspetz, I. Stoica, and S. Tarkoma. Carat: Collaborative Energy Bug Detection. Poster and demo at the proceedings of the 9th USENIX Symposium on Networked Systems Design and Implementation (NSDI '12), San Jose, California.
• Ku. Athukorala, E. Lagerspetz, M von Kügelgen, A. Jylhä, A. J. Oliner, S. Tarkoma, G. Jacucci. How Carat Affects User Behavior: Implications for Mobile Battery Awareness Applications. ACM CHI 2014.
• H.T. T. Truong, E. Lagerspetz, P. Nurmi, A. J. Oliner, S. Tarkoma, N. Asokan, S. Bhattacharya, The Company You Keep: Measuring Mobile Malware Infection Rates and Identifying Inexpensive Predictors of Susceptibility to Infection, Proceedings of WWW 2014.
• E. Lagerspetz, H. Truong, S. Tarkoma, N. Asokan. Mdoctor - A Mobile Malware Prognosis Application. DASec workshop in conjunction with ICDCS 2014.
• S. Tarkoma, M. Siekkinen, E. Lagerspetz, Y. Xiao. “Smartphone Energy Consumption: Modelling and Optimization”, August 2014, Cambridge University Press.