Download - Arpan pal icdcn
1 Copyright © 2014 Tata Consultancy Services Limited ICDCN 2014, 6th Jan 2014
Harnessing the power of edge computing devices for Real-time Analytics of IoT data
Dr. Arpan PalPrincipal Scientist and Research HeadInnovation Lab, Kolkata Tata Consultancy Services
With Arijit Mukherjee, Himadri Sekhar Paul, Swarnabha Dey, Pubali Datta and Batsyan DasInnovation Lab, Kolkata
OutlineAnalytics in Internet of Things
Computing Requirements
Solution Approach – a Framework using Distributed Computing on Edge Devices
Analytics in Internet-of-Things
4
SignalProcessin
g
Internet-of-Things - towards Intelligent Infrastructure
Sense
Extract
Analyze
RespondLearn
Monitor
IntelligentInfra
@Home@Building
@Vehicle@Utility
@Mobile
@Store
@Road“Intelligent” (Cyber) “Infrastructure” (Physical)
APPLICATION SERVICES
BACK-END PLATFORM
INTERNET
GATEWAY
Internet-of-Things (IoT) Framework
Sense
Extract
Analyze
Respond
Communication
Computing
5
IoT Platform from TCS
Internet
End Users Administrators
Device Integration & Management Services
Analytics Services
Application Services
Storage
Messaging & Event Distribution Services
Appl
icatio
n Se
rvice
s
Presentation Services
Application Support ServicesM
iddl
ewar
e
Edge Gateway
Sensors
Internet
Back-end on Cloud
RIPSAC – Real-time Integrated Platform Services & Analytics for Cyberphysical Systems
TraditionalInternet
Service Delivery Platform & App Development Platform
Security/Privacy Framework
Lightweight M2M Protocols
Analytics-as-a-Service
Social Network Integration
SDKs and APIs for App developer
Grid Computing Components
6
Analytics Use Case - Home Energy Management
Source: IEE - Edison Institute, August 2013, http://blog.opower.com/2013/09/report-smart-meters-in-us-now-generating-more-than-1-billion-data-points-per-day/
“Smart meters in US now generating more than 1 billion data points per day”
7
Analytics Use Case - Remote Patient Monitoring
In 2012, worldwide digital healthcare data was estimated to be equal to 500 petabytes and is expected to reach 25,000 petabytes in 2020.Hersh, W., et. al. (2011). Health-care hit or miss? Nature, 470(7334), 327.http://medcitynews.com/2013/03/the-body-in-bytes-medical-images-as-a-source-of-healthcare-big-data-infographic/
8 Experience certainty.
Analytics Use Case - 3D Reconstruction with 2D images from mobiles
• Low cost solution for 3D reconstruction from multiple 2D images captured from mobile device.
• Derive the motion information from the inbuilt sensors of the mobile phone and then aid in increasing the accuracy of the 3D reconstruction.
Applications• Agro-advisory Service• Remote Diagnostics of Machines• Remote Healthcare
Take pictures of a heterogeneous object from different angles using mobile camera.
Extract the camera parameters from the captured images.
Reconstruct the object using extracted camera parameters.
Dense reconstruction - 0.5 million (approx. ) cloud points from 150 images (5 MP) - 8 minutes on 16 core CPU
Computing Requirements
10
Grid Computing for IoT
Intelligent Systems - Intelligence comes from Analytics
Need for crunching huge amount of sensor data and respond in real-time
Needs humongous computing infrastructure in cloud with dynamic load varying from application to application
Another option is to distribute computing load to the edge devices like mobile phones
11
The Grid in IoT is in the Edge - Fog Computing
Source: Flavio Bonomi et.al. MCC2012, Helsinki, Finland
• Need to have economies of scale compared to traditional cloud
12
At What Cost?
Advantages Edge Devices computing power remain unused most of the
timeo Free Computing resource for the grido Potentially millions of ~1GHz Processors on the grid depending
upon use case Energy cost at edge is typically at consumer rates << Energy
cost at cloud which is at Enterprise rateso Energy cost account for 50% of Data Center Opex
Issues End-users incur cost for computing energy and data communication
Security and Privacy Battery Depletion What is the Incentive for the end-user
Solution Approach – a Framework for Distributed Computing on Edge
Devices
14
Using Condor based Job Scheduling and Data Partitioning
“Utilising Condor for Data Parallel Analytics in an IoT Context - an Experience Report”, Arijit Mukherjee et. al., 9th IEEE International Conference on Wireless and Mobile Computing, Networking and Communications, Workshop on the Internet of Things Communications and Technologies (IoT 2013)
15
Data Partitioning - Static
Huge
Dat
a Se
t
Analytics
Resu
lt
Data Parallel Analysi
s
Processing Infrastructure
P? How to partition the input data set when
The computing nodes are heterogeneous (memory, CPU) They are not always available
D
R. Arasanal and D. Rumani, “Improving MapReduce performance through complexity and performance based data placement in heterogeneous Hadoop clusters”, In Intl Conf. on Distributed Computing and Internet technology (ICDCIT), Feb 2013.A Banerjee, A Mukherjee, H S Paul, S Dey, “Offloading work to Mobile Devices: An availability-aware data partitioning approach”, In Proc of Middleware for Cloud-enabled Sensing (MCS), Dec 2013.
16
Using Edge Devices - Detailed Framework Architecture
Use edge devices like mobile phones as computing nodes especially when they are connected to chargers and are idle
Mustafa Arslan et. al., “Computing While Charging: Building a Distributed Computing Infrastructure Using Smartphones”, In CoNEXT’12, December 10–13, 2012, Nice, France.Felix Büsching et. al/, “DroidCluster: Towards Smartphone Cluster Computing - The Streets are Paved with Potential Computer Clusters”, 32nd International Conference on Distributed Computing Systems Workshops, 2012
Need to have agents on edge devices to find out their capability and availability
Need generic execution framework on edge devices
Need dynamic data portioning algorithms based on sensed capability and availability of edge devices
17
Solution Approach
18
The Execution Engine - BOINC
Source: “Tapping the Matrix: Harnessing distributed computing resources using Open Source Tools”, Carlos Justininiano, http://chessbrain.net/LFBOF2005/tappingthematrix.html
Anderson DP et. al,, “BOINC: a system for public-resource computing and storage”, Fifth IEEE/ACM International Workshop on Grid Computing, 2004.
Berkeley Open Infrastructure for Network Computing
19
Proposed solution on top of BOINC
Agent on Edge Devices, Dynamic Data Partitioner, Executable/Data/Result Transport Engine
20
Results – I/O Intensive Text Search
21
Results – Compute Intensive p Calculation
22
Agent on Edge Devices - Exploiting unique usage pattern
9:00pm
11:00pm
8:00am
6:00pm
Idle slotsData Tx/Rx
Wi-Fi signal
Screen stateApp Category
CPU Idle
Cell signal
Memory free
A’s unique usage pattern
Apply mobile OS/architecture domain knowledge
To office by bus
7:00pm
9:00am
9:00pm
11:00pm
8:00am
6:00pm
To office by bus
7:00pm
9:00am
Parameters for identifying relatively free time periods
B’s unique usage pattern
LogSun Oct 27 01:21:40 IST 2013 --> 331 999960 true 31.0 -57.0 1.0 com.android.chrome
CPU { Excellent, G
ood, Average, fair}
Memory { High, Average, Low}
Signal { Excellent, Poor, A
verage}
Screen { On, Off}
App {High QOE, Background, Sporadic}
State S = { CPU X Memory X Signal X Screen X App }
23
Ongoing and Future Work
Automated dynamic sensing of edge device capability and availability based on Edge Device Agent– Improved dynamic data partitioner
Addressing Security and Privacy– Security issue of Personal Edge Devices allowing foreign
executables to run – Sand-boxing feature in BOINC– Privacy issue of analytics on one users’ data happening on
another’s edge device – Need to build Trust models Energy depletion of battery powered devices
– Compute-while-charging Network congestion due to data movement
– Reduced overhead lightweight communication Incentivization of people donating their edge devices to the
grid– Bid based approach
Thank You