performance modeling of iot applications€¦ · dr. subhasri duttagupta, tcs. computer measurement...

Computer Measurement Group, India 1Computer Measurement Group, India 1

www.cmgindia.org

Performance Modeling of IoT Applications

Dr. Subhasri Duttagupta, TCS

http://www.cmgindia.org/

Computer Measurement Group, India 2

Contents

• Introduction to IoT System

• Performance Modeling of an IoT platform

• Performance Modeling of a sample IoT Application

• Performance Modeling of a real-life WSN system

• Summary

Computer Measurement Group, India 3Computer Measurement Group, India

Motivating Examples

Scalability Analysis

End-to-End DelayAnalysis

WNA Lab, Amrita University


Elements of a Typical IoT Platform

Analytics

Message Routing

& Event Processing

Sensor Data Management

AppsDevice

Management

Device Agents

Device Agents

API

Things withEmbedded Sensors

Mobile Devices

Gateway Devices Cloud Services Apps, Clients & Portals

Developers

http(s), tcp, udp, mqtt

LWM2M

RESTful

OPC-UA, Modbus, Continua

A high performance, scalable platform for Internet-of-Things


Questions that Performance Modelling can answer

• When does any of the subsystem become a bottleneck• As the sensor data rate increases• As the number of queries injected by users increases

• How many VMs a particular subsystem needs to handle a certain load

• Is the SLA going to be met for a certain growth in the number of users

• What kind of performance modelling is useful in a specific situation


Challenges in Performance Modelling of IoT

• Diverse Technology

• Huge number of Smart Devices of various types

• Lack of suitable testing platform and tool for testing End-to-End system

• Difficult to Predict exact workload mix

• Addition of frequent new services and devices

• Changes in Deployment platform


Modeling Background

Closed Systems are characterized by No of users and think time

Service Demand – Amount of CPU/Disk time spent in serving one unit of Output (Utilization of resource / Throughput)

Open Systems characterized by rate of arrival

Input to the Model


What Questions Modelling helps in Answering

Which component becomes the bottleneck when the system is

handling peak data rate from a number of sensors

How many No of VMs required at each layer

For a certain no of sensors with a response time SLA

For a certain no of API clients and with a Response time SLA

• Can the system handle a certain growth of users accessing the IoT

services

Can the platform support different types of APIs (random access

query, range query, sequential scan query) simultaneously without

affecting the performance


Steps in Modelling Exercise

1. Understand the architecture• Identify the components that are significant

2. Analyse the commonly used workloads and their parameters

3. Do performance test or analyse available data to obtain Service Demands for each type of workload

4. Analyse workload to find out any variation in service demands based on certain conditions

5. Decide the flow of requests within the model•Attach probability to various alternate flows


Architecture Diagram of a Subsystem

Hbase Cluster

Phoenix Coprocessor

on Region Servers

Tomcat NIO Servers

SOS

DerbyPhoenix

JDBC

Spring MVC

Tomcat NIO Servers

SOS

DerbyPhoenix

JDBC

Spring MVC

Hazelcast

Distributed

Cache

Message

Exchange

API ID

Generation Service

Pulls and Queues ID

ID

REST

Clients

Observation

& Audit Trails

Sensor Observation Services


Workloads for Sensor Observation Services

Different APIs

• GetObs – latest, by Sensor, Time range

• PostObs

• Get/Post Sensor

• Get Feature – get features of sensors

• Get Capability – get capability of SOS

Find out whether API output depends on

the parameters passed

SOS


JMT: Powerful Java Modelling Tool

• Developed since 2002 by 10+ generations of PG and UG students at Politecnico di Milano and Imperial College London

• http://jmt.sourceforge.net/

• JMT is open source: GPL v2

– size: ~4,000 classes; 21MB code; ~200k lines

• Download the jar file and simply run

java –jar JMT.tar

• M.Bertoli, G.Casale, G.Serazzi.JMT: performance engineering tools for system modeling.ACM SIGMETRICS Performance Evaluation Review, Volume 36 Issue 4, New York, US, March 2009, 10-15, ACM press

http://jmt.sourceforge.net/


JMT – Java Modeling Tools

• JSIMgraph - Queueing network models simulator with graphical

user interface

• JSIMwiz - Queueing network models simulator with wizard-based

user interface

• JMVA - Mean Value Analysis and Approximate solution algorithms

for queueing network models

• JABA - Asymptotic Analysis and bottlenecks identification of

queueing network models

• JWAT - Workload characterization from log data

• JMCH - Markov chain simulator

http://jmt.sourceforge.net/JSIMw.html

http://jmt.sourceforge.net/JMVA.html

http://jmt.sourceforge.net/JABA.html

http://jmt.sourceforge.net/JWAT.html

http://jmt.sourceforge.net/JMCH.html


Analytical Modeling of SOS Model

HBase version

Delay stations Queuing stations

Postgres version


Modeling of SOS with PostGres

SOS Modules

• Predicting Max Throughput, Response time for a specific Deployment

• Predicting performance for a different Deployment

• Use of Performance Mimicking Benchmarks [Duttagupta, IoT 2016]

No of Threads

Think time

Performance Testing Tool

UtilizationThroughput


Performance of a Single API on AWS (using Postgres)

• Throughput saturates at 256 users and at 123 trans/sec.

• Response time increases more than 1 sec beyond 256 users

• For scaling to higher no of users, we need to add more VMs to Tomcat

0

20

40

60

80

100

120

140

0 100 200 300 400 500 600

Thro

ugh

pu

t (T

xn/s

ec)

No of Users

PostObs on AWS

Actual Throughput Predicted Throughput

-500

0

500

1000

1500

2000

2500

3000

3500

0 100 200 300 400 500 600

Res

po

nse

tim

e (i

n m

s)

No of Users

PostObs on AWS

Actual Response Time Predicted Response Time

Response time ~ 2s


Mixed API for a different Datastore (Hbase)

• GetObsLatest 10% threads and PostObservation 90% threads

• Modeling helps in performance of Mixed API given that of Single API

0

50

100

150

200

0 100 200 300 400 500 600

Thro

ugh

pu

t

No of Users

GetObsLatest + PostObs


0

200

400

600

800

1000

1200

0 100 200 300 400 500 600

Res

po

nse

Tim

e

No of Users

GetObsLatest + PostObs

Actual Response time Predicted Response time

PostObs SD=12.5 ms GetObsLatest SD= 5.2 ms


Does API support Horizontal Scalability?

With 2 Tomcat VMs, Application scales up-to 512 users.

- Model predicts API to have linear scalability and actual test results show that two VMs scales to twice the number of users without increasing response time.

Response time ~ 0.5s

0

50

100

150

200

250

0 200 400 600 800 1000

Thro

ugh

pu

t (T

xn/s

ec)

No of Users

PostObs with 2VMs on AWS


0

500

1000

1500

2000

0 200 400 600 800 1000

Res

po

nse

tim

e (i

n m

s)

No of Users

PostObs with 2VMs on AWS

Actual Response Time Predicted Response Time


Number of VMs Required for a resp time SLA

Tomcat layer is scaling horizontally

PostGres VM needs to be upgraded to bigger VM beyond 1280 users

Response time SLA = 1 sec


Modeling Challenge – Service demand variability

Service Demand varies with higher no of threads, it also depends on the API

API No of threads Tomcat Service Demand

PostObs 64 11.2 ms

768 8.5 ms

1024 7.2 ms

GetObsLatest 64 8.7 ms

256 3.8 ms

512 2.4 ms

GetObsbySensor 32 44 ms

64 50.8 ms

128 68.4 ms


Modeling of a Rule Processing Engine

• Factors Impacting Performance

• Complexity of Rule applied on messages

• Payload of messages

• Inter-arrival delay of consecutive messages

• No of rules applied to an observation

• No of message producers and consumers

• Server/VM architecture

• We consider the effect of message rate and complexity of

rules


Architecture of a Rule Processing Engine

• Messages first come to Tenant RabbitMQ based on API keys• Then Rule processing engine routes them to various topic exchanges• Multiple Topic MQs exist for multiple tenants with same topic


Model of a Rule Processing Engine


Performance of a Message Routing System

Message latency is very low until either RabbitMQ or MR VM saturates.

Once Util > 90%, Latency can increase rapidly due to Queue build up at the server.

0

50

100

150

200

250

300

350

400

450

0 200 400 600 800 1000 1200 1400

Late

ncy

(m

s)

Flow rate msg/sec

Latency in MQ Module

Actual latency Predicted Latency

Flow rate Latency RabbitMQCPU%

900/s 8 ms 80%

975/s 13 ms 87.4%

1050/s 1.9 – 8 sec 93.5%


Combined Model with SOS sending data to MQ

PostObs request gets forked after processing at SOS and gets processed by Rabbit MQ + MR engine

Modeled using a combination of Open Request and Closed Request [ACM IoT 2016]


Performance Analysis of a Sample Application on IoT Platform

Performance Analysis of two subsystems on IoT Platform

We have seen

Next


Sample App – Energy Monitoring System


Architecture – Data flows from Platform to backend


Modeling Problems

• How many more buildings the current infrastructure can support?

• How many Online users dashboard can support with the present

deployment for typical queries?


Q: How many more buildings can be supported

• Data comes from difference sources

• Occupancy data

• Energy Meter Readings

• Find out the distribution of Inter-arrival time of observation

• SOS log gives timestamps of arrival for each observation

• Two metrics are calculated from Inter-arrival time samples –

Mean and Standard Deviation


Backend Access Log

• We extract the timestamps of subsequent observations being posted.

• With the timestamp, we calculate the inter-arrival time of every observations.


Distribution for Inter-arrival time

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210

Pro

bab

ility

PDF of Inter-arrival time

Interarrival time (in ms)

• If it is Exponential distribution and Mean = 37.2, std deviation should have been also 37.2 but it is 45.6

• What trend does this distribution reflect?• How do we compute parameters corresponding to this

distribution?


What is hyper-exponential distribution

• We need to find out a set of exponential distributions and their probability to match the distribution of data

𝒊=𝟎𝒏 𝒑𝒊

λ𝒊


Fitting distribution to Data

• We can find out µ1 µ2 and their probabilities so that it matches with desired mean and standard deviation.

• µ1 = 22, p1 = 0.6 We obtain mean = 37.2 and STD = 45.6• µ2 = 60.1, p2 = 0.4

• But Ideally we should use a function to obtain the correct mixture so that it matches the shape of the distribution.


Q: How many online users can be supported

• What if Performance Testing is not an option?

– Rely on Utilization monitoring interface

• Find the Trend of the utilization data to derive Min, Max, Average.

• History is for 1 day, last 7 days or 1 month

• Is the utilization due to one type of workload?


When No Dashboard Queries are running

• Utilization is due to data injection and Alert Queries

Higher utilization at every 25-30 mins and remains high for 15 mins, CPU% varies between 11%-20%


Trend over a day – what’s max, average?


Deriving Service demands for Different Workloads

• We compute the service demand based on mean throughput – derived from the log

• Front-end Client node handles traffic from only Dashboard

• Backend Data node handles traffic from Dashboard as well as from the sensor backend

ES BackendDashboard Queries

Sensor Data

Alert Queries


Deriving Service demands for ES datanode

• Use of Least Square Technique

𝑈𝐸𝑆 = 𝑋𝐷𝑎𝑡𝑎 × 𝐷𝐷𝑎𝑡𝑎 + 𝑋𝐴𝑙𝑒𝑟𝑡 × 𝐷𝐴𝑙𝑒𝑟𝑡 + 𝑋𝐷𝑎𝑠ℎ𝑄 × 𝐷𝐷𝑎𝑠ℎ𝑄

𝑋𝐷𝑎𝑡𝑎 = 𝑇ℎ𝑟𝑜𝑢𝑔ℎ𝑝𝑢𝑡 𝑓𝑜𝑟 𝐷𝑎𝑡𝑎 𝐼𝑛𝑠𝑒𝑟𝑡𝑖𝑜𝑛

𝐷𝐷𝑎𝑡𝑎 = Service Demand for Data insertion


Model for Energy Monitoring Subsystem


Take aways from Modelling of Sample App

• Throughput in an Open system mostly remains unchanged

• Iowait% occurs due to logging of Debug/Info

• Utilization over a short duration can become very high

even at low Concurrency

• Model can be built based on data from production log

and utilization information


What other Modeling Techniques we can use

• Markov Chains can be used when we can model system as a set of states

• Need to know States, Their Transition rates

• Example of a real-life Land slide Monitoring systems

• A Number of different types of sensors are used• Rain gauge, pore Pressure, Humidity, Movement

• Makes decisions based on the value of sensor readings


Deriving Parameters for the Markov Chain

OFF

Sr

S234 Smp

ON

Srm

r(t) > Th1

r(t) < Th1

r(t) > Th3 || p(t) > Thp

r(t) < Th2

r(t) > Th2

|| m(t) > Th

m

r(t) < Th3

Power drain

Power drain

WNA Lab, Amrita University


Summary – Things we covered

• Basics of Performance Modeling using Queuing Networks

• Given an architecture, how to build the model for the

system

• Performance Modeling of an IoT Platform – Closed System,

Open System, No of VMs required, Scalability analysis

• Performance Modeling of a Sample Application running on

an IoT Platform – Gathering inter-arrival time distribution,

Deriving service demands of workload

• Outcome of a Performance Model

• Other Modeling Techniques – Markov Chains


Important Resources• M.Bertoli, G.Casale, G.Serazzi.

User-Friendly Approach to Capacity Planning Studies with Java Modelling Tools.Int.l ICST Conf. on Simulation Tools and Techniques, SIMUTools 2009, Rome, Italy, 2009, ACM press

• S. Kounev, and A. Buchmann, “Performance modeling and evaluation of large-scale J2EE applications,” In Proceedings of the Computer Measurement Group's Conference, 2003.

• E. Lazowska, J. Zahorjan, G. Graham and K. Sevcik, “Quantitative System Performance: Computer System Analysis Using Queueing Network Models,” Prentice-Hall, 1984

• Performance Modeling and Design of Computer Systems: Queueing Theory in Action, by Prof. Mor Harchol-Balter

• A gentle introduction to some basic queuing concepts, by William Stallings.

• Automatically Determining Load Test Duration Using Confidence Intervals, R Mansharamani, S Duttagupta, A Nehete, CMG India, Pune, 2014

• Subhasri Duttagupta, Rajesh Mansharamani.Extrapolation Tool for Load Testing Results,Int. Symposium for Performance Evaluation of Computer System and Telecommunication System, 2011

• Subhasri Duttagupta, Mukund Kumar and Manoj Nambiar, Performance Modeling of IoTapplications, 6th ACM Conference on Internet of Things, IoT 2016.

http://www.cse.iitb.ac.in/panda/system/files/QueuingAnalysis.pdf


Open Issues

• How to take care of variability of technology used by

various sensors for connecting to IoT system

• How Service demand varies with load or higher flow rate

• What are the fundamental limits for a technology stack

for a certain kind of workload

performance modeling of iot applications€¦ · dr. subhasri duttagupta, tcs. computer measurement...

Documents