cloud workload analysis and simulation
TRANSCRIPT
![Page 1: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/1.jpg)
Cloud ComputingProject B Cloud workload analysis and simulation.
Group 3:Abinaya ShanmugarajArunraja SrinivasanPrabhakar GanesamurthyPriyanka Mehta
Instructor : Dr. I-Ling Yen
TA : Elham Rezvani
![Page 2: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/2.jpg)
Overview
• Dataset preprocessing
• Dataset Analysis and Observations
• Important attributes in dataset
• Categorization of users and tasks
• Time series analysis
• Workload prediction
• Looking Ahead
![Page 3: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/3.jpg)
Dataset pre-processing
![Page 4: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/4.jpg)
• Inconsistent and vague data was processed to perform analysis.
• The task-usage table has many records for a same jobID-task index pair because the same task might be re-submitted or re-scheduled due to task failure.
• So to avoid reading many values for the same JobID-Task index pair pre-processing was done.
• Pre-processing: All records were grouped by JobID-Task index and the last occurring record of repeating task records was considered and stored as a single record.
• Time is in microseconds in the dataset.
• Pre-processing: Time is converted into days and hours for per day analysis
Dataset pre-processing
![Page 5: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/5.jpg)
Dataset analysis and observation
![Page 6: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/6.jpg)
• The data in the tables were visualized
• The data which were found to be constant/within a small range of values for most of the records were not considered for analysis.
• The attributes that play a major part in shaping the user profile and task profile are considered important attributes.
• The main attributes from a table were analyzed and visualized and certain observations were made.
Data Analysis and Observation
![Page 7: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/7.jpg)
Ignored attribute(example) – Memory accesses per instruction
Memory accesses per instruction Vs Tasks per JobID – Except for a few tasks MAI is almost the same for all tasks
![Page 8: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/8.jpg)
Job Events tableAttributes considered: Time, JobID, event type, user.
• These attributes were extracted from the csv files using java code.
• To find the number of jobs submitted per day and per user, the records with event type = 0 were considered, as ‘0’ means a job is submitted by the user.
• Time in microseconds is converted into days
Visualizations : jobs submitted per day, per user.
![Page 9: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/9.jpg)
Task events tableAttributes considered: Time, JobID, task index,event type, user, CPU request,
memory request, disk space request.
• With records where event type = 0, the number of tasks per day, per user was visualized.
• Through the distinct count of users, the numbers of users per day was visualized
Average tasks per day = 1,607,694
Average users per day = 398
Visualizations: number of tasks per day, per user, number of users per day, user submission rate (total number of tasks submitted/30) average memory requested per user, average CPU requested per user, Avg tasks/job per user.
![Page 10: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/10.jpg)
Tasks per day Vs Jobs per dayDay
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
0M
1M
2M
3M
4M
5M
Cou
nt o
f Tas
k In
dex
0K
10K
20K
30K
40K
Dis
tinct
cou
nt o
f Job
ID
Sheet 7
C o u n t o f Ta s k In d e x a n d d is tin c t c o u n t o f J o b ID f o r e a c h D a y .
Observation: From the visualization, there is loose correlation between Jobs/day and Tasks/day. (Less jobs does not mean less number of tasks)
![Page 11: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/11.jpg)
Tasks per day Vs Users per dayDay
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
0M
1M
2M
3M
4M
5M
Cou
nt o
f Tas
k In
dex
0
100
200
300
400
500
Dis
tinct
cou
nt o
f Use
r
Sheet 1
C o u n t o f Ta sk In d e x a n d d is tin c t co u n t o f U se r f o r e a ch D a y .
Observation: From the visualization, there is loose correlation between Jobs/day and users/day. There is a pattern in users/day(Every week, 7th day has less number of users(possibly a weekend)). Type of users is important than number of users/day to predict the number of tasks/day
![Page 12: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/12.jpg)
User Submission rate(Task/day)
Observation: Few users have very high submission rate.
![Page 13: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/13.jpg)
Avg. Tasks/Job per user
Observation: Most jobs user submit are similar as the number of tasks in the jobs are same
![Page 14: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/14.jpg)
Machine EventsAttributes considered: Time, machine ID, event type,
CPU, memory.
• Considering records with event type = 0, we get machines that are added to the cluster and are available
• Considering records with event type = 1, we get machines that are removed due to failure
• Considering records with event type = 2, we get the machines whose attributes are updated
• These data is of less significance for our project
![Page 15: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/15.jpg)
Tasks usageAttributes considered: start time, end time, job ID, task
index, CPU rate, canonical memory usage, assigned memory usage, local disk space usage.
• Using the considered attributes, task length(running time*CPU rate) was computed. (running time was converted from microseconds to seconds)
• The user data from task events table was extracted to get the average memory, CPU used per user
Visualization: Average CPU used per user, Average memory used per user
![Page 16: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/16.jpg)
CPU requested per user Vs CPU used per user
Observation: Most users over estimate the resources they need and use less than 5% of the requested resources A few users under estimate the resources and use more than thrice the amount of requested resources.
![Page 17: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/17.jpg)
Memory requested per user Vs Memory used per user
Observation: Most users over estimated the resources they need and use less than 30% of the requested resources Very few users under estimated the resources and use more than the amount of requested resources
but when tasks use more memory than requested they get killed.
![Page 18: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/18.jpg)
Important Attributes• Those attributes which play an important part in
identifying user and task shape
• From the visualizations and observations made, the following are identified as important attributes:
• User : Submission rate, CPU estimation ratio, Memory estimation ratio
Estimation ratio = (requested resource – used resource)/requested resource
• Task : Task length, CPU usage, Memory usage
![Page 19: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/19.jpg)
CPU Estimation ratio per User
Users with negative (red) CPU estimation ratio have used resources more than requested.Users with CPU estimation ratio between 0.9 to 1 have not used more than 90% of the requested resource.
![Page 20: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/20.jpg)
Memory Estimation ratio per User
Users with negative (orange) memory estimation ratio have used resources more than requested.Users with memory estimation ratio between 0.9 to 1 have not used more than 90% of the requested resource.
![Page 21: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/21.jpg)
Categorization of Users
Categorization of Tasks
![Page 22: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/22.jpg)
Dimensions for categorizationUser : Submission rate, CPU estimation ratio, Memory estimation ratioTask : Task length, CPU usage, Memory usage
We use the following clustering algorithms to identify optimal number of clusters for users and tasks1. K- means 2. Expectation – Maximization (EM)3. Cascade Simple K-means4. Xmeans• We categorize the users and tasks using these clustering algorithms with the above dimensions for users and tasks.• We compare and choose the best clustering for users and tasks.
![Page 23: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/23.jpg)
User Categorization
![Page 24: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/24.jpg)
Users - K- means with 4 clusters
X : Avg. memory est. ratio Y: Submission rate Z: Avg. CPU est. ratio
![Page 25: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/25.jpg)
Tasks Categorization
![Page 26: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/26.jpg)
Tasks – Day 13 – Kmeans (3 clusters)
X: Memory usage Y: Length Z: CPU usage
![Page 27: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/27.jpg)
Tasks – Day 13 - Xmeans
X: Memory usage Y: Length Z: CPU usage
![Page 28: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/28.jpg)
Clustering Comparison:
Our clustering(Xmeans)
K means clustering in done in IEEE paperAn Approach for Characterizing Workloads in Google Cloud to Derive RealisticResource Utilization Models
![Page 29: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/29.jpg)
Selected User and Task clustering
Users - K means with 4 clustersX : Avg memory est. ratio Y: Submission rate Z: Avg. CPU est. ratio
Tasks - X means with 3 clustersX: Memory usage Y: Length Z: CPU usage
![Page 30: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/30.jpg)
Time Series Analysis
![Page 31: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/31.jpg)
Selecting Target Users & Tasks
From the clustering results we observed:• 97% of the users have estimation ratios ranging from 0.7-1.0• That is 97% of the users don’t user more than 70% of the resources they request• We targeted User Cluster 0 & Cluster 3 ( more than 90 % unused)
We targeted tasks that were long enough to perform efficient resource allocation• Performed clustering on task lengths of these users to filter out short tasks
![Page 32: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/32.jpg)
User workload analysis – Dynamic Time Warping
To identify user’s tasks with similar workload,We ran the DTW algorithm on each tasks of Cluster0 and Cluster3 users• Computed the DTW between user’s tasks and a reference curve• Extracted tasks of a user that have same DTW value• These tasks were identified to have similar workload curve.
![Page 33: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/33.jpg)
Workload prediction
![Page 34: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/34.jpg)
Workload predictionSince resource allocation and de-allocation cannot be done dynamically because of :• Huge overhead• Delay in allocating resourcesSo the resource allocation must happen once in every pre-determined interval of time.
Prediction:• When a predictable user runs a task , its initial workload is compared with the curve associated(reference curve) with him/her.• Based on the slope of the predicted workload curve(reference curve) a step- up or step-down in resource allocation is determined, considering the delay in resource allocation.
![Page 35: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/35.jpg)
Looking ahead…
![Page 36: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/36.jpg)
• When the unhashed job name and user name is known, associations between job name and its workload can be formed and used for better prediction
• As observed in the user clustering, most users have poor estimation ratios.So better resource estimating processes can be used to assist users to have a better Estimation ratios.
• More techniques like regression analysis, curve fitting algorithms can be used to get a better representative curve for a predictable user.
![Page 37: Cloud workload analysis and simulation](https://reader035.vdocuments.net/reader035/viewer/2022062513/5578af2ad8b42a4d4b8b4e4c/html5/thumbnails/37.jpg)
நன்றி�