analyzing reliability in hybrid compute units muhammad candra, hong-linh truong, schahram dustdar...

Download Analyzing Reliability in Hybrid Compute Units Muhammad Candra, Hong-Linh Truong, Schahram Dustdar Distributed Systems Group TU Wien Distributed Systems

If you can't read please download the document

Upload: lambert-shields

Post on 17-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

Hybrid Computing System Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 3 [Background] Application Software-based services Cloud-based services composition Workflows with human-tasks Crowdsourcing applications IoT applications Human-based Compute Units Crowdsourcing platforms Social networks of experts On-premise experts Crowdsourcing platforms Social networks of experts On-premise experts Hybrid Compute Units Quality Metrics? RELIABILITY

TRANSCRIPT

Analyzing Reliability in Hybrid Compute Units Muhammad Candra, Hong-Linh Truong, Schahram Dustdar Distributed Systems Group TU Wien Distributed Systems Group IEEE International Conference on Collaboration and Internet Computing (IEEE CIC 2015) October 28 - October 30, 2015, Hangzhou, China Outline Background Introduction to Hybrid Computing System Introduction to Reliability Analysis Motivation Models Reliability Analysis Framework Implementation and Experiments Conclusions and Future Works Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 2 Hybrid Computing System Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 3 [Background] Application Software-based services Cloud-based services composition Workflows with human-tasks Crowdsourcing applications IoT applications Human-based Compute Units Crowdsourcing platforms Social networks of experts On-premise experts Crowdsourcing platforms Social networks of experts On-premise experts Hybrid Compute Units Quality Metrics? RELIABILITY Reliability Analysis What is reliability? Why do we need? for designer for resource provider for task owner How to measure? Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 4 [Background] The ability of a system to function correctly over a specified period of time, mostly under predefined conditions SYSTEM IMPROVEMENTS STOCHASTIC ANALYSIS Reliability Analysis in HCS Problems for Reliability Analysis in HCS Non-continuous time space More ad-hoc inter-dependency Resources provisioning on The Cloud Our goal: Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 5 [Background] To provide a set of tools for modeling and analyzing reliability for hybrid computing systems. Motivating Scenario Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 6 [Background] HCU Collective Human-Based Computing Platform Infrastructure Maintenance Platform Resources pool Reliability of Individual Units Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 7 [Models] f()d Collective Dependencies RA requires information on inter-dependencies between components. Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 8 [Models] System Overview Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 9 [Reliability Analysis Framework] ASSIGNMENT COMPOSITION Resources discovered suitable for fulfilling a role Static sets of resources Virtual Standby Units (VSU) Reliability Calculation (1) Input: The individual reliability profile for each units Collective dependency Outcome: The reliability for executing a set of K tasks. Steps Obtain individual reliability on time t or on execution k Calculate the reliability for each role Calculate the reliability of the task executions Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 10 [Reliability Analysis Framework] 1 2 3 Reliability Calculation (2) Obtain individual reliability (continuous) on time t (for machine-based units) or (discrete) on execution k (for human-based units) Domain-specific individual reliability model For example (for human units), binomial distribution f(k) = (1 - p) k-1 p R(k) = (1 - p) k How to get p? Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 11 [Reliability Analysis Framework] 1 Reliability Calculation (3) Calculate the reliability for each role Reliability of statics set of unis Simplex Parallel / serial structure Static and dynamic redundancy Reliability of Virtual Standby Units (VSU) Similar to M-of-N redundancy Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 12 [Reliability Analysis Framework] 2 Reliability Calculation (4) Calculate the reliability of the task executions using Execution Spanning Tree (EST) Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 13 [Reliability Analysis Framework] 3 ESTs: IMP, SAS, VSU Se, SN IMP, HCP, VSU Cz Coll, VSU Cz Asses IMP, HCP, VSU Cz Coll, VSU In Asses IMP, HCP, VSU In Coll, VSU Cz Asses IMP, HCP, VSU In Coll, VSU In Asses (IMP)(HCP)(SAS)(SN)(VSU Se ) (VSU Cz Coll ) (VSU In Coll ) (VSU Cz Asses ) (VSU In Asses ) Reliability Calculation (5) Calculate the reliability of the task executions using Execution Spanning Tree (EST) Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 14 [Reliability Analysis Framework] 3 Given S t, as a set of ESTs, e.g.: IMP, SAS, VSU Se, SN IMP, HCP, VSU Cz Coll, VSU Cz Asses IMP, HCP, VSU Cz Coll, VSU In Asses IMP, HCP, VSU In Coll, VSU Cz Asses IMP, HCP, VSU In Coll, VSU In Asses Prototype Implementation Runtime and Analytics for Hybrid Computing Systems (RAHYMS) Based on GridSim toolkit Features Simulate a pool of resources (machine-based and human-based units) Simulate task requests generation Strategies for HCU formation Reliability analysis tool Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 15 [Implementation & Experiments] Experiment Setup Focus on VSUs Sensors R(t) = e -t Human: Citizens and Inspectors R(k) = (1 - p) k t = k / 30 Assumed static: Infrastructure Management Platform (IMP) Human-based Computing Platform (HCP) Sensors Network (SN) Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 16 [Implementation & Experiments] Experiment 1 Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 17 [Implementation & Experiments] Experiment 2 Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 18 [Implementation & Experiments] Experiment 3 Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 19 [Implementation & Experiments] Conclusion Experiments show how the RA can be used to obtain insights for system improvements. Analyzing Reliability in Hybrid Compute Units, IEEE CIC 2015, October , 2015, Hangzhou. 20 [Conclusions & Future Works] Future Works Models Individual Reliability (Continuous & Discrete) Collective Dependency (Collaboration for known structure) Framework Tools for Reliability Analysis Dependable hybrid human-machine computing Dependability metrics: availability, performance, quality of results. Thank you Acknowledgments The first author of this paper is financially supported by Vienna PhD School of Informatics The work mentioned in this paper is partially supported by EU FP7 FET SmartSociety project 21