Download - Cloud Computing + Workflows
Cloud Computing + Workflows
Anushri Khandekar
Cloud Computing
Delivering applications or services in on-demand environment
Hundreds of thousands of users / applications Systems should be fast, secure and available Intelligent infrastructure:
Transparency Scalability Monitoring Security
All services and associated data
Workflows
Operational aspect of a work procedure: how tasks are structured, who performs them, what their relative order is, how they are synchronized, how information flows to support the tasks and how tasks are being tracked.
Workflow Management An activity is a discrete step in a business
process (workflow). Activities range from calling a remote service
to perform a task, e.g. calculating taxes, performing currency conversions, looking up inventory, to custom-defined services.
Activities are orchestrated together in a workflow in BizTalk using XOML (eXtensible Object Markup Language).
Other languages BPEL, ebXML, XPDL etc.
Workflows in Cloud Microsoft allows hosting of Biztalk activities in a
cloud at biztalk labs. Developers integrate those cloud hosted activities
into a BizTalk workflow (orchestration) by calling them as they would any other web-based service or hosted activity.
“Service orchestration” – business process is modeled using workflows Invokes Internet Service Bus and perform HTTP request Language used XOML
Main task – First create a workflow instance and start it
Transparency
“Actual Implementation” of services obscured Another version of virtualization Transparent load-balancing and application
delivery Solution to be automated and integrated in
workflow process Example:
A service running with a single server, more users join in hence additional servers required, transparency allows integration without interrupting the service running or reconfiguration.
Scalability
Scale up and build “mega data centers” Not transparent – Need configuration or re-
architecting Potential of interrupting services is huge Ability to transparently scale the service
infrastructure and the solution On-demand, real time scaling Control node – provides dynamic application
scalability Integration with virtualization solution or orchestration
with workflow process to manage provisioning
Intelligent Monitoring
Control node – intelligent monitoring capabilities
Server overwhelming or application performance affected by network conditions – behavior outside accepted norms
More than knowing when a service in trouble what action should be taken Example – application responding slowly, adjust
application requests add more server if required Detect and participate in the provisioning of new
instance
Capacity Management From buckets to rivers Constrained set of resources – predict peak usage
and have in-house data centre to manage them Unlimited computing power with cloud – How IT
departments properly manage this river? Constraint on new model
Not upper limit of computing power but speed at which new services can be provisioned and put into production
Scaling up means: Initiate new system, transfer data, connect existing
system, test combined system, manage complete life cycle
Capacity Management Traditional life cycle stages:
Modeling, provisioning, monitoring, maintaining, and modifying.
Important here – “Maintaining” & “Modifying” Elastic means provisioning and de-provisioning Is it right time to add an IT asset or get rid of an
asset? Economic benefits rely on when to stop using an asset Utilize the cloud for additional capacity when it is
apparent your own data centre can't handle the load and it is cost-prohibitive to invest in additional servers and infrastructure to increase capacity
Problem Statement
Efficient management of workflows in a cloud environment to allow fast scaling up and scaling down Storing scalability/ compressibility options for
every node in the workflow Input events and output events of every node in
workflow Mechanism to integrate new scaled model of
web service in original cloud workflow
Proposed Idea
Workflow Management
Workflow management important – heavy workflow of traditional waterfall approaches with smallest detail will slow down the use of cloud computing
Separate main workflow from details of mechanism required to scale any activity node
Have efficient way of storing this information
Workflow Management Workflow Main
Has the cloud structure with each web service as an activity node
Workflow Shadow Has sub-workflows for other options for each
activity nodes Workflows – Online or Offline.
Online – running and executing at a particular time
Offline – workflows in passive state waiting for an event to trigger them
Activity Node
Parameters Description
Activity Node NameDescriptionTypeStateConstraintsInput EventOutput EventScalability OptionsCompressibility Options
/* Unique Activity node Name*//* Description of Activity *//* Service or Application etc*//* Online, Offline, or Needs change*//* Time, Execution Cost *//*Event to trigger the activity node *//*Event triggered by activity node *//*Scalability activities as a workflow */ (When and How)/*Compressibility activities as a workflow*/(When and How)
Scalability Options Considering transparency, two ways to scale a
workflow Scale an activity node Addition of new activity node
More tricky, dynamic, according to environment Scale an activity node
When? – store criteria Example, for a web server if load increases above a
threshold, expand How? – again as a workflow
Example, store all the steps to be done in order to expand, configure and connect the node back to original workflow
Cloudbursting vs Bursting the Cloud Cloudbursting is to allow the cloud to act as
overflow resources in the event your own infrastructure becomes overloaded Critical tasks (revenue generating) in own
datacentre
Bursting in the cloud is applied to resources such as servers, application servers, application delivery systems, and other infrastructure required to provide on-demand computing environments
Bursting the cloud Automate the cloud's data centre Requires more than simple workflow systems
on-demand control and management over all devices in the delivery chain
from the storage to the application and web servers to the load-balancers and acceleration offerings that deliver the applications to end-users
“Data centre orchestration” – many moving parts and pieces be coordinated in order to perform a highly complex set of tasks
Hadoop As a Service
Automated installation and provisioning Research Questions:
How to support multi-tenancy with QoS differentiation
How to optimize workflows across users with fluctuating capacity requirements
Key features: On-demand creation Dynamic resource flexing
Differentiated Hadoop services
Problem: More important jobs should preempt less
important jobs Time critical jobs need to meet deadlines Test jobs need no stringent QoS guarantees How to get users to truthfully reveal their
resource requirements?
Differentiated Hadoop services
Approach Market-based resource allocator, Tycoon
Continuous bidding (of spending rates) for resource capacity
Proportional allocation Allocation materialized as VM
Users can evaluate and select providers based on cost/benefit metrics (best value for money)
Gives incentive to users to be judicial about capacity requests and time to submit
Economic workflow optimization
Assumption: Not all subtasks need maximum capacity at all
times Approach:
Automatically rescale the capacity as needed to optimize the cost/benefit ratio of the workflow as a whole
Opportunity: Application scalability
profile not perfectly linear
Optimization strategies Node Priority
P: Some nodes more performance critical than others S: Boost spending on critical nodes (e.g. master funding boost)
Workflow Priority: P: Some workflows more performance critical than others
(although they look the same to the system)S: Declare relative priority of workflows and split budget
accordingly Job Priority:
P: Some stages of a workflow are more i/o intensive, others more cpu intensive
S: Boost resource spending during resource-intense stages of workflow
Bottleneck Mitigation:P: During map/reduce synch up some nodes may be
bottlenecks S: Redistribute funds to active bottlenecks
Optimization strategies Best Response:
P: When other users place competing bids, optimal configuration/allocation might change
S: Find game theoretical best response bids continuously to maximize utility
Risk:P: Some users are more risk averse than others
(can tolerate less fluctuations)S: Bid on nodes based on predicted guarantee to
deliver a QoS level
Managing Resources Includes clear policies on
who to admit how to arbitrate among competing requests what resource capacity may be requested over
what time frames Isolated Datacentre
Reset, reboot, power up, power down, get status Bias towards large and short experiments Site coordination required, e.g. accounting
XOML Original Cloud Activities
CloudHttpSend CloudHttpReceive CloudIfElse CloudSequence
Activity node – details should be stored with this CloudServiceBusSend CloudDelay CloudWhile
Microsoft Azure
Citrix Cloud Centre XenServer Cloud Edition – a complete, cloud-ready
virtual infrastructure NetScaler – to load balance, speed access to backend
VMs and dynamically provision workloads. "There's more to providing [cloud computing] than
simply providing a flat virtual infrastructure. You want to have workflows, you want SLAs, you want to be able to automate and move things around, and that's essentially what Citrix is bringing to the table -- a full suite of tools to do all of that."
James Staten Citrix WANScaler and Citrix Workflow Studio “Single Automated Cohesive system”
Conclusions
“Workflow management matters because much of the benefits of cloud computing comes from the speed and ease with which IT resources can be created and put into production.”
Thank you !!!
Questions ???