research challenges in cloud computing - raouf ?· research challenges in cloud computing ... vl2:...

Download Research Challenges in Cloud Computing - Raouf ?· Research Challenges in Cloud Computing ... VL2: A…

Post on 03-Jul-2018

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Raouf Boutaba

    Research Challenges in Cloud Computing

    D. Cheriton School of Computer Science University of Waterloo

    CS856 W17

  • Outline Data Center Networks Network Management Resource and Performance Management Energy Management Pricing and Economics Security and Enterprise Applications

  • Data Center Networks Data center networks form the backbones of data centers Connecting tens of thousands of servers that may host

    millions of applications

    Characteristics Very large scale Single administrative domain Bandwidth is often the performance bottleneck

    3 Research Issues and Current Trends

  • Conventional Architecture

    4 Research Issues and Current Trends Data Center Networks

    Source: VL2: A Scalable and Flexible Data Center Network, SIGCOMM 2009

  • Limitations of Conventional Architectures

    High oversubscription ratio (i.e. creating bandwidth bottleneck) Typically 1:5, 1:80 or even 1:240 at root

    Poor reliability and utilization

    Static network addresses assignment Fragmentation of resources Difficult to support VM migration due to address

    reconfiguration

    5 Research Issues and Current Trends Data Center Networks

  • Design Objectives Scalability

    Scale to millions of servers without compromising performance

    Economics Built using commodity switches and servers

    Performance Low network diameter Large bisection bandwidth

    Reliability Multiple forwarding paths for host-to-host communication

    Application Support Support address reconfiguration and VM migration

    6 Research Issues and Current Trends Data Center Networks

  • Architectural Proposals Switch-Centric

    Forwarding using only switches E.g. Portland, VL2

    Server-Centric Forwarding using both switches and servers E.g. DCell, Bcube, CamCube

    7 Research Issues and Current Trends Data Center Networks

  • Portland Uses a fat-tree topology for path diversity and large bisection bandwidth Operates on Layer 2

    Using Pseudo-MAC address in the format of pod.position.port.vmid for forwarding

    Using a centralized fabric manager to manage actual to pseudo MAC mapping

    8 Research Issues and Current Trends Data Center Networks Switch-Centric

  • BCube Targeting container-based datacenters Using a generic hypercube topology

    Overlay routing at layer 2.5 Efficient support for communication patterns such as one-to-one, one-to-

    many, many-to-many using source routing

    9 Research Issues and Current Trends Data Center Networks Switch-Centric

    BCube0 = n servers + one mini-switch (n

  • Research Challenges Understanding the trade-off between different architectures

    Switch centric vs. Server centric

    Comparison criteria Network capacity Robustness Capital and Operational Cost

    Managing and upgrading existing data center networks over time

    10 Research Issues and Current Trends Data Center Networks

  • Outline Data Center Networks Network Management Resource and Performance Management Energy Management Pricing and Economics Security Management Migrating Enterprise Applications to the Cloud

    11

  • Network Management Issues Naming and addressing

    Address configuration and management

    Flow control and management Congestion Control Flow Scheduling

    12 Research Issues and Current Trends

  • Address Configuration ID/Locator separation is a design principle of data center

    networks. E.g. Portland maintains the mapping between physical MAC and

    hierarchical PMAC addresses, E.g. BCube assigns virtual addresses to individual host

    Automatic address reconfiguration is a requirement Manual configuration is costly and error prone

    13 Research Issues and Current Trends Network Management

  • Congestion Control Data center traffic typically consists of

    (>80%) Low latency short flows (i.e. user facing requests) (

  • Flow Scheduling Given path diversity provided by data center networks, route

    network flows to minimize congestion

    Current Approaches Equal Cost Multipath (ECMP)

    Determining path using a hash function (called flow-hashing) Valiant Load Balancing (VLB)

    Bouncing packet off of random intermediary nodes (switches or servers)

    Limitation: Inefficient for non-uniform traffic patterns

    Two heavy weight flows may collide, resulting in congestion

    15 Research Issues and Current Trends Network Management

  • Flow Scheduling (cont) Flow scheduling

    Separate flows into large and small flows

    For small flows, use ECMP or VLB

    For large flows, use centralized scheduling A variant of NP-hard multi-commodity flow problem

    Implementation Monitor network flows Dynamically inserting forwarding entries for large flows

    16 Research Issues and Current Trends Network Management

  • Research Directions Configuration Management

    Reducing the complexity of management tasks such as address configuration

    Traffic Management Support various usage patterns of cloud applications

    Leveraging new network management paradigms such as SDN

    17 Research Issues and Current Trends Network Management

  • Outline

    Data Center Networks Network Management Resource and Performance Management Energy Management Pricing and Economics Security Management Migrating Enterprise Applications to the Cloud

    18

  • Resource and Performance Management A cloud computing environment hosts myriads of

    applications with diverse performance objectives

    How to effectively allocate resources to applications to satisfy their performance objectives?

    Sub-problems Performance modeling and management for each individual

    application Run-time resource management

    19 Research Issues and Current Trends

  • Application Performance Management An application owner needs to understand the performance model of the

    application, and adjust resource requirement according to workload condition E.g. Increase number of web server replicas to mitigate flash crowd effect

    20 Research Issues and Current Trends Resource & Performance Mgmt

    Demand Prediction Controller Application

    Performance Model

    Output

    Input

  • Application Performance Management (cont)

    Using probabilistic / statistical methods Queuing Models Machine learning

    Proactive vs. reactive Control Proactive control uses predicted demand to allocate resources before

    they are needed Reactive control respond to immediate demand fluctuations when

    prediction is not available.

    21 Research Issues and Current Trends Resource & Performance Mgmt

  • Data Center Resource Management Objectives

    Mitigating performance bottleneck (i.e. hotspot) Improving application schedulability Improving server utilization Improve resource sharing among applications Reducing energy cost

    Current approach: using various virtualization techniques Dynamically adjusting resource allocation of applications Virtual machine migration

    22 Research Issues and Current Trends Resource & Performance Mgmt

  • Data Center Resource Management (cont) Optimal placement problem is a general case of multi-

    dimensional bin packing problem NP-hard to solve

    Additional Factors Job arrival process Job duration Reconfiguration procedure and cost

    E.g. cost of migration

    23 Research Issues and Current Trends Resource & Performance Mgmt

  • Research Directions Understanding application resource requirements

    e.g. workload characterization, application performance analysis

    Resource management framework for data-center wide workloads

    Multi-tenancy issues Application owner and cloud owner may have potentially conflicting

    objectives

    24 Research Issues and Current Trends Resource & Performance Mgmt

  • Outline

    Data Center Networks Network Management Resource and Performance Management Energy Management Pricing and Economics Security Management Migrating Enterprise Applications to the Cloud

    25

  • Energy Management Reducing energy consumption is a critical objective of cloud

    computing

    Power and cooling cost constitutes a large potion of datacenter expenditure 25%-30% total data center operational cost

    Government regulations call for environment friendly (i.e. Green) data centers

    26 Part 2- Research Issues and Current Trends

  • Cost of Consumption Power and Cooling cost millions of dollars monthly

    27 Part 2- Research Issues and Current Trends Energy Management

    Estimated Monthly Operational Expenditure of a 50k machine Data Center Source: http://perspectives.mvdirona.com/

  • Reducing Energy Cost Server Consolidation

    Reducing number of servers used by turning off unused servers

    Energy-Aware scheduling Scheduling jobs to reduce power and cooling costs

    Energy Efficient Networks Dynamically adjust active network elements to reduce power

    cost

    28 Part 2- Research Issues and Current Trends Energy Management

  • Server Consolidation Consolidating application workloads on a smaller number of

    servers to save server power cost

    However, consolidation increases resource contention among applications, which may hurt their performance

    Challenges Understanding the energy and performance impac

Recommended

View more >