resource management in tessellation...

1
Resource Management in Tessellation OS 2. Space-time Partitioning and Two-level Scheduling Juan A. Colmenares 1 , Sarah Bird 1 , Henry Cook 1 , Paul Pearce 1 , John Shalf 2 , Steven Hofmeyr 2 , Krste Asanović 1 and John Kubiatowicz 1 1 ParLab, CS Division, UC Berkeley | 2 Lawrence Berkeley National Laboratory A Spatial Partition (or Cell) comprises a group of processors acting within a hardware boundary Each cell receives a vector of basic resources Some number of processors, a portion of physical memory, a portion of shared cache memory, and potentially a fraction of memory bandwidth A cell may also receive Exclusive access to other resources (e.g., certain hardware devices and raw storage partition) Guaranteed fractional services (i.e., QoS guarantees) from other partitions (e.g., network service and file service) Spatial partitioning may vary over time Partitioning adapts to needs of the system Some cells persist while others change with time Parallel Computing Laboratory 2nd-level Scheduling 2nd-level Memory Management Address Space A Address Space B Cell Task Tessellation Kernel (Partition Support) Time Space Cell 1 Application A Cell 2 Application B Service 1 e.g., Network Service HW Device Cell 3 A cell contains channel endpoints to other cells Channels allow an application in a cell to access services and to interact with other applications residing in other cells Communication between cells is controlled for security and QoS enforcement Channels enable efficient and non-blocking message passing CPU L1 L2 Bank DRAM DRAM & I/O Interconnect L1 Interconnect CPU L1 L2 Bank DRAM CPU L1 L2 Bank DRAM CPU L1 L2 Bank DRAM CPU L1 L2 Bank DRAM CPU L1 L2 Bank DRAM (+) Fraction of memory bandwidth 1. Some Basic Goals Offer better support to a simultaneous mix of high-throughput parallel, interactive, and real-time applications Enable improved application performance by exploiting many-core hardware Allow applications to achieve high responsiveness Facilitate engineering applications with real-time guarantees Allow applications to consistently deliver the above performance improvements Enable rapid adaptation to changes in the application mix, resource availability, and other operational conditions Other Important Goals Scalability Better handling of power-performance tradeoffs Additional protection, fault-containment, and security capabilities Scheduling at Level 1: Coarse-grained resource allocation and distribution at the cell level Scheduling at Level 2: Fine-grained application-specific scheduling within a cell Space-Time Resource Graph (STRG) Cell Cell Cell All system resources Cell group with fraction of resources Cell Contains the current distribution of system resources to Cells 3. Partition-management Layers Time Space Policy Service Mapping & Time-Multiplexing Layer Mechanism Layer Implementation of hardware partitions and secure channel Device dependent: Exploits hardware support for QoS and partitioning when available Resource Allocation and Adaptation Allocates resources to cells based on global policies (i.e., reflects global goals) Adapts resource allocations in response to changing conditions Produces only implementable STRGs Performs admission control Resource Distribution and Time- slicing Implements the STRG by performing a bin- packing like operation Makes no policy decisions, but rather implements them Time-slices at a course granularity (when necessary) 4. Resource-management Software Architecture Research supported by Microsoft (Award #024263) and Intel (Award #024894) funding and by matching funding by U.C. Discovery (Award #DIG0710227) Tessellation Kernel ( TrustedPolicy Service STRG Validator& Resource Planner Partition Mapping and Multiplexing Layer Resource Plan Executor Partition Partitionable Hardware Resources Cores Physical Memory Memory Bandwidth Cache/ Local Store Disks NICs Major Change Request ACK / NACK Cell-Creation & Resizing Requests from Users Admission Control Minor Changes Cell Cell Cell All system resources Cell group with fraction of resources Cell Space-Time Resource Graph (STRG) (Current Resources) Global Policies / User Policies and Preferences Resource- Allocation and Adaptation Mechanism Offline Models and Behavioral Parameters Online Performance Monitoring, Model Building, and Prediction Performance Counters ACK / NACK Apps

Upload: others

Post on 17-Mar-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Resource Management in Tessellation OSparlab.eecs.berkeley.edu/sites/all/parlab/files/juancol-poster-retreat-201005.pdf · Resource Management in Tessellation OS 2. Space-time Partitioning

Resource Management in Tessellation OS

2. Space-time Partitioning and Two-level Scheduling

Juan A. Colmenares1, Sarah Bird1, Henry Cook1, Paul Pearce1, John Shalf2, Steven Hofmeyr2, Krste Asanović1 and John Kubiatowicz1

1 ParLab, CS Division, UC Berkeley | 2 Lawrence Berkeley National Laboratory

A Spatial Partition (or Cell) comprises a group of processors acting within a hardware boundary Each cell receives a vector of basic resources– Some number of processors, a portion of physical memory, a

portion of shared cache memory, and potentially a fraction of memory bandwidth

A cell may also receive – Exclusive access to other resources (e.g., certain hardware

devices and raw storage partition)– Guaranteed fractional services (i.e., QoS guarantees) from

other partitions (e.g., network service and file service)

Spatial partitioning may vary over time–Partitioning adapts to needs of the system–Some cells persist while others change with time

Parallel Computing Laboratory

2nd-level Scheduling

2nd-level Memory Management

Address Space A Address Space B

Ce

ll

Task

Tessellation Kernel(Partition Support)

Time

Spa

ce

Cell1

Application A

Cell2

Application B

Service 1

e.g., Network Service

HW Device

Cell3

A cell contains channel endpoints to other cells–Channels allow an application in a cell to access services and to interact with other applications residing in other cells

–Communication between cells is controlled for security and QoS enforcement

Channels enable efficient and non-blocking message passing

CPU

L1

L2Bank

DRAM

DRAM & I/O Interconnect

L1 Interconnect

CPU

L1

L2Bank

DRAM

CPU

L1

L2Bank

DRAM

CPU

L1

L2Bank

DRAM

CPU

L1

L2Bank

DRAM

CPU

L1

L2Bank

DRAM

(+) Fraction of memory bandwidth

1. Some Basic Goals

Offer better support to a simultaneous mix of high-throughput parallel, interactive, and real-time applications

Enable improved application performance by exploiting many-core hardware Allow applications to achieve high responsiveness Facilitate engineering applications with real-time guarantees Allow applications to consistently deliver the above performance improvements Enable rapid adaptation to changes in the application mix, resource availability,

and other operational conditions

Other Important Goals Scalability Better handling of power-performance tradeoffs Additional protection, fault-containment, and security capabilities

Scheduling at Level 1: Coarse-grained resource allocation and distribution at the cell level Scheduling at Level 2: Fine-grained

application-specific scheduling within a cell

Space-Time Resource Graph

(STRG)

Cell Cell Cell

All system resources

Cell group with fraction of resources

Cell

Contains the current distribution of systemresources to Cells

3. Partition-management Layers

Time

Spac

e

Policy Service

Mapping & Time-Multiplexing Layer

Mechanism Layer Implementation of hardware partitions and secure channel– Device dependent: Exploits hardware support for

QoS and partitioning when available

Resource Allocation and Adaptation– Allocates resources to cells based on global

policies (i.e., reflects global goals)– Adapts resource allocations in response to

changing conditions– Produces only implementable STRGs– Performs admission control

Resource Distribution and Time-slicing– Implements the STRG by performing a bin-

packing like operation– Makes no policy decisions, but rather implements

them– Time-slices at a course granularity (when

necessary)

4. Resource-management Software Architecture

Research supported by Microsoft (Award #024263) and Intel (Award #024894) funding and by matching funding by U.C. Discovery (Award #DIG0710227)

Tessella

tion

Kern

el

(Tru

ste

d)

Microsoft/UPCRC

Po

lic

y

Se

rvic

e

STRG Validator &Resource Planner

Partition Mapping andMultiplexing

Layer

Resource Plan Executor

Partition Mechanism

Layer

QoSEnforcement

PartitionImplementation

ChannelAuthenticator

Partitionable Hardware Resources

CoresPhysicalMemory

MemoryBandwidth

Cache/Local Store Disks NICs

Major ChangeRequest

ACK / NACK

Cell-Creation & Resizing Requestsfrom Users Admission

Control

Minor Changes

Cell Cell Cell

All system resources

Cell group with fraction of resources

Cell

Space-Time Resource Graph

(STRG)

(Current Resources)

Global Policies /User Policies and Preferences

Resource-Allocation

and AdaptationMechanism

Offline Modelsand Behavioral

Parameters

Online Performance Monitoring,

Model Building, and Prediction

PerformanceCounters

ACK / NACK

Apps