libi: the lightweight infrastructure-bootstrapping infrastructure

22
Department of Computer Science LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure Joshua Goehner and Dorian Arnold University of New Mexico (In collaboration with LLNL and U. of Wisconsin

Upload: talia

Post on 22-Feb-2016

52 views

Category:

Documents


0 download

DESCRIPTION

LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure . Joshua Goehner and Dorian Arnold University of New Mexico (In collaboration with LLNL and U. of Wisconsin. A Little Preaching to the Choir. Key statistics from Top500 (Nov. 2011) 149/500 ( 30% ) more than 8,192 core - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

Department of Computer Science

LIBI: The Lightweight Infrastructure-Bootstrapping

Infrastructure Joshua Goehner and Dorian Arnold

University of New Mexico(In collaboration with LLNL and U. of Wisconsin

Page 2: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

Key statistics from Top500 (Nov. 2011)◦ 149/500 (30%) more than 8,192 core

In 2006, this number was 12/500 (2.4%)◦ 7 systems: more than 64K cores◦ 6 systems: between 64K and 128K cores◦ 3 systems: more than 128K cores

Later this year: LLNL’s Sequoia w/ 1.6M cores 2018?: Exascale systems w/ 107 or 108 cores

A Little Preaching to the Choir

Software systems must scale as well!

Page 3: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

Before bootstrapping:◦ Program image on storage device◦ Set of (allocated) computer nodes

After bootstrapping: Application processes started on computer nodes Application’s configuration complete

ready for primary operation

Software Infrastructure-bootstrapping

Given a node allocation, start infrastructure's composite processes and propagate necessary startup information.

Page 4: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

“I gotta get my application up and running!”

Page 5: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

(1) “First, I start all the processes on the appropriate nodes”

Page 6: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

(2) “Next, I must disseminate some initialization information”

Page 7: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

Bootstrapping is complete when theinfrastructure is ready for steady-state usage.

Page 8: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

Basic Bootstrapping

Sequential instantiation◦ e.g. rsh or ssh-based

Point-to-point propagation

0 50 100 150 200 250 3000

2

4

6

8

10

Number of ProcessesIn

stan

tiatio

n Ti

me

(sec

onds

)

70 seconds to start 2K processes!

Page 9: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

Infrastructure-specific, scalable mechanisms◦ Still limited by sequential operations◦ Not portable to other infrastructures

Using high-performance resource managers◦Myriad interfaces◦ No communication facilities

Generic bootstrapping infrastructures◦ LaunchMON: targets tools with wrapper for existing RMs◦ ScELA: sequentially launch agents that create local processes

More Scalable Approaches

Page 10: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

MRNet Sequential-ish Bootstrapping

Parent creates children

Local fork()/exec()

Remote rsh-based mechanism

Integrated instantiation and information propagation

MRNet’s “standard”

Page 11: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

MRNet XT (hybrid) process launch

Bulk-launch 1 process per node

Process launches collocated processes

Information propagated insimilar fashion

Page 12: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

LIBI Approach LIBI: Lightweight infrastructure-

bootstrapping infrastructure

◦ Name is no longer a work in progress

◦ Generic service for scalable distributed software infrastructure bootstrapping

Process launch

Scalable, low-level collectives

Large Scale Distributed Software

Job Launchers Communication Services

LIBI

rsh/ssh

SLURMOpenRTE

ALPS

LaunchMON

Debuggers System Monitors

Applications

Performance Analyzers Overlay Networks

COBO

MPI

Page 13: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

session: set of processes (to be) deployed◦ session master manages other members◦ session front-end interacts with session master◦ LIBI currently supports only master/member communication

host-distribution: where to create processes◦ <hostname, num-processes>

process distribution: how/where to create processes◦ <session-id, executable, arguments, host-distribution,

environment>

LIBI API

Page 14: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

launch(process-distribution-array)

◦ instantiate processes according to input distributions

[send|recv]UsrData(session-id, msg)

◦ communicate between front-end and session master

broadcast(), scatter(), gather(), barrier()

◦ communicate between session master and members

LIBI API (cont’d)

Page 15: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

front-end( ){LIBI_fe_init();LIBI_fe_createSession(sess);

proc_dist_req_t pd;pd.sessionHandle = sess;pd.proc_path = get_ExePath();pd.proc_argv = get_ProgArgs();pd.hd = get_HostDistribution();

LIBI_fe_launch(pd);

//test broadcast and barrierLIBI_fe_sendUsrData(sess1, msg, len );LIBI_fe_recvUsrData(sess1, msg, len);

//test scatter and gatherLIBI_fe_sendUsrData(sess1, msg, len );LIBI_fe_recvUsrData(sess1, msg, len);

return 0;}

Example LIBI Front-end

Page 16: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

session_member() {LIBI_init();

//test broadcast and barrierLIBI_recvUsrData(msg, msg_length);LIBI_broadcast(msg, msg_length);LIBI_barrier();LIBI_sendUsrData(msg, msg_length)

//test scatter and gatherLIBI_recvUsrData(msg, msg_length);LIBI_scatter(msg, sizeof(rcvmsg), rcvmsg);LIBI_gather(sndmsg, sizeof(sndmsg), msg);LIBI_sendUsrData(msg, msg_length);

LIBI_finalize();}

Example LIBI-launched Application

Page 17: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

LaunchMON-based runtime

◦ Tested SLURM or rsh launching

◦ COBO PMGR service

LIBI Implementation StatusLarge Scale Distributed Software

Job Launchers Communication Services

LIBI

rsh/ssh

SLURMOpenRTE

ALPS

LaunchMON

Debuggers System Monitors

Applications

Performance Analyzers Overlay Networks

COBO

MPI

Page 18: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

Goal to demonstrate ease-of-integration and performance/scalability enhancements

Microbenchmark

◦ Time to instantiate a set of processes

◦ Time to broadcast 128 bytes followed by barrier

◦ Time to scatter and gather 128 bytes

LIBI Early Evaluation

Page 19: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

LIBI Microbenchmark Results

0 500 1000 1500 2000 2500 30000

1

2

3

4

5

6

7

8 ProjectedSequentialLIBI

Number of Processes

Inst

antia

tion

Tim

e (s

econ

ds)

Page 20: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

LIBI Microbenchmark Results

0 500 1000 1500 2000 2500 30000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

LIBI Broadcast & Barrier

LIBI Scatter & Gather

Sequential Broadcast & Barrier

Sequential Scatter & Gather

Number of Processes

Com

mun

icati

on T

ime

(sec

onds

)

Page 21: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

MRNet uses LIBI to launch all MRNet processes

◦ Parse topology file and setup/call LIBI_launch()

Session front-end gathers/scatters startup information

◦ Parent listening socket (IP/port)

MRNet/LIBI Integration

Page 22: LIBI: The Lightweight Infrastructure-Bootstrapping Infrastructure

STAT over MRNet over LIBI performance results Basic, scalable rsh-based mechanism Hybrid (XT-like) launch approach Mechanisms to alleviate filesystem contention◦ Like our scalable binary relocation service

More flexible process and host distributions◦ Instantiating different images in same session◦ Integrating allocating and launching

Integrated scalable communication infrastructure

Some Future Work