data view of teragrid logical site model

12
SAN DIEGO SUPERCOMPUTER CENTER, UCSD NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE TeraGrid: Logical Site Model Chaitan Baru Data and Knowledge Systems San Diego Supercomputer Center

Upload: tess98

Post on 25-May-2015

392 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

TeraGrid:Logical Site Model

Chaitan BaruData and Knowledge Systems

San Diego Supercomputer Center

Page 2: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

National Science Foundation TeraGrid

• Prototype for Cyberinfrastructure (the “lower” levels)

• High Performance Network: 40 Gb/s backbone, 30 Gb/s to each site

• National Reach: SDSC, NCSA, CIT, ANL, PSC• Over 20 Teraflops compute power• Approx. 1 PB rotating Storage• Extending by 2-3 sites in Fall 2003

Page 3: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

Services/Software View of Cyberinfrastructure

Hardware

Grid Services & Middleware

DevelopmentTools & Libraries

Applications• Environmental Science• High Energy Physics• Proteomics/Genomics• …

Domain-specific Cybertools (software)

Domain-specific Cybertools (software)

Domain-specific Cybertools (software)

Domain-specific Cybertools (software)

Shared Cybertools (software)

Shared Cybertools (software)

Shared Cybertools (software)

Shared Cybertools (software)

Distributed Resources

(computation, communication,

storage, etc.)

Distributed Resources

(computation, communication,

storage, etc.)

Distributed Resources

(computation, communication,

storage, etc.)

Distributed Resources

(computation, communication,

storage, etc.)

Page 4: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

SDSC Focus on Data: A Cyberinfrastructure “Killer App”

• Over the next decade, data will come from everywhere• Scientific instruments• Experiments• Sensors and sensornets• New devices (personal digital devices,

computer-enabled clothing, cars, …)

• And be used by everyone• Scientists• Consumers• Educators• General public

• SW environment will need to support unprecedented diversity, globalization, integration, scale, and use

Data from sensors

Data from simulations

Data from

instruments

Data from analysis

Page 5: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

Prototype for Cyberinfrastructure

Page 6: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

SDSC Machine Room Data Architecture

• Enable SDSC to be the grid data engine

Blue Horizon

HPSS

LAN (multiple GbE, TCP/IP)

SAN (2 Gb/s, SCSI)

Linux Cluster, 4TF

Sun F15K

WAN (30 Gb/s)

SCSI/IP or FC/IP

FC Disk Cache (400 TB)

FC GPFS Disk (100TB)

200 MB/s per controller

Silos and Tape, 6 PB, 1 GB/sec disk to tape 32 tape drives

30 MB/s per drive

Database Engine

Data Miner

Vis Engine

Local Disk (50TB)

Power 4

Power 4 DB

• .5 PB disk• 6 PB archive• 1 GB/s disk-to-tape• Support for DB2 /Oracle

DBMS disk (~10TB)

Page 7: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

The TeraGrid Logical Site View

• Ideally, applications / users would like to see:• One single computer • Global everything: filesystem, HSM, database system• With highest possible performance

• We will get there in steps• Meanwhile, the TeraGrid Logical Site View

provides a uniform view of sites• A common abstraction supported by every site

Page 8: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

Logical Site View

• Logical Site View is currently simply provided as a set of environment variables• Can easily become a set of services

• This is minimum required to enable a TG application to easily make use of TG storage resources

• However, for “power” users, we also anticipate the need to expose mapping from logical to physical resources at each site• Enables applications to take advantage of site-specific

configurations and obtain optimal performance

Page 9: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

Basic Data Operations

• The Data WG has stated as a minimum requirement:a) the ability for a user to transfer data between any TG

storage resource to memory on any TG compute resource – possibly via the use of an intermediate storage resource

b) Ability to transfer data between any two TG storage resources

Page 10: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

ComputeCluster

DBMS

Logical Site View

ComputeCluster

HSM

DBMSCollection

Management

Scratch

Staging Area

Staging Area

Staging Area

“Network” Staging

Area

Page 11: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

Environment Variables

• TG_NODE_SCRATCH• TG_CLUSTER_SCRATCH• TG_GLOBAL_SCRATCH• TG_SITE_SCRATCH…?• TG_CLUSTER_HOME• TG_GLOBAL_HOME • TG_STAGING • TG_PFS

• TG_PFS_GPFS, TG_PFS_PVFS, TG_PFS_LUSTRE

• TG_SRB_STAGING

Page 12: Data View of TeraGrid Logical Site Model

SAN DIEGO SUPERCOMPUTER CENTER, UCSD

NATIONAL PARTNERSHIP FOR ADVANCED COMPUTATIONAL INFRASTRUCTURE

Issues Under Consideration

• Suppose a user wants to run computation, C, on data, D

• The TG middleware should automatically figure out• Whether C should move to where D is, or vice versa• Whether data, D, should be pre-fetched, or “streamed”• Whether output data should be streamed to persistent

storage, or staged via intermediate storage• Whether prefetch/staging time ought to be “charged” to

the user or not