网格计算与云计算. “cloud” computing is 1+ yr old michael sheehan’s gogrid blog, july...
TRANSCRIPT
网格计算与云计算
“Cloud” Computing is 1+ yr old
Michael Sheehan’s GoGrid Blog, July 25, 2008 http://linux.sys-con.com/node/587717
Confused?
SaaS
Utility Computing
SaaS = Software as a Service
?
?
Virtualization
Grid Computing
Cluster Computing
Cloud Computing
P2P
One can categorize each component
Cloud Computing
SaaS
Grid Computing
Cluster Computing
Utility Computing
Usage Model Infrastructure
Virtualization
P2P
网格计算网格计算
6
What is a Grid?
Enable “coordinated resource sharing & problem solving in dynamic, multi-institutional virtual organizations.”
(Source: “The Anatomy of the Grid”)
7
Virtual Organizations
TeraGrid
9
What is the TeraGrid?
Technology + Support = Science
– NSF 已投资 2.460 亿美元– 自 2004 年 10 月已处于生产运行阶段,目前已用高性能网络集成了每秒 750 万亿次计
算能力、 30PB 存储空间和 100 多个学科的数据库资源。
10
TeraGrid’s 3-pronged strategy to further science
• DEEP Science: Enabling Terascale Science– Make science more productive
through an integrated set of very-high capability resources
• ASTA projects
• WIDE Impact: Empowering Communities– Bring TeraGrid capabilities to the
broad science community• Science Gateways
• OPEN Infrastructure, OPEN Partnership– Provide a coordinated, general
purpose, reliable set of services and resources
• Grid interoperability working group
11
TeraGrid Used
12
TeraGrid PI’s By Institution
TeraGrid PI’s
Blue: 10 or more PI’sRed: 5-9 PI’sYellow: 2-4 PI’sGreen: 1 PI
13
ANL/UC IU NCSA ORNL PSC Purdue SDSC TACC
ComputationalResources
Itanium 2(0.5 TF)
IA-32(0.5 TF)
Itanium2(0.2 TF)
IA-32(2.0 TF)
Itanium2(10.7 TF)
SGI SMP (7.0 TF)
Dell Xeon(17.2TF)
IBM p690(2TF)
Condor Flock(1.1TF)
IA-32 (0.3 TF)
XT3 (10 TF)
TCS (6 TF)
Marvel SMP
(0.3 TF)
Hetero(1.7 TF)
IA-32(11 TF)Opportunistic
Itanium2(4.4 TF)
Power4+(15.6 TF)
Blue Gene(5.7 TF)
IA-32(6.3 TF)
Online Storage 20 TB 32 TB 1140 TB 1 TB 300 TB 26 TB 1400 TB 50 TB
Mass Storage 1.2 PB 5 PB 2.4 PB 1.3 PB 6 PB 2 PB
Net Gb/s, Hub 30 CHI 10 CHI 30 CHI 10 ATL 30 CHI 10 CHI 10 LA 10 CHI
DataCollections# collectionsApprox total sizeAccess methods
5 Col.>3.7 TBURL/DB/GridFTP
> 30 Col.URL/SRB/DB/GridFTP
4 Col.7 TBSRB/Portal/OPeNDAP
>70 Col.>1 PBGFS/SRB/DB/GridFTP
4 Col. 2.35 TBSRB/Web Services/URL
Instruments ProteomicsX-ray Cryst.
SNS and HFIR Facilities
VisualizationResourcesRI: Remote InteractRB: Remote BatchRC: RI/Collab
RI, RC, RB IA-32, 96 GeForce 6600GT
RBSGI Prism, 32 graphics pipes; IA-32
RI, RBIA-32 + Quadro4 980 XGL
RBIA-32, 48 Nodes
RB RI, RC, RBUltraSPARC IV, 512GB SMP, 16 gfx cards
TeraGrid Resources
100+ TF8 distinct architectures
3 PB Online Disk
>100 data collections
14
Science GatewaysA new initiative for the TeraGrid
• Increasing investment by communities in their own cyberinfrastructure, but heterogeneous:
• Resources• Users – from expert to K-12• Software stacks, policies
• Science Gateways– Provide “TeraGrid Inside”
capabilities– Leverage community investment
• Three common forms:– Web-based Portals – Application programs running on
users' machines but accessing services in TeraGrid
– Coordinated access points enabling users to move seamlessly between TeraGrid and other grids.
Workflow Composer
15
Gateways are growing in numbers
• 10 initial projects as part of TG proposal• >20 Gateway projects today• No limit on how many gateways can use TG resources
– Prepare services and documentation so developers can work independently
• Open Science Grid (OSG)• Special PRiority and Urgent Computing Environment
(SPRUCE)• National Virtual Observatory (NVO)• Linked Environments for Atmospheric Discovery
(LEAD)• Computational Chemistry Grid (GridChem)• Computational Science and Engineering Online (CSE-
Online)• GEON(GEOsciences Network)• Network for Earthquake Engineering Simulation (NEES)• SCEC Earthworks Project• Network for Computational Nanotechnology and
nanoHUB• GIScience Gateway (GISolve)• Biology and Biomedicine Science Gateway• Open Life Sciences Gateway• The Telescience Project• Grid Analysis Environment (GAE)• Neutron Science Instrument Gateway• TeraGrid Visualization Gateway, ANL• BIRN• Gridblast Bioinformatics Gateway• Earth Systems Grid• Astrophysical Data Repository (Cornell)
• Many others interested– SID Grid– HASTAC
OSG(Open Science Grid)
17
Open Science Grid (OSG)
Origins:– National Grid (iVDGL, GriPhyN, PPDG) and LHC Software &
Computing Projects Current Compute Resources:
– 61 Open Science Grid sites– Connected via Inet2, NLR.... from 10 Gbps – 622 Mbps– Compute & Storage Elemets– All are Linux clusters– Most are shared
• Campus grids• Local non-grid users
– More than 10,000 CPUs• A lot of opportunistic usage • Total computing capacity difficult to estimate• Same with Storage
Origins:– National Grid (iVDGL, GriPhyN, PPDG) and LHC Software &
Computing Projects Current Compute Resources:
– 61 Open Science Grid sites– Connected via Inet2, NLR.... from 10 Gbps – 622 Mbps– Compute & Storage Elemets– All are Linux clusters– Most are shared
• Campus grids• Local non-grid users
– More than 10,000 CPUs• A lot of opportunistic usage • Total computing capacity difficult to estimate• Same with Storage
18
96 Resources across production & integration infrastructures
20 Virtual Organizations +6 operations
Includes 25% non-physics.
~20,000 CPUs (from 30 to 4000)
~6 PB Tapes
~4 PB Shared Disk
Snapshot of Jobs on OSGs
Sustaining through OSG submissions:
3,000-4,000 simultaneous jobs .
~10K jobs/day
~50K CPUhours/day.
Peak test jobs of 15K a day.
Using production & research networks
OSG Snapshot
NERSC
BU
UNMSDSC
UTA
OU
FNALANL
WISC BNL
VANDERBILT
PSU
UVA
CALTECH
IOWA STATE
PURDUE
IU
BUFFALO
TTU
CORNELL
ALBANY
UMICH
INDIANAIUPUI
STANFORD
UWM
UNL
UFL
KU
UNI
WSUMSU
LTU
LSU
CLEMSON
MCGILL
UMISS
UIUC
UCRUCLA
LEHIGH
NSF
ORNL
HARVARD
UIC
SMU
UCHICAGO
What is the Open Science Grid?
(+Brazil, Mexico, Tawain, UK)
OSGOSG 应用应用Genome sequence analysis
STAR: 5 TB transfer(SRM, GridFTP)
Sloan digital sky survey
Earth System Grid:O(100TB) online data
Earth System GridEarth System Grid
EGEE(Enabling Grids for E-sciencE)
23
European Grid Initiative
June 2, 200824
ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…
>250 sites48 countries>50,000 CPUs>20 PetaBytes>10,000 users>150 VOs>150,000 jobs/day
June 2, 200825
Users and resources distribution
26
XferCPU
Storage
EGEE workload in 2007
CPU: 114 Million hours
Data:
25PB stored
11PB transferred
http://gridview.cern.ch/GRIDVIEW/same_index.php http://calculator.s3.amazonaws.com/calc5.html? 17/05/08 $58688679.08
LCG(LHC Computing Grid)
04/19/23 Federico Calzolari 28
LHC - Large Hadronic Collider
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG 4 experiments:
ATLAS Alice CMS LHCb
27 km long pipe
7+7 TeV
04/19/23 Federico Calzolari 29
LCG - LHC Computing GridG
RID
Tu
tori
alG
RID
Tu
tori
al -
Ho
w t
o u
se
LC
G-
Ho
w t
o u
se
LC
G
目前集成了 33 个国家的140 个计算中心。
2008 年将执行 1 亿个计算任务。
04/19/23 30
Proxy certificate
Get your proxy certificate temporary (usually 24h) certificate depending on VO:
grid-proxy-initvoms-proxy-init -voms <VO>:/<VO>/Role=<role> -valid 1000:00
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 Federico Calzolari 31
Certificate Install your certificate on the User Interface:
Log in into the UserInterface, copy there the file you exported, and create a directory where your certificate + private key will be stored:mkdir ~/.globus
Convert PKCS12 file .p12 into the supported standard .pemThis operation will split your mycert.p12 file in two files: the certificate (usercert.pem) and the private key (userkey.pem)openssl pkcs12 -nocerts -in <mycert.p12> -out ~/.globus/userkey.pemopenssl pkcs12 -clcerts -nokeys -in <mycert.p12> -out ~/.globus/usercert.pemchmod 0400 ~/.globus/userkey.pemchmod 0600 ~/.globus/usercert.pem
At end you should have something like:[user@userinterface .globus]$ ls -al-rw------- 1 user user 2008 Nov 13 16:50 usercert.pem-r-------- 1 user user 963 Nov 13 16:50 userkey.pem
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 32
Register to a VOG
RID
Tu
tori
alG
RID
Tu
tori
al -
Ho
w t
o u
se
LC
G-
Ho
w t
o u
se
LC
G
for generic user
http://grid-it.cnaf.infn.it
04/19/23 33
JDL: Job Description Language
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
JOB overview:
JDL (job encapsulation) main script executable program Creation
Submission
Status
Retrieval
04/19/23 34
JDL test.jdl
Executable = "script.sh";StdOutput = "std.out";StdError = "std.err";InputSandbox = {"script.sh","exe.bin"}; # InputOutputSandbox = {"std.out","std.err","out"}; # OutputVirtualOrganisation = "<VO>";DataAccessProtocol = {"file","gsiftp","rfio","dcap"};InputData = {"lfn:/grid/<VO>/<FILE>"};OutputSE = "<SE>";
Requirements=Member("<SITE>", other.GlueHostApplicationSoftwareRunTimeEnvironment && other.GlueCEName=="<QUEUE>");
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 35
Main script script.sh
#!/bin/sh# Environmentdate >> out2hostname >> out2
# Get datalcg-cp [-v] --vo <VO> lfn:<file> file:///data.tgz
# Unpack input [data.tgz: src.cpp,...]tar -zxvf data.tgz
# Compile sourceg++ src.cpp -o exe.binchmod u+x exe.bin
# Exec program./exe.bin > out
# Pack outputtar -zcvf out.tgz out out2
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 36
Submit a Job Submit a JOB
edg-job-submit -o ID <JDL> # save JOBid on file ID
Selected Virtual Organisation name (from JDL): cms
Connecting to host rb119.cern.ch, port 7772 # Resource BrokerLogging to host rb119.cern.ch, port 9002********************************************************************************************* JOB SUBMIT OUTCOMEThe job has been successfully submitted to the Network Server.Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is:- https://rb119.cern.ch:9000/tG3Xp2jT_58IUeXoY1GoZQ # JOBid*********************************************************************************************
Control JOB status
edg-job-status <JOBid> [https://rb119.cern.ch:9000/tG3Xp2jT_58IUeXoY1GoZQ]
*************************************************************
BOOKKEEPING INFORMATION:Status info for the Job : https://rb119.cern.ch:9000/tG3Xp2jT_58IUeXoY1GoZQCurrent Status: Waiting / Scheduled / Running / Done (Success/Abort)Status Reason: Job successfully submitted to GlobusDestination: ce0001.m45.ihep.su:2119/jobmanager-lcgpbs-cmsreached on: Sat Nov 17 22:38:34 2007*************************************************************G
RID
Tu
tori
alG
RID
Tu
tori
al -
Ho
w t
o u
se
LC
G-
Ho
w t
o u
se
LC
G
04/19/23 37
Get the output JOB output retrieve
edg-job-get-output <JOBid> [https://rb119.cern.ch:9000/tG3Xp2jT_58IUeXoY1GoZQ]
Retrieving files from host: rb119.cern.ch( for https://rb119.cern.ch:9000/tG3Xp2jT_58IUeXoY1GoZQ)
********************************************************************************* JOB GET OUTPUT OUTCOMEOutput sandbox files for the job:- https://rb119.cern.ch:9000/tG3Xp2jT_58IUeXoY1GoZQhave been successfully retrieved and stored in the directory:/tmp/jobOutput/<USER>_ tG3Xp2jT_58IUeXoY1GoZQ*********************************************************************************
ls -al /tmp/jobOutput/calzolar_ tG3Xp2jT_58IUeXoY1GoZQ -rw-r--r-- 1 calzolar cms 11 Nov 17 23:59 out -rw-r--r-- 1 calzolar cms 133 Nov 17 23:59 std.err -rw-r--r-- 1 calzolar cms 8 Nov 17 23:59 std.out
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 38
Job Requirements JDL Requirements
everywhereNO Requirements
at PisaRequirements=Member("INFN-PISA",other.GlueHostApplicationSoftwareRunTimeEnvironment);
on a queue 1 day at least longRequirements=(other.GlueCEPolicyMaxCPUTime>60*24);
on a site with at least 20 free CPURequirements=(other.GlueCEStateFreeCPUs>20);
on a site with at least 1 TB (unit:kb) local disk availableRequirements=anyMatch(other.storage.CloseSEs,target.GlueSAStateAvailableSpace > 1000000000);
on a site with a given software locally installedRequirements=Member(”VO-<VO>-TAG",other.GlueHostApplicationSoftwareRunTimeEnvironment);
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 39
Requirements TAGs from SINICA http://goc.grid.sinica.edu.tw/gstat/<SITE>/
GlueHostOperatingSystemName: Scientific Linux CERN GlueHostOperatingSystemRelease: 4.5 GlueHostOperatingSystemVersion: Beryllium GlueSubClusterPhysicalCPUs: 0 GlueSubClusterLogicalCPUs: 0 GlueHostApplicationSoftwareRunTimeEnvironment:
LCG-2 LCG-2_1_0LCG-2_1_1 LCG-2_2_0LCG-2_3_0 LCG-2_3_1LCG-2_4_0 LCG-2_5_0LCG-2_6_0 LCG-2_7_0GLITE-3_0_0 R-GMAINFN-PISA SI00MeanPerCPU_1800SF00MeanPerCPU_2000 MPICHMPI_HOME_NOTSHARED AFSVO-atlas-cloud-IT VO-atlas-production-12.0.5VO-atlas-production-12.0.6 VO-atlas-production-12.0.7[…]
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 40
Resources search Query CPU / Storage available per VO
lcg-infosites --vo <VO> ce
#CPU Free Total Jobs Running Waiting ComputingElement----------------------------------------------------------
165 1 1 0 1 ce.phy.bg.ac.yu:2119/jobmanager-pbs-cms120 11 0 0 0 fangorn.man.poznan.pl:2119/jobmanager-pbs-cms192 110 0 0 0 gridce.atlantis.ugent.be:2119/jobmanager-pbs-cms212 0 529 146 383 gridce.iihe.ac.be:2119/jobmanager-pbs-cms227 5 312 222 90 ingrid.cism.ucl.ac.be:2119/jobmanager-lcgcondor-cms 15 15 0 0 0 ce002.ipp.acad.bg:2119/jobmanager-lcgpbs-cms 80 43 0 0 0 ce02.grid.acad.bg:2119/jobmanager-pbs-cms 24 13 0 0 0 ce001.grid.uni-sofia.bg:2119/jobmanager-lcgpbs-cms
lcg-infosites --vo <VO> se
Avail Space(Kb) Used Space(Kb) Type SEs----------------------------------------------------------97470000 n.a n.a dpm.phy.bg.ac.yu395467659 779205896 n.a cmsse01.ihep.ac.cn27664924 59878772 n.a se001.grid.uni-sofia.bg149180000 n.a n.a se.hpc.iit.bme.hu1 1 n.a dcsrm.usatlas.bnl.gov190040000 208 n.a lxdpm101.cern.ch1000000000000 500000000000 n.a castorgrid.cern.ch1000000000000 500000000000 n.a srm.cern.chG
RID
Tu
tori
alG
RID
Tu
tori
al -
Ho
w t
o u
se
LC
G-
Ho
w t
o u
se
LC
G
04/19/23 41
Resources search Query available sites for my Job
edg-job-list-match <JDL>
Selected Virtual Organisation name (from JDL): cmsConnecting to host rb119.cern.ch, port 7772*************************************************************************** COMPUTING ELEMENT IDs LISTThe following CE(s) matching your job requirements have been found: *CEId*a01-004-128.gridka.de:2119/jobmanager-pbspro-cmsSa01-004-128.gridka.de:2119/jobmanager-pbspro-cmsXS
ares02.cyf-kr.edu.pl:2119/jobmanager-pbs-cms beagle14.ba.itb.cnr.it:2119/jobmanager-lcgpbs-cms bogrid5.bo.infn.it:2119/jobmanager-lcgpbs-cms ce-fzk.gridka.de:2119/jobmanager-pbspro-cmsL ce-fzk.gridka.de:2119/jobmanager-pbspro-cmsS ce-fzk.gridka.de:2119/jobmanager-pbspro-cmsXS ce.bg.ktu.lt:2119/jobmanager-lcgpbs-cms ce.cc.ncu.edu.tw:2119/jobmanager-lcgpbs-cms
[…]gridce.ilc.cnr.it:2119/jobmanager-lcgpbs-cmsgridce2.pi.infn.it:2119/jobmanager-lcglsf-cms4gridce.sns.it:2119/jobmanager-lcgpbs-cms
GR
ID T
uto
rial
GR
ID T
uto
rial
- H
ow
to
us
e L
CG
- H
ow
to
us
e L
CG
04/19/23 42
Grid MonitoringG
RID
Tu
tori
alG
RID
Tu
tori
al -
Ho
w t
o u
se
LC
G-
Ho
w t
o u
se
LC
G
GOC Sinica
GridICE INFN
04/19/23 43
Grid MonitoringG
RID
Tu
tori
alG
RID
Tu
tori
al -
Ho
w t
o u
se
LC
G-
Ho
w t
o u
se
LC
G
AOB
云计算云计算
Cloud Computing
45
Cloud Computing
Definition Cloud computing is a concept of using the internet to allow
people to access technology-enabled services. It allows users to consume services without knowledge of control over the technology infrastructure that supports them.
- Wikipedia
47
Enterprise IT spending challenge
Source: IBM Corporate Strategy analysis of IDC data, Sept. 2007
Global Annual IT SpendingEstimated US$B 1996-2010
$0B
50
100
150
200
250
300
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
New Server Spending
Server Mgt and Admin Costs
Power and Cooling Costs
Dream or Nightmare?
Seasonal Spikes
A Closer Look at Cloud Computing
Enterprise Cloud
Public Cloud
INNOVATIVE BUSINESS MODELS
End Users / Requestors
Government/ Academics
Industry(Startups/ SMB/ Enterprise)
Consumers
• An “Elastic” pool of high performance virtualized compute resources
• Cloud applications enable the simplificationof complex services
• A cloud computing platform combines modular componentson a service oriented architecture with flexible pricing
• New combinations of services to form differentiating value propositions at lowercosts in shorter time
• Internet protocol based convergence of networks and devices
SIMPLIFIED SERVICES
Source: Corporate Strategy
52
Examples of Different Types of Services
Cloud Computing
Service Catalog
DatacenterInfrastructure
Virtual Client service
Web Application Service
Compute Service
Database service
Storage service
Content Classification
Storage backup, archive… service
Job SchedulingService
Collaboration Services
Google and Cloud ComputingGoogle 与云计算
User Centric
• Data stored in the “Cloud”
• Data follows you & your devices
• Data accessible anywhere
• Data can be shared with others
music
preferences
maps
newscontacts
messages
mailing lists
photo
e-mails
calendar
phone numbers
investments
Google 的三大法宝Google File System(GFS) BigTable MapReduce
Google File System(GFS)
57
GFS Architecture
Google48%
MSN19%
Yahoo33%
• Files broken into chunks (typically 64 MB)• Master manages metadata• Data transfers happen directly between clients/chunkservers
Client
ClientClientRep
licas
Masters
GFS Master
GFS Master
C0 C1
C2C5
Chunkserver 1
C0
C2
C5
Chunkserver N
C1
C3C5
Chunkserver 2
…
ClientClient
ClientClient
ClientClient
GFS Usage @ Google
• 200+ clusters• Filesystem clusters of up to 5000+ machines• Pools of 10000+ clients• 5+ Petabyte Filesystems• All in the presence of frequent HW failure
Google 的三大法宝Google File System(GFS) BigTable MapReduce
BigTable
• Data model (row, column, timestamp) cell contents
BigTable
• Distributed multi-level sparse map Fault-tolerance, persistent
• Scalable Thousand of servers Terabytes of in-memory data Petabytes of disk-based data
• Self-managing Servers can be added/removed dynamically Servers adjust to load imbalance
Why not just use commercial DB?
• Scale is too large or cost is too high for most commercial databases
• Low-level storage optimizations help performance significantly Much harder to do when running on top of a database layer Also fun and challenging to build large-scale systems
BigTable Summary
• Data model applicable to broad range of clients Actively deployed in many of Google’s services
• System provides high-performance storage system on a large scale Self-managing Thousands of servers Millions of ops/second Multiple GB/s reading/writing
• Largest bigtable cell manages – 3PB of data spread over several thousand machines
Google 的三大法宝Google File System(GFS) BigTable MapReduce
MapReduce
• A simple programming model that applies to many data-intensive computing problems
• Hide messy details in MapReduce runtime library Automatic parallelization Load balancing Network and disk transfer optimization Handle of machine failures Robustness Easy to use
MapReduce Programming Model
• Borrowed from functional programmingmap(f, [x1,…,xm,…]) = [f(x1),…,f(xm),…]
reduce(f, x1, [x2, x3,…])
= reduce(f, f(x1, x2), [x3,…])
= …
(continue until the list is exhausted)
• Users implement two functionsmap (in_key, in_value) (key, value) list
reduce (key, [value1,…,valuem]) f_value
f f f f f f
f f f f f returned
initial
MapReduce – A New Model and System• Two phases of data processing
– Map: (in_key, in_value) {(keyj, valuej) | j = 1…k}
– Reduce: (key, [value1,…valuem]) (key, f_value)
Data store 1 Data store nmap
(key 1, values...)
(key 2, values...)
(key 3, values...)
map
(key 1, values...)
(key 2, values...)
(key 3, values...)
Input key*value pairs
Input key*value pairs
== Barrier == : Aggregates intermediate values by output key
reduce reduce reduce
key 1, intermediate
values
key 2, intermediate
values
key 3, intermediate
values
final key 1 values
final key 2 values
final key 3 values
...
MapReduce Version of Pseudo Code
Example – WordCount (1/2)
• Input is files with one document per record• Specify a map function that takes a key/value pair
key = document URL Value = document contents
• Output of map function is key/value pairs. In our case, output (w,”1”) once per word in the document
Example – WordCount (2/2)
• MapReduce library gathers together all pairs with the same key(shuffle/sort)
• The reduce function combines the values for a key. In our case, compute the sum
• Output of reduce paired with key and saved
MapReduce Framework
• For certain classes of problems, the MapReduce framework provides: Automatic & efficient parallelization/distribution I/O scheduling: Run mapper close to input data Fault-tolerance: restart failed mapper or reducer tasks on the
same or different nodes Robustness: tolerate even massive failures:
e.g. large-scale network maintenance: once lost 1800 out of 2000 machines
Status/monitoring
Task Granularity And Pipelining
• Fine granularity tasks: many more map tasks than machines Minimizes time for fault recovery Can pipeline shuffling with map execution Better dynamic load balancing
• Often use 200,000 map/500 reduce tasks with 2000 machines
MapReduce: Uses at Google
• Typical configuration: 200,000 mappers, 500 reducers on 2,000 nodes
• Broad applicability has been a pleasant surprise Quality experiences, log analysis, machine translation, ad-hoc
data processing Production indexing system: rewritten with MapReduce
• ~10 MapReductions, much simpler than old code
MapReduce Summary
• MapReduce is proven to be useful abstraction• Greatly simplifies large-scale computation at
Google• Fun to use: focus on problem, let library deal
with messy details
A Data Playground
• MapReduce + BigTable + GFS = Data playground Substantial fraction of internet available for processing Easy-to-use teraflops/petabytes, quick turn-around Cool problems, great colleagues
Amazon Web Services
Amazon Simple Storage Service
S3
Amazon Simple Storage ServiceAmazon Simple Storage Service
$.15 per GB per monthstorage
• Object-Based Storage• 1 B – 5 GB / object• Fast, Reliable, Scalable• Redundant, Dispersed• 99.99% Availability Goal• Private or Public• Per-object URLs & ACLs• BitTorrent Support $.10 - $.18 per
GB data transfer
$.01 for 1000 to 10000 requests
Amazon S3 Concepts
Objects:Opaque data to be stored (1 byte … 5 Gigabytes)Authentication and access controls
Buckets:Object container – any number of objects100 buckets per account / buckets are “owned”
Keys:Unique object identifier within bucketUp to 1024 bytes longFlat object storage model
Standards-Based Interfaces:REST and SOAPURL-Addressability – every object has a URL
S3 SOAP/Query APIService:
ListAllMyBuckets
Buckets:CreateBucketDeleteBucketListBucketGetBucketAccessControlPolicySetBucketAccessControlPolicyGetBucketLoggingStatusSetBucketLoggingStatus
Objects:PutObjectPutObjectInlineGetObjectGetObjectExtendedDeleteObjectGetObjectAccessControlPolicySetObjectAccessControlPolicy
Amazon Simple Queue Service
SQS
Amazon Simple Queue ServiceAmazon Simple Queue Service
$.10 per 1000 messages
• Scalable Queuing• Elastic Capacity• Reliable, Simple, Secure
Inter-process messaging, data buffering, architecture component
$.10 - $.18 per GB data transfer
Amazon SQS Concepts
Queues:Named message containerPersistent
Messages:Up to 256KB of data per messagePeek / Lock access model
Scalable:Unlimited number of queues per accountUnlimited number of messages per queue
SQS SOAP/Query APIQueues:
ListQueues DeleteQueueSetVisibilityTimeoutGetVisibilityTimeout
Messages: SendMessage ReceiveMessage DeleteMessage PeekMessage
Security:AddGrantListGrantsRemoveGrant
Amazon Elastic Compute Cloud
EC2
Amazon Elastic Compute CloudAmazon Elastic Compute Cloud
$.10 per server hour
• Virtual Compute Cloud• Elastic Capacity• 1.7 GHz x86• 1.7 GB RAM• 160 GB Disk• 250 MB/Second Network• Network Security Model
Time or Traffic-based Scaling, Load testing, Simulation and Analysis, Rendering, Software as a Service Platform, Hosting
$.10 - $.18 per GB data transfer
Amazon EC2 Concepts
Amazon Machine Image (AMI):Bootable root diskPre-defined or user-builtCatalog of user-built AMIsOS: Fedora, Centos, Gentoo, Debian, Ubuntu, Windows ServerApp Stack: LAMP, mpiBLAST, Hadoop
Instance:Running copy of an AMILaunch in less than 2 minutesStart/stop programmatically
Network Security Model:Explicit access controlSecurity groups
Inter-service bandwidth is free
Amazon EC2 At Work
StartupsCruxy – Media transcodingGigaVox Media – Podcast Management
Fortune 500 clients:High-Impact, S hort-Term ProjectsDevelopment Host
Science / Research:Hadoop / MapReducempiBLAST
Load-Management and Load Balancing Tools:Pound WeogeoRightscale
EC2 SOAP/Query API
Images:RegisterImageDescribeImagesDeregisterImage
Instances:RunInstancesDescribeInstancesTerminateInstancesGetConsoleOutputRebootInstances
Keypairs:CreateKeyPairDescribeKeyPairsDeleteKeyPair
Image Attributes:
ModifyImageAttribute
DescribeImageAttribute
ResetImageAttribute
Security Groups:
CreateSecurityGroup
DescribeSecurityGroups
DeleteSecurityGroup
AuthorizeSecurityGroupIngress
RevokeSecurityGroupIngress
Web-Scale Architecture
GigaVox Economics
Implemented Amazon S3, Amazon EC2 and Amazon SQS in November 2006
Created an infinitely scalable infrastructure for less than $100 - building the same infrastructure themselves would have cost thousands of dollars
Reduced staffing requirements - far less responsibility for 24x7 operations
分析展望分析展望
网络的迅猛发展
1986 年到 2000 年 计算机 : × 500 网络 : × 340,000
网络发展的必然结果…
网格计算与云计算的比较• 异构资源• 不同机构• 虚拟组织• 科学计算为主• 高性能计算机 • 紧耦合问题• 免费• 标准化• 科学界
• 同构资源• 单一机构• 虚拟机• 数据处理为主• 服务器 /PC• 松耦合问题• 按量计费• 尚无标准• 商业社会
107
云计算是广义网格的一种
“ 网格是构筑在互联网上的一组新兴技术,它将高速互联网、高性能计算机、大型数据库、传感器、远程设备等融为一体,为科技人员和普通老百姓提供更多的资源、功能和交互性服务。
108
Ian Foster, The Grid, 1998
未来 10 年的科学
Science 2.0Science 2.0网格计算网格计算
未来 10 年的商业
Business 2.0Business 2.0云计算云计算
网格书籍网格书籍
http://www.chinagrid.net
http://www.china-cloud.net