research achievements kenji kaneda. agenda research background and goal research background and goal...
TRANSCRIPT
Research Research AchievementsAchievements
Kenji KanedaKenji Kaneda
AgendaAgenda
Research background and goalResearch background and goal Overview of my research Overview of my research
achievementsachievements PhoenixPhoenix Virtual Private GridVirtual Private Grid
Summary and recent activitiesSummary and recent activities
Research Research Background and Background and
GoalGoal
BackgroundBackground
Grid computingGrid computing Parallel computing with harnessing Parallel computing with harnessing
many widely-distributed resourcesmany widely-distributed resources
E.g.)E.g.) aggregate of PC clusters spread aggregate of PC clusters spread over multiple LANsover multiple LANs
Traditional Parallel Traditional Parallel Computing Computing
vs. Grid Computingvs. Grid ComputingTraditional Traditional
parallel parallel computingcomputing
Reliable processorsReliable processors Single-LAN resourcesSingle-LAN resources
Grid Grid computingcomputing
UnreliableUnreliable processors processors MultiMulti-LANs resources-LANs resources
Difficulty in Grid Difficulty in Grid computingcomputing
Frequent machine/network failuresFrequent machine/network failuresE.g.)E.g.) 1 machine failure per a day 1 machine failure per a day
Restricted ConnectivityRestricted Connectivity Administrative policies restrict Administrative policies restrict
communications between machinescommunications between machines
E.g.)E.g.) firewall, NAT, DHCP firewall, NAT, DHCP
GatewGatewayay
TCTCPP
GatewGatewayay
FirewaFirewallll
Research GoalResearch Goal
Allow a user to harness a Allow a user to harness a computational grid like computational grid like traditional parallel computingtraditional parallel computing Fault toleranceFault tolerance Transparent communication on Transparent communication on
WANsWANs
My Research My Research AchievementsAchievements
Design/implementation of Design/implementation of middlewaresmiddlewares
PhoenixPhoenix Parallel programming library for Parallel programming library for
accommodating dynamically accommodating dynamically joining/leaving resources joining/leaving resources
Virtual Private GridVirtual Private Grid Command shell for utilizing hundreds Command shell for utilizing hundreds
of computers spread over multiple of computers spread over multiple LANsLANs
PhoenixPhoenix
PhoenixPhoenix
Parallel programming library for Parallel programming library for accommodating dynamically accommodating dynamically joining/leaving resources joining/leaving resources Programming model for supporting Programming model for supporting
migration of application statesmigration of application states Transparent communication Transparent communication
mechanism for WANsmechanism for WANs
Programming Model for Programming Model for Supporting Migration of Supporting Migration of
Application StatesApplication States Subsumes a regular message passing Subsumes a regular message passing
modelmodel Provides a namespace that does not Provides a namespace that does not
depend on physical machinesdepend on physical machines Programmer uses this name to specify a Programmer uses this name to specify a
message destinationmessage destination
Programmer can write a program Programmer can write a program without being aware of physical without being aware of physical machinesmachines
Transparent Transparent Communication mechanism Communication mechanism
for WANsfor WANs Overlay network constructionOverlay network construction Application-level message routingApplication-level message routing
Processes can communicate with one Processes can communicate with one anotheranother even if networks are not fully connectedeven if networks are not fully connected even if connection topologies change even if connection topologies change
dynamicallydynamically
DemonstrationDemonstration
Boot processes on 3 subnetsBoot processes on 3 subnets Add processes dynamically Add processes dynamically
DemonstrationDemonstration
Experiments (1/3)Experiments (1/3)
Speedup with fixed resourcesSpeedup with fixed resources POV-Ray: 78 speedup using 104 processors POV-Ray: 78 speedup using 104 processors
on 3 LANson 3 LANs LU: comparable to MPICH (on a single LAN)LU: comparable to MPICH (on a single LAN)
POV-Ray
0
10
20
30
40
50
60
70
80
90
0 20 40 60 80 100 120
number of CPUs
spee
dup
1 LAN 3 LANs
LU factorization
0
5
10
15
20
25
30
0 20 40 60 80 100 120number of CPUs
spee
dup
MPICH 1 LAN 3 LANs
Experiments (2/3)Experiments (2/3)
Speedup with dynamic resourcesSpeedup with dynamic resources POV-Ray takes advantage of POV-Ray takes advantage of
dynamically added resources quicklydynamically added resources quickly
POV-Ray
0
10
20
30
40
50
60
70
80
0 50 100 150 200 250
time (sec)
rela
tive
per
form
ance
(ba
se: 1
CP
U th
roug
hput
)
dynamic fixed
A
LU factorizatoin
0
5
10
15
20
25
30
0 50 100 150 200 250 300 350
time (sec)
rela
tive
per
form
ance
(ba
se :
1C
PU
thro
ughp
ut)
fixed dynamic
Experiments (3/3)Experiments (3/3)
Parallel shogi (Japanese chess) Parallel shogi (Japanese chess) program on 720 program on 720 laptop PCslaptop PCs 7~8 speedup7~8 speedup
Related WorkRelated Work
Grid enabled MPIsGrid enabled MPIsE.g.)E.g.) MPICH-G [G. Bosilca et al. SC’02] MPICH-G [G. Bosilca et al. SC’02] Based on a traditional message Based on a traditional message
passing modelpassing model Difficult to support dynamic changes Difficult to support dynamic changes
of resourcesof resources Communications libraries for GridsCommunications libraries for Grids
E.g.)E.g.) Ibis [A. Denis et al. HPDC’04] Ibis [A. Denis et al. HPDC’04] Static message routing Static message routing
SummarySummary~ Phoenix ~~ Phoenix ~
Parallel programming library for Parallel programming library for dynamically changing resourcesdynamically changing resources Good speedup with a large number Good speedup with a large number
of machines on multiple LANsof machines on multiple LANs
Virtual Private Virtual Private GridGrid
Virtual Private Grid Virtual Private Grid (VPG)(VPG)
Command shell for utilizing Command shell for utilizing hundreds of computers spread hundreds of computers spread over multiple LANsover multiple LANs
Features (1/2)Features (1/2)
User can submit jobs without User can submit jobs without caring administrative restrictionscaring administrative restrictionsE.g.)E.g.) cmd1@host1 | cmd2@host2 > cmd1@host1 | cmd2@host2 >
file3@host3file3@host3
FirewaFirewallll
hosthost11
hosthost22
FirewaFirewallll
NATNAT
hosthost33
Execute Execute cmd1cmd1
Write to Write to file3file3
Execute Execute cmd2cmd2
Features (2/2)Features (2/2)
Fault tolerance Fault tolerance VPG can continue to run even if VPG can continue to run even if
some machines are added/deleted some machines are added/deleted dynamicallydynamically
No central server is requiredNo central server is required
DemonstrationDemonstration
EnvironmentEnvironment 3 LANs3 LANs CPU: Sparc, x86, MIPS, PowerPCCPU: Sparc, x86, MIPS, PowerPC OS: Solaris, Linux, IRIXOS: Solaris, Linux, IRIX
DemonstrationDemonstration
Related WorkRelated Work
Grid job submission toolsGrid job submission toolsE.g.)E.g.) Globus, Condor-G Globus, Condor-G Difficult to submit jobs to machines Difficult to submit jobs to machines
under administrative restrictionsunder administrative restrictions
SummarySummary~ Virtual Private Grid ~~ Virtual Private Grid ~
Command shell for utilizing Command shell for utilizing hundreds of computers spread hundreds of computers spread over multiple LANsover multiple LANs Fast job submission to more than Fast job submission to more than
100 machines100 machines
Summary and Summary and Recent activitiesRecent activities
SummarySummary~ My Research ~ My Research Achievements ~Achievements ~
Middlewares for Grid computingMiddlewares for Grid computing PhoenixPhoenix Virtual Private GridVirtual Private Grid
Recent Activities (1/2)Recent Activities (1/2)
Virtual SMPVirtual SMP Emulates a multi-processor machines Emulates a multi-processor machines
on a loosely-coupled computeon a loosely-coupled computerss
Virtual dual processor machine
on two single processor machines
Recent Activities (2/2)Recent Activities (2/2)
Virtual SMPVirtual SMP Easy utilization of distributed Easy utilization of distributed
resources with a common OS (e.g., resources with a common OS (e.g., Windows, Linux)Windows, Linux)