derek wright computer sciences department university of wisconsin-madison [email protected] condor...
TRANSCRIPT
![Page 1: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/1.jpg)
Derek WrightComputer Sciences DepartmentUniversity of Wisconsin-Madison
[email protected]://www.cs.wisc.edu/condor
Condor and MPIParadyn/Condor Week
Madison, WI 2001
![Page 2: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/2.jpg)
www.cs.wisc.edu/condor
Overview
› MPI and Condor: Why Now?
› Dedicated and Opportunistic Scheduling
› How Does it All Work?
› Specific MPI Implementations
› Future Work
![Page 3: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/3.jpg)
www.cs.wisc.edu/condor
What is MPI?
› MPI is the “Message Passing Interface”
› Basically, a library for writing parallel applications that use message passing for inter-process communication
› MPI is a standard with many different implementations
![Page 4: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/4.jpg)
www.cs.wisc.edu/condor
MPI and Condor: Why Haven’t We Supported
it Until Now? › MPI's model is a static world
› We always saw the world as dynamic, opportunistic, ever-changing
› We focused our parallel support on PVM which supported a dynamic environment
![Page 5: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/5.jpg)
www.cs.wisc.edu/condor
MPI With Condor:Why Now?
› More and more Condor pools are being formed from dedicated resources
› MPI's API is also starting to move towards supporting a dynamic world (e.g. LAM, MPI2, etc)
› Few schedulers (if any) handle both opportunistic and dedicated resources at the same time
![Page 6: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/6.jpg)
www.cs.wisc.edu/condor
Dedicated and Opportunistic
Scheduling› Resources can move between
'dedicated' and 'opportunistic' status
› Users submit jobs that are either dedicated (e.g. Universe = MPI) or opportunistic (e.g. Universe = standard)
![Page 7: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/7.jpg)
www.cs.wisc.edu/condor
Dedicated and Opportunistic (Cont'd)
› Condor leaves all resources as opportunistic unless it sees dedicated jobs to service
› The Dedicated Scheduler ('DS') claims opportunistic resources and turns them into dedicated ones to schedule into the future
![Page 8: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/8.jpg)
www.cs.wisc.edu/condor
Dedicated and Opportunistic (Cont'd)
› When the DS has no more jobs, it releases the resources which go back to serving opportunistic jobs
![Page 9: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/9.jpg)
www.cs.wisc.edu/condor
Dedicated Scheduling, and "Back-Filling”
› There will always be "holes" in the dedicated schedule, sets of resources that can't be filled with dedicated jobs for certain periods of time
› Traditional solution is “back-filling” the holes with smaller dedicated jobs
› However, these might not be preemptable
![Page 10: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/10.jpg)
www.cs.wisc.edu/condor
Back-Filling (Cont’d)
› Instead of back-filling with dedicated jobs, we give the resources to Condor’s opportunistic scheduler
› Condor runs preemptable opportunistic jobs until the DS decides it needs the resources again and reclaims them
![Page 11: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/11.jpg)
www.cs.wisc.edu/condor
Dedicated Resources are Opportunistic
Resources› Even “dedicated” resources are
really opportunistic Hardware failure, software failure, etc Condor handles these failures better
than traditional dedicated schedulers, since our system already deals with them after years of opportunistic scheduling experience
![Page 12: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/12.jpg)
www.cs.wisc.edu/condor
How Does MPI Support in Condor Really Work?› Changes to the resource agent
(condor_startd)
› Changes to the job scheduling agent (condor_schedd)
› Changes to the rest of the Condor system
![Page 13: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/13.jpg)
www.cs.wisc.edu/condor
How Do You Make a Resource Dedicated in
Condor?› Just have to change a few config file
settings.... no new startd binary is required
› Add an attribute to the classad saying which scheduler, if any, this resource is willing to become dedicated to
![Page 14: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/14.jpg)
www.cs.wisc.edu/condor
Other Configuration Changes for the startd
› In addition, you must change the policy expressions: Must always be willing to run jobs
from the DS While the resource is claimed by the
DS, the startd should never suspend or preempt jobs.
![Page 15: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/15.jpg)
www.cs.wisc.edu/condor
Submitting Dedicated Jobs
› Requires a new "contrib" version of the condor_schedd
› Condor "wakes up" the dedicated scheduler logic inside the condor_schedd when MPI jobs are submitted
![Page 16: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/16.jpg)
www.cs.wisc.edu/condor
How Does Your Job Get Resources?
› The DS does a query to find all resources that are willing to become dedicated to it
› DS sends out "resource request" classads and negotiates for resources with the negotiator (the opportunistic scheduler)
![Page 17: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/17.jpg)
www.cs.wisc.edu/condor
How Does Your Job Get Resources? (Cont’d)
› DS then claims resources directly
› Once resources are available, the DS schedules and spawns jobs
› When jobs complete, if more MPI jobs can be serviced with the same resources, the DS holds onto them and uses them immediately
![Page 18: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/18.jpg)
www.cs.wisc.edu/condor
Changes to the rest of Condor?
› Very few other changes required
› Users can use all the same tools, interfaces, etc.
› Just need a new condor_starter to actually spawn MPI jobs (will also be offered as a contrib module)
![Page 19: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/19.jpg)
www.cs.wisc.edu/condor
Specific MPI Implementations
› MPICH
› LAM
› Others?
![Page 20: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/20.jpg)
www.cs.wisc.edu/condor
Condor and MPICH
› Currently we support MPICH on Unix
› Working on adding MPICH-NT support NT’s MPICH has a different
mechanism to spawn jobs than the Unix MPICH...
![Page 21: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/21.jpg)
www.cs.wisc.edu/condor
Condor + LAM = "LAMdor”
› LAM's API is better suited for a dynamic environment, where hosts can come and go from your MPI universe
› Has a different mechanism for spawning jobs than MPICH
› Condor working to support their methods for spawning
![Page 22: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/22.jpg)
www.cs.wisc.edu/condor
LAMdor (Cont’d)
› LAM working to understand, expand, and fully implement the dynamic scheduling calls in their API
› LAM also considering using Condor’s libraries to support checkpointing of MPI computations
![Page 23: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/23.jpg)
www.cs.wisc.edu/condor
MPI-2 Standard
› The MPI-2 standard contains calls to handle dynamic resources
› Not yet fully implemented by anyone
› When it is, we'll support it
![Page 24: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/24.jpg)
www.cs.wisc.edu/condor
Other MPI implementations
› What are people using?
› Do you want to see Condor support any other MPI implementations?
› If so, send email to [email protected] and let us know
![Page 25: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/25.jpg)
www.cs.wisc.edu/condor
Future work
› Implementing more advanced dedicated scheduling algorithms
› Support for all sorts of MPI implementations (LAM, MPICH-NT, MPI-2, others)
![Page 26: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/26.jpg)
www.cs.wisc.edu/condor
More Future work
› Solving problems w/ MPI on the Grid "Flocking" MPI jobs to remote pools, or
even spanning pools with a single computation
Solving issues of resource ownership on the Grid (i.e. how do you handle multiple dedicated schedulers on the grid wanting to control a given resource?)
![Page 27: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/27.jpg)
www.cs.wisc.edu/condor
More Future work
› Checkpointing entire MPI computations
› "MW" implmentation on top of Condor-MPI
![Page 28: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/28.jpg)
www.cs.wisc.edu/condor
More Future work
› Support for other kinds of dedicated jobs Generic dedicated jobs (we just
gather and schedule the resources, then call your program, give it the list of machines, and let the program spawn itself)
LINDA
![Page 29: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/29.jpg)
www.cs.wisc.edu/condor
How do I start using MPI with Condor?
› MPI support is still alpha, not quite ready for production use
› A beta release should be out soon as a contrib module
› Check the web site www.cs.wisc.edu/condor
![Page 30: Derek Wright Computer Sciences Department University of Wisconsin-Madison wright@cs.wisc.edu Condor and MPI Paradyn/Condor](https://reader038.vdocuments.net/reader038/viewer/2022110321/56649f595503460f94c7eb0e/html5/thumbnails/30.jpg)
www.cs.wisc.edu/condor
Thanks for Listening!
› Questions?
› For more information: http://www.cs.wisc.edu/condor mailto:[email protected]