high performance computing workshop hpc 101 dr. charles j antonelli lsait ars february, 2014
TRANSCRIPT
![Page 1: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/1.jpg)
High PerformanceComputing Workshop
HPC 101Dr. Charles J Antonelli
LSAIT ARSFebruary, 2014
![Page 2: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/2.jpg)
cja 2014 2
CreditsContributors:
Brock Palen (CAEN HPC)
Jeremy Hallum (MSIS)
Tony Markel (MSIS)
Bennet Fauber (CAEN HPC)
Mark Montague (LSAIT ARS)
Nancy Herlocher (LSAIT ARS)
LSAIT ARS
CAEN HPC
2/14
![Page 3: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/3.jpg)
cja 2014 3
Roadmap
High Performance Computing
Flux Architecture
Flux Mechanics
Flux Batch Operations
Introduction to Scheduling
2/14
![Page 4: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/4.jpg)
4
High Performance Computing
2/14cja 2014
![Page 5: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/5.jpg)
cja 2014 5
Cluster HPC
A computing cluster a number of computing nodes connected together via special hardware and software that together can solve large problems.
A cluster is much less expensive than a single supercomputer (e.g., a mainframe)
Using clusters effectively requires support in scientific software applications (e.g., Matlab's Parallel Toolbox, or R's Snow library), or custom code
2/14
![Page 6: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/6.jpg)
cja 2014 6
Programming Models
Two basic parallel programming modelsMessage-passingThe application consists of several processes running on different nodes and communicating with each other over the network
Used when the data are too large to fit on a single node, and simple synchronization is adequate
“Coarse parallelism”
Implemented using MPI (Message Passing Interface) libraries
Multi-threadedThe application consists of a single process containing several parallel threads that communicate with each other using synchronization primitives
Used when the data can fit into a single process, and the communications overhead of the message-passing model is intolerable
“Fine-grained parallelism” or “shared-memory parallelism”
Implemented using OpenMP (Open Multi-Processing) compilers and libraries
Both
2/14
![Page 7: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/7.jpg)
cja 2014 7
Amdahl’s Law
2/14
![Page 8: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/8.jpg)
cja 2014 8
Flux Architecture
2/14
![Page 9: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/9.jpg)
cja 2014 9
FluxFlux is a university-wide shared computational discovery / high-performance computing service.
Provided by Advanced Research Computing at U-M
Operated by CAEN HPC
Procurement, licensing, billing by U-M ITS
Interdisciplinary since 2010
2/14
http://arc.research.umich.edu/resources-services/flux/
![Page 10: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/10.jpg)
cja 2014 10
The Flux clusterLogin nodes Compute nodes
Storage…
Data transfernode
2/14
![Page 11: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/11.jpg)
cja 2014 11
A Flux node
12-16 Intel cores
48-64 GB RAM
Local disk
Ethernet InfiniBand
2/14
![Page 12: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/12.jpg)
cja 2014 12
A Large Memory Flux node
1 TB RAM
Local disk
Ethernet InfiniBand
2/14
32-40 Intel cores
![Page 13: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/13.jpg)
cja 2014 13
Coming soon:A Flux GPU node
16 Intel cores
64 GB RAM
Local disk
2/14
8 GPUs
Each GPU contains 2,688 GPU cores
![Page 14: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/14.jpg)
cja 2014 14
Flux softwareLicensed and open software:
Abacus, BLAST, BWA, bowtie, ANSYS, Java, Mason, Mathematica, Matlab, R, RSEM, STATA SE, …
See http://cac.engin.umich.edu/resources
C, C++, Fortran compilers:Intel (default), PGI, GNU toolchains
You can choose software using the module command
2/14
![Page 15: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/15.jpg)
cja 2014 15
Flux networkAll Flux nodes are interconnected via Infiniband and a campus-wide private Ethernet network
The Flux login nodes are also connected to the campus backbone network
The Flux data transfer node is connected over a 10 Gbps connection to the campus backbone network
This meansThe Flux login nodes can access the Internet
The Flux compute nodes cannot
If Infiniband is not available for a compute node, code on that node will fall back to Ethernet communications
2/14
![Page 16: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/16.jpg)
cja 2014 16
Flux dataLustre filesystem mounted on /scratch on all login, compute, and transfer nodes
640 TB of short-term storage for batch jobs
Large, fast, short-term
NFS filesystems mounted on /home and /home2 on all nodes
80 GB of storage per user for development & testing
Small, slow, long-term
2/14
![Page 17: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/17.jpg)
cja 2014 17
Flux dataFlux does not provide large, long-term storage
Alternatives:Value Storage (NFS)
$20.84 / TB / month (replicated, no backups)
$10.42 / TB / month (non-replicated, no backups)
LSA Large Scale Research Storage2 TB free to researchers (replicated, no backups)
Faculty members, lecturers, postdocs, GSI/GSRA
Additional storage $30 / TB / year (replicated, no backups)
Departmental server
CAEN can mount your storage on the login nodes
2/14
![Page 18: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/18.jpg)
cja 2014 18
Copying dataThree ways to copy data to/from Flux
From Linux or Mac OS X, use scp:scp localfile [email protected]:remotefilescp [email protected]:remotefile localfilescp -r localdir [email protected]:remotedir
From Windows, use WinSCP
U-M Blue Dischttp://www.itcs.umich.edu/bluedisc/
Use Globus Connect
2/14
![Page 19: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/19.jpg)
cja 2014 19
Globus ConnectFeatures
High-speed data transfer, much faster than SCP or SFTP
Reliable & persistent
Minimal client software: Mac OS X, Linux, Windows
GridFTP EndpointsGateways through which data flow
Exist for XSEDE, OSG, …
UMich: umich#flux, umich#nyx
Add your own client endpoint!
Add your own server endpoint: contact [email protected]
More informationhttp://cac.engin.umich.edu/resources/login-nodes/globus-gridftp
2/14
![Page 20: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/20.jpg)
cja 2014 20
Flux Mechanics
2/14
![Page 21: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/21.jpg)
cja 2014 21
Using Flux
Three basic requirements to use Flux:
1. A Flux account2. A Flux allocation3. An MToken (or a Software Token)
2/14
![Page 22: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/22.jpg)
cja 2014 22
Using Flux1. A Flux account
Allows login to the Flux login nodes
Develop, compile, and test code
Available to members of U-M community, free
Get an account by visiting https://www.engin.umich.edu/form/cacaccountapplication
2/14
![Page 23: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/23.jpg)
cja 2014 23
Using Flux2. A Flux allocation
Allows you to run jobs on the compute nodesSome units cost-share Flux rates
Regular Flux: $11.72/core/monthLSA, Engineering, Medical School $6.60/month
Large Memory Flux: $23.82/core/monthLSA, Engineering, Medical School $13.30/month
GPU Flux: $107.10/2 CPU cores and 1 GPU/monthLSA, Engineering, Medical School $60/month
Flux Operating Environment: $113.25/node/monthLSA, Engineering, Medical School $63.50/month
Flux pricing at http://arc.research.umich.edu/flux/hardware-services/
Rackham grants are available for graduate studentsDetails at http://arc.research.umich.edu/resources-services/flux/flux-pricing/
To inquire about Flux allocations please email [email protected]
2/14
![Page 24: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/24.jpg)
cja 2014 24
Using Flux3. An MToken (or a Software Token)
Required for access to the login nodesImproves cluster security by requiring a second means of proving your identity
You can use either an MToken or an application for your mobile device (called a Software Token) for this
Information on obtaining and using these tokens at http://cac.engin.umich.edu/resources/login-nodes/tfa
2/14
![Page 25: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/25.jpg)
cja 2014 25
Logging in to Fluxssh flux-login.engin.umich.edu
MToken (or Software Token) required
You will be randomly connected a Flux login nodeCurrently flux-login1 or flux-login2
Firewalls restrict access to flux-login.To connect successfully, either
Physically connect your ssh client platform to the U-M campus wired or MWireless network, or
Use VPN software on your client platform, or
Use ssh to login to an ITS login node (login.itd.umich.edu), and ssh to flux-login from there
2/14
![Page 26: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/26.jpg)
cja 2014 26
ModulesThe module command allows you to specify what versions of software you want to usemodule list -- Show loaded modulesmodule load name -- Load module name for usemodule avail -- Show all available modulesmodule avail name -- Show versions of module name*module unload name -- Unload module namemodule -- List all optionsEnter these commands at any time during your sessionA configuration file allows default module commands to be executed at login
Put module commands in file ~/privatemodules/defaultDon’t put module commands in your .bashrc / .bash_profile
2/14
![Page 27: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/27.jpg)
cja 2014 27
Flux environment
The Flux login nodes have the standard GNU/Linux toolkit:
make, autoconf, awk, sed, perl, python, java, emacs, vi, nano, …
Watch out for source code or data files written on non-Linux systems
Use these tools to analyze and convert source files to Linux formatfile
dos2unix2/14
![Page 28: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/28.jpg)
cja 2014 28
Lab 1Task: Invoke R interactively on the login node
module load Rmodule list
Rq()
Please run only very small computations on the Flux login nodes, e.g., for testing
2/14
![Page 29: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/29.jpg)
cja 2014 29
Lab 2Task: Run R in batch mode
module load R
Copy sample code to your login directorycdcp ~cja/hpc-sample-code.tar.gz .tar -zxvf hpc-sample-code.tar.gzcd ./hpc-sample-code
Examine Rbatch.pbs and Rbatch.R
Edit Rbatch.pbs with your favorite Linux editor
Change #PBS -M email address to your own
2/14
![Page 30: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/30.jpg)
cja 2014 30
Lab 2Task: Run R in batch mode
Submit your job to Fluxqsub Rbatch.pbs
Watch the progress of your jobqstat -u uniqname
where uniqname is your own uniqname
When complete, look at the job’s outputless Rbatch.out
Copy your results to your local workstation (change uniqname to your own uniqname)scp [email protected]:hpc-sample-code/Rbatch.out Rbatch.out
2/14
![Page 31: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/31.jpg)
cja 2014 31
Lab 3Task: Use the multicore package
The multicore package allows you to use multiple cores on the same node
module load Rcd ~/sample-code
Examine Rmulti.pbs and Rmulti.R
Edit Rmulti.pbs with your favorite Linux editor
Change #PBS -M email address to your own
2/14
![Page 32: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/32.jpg)
cja 2014 32
Lab 3Task: Use the multicore package
Submit your job to Fluxqsub Rmulti.pbs
Watch the progress of your jobqstat -u uniqname
where uniqname is your own uniqname
When complete, look at the job’s outputless Rmulti.out
Copy your results to your local workstation (change uniqname to your own uniqname)scp [email protected]:hpc-sample-code/Rmulti.out Rmulti.out
2/14
![Page 33: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/33.jpg)
cja 2014 33
Compiling CodeAssuming default module settings
Use mpicc/mpiCC/mpif90 for MPI code
Use icc/icpc/ifort with -mp for OpenMP code
Serial code, Fortran 90:ifort -O3 -ipo -no-prec-div –xHost -o prog prog.f90
Serial code, C:icc -O3 -ipo -no-prec-div –xHost –o prog prog.cMPI parallel code:mpicc -O3 -ipo -no-prec-div –xHost -o prog prog.cmpirun -np 2 ./prog
2/14
![Page 34: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/34.jpg)
cja 2014 34
Lab 4Task: compile and execute simple programs on the Flux login node
Copy sample code to your login directory:cdcp ~brockp/cac-intro-code.tar.gz .tar -xvzf cac-intro-code.tar.gzcd ./cac-intro-code
Examine, compile & execute helloworld.f90:ifort -O3 -ipo -no-prec-div -xHost -o f90hello helloworld.f90./f90hello
Examine, compile & execute helloworld.c:icc -O3 -ipo -no-prec-div -xHost -o chello helloworld.c./chello
Examine, compile & execute MPI parallel code:mpicc -O3 -ipo -no-prec-div -xHost -o c_ex01 c_ex01.cmpirun -np 2 ./c_ex01
2/14
![Page 35: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/35.jpg)
cja 2014 35
MakefilesThe make command automates your code compilation processUses a makefile to specify dependencies between source and object filesThe sample directory contains a sample makefileTo compile c_ex01:make c_ex01To compile all programs in the directorymakeTo remove all compiled programsmake cleanTo make all the programs using 8 compiles in parallel make -j8
2/14
![Page 36: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/36.jpg)
cja 2014 36
Flux Batch Operations
2/14
![Page 37: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/37.jpg)
cja 2014 37
Portable Batch System
All production runs are run on the compute nodes using the Portable Batch System (PBS)
PBS manages all aspects of cluster job execution except job scheduling
Flux uses the Torque implementation of PBS
Flux uses the Moab scheduler for job scheduling
Torque and Moab work together to control access to the compute nodes
PBS puts jobs into queuesFlux has a single queue, named flux
2/14
![Page 38: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/38.jpg)
cja 2014 38
Cluster workflowYou create a batch script and submit it to PBS
PBS schedules your job, and it enters the flux queue
When its turn arrives, your job will execute the batch script
Your script has access to any applications or data stored on the Flux cluster
When your job completes, anything it sent to standard output and error are saved and returned to you
You can check on the status of your job at any time, or delete it if it’s not doing what you want
A short time after your job completes, it disappears
2/14
![Page 39: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/39.jpg)
cja 2014 39
Basic batch commands
Once you have a script, submit it:qsub scriptfile
$ qsub singlenode.pbs6023521.nyx.engin.umich.edu
You can check on the job status:qstat jobidqstat -u user$ qstat -u cjanyx.engin.umich.edu: Req'd Req'd ElapJob ID Username Queue Jobname SessID NDS TSK Memory Time S Time-------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - -----6023521.nyx.engi cja flux hpc101i -- 1 1 -- 00:05 Q --
To delete your jobqdel jobid
$ qdel 6023521$
2/14
![Page 40: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/40.jpg)
cja 2014 40
Loosely-coupled batch script
#PBS -N yourjobname#PBS -V#PBS -A youralloc_flux#PBS -l qos=flux#PBS -q flux#PBS –l procs=12,pmem=1gb,walltime=01:00:00#PBS -M youremailaddress#PBS -m abe#PBS -j oe
#Your Code Goes Below:cd $PBS_O_WORKDIRmpirun ./c_ex01
2/14
![Page 41: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/41.jpg)
cja 2014 41
Tightly-coupled batch script
#PBS -N yourjobname#PBS -V#PBS -A youralloc_flux#PBS -l qos=flux#PBS -q flux#PBS –l nodes=1:ppn=12,mem=47gb,walltime=02:00:00#PBS -M youremailaddress#PBS -m abe#PBS -j oe
#Your Code Goes Below:cd $PBS_O_WORKDIRmatlab -nodisplay -r script
2/14
![Page 42: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/42.jpg)
cja 2014 42
Lab 5Task: Run an MPI job on 8 cores
Compile c_ex05cd ~/cac-intro-codemake c_ex05
Edit file run with your favorite Linux editorChange #PBS -M address to your own
I don’t want Brock to get your email!
Change #PBS -A allocation to FluxTraining_flux, or to your own allocation, if desired
Change #PBS -l allocation to flux
Submit your jobqsub run
2/14
![Page 43: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/43.jpg)
cja 2014 43
PBS attributesAs always, man qsub is your friend
-N : sets the job name, can’t start with a number-V : copy shell environment to compute node-A youralloc_flux: sets the allocation you are using-l qos=flux: sets the quality of service parameter-q flux: sets the queue you are submitting to-l : requests resources, like number of cores or nodes-M : whom to email, can be multiple addresses-m : when to email: a=job abort, b=job begin, e=job end-j oe: join STDOUT and STDERR to a common file
-I : allow interactive use-X : allow X GUI use
2/14
![Page 44: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/44.jpg)
cja 2014 44
PBS resources (1)A resource (-l) can specify:
Request wallclock (that is, running) time-l walltime=HH:MM:SS
Request C MB of memory per core-l pmem=Cmb
Request T MB of memory for entire job-l mem=Tmb
Request M cores on arbitrary node(s)-l procs=M
Request a token to use licensed software-l gres=stata:1-l gres=matlab-l gres=matlab%Communication_toolbox
2/14
![Page 45: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/45.jpg)
cja 2014 45
PBS resources (2)A resource (-l) can specify:
For multithreaded code:Request M nodes with at least N cores per node-l nodes=M:ppn=N
Request M cores with exactly N cores per node (note the differencevis a vis ppn syntax and semantics!)-l nodes=M,tpn=N(you’ll only use this for specific algorithms)
2/14
![Page 46: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/46.jpg)
cja 2014 46
Interactive jobsYou can submit jobs interactively:
qsub -I -X -V -l procs=2 -l walltime=15:00 -A youralloc_flux -l qos=flux –q flux
This queues a job as usualYour terminal session will be blocked until the job runs
When your job runs, you'll get an interactive shell on one of your nodes
Invoked commands will have access to all of your nodes
When you exit the shell your job is deleted
Interactive jobs allow you toDevelop and test on cluster node(s)
Execute GUI tools on a cluster node
Utilize a parallel debugger interactively
2/14
![Page 47: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/47.jpg)
cja 2014 47
Lab 6Task: Run an interactive job
Enter this command (all on one line):qsub -I -V -l procs=1 -l walltime=30:00 -A FluxTraining_flux -l qos=flux -q flux
When your job starts, you’ll get an interactive shell
Copy and paste the batch commands from the “run” file, one at a time, into this shell
Experiment with other commands
After thirty minutes, your interactive shell will be killed
2/14
![Page 48: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/48.jpg)
cja 2014 48
Lab 7Task: Run Matlab interactively
module load matlab
Start an interactive PBS sessionqsub -I -V -l procs=2-l walltime=30:00 -A FluxTraining_flux -l qos=flux -q flux
Run Matlab in the interactive PBS sessionmatlab -nodisplay
2/14
![Page 49: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/49.jpg)
cja 2014 49
Introduction to Scheduling
2/14
![Page 50: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/50.jpg)
cja 2014 50
The Scheduler (1/3)
Flux scheduling policies:The job’s queue determines the set of nodes you run on
The job’s account and qos determine the allocation to be charged
If you specify an inactive allocation, your job will never run
The job’s resource requirements help determine when the job becomes eligible to run
If you ask for unavailable resources, your job will wait until they become free
There is no pre-emption
2/14
![Page 51: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/51.jpg)
cja 2014 51
The Scheduler (2/3)Flux scheduling policies:
If there is competition for resources among eligible jobs in the allocation or in the cluster, two things help determine when you run:
How long you have waited for the resource
How much of the resource you have used so far
This is called “fairshare”
The scheduler will reserve nodes for a job with sufficient priority
This is intended to prevent starving jobs with large resource requirements
2/14
![Page 52: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/52.jpg)
cja 2014 52
The Scheduler (3/3)Flux scheduling policies:
If there is room for shorter jobs in the gaps of the schedule, the scheduler will fit smaller jobs in those gaps
This is called “backfill”
Core
sTime
2/14
![Page 53: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/53.jpg)
cja 2014 53
Gaining insightThere are several commands you can run to get some insight over the scheduler’s actions:
freenodes : shows the number of free nodes and cores currently available
mdiag -a youralloc_name : shows resources defined for your allocation and who can run against it
showq -w acct=yourallocname: shows jobs using your allocation (running/idle/blocked)
checkjob jobid : Can show why your job might not be starting
showstart -e all jobid : Gives you a coarse estimate of job start time; use the smallest value returned
2/14
![Page 54: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/54.jpg)
cja 2014 54
More advanced scheduling
Job Arrays
Dependent Scheduling
2/14
![Page 55: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/55.jpg)
cja 2014 55
Job Arrays• Submit copies of identical jobs• Invoked via qsub –t:
qsub –t array-spec pbsbatch.txt
Where array-spec can be
m-n
a,b,c
m-n%slotlimit
e.g.
qsub –t 1-50%10 Fifty jobs, numbered 1 through 50,
only ten can run simultaneously
• $PBS_ARRAYID records array identifier
2/14
![Page 56: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/56.jpg)
cja 2014 56
Dependent scheduling
• Submit jobs whose execution scheduling depends on other jobs
• Invoked via qsub –W:qsub -W depend=type:jobid[:jobid]…
Where depend can be
after Schedule after jobids have started
afterok Schedule after jobids have finished, only if no errors
afternotok Schedule after jobids have finished, only if errors
afterany Schedule after jobids have finished, regardless of status
before,beforeok,beforenotok,beforeany 2/14
![Page 57: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/57.jpg)
cja 2014 57
Dependent scheduling
Where depend can be (cont’t)
before When this job has started, jobids will be scheduled
beforeok After this job completes without errors, jobids will be scheduled
beforenotok After this job completes without errors, jobids will be scheduled
afterany After this job completes, regardless of status, jobids will be scheduled
2/14
![Page 58: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/58.jpg)
cja 2014 58
Some Flux Resources
http://arc.research.umich.edu/resources-services/flux/
U-M Advanced Research Computing Flux pages
http://cac.engin.umich.edu/CAEN HPC Flux pages
http://www.youtube.com/user/UMCoECACCAEN HPC YouTube channel
For assistance: [email protected] by a team of people including unit support staffCannot help with programming questions, but can help with operational Flux and basic usage questions
2/14
![Page 59: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/59.jpg)
cja 2014 59
SummaryThe Flux cluster is just a collection of similar Linux machines connected together to run your code, much faster than your desktop can
Command-line scripts are queued by a batch system and executed when resources become available
Some important commands are
qsubqstat -u usernameqdel jobidcheckjob
Develop and test, then submit your jobs in bulk and let the scheduler optimize their execution
2/14
![Page 60: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/60.jpg)
cja 2014 60
Any Questions?Charles J. AntonelliLSAIT Advocacy and Research [email protected]://www.umich.edu/~cja734 763 0607
2/14
![Page 61: High Performance Computing Workshop HPC 101 Dr. Charles J Antonelli LSAIT ARS February, 2014](https://reader036.vdocuments.net/reader036/viewer/2022081514/56649cb15503460f949762e0/html5/thumbnails/61.jpg)
cja 2014 61
References1. http://arc.research.umich.edu/resources-services/flux/
2. http://arc.research.umich.edu/flux/hardware-services/
3. http://cac.engin.umich.edu/resources/software/R.html
4. http://cac.engin.umich.edu/resources/software/matlab.html
5. CAC supported Flux software, http://cac.engin.umich.edu/resources/software/flux-software (accessed August 2013)6. J. L. Gustafson, “Reevaluating Amdahl’s Law,” chapter for book, Supercomputers and Artificial Intelligence, edited
by Kai Hwang, 1988. http://www.scl.ameslab.gov/Publications/Gus/AmdahlsLaw/Amdahls.html (accessed November 2011).
7. Mark D. Hill and Michael R. Marty, “Amdahl’s Law in the Multicore Era,” IEEE Computer, vol. 41, no. 7, pp. 33-38, July 2008. http://research.cs.wisc.edu/multifacet/papers/ieeecomputer08_amdahl_multicore.pdf (accessed November 2011).
8. InfiniBand, http://en.wikipedia.org/wiki/InfiniBand (accessed August 2011).9. Intel C and C++ Compiler 1.1 User and Reference Guide,
http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/lin/compiler_c/index.htm (accessed August 2011).
10. Intel Fortran Compiler 11.1 User and Reference Guide,http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/fortran/lin/compiler_f/index.htm (accessed August 2011).
11. Lustre file system, http://wiki.lustre.org/index.php/Main_Page (accessed August 2011).12. Torque User’s Manual, http://www.clusterresources.com/torquedocs21/usersmanual.shtml (accessed August 2011).13. Jurg van Vliet & Flvia Paginelli, Programming Amazon EC2,’Reilly Media, 2011. ISBN 978-1-449-39368-7.
2/14