introduction to high performance computingrcg.group.shef.ac.uk/courses/hpcintro/downloads/hpc... ·...

36
INTRODUCTION TO HIGH PERFORMANCE COMPUTING

Upload: others

Post on 24-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

INTRODUCTIONTO HIGH PERFORMANCE

COMPUTING

Page 2: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Course material:

http://rcg.group.shef.ac.uk/courses/hpcintro/

Page 3: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

GETTING STARTEDGetting an Account

Page 4: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Before you can start using Bessemer you need to register for an account.

Students can also have an account on Bessemer with the permission of their supervisors.

Accounts are available by emailing [email protected]

Page 5: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Connecting to BessemerWindows - Putty / MobaXterm

Download and install Putty, MobaXterm or other client.

Hostname:<u>@bessemer.shef.ac.uk

Page 6: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Connecting to BessemerLinux / macOS

Linux and macOS both have a terminal emulator pre-installed.

Once you have a terminal open run the following command:

ssh -X <username>@bessemer.shef.ac.ukssh -Y <username>@bessemer.shef.ac.uk

where you replace <username> with your CICS username.

Page 7: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Connecting to the Training Cluster

Host: traininghpcUsername: Muse usernamePassword: Muse password

Page 8: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Connecting to ShARCPlatform independent

Open a browser and type:https://myapps.shef.ac.uk

Log in with your university account.

Click on Connect via myAPPs Portal and log in.

Page 9: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

INTRODUCTION

Page 10: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

The supercomputer is a computer with a high level of computing performance compared to a general-purpose computer.

Bessemer specifications:• CPU cores: 1040• Memory: 5184 GiB• GPUs: 4• Storage: 460 TiB

Page 11: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Machine

Dell PowerEdge C6420

Central Processing Units:• 2 x Intel Xeon Gold 6138 • 2.00 GHz;

Memory:• 192 GB• 2666 MHz• DDR4.

General CPU node specifications25 nodes are publicly available

Operating System

• Centos 7.x • Interactive and batch job

scheduling software: Slurm

• Many applications, compilers, libraries and parallel processing.

Page 12: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Worker node #1

Worker node #2

Worker node #3

Worker node #4

Worker node #25

Login node #1

Login node #2Shared userfile storage

Page 13: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Two Bessemer head-nodes are gateways to the cluster of worker nodes.

Head-nodes’ main purpose is to allow access to the worker nodes but NOT to run cpu intensive programs.

All cpu intensive computations must be performed on the worker nodes. This is achieved by;

srun --pty bash -i

Page 14: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

RUNNING SIMPLE PROGRAMS

Page 15: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Setting up your softwaredevelopment environment

You can setup your software environment for a job by the command

module

All the available software environments can be listed by using

module avail

You can then select the ones you wish to use by using

module add

Page 16: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Using modules• List modules• Available Modules• Load Module

Write a simple “Hello World!” application and run it!

Demonstration

Page 17: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

PRACTICE SESSION

Start an interactive session on the training machine usingsrun --pty bash -i

tar -xvf /usr/local/courses/hpc_intro_long.tgz

In LOGIN NODE extract the course examples:

Page 18: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

We are studying inflammation in patients who have been given a new treatment for arthritis we need to analyse the first set of inflammation data. The data sets are held in comma separated variable (csv) format. Each row holds the observations for one patient. Each column holds the inflammation measured in one day.

Page 19: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

For this practice session we will run the R application The lastest version of R can be loaded withmodule load apps/R/3.6.1/binary

Change directory to the ‘hpc_intro_long/data’ directory and run R with$ R

From the R session you can run a series of commands to plot the inflammation data.dat <- read.csv(file = "inflammation-01.csv", header = FALSE)avg_day_inflammation <- apply(dat, 2, mean)plot(avg_day_inflammation)

to exit R typeq()

Page 20: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

MANAGING YOUR JOBS

Page 21: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Slurm is the resource management system, job scheduling and batch control system. Starts up

interactive jobs on available workers.

Schedules all batch orientated‘i.e. non-interactive’ jobs

Attempts to create a fair-share environment

Optimises resource utilisation

Page 22: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Difference between interactiveand non-interactive jobs

Until now, you have used interactive jobs. However, there are certain facts that cannot be ignored:

• Maximum time limit for interactive jobs is 8 hours.

• You must keep your connection alive!

Inconvenient or impossible to solve time consuming problems.

Solution? Non-interactive jobs

Page 23: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

NON-INTERACTIVE JOBS

Page 24: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

1) Write a job-submission shell script

You can submit your job, using a shell script. A general job-submission shell script contains the “bang-line” in the first row.#!/bin/bash

2) Next you may specify some options, such as memory limit.

#SBATCH --"OPTION"="VALUE"

3) Load the approipate modules if necessery.

module use "MODULE NAME”

4) Run your program by using the Slurm “srun” command.

srun "PROGRAM"

Page 25: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Save the script (“submission.sh”) and use

sbatch submission.sh

Note the job submission number. For example:

Submitted batch job 1226

Check your output file when the job is finished.

cat "JOB_NAME"-1226.out

JOB SUBMISSION

Page 26: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Jobs typically pass through several states in the course of their execution. The typical states are PENDING, RUNNING, SUSPENDED, COMPLETING, and COMPLETED.

Display the job queue.

squeue

Shows job details:

sacct -v

Deletes job from queue:

scancel "JOB_ID"

Managing Jobs monitoring and controlling your jobs

Page 27: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Additional options for job submission

Name your submission:

#SBATCH --comment=test_job

Specify nodes and tasks for MPI jobs:

#SBATCH --nodes=1#SBATCH --ntasks-per-node=16

Memory allocation:

#SBATCH --mem=16000

Page 28: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Additional options for job submissionSpecify the output file name:

#SBATCH --output=output.%j.test.out

Request time:

#SBATCH --time=00:30:00

Email notification:

#SBATCH [email protected]

For the full list of the available options please visit the Slurm manual webpage at https://slurm.schedmd.com/pdfs/summary.pdf.

Page 29: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

#!/bin/bash#SBATCH --nodes=1#SBATCH --ntasks-per-node=40#SBATCH --mem=64000#SBATCH [email protected]

module load OpenMPI/3.1.3-GCC-8.2.0-2.31.1

srun programMaximum 40 cores can be requested per node in the general use queues.

EXAMPLE

Page 30: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

DEMONSTRATION

Write a single script!

#!/bin/bashmodule add apps/python/3.6/binary srun python hello.py You simply type; sbatch myjob.sh

Page 31: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

PRACTICE SESSION

Submit your R job using the command

sbatch rslurm.sh

Change directory to the r folder of the course examples

Inspect the script file rslurm.sh and check that it will execute The R job for analysing computing the means of the inflammationdata sets.

Page 32: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

JOB ARRAY

Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily.

All jobs must have the same initial options (e.g. size, time limit, etc.),

Page 33: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

#SBATCH --array=0-4

Job arrays are only supported for batch jobs and the array index values are specified using the --array or -a option of the sbatch command.

The option argument can be specific array index values, a range of index values, and an optional step size

JOB ARRAY

Page 34: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Job ID and Environment Variables

Job arrays will have two additional environment variable set.

SLURM_ARRAY_JOB_ID will be set to the first job ID of the array.SLURM_ARRAY_TASK_ID will be set to the job array index value.

srun ./fish < fish${SLURM_ARRAY_TASK_ID}.in > fish${SLURM_ARRAY_TASK_ID}.out

Page 35: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Submitting a Job Array

#!/bin/bash

#SBATCH --array=0-4

srun ./fish < fish${SLURM_ARRAY_TASK_ID}.in > fish${SLURM_ARRAY_TASK_ID}.out

Job submission script (named submit.sh):

Job submission:

sbatch submit.sh

Page 36: INTRODUCTION TO HIGH PERFORMANCE COMPUTINGrcg.group.shef.ac.uk/courses/hpcintro/downloads/HPC... · 2020. 5. 26. · INTRODUCTION. The supercomputer is a computer with a high level

Getting help• Web site

- http://www.shef.ac.uk/cics/research• Iceberg Documentation

- http://www.sheffield.ac.uk/cics/research/hpc/iceberg• Training (also uses the learning management system)

- http://www.shef.ac.uk/cics/research/training• Discussion Group (based on google groups)

- https://groups.google.com/a/sheffield.ac.uk/forum/?hl=en-GB#!forum/hpc• E-mail the group [email protected] Help on google groups

- http://www.sheffield.ac.uk/cics/groups• Contacts

- [email protected]