linux (for hpc) basics - home | national institute for ... · linux (for hpc) basics ... be in this...

125
ORNL is managed by UT-Battelle for the US Department of Energy Linux (for HPC) Basics Bill Renaud Oak Ridge Leadership Computing Facility

Upload: duongkiet

Post on 25-Apr-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

ORNL is managed by UT-Battelle for the US Department of Energy

Linux (for HPC) Basics

Bill Renaud Oak Ridge Leadership Computing Facility

2

Overview

• Part I – General concepts and

terminology –  Shells – Working with Files and

Directories

• Part II –  Process Management –  Archiving –  File Permissions –  String processing – Graphical Tools –  Batch Systems

3

Credits

• Some ideas, organization, etc. from http://www.doc.ic.ac.uk/~wjk/UnixIntro

• Others from a previous HPC Seminar Series lecutre http://www.nics.tennessee.edu/files/pdf/hpcss13_14/10_24_UnixIntro.pdf

• http://www.gnu.org Information on core utilities, especially file permissions and signals

Part I

5

First Things First

• This assumes you are familiar with connecting to/logging in to a system –  If not let me know and we can cover that –  You’ll almost always want to use ssh to connect

•  In this lecture I’ll mention many commands (they’ll be in this font in the slides). –  I’ll give a brief discussion –  They’re your license to learn…once you have the basic

idea use a more complete reference to learn them

6

“Normal” Linux/Unix vs. HPC Linux/Unix

•  In terms of commands/interaction, not much • The basic OS is the same • HPC will have larger filesystems, batch queues, etc. • HPC is almost always a client/server arrangement

–  In HPC, you’ll seldom be at the system’s main terminal. On a desktop installation, you will.

–  You’ll almost always use ssh to connect to HPC systems –  You may need help with X11 (more later)

7

“Normal” Linux/Unix vs. HPC Linux/Unix

• Many different connection utilities; many systems are trending to Secure Shell (ssh) – Older connection protocols include telnet, rlogin, etc. –  You must have client software for the appropriate protocol

on your system •  Most Unix-like systems will have ssh!•  Windows users can use PuTTY (it’s free), which provides multiple

protocols (including ssh)

8

General Concepts

•  “Linux” technically refers only to the kernel –  The central process that manages the system –  There’s much more to an operating system…

• Software from the GNU Project is added to make a fully capable OS – Most commands are from the GNU project –  Thus, it’s more properly called GNU/Linux

• A distribution (or distro) is a bundling of a kernel, OS software (GNU) & other software – What most people mean when calling Linux an OS

9

General Concepts

• Some common distros: –  Ubuntu (including Xubuntu, Kubuntu, Edubuntu) –  Red Hat (RHEL, Fedora) –  Debian –  SuSE –  and many, many more…

10

General Concepts

• Linux and other Unix-like operating systems are fairly common –  Embedded devices –  Smartphones –  Laptops/Desktops –  Servers –  Supercomputers

• Most systems provide both command-line and GUI access – We’ll focus on the command line –  If you know the command line, the GUIs are intuitive

11

General Concepts

• Users –  Individuals with access to a system –  User accounts can also be set up for automated tasks

• Groups –  Collections of users – Often based on some outside organization structure

12

General Concepts

• The system knows users and groups by numbers (called the uid and gid)

• Since numbers aren’t user-friendly, it maps these to names for our use –  e.g. I might be user renaudb with uid 501!

•  In fact, the system handles most things as numbers but maps them over when interacting with humans –  Common in computing…we use domain names instead of

IP addresses

13

General Concepts

• The shell –  Basic program through which you interact with the system –  Several available: bash, ksh, csh, tcsh, zsh –  Largely guided by user preference

•  Users are very passionate about their shell of choice. –  Different shells provide similar features in different ways

• Shell script –  A small file containing several commands –  Essentially a very basic program

14

General Concepts

• Commands –  Basic programs through which we interact with the system –  Typically very short (why type unnecessary letters?!?)

•  Often an abbreviation of the action we’re requesting –  Very flexible…take many options

•  Options are typically preceded by one or two dashes •  May be a single letter or a word •  May or may not take arguments

–  Typically single-purpose •  Combine them for complex operations. Philosophy similar to RISC.

• General form: command [options] [args…]!

15

General Concepts

Hardware

16

General Concepts

• Getting help –  Ask someone familiar with the OS –  Ask Google – Once you’re more comfortable, ask the system

•  Most commands take some form of “help” option •  System manual pages (the man command: man pwd) •  System info command(similar to manpages but newer)

17

General Concepts

• Linux operates on the philosophy of Least User Privilege –  Let the user do what’s necessary for their job, but don’t let

them do things that could compromise the system – MacOS and recent Windows versions also use LUP

• Everything on the system is associated with a user –  Some are real people –  Some are “system” accounts

• A special user account, root, is the system administrator (its uid is always 0)

18

General Concepts

• Linux assumes you meant to type that –  Some commands can be destructive—Linux will happily

do them without asking “Do you really want to do that?” –  Exercise extreme caution when using wildcards (we’ll

discuss those in a few slides) & recursive options

• When doing things as root…THINK! (& then think some more) –  As root, you can destroy a system very quickly –  NEVER use root unless you absolutely have to –  NEVER use root for day-to-day access –  Try to avoid that “Why is that taking so long?” feeling

19

“Normal” Computing vs. HPC

• HPC is almost always a client/server arrangement • You use your laptop/desktop to connect to a

“remote” supercomputer • Many different connection utilities; many systems

are trending to Secure Shell (ssh) – Older connection protocols include telnet, rlogin, etc. –  You must have client software for the appropriate protocol

on your system •  Most Unix-like systems will have ssh!•  Windows users can use PuTTY (it’s free), which provides multiple

protocols (including ssh)

Working with Shells

21

Shells

• Shells are the way through which you interact with the system

• Shells provide control structures (if/then/else, while, etc.) to permit some degree of programming –  Syntax differs from shell to shell

• Most modern shells provide “tab completion” –  Type part of a command/filename, press tab –  If you’ve typed enough to identify the command/file, the

shell completes it for you Lazy programmer’s note: Tab completion can save you lots of keystrokes. It also fills in what the OS needs to handle special characters (e.g. spaces) in filenames

22

Shells

• All users have a default shell • When you log in, your default shell is started • Once the login process finishes, you are given a

command prompt –  Default one will differ from system to system –  In these slides, we’ll simply use $

• Prompts can be customized to provide helpful info –  I typically use a two-line prompt!

user@hostname $!

23

Shell Variables

• Shells provide variables –  These are useful in commands and scripts –  These can be read by programs

• Different shells use different syntax to set variables • Two types of variables

–  Shell Variable-only exists in local shell –  Environment Variable-exists in local shell and any

“subshells” or scripts

24

Shell & Environment Variables

• Shell variables –  bash, etc.: MYVAR=“somevalue”!–  csh/tcsh: set MYVAR “somevalue”!

• Environment variables –  bash, etc: export MYVAR=“somevalue”!–  csh/tcsh: setenv MYVAR “somevalue”!

25

Handy Environment Variables

• $PATH: The directory list that the system searches to find commands –  Be careful when “updating” this. Don’t overwrite what’s

there. export PATH=/some/new/dir:${PATH}!

–  If you need to use a command that’s not in your $PATH, you have to tell the system where to find it •  Might involve typing the name of the file starting at / and including

all directories, like: /home/someuser/dir1/dir1a/my_command.x!

• $USER: Your username. Helpful in scripts. • $HOME: Your home directory. More on this later.

26

Quoting

• Often we deal with strings. These are delimited with quotes.

• Different types of quotes have different meaning –  Single quotes (‘) identify a literal string (don’t expand

variables) –  Double quotes (“) identify a string whose variables should be

expanded –  Backslash (\) removes the special meaning of the next

character (usually used in double-quoted strings) –  Backquotes (`) are used to capture the output of a command

NOTE: In bash, `command` is deprecated in favor of $(command)

27

Your First Batch of Commands

• echo – print something to the screen • hostname – set or display the system’s hostname • whoami – display my username • sleep n – Do nothing for a n (a number) seconds • : – No-op. Do nothing (often useful in scripting) • ; – Command separator

–  Allows you to type multiple commands at one prompt –  Trust me, it’s useful

• # – Begins a comment (useful in scripting)

28

Quoting Examples

$ export MYHOST=$( hostname ) $ echo $MYHOST titan $ export MYVAR1='single quotes: $MYHOST' $ export MYVAR2="double quotes: $MYHOST" $ echo $MYVAR1 single quotes: $MYHOST $ echo $MYVAR2 double quotes: titan $ export MYVAR3="double quotes w/backslash \$MYHOST" $ echo $MYVAR3 double quotes w/backslash $MYHOST!

!

29

Wildcards

• Users often run commands that deal with files • Sometimes multiple files need to be listed • Tedious to type each name…wildcards help with this

–  ? – matches 0 or 1 character(s) –  * – matches 0 or more characters (essentially infinite ?s) –  [ ] – matches any one letter in the brackets

• Wildcards are part of a larger topic called regular expressions. – We won’t go into a great deal of depth with regular

expressions…they could easily be a week-long class

30

Wildcards

• Assume we have the strings/files: file1.dat file2.dat file3.dat file1a.dat file2a.dat file3a.dat file1b.dat file2b.dat file3b.dat

• Then some wildcard matches might be: String Matches fi*dat All 9 strings file?.dat file1.dat, file2.dat and file3.dat file[23]a.dat file2a.dat, file3a.dat file??.dat All 9 strings

31

Redirecting Output

• Sometimes you may want to capture a command’s output into a file –  For you to browse later –  As input to a later script –  To send to tech support

• Again, different shells use slightly different syntax • There are two main output streams to consider

–  Standard output (stdout) –  Standard error (stderr)

32

Redirecting Output (bash/ksh/similar)

Redirection Command Stdout; overwrite contents of ‘out’ ./run.sh >out!

Stdout; append to ‘out’ ./run.sh >>out!

Stderr; overwrite contents of ‘err’ ./run.sh 2>err!

Stderr; append to ‘err’ ./run.sh 2>>err!

Both (into different files) ./run.sh >>out 2>>err!

Both (into the same file) ./run.sh >>out 2>&1!

33

Redirecting Output (csh/tcsh)

Redirection Command Stdout ./run.csh > out (same as bash) Stdout; overwrite ./run.csh >! out!

Stdout; append ./run.csh >>out (same as bash) Stderr; overwrite (run.csh > /dev/null) >&err!

Stderr; append (run.csh > /dev/null)>>&err!

Both (into different files) (run.csh > out) >& err!

Both (into the same file) ./run.csh >>& errout!

/dev/null is a special file. It discards anything redirected into it. It is occasionally referred to as the “bit bucket”.

34

Redirecting Input

• You can redirect input, too –  Say, a list of data for the script to process

• This involves the standard input (stdin) stream • Normal redirection

–  Done with a single “less than” sign

–  Every line of myfile.in will be sent to the stdin stream of run.sh!

$ ./run.sh <myfile.in!

35

Redirecting Input

•  “Here” document –  Instead of putting input into a file, type it “here” –  Done via the double “less than” sign and a delimiter –  Continues until the delimiter appears by itself on a line

•  The delimiter is NOT part of the here document/stdin stream $ ./run.sh <<EOF line1 line2 EOF!

Working with Files and Directories

37

The Filesystem

•  Linux uses a hierarchical filesystem –  The “top” is named / and called the root directory

•  Common (useful) directories –  etc/ (configuration files) –  home/ (user home directories) –  bin/ (commands) –  tmp/ (temporary files) –  usr/bin/ (more commands) –  usr/lib/ (libraries) –  usr/local/ (even more commands and libraries) –  var/ (log files)

•  There are many, many more

38

The Filesystem

• Linux uses multiple physical devices (hard drives, DVDs, etc.), but arranges them in a unified structure under / –  If you install a DVD or flash drive, it’s probably placed

under /media or something similar – Most GUIs make it easy to find

• Windows, on the other hand, places each device as the root of its own filesystem (C:\, D:\, etc.)

39

A Note on Terminology

• Unfortunately, “filesystem” can be used in many contexts on Unix-like systems 1.  It can be what we just described…the full directory “tree”

that begins with / 2.  It can be one of the many physical devices that are put

together to make up that tree 3.  It can mean the way files/directories are physically

stored on a device (examples are ext4, fat32, hfs, etc.)

•  In this presentation, from this point forward, filesystem means meaning #2 above (unless otherwise noted)

40

A Very Simple Directory Tree

$ ls -1F /!

bin/ dev/ etc/ home/ sbin/ tmp/ usr/ var/ $!

41

Working with Files and Directories

• For any given file, we have two ways in which we can refer to it –  Relative path –  Absolute path

• An absolute path tells the system the precise location of the file, starting with / (the root directory) –  /home/someuser/mydir1/thefile!

• A relative path tells the system the location of the file in terms of our working directory –  Assuming we’re in /home/someuser, a relative path

would be mydir/thefile!

42

Working with Files and Directories

• pwd – Print Working (current) Directory • cd – Change Directory • mv – MoVe a file (includes renaming) • ls – LiSt directory contents • mkdir – MaKe DIRectory (It’s not md?!?) • rm – ReMove a file

–  There is no “recycle bin”. Once you remove it, it’s gone.

• rmdir – ReMove a DIRectory –  rm won’t remove a directory unless you are doing a

recursive remove, so we have the rmdir command.

43

Working with Files and Directories

• ln – Create a LiNk • file – tell me what type of file this is (text, binary) • strings – show me printable strings from this file • less, more, cat – display a (text) file • head – display the first few lines of a file • tail – display the last few lines of a file !

44

Working With Files and Directories

$ pwd!/home/someuser!!$ cd mydir!!$ pwd!/home/someuser/mydir!!$ ls!myfile!!$ mv myfile mynewfile!!$ ls!mynewfile!!$ cd /home/someotheruser!!$pwd!/home/someotheruser!

45

Working with Files and Directories

• quota – show my filesystem quota • df – show how much free space the disk has • du – show Disk Usage of a given file/directory

46

$ quota!Disk quotas for user someusr (uid 12321): ! Filesystem blocks quota limit grace files quota limit grace!nccsfiler4.ccs.ornl.gov:/vol/home2! 16182172 52428800 52428800 245827 4294967295 4294967295 !nccsfiler3.ccs.ornl.gov:/vol/home1! 8 10485760 10485760 2 4294967295 4294967295 !

Working with Files and Directories

47

$ df!Filesystem 1K-blocks Used Available Use% Mounted on!/dev/mapper/vg0-root 20314748 4252564 15013608 23% /!/dev/mapper/vg0-opt 20314748 209760 19056412 2% /opt!/dev/mapper/vg0-var 20314748 1350264 17915908 8% /var!/dev/cciss/c0d0p1 295561 31662 248639 12% /boot!tmpfs 4088124 0 4088124 0% /dev/shm!nccsfiler3. nccsfiler3.ccs.ornl.gov:/vol/home! 102400 1248 101152 2% /autofs/na3_home!nccsfiler4.ccs.ornl.gov:/vol/proj! 5033246720 4870948832 162297888 97% /autofs/na4_proj!nccsfiler4.ccs.ornl.gov:/vol/sw! 1356437920 1201421696 155016224 89% /autofs/na4_sw!!$ du -sk .!16279240 .!!

Working with Files and Directories

Remember the multiple meanings of “filesystem”? This command shows us where the individual physical devices (our second meaning of filesystem) is mapped into the larger directory tree (our first meaning of filesystem).

48

Working With Files and Directories

• The ls command takes tons of options. Important ones are –  l: show a detailed (long) listing –  a: show all files, even hidden ones

•  Hidden files have a period as the first letter of their name –  R: be recursive (show contents of subdirectories)

49

Working with Files and Directories

• The rm command also takes some useful (and potentially dangerous) options –  i: ask me before deleting each file –  r: delete recursively (VERY DANGEROUS)

50

Working With Files and Directories

• Why did I tell you the file command? –  Linux assumes “you meant that”, and trying to print a non-

text file is…um…“interesting”.

• How do I display a text file to the screen? –  cat: shows the file all at once (conCATenate file) –  more: shows the file a screenful at a time

•  Some versions don’t let you scroll back, so… –  less: less is more. No, really. It does what more does,

but lets you scroll using the arrow keys.

51

Working with Files and Directories

• Systems have the concept of users and groups –  User-an individual* that has access to the system – Group-a collection of one or more users

• Users and groups are used to control access to files, directories, etc.

• Files are associated with a user and a group • Every user has a “home” directory

–  You’re placed here when you log in –  It’s where your personal files go

52

Working with Files and Directories

• We have some special characters to refer to directories: –  . – the current directory –  .. – the current directory’s parent –  ~ - refers to home directories

•  ~ alone is the current user’s home directory •  ~john is the home directory for user “john”

53

Working with Files and Directories

• Why do we need ~, . and ..? • They make life easier by giving us shortcuts for

directory names cd ..! Change to the current directory’s parent (without

knowing the full path) ./somecommand! Run somecommand, which may not be in your $PATH!cat ~/somefile! Display a file in my home directory cat ~john/somefile! Display a file in john’s home directory cd ~! Change to my home directory.

Equivalent to cd $HOME!

Lazy programmer’s note: Just typing cd and pressing enter will change to your home directory.

54

Working with Files and Directories

•  If a filename begins with . (period), it’s a “hidden” file –  Normal ls won’t list it. You need ls –a.

• Other than a leading period and the special directories . And .., a period has no special meaning in filenames –  You don’t need to do *.* to represent all files…* is

sufficient.

Part II

Working with Files and Directories (continued)

57

File Permissions

• Every file on a system has three basic permissions: read, write, and execute

• For normal files: –  Read = ability to browse/print the contents of the file – Write = the ability to update the file –  Execute = the ability to run the file as a program

• For directories: –  Read = the ability to see what files are in a directory – Write = the ability to add/remove files in a directory –  Execute = the ability to cd into the directory

58

File Permissions

• Typically viewed with ls –l!

• Permissions are the left-most 10 character string –  First character indicates the file type (- for normal files, d

for directories, l for symbolic links) –  Remaining 9 characters indicate permissions for the file

owner (user), group, and other (everyone else on the system)

$ ls -l!total 4!-rw-r----- 1 someuser somegrp 1032 Oct 30 12:28 myfile.dat!

!

59

File Permissions

• Permission groups are three characters –  In order, read permission, write permission, execute

permission –  The letter (r, w, or x) indicates the permission is granted –  A hyphen means it is denied

60

Changing Permissions

• A file’s permission setting is also called it’s “mode” • Thus, permissions are changed with the chmod (CHange MODe) command

• Two ways to use it –  Symbolic (easier, but less flexible) – Octal (harder to understand, very powerful)

61

Changing Permissions-symbolic

• Symbolic chmod [entity] [+-=] [permission] [file(s)]!–  entity is “u”, “g” or “o” –  + adds permission, - removes it, = sets it explicitly –  permission is “r”, “w” or “x” – Multiple entity/operation/permission group can be

specified (if so, they’re comma delimited) – One or more files can be specified

• Examples chmod u+rwx myfile.dat chmod o-wx myfile.dat!chmod u=rwx,g=rx,o=r myfile.dat!

62

Changing Permission-Octal

• Sets (potentially different) user, group, and other permissions all at once –  The three numbers represent user, group, and other

• To determine the different numbers, determine what permissions you want for each entity –  Start with 0 and add 4 for read, 2 for write, and 1 for

execute

• Let’s say we want read, write, execute for user, read and execute for group, and read only for other

63

Changing Permissions-Octal

User Read Write Execute = 7 4 2 1

Group Read (No Write) Execute = 5 4 1

Other (No Read) (No Write) Execute = 1 1

chmod 751 myfile.dat!

64

Changing Permissions-Octal

• We use 4, 2, and 1 because we’re essentially converting each permissions group into a 3-bit binary number –  If a permission is present, we use a 1. If not, use a 0.

–  So…

22 (= 4) 21 (= 2) 20 (= 1) Read (r) Write (w) Execute (x)

Symbolic Representation

Binary Representation

Calculation Value

r--! 100! 4 + 0 + 0 4 r-x! 101! 4 + 0 + 1 5 rw-! 110! 4 + 2 + 0 6

65

Understanding Permissions

- r w x r - x - - -!

Given the permission string

66

Understanding Permissions

- r w x r - x - - -!

File type is “-”, meaning this is a normal file User has read, write, and execute permission

Group has read and execute permission, but not write permission

Other has no permissions

67

Understanding Permissions

- r w x r - x - - -!4 + 2 + 1! 4 + + 1! + + ! 7! 5! 0!

File type is “-”, meaning this is a normal file User has read, write, and execute permission

Group has read and execute permission, but not write permission

Other has no permissions

68

Understanding Permissions

- r w x r - x - - -!

7! 5! 0!

69

Changing Permissions

• Symbolic method –  Usually easier to grasp –  Can be explicit (=), – …however, usually not

explicit (+/-) •  Need to know current

permissions •  May require several

iterations to get what you want

• Octal method –  Can be more difficult to

grasp –  Always explicitly sets

permissions –  It’s the winner in terms

of brevity

chmod u=rwx,g=rx,o=x myfile.dat chmod 751 myfile.dat!

70

File Permissions

• For any given user on a system, only one set of permissions on a given file apply –  If you’re the owner, the “user” permissions apply –  If you’re not the owner but in the group, “group”

permissions apply – Otherwise, “other” permissions apply

•  If a file owner sets permissions of 077, everyone on the system except them has full access to the file –  But they can chmod it to something more useful

• File permissions don’t apply to root. –  root can access mode 000 files

71

File Permissions

• Extended permissions –  You may see other letters (s, S, t, or T) –  These relate to extended permissions

•  Set User ID (setuid), Set Group ID (setgid), “sticky bit” –  These are a bit more advanced, but still not very difficult

to understand

72

Extended Permissions

• Set User ID (setuid) –  Typically meaningless for directories –  Executables run under the uid of their owner instead of the

person running the program

• Set Group ID (setgid) –  For directories, files/directories created in the directory inherit

its group; subdirectories also inherit the setgid bit –  For executables, similar to Set User ID

•  “Sticky Bit” or restricted deletion flag –  Only owners can delete files, even in world-writable directories –  For files, essentially meaningless

73

Extended Permissions

• These show up in place of the execute permission –  Setuid in place of user’s execute permission –  Setgid in place of group’s execute permission –  Sticky bit in place of other’s execute permission

• Files/directories need execute permission set for these special permissions to make sense –  Because setuid and setgid have an impact when a

program/script is executed, and because directories need execute permission for you to cd into them

–  If it is, these are represented by a lowercase s (or t). If not, it’s an uppercase S (or T)

74

Changing Extended Permissions

• Octal mode –  Use 4 numbers instead of three –  First sets special permissions

•  4 = setuid •  2 = setgid •  1 = sticky bit

–  Second, third, and fourth numbers are user, group, and other permissions (respectively)

• Symbolic mode –  u+s for setuid g+s for setgid o+t for sticky bit

75

Changing Extended Permissions

• The initial permissions are nonsensical, but demonstrate extended permissions (and the effect of the absence of execute permission)

•  In the second command, we added group execute permission. That makes things look much more normal.

$ mkdir mydir; chmod 7767 mydir; ls -l!total 4!drwsrwSrwt 2 user12 group34 4096 Nov 12 13:28 mydir!!!$ chmod g+x mydir; ls -l!total 4!drwsrwsrwt 2 user12 group34 4096 Nov 12 13:28 mydir!

76

Umask

• Users have a “umask” which controls default permissions for newly created files –  It represents permissions that are “blocked” or subtracted

for newly created files –  It is NOT a default permission setting

• For example, a umask setting of 027 means: –  New files will not have group write permission or any

“other” permissions –  Nothing else is known…we don’t know the stat of any

user permission or group read/execute permission

• Settings of 077, 027, and 007 are fairly common

77

Umask

• The umask does not have a permanent effect on a files permissions –  It only affects a file’s initial permissions –  They can be changed with chmod!

78

Umask

• To better visualize how umask works, think of the permissions and umask in binary

• A umask bit of 1 is a wall…it won’t let the permission bit through.

• A umask bit of 0 is a space that will let it through.

• Or, if you think in terms of logical operations, the resulting permissions are the bitwise AND of the default permissions and the negated umask –  In other words, A & !B

79

Visualizing Umask

Description Octal Value Visualization Default Permissions 775 Umask 027 Resultant Permissions 750

0 0 0 0 1 0 1 1 1!

1 1 1 1 1 1 1 0 1!

80

Visualizing Umask

Description Octal Value Visualization Default Permissions 775 Umask 027 Resultant Permissions 750

0 0 0 0 1 0 1 1 1!

1 1 1 1 1 1 1 0 1!

81

Visualizing Umask

Description Octal Value Visualization Default Permissions 775 Umask 027 Resultant Permissions 750

0 0 0 0 1 0 1 1 1!

1 1 1 1 0 1 0 0 0 !1 1 1 1 0 1 0 0 0!

1 1 1 1 1 1 1 0 1!

82

Umask Examples

• The umask command is used to display your current setting or to set a new mask $ umask!0027!!$ umask 077!!$ umask!0077!

83

Umask Examples

• Umask can be difficult to master. Play around a bit. $ umask!0000!!$ touch afile; mkdir mydir!!$ ls -l!total 4!-rw-rw-rw- 1 someusr somegrp 0 Oct 31 10:54 afile!drwxrwxrwx 2 someusr somegrp 4096 Oct 31 10:54 mydir!!$ umask 027!!$ touch afile2; mkdir mydir2; ls –l!total 8!-rw-rw-rw- 1 someusr somegrp 0 Oct 31 10:54 afile!-rw-r----- 1 someusr somegrp 0 Oct 31 10:54 afile2!drwxrwxrwx 2 someusr somegrp 4096 Oct 31 10:54 mydir!drwxr-x--- 2 someusr somegrp 4096 Oct 31 10:54 mydir2!

See? I told you ; was useful

84

Links

• Links are ways to refer to a file by multiple names –  Called shortcuts in Windows and aliases in MacOS

• Two types –  Hard –  Symbolic

•  Sometimes called soft links, but usually shortened to “symlinks”

85

Links

• A bit of technical detail… –  Think of the file on disk as simply a bunch of bytes at a

particular “address” –  The filename is a “pointer” to that address. –  The address is called an inode

• A hard link creates an additional pointer to an address/inode

• A symbolic link creates a pointer to a filename –  A pointer to the pointer to the address

86

010101000110100001100101!001000000100110101100001!011110000111001001101001!011110000010000001101000!011000010111001100100000!011110010110111101110101!001011100010111000001010!

Links

12345 FilenameA

FilenameB

SymlinkA

Hard links

Symbolic link

uid/gid/timestamps/etc

87

Links

• Create a link with ln [-s] filename linkname!

• The –s is optional –  If it’s there, it’s a symbolic link –  If not, hard link.

• WARNING: Don’t reverse the arguments! –  You can trash your file! – Modern day OSes will usually refuse but don’t rely on this – My mnemonic for ln [-s] X Y:

“Create a link to X named Y”

88

Links

• Hard links must be within a single filesystem – Why? Because inodes are unique to a filesystem

• A hard link’s target can’t be a directory (this is permissible for symlinks)

• Symlinks can span filesystems – Why? Because they point to file names, not inodes

• What about permissions? –  Hard links look just like regular files. –  Symlinks have unique permissions: lrwxrwxrwx!

•  But they use the target’s permissions

89

Links

$ ls –il total 8 37930848 -rw-r--r-- 1 someusr somegrp 7 Nov 4 15:59 afile!

$ ln afile alink!

$ ls –il total 16 37930848 -rw-r--r-- 2 someusr somegrp 7 Nov 4 15:59 afile 37930848 -rw-r--r-- 2 someusr somegrp 7 Nov 4 15:59 alink $ ln -s afile alink2 $ ls –il total 24 37930848 -rw-r--r-- 2 someusr somegrp 7 Nov 4 15:59 afile 37930848 -rw-r--r-- 2 someusr somegrp 7 Nov 4 15:59 alink 37931526 lrwxr-xr-x 1 someusr somegrp 6 Nov 4 16:02 alink2 -> afile!

90

Links

• Once a hard link is created, you cannot tell the difference between the link and the original file – …because there is no difference –  You can delete either, and the other remains –  If you edit one, you’ll see those edits in the other –  A file isn’t truly “deleted” until all hard links are removed

•  Technically, it’s not even deleted then, but recovery is very difficult •  Thus, rm technically removes a hard link, not a file. •  The ‘delete’ operation in some languages is even called ‘unlink’

•  If you delete the target for a symbolic link, it becomes a “hanging” symlink (it points to nothing)

91

Combining Commands

• Sometimes the output of one command is needed as the input to another –  You could do this with file output/input redirection, but

that’s inefficient and confusing

• This is done with the | operator –  Vertical bar; typically same key as \

•  | is sometimes called “pipe”, so we are said to “pipe one command into another”. – Get the plumbing reference? The pipe takes the data

from one point to another.

92

Combining Commands

• Let’s say you want to list a huge directory –  A normal ls –l would send everything by too quickly –  If only we had a command to display a screenful of data

at a time…(hint: more)

• So, will display the long directory listing a screenfull at a time.

ls –l |more!

93

The über command: find

• One of the most useful/powerful commands is find!• Ostensibly, it searches for files, but it is extremely

flexible in the search criteria it can use • An extremely useful feature is its ability to take (and

run) commands – Often provides a safer alternative than recursio

• Example: find all files in /data/dir1 and /data/dir2 owned by phil and give them to sue find /data/dir[12] –user phil –exec chown sue {} \;!

94

Searching Within Files

• To search for text within files, use the grep command (Global Regular Expression Print)

• As its name implies, grep uses regular expressions –  There are more advanced versions: egrep and –  Very involved topic…an “exercise left to the student”

Process Management

96

Process Management

• Everything running on a Linux/Unix system is a process –  The system views it by its process id or pid – We can interact with it via its pid

• Sometimes we need to interact with processes – Move them to higher/lower priority –  Stop (kill) them

97

Process Management

• Foreground processes –  “Block” until they finish –  High priority

• Background processes –  Lower priority –  Run when they find available cycles – We can run other commands while background processes

are still running

98

Process Management

• Most commands are foreground processes • To make it background:

–  Append & when starting the command (./sometask &) –  If it’s already running, press Ctrl-Z to pause it, then type bg to move it to the background

• To move a background process to the foreground, type fg!–  You can only foreground 1 process. If multiple are

backgrounded you need to tell the system which one

• To view backgrounded processes, type jobs!

99

Processes Management

• All processes are started by other processes –  This is called the parent process

• Processes can also be viewed with the ps command –  Shows process ID (pid) & the parent process id (ppid)

• The top-level process is called init!

100

Process Management

• Common options to ps on most Linux systems are: –  f: show various details including pid and ppid –  e: show Everyone’s processes –  u: show processes belonging to a given user

• Show all processes (can be lots of output)

• Show processes belonging to user “john” $ ps –fu john!

$ ps –ef!

101

Process Management-Signals

• We can communicate with processes (and they can communicate with each other) via signals!

• Some signals happen automatically (e.g. floating-point exceptions)

• Some are explicitly done by users via the kill command –  Somewhat a misnomer…kill is but one of many signals –  Syntax: kill [signal] pid!

–  Special cases: pid <= 0

102

Process Management-Signals

• Your scripts/programs can interact with some signals – We say they “trap” the signal –  Lets you do things like perform orderly shutdown, etc.

• Not all signals can be trapped • You can often list signals with kill -l!

103

Process Management-Signals

$ kill -l! 1) SIGHUP 2) SIGINT 3) SIGQUIT 4) SIGILL! 5) SIGTRAP 6) SIGABRT 7) SIGBUS 8) SIGFPE! 9) SIGKILL 10) SIGUSR1 11) SIGSEGV 12) SIGUSR2!13) SIGPIPE 14) SIGALRM 15) SIGTERM 16) SIGSTKFLT!17) SIGCHLD 18) SIGCONT 19) SIGSTOP 20) SIGTSTP!21) SIGTTIN 22) SIGTTOU 23) SIGURG 24) SIGXCPU!25) SIGXFSZ 26) SIGVTALRM 27) SIGPROF 28) SIGWINCH!29) SIGIO 30) SIGPWR 31) SIGSYS 34) SIGRTMIN!35) SIGRTMIN+1 36) SIGRTMIN+2 37) SIGRTMIN+3 38) SIGRTMIN+4!39) SIGRTMIN+5 40) SIGRTMIN+6 41) SIGRTMIN+7 42) SIGRTMIN+8!43) SIGRTMIN+9 44) SIGRTMIN+10 45) SIGRTMIN+11 46) SIGRTMIN+12!47) SIGRTMIN+13 48) SIGRTMIN+14 49) SIGRTMIN+15 50) SIGRTMAX-14!51) SIGRTMAX-13 52) SIGRTMAX-12 53) SIGRTMAX-11 54) SIGRTMAX-10!55) SIGRTMAX-9 56) SIGRTMAX-8 57) SIGRTMAX-7 58) SIGRTMAX-6!59) SIGRTMAX-5 60) SIGRTMAX-4 61) SIGRTMAX-3 62) SIGRTMAX-2!63) SIGRTMAX-1 64) SIGRTMAX!!

104

Process Management-Common Signals

Description Name # Hangup SIGHUP! 1 Tell process that user terminal went away Interrupt SIGINT! 2 Tell process that user typed Ctrl-C!Quit SIGQUIT! 3 Similar to SIGINT, but dump a core file Abort SIGABRT! 6 Program detected error and is aborting Kill SIGKILL! 9 Go away, period. (Can’t be trapped) Terminate SIGTERM! 15 Ask nicely for program to stop. Illegal instruction SIGILL! ? CPU encountered an unknown instruction Floating-point exception

SIGFPE! ? Math error, like divide by 0

Segmentation fault SIGSEGV! ? Illegal memory access User-Defined SIGUSR1

SIGUSR2!? ?

User-defined signals

105

Process Management-Signals

• SIGKILL, SIGTERM, and SIGINT all essentially tell a job to end, but in varying ways –  SIGKILL ends it no matter what –  SIGTERM and SIGINT are more “friendly” –  SIGTERM is the default signal for the kill command

• Ctrl-C is handy to stop a command from running (e.g. when you realize you didn’t mean to type it) –  This sends SIGINT!

• When suspending a process prior to backgrounding it, type Ctrl-Z!–  Sends the interactive stop signal, SIGTSTP!

106

File Compression

• Sometimes we want to compress files for storage or to save space.

• Several utilities are provided –  gzip/gunzip – GNU ZIP –  zcat – gzip, but send output to stdout!–  bzip2/bunzip2 – Another compression tool

• These tools only compress; they don’t combine multiple files into one (à la zip on Windows)

107

Archiving Files

• Sometimes you may want to archive a directory (a code build, etc.) as a single file

• This can be done with the tar command (Tape ARchiver)

• Combines all files into a single one (or extracts files from an existing .tar file)

• Supports compression, or you can run tar and then gzip/bzip on the result.

108

File Ownership

• All files have owner and group attributes –  Traditionally it’s only one group. Some advanced

filesystems (context #3 from the earlier slide) may support multiple ones.

• Group can be changed with the chgrp (CHange GRouP) command

• Owner can be changed with chown (but only root can do this)

$ chgrp orstaff somefile.dat!

$ chown user234 somefile.dat!

109

String Processing

• A very common task (both in system administration and day-to-day use) is string processing – Many things are stored in configuration files –  Usually we only need a part of the data

• Often we need to manipulate this data – Only display a part of it – Make edits/combine fields/change case

• We have several utilities that can do this • We can also do some by combining basic

commands

110

String Processing

• Many tasks require processing text files/searching strings/etc.

• There are several utilities to do this –  sed – Stream EDitor –  awk – (first letter of each author’s last name) –  perl!–  And many more

• Each of these could be a class in itself –  They all involve regular expressions (remember that

term?) –  The web and reference books are your friends

111

String Processing Example

• Assume we have a file full of address information. It contains lines of the form: FirstName:LastName:Address:City:State:ZIP:PhoneNumber!

• We have been asked to provide the various ZIP codes from that file

112

String Processing Example

• The cut command breaks a string into fields based on a given delimiter (given via the -d option) and returns only the fields you want

• Typically used in conjunction with another command like cat –  In our example, the ZIP Code is the 6th field:!

cat addresses.lst |cut -d’:’ -f6!

113

String Processing Example

• That’s great, but it’d be really nice if the ZIPs were in order…

• The sort command can do this –  Can do either numeric (-n option) or alphabetic (default) –  Can take a delimiter and a “key” field (i.e. the one on

which to sort) •  Thus, you’re sorting on the key, not the whole string

cat addresses.lst |cut –d’:’ –f6|sort -n!

114

String Processing Example

• Even better, but I really only need each code once (right now I see lots of duplicates)

• uniq to the rescue – Given a stream of data, returns unique lines

cat addresses.lst |cut –d’:’ –f6|sort –n|uniq!

115

Combining Commands

• Remember how I said Linux has “single-purpose” commands? And that they can be combined to form more complex commands?

• We just took 4 basic commands and created a much more complex one –  cat to display a file –  cut to remove unnecessary data –  sort to put it in a more user-firendly order –  uniq to remove duplicates

116

Graphical Tools

• On desktop systems, you use many graphical tools: – Window managers –  File managers – GUIs for various system functions – Web/Email

• The underlying graphical system is called the X Window System, commonly called X11

• 10 years ago, setting this up was a nightmare • Now everything basically works out of the box

117

Graphical Tools

• You’ll also use GUIs on HPC systems, albeit to a much lesser extent –  Debuggers –  Profiling tools

• X11 traffic can be automatically forwarded over ssh!–  Easy to configure –  Secure, but… –  S…..L…..O…..W

118

Graphical Tools

• Tools exist to mitigate slow X11 traffic –  FreeNX

(One implementation is http://www.nomachine.com) –  VNC

• These do require support on the server-side and some configuration on the client –  Contact your friendly administrator or user support team

for help

119

Batch Systems

• On a desktop system, you’re usually the only user; on an HPC system, you’re sharing resources

• Compute resources have to be ‘scheduled’ to some extent

• This is done through a Batch Queue System • There are many Batch Queue Systems in use

– MOAB/Torque –  PBS –  LoadLeveler –  LSF

120

Batch Systems

• Batch systems provide controlled access to resources

• Several types of commands are provided –  Submit a job –  Cancel a job –  Check status of the queue or of a specific job

• Commands differ between systems, but the basics are the same

121

Batch Systems

• When you want to run a job, you create a text file called a batch script.

• A batch script is a shell script that essentially lists all the commands you need to run as part of your job –  Copying files into/out of the run directory –  The command to launch the job

• They also contain metadata about the job –  How long it will run –  How many processors – What account/project to charge

122

Batch Script

• You “submit” your batch script to the system • The Batch Queue System contains a scheduler that

determines when your job will run based on… –  Some priority scheme – When your job can “fit” among other jobs –  Etc.

• Accurate estimates of runtime and processor needs are important to make the scheduler efficient

• Further discussion gets very system specific…consult user guides for those specific systems

123

Resources

• http://www.doc.ic.ac.uk/~wjk/UnixIntro • Linux in a Nutshell (O’Reilly Media)

–  Current version is old, perhaps a new one is coming soon

124

More Advanced Topics (for your own investigation)

• Editors (nano, emacs, vi, and more)

• Special files –  /dev/null –  /dev/random, –  /dev/zero

• Email/web tools • Scheduling tasks with at and cron!

• Regular expressons • System administration • Advanced scripting • Package managers

(rpm, dpkg, yum, etc.) • More advanced batch

systems

125

Any Final Questions?