workshop on command line tools - day 1
TRANSCRIPT
I Workshop on command-line tools
(day 1)
Center for Applied GenomicsChildren's Hospital of Philadelphia
February 12-13, 2015
ArgumentsCome after the name of the program
Example:cat file.txt (1 argument)cut -f2 file.txt (2 arguments)
The number of spaces between arguments doesn't mattercut -f2 file.txt
some tips (i)
Use <Tab> to auto-complete your commands or file/directory names
To search old commands, you can use ↑ and ↓ arrows in your keyboard
some tips (ii)
The command history will return a list of your last commands
Use ! to run the last command starting with…Example:!grepThis will run the last command starting with grep
Special characters (i)
^ : beginning of line$ : end of line or beginning of variable name? : any character (with one occurrence)* : any character (with 0 or more occurrences)# : start comments[ ] : define sets of characters
Special characters (ii)
" " : define strings' ' : define strings- : start a parameter` ` : define commands; : separate commands| : "pipe" commands
Special characters (iii)
~ : home directory/ : separate internal directories\ : escape character \n : new line (Linux) \r : new line (Mac) \t : tab
First steps
pwd # where am I?
whoami # who am I?
id <your_username> # what can I do?
date # what time/day is it?
cat - concatenate and print text files
cat file1.txt file2.txt > output.txtcat *.bed > all.bed
cat -n : shows line numberscat -e : shows non-printing characters
echo - write to the standard output
echo Hello, CAG!echo -e : prints escape charactersecho -e "C\tA\tG"echo -e "C\nA\nG"echo -n : prints and doesn't go to a new lineecho -n "CAG"; echo "123"echo "CAG"; echo "123"
Redirect output or errors (i)
echo "bla" > bla.txtecho "ble" > ble.txtcat bla.txt ble.txt > BLs.txtecho "bli" >> BLs.txtecho "blo" > blo.txtcat blo.txt >> BLs.txt
Redirect output or errors (ii)
cat -n BLs.txtcat blu.txt >> BLs.txt 2> error.txtcat error.txtcat blublu.txt >> BLs.txt 2>> error.txtcat error.txt
ls - list files in directories (i)
ls : list files of current directoryls workshop : list files in directory workshopls -l : in long formatls -t : list files sorted by time modifiedls -1 : force output to be one entry per linels -S : list files sorted by time modified
ls - list files in directories (ii)
ls -r : reverse the sortingls -a : list hidden files (which begin with a dot)ls -h : show file size human-readablels -G : colors output
We can combine options:ls -lhrt
ssh - secure shell (access remote servers) (i)
ssh <user>@<server>
ssh -t : exits after a list of commandsssh [email protected]
ssh [email protected] -t top
ssh [email protected] -t ls -lh
ssh [email protected] -t ls -lh >
my_home_on_respub.txt
ssh - secure shell (access remote servers) (ii)
ssh -p <port> : access a specific port on server
ssh -X : open session with graphic/display options (if you need to open a graphic program in a remote server; e.g. IGV).
alias - "shortcut" for commandsalias <alias> : see what is a specific alias
alias ll # ll is not a real command. =)
alias resp='ssh [email protected]'
resp
mkdir - make directory
mkdir bioinfo_filesmkdir workshop_text_filesmkdir workshop123mkdir -p 2015/February/12# Suggestion:# Create names that make sense
cd - change working directory
cd bioinfo_filescd .. # go to directory abovecd ~ # go to home directorycd - # go to previous directory
mv - move files and directories
mv bl?.txt workshop_text_filesmv BLs.txt old_file.txtmv workshop_text_files workshop_files
cp - copy files and directoriescp old_file.txt workshop_filescp error.txt error_copy.txt
# To copy directories with its contents,# use -r (recursive)cp -r workshop_files bioinfo_files/# Now, try...cp -r workshop_files/ bioinfo_files/
scp - secure copy files and directories in different servers# Similar to "cp" (in this case, we're uploading)
scp *.txt [email protected]:~/
# To copy directories with its contents,
# use -r (recursive)
scp -r w* [email protected]:~/
# Downloading
scp [email protected]:~/*.txt .
rm - remove files and directories
rm old_file.txt error_copy.txt
# Use -r (recursive) to remove# directories and its contentsrm -r bioinfo_files/workshop_files/rm -r 2015
ln - make links (pointers) of files(it's good to avoid multiple copies)# hard links keep the same if the original# files are removedln workshop_files/old_file.txt hard.txt
# symbolic links break if the original # files are removedln -s workshop_files/old_file.txt symbolic.txt
testing linksecho "hard" >> hard.txtecho "symbolic" >> symbolic.txthead hard.txt symbolic.txthead workshop_files/old_file.txtrm workshop_files/old_file.txthead hard.txt symbolic.txt
wget - network downloader wget www.ime.usp.br/~llima/XHMM_results.tar.bz2
wget -c : continue (for incomplete downloads)
wget http://bio.ime.usp.br/llima/GWAS.tar.gz# after 10%, press Ctrl+Cwget -c http://bio.ime.usp.br/llima/GWAS.tar.gz
tar - archiving
Create an archive:tar -cvf newfile.tar file1 file2 dir1 dir2tar -cvf BLs.tar bla.txt ble.txt blo.txttar -cvzf BLs.tar.gz bla.txt ble.txt blo.txt
Parameters: c (create), v (verbose), z (gzip), f (file)
tar - archiving
Extract from an archive:tar -xvzf GWAS.tar.gztar -xvjf XHMM_results.tar.bz2
Parameters: x (extract), v (verbose), f (file),z (gzip), j (bzip2)
head - first lines
# first 20 lineshead -n 20 DATA.xcnv
# all lines, excluding last 2# (on Linux, not Mac)head -n -2 DATA.xcnv
cut - get specific columns of file# fields 1 to 3 and 6
cut -f 1-3,6 DATA.xcnv
# other examples
cut -f1 adhd.ped
cut -f1 -d' ' adhd.ped # delimiter = space
# other delimiters: comma, tab, etc.
cut -d, -f1-2 …
cut -d'\t' -f5,7,9 …
Using "|" (pipe) to join commandscut -f 1-3,6 DATA.xcnv | head -n 1cut -f 1-3,6 DATA.xcnv | less
zcat adhd.ped.gz | less
# Compare (same result? same time?)zcat adhd.ped.gz | cut -f1 -d' ' | headzcat adhd.ped.gz | head | cut -f1 -d' '
column - columnate lists
# using white spaces to separate# and fill columnscolumn -t DATA.xcnv
column -s # choose separator
sort - sort lines of text filessort DATA.xcnvsort -k : choose specific fieldsort -n : numeric-sortsort -r : reverse
# Exercise: show 10 top CNVs with# more targets (column 8)
uniq - report or filter out repeated lines in a file
cut -f1 DATA.xcnv | sort | uniq
# reporting counts of each linecut -f5 DATA.xcnv | sort | uniq -c
wc - word, line, character and byte count
wc -l : number of lineswc -w : number of wordswc -m : number of characters
cut -f5 DATA.xcnv | sort | uniq | wc -l
head -n1 DATA.xcnv | cut -f1 | wc -m
More exercises1. What are the top 10 samples with more CNVs?2. What are the top 5 largest CNVs?3. What are the top 15 directories using more space?
vi/vim (text editor) (i)
vi text_file.txt (open "text_file.txt")i - start edition mode (remember "insert")ESC - stop edition mode:w - save file ("write"):q - quit:x - save (write) and quit
vi/vim (text editor) (ii)
u - undo:30 - go to line number 30:syntax on - syntax highlighting^ - go to beginning of line$ - go to end of line
vi/vim (text editor) (iii)
dd - delete current lined2↓ - delete current line and 2 lines below yy - copy current liney3↓ - copy current line and 3 lines belowpp - paste lines below current line
grep - finds words/patterns in a file (i)
grep word file.txtOptions:grep -w : find the whole wordgrep -c : returns the number of lines foundgrep -f : specifies a file with a list of wordsgrep -o : returns only the match
grep - finds words/patterns in a file (ii)
grep -A 2 : also show 2 lines aftergrep -B 3 : also show 3 lines beforegrep -v : shows lines without patterngrep --color : colors the match