jip pipeline system introduction

J I P - P I P E L I N E S Y S T E MA C C E S S I B L E H I G H T H R O U G H P U T C O M P U T I N G

W H Y ?S E R I O U S LY

• Job Management

• Implementation

• Batch job handling

• Reusable and…

• … documented tools

L O C AT I O N S

P L E A S E TA K E A L O O K

• Documentation http://pyjip.rtfd.org

• Source Code https://github.com/thasso/pyjip

• Exampleshttps://github.com/thasso/pyjip/tree/master/examples

C L I O R A P I

• Commands to run and submit jobs

• List and query jobs

• Manipulate jobs (delete, archive, cancel, edit,…)

• Cleanup jobs and list profiles and tools

• Start your own server

Commands ======== run Locally run a jip script submit submit a jip script to a remote cluster bash Run or submit a bash command !List and query jobs =================== jobs list and update jobs from the job database !Manipulate jobs =============== delete delete the selected jobs archive archive the selected jobs cancel cancel selected and running jobs hold put selected jobs on hold restart restart selected jobs logs show log files of jobs edit edit job commands for a given job show show job options and command for jobs !Miscellaneous ============= tools list all tools available through the search paths profiles list all available profiles clean remove job logs check check job status server start the jip grid server

H E L L O W O R L D

Lets get started

H E L L O W O R L D

#!/usr/bin/env jip # Prints hello world !echo "Hello world"

#!/usr/bin/env jip # Prints hello world using perl !#%begin command perl print "Hello world\n"; #%end

#!/usr/bin/env jip !#%begin command python print "Hello world" #%end @pytool()

def hello_world(): """Prints hello world""" print "Hello python"

#%begin command [perl|RScript|…]

• command block to run scripts

• specify an interpreter (default bash)

• use templates to access options and variables

O P T I O N S A N D D O C U M E N TAT I O N

• Options are specified in your documentation

• Specify Inputs, Outputs, and other Options

• Options are available as ${variables}

O P T I O N S A N D D O C U M E N TAT I O N

#!/usr/bin/env jip # # BWA/Samtools pileup # # Usage: # pileup.jip -i <input> -r <reference> -o <output> # # Inputs: # -i, --input <input> The input file # -r, --reference <reference> The genomic reference # # Outputs: # -o, --output <output> The .bcf output file # # Options: # —-fast Enable fast mode

T E M P L AT E S A N D VA R I A B L E S

• Access variables and options ${variable}

• Apply filters:

• arg — ${bool|arg} ${file|arg(“>”)}

• pre / suf — ${input|suf(“.txt”)}

• name, ext, and, abs — ${input|name|ext}

S I N G L E T O O L S

• Inputs, Outputs, Options

• Phases:

• init — initialise the tool and its options

• setup — perform setup using option (values)

• validate — check input files and options

• execute — execute through interpreter

E X E C U T I O N

• Check all inputs (dependency aware)

• Update the DB and run the command block

• Update DB

S U C C E S S FA I L U R E

• Remove output

• Update DB

G E M T O B E D

#!/usr/bin/env jip # Delegates to gem-2-bed to create BED graphs from .map files # # Usage: # gem2bed -i <input> -I <index> # # Inputs: # -i, --input <input> The .map input file (can be compressed) # -I, --index <index> The .gem index !#%begin init add_output('graph', '${input|name|re("\.map(.gz)?", ".bg")}') add_output('sizes', '${input|name|re("\.map(.gz)?", ".sizes")}') #%end !zcat -f ${input} | \ ${__file__|parent}/gem-2-bed blocks-coverage -I ${index} \ -o ${graph|ext} -T $JIP_THREADS

D O C U M E N TAT I O N

I N I T I A L I S AT I O N

E X E C U T I O N

B E D 2 B I G W I G#!/usr/bin/env jip # Delegates to gem-2-bed to create BED graphs from .map files # # Usage: # bed2wig -g <graph> -s <sizes> [-o <output>] # # Inputs: # -g, --graph <graph> The graph file generated with gem-2-bed # -s, --sizes <sizes> The sizes file generated with gem-2-wig # # Outputs: # -o, --output <output> The output file name # [default: ${graph|ext}.bw] !#%begin init add_output('output', '${graph|name|ext}.bw') #%end !#%begin setup profile.threads = 1 #%end !${__file__|parent}/bedGraphToBigWig ${graph} ${sizes} ${output}

P I P E L I N E S

• Inputs, Outputs, Options

• Phases

• init, setup, validate

• create pipeline

G E M 2 B I G W I G

#!/usr/bin/env jip # Creates a bed graph from a .map file and converts it to wig # # Usage: # gem2wig -i <input> -I <index> # # Inputs: # -i, --input <input> The .map input file (can be compressed) # -I, --index <index> The .gem index !#%begin pipeline bed = job(temp=True).run('gem2bed', input=input, index=index) run('bed2wig', graph=bed.graph, sizes=bed.sizes)

G E M 2 B I G W I G

#!/usr/bin/env jip # Creates a bed graph from a .map file and converts it to wig # # Usage: # gem2wig -i <input> -I <index> # # Inputs: # -i, --input <input> The .map input file (can be compressed) # -I, --index <index> The .gem index !#%begin pipeline bed = job(temp=True).run('gem2bed', input=input, index=index) run('bed2wig', graph=bed.graph, sizes=bed.sizes)

D O C U M E N TAT I O N

P I P E L I N E

#%begin pipeline

bed = job(temp=True).run('gem2bed', input=input, index=index)

#%begin pipeline

bed = job(temp=True).run('gem2bed', input=input, index=index)

run('bed2wig', graph=bed.graph, sizes=bed.sizes)

D E M O

M U LT I P L E X I N G

S T R E A M S

M U LT I P L E X I N G A N D S T R E A M S

echo "Hello World" | \ (tee > producer_out.txt | (tee >(wc -w) | wc -l))

bash('echo "Hello World"'), output='producer_out.txt') \ | (bash('wc -l') + bash('wc -w'))

producer = bash('echo "Hello World"', output='producer_out.txt') word_count = bash("wc -w", input=producer) line_count = bash("wc -l", input=producer) producer | (word_count + line_count)

B A S H

Common Questions

S U B M I T S I N G L E C O M M A N D S

• The jip bash command wraps single executions

• You can run or submit

• Dry runs and multiplexing are supported

D E M O

S U B M I T F O R M U LT I P L E F I L E S

• Fan-Out operations work for all tools

• Define a single input option

• Specify multiple values

• Works also for the jip bash command

D E M O

W H AT W A S T H E C O M M A N D

• jip show shows job properties and the command

• jip edit loads the job command in an editor

D E M O

R E S TA R T I N G A N D M O V I N G

• jip restart resubmits jobs after failure

• jip restart can also move jobs and pipelines to other queues/partitions

D E M O

C U S T O M I S E L O G F I L E S

• The job profile covers stdout and stderr log files

• jip logs finds and shows log files for jobs

D E M O

Q U E S T I O N S ?

Thank You

jip pipeline system introduction

o p t i o n s

n e p i p e

o n options

i o n s documentation

s e r i o u s lywhy

e n tat

n d d o c u

gem index

Technology

introduction -gps pipeline current mapper

lec11 pipeline introduction

introduction to pipeline risk assessment

petro leo jip

jip 5 februarie 2007

introduction to gas pipeline design and...

jip - cranfield university

umsire - jip

jaarverslag 2015 jip zoetermeer

ums 'n jip

fpso experience database jip

an introduction about pipeline

jip crytransfer external

preactas jip 2010

joint industry program (jip)joint industry program (jip)jip...

introduction to pipeline marketing

anniem.g.schmidt jip en janneke...anniem.g.schmidt jip en...

jip company brochure812201060306pm1

anniem.g.schmidt jip en janneke · 2017. 8. 17. ·...

final proposal helios jip