t-coffee tutorial acgt retreat 2012 jean-françois taly, ionas erb and cedrik magis
Embed Size (px)
TRANSCRIPT
T-Coffee tutorial
T-Coffee tutorialACGT Retreat 2012Jean-Franois Taly, Ionas Erb and Cedrik MagisWhat is T-Coffee ?Tree based Consistency based Objective Function For AlignmEnt Evaluation
Progressive AlignmentConsistency
Dynamic Programming Using A Substitution MatrixProgressive Alignment3Depends on the CHOICE of the sequences.Depends on the ORDER of the sequences (Tree).Depends on the PARAMETERS:Substitution Matrix.Penalties (Gop, Gep).Sequence Weight.Tree making Algorithm.Progressive Alignment4T-Coffee and ConsistencyJ. Mol. Biol. (2000) 302, 205-217
M-Coffee:T-Coffee and other alignersPrimary libraries can be computed from any third party aligners (pairwise or MSA):clustalw2mafftmuscleprobconspcmaand many more type t_coffee for a full listTemplate Based AlignmentVery useful in case of weak sequence similaritywrong libraries will lead to wrong MSAsReplace the sequence with something more informative:Profile PSI-CoffeePDB Structure ExpressoRNA StructureR-CoffeeLLL?Simple scoring schemes result in alignment ambiguitiesPSI-Coffee:Homology extensionLLLLLLLLLLLIVILLLLLLLProfile 1Profile 2PSI-Coffee:Use conservation across the protein familyEXPRESSO: Finding automatically the right template structureSourcesTemplateLibraryStructural Template AlignmentSource & Template AlignmentRemove TemplatesTemplateBLASTPDBBLASTPDBStructural Alignment(SAP)10R-Coffee:Embedding RNA Structures Within The T-Coffee LibrariesCCGGTC LibraryG G Score XC C Score Y
CCGG
The R-extension can be added on the top of any existing method:Mafft / Muscle / ProbConsConsan align the RNA sequence and predict secondary structure at the same timeBetter libraries but very slowRNA secondary structures:Predicted: RNAplFoldReal onesRNA SequencesSecondary StructuresPrimary LibraryR-Coffee ExtendedPrimary LibraryProgressive AlignmentUsing The R-ScoreRNAplfoldConsanorMafft / Muscle / ProbConsR-CoffeeExtensionR-ScoreSoon! SARA-Coffee:Like expresso but with RNA structures extracted from the PDB
Carsten Kemena Giovanni Bussotti12Pro-Coffeegives you a global alignment of homologous regulatory sequences (promoters, enhancers).
uses a dinucleotide substitution matrix derived from TRANSFAC binding site alignmentswas optimized on an ortholog finding task with promoter sequences and validated with multi-species ChIP-seq dataValidation Pro-Coffee
Which alignment is better?Validation Pro-Coffee
The 2nd one? But can we trust these binding site predictions?Validation Pro-Coffee
The 2nd one! The green sites are confirmed by ChIP-seq.
Magis & al, JMB 2010
MSA define equivalences
T-RMSD computes Intramolecular distances
One column = One matrix
One matrix = one tree
Nb columns = support
Using 3D structure for structural clustering
Structural Tree / PFAM / 3D-Coffee From structural clustering to phylogenetic inference
Glenney & wiens, Journal of Immunology 2007
Magis et al, TIBS (2012, submitted)Which Flavor?Fast AlignmentsM-Coffee with Fast Aligners: mafft, muscle, kalign
Difficult Protein AlignmentsPSI-CoffeeExpresso
Structural clusteringT-RMSD
RNA AlignmentsR-Coffee
Promoter AlignmentsPro-CoffeeServer: tcoffee.crg.cat Paolo Di Tommaso
Command line structuret_coffee-in input_file_name-method kalign_msa,muscle_msa,mafft_msaGive the list of methods you want for the computation of the primary librariesOn line documentation: http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htmCommand line structuret_coffee-in input_file_name-modefmcoffeeT-Coffee special modesmcoffeepsicoffeeexpressoOn line documentation: http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htmmcoffeepsicoffeercoffeeprocoffeeInput/output formatt_coffee-in input_file_name-modeexpresso-output output_format
clustal_aln (default)fasta_alnphylip_alnsaga_alnmsf_alnpir_alncompressed_aln On line documentation: http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htmT-Coffee other programst_coffee-other_pg seq_reformat
aln_comparestrikeirmsdtrmsdextract_from_pdb
On line documentation: http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htmseq_reformat T-Coffee alignment editing toolt_coffee-other_pg seq_reformat-in input_file_name-output output_format-action+trim _seq_%%90_On line documentation: http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htmt_coffee-other_pg seq_reformat -helpOn line documentation: http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htmseq_reformat T-Coffee alignment editing toolT-Coffee & the cacheT-Coffee keeps data in :~/.t_coffee/cache/ Warning! The cache will accumulate your data and may become very bigSeveral options :-cache update-cache ignore-cache path
Tutorial web sitehttps://sites.google.com/site/tcoffeetutorials
Installation
Where to Trust Your Alignments
Most Methods AgreeMost Methods Disagree
Wifi: edenrocUser:gjer5Password:mm9vq