wssp chapter 8 blastx translated dna vs protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc...

25
WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt

Upload: anabel-beasley

Post on 03-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

WSSP Chapter 8BLASTX Translated DNA vs Protein searches

atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgctga ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgttgg attgaaggta attatcttgc atgagccagc tgatgagtat gatacagttt

Page 2: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

8-3© 2014 WSSP

Page 3: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Query AGG TCG TTA CTA TCG AGG AGT AGA | | | | Sbjct CGT AGC CTT TTG AGT CGA TCG CGG 16% Identity

BLASTN Match

Why do a BLASTX if we have done a BLASTN?

R S L L S R S RQuery AGG TCG TTA CTA TCG AGG AGT AGA | | | | Sbjct CGT AGC CTT TTG AGT CGA TCG CGG R S L L S R S R

R S L L S R S R | | | | | | | | 100% Identity R S L L S R S R

8-1© 2014 WSSP

Page 4: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Clicker Question #1: How many different DNA sequences can code for the peptide sequence Met-Leu-Cys-Ala?

3 Letter 1 Letter DNA codons for each Amino Acids NAME Abbreviation AbbreviationAlanine Ala A GCA,GCC,GCG,GCUCysteine Cys C UGC,UGUHistidine His H CAC,CAUIsoleucine Ile I AUA,AUC,AUULysine Lys K AAA,AAGLeucine Leu L UUA,UUG,CUA,CUC,CUG,CUUMethionine Met M AUGAsparagine Asn N AAC,AAUProline Pro P CCA,CCC,CCG,CCUGlutamine Gln Q CAA,CAGArginine Arg R CGA,CGC,CGG,CGU,AGA,AGGSerine Ser S UCA,UCC,UCG,UCU,AGC,AGUThreonine Thr T ACA,ACC,ACG,ACUValine Val V GUA,GUC,GUG,GUUTryptophan Trp W UGGTyrosine Tyr Y UAC,UAUStop Codons . UAA,UAG,UGA

A) 1

B) 12

C) 36

D) 48

E) 54

Page 5: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

3422242262422412263622666262232446622244246216

ITKNYPYYRTADKGWQNSIRHNLSLNRYFIKVPRSQEEPGKGSFWR

Number of codons for the conserved region of the protein

3 Letter 1 Letter DNA codons for each Amino Acids NAME Abbreviation AbbreviationAlanine Ala A GCA,GCC,GCG,GCUCysteine Cys C UGC,UGUAspartic Acid Asp D GAC,GAUGlutamic Acid Glu E GAA,GAGPhenylalanine Phe F UUC,UUUGlycine Gly G GGA,GGC,GGG,GGUHistidine His H CAC,CAUIsoleucine Ile I AUA,AUC,AUULysine Lys K AAA,AAGLeucine Leu L UUA,UUG,CUA,CUC,CUG,CUUMethionine Met M AUGAsparagine Asn N AAC,AAUProline Pro P CCA,CCC,CCG,CCUGlutamine Gln Q CAA,CAGArginine Arg R CGA,CGC,CGG,CGU,AGA,AGGSerine Ser S UCA,UCC,UCG,UCU,AGC,AGUThreonine Thr T ACA,ACC,ACG,ACUValine Val V GUA,GUC,GUG,GUUTryptophan Trp W UGGTyrosine Tyr Y UAC,UAUStop Codons . UAA,UAG,UGA

7.5 x 1019

© 2014 WSSP

Page 6: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

8-2© 2014 WSSP

Page 7: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

p. 8-2© 2014 WSSP

Page 8: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

p. 7-2

DSAP BLASTx PageCropped DNA sequence

NCBI BLASTx page

© 2014 WSSP

Page 9: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

p 8-3

BLASTX Dialog Box

© 2014 WSSP

Page 10: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

BLASTX of EX1.14

p 8-3© 2014 WSSP

Page 11: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

BLASTn and BLASTx of another Landoltia sequence

BLASTn

BLASTx

p 8-4© 2014 WSSP

Page 12: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

List of EX1.14 BLASTx matches

p 8-4© 2014 WSSP

Page 13: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Best BLASTx alignment for EX1.14

p 8-5© 2014 WSSP

Page 14: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

>gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis]Score = 69.7 bits (169), Expect = 3e-10 Identities = 54/174 (31%), Positives = 85/174 (48%), Gaps = 4/174 (2%)

Query 40 LTCLLILQAPSSHAFYLWppfffpspvpDVITVLNQANQFTTLVQLLTETGVATAVNAIS 219 LT L++L + + A P PS +V +L++ QFTT ++LLT T VAT + Sbjct 9 LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q 67

Query 220 TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP 399 N G T+FAPTD AF + A L+ L+ Q+ ++ H +FYT + L + + Sbjct 68 LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR 127

Query 400 TLNT---GrsltfstsvtrvttitsPGGRVTTLNFLLYRRFPLTIFPIADVLLP 552 T T G + + S G T +N + ++FPL ++ + VLLPSbjct 128 TQATGQDGGVFGLNFTGQANQVNVSTGIVETQINNAIRQQFPLALYQVDKVLLP 181

>gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis]Score = 82.4 bits (202), Expect = 4e-14 Identities = 57/176 (32%), Positives = 89/176 (50%), Gaps = 8/176 (4%) Frame = +1

Query 40 LTCLLILQAPSSHAFYLWPPFFFPSPVPDVITVLNQANQFTTLVQLLTETGVATAVNAIS 219 LT L++L + + A P PS +V +L++ QFTT ++LLT T VAT + Sbjct 9 LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q 67

Query 220 TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP 399 N G T+FAPTD AF + A L+ L+ Q+ ++ H +FYT + L + + Sbjct 68 LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR 127

Query 400 TLNTGR-----SLTFSTSVTRVTTITSPGGRVTTLNFLLYRRFPLTIFPIADVLLP 552 T TG+ L F+ +V S G T +N + ++FPL ++ + VLLPSbjct 128 TQATGQDGGVFGLNFTGQANQVN--VSTGIVETQINNAIRQQFPLALYQVDKVLLP 181

Low Sequence Complexity Filter

With Filter

Without Filter

© 2014 WSSP

Page 15: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

p 8-6

Answer questions in DSAP

© 2013 WSSP© 2014 WSSP

Page 16: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Question: Which of these alignments has a greater biological significance?

A)

B)

© 2014 WSSP

Page 17: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

What can you conclude about this BLASTX result?

A) It is too short to be significant

B) It does not match anythingC) There is a frame shift in the DNA sequenceD) Your DNA has an exact match

© 2014 WSSP

Page 18: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Where is the frameshift most likely to be found?

A) bp 181

B) Bp 75

C) bp 227

D) bp 381

E) Can not tell from the data© 2014 WSSP

Page 19: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

AAAAAAAA

AAAAAAAATTTTTTTTT

TTTTTTTTTAAAAAAAA

TTTTTTTTTAAAAAAAA

DNA

RNA

cDNA

DS-cDNA

Cloning

Replication&

Purification

Sequencing

Points at when an error can be introduced into the DNA sequence of the clone

© 2014 WSSP

Page 20: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Is the frame shift at bp 227 caused by a DNA sequencing error?

A) Yes

B) No

C) Can not tell from the data

© 2014 WSSP

Page 21: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Does this have a frame shift?

Where?

© 2014 WSSP

Page 22: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

What does this BLASTX report indicate?

A) There are matches to different proteins at the end of the sequence

B) There are matches in one frame to the entire sequence

C) There is a frame shift in the DNA sequence

D) The protein has two different domains

E) Can not conclude anything

Page 23: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Where is the frame shift?

A) bp 149B) Bp 160C) bp 458D) bp 469E) bp 493

© 2014 WSSP

Page 24: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

Does this indicate that there is a frame shift in the sequence?

A)Yes

B)No

C)Can not tell

from the data

+1 +1+3

+1 +3Intron

© 2014 WSSP

Page 25: WSSP Chapter 8 BLASTX Translated DNA vs Protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag

What is the most likely explanation for this result?A) There is nothing

wrong with the alignment.

B) There is an extra or missing base causing a frame shift.

C) There is an unspliced intron in the cDNA.

D) The query has an extra protein region.

E) Answers C or D© 2014 WSSP