wssp chapter 8 blastx translated dna vs protein searches atttaccgtg ttggattgaa attatcttgc atgagccagc...

Post on 03-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

WSSP Chapter 8BLASTX Translated DNA vs Protein searches

atttaccgtg ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgctga ttggattgaa attatcttgc atgagccagc tgatgagtat gatacagttt tccgtattaa taacgaacgg ccggaaatag gatcccgatc atgattgctt caatattttc acttcaatga ttggttctaa gcattcgaat gcgtacccgt ttgattaata tttccatttc tgtcccagtt tttaattttc atttcttttg gttaaaaaat tcccagtctc ttgaatgctt ttctaaaatc tttaattcaa ttatttatta gaatcttctg ttttgagaac tttgtaatgt aattaaataa tttgatgaaa tgattatgaa tgcgaataaa ttattaattt accgtgttgg attgaaggta attatcttgc atgagccagc tgatgagtat gatacagttt

8-3© 2014 WSSP

Query AGG TCG TTA CTA TCG AGG AGT AGA | | | | Sbjct CGT AGC CTT TTG AGT CGA TCG CGG 16% Identity

BLASTN Match

Why do a BLASTX if we have done a BLASTN?

R S L L S R S RQuery AGG TCG TTA CTA TCG AGG AGT AGA | | | | Sbjct CGT AGC CTT TTG AGT CGA TCG CGG R S L L S R S R

R S L L S R S R | | | | | | | | 100% Identity R S L L S R S R

8-1© 2014 WSSP

Clicker Question #1: How many different DNA sequences can code for the peptide sequence Met-Leu-Cys-Ala?

3 Letter 1 Letter DNA codons for each Amino Acids NAME Abbreviation AbbreviationAlanine Ala A GCA,GCC,GCG,GCUCysteine Cys C UGC,UGUHistidine His H CAC,CAUIsoleucine Ile I AUA,AUC,AUULysine Lys K AAA,AAGLeucine Leu L UUA,UUG,CUA,CUC,CUG,CUUMethionine Met M AUGAsparagine Asn N AAC,AAUProline Pro P CCA,CCC,CCG,CCUGlutamine Gln Q CAA,CAGArginine Arg R CGA,CGC,CGG,CGU,AGA,AGGSerine Ser S UCA,UCC,UCG,UCU,AGC,AGUThreonine Thr T ACA,ACC,ACG,ACUValine Val V GUA,GUC,GUG,GUUTryptophan Trp W UGGTyrosine Tyr Y UAC,UAUStop Codons . UAA,UAG,UGA

A) 1

B) 12

C) 36

D) 48

E) 54

3422242262422412263622666262232446622244246216

ITKNYPYYRTADKGWQNSIRHNLSLNRYFIKVPRSQEEPGKGSFWR

Number of codons for the conserved region of the protein

3 Letter 1 Letter DNA codons for each Amino Acids NAME Abbreviation AbbreviationAlanine Ala A GCA,GCC,GCG,GCUCysteine Cys C UGC,UGUAspartic Acid Asp D GAC,GAUGlutamic Acid Glu E GAA,GAGPhenylalanine Phe F UUC,UUUGlycine Gly G GGA,GGC,GGG,GGUHistidine His H CAC,CAUIsoleucine Ile I AUA,AUC,AUULysine Lys K AAA,AAGLeucine Leu L UUA,UUG,CUA,CUC,CUG,CUUMethionine Met M AUGAsparagine Asn N AAC,AAUProline Pro P CCA,CCC,CCG,CCUGlutamine Gln Q CAA,CAGArginine Arg R CGA,CGC,CGG,CGU,AGA,AGGSerine Ser S UCA,UCC,UCG,UCU,AGC,AGUThreonine Thr T ACA,ACC,ACG,ACUValine Val V GUA,GUC,GUG,GUUTryptophan Trp W UGGTyrosine Tyr Y UAC,UAUStop Codons . UAA,UAG,UGA

7.5 x 1019

© 2014 WSSP

8-2© 2014 WSSP

p. 8-2© 2014 WSSP

p. 7-2

DSAP BLASTx PageCropped DNA sequence

NCBI BLASTx page

© 2014 WSSP

p 8-3

BLASTX Dialog Box

© 2014 WSSP

BLASTX of EX1.14

p 8-3© 2014 WSSP

BLASTn and BLASTx of another Landoltia sequence

BLASTn

BLASTx

p 8-4© 2014 WSSP

List of EX1.14 BLASTx matches

p 8-4© 2014 WSSP

Best BLASTx alignment for EX1.14

p 8-5© 2014 WSSP

>gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis]Score = 69.7 bits (169), Expect = 3e-10 Identities = 54/174 (31%), Positives = 85/174 (48%), Gaps = 4/174 (2%)

Query 40 LTCLLILQAPSSHAFYLWppfffpspvpDVITVLNQANQFTTLVQLLTETGVATAVNAIS 219 LT L++L + + A P PS +V +L++ QFTT ++LLT T VAT + Sbjct 9 LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q 67

Query 220 TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP 399 N G T+FAPTD AF + A L+ L+ Q+ ++ H +FYT + L + + Sbjct 68 LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR 127

Query 400 TLNT---GrsltfstsvtrvttitsPGGRVTTLNFLLYRRFPLTIFPIADVLLP 552 T T G + + S G T +N + ++FPL ++ + VLLPSbjct 128 TQATGQDGGVFGLNFTGQANQVNVSTGIVETQINNAIRQQFPLALYQVDKVLLP 181

>gi|223542822|gb|EEF44358.1| conserved hypothetical protein [Ricinus communis]Score = 82.4 bits (202), Expect = 4e-14 Identities = 57/176 (32%), Positives = 89/176 (50%), Gaps = 8/176 (4%) Frame = +1

Query 40 LTCLLILQAPSSHAFYLWPPFFFPSPVPDVITVLNQANQFTTLVQLLTETGVATAVNAIS 219 LT L++L + + A P PS +V +L++ QFTT ++LLT T VAT + Sbjct 9 LTALILLLSLQAQAQNPAAPAPAPSGPLNVTGILDKNGQFTTFIRLLTSTQVATQLEN-Q 67

Query 220 TNGAGPGITLFAPTDAAFAKIPAANLSALNVTQRTSILTLHALTRFYTFAELFVANAALP 399 N G T+FAPTD AF + A L+ L+ Q+ ++ H +FYT + L + + Sbjct 68 LNSTTEGFTVFAPTDNAFNNLKAGTLNDLSTQQQVQLVLAHITPKFYTLSNLLLVPNPVR 127

Query 400 TLNTGR-----SLTFSTSVTRVTTITSPGGRVTTLNFLLYRRFPLTIFPIADVLLP 552 T TG+ L F+ +V S G T +N + ++FPL ++ + VLLPSbjct 128 TQATGQDGGVFGLNFTGQANQVN--VSTGIVETQINNAIRQQFPLALYQVDKVLLP 181

Low Sequence Complexity Filter

With Filter

Without Filter

© 2014 WSSP

p 8-6

Answer questions in DSAP

© 2013 WSSP© 2014 WSSP

Question: Which of these alignments has a greater biological significance?

A)

B)

© 2014 WSSP

What can you conclude about this BLASTX result?

A) It is too short to be significant

B) It does not match anythingC) There is a frame shift in the DNA sequenceD) Your DNA has an exact match

© 2014 WSSP

Where is the frameshift most likely to be found?

A) bp 181

B) Bp 75

C) bp 227

D) bp 381

E) Can not tell from the data© 2014 WSSP

AAAAAAAA

AAAAAAAATTTTTTTTT

TTTTTTTTTAAAAAAAA

TTTTTTTTTAAAAAAAA

DNA

RNA

cDNA

DS-cDNA

Cloning

Replication&

Purification

Sequencing

Points at when an error can be introduced into the DNA sequence of the clone

© 2014 WSSP

Is the frame shift at bp 227 caused by a DNA sequencing error?

A) Yes

B) No

C) Can not tell from the data

© 2014 WSSP

Does this have a frame shift?

Where?

© 2014 WSSP

What does this BLASTX report indicate?

A) There are matches to different proteins at the end of the sequence

B) There are matches in one frame to the entire sequence

C) There is a frame shift in the DNA sequence

D) The protein has two different domains

E) Can not conclude anything

Where is the frame shift?

A) bp 149B) Bp 160C) bp 458D) bp 469E) bp 493

© 2014 WSSP

Does this indicate that there is a frame shift in the sequence?

A)Yes

B)No

C)Can not tell

from the data

+1 +1+3

+1 +3Intron

© 2014 WSSP

What is the most likely explanation for this result?A) There is nothing

wrong with the alignment.

B) There is an extra or missing base causing a frame shift.

C) There is an unspliced intron in the cDNA.

D) The query has an extra protein region.

E) Answers C or D© 2014 WSSP

top related