discovery of possible regulatory motifs

34
Analysis: Discovery of possible regulatory motifs What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests. A narrative to help you go through the simulation appears in a red-bordered box, such as the one below. To begin: 1. Click on Slide Show, (on the upper toolbar) 2. Click View Show 3. Click Continue button Continue Scenario 5

Upload: nirmala-last

Post on 06-May-2015

235 views

Category:

Technology


3 download

TRANSCRIPT

  • 1.Analysis:Discovery of possible regulatory motifs What follows is asimulationof the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serveyourresearch and annotation interests. A narrative to help you go through the simulation appears in a red-bordered box, such as the one below. To begin: 1. Click onSlide Show , (on the upper toolbar) 2. ClickView Show 3. ClickContinuebutton Continue Scenario 5

2. Youve decided you want to know what regulates the expression ofnifgenes, encoding the machinery for nitrogen fixation. Heres your strategy: Scenario 5 Continue

    • ( Search for other genes with same motifs)
  • Analyze set of 5 sequences for motifs
  • Extract 5 sequences from all genes in set
  • Collectnifgenes fromAnabaenaPCC 7120 into set
  • Include in set orthologs of theAnabaenagenes

Analysis:Discovery of possible regulatory motifs 3. Build set Display set Modify set Set operation Click onBuild Setto begin finding orfs with the desired specifications 4. AllitemsinAll open reading frames of All amino acid sequences of All intergenic regions of Human-annotated orfs of Private set Public set All open reading frames of Build set Display set Modify set Set operation Cancel Choose set type The first goal is to find all open reading frames within Prochlorococcus annotated as nif genes, so click onAll open reading frames in 5. AllitemsinAll open reading frames of Arthrobacter platensis Gloeobacter violaceus Microcystis aeruginosa Nostoc punctiforme NostocPCC 7120 ProchlorococcusMED4 ProchlorococcusMIT9313 ProchlorococcusS120 SynechococcusPCC6301 SynechococcusPCC7942 SynechococcusWH SynechocystisPCC 6803 Thermosynechococcus Trichodesmium Unicellulular Filamentous All AnabaenaPCC 7120 Display set Modify set Set operation Cancel Choose set type Choose database Build set Click onAnabaena PCC 7120 6. AllitemsinAnabaenaPCC 7120 Display set Modify set Set operation Cancel such that:Variable Data Operation Function Done Choose database Build set All open reading frames of Choose set type You want to compare the description of each orf with nif. To get a tool to extract the description, click on. Function 7. AllitemsinAnabaenaPCC 7120 Display set Modify set Set operation Cancel such that:Variable Data Operation Function Done Choose database Closest ortholog of Protein product of Upstream region of Downstream region of Description of Category of Annotation level of Description of Choose function ( item Build set All open reading frames of Choose set type Click onDescription of . 8. AllitemsinDisplay set Modify set Set operation Cancel Variable Data Operation Function Done Description of Choose function ( item )= includes excludes includes Op Build set You want to find orfs whose description includes the word nif. Click onincludes . AnabaenaPCC 7120 such that:Choose database All open reading frames of Choose set type 9. AllitemsinDisplay set Modify set Set operation Cancel Data Operation Function Done includes Op nif Type description term(s) Build set Description of Choose function ( item )You can type in any characters to search for. For this simulation, the term nif is provided. Press theEnterkey AnabaenaPCC 7120 such that:Choose database All open reading frames of Choose set type 10. AllitemsinDisplay set Modify set Set operation Cancel Variable Data Operation Function Done includes Op nif Type description term(s) Build set Description of Choose function ( item )No more specifications. Press theDonebutton. AnabaenaPCC 7120 such that:Choose database All open reading frames of Choose set type 11. AllitemsinDisplay set Modify set Set operation Cancel Variable Data Operation Function Done includes Op nif Type description term(s) Build set Description of Choose function ( item )Done Save results and script Save only results Save only results If this were a complicated search, you might want to save the specifications as a script. In this case, just save the results by clicking onSave only results . AnabaenaPCC 7120 such that:Choose database All open reading frames of Choose set type 12. AllitemsinDisplay set Modify set Set operation Cancel Variable Data Operation Function Done includes Op nif Type description term(s) Build set Description of Choose function ( item )7120 nif genes Type name of set AnabaenaPCC 7120 such that:Choose database All open reading frames of Choose set type All orfs of Anabaena whose descriptions include nif will be collected into a set. You can name the set anything you want. For this simulation, a name is provided. Press theEnterkey. 13. Build set Display set Modify set Set operation Anab7120:all0687 hupL[NiFe] uptake hydrogenase large subunit, C terminus Anab7120:all0687 hupL[NiFe] uptake hydrogenase large subunit, N terminus Anab7120:all0688 hupS[NiFe] uptake hydrogenase small subunit Anab7120:alr0692 similar tonifU Anab7120:alr0874 nifH2dinitrogenase reductase Anab7120:asr1309 similar tonifU Anab7120:alr1407 nifV1homocitrate synthase Anab7120:asr1408 nifZiron-sulfur cofactor synthesis Anab7120:asr1409 nifT Done Set: 7120 nif genes > This is the result of the search. The set is displayed both as a list of orfs and a graphical representation of the genetic neighborhood of each orf. You can find out more about an orf by clicking its name or its arrow. For now, just press. Continue Continue 14. Build set Display set Modify set Set operation Anab7120:all0687 hupL[NiFe] uptake hydrogenase large subunit, C terminus Anab7120:all0687 hupL[NiFe] uptake hydrogenase large subunit, N terminus Anab7120:all0688 hupS[NiFe] uptake hydrogenase small subunit Anab7120:alr0692 similar tonifU Anab7120:alr0874 nifH2dinitrogenase reductase Anab7120:asr1309 similar tonifU Anab7120:alr1407 nifV1homocitrate synthase Anab7120:asr1408 nifZiron-sulfur cofactor synthesis Anab7120:asr1409 nifT Done Set: 7120 nif genes > This search, like most, is only a beginning. It brought up some unintended hits (nif found NiFe). More seriously, it brought up many genes probably in the middle of operons and unlikely to be preceded by regulatory motifs. The genetic neighborhood gives clues as to operon structure. Select the two most likely orfs to begin operons by clicking on the circles next to alr0874 and alr1407. 15. Build set Display set Modify set Set operation Anab7120:all0687 hupL[NiFe] uptake hydrogenase large subunit, C terminus Anab7120:all0687 hupL[NiFe] uptake hydrogenase large subunit, N terminus Anab7120:all0688 hupS[NiFe] uptake hydrogenase small subunit Anab7120:alr0692 similar tonifU Anab7120:alr0874 nifH2dinitrogenase reductase Anab7120:asr1309 similar tonifU Anab7120:alr1407 nifV1homocitrate synthase Anab7120:asr1408 nifZiron-sulfur cofactor synthesis Anab7120:asr1409 nifT Done Set: 7120 nif genes > Lets suppose you proceed in a like fashion through the rest of the list. Press.Done 16. Build set Display set Modify set Set operation Anab7120:alr0874 nifH2dinitrogenase reductase Anab7120:alr1407 nifV1homocitrate synthase Done Set: 7120 nif genes The set now consists of the six Anabaena nif genes that you judged most likely to be preceded by transcriptional signals. It might be interesting to see where this set is located on the genome. To do this, click, then make some room by clicking onShow graphic . Display set Anab7120:all1438 nifEnitrogenase Fe/Mo cofactor Anab7120:all1455 nifHdinitrogenase reductase Anab7120:all1517 nifBnitrogen fixation protein Anab7120:alr2968 nifV2homocitrate synthase Display set Show orf ID Show gene name Show description Show coordinates Show graphic Show neighbors: +/- 1 Show map 17. Build set Display set Modify set Set operation Anab7120:alr0874 nifH2dinitrogenase reductase Anab7120:alr1407 nifV1homocitrate synthase Done Set: 7120 nif genes Replace the space-consuming description with coordinates by clicking onShow description , and then clickShow coordinates and finally Show map . Anab7120:all1438 nifEnitrogenase Fe/Mo cofactor Anab7120:all1455 nifHdinitrogenase reductase Anab7120:all1517 nifBnitrogen fixation protein Anab7120:alr2968 nifV2homocitrate synthase Display set Show orf ID Show gene name Show description Show coordinates Show graphic Show neighbors: +/- 1 Show map 18. Build set Display set Modify set Set operation Anab7120:alr0874 nifH2 Anab7120:alr1407 nifV1 Done Set: 7120 nif genes Anab7120:all1438 nifE Anab7120:all1455 nifH Anab7120:all1517 nifB Anab7120:alr2968 nifV2 Replace the space-consuming description with coordinates by clicking onShow description , and then clickShow coordinates and finally Show map . Display set Show orf ID Show gene name Show description Show coordinates Show graphic Show neighbors: +/- 1 Show map 19. Anab7120:alr0874 nifH21008496->1009389 Anab7120:alr1407 nifV11671878->1673011 Anab7120:all1438 nifE16963891673011 Set: 7120 nif genes Anab7120:all1438 nifE1696389 The resulting set consists ofsequencesnotorfs , and so the elements are defined by coordinates. Clicking on a coordinate brings up the sequence display (see Scenario 6). Clicking on a graph of an orf brings up the orfs annotation page. Click. Continue Continue 30. Build set Display set Modify set Set operation Done Anab7120.C:1006982-1008496d Anab7120.C:1671462-1671878d Set: all nif genes 5 Anab7120.C:1697832-1698138c Anab7120.C:1713264-1713395c Anab7120.C:1778098-1779034c Anab7120.C:3609273-3609624d NostPunc.637:37288-37376d NostPunc.510:15955-16325d NostPunc.651:60311-60584c NostPunc.510:5239-6338c > The final step in this procedure is to analyze the set of upstream sequences of nif genes hoping to find a common motif. Click onSet operatio , thenAnalysis tools . Tools based onP osition- S pecificS coringM atrices (PSSMs) are most often used for the task. Click on one of these:Meme . Set operation Set operation Maintenance Set operations Analysis tools Discovery tools Transformations Analysis tools Align PSSM: Gibbs sampler PSSM: Meme Make HMM PSSM: Meme 31. PSSM: Meme of (Build set Display set Modify set Set operation Cancel Public set Private set Private set Choose set type ClickPrivate setand thenall nif genes 5to give Meme the set of 5 sequences. 32. PSSM: Meme of (Build set Display set Modify set Set operation Cancel Private set Choose set type ClickPrivate setand thenall nif genes 5to give Meme the set of 5 sequences. 7120 IS895 seqs 7120 nif genes 7120 STTR7 regions all nif genes all nif genes 5 Npun STTR7 regions all nif genes 5 Choose set ) 33. PSSM: Meme of (Build set Display set Modify set Set operation Cancel Private set Choose set type Give the results a name, pressEnter , and the task is accomplished. all nif genes 5 Choose set ) PSSM:all nif 5 Type name of results 34. Analysis:Discovery of possible regulatory motifs Summary

  • The interface facilitates operations on sets of genes and sequences
  • The interface puts at your disposal powerful tools (thatalready exist), without the need to figure out a different computer environment
  • Taken together, these capabilities make possible a focus by those not particularly adept at computer programming onthe function of noncoding sequences

Scenario 5 But dont be fooled the interface does not yet exist.Thats the point of the proposal!