![Page 1: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/1.jpg)
Open Science, Open Data, Open Source Projects for Undergraduate Research
Experiences
BioQUEST/HHMI/CaseNet Summer WorkshopJune 13, 2015
Kam D. Dahlquist, Ph.D.Department of Biology
Loyola Marymount University
![Page 2: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/2.jpg)
Outline• An open science ecosystem enhances student learning
• Quick example: XMLPipeDB project in a Biological Databases course
• Longer example: GRNmap project in Biomathematical Modeling course
• Potential research projects for BioQUEST participants
• Challenges are also opportunities– Computer literacy– Data literacy– Information literacy
![Page 3: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/3.jpg)
Open Science(open process)
CitizenScience
OpenSource
Code
Open Access(creative commons)
Reproducible Research
Research Integrity
Open Science Ecosystem
Open DataOpen Pedagogy
With thanks to John Jungck
![Page 4: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/4.jpg)
Open Science Pedagogy Adds Open Source Values and Tools to Problem Spaces
• Students solve an authentic research problem.
• They investigate large, publicly available datasets.
• They return the products of their research to the scholarly community.
Image: http://www.bioquest.org/bedrock/problem_spaces/
![Page 5: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/5.jpg)
Official Open Source Definition (http://opensource.org)
Free redistribution
Source code
Derived works
Integrity of the author’ssource code
No discrimination againstpersons or groups
No discrimination againstfields of endeavor
Distribution of license
License must not bespecific to a product
License must notrestrict other software
License must betechnology-neutral
![Page 6: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/6.jpg)
Open Source ValuesActive Learning
PedagogyOpen Source
Practices & Tools
Source code is available, modifiable,
and long-lived
Authentic problem to solve with realistic
complexity
Central code repository; version
control; provenance of code
Accountability to a developer and user
community
Participatory and collaborative work;
peer review
Task and bug trackers; continuous
integration; test-driven workflows
Responsibilities accompany rights
Responsibility and ownership of the learning process
Documentation: in-line, user manual,
web site, wiki
Open Source Values Mirror STEM Curricular Reform
![Page 7: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/7.jpg)
Pedagogy Implemented on Course Wikis
• Team-taught and cross-listed− BIOL/CMSI 367: Biological Databases
https://xmlpipedb.cs.lmu.edu/biodb/fall2013/index.php/Main_Page
− BIOL/MATH 388: Biomathematical Modelinghttp://www.openwetware.org/wiki/BIOL398-04/S15
• Single instructor− BIOL 368: Bioinformatics Laboratory
http://www.openwetware.org/wiki/BIOL368/F14
− BIOL 478: Molecular Biology of the Genome(wet lab, mostly offline)data analysis: http://www.openwetware.org/wiki/BIOL478/S15:Microarray_Data_Analysis
• Weekly assignments leading up to final research project
• All projects involve exploration of DNA microarray data
![Page 8: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/8.jpg)
Pedagogy Implemented on Course Wikis
• Team-taught and cross-listed− BIOL/CMSI 367: Biological Databases
https://xmlpipedb.cs.lmu.edu/biodb/fall2013/index.php/Main_Page
− BIOL/MATH 388: Biomathematical Modelinghttp://www.openwetware.org/wiki/BIOL398-04/S15
• Single instructor− BIOL 368: Bioinformatics Laboratory
http://www.openwetware.org/wiki/BIOL368/F14
− BIOL 478: Molecular Biology of the Genome(wet lab, mostly offline)data analysis: http://www.openwetware.org/wiki/BIOL478/S15:Microarray_Data_Analysis
• Weekly assignments leading up to final research project
• All projects involve exploration of DNA microarray data
![Page 9: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/9.jpg)
GenMAPP-compatibleGene Database
Visualize data
PostgreSQLIntermediateDatabase
http://xmlpipedb.cs.lmu.edu/
Biological Databases Team Final Project:
create a gene database for a bacterial species
Microarray data
![Page 10: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/10.jpg)
Each Student on the Team is Assigned a Specific Role
Coder
QualityControl
Data Analysis
Project Manager
![Page 11: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/11.jpg)
Student Products Are Shared with the Scientific Community
http://sourceforge.net/projects/xmlpipedb/
![Page 12: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/12.jpg)
Pedagogy Implemented on Course Wikis
• Team-taught and cross-listed− BIOL/CMSI 367: Biological Databases
https://xmlpipedb.cs.lmu.edu/biodb/fall2013/index.php/Main_Page
− BIOL/MATH 388: Biomathematical Modelinghttp://www.openwetware.org/wiki/BIOL398-04/S15
• Single instructor− BIOL 368: Bioinformatics Laboratory
http://www.openwetware.org/wiki/BIOL368/F14
− BIOL 478: Molecular Biology of the Genome(wet lab, mostly offline)data analysis: http://www.openwetware.org/wiki/BIOL478/S15:Microarray_Data_Analysis
• Weekly assignments leading up to final research project
• All projects involve exploration of DNA microarray data
![Page 13: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/13.jpg)
Systems Biology Workflow
DNA microarray data:wet lab-generated or published
Statistical analysis,clustering,Gene Ontology, term enrichment
Generate gene regulatory network
Modeling dynamics of the network
Visualizing the results
New experimental questions
![Page 14: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/14.jpg)
Systems Biology Workflow
DNA microarray data:wet lab-generated or published
Statistical analysis,clustering,Gene Ontology, term enrichment
Generate gene regulatory network
Modeling dynamics of the network
Visualizing the results
New experimental questions
![Page 15: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/15.jpg)
DNA
mRNA
Protein
Central Dogma of Molecular Biology (simplified)
Transcription
Translation
Freeman (2003)
![Page 16: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/16.jpg)
Genome
Transcriptome
Proteome
And Now in the “omics” Era…
Transcription
Translation
Freeman (2002)
![Page 17: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/17.jpg)
![Page 18: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/18.jpg)
![Page 19: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/19.jpg)
![Page 20: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/20.jpg)
![Page 21: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/21.jpg)
![Page 22: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/22.jpg)
![Page 23: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/23.jpg)
![Page 24: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/24.jpg)
Budding Yeast, Saccharomyces cerevisiae, isan Ideal Model Organism for Systems Biology
Alberts et al. (2004)
• Small genome of~6000 genes
• Extensive genome-wide datasets readily accessible
• Molecular genetictools available
![Page 25: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/25.jpg)
Environmental Changes and Stresses
• All organisms must respond to changes in theenvironment– pH– oxygen availability– pressure– osmotic stress– temperature (heat and cold)
• Some changes in the environment cause cellular damage and trigger a “stress response”– damage from reactive oxygen species– damage from UV radiation– sudden and/or large change in temperature (increase or
decrease)
![Page 26: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/26.jpg)
Cold Shock Is an Environmental Stressthat Is Not Well-Studied
• Increases in temperature (heat shock)– response very well-characterized– proteins denature due to heat– induction of heat shock proteins (chaperonins), that assist in
protein folding– conserved in all organisms (prokaryotes, eukaryotes)
• Decreases in temperature (cold shock)– response less well-characterized– decrease fluidity of membranes– stabilize DNA and RNA secondary structures– impair ribosome function and protein synthesis– decrease enzymatic activities– no equivalent set of cold shock proteins that are conserved in
all organisms
![Page 27: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/27.jpg)
Yeast Respond to Cold Shock by Changing Gene Expression
• Cold shock temperature range for yeast is 10-18°C• Previous studies indicate that the cold shock response
can be divided into:• Late response genes – 12 to 60 hours
– General environmental stress response genes (ESR) are induced – Regulated by the Msn2/Msn4 transcription factors
• Early response genes – 15 minutes to 2 hours– Genes unique to cold shock are induced, such as genes involved
in ribosome biogenesis and membrane fluidity– Which transcription factors regulate this response is unknown
![Page 28: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/28.jpg)
• Activators increase gene expression• Repressors decrease gene expression• Transcription factors are themselves proteins
that are encoded by genes
Transcription Factors Control Gene Expression by Binding to Regulatory DNA Sequences
![Page 29: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/29.jpg)
Experimental Design and Methods
![Page 30: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/30.jpg)
Yeast Cells Were Harvested for Microarrays Before, During, and After a Cold Shock and During Recovery
![Page 31: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/31.jpg)
Mixture of labeled cDNA from two samples
• 4 replicates of each experiment with dye swaps• wt and transcription factor deletion strains
![Page 32: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/32.jpg)
DNA Microarray
Freeman (2002)
One spot =one gene
Green = decreased
relative to control
Red =increased
Yellow =no changein geneexpression
![Page 33: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/33.jpg)
Gene Expression Changes Due to Cold ShockReturn to Pre-shock Levels During Recovery
t30/t0 cold shock t60/t0 cold shock
t90/t0 recovery t120/t0 recovery
• Four sets of biological replicates were performed
• Dye orientation was swapped for two sets of replicates
![Page 34: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/34.jpg)
Steps Used to Analyze DNA Microarray Data
1. Quantitate the fluorescence signal in each spot2. Calculate the ratio of red/green fluorescence3. Log2 transform the ratios4. Normalize the ratios on each microarray slide5. Normalize the ratios for a set of slides in an
experiment6. Perform statistical analysis on the ratios 7. Compare individual genes with known data8. Pattern finding algorithms/clustering9. Modeling the dynamics of the gene regulatory network10. Visualizing the results
![Page 35: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/35.jpg)
Systems Biology Workflow
DNA microarray data:wet lab-generated or published
Generate gene regulatory network
Modeling dynamics of the network
Visualizing the results
New experimental questions
Statistical analysis,clustering,Gene Ontology, term enrichment
Excel,stem
![Page 36: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/36.jpg)
And so on…
![Page 37: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/37.jpg)
ANOVA wt Δgln3
p < 0.05 2378/6189 (38.42%) 1864/6189 (30.11%)
p < 0.01 1527/6189 (24.67%) 1008/6189 (16.29%)
p < 0.001 860/6189 (13.90%) 404/6189 (6.53%)
p < 0.0001 460/6189 (7.43%) 126/6189 (2.04%)
B-H p < 0.05 1656/6189 (26.76%) 913/6189 (14.75%)
Bonferroni p < 0.05 228/6189 (3.68%) 26/6189 (0.42%)
Within-strain ANOVA Reveals How Many Genes Had Significant Changes in
Expression at Any Timepoint
![Page 38: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/38.jpg)
Number of Genes whose Expression Changes
Cold Shock Recoveryt15 t30 t60 t90 t120
Increased p < 0.05
439 (7%) 668 (11%) 609 (10%) 398 (6%) 191 (3%)
Decreasedp < 0.05
331 (5%) 517 (8%) 411 (7%) 249 (4%) 59 (1%)
Totalp < 0.05
770 (12%) 1185 (19%) 1020 (17%) 647 (10%) 250 (4%)
A Modified T Test Was Used to Determine Significant Changes in Gene Expression at Each Timepoint
wild type
![Page 39: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/39.jpg)
Short Time Series Expression Miner (stem) Software Clusters Genes with Similar Profiles
Exp
ress
ion
(lo
g2
fold
ch
ang
e)
Time (minutes)
![Page 40: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/40.jpg)
![Page 41: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/41.jpg)
Short Time Series Expression Miner (stem) Software Clusters Genes with Similar Profiles
Exp
ress
ion
(lo
g2
fold
ch
ang
e)
Time (minutes)
Gene Ontology categories assigned to clusters:•Ribosome biogenesis•Zinc ion homeostasis•Hexose transport
• Endomembrane system• Protein and vesicle transport• Negative regulation of nitrogen
compound process
![Page 42: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/42.jpg)
The Transcription Factor Gln3 Regulates Genes Involved in Nitrogen Metabolism
• Yeast differentiate between preferred and non-preferred nitrogen sources.
• When the nitrogen source is poor, Gln3 localizes to the nucleus and activates genes required to utilize the poor nitrogen source.
• The gln3 strain is impaired for growth at cold temperatures:
− Doubling time at 13°C of 15 hours vs. 8.3 hours for wild type.
• A microarray experiment was performed on the gln3 strain.
![Page 43: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/43.jpg)
Gln3 Target Genes Were Extracted from the YEASTRACT Database
37 out of 164 (23%) have significantly different expression profiles in the wild type versus the gln3 strain
![Page 44: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/44.jpg)
Systems Biology Workflow
DNA microarray data:wet lab-generated or published
Statistical analysis,clustering,Gene Ontology, term enrichment
Generate gene regulatory network
Modeling dynamics of the network
Visualizing the results
New experimental questions
YEASTRACT,Excel
![Page 45: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/45.jpg)
• Does not show whether activation or repression occurs
• Shows topology, but not the behavior of the network over time
• Data found in YEASTRACT database
Genome-wide Location Analysis has Determined the Relationships between Transcription Factors
and their Target Genes in Yeast
Lee et al. (2002)
![Page 46: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/46.jpg)
Assumptions made in our model:• Each node represents one gene encoding a transcription factor.• When a gene is transcribed it is immediately translated into protein;
a node represents both the gene and the protein it encodes.• An edge drawn between two nodes represents a regulation
relationship, either activation or repression, depending on the sign of the weight.
A Transcriptional Network Controllingthe Cold Shock Response
![Page 47: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/47.jpg)
Systems Biology Workflow
DNA microarray data:wet lab-generated or published
Statistical analysis,clustering,Gene Ontology, term enrichment
Generate gene regulatory network
Modeling dynamics of the network
Visualizing the results
New experimental questions
GRNmap (Windows-only)
![Page 48: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/48.jpg)
GRNmap: Gene Regulatory Network Modeling and Parameter Estimation
• Parameters are estimated from DNA microarray data from wild type and transcription factor deletion strains subjected to cold shock conditions.
• Weight parameter, w, gives the direction (activation or repression) and magnitude of regulatory relationship.
0
0.5
1Activation
1/w
0
0.5
1Repression
1/w
)(
)(exp1
)(txd
btxw
P
dt
tdxii
jjjij
ii
![Page 49: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/49.jpg)
The “Worst” Rate Equation is:
1)6()4()1()7()1()1()5(exp1
11
144341353023105
1 PHDDbSWIwSWIwSKOwSKNwPHDwFHLwCINw
P
dt
dPHDPHD
PHD
![Page 50: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/50.jpg)
Optimization of the 92 Parameters Requiresthe Use of a Regularization (Penalty) Term
• Plotting the least squares error function showed that not all the graphs had clear minima.
• We added a penalty term so that MATLAB’s optimization algorithm would be able to minimize the function.
• θ is the combined production rate, weight, and threshold parameters.
• is determined empirically from the “elbow” of the L-curve.
Q
rc
rd tztz
QE
1
22)]()([
1
Parameter Penalty Magnitude
Lea
st S
qu
ares
Res
idu
al
![Page 51: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/51.jpg)
Forward Simulation of the Model Fits the Microarray Data
![Page 52: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/52.jpg)
Systems Biology Workflow
DNA microarray data:wet lab-generated or published
Statistical analysis,clustering,Gene Ontology, term enrichment
Generate gene regulatory network
Modeling dynamics of the network
Visualizing the results
New experimental questions
GRNsight
![Page 53: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/53.jpg)
GRNsight Rapidly Generates GRN graphs Using Our Customizations to the Open Source D3 Library
GRNsight: 10 milliseconds to generate, 5 minutes to arrange
Adobe Illustrator: several hours to create
GRNsight: colored edges for weights reveal patterns in data
![Page 54: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/54.jpg)
The First Round of Modeling Has Suggested Future Experiments
![Page 55: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/55.jpg)
Systems Biology Workflow
DNA microarray data:wet lab-generated or published
Statistical analysis,clustering,Gene Ontology, term enrichment
Generate gene regulatory network
Modeling dynamics of the network
Visualizing the results
New experimental questions
http://www.openwetware.org/wiki/Dahlquist:BioQUEST_Summer_Workshop_2015
![Page 56: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/56.jpg)
95% of Bioinformatics is Getting Your Data into the Correct File Format
• Exposes deficiencies in computer literacy skills in so-called “digital natives”
• When you leave your comfort zone, it is, by definition, uncomfortable
• Emphasis on research process− Teamwork− Electronic lab notebook− Keeping track of files and code− Trouble-shooting problems that arise in the research
process: bugs, data issues, etc.
![Page 57: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/57.jpg)
Summary• An open science ecosystem enhances student learning
• Quick example: XMLPipeDB project in a Biological Databases course
• Longer example: GRNmap project in Biomathematical Modeling course
• Potential research projects for BioQUEST participants
• Challenges are also opportunities– Computer literacy– Data literacy– Information literacy
![Page 58: Open Science, Open Data, Open Source Projects for Undergraduate Research Experiences BioQUEST/HHMI/CaseNet Summer Workshop June 13, 2015 Kam D. Dahlquist,](https://reader031.vdocuments.net/reader031/viewer/2022013101/56649ee45503460f94bf2ad7/html5/thumbnails/58.jpg)
Acknowledgments
Ben G. FitzpatrickLMU Math
John David N. DionisioLMU Computer Science
Juan Carrillo, Natalie Williams, K. Grace Johnson, Kevin Wyllie, Kevin McGeeMonica Hong, Nicole Anguiano, Anindita Varshneya, Trixie Roque, (Tessa Morris)
Special thanks to John Jungck & Sam Donovan