2012 project research grant - statistics in the empirical...

26
Kansliets noteringar Kod Dnr 2012-40084-97965-23 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science The Swedish Research Council Announced grants Thematic grants VR April 24 2012 Total amount for which applied (kSEK) 2013 2014 2015 2016 2017 1670 1701 1754 1808 929 Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected] APPLICANT Name(Last name, First name) Date of birth Gender Axelson-Fisk, Marina 720504-5946 Female Email address Academic title Position [email protected] Associate professor Docent Phone Doctoral degree awarded (yyyy-mm-dd) 031-7724996 1999-05-28 WORKING ADDRESS University/corresponding, Department, Section/Unit, Address, etc. Chalmers tekniska högskola Matematiska vetenskaper Matematisk statistik 41296 Göteborg, ADMINISTERING ORGANISATION Administering Organisation Chalmers tekniska högskola DESCRIPTIVE DATA Project title, Swedish (max 200 char) Statistisk Signifikans hos Biologiska Sekvenser Project title, English (max 200 char) Statistical Significance of Biological Sequences Abstract (max 1500 char) Life originated in the sea, and the species variety that we observe today has its roots in marine biology. Besides possessing the most diverse and unique genomes among all living things, marine organisms serve as illustrative indicators of climate change. This project proposal focuses on the statistical treatment of biological sequence data in genomic pipelines. The foundation of the proposal is a larger project, which involves building an infrastructure for marine genomics research, in which our role is the development and management of a bioinformatics pipeline for high-throughput genome sequence analysis. The statistical modeling of biological sequence motifs, or sequence words, runs as a red thread through such a pipeline. In the sequence assembly, separating sequences of different origin. In the genome characterization, identifying functional elements in the sequence. In comparative analyzes, characterizing the evolutionary relationships between species. In the functional analyzes, comparing the gene predictions to known gene and protein families. In gene expression analysis, computing the relative abundance of various transcripts in various situations. All these analyzes rests upon the robust characterization and statistical modeling of biological sequence ``words´´. Such word models will then be used for clustering genomic signatures, hypothesis testing between different gene sets, and comparative gene finding over large evolutionary distances. 2012-5918

Upload: others

Post on 07-Feb-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

Kansliets noteringarKod

Dnr

2012-40084-97965-23

2012Project Research Grant - Statistics in

the Empirical SciencesArea of science

The Swedish Research CouncilAnnounced grants

Thematic grants VR April 24 2012Total amount for which applied (kSEK)

2013 2014 2015 2016 2017

1670 1701 1754 1808 929

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

APPLICANTName(Last name, First name) Date of birth Gender

Axelson-Fisk, Marina 720504-5946 FemaleEmail address Academic title Position

[email protected] Associate professor DocentPhone Doctoral degree awarded (yyyy-mm-dd)

031-7724996 1999-05-28

WORKING ADDRESSUniversity/corresponding, Department, Section/Unit, Address, etc.

Chalmers tekniska högskolaMatematiska vetenskaperMatematisk statistik

41296 Göteborg,

ADMINISTERING ORGANISATIONAdministering Organisation

Chalmers tekniska högskola

DESCRIPTIVE DATAProject title, Swedish (max 200 char)

Statistisk Signifikans hos Biologiska Sekvenser

Project title, English (max 200 char)

Statistical Significance of Biological Sequences

Abstract (max 1500 char)

Life originated in the sea, and the species variety that we observe today has its roots in marine biology. Besides possessing the mostdiverse and unique genomes among all living things, marine organisms serve as illustrative indicators of climate change.

This project proposal focuses on the statistical treatment of biological sequence data in genomic pipelines. The foundation of theproposal is a larger project, which involves building an infrastructure for marine genomics research, in which our role is thedevelopment and management of a bioinformatics pipeline for high-throughput genome sequence analysis. The statistical modelingof biological sequence motifs, or sequence words, runs as a red thread through such a pipeline. In the sequence assembly,separating sequences of different origin. In the genome characterization, identifying functional elements in the sequence. Incomparative analyzes, characterizing the evolutionary relationships between species. In the functional analyzes, comparing the genepredictions to known gene and protein families. In gene expression analysis, computing the relative abundance of various transcriptsin various situations.

All these analyzes rests upon the robust characterization and statistical modeling of biological sequence ``words´´. Such wordmodels will then be used for clustering genomic signatures, hypothesis testing between different gene sets, and comparative genefinding over large evolutionary distances.

2012-5918

Page 2: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

Kod

2012-40084-97965-23Name of Applicant

Axelson-Fisk, Marina

Date of birth

720504-5946

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Abstract language

EnglishKeywords

statistical significance, bioinformatics, genomic signatures, cluster analysis, Research areas

Mathematics & Engineering MathematicsReview panel

NT-R, NT-KClassification codes (SCB) in order of priority

10106, 10610, 10203Aspects

Continuation grant

Application concerns: New grantRegistration Number: Application is also submitted to

similar to: identical to:

ANIMAL STUDIESAnimal studies

No animal experiments

OTHER CO-WORKER Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

Alm Rosenblad, Magnus Sahlgrenska universitetssjukhusetInstitutionen för medicinsk och fysiologisk kemi

Date of birth Gender

570427-0072 MaleAcademic title Doctoral degree awarded (yyyy-mm-dd)

PhD 2005-03-15

Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

,

Date of birth Gender

Academic title Doctoral degree awarded (yyyy-mm-dd)

Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

,

Date of birth Gender

Academic title Doctoral degree awarded (yyyy-mm-dd)

Name(Last name, First name) University/corresponding, Department, Section/Unit, Addressetc.

,

Date of birth Gender

Academic title Doctoral degree awarded (yyyy-mm-dd)

Page 3: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

Kod

2012-40084-97965-23Name of Applicant

Axelson-Fisk, Marina

Date of birth

720504-5946

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

ENCLOSED APPENDICESA, B, B, C, C, N, S

APPLIED FUNDING: THIS APPLICATION Funding period (planned start and end date)

2013-01-01 -- 2017-12-31Staff/ salaries (kSEK)

Main applicant % of full time in the project 2013 2014 2015 2016 2017

Marina Axelson-Fisk 50 677 699 722 746 771

Other staff

Doktorand 100 733 758 783 808

Total, salaries (kSEK): 1410 1457 1505 1554 771

2013 2014 2015 2016 2017

Övrig utrustning 20Konferensresor 100 100 100 100 80Litteratur 5 5 5 5 5Lokaler 125 129 134 138 68IT-kostnader 10 10 10 11 5

Total, other costs (kSEK): 260 244 249 254 158

Total amount for which applied (kSEK)

2013 2014 2015 2016 2017

1670 1701 1754 1808 929

ALL FUNDINGOther VR-projects (granted and applied) by the applicant and co-workers, if applic. (kSEK)

Funded 2012 Funded 2013 Applied 2013Proj.no.(M) or reg.nr.

2011-3996 900 900Project title Applicant

Next Generation ComparativeGenomics

Marina Axelson-Fisk

Funds received by the applicant from other funding sources, incl ALF-grant (kSEK)

POPULAR SCIENCE DESCRIPTIONPopularscience heading and description (max 4500 char)

Människan är uppbyggd av många miljarder små celler, som är specialiserade på en rad olika sätt. I varje cell finns en komplett kopiaav vår genetiska kod, vårt genom, som dels utgör vår arvsmassa som förs vidare till vår avkomma, och som dels ansvarar för allafunktioner i vår kropp. Arvsmassan bärs av gener, som sitter som delsekvenser på

Page 4: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

Kod

2012-40084-97965-23Name of Applicant

Axelson-Fisk, Marina

Date of birth

720504-5946

Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

våra kromosomer, som i sig är långa, vindlande DNA-sekvenser. Generna kodar för proteiner, små molekyler som utför och styr istort sett all aktivitet i kroppen.

DNA-sekvensering innebär att man bestämmer den exakta följden av molekyler (eller baser) som bygger upp en kromosom. Imänniskan utgör den sammantagna DNA-sekvensen i en och samma cell över tre miljarder sådana baser. Den nyasekvenseringstekniken, också kallad Nästa Generations Sekvenseringsteknik (NGS), har inneburit en revolution förmolekylärbiologer och bioinformatiker. Med NGS kan man numera producera oerhörda mängder sekvensdata på kort tid och till enkostnad som bara är en bråkdel av den för tidigare metoder. Det som tog Human Genome Project över ett årtioende, och oerhördaresurser världen över, kan idag utföras på några månader. Plötsligt ligger det inom räckhåll att sekvensera tusentals olikamänniskor, eller alla arter inom ett givet genus, eller varför inte alla trillioner mikrober som finns i människokroppen. Man kan börjautforska delar av det evolutionära livsträdet som varit helt orörda tidigare, och karakterisera genomen hos helt okända arter, och medokända släktförhållanden till varann. Men med den nya tekniken kommer svårigheten att skapa ordning och att hitta betydelsen i alldata. Behovet av matematiska och datalogiska verktyg som kan hantera datamängden har plötsligt blivit en flaskhals, ochutmaningarna är enorma.

Det har också blivit mer och mer tydligt att kunskap om en organisms genom och alla dess genprodukter inte är tillräckligt för attförstå alla funktioner i kroppen. Ett lysande exempel är sekvenseringen och analysen av jättepandans genom. Trots att jättepandansorteras under familjen rovdjur, är den växtätare med bambu som huvudsaklig föda. Den genetiska analysen visar mycket riktigt attpandan har alla gener som behövs för en köttätande diet, men att den överraskande nog saknar de nödvändiga komponenterna föratt bryta ner cellulosa. En paradox, kan man tycka. Förklaringen tros ligga i floran av mikroorganismer som finns i kroppen. Det ärförst när vi sekvenserat dessa som vi börjar få alla pusselbitar till hur kroppen fungerar.

Metagenomik är ett nytt område, som är en direkt produkt av den nya sekvenseringstekniken. Istället för att kartlägga en enskildorganism efter en tidskrävande kultivering i labb, tar man prover direkt från någon intressant miljö och sekvenserar alla organismersom finns i provet. Exempelvis innehållet i en hink havsvatten, eller sammansättningen i människans mag-tarmkanal. På så sätt kanman skapa sig en bild av vilka mikroorganismer som finns i en viss miljö vid en viss tidpunkt, och få insikter i deras samspel medomgivningen. Den största utmaningen här är att identifiera organism-tillhörighet för varje sekvens, särskilt som ett sådant prov kaninnehålla tiotusentals arter av okänd härkomst.

Målet med det här projektet är att utveckla statistiska modeller som kan användas för att jämföra DNA-sekvenser av olika ursprungoch under olika betingelser. Genom att bygga robusta modeller, med kända egenskaper, kan man urskilja mönster i sekvensernasom till exempel kan användas för att separera arter i ett metagenomik-projekt, eller identifiera särskiljande karakteristika mellanolika arter, identifiera samreglerade grupper av gener, eller att lokalisera funktionella komponenter i DNA-sekvensen.

Page 5: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

VRAPS/VR-Direct bilaga 2004.Ae Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Appendix AResearch programme

Page 6: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 7: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 8: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 9: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 10: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 11: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 12: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 13: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 14: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 15: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 16: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

VRAPS/VR-Direct bilaga 2004.Be Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Appendix BCurriculum vitae

Page 17: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 18: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 19: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

VRAPS/VR-Direct bilaga 2004.Ce Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Page 20: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 21: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science
Page 22: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

Magnus Alm Rosenblad, 570427-0072 Appendix C

Appendix C: Publications – Magnus Alm Rosenblad

1 Peer-reviewed articles

Number of citations from Web of Science as of 20 April 2012.

1. Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets Bengtsson J., K. M. Eriksson, M. Hartmann, Z. Wang, B. D. Shenoy, G-A. Grelet, K. Abarenkov, A. Petri, M. Alm Rosenblad and R. Henrik Nilsson. Antonie van Leeuwenhoek international journal of general and molecular microbiology 2011 Oct; 100(3); 471-475. Number of citations: 1

2. Evolutionary loss of 8-oxo-G repair components among eukaryotes Jansson, K., Blomberg, A., Sunnerhagen, P. and Alm Rosenblad, M. Genome Integrity 2010 Sep 1;1(1):12 Number of citations: 0

3. Octopamine receptors from the barnacle Balanus improvisus exhibit cAMP-mediated signaling by the antifouling substance medetomidine U. Lind, M.Alm Rosenblad, L. Hasselberg Frank, S. Falkbring, L. Brive, J. M. Laurila, K. Pohjanoksa, A. Vuorenpää, J. P. Kukkonen, L. Gunnarsson, M.Schenin, L.G.E. Mårtensson Lindblad, A. Blomberg Molecular Pharmacology 2010 Aug; 78(2):237-248. Number of citations: 6

4. Kinship in the SRP RNA family M.Alm Rosenblad, Larsen, N, Samuelsson T., Zwieb, C. RNA Biology 2009 Nov 7;6(5). Number of citations: 11

5. Conserved and variable domains of RNase MRP RNA M. Dávila López, M. Alm Rosenblad, T. Samuelsson RNA Biology July/August 2009;3(6) Number of citations: 18

6. Computational screen for spliceosomal RNA genes aids in defining the phylogenetic distribution of major and minor spliceosomal components M. Dávila López, M. Alm Rosenblad, T. Samuelsson Nucleic Acids Research 2008 May;36(9). Number of citations: 26

7. Inventory and phylogenetic analysis of the protein subunits of the ribonucleases P and MRP M. Alm Rosenblad, M. Dávila López, P. Piccinelli, and T. Samuelsson Nucleic Acids Research 2006 Sep 22; 34(18):5145-56. Number of citations: 24

8. The tmRDB and SRPDB resources Andersen,E.S., Rosenblad,M.A., Larsen,N., Westergaard,J.C., Burks,J., Wower,I.K., Wower,J., Gorodkin,J., Samuelsson,T., Zwieb,C. Nucleic Acids Research 2006 Jan 1;34. Number of citations: 36

9. Identification and analysis of ribonuclease P and MRP RNA in a broad range of eukaryotes Piccinelli,P. , Rosenblad,M.A. and Samuelsson,T. Nucleic Acids Research 2005, 33(14). Number of citations: 54

10. A nomenclature for all signal recognition particle RNAs Zwieb,C., van Nues,R., Rosenblad,M.A., Brown,J., and Samuelsson, T. RNA, 2005. Number of citations: 22

11. Identification of chloroplast signal recognition particle RNA genes Rosenblad,M.A. and Samuelsson,T. Plant and Cell Physiology 2004 45: 1633-1639. Number of citations: 16

12. Identification and comparative analysis of components from the signal recognition particle in protozoa and fungi Rosenblad,M.A., Zwieb,C. and Samuelsson,T. BMC Genomics, 2004, 5:5. Number of citations: 29

Page 23: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

Magnus Alm Rosenblad, 570427-0072 Appendix C

13. SRPDB: Signal Recognition Particle Database, Rosenblad,M.A., Gorodkin,J., Knudsen,B., Zwieb,C. and Samuelsson,T. Nucleic Acids Research, Vol. 31, Nr. 1, 2003. Number of citations: 72

14. Prediction of signal recognition particle RNA genes, Regalia, M., Rosenblad, M.A., Samuelsson, T. Nucleic Acids Research, 30(15) 2002. Number of citations: 44

Page 24: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

VRAPS/VR-Direct bilaga 2004.Re Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth

Kod

Title of research programme

Page 25: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

Marina Axelson-Fisk, 720504-5946 Appendix N

Appendix N: Budget and research resources

N.1: Budget motivation

Projektans�okan g�aller fem �ar, 2013 { 2017.

L�oner: Vi ans�oker om 50% st�od till Marina Axelson-Fisk under fem �ar, och 100% f�or endoktorand under fyra �ar.

Utrustning: Vi ans�oker om 20 tkr f�or ink�op av l�amplig datorutrustning till doktoranden.

Resebidrag: Vi ans�oker om (80 + 20) tkr per �ar f�or resor till viktiga konferensens, s�asomISMB, ECCB, Recomb, IMS och JMS.

Litteratur: Vi ans�oker om 5 tkr per �ar f�or ink�op av publiktionsrelaterade materiel.

N.2: Totala forskningsresurser

Den h�ar ans�okan �ar t�ankt att t�acka kostnaderna f�or Marina Axelson-Fisks medverkan iprojektet, samt en doktorand eller en postdoc. Magnus Alm Rosenblad t�acks av sin l�onfrn G�oteborgs Universitet. Inga �ovriga bidrag �nns f�or n�arvarande i projektet.

Typ av Status Finansi�ar Innehavare/ Bidragsperiod Totalbeloppbidrag projektledareProjektbidrag Ans�okt VR Marina 2013-2017 7862

Axelson-Fisk

1

Page 26: 2012 Project Research Grant - Statistics in the Empirical ...users.du.se/~lrn/obscure_tmp/5918.pdf · 2012 Project Research Grant - Statistics in the Empirical Sciences Area of science

VRAPS/VR-Direct b Vetenskapsrådet, Box 1035, SE-101 38 Stockholm, tel. +46 (0)8 546 44 000, [email protected]

Name of applicant

Date of birth Reg date

Kod Dnr

Project title

DateApplicant

Head of department at host University Clarifi cation of signature Telephone

Vetenskapsrådets noteringarKod