ercb-nephromine-eurenomics database · 2015-12-04 · ercb-nephromine-eurenomics database maja t....

1
ERCB-Nephromine-EURenOmics Database Maja T. Lindenmeyer, Korbinian Grote, Sebastian Martini, Thomas Werner, Matthias Kretzler, Clemens D. Cohen Introduction The main goal of WP7 is to develop a multiscalar view of rare renal diseases by integrating diverse large-scale data sources. Data generated in WP 2-6 as well as data from cooperating consortia and publically available datasets will be implemented in the EURenOmics platform. ERCB – study design ERCB – disease distribution (>2600 biopsies) European Renal cDNA Bank (ERCB) Histological information obtained by renal biopsies is a cornerstone of the current management of kidney disease. However, histology-based analysis yields mainly descriptive diagnostic categories and gives limited prognostic information; therefore additional sources for defining renal disease on a molecular level are desirable. For this purpose a comprehensive European Renal cDNA bank (ERCB) of kidney biopsies was established in an interdisciplinary European collaboration of renal research centres. Disease Total # of biopsies # of arrays (Glom) # of arrays (Tub) Minimal Change (MCD) 66 15 14 FSGS 83 29 17 Membranous GN 130 21 18 Diabetic nephropathy 55 11 Hypertensive nephropathy 115 15 20 IgA nephropathy 251 27 25 Lupus nephropathy 106 32 32 Living Donors 50 38 46 ERCB – Arrays (included in EURenOmics Database) Procedure: Immediately after renal biopsy, a minimum of 10% of the biopsy specimen is separated and stored in RNAlater solution. Under a stereomicroscope, glomeruli and tubulointerstitial compartments are manually microdissected. The gene expression analysis is performed by real- time RT-PCR or microarrays. Clinical characteristics of the patients are also collected in parallel, to enable integrative analysis of the gene expression data Nephromine Nephromine (http://nephromine.org) is a web-based kidney specific systems biology search engine that provides data from 20 publically available gene expression datasets from 1757 samples (murine and human), incorporates clinical data and allows various analyses. Nephromine - Combined database and systems biology search engine for the renal research community Coexpression Analysis: Nephromine identifies sets of genes that are coexpressed across a panel of tissue samples. Coexpression suggests shared function. Differential analysis: Nephromine precomputes differential expression profiles using Student’s t test for two class differential expression analyses and standard correlation methods for multiclass ordinal analyses (e.g. types or subtypes of renal disease). Outlier Analyis: Outliers are important to analyze because they can give more information. It could be that some substructures or groups do not fit into the whole picture but are not statistically significant. Nonetheless it is important to recognize the variation within the samples. Concept Filter: Knowledge based concepts: Nephromine captures lists of gene associated with specific conceptual annotations from known data sources. Nephromine derived concepts: concepts created from differentially expressed genes in datasets that involve at least two groups. The database consists of three layers: data input, data analysis, and data visualization. The data input layer has two components, the gene expression data pipeline and the annotation data warehouse. Data sources include those from the public domain (GEO at NCBI and Array Express at EBI), studies generated in the framework of the O’Brien Center, and data sets obtained by request from ongoing or published studies.. Nephromine– Architecture The data-analysis layer consists of sample facts standardization and automated statistical analysis. The sample facts standardization utilizes the NCI Thesaurus and manual annotation. The automated statistical analysis component is implemented in Perl and R. A series of scripts monitor the database for new data and sample parameters and automatically performs differential expression analysis, cluster analysis, and concept analysis, when needed. EURenOmics data repository & interactive analysis platform To allow a smooth up-and download as well as sharing data amongst partners a web-based functional data repository has been generated. The interactive platform provides standardized and harmonized datasets and uses existing architectures from Genomatix’s GePS and ElDorado databases. The interactive analysis platforms provides tools for NGS and gene regulatory analysis as well pathway & network mining for consortium members. Data from the ERCB has already been integrated and further EURenOmics data will be added as they become available. EURenOmics – Web-based functional data repository EURenOmics – interactive platform for further data analysis https://eurenomics-ftp.genomatix.de/Login ftp://eurenomics-ftp.genomatix.de The data repository is generally accessible for each of the partners of EURenOmics. However, to guarantee intellectual property there will be general access restrictions and privileged group rights depending on lock-up times for publications and patent fillings. This criteria is fullfilled by dividing the data repository into three parts (private only accessible by the group, public open to all partners of EURenOmics, WPs open to all members of the respective WP). Login File Editing Data repository architecture Download Delete Rename Acknowledgment The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 2012-305608 (EURenOmics).. We thank all participating centers of the European Renal cDNA Bank-Kroener-Fresenius biopsy bank (ERCB KFB) and their patients for their cooperation. Active members at the time of the study: Clemens David Cohen, Holger Schmid, Michael Fischereder, Lutz Weber, Matthias Kretzler, Detlef Schlöndorff, Munich/Zurich/AnnArbor/New York; Jean Daniel Sraer, Pierre Ronco, Paris; Maria Pia Rastaldi, Giuseppe D'Amico, Milano; Peter Doran, Hugh Brady, Dublin; Detlev Mönks, Christoph Wanner, Würzburg; Andrew Rees, Aberdeen; Frank Strutz, Gerhard Anton Müller, Göttingen; Peter Mertens, Jürgen Floege, Aachen; Norbert Braun, Teut Risler, Tübingen; Loreto Gesualdo, Francesco Paolo Schena, Bari; Jens Gerth, Gunter Wolf, Jena; Rainer Oberbauer, Dontscho Kerjaschki, Vienna; Bernhard Banas, Bernhard Krämer, Regensburg; Moin Saleem, Bristol; Rudolf Wüthrich, Zurich; Walter Samtleben, Munich; Harm Peters, Hans-Hellmut Neumayer, Berlin; Mohamed Daha, Leiden; Katrin Ivens, Bernd Grabensee, Düsseldorf; Francisco Mampaso(†), Madrid; Jun Oh, Franz Schaefer, Martin Zeier, Hermann-Joseph Gröne, Heidelberg; Peter Gross, Dresden; Giancarlo Tonolo; Sassari; Vladimir Tesar, Prague; Harald Rupprecht, Bayreuth; Hermann Pavenstädt, Münster; Hans-Peter Marti, Bern. Conclusion This EURenOmics Database and Analysis Pipeline will provide a central bioinformatics platform which allows addressing mechanistic, diagnostic and therapeutic challenges at the systems biology level. Example: GePS – Pathway analysis EURenOmics-Genomatix-Data Analysis Pipeline The pipeline is designed with a large degree of flexibility to allow in a WP specific manner multiple entry and exit points for large-scale data sets, integration into biological context and subsequent presentation of dependencies in intuitive graphical manner for iterative data analysis. It allows e.g. NGS Downstream Analysis (Peak Finding, RNA- Seq, and CNV analysis), a fully automatic analysis of CHIP-Seq data (ChiP-Seq Workflow), de novo pattern definition/detection of known and new patterns (GEMS Launcher) as well as enrichment analyses (GePS)

Upload: others

Post on 27-Apr-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ERCB-Nephromine-EURenOmics Database · 2015-12-04 · ERCB-Nephromine-EURenOmics Database Maja T. Lindenmeyer, Korbinian Grote, Sebastian Martini, Thomas Werner, Matthias Kretzler,

ERCB-Nephromine-EURenOmics Database Maja T. Lindenmeyer, Korbinian Grote, Sebastian Martini, Thomas Werner,

Matthias Kretzler, Clemens D. Cohen

Introduction

The main goal of WP7 is to develop a multiscalar view of rare renal diseases by integrating diverse large-scale data sources. Data generated in WP 2-6 as well as data from cooperating consortia and publically available datasets will be implemented in the EURenOmics platform.

ERCB – study design ERCB – disease distribution (>2600 biopsies)

European Renal cDNA Bank (ERCB) Histological information obtained by renal biopsies is a cornerstone of the current management of kidney disease. However, histology-based analysis yields mainly descriptive diagnostic categories and gives limited prognostic information; therefore additional sources for defining renal disease on a molecular level are desirable. For this purpose a comprehensive European Renal cDNA bank (ERCB) of kidney biopsies was established in an interdisciplinary European collaboration of renal research centres.

Disease Total # of biopsies

# of arrays (Glom)

# of arrays (Tub)

Minimal Change (MCD)

66 15 14

FSGS 83 29 17

Membranous GN 130 21 18

Diabetic nephropathy

55 11

Hypertensive nephropathy

115 15 20

IgA nephropathy 251 27 25

Lupus nephropathy 106 32 32

Living Donors 50 38 46

ERCB – Arrays (included in EURenOmics Database) Procedure: Immediately after renal biopsy, a minimum of 10% of the biopsy specimen is separated and stored in RNAlater solution. Under a stereomicroscope, glomeruli and tubulointerstitial compartments are manually microdissected. The gene expression analysis is performed by real-time RT-PCR or microarrays. Clinical characteristics of the patients are also collected in parallel, to enable integrative analysis of the gene expression data

Nephromine Nephromine (http://nephromine.org) is a web-based kidney specific systems biology search engine that provides data from 20 publically available gene expression datasets from 1757 samples (murine and human), incorporates clinical data and allows various analyses.

Nephromine - Combined database and systems biology search engine for the renal research community

Coexpression Analysis: Nephromine identifies sets of genes that are coexpressed across a panel of tissue samples. Coexpression suggests shared function.

Differential analysis: Nephromine precomputes differential expression profiles using Student’s t test for two class differential expression analyses and standard correlation methods for multiclass ordinal analyses (e.g. types or subtypes of renal disease).

Outlier Analyis: Outliers are important to analyze because they can give more information. It could be that some substructures or groups do not fit into the whole picture but are not statistically significant. Nonetheless it is important to recognize the variation within the samples.

Concept Filter: Knowledge based concepts: Nephromine captures lists of gene associated with specific conceptual annotations from known data sources. Nephromine derived concepts: concepts created from differentially expressed genes in datasets that involve at least two groups.

The database consists of three layers: data input, data analysis, and data visualization. The data input layer has two components, the gene expression data pipeline and the annotation data warehouse. Data sources include those from the public domain (GEO at NCBI and Array Express at EBI), studies generated in the framework of the O’Brien Center, and data sets obtained by request from ongoing or published studies..

Nephromine– Architecture

The data-analysis layer consists of sample facts standardization and automated statistical analysis. The sample facts standardization utilizes the NCI Thesaurus and manual annotation. The automated statistical analysis component is implemented in Perl and R. A series of scripts monitor the database for new data and sample parameters and automatically performs differential expression analysis, cluster analysis, and concept analysis, when needed.

EURenOmics data repository & interactive analysis platform To allow a smooth up-and download as well as sharing data amongst partners a web-based functional data repository has been generated. The interactive platform provides standardized and harmonized datasets and uses existing architectures from Genomatix’s GePS and ElDorado databases. The interactive analysis platforms provides tools for NGS and gene regulatory analysis as well pathway & network mining for consortium members. Data from the ERCB has already been integrated and further EURenOmics data will be added as they become available.

EURenOmics – Web-based functional data repository EURenOmics – interactive platform for further data analysis

https://eurenomics-ftp.genomatix.de/Login

ftp://eurenomics-ftp.genomatix.de

The data repository is generally accessible for each of the partners of EURenOmics. However, to guarantee intellectual property there will be general access restrictions and privileged group rights depending on lock-up times for publications and patent fillings. This criteria is fullfilled by dividing the data repository into three parts (private – only accessible by the group, public – open to all partners of EURenOmics, WPs – open to all members of the respective WP).

Login File Editing Data repository architecture

Download

Delete Rename

Acknowledgment The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n° 2012-305608 (EURenOmics)..

We thank all participating centers of the European Renal cDNA Bank-Kroener-Fresenius biopsy bank (ERCB KFB) and their patients for their cooperation. Active members at the time of the study: Clemens David Cohen, Holger Schmid, Michael Fischereder, Lutz Weber, Matthias Kretzler, Detlef Schlöndorff, Munich/Zurich/AnnArbor/New York; Jean Daniel Sraer, Pierre Ronco, Paris; Maria Pia Rastaldi, Giuseppe D'Amico, Milano; Peter Doran, Hugh Brady, Dublin; Detlev Mönks, Christoph Wanner, Würzburg; Andrew Rees, Aberdeen; Frank Strutz, Gerhard Anton Müller, Göttingen; Peter Mertens, Jürgen Floege, Aachen; Norbert Braun, Teut Risler, Tübingen; Loreto Gesualdo, Francesco Paolo Schena, Bari; Jens Gerth, Gunter Wolf, Jena; Rainer Oberbauer, Dontscho Kerjaschki, Vienna; Bernhard Banas, Bernhard Krämer, Regensburg; Moin Saleem, Bristol; Rudolf Wüthrich, Zurich; Walter Samtleben, Munich; Harm Peters, Hans-Hellmut Neumayer, Berlin; Mohamed Daha, Leiden; Katrin Ivens, Bernd Grabensee, Düsseldorf; Francisco Mampaso(†), Madrid; Jun Oh, Franz Schaefer, Martin Zeier, Hermann-Joseph Gröne, Heidelberg; Peter Gross, Dresden; Giancarlo Tonolo; Sassari; Vladimir Tesar, Prague; Harald Rupprecht, Bayreuth; Hermann Pavenstädt, Münster; Hans-Peter Marti, Bern.

Conclusion

This EURenOmics Database and Analysis Pipeline will provide a central bioinformatics platform which allows addressing mechanistic, diagnostic and therapeutic challenges at the systems biology level.

Example: GePS – Pathway analysis EURenOmics-Genomatix-Data Analysis Pipeline

The pipeline is designed with a large degree of flexibility to allow in a WP specific manner multiple entry and exit points for large-scale data sets, integration into biological context and subsequent presentation of dependencies in intuitive graphical manner for iterative data analysis. It allows e.g. NGS Downstream Analysis (Peak Finding, RNA-Seq, and CNV analysis), a fully automatic analysis of CHIP-Seq data (ChiP-Seq Workflow), de novo pattern definition/detection of known and new patterns (GEMS Launcher) as well as enrichment analyses (GePS)