assessment of cardiac microrna high throughput sequencing data sets generated from rna of varying...

1
hsa-miR-196a-5p hsa-miR-196a-3p-2 A bioinformatic solution to analyse miRNA-seq data sets. A HTML document that encodes a complete miRNA-seq data set. NO internet connectivity required. Has inbuilt research tools that provides detailed analysis of miRNA processing features. Perfect solution for supplementary information in publications miRspring: Visualisation & Data analysis miRspring document (miRNA sequence profiling) Figure 4: (A) Software pipeline to creating a miRspring document. Three perl scripts are required to make a miRspring document), as well as data files accessible from the miRBase website (mature.fa, hairpin.fa and species.gff). (B) The miRspring document is significantly smaller than BAM files, but yet contains all the sequence information. Global visualisation mode Focused visualisation mode Web site: http://miRspring.victorchang.edu.au Enquires: [email protected] Humphreys and Suter, 2013. Nucleic Acids Research 41(15):e147 Figure 5 Opening the miRspring document (in any internet browser) loads the (A) “Global” visualisation mode. This provides access to the raw counts of the whole data set in tabular or graphic format. Selecting points on the graph or a table entry will load (B) focused view which displays the sequencing information for that miRNA. C) The focus visualisation mode has a unique format to display sequences. D) QR code to a video screen cast on miRspring features. Left ventricular assist device (LVAD) Diluted Cardiomyopathy (DCM) Familial Congenital Idiopathic Post Partum Viral Chemotherapy Alcohol Hypertrophy Cardiomyopathy (HCM) 2cm (normal 1.2 – 1.5cm) Ischemia (Isch) Ideal scenario Four classes of NYHA heart failure: (New York Heart Association) 1) Cardiac disease but no symptoms and no limitation in ordinary physical activity (shortness of breath when walking, climbing stairs etc) 2) Mild symptoms (shortness of breath/angina) with exertion 3) Symptoms with exertion 4) Symptoms at rest Implanted with the intention to bridge patient to heart transplantation. Offered to individuals with severe end-stage heart failure in order to improve survival & quality of life A mechanical circulatory device that replaces the function of a failing heart - Impeller part of device magnetically levitated and hydro-dynamically suspended - Deliver up to 10 L of blood per minute at 100mmHG - Placed with inflow cannula draining left ventricular cardiac apex, outflow cannula inserted into ascending aorta Estimated prevalence of heart failure in Australia is 325,000 patients, >$1 billion (p/a) R. A. Clark, S. McLennan, A. Dawson, D. Wilkinson, S. Stewart, Heart Lung Circ 13, 266 (Sep, 2004). Heart failures have a number of aetiologies. AIMS (i) To implement quality control measures (ii) To measure differential miRNA expression in hearts, before and after LVAD implantation Tissue taken at LVAD implantation (core section) and at explant (ie when patient receives heart transplant). Heart Failure Assessment of cardiac microRNA high throughput sequencing data sets generated from RNA of varying quality. David T Humphreys 1,2 , Kavitha Muthiah 1,3 , Liza Thomas 4 , Peter Macdonald 1,2,3 , Chris Hayward 1,2,3 1 Victor Chang Cardiac Research Institute, Sydney, Australia, 2 St Vincent’s Clinical School, UNSW, Australia 3 St Vincent Hospital, Darlinghurst, Sydney, Australia, 4 Liverpool Hospital, NSW, Australia Figure 2: A) miRNA are processed from longer transcripts (pri-miR, pre-miR) by class III RNases (Drosha, Dicer) into a 22bp miRNA duplex. One strand is preferentially loaded into the RNA induced silencing complex (RISC) and guides it to semi-complementary sequences.. miRNA can be degraded 5’ to 3’ by exonuclease activity. B) miRBase (www.mirbase.org), the central repository for miRNA sequences, stores two types of sequences : (i) the stem loop , containing the precursor miRNA sequence (eg brown text) and (ii) mature miRNAs (eg highlighted blue box). There is a defined naming nomenclature defined by miRBase. Ischemia (miR-195, miR-320) Heart failure (miR-195, miR-23, Dicer) miRNA-Seq Figure 1: A) Many aspects of miRNA processing can be assessed from miRNA-Seq. These include (i) 5’ isomiRs, (ii) 3’ isomiRs, (iii) non-canonical processing, (iv) arm bias, (v) miRNA length, (vi) RNA editing, (vii) miRNA clusters. (B) Examples of the types of isomiRs that can be detected in miRNA-seq data sets. Note that both 3’ isomiRs and seed isomiRs have the same seed sequence. Bioinformatic analysis of miRNA-Seq data requires a good understanding of isomiRs, miRNA clusters and “multi-loci” miRNA. mmu-mir-196a-1 hsa-miR-196a-5p hsa-miR-196a-3p-1 5’ 3’ mmu-mir-196a-2 Naming nomenclature Precursor miRNAs: <species> – < mir > – <family ID> – <multi loci ID> Mature miRNAs: <species> – < miR > – <family ID> – <hairpin arm> – <multi loci ID> miRNA biogenesis Drosha Translational repression OR pri-miRNA pre-miRNA miRNA miRNA * miRNA duplex Asymmetric RISC assembly Exportin 5 DICER RISC RISC A B 5’ 3’ heart transplant PROTOCOL Tissue RNA extraction Library prep Sequencing Data analysis Tissue collected within 5 minutes of coring/explant Tissue placed into RNAlater (Ambion) Snap frozen in liquid nitrogen. Stored at -80 C Tissues homogenized RNA purified using Qiagen miRNEasy kit QC analysis on agilent bioanalyzer 300-1000ng total RNA input (NEB library prep kit) 12 cycles of PCR amplification QC analysis on agilent bioanalyzer Next Generation Sequencing (SOLiD 5500xl) BAM files produced from Lifescope small RNA pipeline miRNA profiling statistics calculated with miRspring software Fibrosis (miR-21, miR-29) Myocardial infarction (miR-29) Ageing heart (miR-34a) Figure 3: miRNAs are important in heart development and homeostasis. The miRNAs listed have been identified to be important for functional roles in key aspects of development/homeostasis. This is not a comprehensive list, the primary research articles implicating these miRNAs are referenced in the following reviews: Porrello ER (2013), Clin Sci 125(4):151-66 Small and Olson (2011), Nature 469:336-342. Callis and Wang (2008), Trends mol med 14(6):254-60 Figure is modified from Callis and Wang, 2008. miRNA in the heart miR-499 Hypothesis: There is miRNA remodelling in hearts with LVAD support RESULTS: miRNA-seq QC analysis To date we have collected and profiled miRNAs in 39 samples. For some preparations we recovered significantly degraded RNA. Degradation was measured by calculating the RNA integrity (RIN) scores with the Agilent Bioanalyzer. We examined if the miRNA population was comprimised in samples with low RIN scores. We see negligible trends in miRNA processing in RNA of various RIN scores. This suggests that there is no significant: - 3’ to 5’ miRNA degradation (3’ isomiRs and miRNA length) - 5’ to 3’ miRNA degradation (5’ isomiRs) - Degradation of the precursor transcript (non-canonical processing) RESULTS: Assessment of public data sets We converted public data sets into miRspring documents and assessed what processing parameters were typical of a miRNA-seq data set. ~895 million sequence tags aligned to miRBase v19 precursors which were distributed across 73 miRspring documents needing less than 56 megabytes of disk space. References for data sets: AGO IP: Burroughs, A.M. et al. RNA Biol 8, 158-177 (2011) TISSUE ATLAS: Cloonan, N. et al. Genome Biol 12, R126 (2011) ENCODE: Fejes-Toth, K. et al. Nature 457, 1028-1032 (2009) RIN vs miRNA length (nt) RIN vs % 5’ isomiRs RIN vs % 3’ isomiRs 19 25 RIN vs % non-canonical processing 0 5 10 15 20 0 40 60 80 100 20 0 40 60 80 100 20 22 21 20 23 24 RIN score Left Ventricle implants Left Ventricle explants Right Ventricle implants Right Ventricle explants Key/Legend: Conclusions The hundred most abundant miRNAs are most likely to represent the entries as found in miRbase. Less abundant miRNAs are more likely to have shorter lengths and non-canonically processed. Tissue Atlas Heart Kidney Liver Lung Ovary Spleen Testes Thymus Brain Placenta AGO IP THP-1 ENCODE HeLa S3 A549 Ag04450 Bj Gm1287 H1hesc HepG2 Huvec K562 MCF7 NheK Sknshra Sampling bias! Individual data sets (as visualised in miRspring) ALL data sets Distribution of abundant miRNAs Key processing features UAGGUAGUUUCCUGUUGUUGGG GAGGUAGUAGGUUGUAUAGUU UGAGGUAGUAGGUUGUAUAGUU GAGGUAGUAGGUUGUAUAGUU 5’ isomiRs 3’ isomiRs seed isomiRs let-7a let-7a +1 5’ isomiR let-7a let-7a -1 3’ isomiR miR-196 let-7a +1 5’ isomiR UGAGGUAGUAGGUUGUAUAGUU UGAGGUAGUAGGUUGUAUAGU A B vii Acknowledgments Thanks to Djordje Djordjevic: for help getting up and running with the statistical package R, which was used to generate these results A B B A C D SCAN to load example miRspring document RIN vs % miRNA mapping 0 40 60 80 100 20 What is a RIN score? Conclusions R 2 = 0.56 0 20 30 40 50 10 RIN vs most abundant miR (%) RIN score

Upload: australian-bioinformatics-network

Post on 15-Aug-2015

857 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Assessment of cardiac microRNA high throughput sequencing data sets generated from RNA of varying quality - David T Humphreys

hsa

-miR

-19

6a-

5p

hsa

-miR

-19

6a-

3p

-2

• A bioinformatic solution to analyse miRNA-seq data sets. • A HTML document that encodes a complete miRNA-seq data set. NO internet connectivity required. • Has inbuilt research tools that provides detailed analysis of miRNA processing features. • Perfect solution for supplementary information in publications

miRspring: Visualisation & Data analysis

miRspring document (miRNA sequence profiling)

Figure 4: (A) Software pipeline to creating a miRspring document. Three perl scripts are required to make a miRspring document), as well as data files accessible from the miRBase website (mature.fa, hairpin.fa and species.gff). (B) The miRspring document is significantly smaller than BAM files, but yet contains all the sequence information.

Global visualisation mode Focused visualisation mode

Web site: http://miRspring.victorchang.edu.au Enquires: [email protected] Humphreys and Suter, 2013. Nucleic Acids Research 41(15):e147

Figure 5 Opening the miRspring document (in any internet browser) loads the (A) “Global” visualisation mode. This provides access to the raw counts of the whole data set in tabular or graphic format. Selecting points on the graph or a table entry will load (B) focused view which displays the sequencing information for that miRNA. C) The focus visualisation mode has a unique format to display sequences. D) QR code to a video screen cast on miRspring features.

Left ventricular assist device (LVAD)

Diluted Cardiomyopathy (DCM)

Familial Congenital Idiopathic Post Partum Viral Chemotherapy Alcohol

Hypertrophy Cardiomyopathy (HCM)

2cm (normal 1.2 – 1.5cm)

Ischemia (Isch)

Ideal scenario

• Four classes of NYHA heart failure: (New York Heart Association) 1) Cardiac disease but no symptoms and no limitation in ordinary physical activity (shortness of breath when walking, climbing stairs etc) 2) Mild symptoms (shortness of breath/angina) with exertion 3) Symptoms with exertion 4) Symptoms at rest

• Implanted with the intention to bridge patient to heart transplantation. • Offered to individuals with severe end-stage heart failure in order to improve survival & quality of life • A mechanical circulatory device that replaces the function of a failing heart

- Impeller part of device magnetically levitated and hydro-dynamically suspended - Deliver up to 10 L of blood per minute at 100mmHG - Placed with inflow cannula draining left ventricular cardiac apex, outflow cannula inserted into ascending aorta

Figure 1b. HeartMate IIFigure 1a. HeartWare Figure 1b. HeartMate IIFigure 1a. HeartWare Figure 1b. HeartMate IIFigure 1a. HeartWare Figure 1b. HeartMate IIFigure 1a. HeartWare

• Estimated prevalence of heart failure in Australia is 325,000 patients, >$1 billion (p/a) R. A. Clark, S. McLennan, A. Dawson, D. Wilkinson, S. Stewart, Heart Lung Circ 13, 266 (Sep, 2004).

• Heart failures have a number of aetiologies.

Figure 1b. HeartMate IIFigure 1a. HeartWare Figure 1b. HeartMate IIFigure 1a. HeartWare

AIMS

(i) To implement quality control measures (ii) To measure differential miRNA expression in hearts, before and after LVAD implantation

Tissue taken at LVAD implantation (core section) and at explant (ie when patient receives heart transplant).

Heart Failure

Assessment of cardiac microRNA high throughput

sequencing data sets generated from RNA of varying quality. David T Humphreys1,2, Kavitha Muthiah1,3, Liza Thomas4, Peter Macdonald1,2,3, Chris Hayward1,2,3

1 Victor Chang Cardiac Research Institute, Sydney, Australia, 2St Vincent’s Clinical School, UNSW, Australia 3 St Vincent Hospital, Darlinghurst, Sydney, Australia, 4Liverpool Hospital, NSW, Australia

Figure 2: A) miRNA are processed from longer transcripts (pri-miR, pre-miR) by class III RNases (Drosha, Dicer) into a 22bp miRNA duplex. One strand is preferentially loaded into the RNA induced silencing complex (RISC) and guides it to semi-complementary sequences.. miRNA can be degraded 5’ to 3’ by exonuclease activity. B) miRBase (www.mirbase.org), the central repository for miRNA sequences, stores two types of sequences : (i) the stem loop , containing the precursor miRNA sequence (eg brown text) and (ii) mature miRNAs (eg highlighted blue box). There is a defined naming nomenclature defined by miRBase.

Ischemia (miR-195, miR-320)

Heart failure (miR-195, miR-23, Dicer)

miRNA-Seq

Figure 1: A) Many aspects of miRNA processing can be assessed from miRNA-Seq. These include (i) 5’ isomiRs, (ii) 3’ isomiRs, (iii) non-canonical processing, (iv) arm bias, (v) miRNA length, (vi) RNA editing, (vii) miRNA clusters. (B) Examples of the types of isomiRs that can be detected in miRNA-seq data sets. Note that both 3’ isomiRs and seed isomiRs have the same seed sequence. Bioinformatic analysis of miRNA-Seq data requires a good understanding of isomiRs, miRNA clusters and “multi-loci” miRNA.

mmu-mir-196a-1

hsa

-miR

-19

6a-

5p

hsa

-miR

-19

6a-

3p

-1

5’ 3’

mmu-mir-196a-2

Naming nomenclature Precursor miRNAs: <species> – < mir > – <family ID> – <multi loci ID> Mature miRNAs: <species> – < miR > – <family ID> – <hairpin arm> – <multi loci ID>

miRNA biogenesis

Drosha

Translational repression

OR

pri-miRNA

pre-miRNA

miRNA

miRNA *

miRNA duplex

Asymmetric RISC assembly

Exportin

5

DICER

RISC RISC

A B

5’ 3’

heart transplant

PR

OTO

CO

L

Tissue

RNA extraction

Library prep

Sequencing

Data analysis

• Tissue collected within 5 minutes of coring/explant • Tissue placed into RNAlater (Ambion) • Snap frozen in liquid nitrogen. Stored at -80 C

• Tissues homogenized • RNA purified using Qiagen miRNEasy kit • QC analysis on agilent bioanalyzer

• 300-1000ng total RNA input (NEB library prep kit) • 12 cycles of PCR amplification • QC analysis on agilent bioanalyzer

• Next Generation Sequencing (SOLiD 5500xl)

• BAM files produced from Lifescope small RNA pipeline • miRNA profiling statistics calculated with miRspring software

Fibrosis (miR-21, miR-29)

Myocardial infarction (miR-29)

Ageing heart (miR-34a)

Figure 3: miRNAs are important in heart development and homeostasis. The miRNAs listed have been identified to be important for functional roles in key aspects of development/homeostasis. This is not a comprehensive list, the primary research articles implicating these miRNAs are referenced in the following reviews:

Porrello ER (2013), Clin Sci 125(4):151-66 Small and Olson (2011), Nature 469:336-342. Callis and Wang (2008), Trends mol med 14(6):254-60

Figure is modified from Callis and Wang, 2008.

miRNA in the heart

miR-499

Hypothesis: There is miRNA remodelling in hearts with LVAD support

RESULTS: miRNA-seq QC analysis To date we have collected and profiled miRNAs in 39 samples. For some preparations we recovered significantly degraded RNA. Degradation was measured by calculating the RNA integrity (RIN) scores with the Agilent Bioanalyzer. We examined if the miRNA population was comprimised in samples with low RIN scores.

We see negligible trends in miRNA processing in RNA of various RIN scores. This suggests that there is no significant: - 3’ to 5’ miRNA degradation (3’ isomiRs and miRNA length) - 5’ to 3’ miRNA degradation (5’ isomiRs) - Degradation of the precursor transcript (non-canonical processing)

RESULTS: Assessment of public data sets

We converted public data sets into miRspring documents and assessed what processing parameters were typical of a miRNA-seq data set. ~895 million sequence tags aligned to miRBase v19 precursors which were distributed across 73 miRspring documents needing less than 56 megabytes of disk space.

References for data sets: AGO IP: Burroughs, A.M. et al. RNA Biol 8, 158-177 (2011) TISSUE ATLAS: Cloonan, N. et al. Genome Biol 12, R126 (2011) ENCODE: Fejes-Toth, K. et al. Nature 457, 1028-1032 (2009)

RIN vs miRNA length (nt) RIN vs % 5’ isomiRs RIN vs % 3’ isomiRs

19

25

RIN vs % non-canonical

processing

0

5

10

15

20

0

40

60

80

100

20

0

40

60

80

100

20

22

21

20

23

24

RIN score

Left Ventricle implants

Left Ventricle explants

Right Ventricle implants

Right Ventricle explants

Key/Legend:

Conclusions

• The hundred most abundant miRNAs are most likely to represent the entries as found in miRbase.

• Less abundant miRNAs are more likely to have shorter lengths and non-canonically processed.

Tissue Atlas

Heart Kidney Liver Lung

Ovary Spleen Testes

Thymus Brain

Placenta

AGO IP

THP-1

ENCODE

HeLa S3 A549

Ag04450 Bj

Gm1287 H1hesc HepG2 Huvec K562 MCF7 NheK

Sknshra

Sampling bias!

Individual data sets (as visualised in miRspring)

ALL data sets

Distribution of abundant miRNAs Key processing features

UAGGUAGUUUCCUGUUGUUGGG

GAGGUAGUAGGUUGUAUAGUU

UGAGGUAGUAGGUUGUAUAGUU

GAGGUAGUAGGUUGUAUAGUU

5’ isomiRs

3’ isomiRs

seed isomiRs

let-7a let-7a +1 5’ isomiR

let-7a let-7a -1 3’ isomiR

miR-196 let-7a +1 5’ isomiR

UGAGGUAGUAGGUUGUAUAGUU

UGAGGUAGUAGGUUGUAUAGU

A B

vii

Acknowledgments

Thanks to Djordje Djordjevic: for help getting up and running with the statistical package R, which was used to generate these results

A B

B A C

D

SCAN to load example miRspring document

RIN vs % miRNA mapping

0

40

60

80

100

20

What is a RIN score?

Conclusions

R2 = 0.56 0

20

30

40

50

10

RIN vs

most abundant miR (%)

RIN score