functional non-coding dna part i non-coding genes and non-coding elements of coding genes

29
Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG

Upload: vicky

Post on 23-Feb-2016

81 views

Category:

Documents


1 download

DESCRIPTION

Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes. BNFO 602/691 Biological Sequence Analysis Mark Reimers, VIPBG. What D oes ‘Functional N on-Coding DNA’ Mean?. DNA whose sequence affects transcripts made from DNA in some way - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Functional Non-Coding DNAPart I

Non-coding genes and non-coding elements of coding genes

BNFO 602/691Biological Sequence Analysis

Mark Reimers, VIPBG

Page 2: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

What Does ‘Functional Non-Coding DNA’ Mean?

• DNA whose sequence affects transcripts made from DNA in some way

• Could affect transcription levels, splicing or sequestering of RNA

• Three main ways to identify functional non-coding elements– Sequence characteristics – favored bases– Genomic conservation– Epigenetic marks and open chromatin

• especially outside of genes

Page 3: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Types of Non-Coding Elements

• Non-coding RNAs– miRNAs, lncRNAs, etc

• Non-coding gene elements– UTRs, splice sites, poly-adenylation sites, splice sites

and regulating element, RNA-binding sites• DNA elements outside genes – our main focus– Promoters – Enhancers/Silencers– Insulators

Page 4: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Types of Non-Coding RNA

• microRNAs• Silencing RNAs• Small nuclear/nucleolar RNAs• Piwi-Interacting RNAs• Long Non-Coding RNAs• Circular RNAs• Still other RNAs???• Comprehensive data base at www.ncrna.org

Page 5: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Micro-RNAs• Micro-RNAs are small non-coding RNA molecules, about 21–

25 nucleotides in length• They are processed from much longer genes, or from introns

within mRNA, by several molecular pathways• Micro-RNAs base-pair with complementary sequences within

mRNA molecules, often in 3’ or 5’ UTR.• miRNA binding usually results in gene repression either via

translational stalling or by triggering mRNA degradation

Image by Charles Mallery, U of Miami

Page 6: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Micro-RNAs• The human genome encodes over 1500 miRNAs,

which are believed to affect more than half of human genes

• miRNAs are abundant in many cell types– Thousands of copies per cell of some miRNAs– Those within gene introns share regulation

• miRNAs are well-conserved across vertebrates– No orthologs between plant and animal miRNAs– miRBase is the comprehensive repository of micro-RNAs

Page 7: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Other Short RNAs: siRNA

• Small interfering RNAs are double-stranded with an overhang

• They are processed by some of the same machinery as miRNAs and have some of the same effects

Page 8: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Other Short RNAs: piRNA

• Piwi-Interacting RNAs are longer 26-31 base single-stranded RNAs – PIWI (P-element Induces Wimpy Testis) protein

• Over 50,000 sequences known in mouse– They are the largest class of nc-RNA

• They seem to play an ancient role in defense against retro-viruses and transposons

Page 9: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Other Short RNAs: snRNAs & snoRNAs

• Small nuclear RNAs (snRNAs) are typically ~ 150 bases long, and associate with protein– Many conserved copies of each snRNA gene– U1-U6 snRNAs key parts of splicing machinery

• Small nucleolar RNAs (snoRNAs) – Guide chemical modifications of other RNAs– Prader-Willi syndrome results from deletion of

region containing 29 copies of SNORD116 on chr 15q11

U6 snRNA

Page 10: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Long Non-Coding RNAs• Many long (>200bp) stretches of

genome are transcribed and have epigenetic marks like those of protein-coding genes

• Most of these are spliced RNAs with two (or more) exons

• GENCODE v15 has 13.5K lncRNA• See also

– Derrien et al, Genome Research 2012

– Lee, Science 2012

From Derrien et al Genome Res 2012

Page 11: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Many lncRNAs Induce Silencing• Coat nearby gene(s)

and silence them• Xist binds to gene

clusters first• Xist binds disparate

parts of chromosome • Many lncRNA are

antisense to genes• Some lncRNAs

maintain pluripotency of stem cells

From Jeannie Lee lab (Harvard) website

Page 12: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Long Non-Coding RNAs - 2• Most lncRNAs are expressed

in only a few tissues• Most human lncRNAs are

specific to the primate lineage

From Derrien et al Genome Res 2012

Page 13: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Circular RNAs

• Several thousand non-coding RNAs apparently form circular structures

• Many form complexes with AGO and seem to absorb attached miRNAs, blocking processing

• CDR1 has 70 conserved binding sites for mir7

Page 14: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Functional Pseudo-Genes• Pseudo-genes are copies of genes that are

decaying and rarely (never) make proteins• Some pseudo-genes act to absorb negative

regulators of the original gene – eg. SRGAP2B

Page 15: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

How to Identify Non-Coding RNAs?

• Short (and long) RNA transcriptomes• Promoter chromatin marks for independent

(non-embedded) miRNAs and lncRNAs

Page 16: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

DEMO: Display HOTAIR & XIST Tracks in UCSC Browser

Page 17: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Non-Coding Elements of Genes• TSS• 5' UTRs• Introns• Splicing regulation sites• 3' UTRs• Termination/Poly-adenylation sites

Page 18: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Transcription Start Sites

• Transcription of most genes may initiate at several distinct clusters of locations with distinct promoters for each TSS

• Two major types of metazoan TSS: CG-rich broad TSS, and narrow (often tissue-specific) TSS

Page 19: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Transcription Start Sites

Transcription often starts at CG within promoter

Page 20: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

5’ Untranslated Regions

• First exon often contains dozens to thousands of bases before Start codon (median 150)

• Sometimes contains regulatory sequences, e.g. binding sites for RNA binding proteins, and translation initiators

Page 21: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Splice Regulatory Sites• Splicing is achieved through binding of

spliceosome to recognition sequences on nascent RNA molecule

Page 22: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Splice Regulatory Sites

• Tissue-specific splice regulatory sites are highly conserved

From Merkin et al Science 2012

Page 23: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Splicing Patterns Evolve in All Tissues Except Brain

From Merkin et al Science 2012

Page 24: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Non-Coding Elements in Coding Exons

• Many regulatory sites occur within coding exons, esp. toward 5’ end

• These constrain some codons as much as protein sequence

• Many human SNPs break TFBS but have little effect on protein (AFAWK)

From Stergachis et al Science 2013

Page 25: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

3’ Untranslated Regions• Longest exon is usually 3’UTR (>1000 nt)• Typically 1/3 – 1/2 of a gene is in 5’ & 3’ UTRs• 3’UTR has binding sites for miRNAs and RNA binding

proteins• AU-rich elements (AREs) stabilize mRNA• Proteins recognize complex secondary structure

GRIK4 3’UTR secondary structure is conserved

Page 26: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

RNA Binding-Protein Sites• mRNAs are usually further processed (e.g.

transported or sequestered)• RNA binding proteins recognize specific motifs

within secondary structure of 3’ or 5’ UTR• These sites are often highly conserved

From Ray et al Nature 2013

Page 27: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Poly-adenylation/Termination Sites• Transcripts can be terminated and poly-adenylated at

sites with specific sequences• Most genes have alternate poly-adenylation sites• Median lengths of 3’UTR are 250 & 1773 bp (mouse)

Page 28: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

Poly-adenylation/Termination Sites• Rapidly proliferating cells express gene

isoforms with short 3’ UTRs • Neurons typically have longer 3’ UTRs

Elkon et al, NRG 2013Types of alternate poly-adenylation

Page 29: Functional Non-Coding DNA Part I Non-coding genes and non-coding elements of coding genes

DEMO: GAPDH and GABRA1 in UCSC Browser