![Page 1: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/1.jpg)
ICG-11: Genomic Data Projects around the World - How to find data for your research
Fiona Nielsen – November 4th 2016
![Page 2: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/2.jpg)
We are always looking for data
Genetics, Cancer,
Rare diseaseresearch
We need access to the right data at the right time
DNA interpretation
requires lots of data
![Page 3: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/3.jpg)
How much data do you need to publish a paper?
2001: 1 human genome
2012: 1000 Genomes (1092 genomes, since increased to ~2500)
2015: UK10K, Icelandic population (2,636 + 100k imputed), Cancer genome atlas ~11,000 genomes
?
2016:Exac consortium 65,000 exomesGnomAD ~126,000 exomes
2020:
![Page 4: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/4.jpg)
Data is not easy to find and access
FRAGMENTEDPoor visibility of available
genomic data
ADMIN BURDENHuge overhead to manage
data access
BAD CULTURELack of data sharing habits in
research culture
![Page 5: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/5.jpg)
Finding and accessing data can take months
< 1 week
1-3 months
+6 months
40%
48%
11%
Time spent data scouting per project
![Page 6: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/6.jpg)
Why the barrier?
Barriers
• Difficult to find data, let alone find the RIGHT data
• Time-consuming and difficult to apply for access to data
• Complicated and labourious to submit data to public repositories
http://blog.repositive.io/tag/data-access/
http://blog.repositive.io/tag/data-sharing/
![Page 7: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/7.jpg)
But where in the world is the data?
?
![Page 8: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/8.jpg)
DATA is fragmented
![Page 9: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/9.jpg)
How to make data easy to discover?
![Page 10: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/10.jpg)
We have identified hundreds of data sources
Universities – Or repositories affiliated to a university.
Projects/Consortia – Has a specific purpose/aim. Often focussed on a specific research question or disease.
Public repositories – Allows download and upload of data from multiple institutions.
Companies – For profit organisations making data available for free or as a service.
Biobanks – many have sequence data of their biological samples.
Researchers know on
average 4-5 data sources
More data sources appear every day, to date we have identified 350+
![Page 11: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/11.jpg)
Simpler workflowfor data access
And indexed them on a the Repositive platform
Discover and access
Efficient Search, see related results
Find colleagues & their data interests
Co-annotate data & community feedback
Free to use: http://discover.repositive.io
![Page 12: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/12.jpg)
Platform launched in Sept 2016
Discover and access
Efficient Search, see related results
Find colleagues & their data interests
Co-annotate data & community feedback
1 Million+ Human genomic datasets indexed
Free to use: http://discover.repositive.io
![Page 13: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/13.jpg)
Platform launched in Sept 2016
Discover and access
Efficient Search, see related results
Find colleagues & their data interests
Co-annotate data & community feedback
1 Million+ datasets indexed
Simpler workflowfor data access
177kWhole Exomes
213kWhole Genomes
240023andMe samples
Free to use: http://discover.repositive.io
![Page 14: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/14.jpg)
Platform launched in Sept 2016
Discover and access
Efficient Search, see related results
Find colleagues & their data interests
Co-annotate data & community feedback
1 Million+ datasets indexed
Simpler workflowfor data access
61+Countries
426+Research organisations
Using Repositive
PDX ConsortiumWith AstraZeneca
Free to use: http://discover.repositive.io
![Page 15: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/15.jpg)
Data sources across the globeGEO location of 278 data sources analysed.
Found by tracking IP address of the source.
These include:
Public Repositories
Universities
Companies
BioBanks
Research consortiums
![Page 16: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/16.jpg)
Data source content
Assay Types
Dedicated to…
![Page 17: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/17.jpg)
Sequenced ethnicities
Aboriginals
African Americans
Africans
Australians
Chinese
MalaysIndians
DanishDutch Estonian
Russian
European Ancestry
FinnishIcelandic
JapaneseKorean
Latin Americans
Saudi
Swedish
![Page 18: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/18.jpg)
Machines & Data sources
9475600
88
660
26
68
5062
3
25
0
0
23 International
Interesting site to look at: http://omicsmaps.com/stats
![Page 19: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/19.jpg)
• Repositive is supporting the whole research workflow
• Faster, more efficient data discovery• Streamlining data access applications • Developing technology for efficient data access• Setting up pre-competitive data sharing agreements• Running workshops and training programmes
More efficient data access
Read about our pre-competitive PDX data resource in collaboration with AstraZeneca http://repositive.io/pdx
![Page 20: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/20.jpg)
Building upon best practices
MAKE DATA DISCOVERABLE
SIMPLIFY WORKFLOWS
CONTRIBUTE TOCOMMUNITY
DNAdigest and Repositive – Connecting the world of genomic datahttp://www.tinyurl.com/plos-biology-repositive
First 30 data sources listed here:
![Page 21: ICG-11 - genomic data projects around the world - nov 5 2016](https://reader035.vdocuments.net/reader035/viewer/2022070509/589ee2ec1a28ab39498b73df/html5/thumbnails/21.jpg)
Connecting the world of genomic data
Visit us at: http://repositive.io Or tweet us @repositiveio Free to use: http://discover.repositive.io
Fiona Nielsen, CEO Email us: [email protected]