big data in bioinformatics - bist · big data is bioinformatics •heterogeneous data •numerical...
TRANSCRIPT
![Page 1: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/1.jpg)
BIG DATA IN
BIOINFORMATICS
![Page 2: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/2.jpg)
BIG DATA IS
BIOINFORMATICS
![Page 3: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/3.jpg)
BiG data is bioinformatics• Heterogeneous data
• numerical• non-numerical
• Structures at different levels (from molecules to organisms)—images• Sequences• Longitudinal/dynamic—movies
• Multi-dimensional
• Collected at multiple sites• Produced by indivual small labs to large international consortiums
• Shared through the internet• Real time acces
• Need of integrative analysis
3
![Page 4: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/4.jpg)
bioinformatics 24,500,000
chemoinformatics 275,000
astroinformatics 27,800
neuroinformatics 331,000
socioinformatics 14,100
geoinformatics 548,000
meteoinformatics 146
econoinformatics 2,010
ecoinformatics 92,800
physicoinformatics 5,390
Google search: X-informatics (june 4,2015)
![Page 5: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/5.jpg)
0
5
10
15
20
25
30
35
40
45
17
60
17
70
17
80
17
90
18
00
18
10
18
20
18
30
18
40
18
50
18
60
18
70
18
80
18
90
# of commissioned years
Cedric Notredame, CRG
Number of scientific expeditions
![Page 6: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/6.jpg)
Cedric Notredame, CRG
![Page 7: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/7.jpg)
Thomas Heinis, EPFL
![Page 8: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/8.jpg)
Stephens ZD et al. PLOS Biology, 2015
![Page 9: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/9.jpg)
Big Data: Astronomical or Genomical?
Table 1. Four domains of Big Data in 2025
![Page 10: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/10.jpg)
We are the Big Data
![Page 11: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/11.jpg)
Wearable medical devices
![Page 12: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/12.jpg)
Implantable wearable devices
![Page 13: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/13.jpg)
Nanowearables
![Page 14: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/14.jpg)
Stephens ZD et al. PLOS Biology, 2015
Moore’s Law
![Page 15: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/15.jpg)
![Page 16: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/16.jpg)
2 PB per 1 g DNA
![Page 17: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/17.jpg)
“Goldman prediction”
• 2PB per 1g DNA (2 x 1015 bytes)
• Total world info (2013): 3ZBytes (3 x 1021 bytes)• Aproximately 1.5 x 106 g (1,5 tonnes of DNA) to store all information
• Information doubling time: 2 years
• Mass of earth: 6 x 1027 g (google)
• 1.5 x 106 x 2x/2 ≈ 6 x 1027 x ≈ 140 years
• the mass of total info in the world stored in DNA exceeds the mass of the Earth in year 2157
17
![Page 18: BIG DATA IN BIOINFORMATICS - BIST · BiG data is bioinformatics •Heterogeneous data •numerical •non-numerical ... bioinformatics 24,500,000 chemoinformatics 275,000 astroinformatics](https://reader035.vdocuments.net/reader035/viewer/2022071011/5fc9d6b30487c725ec11facf/html5/thumbnails/18.jpg)
18
215 PB per 1 g DNA