chunlei wu heart_bd2k_201602_ebi
TRANSCRIPT
Chunlei Wu, [email protected]
@chunleiwu
Associate Professor of Molecular MedicineDept. of Molecular Experimental Medicine
The Scripps Research InstituteLa Jolla, CA, USA
02/23/2016 HeartBD2K PI meeting at EBI
Integrated Annotation Services for "BioThings"
From MyGene.info and MyVariant.info towards BioThings API
Annotations centered around bio-entities
Gene
GVariant
V
Pathway
P
D
Metabolite
M
Disease
So many variant annotation resources
dbNSFP
The Exome Aggregation
Consortium (ExAC)
BioGPS.org – a gene portal for biologists
Schematic view of MyVariant.info architecture
Each data source is updated individually. Colors indicate their different updating schedules.
High-performance web service APIs
Schematic view of MyVariant.info architecture
MyVariant.info for the end users:
http://MyVariant.info(currently v1 API, two endpoints)
http://MyVariant.info/v1/query?q=<query>
any query term(s)
matching variant hits
http://MyVariant.info/v1/variant/<variantid>
hgvs id(s)
matching variant object(s)
Both supports batch-mode via POST
Simple API. No sign-up. No API key.
Try our live API , and documentations
MyGene.info for the end users:
http://MyGene.info(currently v2 API, two endpoints)
http://MyGene.info/v2/query?q=<query>
any query term(s)
matching gene hits
http://MyGene.info/v2/gene/<geneid>
gene id(s)
matching gene object(s)
Both supports batch-mode via POST
Simple API. No sign-up. No API key.
Try our live API , and documentations
MyGene.info usage updates
lastyear
thisyear
2M
3MMonthly hits in Millions
30%9%
35%26%
Increased clients adoption
Requests by MyGene.info clients
Highlights:• mygene Python client usage now surpasses BioGPS usage• mygene R client usage now increased to 9% from <1%
10/07/2015-01/05/2016
30%9%
35%26%
Increased clients adoptionmygene Python client hosted in PyPI
mygene R client hosted in Bioconductor
MyVariant.info updates
Total over 334 Millions of annotated variants
The Exome Aggregation Consortium (ExAC)
New additions:
dbNSFPUpdated:
MyVariant.info updates
30%
68%2%
10/07/2015-01/05/2016
1 Million requests in 3 months
MyVariant.info official Python/R Clients
myvariant Python client hosted in PyPI (initial release in Aug 2015)
myvariant R client hosted in Bioconductor(initial release in Oct 2015)
From our users' love
From our users' love
Next?
MyVariant.info
MyGene.info
Make our APIs serve Linked Data
via
JSON + context = JSON-LD{ "@context": { "clinvar": "http://schema.myvariant.info/datasource/clinvar", "rcv": "http://schema.myvariant.info/datanode/rcv", "gene": "http://schema.myvariant.info/datanode/gene", "_id": "@id" }, "_id": "chr6:g.26093141G>A", "clinvar": { "@context": { "uniprot": "http://identifiers.org/uniprot/", "omim": "http://identifiers.org/omim/" }, "chrom": "6", "alt": "A", "ref": "G", "allele_id": 15048, "rsid": "rs1800562", "rcv": { "@context": { "accession": "http://identifers.org/clinvar" }, "accession": "RCV000000020", "origin": "germline", "clinical_significance": "risk factor" }, "gene": { "@context": { "symbol": "http://identifiers.org/hgnc.symbol/" }, "id": "3077", "symbol": "HFE" }, "omim": "613609.0001", "variant_id": 9 }}
Processed JSON-LD
<chr6:g.26093141G>A> <http://schema.myvariant.info/datasource/clinvar> _:b0 ._:b0 <http://identifiers.org/omim/> "613609.0001" ._:b0 <http://schema.myvariant.info/datanode/gene> _:b1 ._:b0 <http://schema.myvariant.info/datanode/rcv> _:b2 ._:b1 <http://identifiers.org/hgnc.symbol/> "HFE" ._:b2 <http://identifers.org/clinvar> "RCV000000020" .
JSON-LD N-Quads output:
{ "@id": "chr6:g.26093141G>A", "http://schema.myvariant.info/datasource/clinvar": { "http://identifiers.org/omim/": "613609.0001", "http://schema.myvariant.info/datanode/gene": { "http://identifiers.org/hgnc.symbol/": "HFE" }, "http://schema.myvariant.info/datanode/rcv": { "http://identifers.org/clinvar": "RCV000000020" } }}
JSON-LD compacted output:
Linked Data for data aggregation
MyVariant.info
Another Variant API
{ "_id": "chr1:g.196659237C>T", “cosmic": { "tumor_site": "breast", "mut_freq": 0.49, }, "clinvar": {…}, "dbsnp": {…}, …}
{ "pop": "GWD", "nobs": 226, "freq": 0.371681415929, …}
{ "_id": "chr1:g.196659237C>T", “cosmic": { "tumor_site": "breast", "mut_freq": 0.49, }, "clinvar": {…}, "dbsnp": {…}, "new_src": { "pop": "GWD", "nobs": 226, "freq": 0.371681415929 }, …}
BioThings API
MyVariant.info MyGene.info
JSON data aggregation mechanism
High-performance query engine
Well-designed REST API pattern
JSON-LD enabled Linked Data
Data-updating schedulerPython/R clients…
BioThings API
API specs API SDK API registry
BD2K API working group
Acknowledgement
Funding and SupportU54GM114833U01HG008473
Washington U:Ben AinscoughObi Griffith
TSRI:
Andrew SuJiwen XinCyrus AfrasiabiGinger TsuengAdam Mark
Greg StuppTim Putman
STSI:
Eric TopolAli TorkamaniGalina Erikson
U. Washington:
Sean MooneyMoritz JuchlerNikhil Gopal
OICR:Robin Haw
UC Berkeley:Chris Mungall
UCSD:Trish Whetzel
MyVariant.info
MyGene.info