chemconnect: smartcats presentation

23
ChemConnect A use case example using cloud services

Upload: edward-blurock

Post on 23-Jan-2018

161 views

Category:

Science


1 download

TRANSCRIPT

Page 1: ChemConnect: SMARTCATS presentation

ChemConnectA use case example using cloud services

Page 2: ChemConnect: SMARTCATS presentation

• Data is the backbone of modern scientific research

• Exchange of data is paramount to successful interaction between research groups

Motivation

Publications and conferences

Data exchanged between researchers (email, etc)

Virtual Research Environment

paper

Data files

Clouds (infrastructures)

Page 3: ChemConnect: SMARTCATS presentation

Towards Virtual Research Environment

Cloud based Database

ChemConnect

Repository and connected data(first step towards a electronic scientific notebook

backed by searchable data)

http://www.chemicalkinetics.info

Page 4: ChemConnect: SMARTCATS presentation

Make the immense amount of data in the combustion community

not only available

but searchable

ChemConnect: Current Phase

Start with data set (in an accepted format)

Recognize interdependencies between datathrough connected relationships (semantic web concepts)

Parse the data set and produce fine-grained pieces of data

Page 5: ChemConnect: SMARTCATS presentation

ChemConnect: client-server Structure

User interface on browser, tablet or phone(adjustable for each)

Generates InterfaceChemConnect

Computingand

Responses

SERVER CLIENT

http://www.chemicalkinetics.info

Page 6: ChemConnect: SMARTCATS presentation

Current (Prototype) status• Number of data sets (primarily CHECKIN) 29 mechanisms

• Data sources (public domain from web):

• LLNL, Galway, San Diego, Stanford, Lund, CNRS-Nancy

• Size of database: 1.5 GB

• Number of data objects: 800k (objects: 240MB, indices 1.2GB)

• Number of relationships (fine grain semantics): 600k

• In the next phase, these numbers will increase dramatically

• More mechanisms

• Different types of data (theoretical, experimental, more 2D-graphical)

Page 7: ChemConnect: SMARTCATS presentation

ChemConnect database components

Repository of data sets

Description

References

Data in accepted format

IndividualData objects

Build relationshipsbetween

Data objects

Page 8: ChemConnect: SMARTCATS presentation

RDF: Resource Description Language

Subject: The subject of the

description

Predicate: The description of the relationship

between subject and object

Object: The object of the description

Subject ObjectPredicate

Concept of ontologies from the semantic web

Page 9: ChemConnect: SMARTCATS presentation

Relationships(example from CHEMKIN mechanism)

Object Relationship Object

Mech-butane-2011 hasReaction c2h5+o2 = c2h5o2

Mech-butane-2011 hasSpecies c2h5

c2h5o2 = c2h4o2h hasReactant c2h4o2h

c2h5o2 = c2h4o2h hasProduct c2h4o2h

c2h4o2h isIsomer c2h5o2

c2h4o2h hasStandardEnthalpy -276.51 kJ/mol

c2h5 hasProduct c2h5o2

c2h5 hasProduct c2h4o2h

c2h5o2 = c2h4o2h subMechanism C2

c2h5o2 = c2h4o2h subMechanism C2H5O2

C2h5 + o2 = c2h5o2 followedBy c2h5o2=c2h4o2h

Page 10: ChemConnect: SMARTCATS presentation

Google Cloud Platform: datastore

Page 11: ChemConnect: SMARTCATS presentation

ChemConnect example

Page 12: ChemConnect: SMARTCATS presentation

keyword: repository (data set input)

Page 13: ChemConnect: SMARTCATS presentation

Data set information

Page 14: ChemConnect: SMARTCATS presentation

Data input

Direct link from website

Drag and drop a file

Text field

Page 15: ChemConnect: SMARTCATS presentation

Searching through the connected data

Keyword

Searching with keywordscan be viewed as

moving through the connected data

Subject

Object

Page 16: ChemConnect: SMARTCATS presentation

keyword search: fine grained info

Page 17: ChemConnect: SMARTCATS presentation

reaction information

Page 18: ChemConnect: SMARTCATS presentation

find a reaction

Page 19: ChemConnect: SMARTCATS presentation

keyword search

Page 20: ChemConnect: SMARTCATS presentation

Connecting ’unrelated’ dataPassive Connection:

Don’t need to know which structures you want to connect to

If they share an RDF subject or a RDF object

Then they are connected!!Keyword: Passive

(independent datasets had no knowledge of other datasets)

Page 21: ChemConnect: SMARTCATS presentation

Linking data/models

ChemkinModel I

ChemkinModel II

2-D Structure

ComputationalChemistry

Calculations

AutomaticallyGeneratedCHEMKIN

Model1-Butyl-3-hydroperoxide

C4H11O2

ch2ch2ch(ooh)ch31-c4hh8-3-ooh

hasSpecieshasSpecies

hasSpecies

hasThermo

isIsomer isIsomer

isIsomer

Thermo

hasThermo

Thermo

hasThermo

Thermo

Page 22: ChemConnect: SMARTCATS presentation

Future directionsData presentation

Toward tool for comparison and mechanism building (ex. shopping cart of data)

Data object visualisationPresentation of search tree resultsIncrease number and source of data:

Data sets with 2-D (sub)structures (connecting substructures to species/reactionsExperimental data from source groups (see discussion after this session)Supplementary data from journals

Query More complex searches

multiple keywordsinterpretation/preprocessing of keyword expression before search

Ordering and filtering results (passive and with check boxes)Data Input

Researchers can enter their own dataFurther develop concept of private, group shared and public data

Page 23: ChemConnect: SMARTCATS presentation

Thank you…… any Questions?

I encourage you to try ChemConnect and give feedback

www.chemicalkinetics.info