contributions to the world of escience from the royal society of chemistry

Post on 06-May-2015

576 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Our access to scientific information has changed in ways that were hardly imagined even by the early pioneers of the internet. The immense quantities of data and the array of tools available to search and analyze online content continues to expand while the pace of change does not appear to be slowing. ChemSpider is one of the chemistry community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day and it serves as the foundation for many important international projects to integrate chemistry and biology data, facilitate drug discovery efforts and help to identify new chemicals from under the ocean. This presentation will provide an overview of the expanding reach of the ChemSpider platform and the nature of the solutions that it helps to enable. We will also discuss the possibilities it offers in the domain of crowdsourcing and open data sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community and facilitated collaboration and ultimately accelerate scientific progress.

TRANSCRIPT

Contributions to the World of eScience from the Royal

Society of Chemistry

Antony WilliamsUniversity of North Florida

November 14th 2013

We Have …Too Much Data!!!

The World of Online Chemistry

• Property databases• Compound aggregators• Screening assay results• Scientific publications • Encyclopedic articles (Wikipedia)• Metabolic pathway databases• ADME/Tox data – eTOX for example• Blogs/Wikis and Open Notebook Science

e-Science and Primary Data

• How much data generated in a lab, that COULD go public, is lost forever?

• Public Domain reference databases of value?• Syntheses• Properties• Spectra• CIFs• Images

• Much of chemistry is chemical structure-based – where and how could we host these data?

RSC’s ChemSpider

ChemSpider

• >29 million unique chemicals from >500 data sources

• Focus on improving data quality, enhancing functionality, integrating and enabling

Crowdsourced “Annotations”

• Users can add • Descriptions/Syntheses/Commentaries• Links to PubMed articles• Links to articles via DOIs • Add spectral data• Add Crystallographic Information Files• Add photos• Add MP3 files• Add Videos

Spectra

Chemistry Data online are messy

• We have inherited errors• All public compound databases have errors• “Incorrect” structures – assertions, timelines etc• “Incorrect” names associated with structures• Properties• Links• Publications• ENORMOUS CHALLENGE

Crowdsourced Curation

• Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate

Search “Vitamin H”

“Curate” Identifiers

“Curate” Identifiers

“Curate” Identifiers

Validated Name-Structure Dictionaries

• Chemical name dictionaries are used for:

• Text-mining (publications, patents)• Used to index PubMed and link to Google Patents

• Linking to other databases – think Biology!• When structures are not available drug names link

• Searching the web• Names link to structures link to InChIs

I want to know about “Vincristine”

Vincristine: Identifiers and Properties

Vincristine: Vendors and SourcesLinked by Structure

Vincristine: PatentsLinked by Name

Vincristine: ArticlesLinked by Name

Semantic Mark-up of Articles

Linking Names to Structures

The InChI Identifier

InChIStrings Hash to InChIKeys

Vancomycin – Search the Internet

Vancomycin

Search Molecular SKELETON

Search Full Molecule

Full Skeleton Search: 104 Hits

Full Molecule Search: 4 Hits

Publications - a summary of work

• Scientific publications are a summary of work• Is all work reported?• How much science is lost to pruning?• What of value sits in notebooks and is lost?

• How much data is lost?• How many compounds never reported?• How many syntheses fail or succeed?• How many characterization measurements?

What if we could capture it all?Digitally Enhancing the RSC Archive

Start with data in publications

Turn “Figures” Into Data

ChemSpider Reactions• Starting with data from CSSP, MOS and CCR• Will cover reactions extracted from:

• Patents• RSC journal articles and ESI

About Me…as a Chemist

• I’ve performed a few dozen chemical syntheses

• I’ve run thousands of analytical spectra• I’ve generated thousands of NMR

assignments• I’ve probably published <5% of all work • Most of it has been lost• But things can be different today….• But it still needs to be associated with me…

Micropublishing Syntheses

ChemSpider SyntheticPages

Visibility Means Discoverability

• Does a Social Profile matter?

• You are visible, when you share your skills, experience and research activities by:• Establishing a public profile• Getting on the record• Collaborative Science• Demonstrating a skill set• Measured using “alternative metrics”• Contributing to the public peer review

process

Scientists are “Quantified”

• Scientists are quantified

• Stats are gathered and analyzed

• Employers can find them, tenure will depend on them, and these already happen without your participation

• Scientists Impact Factors, H-index and many other variants.

How you can be Quantified…

ResearchGate

The Alt-Metrics Manifesto

• http://altmetrics.org/manifesto/

AltMetrics via Plum Analytics

Usage, Citations, Social Media, Etc

Detailed Usage Statistics

Your Profile as a Scientist

• If you are an active scientist – i.e. already published, active researcher, generator of data, early, mid- or late career there is lots to do!

• If you are a junior scientist the benefits of investing time now will provide a strong foundation for your future!

• So what do I do??

Branding: I am ChemConnector

Enabled by

• Persistent unique digital identifier

• Integrates to workflows such as manuscript and grant submission

• Supports automated linkages with your professional activities

An Online Profile

• Methods of sharing science online include:• Wikis or blogs• Slideshare for presentations• YouTube for videos• Flickr, Wikimedia etc. for images• ChemSpider for chemistry• GoogleDocs for data• Google Scholar Citations for citations• Microsoft Academic Scholar for papers

LinkedInhttp://www.linkedin.com/in/AntonyWilliams

My Career Captured…

And “Endorsements”

Are you sharing your slides online?

• Slideshare to host, expose and share your presentations, publications, posters and videos (subject to copyright you might have transferred!)

http://www.slideshare.net/

• Register for an account and retain your branding! Keep your online brand consistent

Upload and Add Details

• Edit title, add tags, add “abstract”, choose category

• Select checkbox for allow/disallow file download

SlideShare

Social Media Tools Feed Each Other

• Plugins and connectors integrate your activities across the social media platforms• Expose your Tweeting and your Slideshare

presentations directly on LinkedIn. • Plug-ins allow your tweets and presentations

to be automagically displayed on LinkedIn

From Slideshare Into the Network

Add Applications to LinkedIn

Places to Share Videos

• There are other sites for you to share your videos online as a scientist

• YouTube• SciVee• Vimeo• Slideshare

Share/Manage Your Publications

• Where do you “manage your publications”? • Share your “activities” with the community• My publications/slides/videos are my CV on

• My Blog• On LinkedIn• On SlideShare• On Researchgate• On Academia.edu

Academia.edu

Academia.edu

And Mendeleyhttp://www.mendeley.com/profiles/antony-williams/

My Google Scholar Profile

My Co-author Graph on MAS..

Share Science!!! Not Just Yourself

• Become a community contributor to science• Share your expertise in the new world of

openness• Share your code• Share your data and your model• Share your Figures• Contribute to Wikis – Wikipedia and others• Become an Open Notebook Scientist

The Power of Blogs & Social Media

The Power of Blogs & Social Media

The Power of Blogs & Social Media

And into the AltMetrics World

And into the AltMetrics World

Social Networking for Scientists

• The representation of YOU on the web is going to become increasingly important…

• Engagement and participation is a choice…

• Consider the value to both you and to your community regarding contribution• Open Data, Curations, Annotations etc.

Conclusions

• Online chemistry has exploded…

• Each of you has the opportunity to contribute

• Contributions will ultimately be credited to you and your scientific career

• Imagine starting to build your online presence early and how it can benefit you

• There is no time that is too early to start actively building profile/reputation

Thank you

Email: williamsa@rsc.org Twitter: @ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams

top related