recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in...

24
ChemAxon UGM, San Diego, USA 25 th September 2013 Recent improvements in Marvin v6: Reaction Atom Mapping and its Application to Reaction Validation in Pharmaceutical ELNs Daniel Lowe and Roger Sayle NextMove Software Cambridge, UK

Post on 19-Oct-2014

502 views

Category:

Technology


0 download

DESCRIPTION

Automatic atom mapping attempts to determine the correspondence between the atoms of the reactants and products of a chemical reaction. Such mappings are useful for allowing greater specificity in queries of reaction databases. Recently there has been increased interest in their use to assist in the validation and standardisation of reactions in pharmaceutical ELNs (electronic lab notebooks). Atom mappings can, for example, detect if a reactant is missing or if a reactant does not contribute atoms to the product and hence may be better stored as an agent. We have evaluated the performance of the new atom mapping algorithm introduced with Marvin v6 compared to the prior version on a publically available dataset extracted from the patent literature and on reactions from multiple pharmaceutical ELNs. Dramatic improvements are observed in all cases both in the percentage of reactions that can be successfully atom-mapped and the quality of mappings produced. Finally we examine the difficulties that remain in validating reactions for which a complete atom mapping is not possible, such as for “routine” reactions where the reactant that was added is missing.

TRANSCRIPT

Page 1: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Recent improvements in Marvin v6: Reaction Atom Mapping and its Application to

Reaction Validation in Pharmaceutical ELNs

Daniel Lowe and Roger Sayle

NextMove Software

Cambridge, UK

Page 2: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

What is Atom-Mapping?

Mapping algorithm

Page 3: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Why Perform Atom-Mapping?

• Assigning roles to reagents

• Normalization of reactions for registration

Page 4: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Why Perform Atom-Mapping?

• More precise database searches

– Solvents/catalysts can be distinguished from reactants

– Allows the relationship between the reactant atoms and product atoms to be made explicit

Page 5: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Example

• I want to find reactions converting an alkene to a cyclopropane so I search for C=C>>C1CC1

Page 6: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Why Perform Atom-Mapping?

• Identifying suspect reactions:

Page 7: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Chemaxon atom mapping

Page 8: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Chemaxon atom mapping

Page 9: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Atom mapping modes

• Complete

• Changing

• Matching

Page 10: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Methodology

Test set Reactions

Pharmaceutical ELN subset 18,244

ChemReact68 database 67,926

SPRESI database subset 5,230

Reactions extracted from 2008-2011 USPTO patent applications*

562,872

* Lowe, D. M. Automated Extraction of Reactions from the Patent Literature. 243rd ACS National Meeting & Exposition, San Diego, CA, March 27, 2012.

Page 11: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

MetricS used

• Were all product atoms mapped

– Measures recall

• How many C-C bonds were broken

– Measures precision

Page 12: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Ability to map all product atoms

0

10

20

30

40

50

60

70

80

PharmaELN ChemReact68 SPRESI USPTO

Pe

rce

nt

of

reac

tio

ns

wit

h a

ll p

rod

uct

ato

ms

map

pe

d

Marvin 5.10

Marvin 6.0

ChemDraw 12

Page 13: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

c-c bonds broken

0.0

0.2

0.4

0.6

0.8

1.0

1.2

PharmaELN ChemReact68 SPRESI USPTO

Ave

rage

nu

mb

er

of

C-C

bo

nd

s b

roke

n p

er

map

pin

g (l

ow

er

is b

ette

r)

Marvin 5.10

Marvin 6.0

ChemDraw 12

Page 14: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Marvin 5.10

ChemDraw 12

Marvin 6.0

Page 15: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Speed Comparison

*Comparison performed on the PharmaELN dataset on an i7-2600

0

50

100

150

200

250

300

350

Marvin 5.12 Marvin 6.0 Marvin 6.0(multithreaded)

Re

acti

on

s m

app

ed

pe

r se

con

d

Page 16: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Difficult cases

ΔT

Page 17: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Areas for improvements: Implicit stoichiometry

Page 18: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Areas for improvements: many choices for reactant atom mapping

Page 19: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

0

10

20

30

40

50

60

70

80

90

100

PharmaELN

Pe

rce

nt

of

reac

tio

ns

wit

h a

ll p

rod

uct

ato

ms

map

pe

d Marvin 6.0

ChemDraw 12

Marvin6 + ChemDraw12

Consensus Result*

Consensus Methods

* Marvin 6.0 + ChemDraw12 + 2 variants of GGA’s Indigo toolkit + InfoChem ICMap + Pipeline Pilot + MDL Cheshire

Page 20: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Beyond atom mapping

• Missing reactants (often for routine reactions)

Page 21: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Beyond atom mapping

• Change of stereoisomer or chiral resolution

(E)-3-{8-[2-(4-Isopropyl-1,3-thiazol-2-yl)ethyl]-2-methoxy-4-oxo-4H-pyrido[1,2-a]pyrimidin-3-yl}-2-propenoic acid (1 mg) was dissolved in CDCl3 (0.5 ml) and irradiated with light from a fluorescent lamp

for 19 hours . The solvent was evaporated to obtain the title compound (1 mg).

Page 22: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

Atom mapping + classification

0

10

20

30

40

50

60

70

80

90

100

Atom mappingalgorithms alone

Combined withNameRXN

Pe

rce

nt

of

reac

tio

ns

wit

h a

ll p

rod

uct

at

om

s m

app

ed

Marvin 6.0

ChemDraw 12

ConsensusResult

Verified / Recognised

by NameRXN

(71%)

Page 23: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

conclusions

• Marvin v6’s atom mapping algorithm provides large improvements in recall, precision and speed over v5

• Atom mapping in some cases isn’t as simple as finding a maximum common subgraph mapping

• Classification algorithms can be useful for the validation of some reactions

Page 24: Recent improvements in marvin v6 reaction atom mapping and its application to reaction validation in pharmaceutical el ns

ChemAxon UGM, San Diego, USA 25th September 2013

acknowledgements

• Zsolt Mohacsi and Istvan Rabel, ChemAxon

• Ed Griffen and Nick Tomkinson, AstraZeneca

• Andrew Wooster, GSK

• Hans Kraut, InfoChem

• Thank you for your time.