inchi vs iupac nomenclature: aspects to be aware of when using standard inchi
DESCRIPTION
Features of IUPAC nomenclature that cannot be represented in Standard InChI will be examined to draw caution to cases where the use of standard InChI (and even in some cases non-standard InChI) may result in a loss of information. These areas include the representation of tautomers and mixtures of stereoisomers.TRANSCRIPT
Daniel LoweUnilever Centre for Molecular Science Informatics
University of Cambridge
InChI vs IUPAC nomenclature: Aspects to be aware of when using Standard InChI
• InChI is used for checking the correctness of the results of name to structure– 172,249 name and InChI pairs used for routine
regression testing• Failures arising from stereochemistry can be
distinguished from constitutional failures
Plausible interpretations of “alanine”:
InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6) InChI=1S/C3H7NO2/c1-2(4)3(5)6/h2H,4H2,1H3,(H,5,6)/t2-/m0/s1
1H-tetrazole 2H-tetrazole 3H-tetrazole 4H-tetrazole
InChI=1S/CH2N4/c1-2-4-5-3-1/h1H,(H,2,3,4,5)
Not all tautomers are equally important
A fixed hydrogen layer can always be removed but cannot be losslessly readded
• Conditions can lead to a particular tautomer being far more representative of a compound than another
• Not all tautomers readily interconvert
• A particular tautomer could be the reactive species
cyclo-tris(tetracarbonylosmium) (3 Os—Os)
InChI=1S/12CO.3Os/c12*1-2;;;
InChI=1/12CO.3Os/c12*1-2;;;/rC12O12Os3/c13-1-25(2-14,3-15,4-16)26(5-17,6-18,7-19,8-20)27(25,9-21,10-22,11-23)12-24
(RS)-2-(4-(2-methylpropyl)phenyl)propanoic acidCH3
O
HOCH3
H3C
InChI=1/C13H18O2/c1-9(2)8-11-4-6-12(7-5-11)10(3)13(14)15/h4-7,9-10H,8H2,1-3H3,(H,14,15)/t10-/s3/f/h14H
/s1 = absolute stereochemistry/s2 = relative stereochemistry/s3 = racemic stereochemistry
&1O
&1
&2
N
OCl Cl
O
A mixture of relative and absolute stereochemistry, and systems with multiple groups of relative stereochemistry are not yet supported
beta-cypermethrin (4 exact structures)
Helical stereochemistryhexahelicene
(M)-hexahelicene (P)-hexahelicene
Helical stereochemistry
(Sa)-6,6 -dinitrobiphenyl-2,2 -dicarboxylic acid′ ′
Axial stereochemistry
(Ra)-6,6 -dinitrobiphenyl-2,2 -dicarboxylic acid′ ′
Conclusions
• Where useful greater specificity than standard InChI can be achieved using extra layers
• InChI does not yet support all corner cases of stereochemistry
Any Questions?
Email: [email protected]