chemaxon jchem oracle cartridge and related toolkits
TRANSCRIPT
Java Solutions for Cheminformatics
ChemAxon JChem Oracle Cartridge and related toolkits
April 2007
ChemAxon thumbnailCompany• Founded in 1998, based in Budapest, Hungary, representation in the US,
UK and Japan• Wide cheminformatics expertise (>30 staff) 9 PhD, 11 MSc• Wide industry expertise >200 corporate clients + >800 academic usersProducts• Cheminformatics tools - structure drawing, visualization, search,
transformation, library profiling and property prediction• Enterprise chemistry database and cartridge technologyTechnology• Powerful/Flexible – Enterprise API toolkits• Solutions – Desktop applications• Java based + .NET – Platform & database independent + Web ready • Dedicated Oracle Cartridge and Oracle integration from applicationsMantra• Do what they want, respond quickly
Toolkit and applications overview
Available from Oracle SQL & integrated with Oracle
Product development philosophy
Client driven development
Fast and reliable support>1000 active clients
Sophisticated technology
High performance (speed, accuracy, features)
Rounded, industry relevant functionality
Customizable
Extendable
Long term relevance
Comprehensive API
Platform independence (Java)
JChem Cartridge
Generating pKa and logD values by an SQL statement
Features of JChem Cartridge
• Adds chemistry knowledge into the SQL language of Oracle (SELECT, INSERT, UPDATE, ...)
• Substructure, superstructure, exact structure, structure and reaction similarity searching
• Fast: typically 1K hits in 3M structures within a second, fast index creation, row insertion & deletion
• Complex chemical expressions using the Chemical Terms language (>100 functions available)
• Standardization (canonicalization) during registration and querying
• Structure format conversions (MRV, Molfile, SDfile, RDfile, SMILES, CML, etc.)
• Dynamic 2D, 3D image generation• Structure enumeration using reaction rules yielding
‘synthetically feasible’ structures
Operators and functions – highlights I
“Swiss-army-knife” search operator:
jc_compare(<target-structure-column>, <query-structure>, <options>)
Chemical Terms: Over 100 built-in functions + user-defined functions.
SELECT count(*) FROM nci_3m WHERE jc_compare(structure, 'O=C1ONC(N1c2ccccc2)-c3ccccc3','sep=! t:s!ctFilter:(mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10)') = 1
Example: Lipinski-rule in expressed in chemical terms
Operators and functions – highlights II
Chemical Terms and query prefiltering:
SELECT id, purchase_date FROM compounds_instock WHERE jc_compare(structure, 'C(=S)([N][N])[S]', 'sep=! t:t!simThreshold:0.9!ctFilter:logp()>1!filterQuery:SELECT rowid FROM compounds_instock WHERE purchase_date > DATE ''2002-01-01''') = 1
Prefiltering allows to execute search on a subset of rows more efficiently.
Dynamic generation of static images:
SELECT jc_molconvertb(structure, 'png -2') FROM nci where id = :1
Avaliable image formats: png, jpeg, svg, pdf, emf ...
Indexes and column types
Index parameters can effect:• Fingerprint attributes• Standardizer configuration• Table space and storage options of the index table
CREATE INDEX jcxnci ON nci(structure) INDEXTYPE IS jc_idxtype PARAMETERS('STD_CONFIG=dehydrogenize:optional..aromatize:d')
Supported column types• VARCHAR2• CLOB• BLOB
Performance
• Index creation (regular structure tables): 5,801 sec• Import w/o duplicate filtering (JChem structure tables): 13,104 sec
• Substructure search results:
60,45915,873
1,987980
1,017456
364
274,356Clc1ccccc149,848c1ncc2ncnc2n1
4,632[#7]C1=CC=NC2=C1C=CC(Cl)=C21,752C(Sc1ncnc2ncnc12)c3ccccc31,188[#8]-c1c(N=N)c(cc2cc(ccc12)S([#8])(=O)=O)S([#8])(=O)=O
204O=C1ONC(N1c2ccccc2)c3ccccc3
0C1CN1c2cnnc3c(cncc23)C4=CSC=C4
Time (ms)Hit CountQuery Structure
Test platform: Table containing 3,003,012 structures in VARCHAR2 columns with 4 year old 3GHz dual Xeon (Netburst architecture) with 2GB system memory
Canonicalization with Standardizer
• Aromatize/dearomatize (ChemAxon and Daylight type rules)
• Add/remove explicit hydrogens
• Convert mesomers / tautomers / functional groups
• Removesolventscounterions by listsmallest fragmentretain largest fragment
• Set/Remove chiral flag, remove stereofeatures
• Ungroup S groups
• Enumerate by stoichiometry values
• 2D, 3D coordinate generation (cleaning)
• Template based cleaning
Virtual Synthesis with Reactor
Effective– focused, combinatorial
and diverse libraries– combinatorial, random
and exhaustive dispatching
– high throughput
Flexible� memory, file and database
operations sequential or combinatorial mode
� compound or reaction output type� reverse direction
Compatible– reactions: MRV, RXN, RDF,
SMARTS/SMIRKS – compounds: MRV, MOL, SDF, SMILES– mapping: ChemAxon, Daylight,
automapper
Smart– chemo-, regio- and stereospecific– customizable
Instant JChem: http://www.chemaxon.com/conf/Instant_JChem.ppt
Instant JChem
Desktop application for local and remote chemical database management, search and structure based prediction
• Simple connect to Oracle and MySQL external databases
• Powerful search functionalities
• Scalable – explore ’000,000’s+ live structures
• Dynamically predict propertiesusing Calculator Plugins
• Apply canonicalization rules forimport and viewing
• Form builder
• Relational data support
• Wide import / export options
• Library overlap analysis
• Very active development –what do you want to do?
JChem Cartridge – current developments
• Pharmacophore similarity search, custom descriptor (e.g. BCUT, ~ scalar) and metric at similarity search
• Markush functionality• Coordinated bond search (ferrocene)• Query tables• JChem Server, JChem cluster• Enhanced cartridge administration GUI (web-based)
Find out more
• Product descriptions & links– www.chemaxon.com/products.html
• Forum - open access (no secrets) & very active– www.chemaxon.com/forum
• Presentations and posters– www.chemaxon.com/conf
• Download– www.jchem.com/licensefrset.html
Contacts US
Douglas DrakeSan Diego, CA858 254 [email protected]
Alex AllardyceBoston, MA
857 544 [email protected]
David HatchOkemos MI
517 381 [email protected]