1
Chemical Database Management with JChem Base and Cartridge
Szabolcs Csepregi
ChemAxon European UGM Visegrad 2008
Szabolcs Csepregi
Outline
ChemAxon Chemical database productsArchitectureFeaturesFeaturesExample interfaces: JSP, ASP examplesIntegration with other CXN toolsThe coming Registration System APIWhat is coming in JCB/Cartridge 5.1
ChemAxon European UGM Visegrad 2008 2
2
ChemAxon chemical database products
JChem BaseA library for adding chemical structures into relational database systems. Available in Java, JSP and .NETOpen-source web application example is available.
JChem Cartridge for OracleExtends Oracle SQL with chemical operators and index.SQL interface for ChemAxon functionality
ChemAxon European UGM Visegrad 2008 3
Instant JChemAn all-in-one desktop chemical database application.
JChem Base application architectures
Web application
Cliente
Internet / Intranet
ServerCustom servlet or JSP scripts
Query structure
SQL
Hits
Web browser
Query Structures +data
JChem class library
JChem class library
JChem class library
ChemAxon European UGM Visegrad 2008 4
JDBC driver
Relational database(Oracle, MySQL, MS SQL Server, DB2, etc.)
3
JChem Base application architectures
Rich client application
Client
Internet / Intranet
ServerJDBC driver
Rich client application
Query structure
SQL
Hits
Relational database(Oracle, MySQL, MS SQL Server, DB2, etc.)
JChem class library
JChem class library
JChem class library
Rich client application
ChemAxon European UGM Visegrad 2008 5
( , y , , , )
JChem Cartridge architecture
The JChem computation engine can be on a dedicated server to balance workload.
Client
Internet / Intranet
ServerOracle JChem Cartridge• PL/SQL• Java stored procedures
JChem ServerJChem Cartridge Adapter
JCh B
RMI
Client application / Application server
SQL
ChemAxon European UGM Visegrad 2008 6
Java stored procedures JChem Base
Search Update
JChem core
Cache
CacheJDBC
4
Compatibility and integration
Supported chemical file formats:SMILESMDL MOL/RXN/SDF/RDF (v2000 and v3000)CML, MRVetc.
Database engines:Oracle, MySQL, MS SQL Server, MS Access, PostgreSQL, IBM DB2, Derby, etc.
All operating systems through:Java API (JChem Base)
ChemAxon European UGM Visegrad 2008 7
Java API (JChem Base).NET API (JChem Base + JNBridge) – for WindowsSQL (Cartridge)
Structure searching: features
Substructure, Similarity, Exact, Exact fragment, etc. Search typesWide range of query atomsWide range of query atomsQuery propertiesR-group queriesFull SMARTS supportCoordination compoundsLink nodesPseudo atoms, Lone pairs
ChemAxon European UGM Visegrad 2008 8
Pseudo atoms, Lone pairsRelative stereoReaction search featuresHit coloring ...
www.chemaxon.com/conf/Structural_Search.ppt
5
Structure searching: options
Some of the structure search options:Chemical Terms filter constraintTautomer searchStereo on/offIgnore charge/isotope/radical/valence/mixture bracketsVague bond matching modes: „or aromatic”; ignore bond typesInverse hit list Maximum search time / number of hits
ChemAxon European UGM Visegrad 2008 9
Maximum search time / number of hits SQL SELECT statement for pre-filteringOrdering of resultsetc.
Structure search: performance
Number of compounds
Elapsed timeDuplicates not
checkedDuplicates checked
10 000 22 35
Compound registration:
10,000 22 s 35 s
100,000 2 min 33 s 4 min 16 s
200,000 4 min 53 s 8 min 19 s
Query Number of hits Search time
12 0.219 s
936 0 375
Substructure search in a table of 3 million
compounds:
ChemAxon European UGM Visegrad 2008 10
JChem Base 5.0, Athlon X2 2.6GHz, 4GB RAM; Oracle 9.2.0.8.0
936 0.375 s
4,608 0.734 s
65,208 5.594 s
6
Table types
Controls allowed chemical structures and available operations
Molecule
Reaction
Combinatorial Markush
ChemAxon European UGM Visegrad 2008 11
Query
Any structure
Example interfaces: JSP, ASP
Example web applications: open source JSP, ASP examples
Marvin applets are used for query drawing and structurevisualization
D
ChemAxon European UGM Visegrad 2008 12
Demo
7
Integration
Integration with other ChemAxon tools: Custom, uniform chemical representation. (Standardizer – see separate presentation today.)Automatically calculated properties by Chemical Terms Calculated columns (Calculator plugins)Additional similarity calculations (Screen - JChem Base only) Tautomer handling:
Tautomer searchTautomer duplicate filter import
ChemAxon European UGM Visegrad 2008 13
p pCustom tautomer transforms or canonical tautomer using Standardizer
Query drawing and structure visualization (Marvin)Provides the most consistent interface and back-end.
Integration
Additional Cartridge functionality:JChem index (for non-JChem tables)Communication with Oracle optimizerReaction based enumeration (Reactor)Format conversions – image generation alsoMarkush enumeration (Calculator plugins)Property predictions through Chemical Terms (Calculator plugins)
ChemAxon European UGM Visegrad 2008 14
8
Registration system
New component for registration system will be introduced from summer, 2008 (API only)Main features:
C t i bl b i l iCustomizable business logicMultilevel duplication control Customizable corporate registration ID Handling of salts, batches, lots, samples, and mixtures
Identification, split and registration of salt and solvent structures
Storage of input structures in original format Mock registration (dry run)
ChemAxon European UGM Visegrad 2008 15
g ( y )Pre-registration through a transitory areaBasic, customizable implementation examples
Separate examples for chemists and registrars
Web and Instant JChem interfaces will follow later
What is coming in JChem 5.1
Installation/upgrade improvements (include JDBC drivers, Cartridge configuration)
Structure searching:Position variationin Markush structures and queries
Diastereomer search option(S t t h d l t t b t
ChemAxon European UGM Visegrad 2008 16
(Same tetrahedral stereo centers, but possibly different configurations.)
Check sp-hybridizationsearch option (substructure)
9
What is coming in JChem 5.1
Web Services interface for JChem Base
Compound registration system APICompound registration system API
Chemical Terms calculated columns for JChem index (Cartridge – from 5.0.1)
jc_insert returns existing cd_id for duplicate
ChemAxon European UGM Visegrad 2008 17
structures (Cartridge – from 5.0.3)
Under development
Further improvements of Markush handling(towards patents)
Flexible 3D pharmacophore searching
Integration of further ChemAxon functionality in the Cartridge:
R-group decompositionCustom descriptors & similarity
ChemAxon European UGM Visegrad 2008 18
Custom descriptors & similarity
JChem for Excel
10
Summary
JChem Base, JChem Cartridge and Instant JChem offer comprehensive and efficient chemical database solutions.
They are integrated with many other ChemAxon products and are accessible from various interfaces.
Registration system JChem for Excel and patent
ChemAxon European UGM Visegrad 2008 19
Registration system, JChem for Excel and patent Markush handling are coming.
Thank you for your attention!Tha k you for your atte tio !
For more information please visit www.chemaxon.com
ChemAxon European UGM Visegrad 2008