oracle life sciences platform overview...oracle life sciences platform and 10g overview june 23...
TRANSCRIPT
Oracle Life Sciences Platform and 10g Overview
June 23 & 24Reston, VA
Charlie BergerSr. Director of Product Management, Life Sciences and Data [email protected] Corporation
Administrative
Thank you to our ISV Partners for sponsoring OLSUGInforSense LtdInnaPhaseWaters CorpMDLApplied BiosystemsAccelrysTom Sawyer Software, Inc.World Fusion CorpSpotfireLION bioscience, Inc.Tripos
Administrative
Auditorium—seats 120 2nd Floor—seats 70Classroom—seats 24; first come, first seating
OLSUG Agenda
Administrative
OLSUG Membership and PlansThurs at 8:30 am
Open DiscussionThurs at 4:15
Please turn in your FEEDBACK FORMS!!
Database ApplicationServer
DiscoveryDiscovery
Finance
HR Projects
Maintenance
Manage all your dataManage all your data Run all your applicationsRun all your applications
Oracle’s Solutions for Life Sciences
Manufacture/Supply Chain Management
Development& Clinical
Sales & Marketing
DiscoveryDiscovery Collaborate SecurelyCollaborate Securely
HealthcareTransactions
Life Science ChallengeTypical Research EnvironmentPublic Databases
Industrial Research
Lab
Private/Service Databases
Local Copies
Partner or Collaborator
Local Databases
Genomics
Proteomics
Pathways
Cheminformatics
Clinical
Oracle’s Platform for Life Sciences
1. Access distributed data2. Integrate a variety of data types3. Manage vast quantities of data4. Collaborate securely5. Find patterns and insights
Oracle Life Science Platform1. Access distributed data
Gateways, External Tables, SQL Loader, Streams, Oracle Gateway to Lion SRS, etc.
2. Integrate a variety of data typesXML DB, Intermedia, Text, etc.
3. Manage vast quantities of dataRAC, Partitioning, Grid, etc.
4. Collaborate securelyCollaboration Suite, iFS (Oracle FilesOnline), Portal, Security, etc.
5. Find patterns and insightsData Mining, BLAST, Statistics, Text, etc.
GenomicsGenomics
ProteomicsProteomics
PathwaysPathwaysCheminformaticsCheminformatics
ClinicalClinical
Access Distributed Data
Flat files
Distributed query
Transparent Gateway
External Sites
MySQL
Generic Connectivity
DBlinks
UltraSearch
DB2
Transparent Gateway
External Table
1. Access Distributed Data
Flat filesSRS
1. Access Distributed Data Merge StatementOracle StreamsHeterogeneous Transportable TablespacesSQL*LoaderMigration ToolkitsHigh Speed Import/ExportOracle Scheduler
Flat files MySQL
SQL*LoaderHigh-speed data loading utility
Loads data from external files into tables in an Oracle database.Accepts input data in a variety of formatsPerforms filteringLoads into multiple tables during the same load session
Three methods for loading data: Conventional Path LoadDirect Path LoadExternal Table Load
Merge StatementFast insert, update or conditional update/insert of records
MATCHED THEN insert clauseWHEN NOT
WHENSKIP ( condition )
MERGE INTO table
USING table/view/subquery
ON ( condition )
WHEN MATCHED THEN update clause
Transportable TablespacesMechanism to quickly move a tablespace between Oracle databases
Most efficient means to move bulk data between databases
Enhanced to support different hardware platforms & operating systems
sourcedatabase
targetdatabase
Oracle Data PumpHigh speed bulk data and metadata movement (Import/Export) between Oracle databasesUses high speed direct-path load/unloadAutomatically scales using parallel executionAccessible via
expdp and impdp utilitiesPL/SQL APIEnterprise Manager
Oracle SchedulerRuns PL/SQL, Java, 3GL, OS Scripts, internal utilities (RMAN)
Job classes, priorities, workload windowsIntegrated with Resource Manager & RAC service framework
Integrate Platform’s JobScheduler with Oracle databaseSingle interface for job scheduling Platform’s JobScheduler can create & schedule Oracle database jobsDatabase jobs can be incorporated into larger job flowsSchedule & use resources efficiently for combined database & computational tasks
Distributed Query OptimizationOptimized queries within and across databasesEnhanced cost based optimizerCapture complete statistics for remote tablesConsider network bandwidth & latency in deciding what parts of query plan should be remotely mappedSupport different execution cost at different nodes (e.g. based on node ownership)
Oracle StreamsEnables rule-based information sharing among multiple systems
Captures and manages eventsShares events with other databases and applicationsRoutes published information to subscribed destinationsIntegrated with new job scheduler
Capture Staging Consumption
SRS Transparent Gateway for OracleData supported by the SRS Gateway behaves as if they are in OracleOracle re-writes user’s SQL query into syntax understood by SRS, using capability table & index of GatewayThe query is executed in the SRS systemIf mapping entire query to SRS syntax is not possible, after fetching the data, Oracle will do some functions/joins locally
Integrate a Variety of Data Types
2. Integrate a Variety of Data TypesXML DB
Unite XML content & SQL/relational dataLOBs
Manage unstructured data BFILES, BLOBs, CLOBs, URIs
Files(Oracle9iFS)Central repository for structured & unstructured data
TextIndex & fast query of text content
interMediaManage audio, video & image data
Extensible indexingManage & index complex scientific data
XMLXML
XML SupportOracle Database supports XML data model
XMLType, XMLSchema, DOM Fidelity, Xpath, …
Query Language: SQL/XML and XML QueryTransparent storage optimizationsA new XML Content Repository
Hierarchical organization of the dataWebDAV compliant with indexing for fast access
Copy-based Schema Evolution for XMLTypeSQLX standards compliance
XDK Advances XML APIsXDK unifies XML APIs in/outside Database
Simplifies XML Application development in the Database, Mid-tier & Clients Eliminates multi-step processing by operating directly on XMLType Improves application performance in Java, C, and C++
XSLT performance increase up to 100%Additional XML Standards Support
DOM 3, XSLT 2, XPath 2XML Pipeline, XPointer, JAXB
Reed ElsevierLargest technical publishing conglomerate $8B annual revenueMore than 1700 scientific, technical & medical peer-reviewed journalsOver 59 million abstracts Over two million full-text scientific journal articles , another one million full-text articles via CrossRef(http://www.crossref.org/) to other publishers' platformsOracle XML DB chosen as Repository Database
Oracle TextSearch text documentsSimply text applications development via JDeveloper WizardsClassification & clustering using data mining algorithms
Theme & classification visualization
Improved FilteringMulti-part MIME for emailEncrypted PDF
Oracle Text
European Bioinformatics Institute
Manages major public databases (e.g. SwissProt, EMBL Nucleotide Sequence Database, Medline) in Oracle. (Total: > 5 TB)Uses Oracle XML DB and Oracle Text for Medline – in development.
Size: 11 million records, 200 GBUses Oracle Database and Application Server
interMediaAbility to store wide range of image types
Processing functionalityRotate/flip, brighten/darken using gamma processing, adjust contrast, change bit depth
Access through SQL, Java & Web interfacesRestrict access via security rolesConform to SQL/MM still image standardStore images as columns
Tight integration with annotations
Network Data ModelModel, store, manage & analyze generic connectivity relationships in the DB, i.e. represent data as nodes & linksCan model hierarchies, logical or spatial information, directionalityNetwork analysis at client or application level, e.g. shortest-path, tracing, within-distance analysis, minimum cost spanning tree, nearest neighborNetwork management, e.g. add, delete, modify, load
Network Data Model Reference
"Oracle 10g's Network Data Model feature is great for building asemantic work infrastructure. Oracle 10g's graphical representation
is an excellent tool for planning our Y2H protein interaction data storage needs and for building a signaling network from our Nature-
AfCS Molecule Pages Database." - Joshua Li, Sr. Computational Scientist, San Diego Supercomputer Center / UCSD
"Beyond Genomics, Inc., as a leading systems biology company, believes that Oracle 10g's network data model will significantly
advance the integration of metabolomic, proteomic, transcriptomic, and clinical data sets and the applications that derive value from
these data." – Eric Neumann, Vice President Strategic Informatics, Beyond Genomics, Inc.
Extensibility Framework
Oracle8iServer
Service Interfaces
DataCartridge
Extensibility Interfaces
TypeSystem
QueryProcessing Data
IndexingServer
Execution . . .
Database Extensibility Services
Oracle10gServer
Data CartridgesManage complex scientific data
Chemical SearchingChemistry searching requires special techniques
Chemical name is not uniqueChemists think graphically
“Viagra®”
The solution:A graphical user interface
Specialized operators such as substructuresearch (“sss”) = a chemical “contains”
“sildenafil citrate”
N
N
SO O
O
N
NN
N
O
H
H H
HHHH
H
H
Cl
Cl
O
finds
Manage Vast Quantities of Data
3. Manage Vast Quantities of DataPartitioning
Divide and conquer
Oracle 10g Application ServerProvide scalability for middle tier
Oracle Data GuardProtect data from human or system failures
Real Application Clusters (RAC)Provides high availability, performance and ease of scalability
Real Application Clusters (RAC)
Works with ALLapplicationsFail-over transparent to usersEasy to administer
Start with one server, one database and grow as you growLinear scalability out of the boxSave on Hardware and Storage costs
High-speedinterconnect
DataLoads
Sample/LabProteomics Portal
A-Z
Enterprise Grid ComputingGrid concepts provided with:
Distributed queries, External Tables, Security, RAC, etc.
Grid Access to Oracle Utilities through Globus Resource Allocation Manager (GRAM)
Export, Import, SQLPlusGrid Access to Oracle Database
Invoke PL/SQL routines specified inGlobus Resource Specification Language
Grid Resource Information Service (GRIS) for Oracle Database
Discover & monitor Oracle databases
Performance and ScalabilityDouble the performance for key OLTP workloadsOver 30 internal performance optimizationsImproved IA64 bit supportFast Interconnect support for DB/DB & DB/iASconnectivityFiber support on Windows
Ultra Large DatabasesDatabase Size Limits Raised
Millions of Terabytes (Exabytes)Unlimited size LOB columnsUltra Large data filesMultiple TEMP TablespacesHigh Speed Bulk Data Movement
Data Pump, Transportable Tablespaces
CaprionDiscover & develop innovativeproducts for the diagnosis & treatment of diseases
Scalability for a multi-TB systemIntegration of all components with existing computing environmentSecurity & protection of data integrity
Key Advantages of OracleEasy access & management of integrated informationRapid deployment of new ad hoc queryScalability necessary to accommodate growth
Oracle EnvironmentOracle DatabaseOracle9i Application ServerOracle9i Developer SuiteOracle9i AS DiscovererOracle Warehouse Builder
“The Oracle Data Warehouse is a key component of our IT
platform for proteomics analysis. The massive amount of
information we produce every day requires a system with
proven performance to effectively capture our biological
data”. - Bernard Gagnon, IT Director
ManageabilitySimplified Installation, Configuration and UpgradeImproved Manageability InfrastructureAutomated Application and SQL TuningAdaptive Instance TuningRAC Automated Workload ManagementAutomated Storage ManagementAutomated Backup and RecoveryOracle Enterprise Manager Improvements
High AvailabilitySupport for Rolling UpgradesOnline Application UpgradesFlashback Any ErrorImproved Data Guard InfrastructureImproved Backup and Recovery
Dragon Genomics Centerof Takara Bio Inc.
High-Level Project GoalsManage data throughout every step of a complicated processCreate a laboratory information management system (LIMS) enabling large scale sequencingProvide reliable back up and recovery of vast amounts of data
Key BenefitsProvided easy access and management for vast amounts of dataEnsured scalability needed to accommodate future growth
Oracle EnvironmentOracle Database Enterprise EditionOracle9iAS Enterprise Edition
"We trust Oracle in its ability to run terabyte-class
databases in clustered environments with high availability. And we're
pleased to say that Oracle has not disappointed us.“ -Toru Suzuki, Project Manager,
Dragon Genomics Center, Takara Bio Inc.
Genentech, Inc.Leading biotech company
Over 2 TBs of data in OracleOracle serves as a centralized information resource for gene searching and database cross-referencing.Oracle used for the entire pipeline from research to clinical data to manufacturing and sales applications.
Key Advantages of OracleImproved performance Greater reliability Genentech's corporate goal is 99.999% availability in a 24x7 environment
Oracle EnvironmentOracle 9i databaseReal Application Clusters
Oracle9i Real Application Clusters provide the
foundation for the scalable and highly available
database infrastructure we require to meet our growing data demands in all areas of our business.“ -Scooter Morris, Genentech, Inc.
San Diego Supercomputing Center“In the beginning, we considered using MySQL, Oracle, and another database. But when we evaluated our project needs over the next ten years and realized that our database could
grow to terabytes, we decided we needed a scalable database and one that was reliable. We didn’t want to be
forced to change databases in the middle of the project. …. “We do not need a lot of DBAs to maintain the database.”
Joshua Li, Senior Computational Scientist, University of California, San Diego, Supercomputing Center
Systemwide, SDSC relies on only three DBAs to run over 40 Oracle databases.
Bioinformatics Center Institute for Chemical Research Kyoto UniversityThe Bioinformatics Center Institute for Chemical Research Kyoto University is leading biotechnology research thanks to its comprehensive studies in various areas, including the life sciences, information sciences, chemistry and physics.
“In order to manage this massive amount of genetic information and to operate efficiently, it is essential to have a platform with paramount stability. Our web site receives accesses from all over the world continuously, 24 hours a day. In order to offer the latest information under such circumstances, performance is also an issue. In this sense, the Oracle Database was the most appropriate since it can handle this enormous amount of data in a fast and stable manner, 24 hours a day.”– Professor and Director Minoru Kanehisa, Bioinformatics Center Institute for Chemical Research Kyoto University
Collaborate Securely
4. Collaborate SecurelyOracle 10gAS Portal
Build personalized portalsOracle Workflow
Automate laboratory and business processesOracle 10gAS Files
Enable content management and collaborationRevision control, check-in/check-out, access control
Virtual Private DatabaseDifferent users have unique access privileges
AuditingCreate audit trail to facilitate FDA compliance
Oracle 10gAS Web ServicesStandard way to collaborate through the Web
Oracle PortalRich, declarative environment
Create Web interfaces, publish and manage information, access dynamic data, and customize:
extensible J2EE framework
Connect researchers and collaborators with the information they need and the flexibility to create views tailored to each community
Oracle Collaboration SuiteIntegrated communications Single enterprise search across all repositoriesFlexible access
Integrated Data and Web Services Platform
Oracle Database
iAS
Data ServicesPL/SQLJavaRelationalTextBinaryXDBStreams/AQDBMS JobsSystem Admin
eBusiness& Collaboration Services...
SOAP
SOAP
SOAP
SOAP
SOAP
ApplicationServicesJ2EEPortalBIWireless...
SOAPServiceRequestor
SOAP
SOAP = SOAP or ebXMLover HTTP-JMS-SMTP-FTP
SOAP
UDDIWSDL
Complete data protection Manage user access Detect data misuse with AuditingFacilitate regulatory compliance (HIPPA, 21 CFR PART 11)
Security Evaluations Oracle Microsoft IBMSecurity EvaluationsSecurity Evaluations OracleOracle MicrosoftMicrosoft IBMIBM
US TCSEC, Level B1
US TCSEC, Level C2
UK ITSEC, Levels E3/F-C2
UK ITSEC, Levels E3/F-B1
ISO Common Criteria, EAL-4
Russian Criteria, Levels III, IV
US FIPS 140-1, Level 2
TOTAL
US TCSEC, Level B1US TCSEC, Level B1
US TCSEC, Level C2US TCSEC, Level C2
UK ITSEC, Levels E3/FUK ITSEC, Levels E3/F--C2C2
UK ITSEC, Levels E3/FUK ITSEC, Levels E3/F--B1B1
ISO Common Criteria, EALISO Common Criteria, EAL--44
Russian Criteria, Levels III, IVRussian Criteria, Levels III, IV
US FIPS 140US FIPS 140--1, Level 21, Level 2
TOTALTOTAL
1
1
3
3
4
2
1
15
11
11
33
33
44
22
11
1515
-
1
-
-
-
-
Failed
1
--
11
--
--
--
--
FailedFailed
11
-
-
-
-
-
-
-0
--
--
--
--
--
--
--00
Oracle10g Unbreakable Security
Taratec e ComplianceTM
Built specifically to supports FDA 21 CFR Part 11 ComplianceDesigned for Life Sciences Data & File Management
FeaturesVersioning, Advance Searching, Check-in/Check-OutIntegrated storage of files from any sourceUniversal access through Web browserComplete Audit Trail of File Operations
“With Oracle as the foundation, we were able to develop a solution that can secure a vast array of file-based data with vault like security.” - Bill Gargano,
President and COO Taratec Development Corporation
Taratec e-ComplianceTM
University of California San Diego School of Medicine
The Patient Centered Access to Secure Systems Online (PCASSO)
178,000 Medical RecordsProvides trusted access to a patient’s health information from healthcare providers over the Internet Oracle Label Security & Virtual Private Database
The security is locked to the data and therefore can’t be subverted. No application coding needed to implement security.
Find Patterns and Insights
5. Discover Patterns and InsightsOracle Data Mining
Find relationships and clusters Naïve Bayes, Adaptive Bayes Networks, Attribute Importance, Association Rules, K-Means, O-Cluster, SVM, NMF algorithms
Oracle Discoverer & Oracle OLAPInteractive query & drill-down
StatisticsPerform statistics in Oracle
E.g. summary statistics, hypothesis testing, cross-tab statistics, distribution testing, correlations, linear regression
Oracle TextSearch, index, classify and cluster documents
Table FunctionsImplement complex algorithms within the database
5. Discover Patterns and Insights
Deductive Analysis
Inductive Analysis
Answer complex questions about the
relationships in genomic, clinical and
pharmacological data
Finding relationships for classification,
class discovery and prediction
Life Sciences data
Pharmacological databases
Proteomics Database
Clinical Databases
Functional Genomic
Databases
C A T G0 0 1 0 1BLAST
Implemented using a table function interfaceBLAST search functions can be placed in SQL queriesDifferent functions for match & alignSQL queries can be used to pre-filter database of sequences & post-process the search resultsCombination of SQL queries & BLAST is very powerful & flexible
Sample BLAST QueryOn the output of a BLAST search, find out how many belong to each functional class (SwissProt keyword) -- cell cycle, DNA repair etc.
select function, COUNT(seq_id) f_count from (select t.seq_id, t.score, t.expect, g.function
from SwissProt_DB g, Table(BLASTP_MATCH(‘AEQAERYDDMAAAMKRY’,
cursor (select seq_id, sequencefrom SwissProt_DB),
5)) t /* expect_value */where t.seq_id = g.seq_id)
group by function /* swissprot kw */order by f_count
BLASTP_MATCH
SwissProt_DB
seq_id, score, expect
query_sequence, parameters
seq_id, function
t.seq_id = g.seq_id
GROUP BY
function, f_count
SwissProt_DB
C A T G0 0 1 0 1
For the query sequence “ATCGCGTT”, find the top 3 matches above a similarity threshold from each organism
select seq_id, organism, score, expect from (select t.seq_id, t.score, t.expect, g.organism,
RANK() OVER (PARTITION BY organism ORDER BY score DESC) as o_rank
from SwissProt_DB g, Table(SYS_BLASTP_MATCH (‘ATCGCGTT’,cursor (select seq_id, sequence from SwissProt_DB), 5)) t /* expect_value */
where t.seq_id = g.seq_id) where o_rank <= 3
BLAST “Delighters”Queries performed in the databaseAbility to perform combinatorial queries e.g. sequence similarity AND annotation contains “Lymphoma”
SYS_BLASTP_MATCH
SwissProt_DB
seq_id, score, expect
query_sequence, parameters
Sample BLAST Query
seq_id, organism, score, expect
t.seq_id = g.seq_id
σo_rank <= 3
RANK
seq_id, organism, score, expect
SwissProt_DB
Example BLAST Application
BLAST Quote
"Oracle 10g's new BLAST feature will enable us to easily integrate multiple types of genomic and proteomic data for complicated queries used in the mining of our proprietary protein-protein
interaction and cDNA sequence datasets." - Jake Chen, Principal Bioinformatics Scientist, Myriad Proteomics
Regular Expression SearchesA powerful method of describing both simple & complex patterns for searching & manipulatingA multilingual regular expression support for SQL & PL/SQL string types Follows POSIX style Regexp syntaxSupport standard Regexp operators Includes common extensions such as case-insensitive matching, sub-expression back-references, etc.Compatible with popular Regexp implementations like GNU, Perl, Awk
Regular Expression Searches Quote
"Thanks to Oracle 10g's Regular Expressions (RE) query support, it's no longer necessary to export data from the database, process
it with a RE enabled tool and then import the data back into thedatabase. Now, RE processing can be handled with a single query." - Marcel Davidson, Head of Database Administration,
Myriad Proteomics
Quotes
“Support for regular expressions in SQL and PL/SQL is one of the most exciting features of Oracle Database 10G. Oracle has long supported the ANSI-standard LIKE predicate for rudimentary pattern matching, but regular expressions take pattern matching to a new level. They provide a powerful way to select data that matches a pattern, as well as to manipulate, rearrange, and change that data.”
Oracle Regular Expressions Pocket Reference, O’Reilly Sept. 2003
StatisticsDescriptive Statistics
mode, summary statisticsHypothesis Testing
Student t-test , F-test, Binomial test, Wilcoxon Signed Ranks test, Chi-square, Mann Whitney test, Kolmogorov-Smirnov test, One-way ANOVA
Cross TabsEnhanced with % statistics
Distribution fittingKolmogorov-Smirnov Test, Anderson-Darling Test, Chi-Squared Test, NORMAL_DIST_FIT, UNIFORM_DIST_FIT, WEIBULL_DIST_FIT, EXPONENTIAL_DIST_FIT
CorrelationsSpearmans rho coefficient, Kendals tau-b coefficient
Pareto Analysis80:20 rule, cumulative results table
10g Statistics & SQL AnalyticsRanking functions
rank, dense_rank, cume_dist, percent_rank,ntile
Window Aggregate functions (moving and cumulative)
Avg, sum, min, max, count, variance, stddev, first_value, last_value
LAG/LEAD functionsDirect inter-row reference using offsets
Reporting Aggregate functionsSum, avg, min, max, variance, stddev, count, ratio_to_report
Statistical AggregatesCorrelation, linear regression family, covariance
Linear regressionFitting of an ordinary-least-squares regression line to a set of number pairs. Frequently combined with the COVAR_POP, COVAR_SAMP, and CORR functions.
Descriptive Statisticsaverage, standard deviation, variance, min, max, median(via percentile_count), mode, group-by & roll-upDBMS_STAT_FUNCS: summarizes numerical columns of a table and returns count, min, max, range, mean, stats_mode, variance, standard deviation, median,quantile values, +/- 3 sigma values, top/bottom 5 values
CorrelationsPearson’s correlation coefficients, Spearman's and Kendall's (both nonparametric).
Cross TabsEnhanced with % statistics: chi squared, phi coefficient, Cramer's V, contingency coefficient, Cohen's kappa
Hypothesis TestingStudent t-test , F-test, Binomial test, Wilcoxon Signed Ranks test, Chi-square, Mann Whitney test, Kolmogorov-Smirnov test, One-way ANOVA
Distribution FittingKolmogorov-Smirnov Test, Anderson-Darling Test, Chi-Squared Test, Normal, Uniform, Weibull, Exponential
Pareto Analysis (documented)80:20 rule, cumulative results table
StatisticsEnables analytic pipelines without removing data to statistical packages for simple analyses (e.g. hypothesis testing)
OracleAS DiscovererAd-hoc query & reportingWeb publishingDiscoverer is included with Oracle Application Server Enterprise Edition
IEEE Floating PointSupport for industry standard treatment of numbers & precisionCritical for compute intensive operationsFaster performance
Multi-dimensional Extensions to SQLHighly scalable spreadsheet-like array computation in SQLColumns are classified into Dimensions & Measures Measures can be treated as cells in an n-dimensional arrayModels can be built & stored in the databaseSupports recursive problem solving
Oracle Data MiningPlatform for data mining applications
PL/SQL APIJava APIOracle Data Miner (GUI)
Wide range of algorithmsClassification
Support Vector Machines, Naïve Bayes, Adaptive Bayes Networks,
Attribute ImportanceAssociation RulesClustering
Enhanced K-Means, Orthogonal ClusteringNonnegative Matrix Factorization (feature extraction)BLAST (Sequence similarity search & alignment)
Data Mining Quote
“Using InforSense discovery workflows built upon the world leading Oracle data mining, text mining and R&D Database functionality,
researchers and organizations can now automate large scale and complex knowledge discovery and management activities with performance and reliability.” - Yike Guo, CEO InforSense
Biological/ Clinical
Experiments
Instruments Data Pre-Processing
Interpretation of Results
Bioinformatics Analytical Pipelines
Analytical Algorithms
New PaperNew Drug
New TreatmentNew DB Entries
Life Science Discovery Phases:• Exploratory/Prototype Analysis
• Application Development
• Production SystemFilesFilesFilesFiles
Perl Scripts
Perl Scripts AlgorithmsAlgorithms
Files
DB Files Files
AlgorithmsPerl
ScriptsFiles
DBFiles
Oracle Life Sciences Platform
C A T G0 0 1 0 1
Oracle’s Contribution to Life SciencesFind me any compound that looks like my current structure, and that has been tested on any assay in my company where the IC50>200nM, where I know that I have a unique patent position, and hasn't been published in any journal?
Find me any compound that looks like my current structure, and that has been tested on any assay in my company where the IC50>200nM, where I know that I have a unique patent position, and hasn't been published in any journal?
select c.id, p.structure, from compound c, protein p, assay awhere a.compound_id = c.idand a.protein_id = p.idand a.company = “BIO_SYS” and a.IC50 > 200nM and similar_to(p.id, “protein kinase”)and not_published(p.id, “Medline”)and extract_value(value(p.id), ‘Dgene/Protein/Id’) = p.id
select c.id, p.structure, from compound c, protein p, assay awhere a.compound_id = c.idand a.protein_id = p.idand a.company = “BIO_SYS” and a.IC50 > 200nM and similar_to(p.id, “protein kinase”)and not_published(p.id, “Medline”)and extract_value(value(p.id), ‘Dgene/Protein/Id’) = p.id
Oracle9iOracle9Oracle9ii
RelationalRelational
MessageMessage
XMLXML
TextText
ImageImage
“At the end of such testimonials, it was very difficult to see whether Oracle has a serious rival in the realm of databases for high-throughput drug discovery. With a well-known 70 percent market share, Oracle is starting to penetrate smaller labs in academia and nonprofit research institutes.”- Mark D. Uehling, Bio-IT World (online) 09/12/03
“All are among the features that make Database 10g much more than a large-scale data repository. Old 1960s labels such as "electronic brain" come to mind—Database 10g doesn't just know stuff, it also thinks about it.”- Peter Coffee, eWeek (online) 05/31/04
IDC Analysts
“Even IBM's own partners say that DB2 and DiscoveryLink have failed to gain
much ground in the life sciences despite IBM's giveaways. According to Hall,
Oracle, the "de facto standard," still holds a commanding 75 percent to 80 percent
market share in this vertical.”Mark Hall, Director of Life Sciences, IDC,
quoted in InfoWeek 12/12/2002
Oracle 10g Enables you to:Access distributed dataIntegrate a variety of data typesManage vast quantities of dataCollaborate securelyFind patterns and insights
Oracle 10g is an ideal platform for life sciences
Oracle Life Sciences Platform Summary
Oracle Consulting Services
Data Mining ServicesRichard Solari [email protected]
Patrick Hoffman [email protected]
Life Sciences ServicesDev [email protected]
Life Sciences Consulting experience
JSP Struts/Blast Application Gene expression analysisSequence Analysis (blast exon/intron prediction)Clinical/Medical data analysisQSAR/Cheminformatics
Isis,Molconz, Predictive ToxAnimal StudiesProtein analysis (arrays, Mass spec)Ontology's and Text Mining
Life SciencesDM Workshop
A one day onsite technical session educating organizations on how to leverage one of their most valuable assets to provide insight in the operations of their business, the behavioral patterns of their customers and hidden relationships found deep within corporate data that can have direct impact to the bottom line.
Life SciencesDM Blueprint
A documented technical roadmap providing the organization with the strategy to integrate and deploy Life Sciences technology. This includes recommendations based on feedback from the Life Sciences workshop focusing on source data preparation, mining methodologies and supporting architecture.
Life SciencesDM Insight
A five day onsite engagement focused on providing a detailed analysis of the business problem, data preparation, model build and analysis and knowledge deployment extending the analysis of the Life Sciences workshop culminating with a technical roadmap with a strategy to integrate and deploy Life Sciences technology.
Life SciencesDMQuickstart
A thirty day engagement focused on taking a business problem and transforming into a Life Sciences solution. This includes transforming the business problem, preparing e data, creation of the mining model and knowledge deployment. Upon completion, results will be delivered mapped to the initial business problem.
Life SciencesDM Services
A series of custom services focused on delivering Life Sciences methodologies and solutions to provide insight in the operations of their business, the behavioral patterns of their customers and hidden relationships found deep within corporate data that can have direct impact to the bottom line.
Oracle Global Customer Program
S. Anand, Oracle 9:45 Thursday, 2nd Floor
Oracle Global Customer ProgramS. Anand, Oracle 9:45 Thursday, 2nd Floor
Define the Need & Budget
Define the Need & Budget
Evaluate Solution &
Target Benefits
Evaluate Solution &
Target Benefits
Select & ProcureSelect & Procure
Implement & Track
Benefits
Implement & Track
BenefitsOperate &
LearnOperate &
LearnMeasure Results
Measure Results
Build on Relationship
Build on Relationship
Buy Implement Run
Unify the “Voice of the Customer” – and Oracle’s Response
Drive Customer Business ValueBusiness Consulting
Customer InsightsCustomer Forums
Customer Communications
Customer Quality Management Evaluate Customer Success & Resolve Issues
Build ReferencesBenefits Measurement & References
Business Case
Analysis
Performance Advisory Services
Strategic Solution Planning
Executive Alignment
Surveys & Scorecards User GroupsResearch &
BenchmarksExecutive Forums
Customer Response
Field Referencing
Marketing Referencing
Benefits Realization &
ROI
Success Stories
Outreach Issue Resolution
Executive Sponsorship
Program
Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S