predictive analytics with oracle data mining
Post on 03-Jan-2016
85 Views
Preview:
DESCRIPTION
TRANSCRIPT
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted
Predictive Analytics with Oracle Data Mining
Vinay DeshmukhSenior Director Oracle Applications Labsvinay.deshmukh@oracle.com
Bryan HodgeGlobal Leader Customer IntelligenceCustomer Support Servicesbryan.hodge@oracle.com
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor StatementThe following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
Oracle Confidential – Internal/Restricted/Highly Restricted 2
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
We run the applications that run OracleWe drive enhancements based on our experienceWe share best practices with our customers
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal 4
Value Chain Opportunities and Risks
Large and Diverse Customer Base
Transition to the cloud
Complex Global Hardware Value Chain
600+ global spares warehouses
Opportunity & RiskAssessment
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal 5
Opportunity and Risk assessment using ODM
Discover hidden/subtle data patterns
Augment Value Chain Planning –both forward and
reverse
Identify inter-relationshipsamong data elements
Quantify likelihood of opportunity/risk
Rewind the clock and compare model accuracy
against actuals.
Oracle Data Mining
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Advanced Analytics Evolution
Analytical SQL in the Database
1998 1999 2002 2005 20082004 2011 2015
• 7 Data Mining “Partners”
• Oracle acquires Thinking Machine Corp’s dev. team + “Darwin” data mining software
• Oracle Data Mining 10g & 10gR2 introduces SQL dm functions, 7 new SQL dm algorithms and new Oracle Data Miner “Classic” wizards driven GUI
• New algorithms (EM, PCA, SVD
• SQLDEV/Oracle Data Miner 4.0 “work flow” GUI launched with SQL script generation and SQL Query node (R integration)
• OAA/ORE 1.3 + 1.4 launched adding several new scalable R algorithms
• Oracle Adv. Analytics for Hadoop Connector launched with scalable BDA algorithms
• Oracle Data Mining 9.2i launched – 2 algorithms (NB and AR) via Java API
• ODM 11g & 11gR2 adds AutoDataPrep (ADP), text mining, perf. improvements
• SQLDEV/Oracle Data Miner 3.2 “work flow” GUI launched
• Integration with “R” and introduction/addition of Oracle R Enterprise
• Product renamed “Oracle Advanced Analytics (ODM + ORE)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data remains in the Database Scalable, parallel Data Mining algorithms in SQL kernel Fast parallelized native SQL data mining functions, SQL data preparation and efficient
execution of R open-source packages High-performance parallel scoring of SQL data mining functions and R open-source
models
Fastest way to deliver enterprise-wide predictive analytics Integrated GUI for Predictive Analytics Database scoring engine
Lowest TCO Eliminate data duplication Eliminate separate analytical servers Leverage investment in Oracle IT
Oracle Advanced AnalyticsPerformance and Scalability with Low Total Cost of Ownership
avings
Model “Scoring”Embedded Data Prep
Data Preparation
Model Building
Oracle Advanced Analytics
Secs, Mins or Hours
Traditional Analytics
Hours, Days or Weeks
Data Extraction
Data Prep & Transformation
Data Mining Model Building
Data MiningModel “Scoring”
Data Prep. &Transformation
Data Import
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Insert Picture Here
• Easy to Use– Oracle Data Miner GUI for data analysts
– “Work flow” paradigm
• Powerful– Multiple algorithms & data transformations
– Runs 100% in-DB
– Build, evaluate and apply models
• Automate and Deploy– Save and share analytical workflows
– Generate SQL scripts for deployment
SQL Developer 4.0 Extension Free OTN Download
Oracle Data Miner
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
More Data Variety—Better Predictive Models
• Increasing sources of relevant data can boost model accuracy
Naïve Guess or Random
100%
0% Population Size
Resp
onde
rs
Model with 20 variables
Model with 75 variables
Model with 250 variables
Model with “Big Data” and hundreds -- thousands of input variables including:• Demographic data• Purchase POS transactional
data• “Unstructured data”, text &
comments• Spatial location data• Long term vs. recent historical
behavior• Web visits• Sensor data• etc.
100%
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Function Algorithms Applicability
ClassificationLogistic Regression (GLM)Decision TreesNaïve Bayes Support Vector Machines (SVM)
Classical statistical techniquePopular / Rules / transparencyEmbedded appWide / narrow data / text
Regression Linear Regression (GLM)Support Vector Machine (SVM)
Classical statistical techniqueWide / narrow data / text
Anomaly Detection One Class SVM Unknown fraud cases or anomalies
Attribute Importance
Minimum Description Length (MDL)Principal Components Analysis (PCA) Attribute reduction, Reduce data noise
Association Rules Apriori Market basket analysis / Next Best Offer
ClusteringHierarchical k-MeansHierarchical O-ClusterExpectation-Maximization Clustering (EM)
Product grouping / Text miningGene and protein analysis
Feature Extraction
Nonnegative Matrix Factorization (NMF)Singular Value Decomposition (SVD) Text analysis / Feature reduction
Oracle Advanced Analytics Algorithms
A1 A2 A3 A4 A5 A6 A7
F1 F2 F3 F4
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
In-Database Advanced Analytics
• Query compares the mean of AMOUNT_SOLD between MEN and WOMEN Grouped By CUST_INCOME_LEVEL ranges
• Returns observed t value and its related two-sided significance (<.05 = significant)
Independent Samples T-Test
SELECT substr(cust_income_level,1,22) income_level,avg(decode(cust_gender,'M',amount_sold,null)) sold_to_men,avg(decode(cust_gender,'F',amount_sold,null)) sold_to_women,stats_t_test_indep(cust_gender, amount_sold, 'STATISTIC','F') t_observed,stats_t_test_indep(cust_gender, amount_sold) two_sided_p_value
FROM sh.customers c, sh.sales sWHERE c.cust_id=s.cust_idGROUP BY rollup(cust_income_level)
ORDER BY 1;
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 12
Case Study: Support Cancellation Early WarningBryan Hodge
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 13
Case Study: Support Cancellation Early Warning
Premier Support Revenue$21B Contracts to be Renewed
8M Product Lines
Very diverse customer base Broad range of products
550K
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 14
Challenge:
Predict the small percentage of contracts/lines that are at risk in order to focus resources appropriately , and minimize losses
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 15
Business Solution
• Developed a cancellation early warning system
• Embedded system generated risk assessment into Forecasting Tool
• Sales Rep uses to help forecast & engage management
• Manager uses to inform forecast judgement
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 16
Two Phase Approach
Tribal Knowledge• Used sales rep experience to
identify risk attributes– E.g. Age of product, size of deal
• Profiled contract base • Established thresholds for Low,
Medium & High risk per attribute• Algorithm to balance attributes into
overall risk assessment
Oracle Data Miner• Analyzed one year of outcomes to
Train decision tree model– Cancelled or Renewed
• ‘Wound the clock back’ on six months more data
• Scored the six months data to generate predictions
• Assessed Accuracy at 85%= (True Positive + True Negatives ) /
Number of Observations
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal 17
Oracle Data Miner - Details
Warehouse star schema with enhanced attributes
Attribute Importance Analysis
Trained decision tree model & assessed accuracy
Saved results in warehouse fact for use in Forecasting Tool
ODM Analysis
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 18
Business Benefits• Identified hidden relationships
across many attributes• Improved quality of risk assessment• Early intervention for customer sat• Reduced cancellation rates
– Bottom line improvements
Challenges• Ensuring statistically significant data
volumes in tree branches• Preparation of data to ‘wind the
clock back’• Avoiding bias during data prep.• Handling partially populated
attributes
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 19
Case Study: Predicting Spare Parts@riskVinay Deshmukh vinay.deshmukh@oracle.comSenior Director Oracle Applications Lab
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 20
Case Study: Identify Spare parts @risk of short supply
Hardware Service Revenue$2.3B Global Spares Warehouses
Large deployment of Value Chain Planning . Augment VCP capabilities with Oracle Data Mining
Very diverse customer base
Broad range of products
1.5 million part-location pairs
supersessions & substitutions
600+
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Problem Statement
Ensure a high level of service to our hardware customers by identifying the parts at risk of short supply at the warehouses closest to them and take proactive steps to remedy the shortage risk . Augment current Value Chain Planning Capabilities to provide risk assessment of parts@risk.
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Demand SignalManagement
DemandManagement and
AdvancedForecasting
CollaborativePlanning and VMI
ProductionScheduling
Global OrderPromising
ServiceParts
Planning
Trade PromotionPlanning andOptimization
Supply and DistributionPlanning and Event-driven
Simulation
Network Designand
Risk Management
Sales andOperationsPlanning
PlanningAnalytics
• Single source of truth• Integrated with ERP
Oracle Value Chain Planning SolutionTransformational Tools
Deployed
Deployed
Deployed
Deployed
Deployed
DeployedDeployed
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Solution Approach
1. Augment Value Chain Planning using ODM Model2.Exception Reporting 3.Customer Report
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal 24
Model Attributes
Forecast Accuracy• MAPE
• Volatility
• Intermittency
Demand• Safety stock mean/std dev
• Forecast mean/std dev
• Shipments
• Backorders
Supply• On hand
• External repair orders
• Projected available balance
• Days of supply
Item Attributes• Cost
• PLC Code
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Prototype Assumptions for the Model: 1 of 2
• 4 Models based on – AMER (US), AMER (Non US), EMEA and APAC
• Data used for training the model was Feb , Mar and Apr 2014 with May 2014 as the target
• Input data used in the model - Apr, May, Jun 2014 with Jul 2014 as the target
• Average and Std Dev used for time phased inputs to the Model – Forecast and Safety Stock. Latest value for projected available balance used.
• Current Backorders and Onhand considered
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Prototype Assumptions for the Model: 2 of 2• For remaining parameters, 3 month average value used
• Item is marked at RISK if
• (backorder > 0 OR pab_qty < 0 OR ss_qty > oh_qty)
• Item is marked as ‘Not at RISK’ if
• (pab_qty between 0 and 0.25 )OR (oh_qty - ss_qty) between 0 and 0.25
• Remaining records were deleted. This tolerance logic was applied to restrict the count of ‘NO’ records in the training data
• The final output shows the items at risk along with the orgs where planning exceptions are generated in the latest run of the corresponding Value Chain Plan for spares
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal 27
Accuracy Analysis
APACAMEREMEA
Total Cases: 2097False YES: 0 (0%)False NO: 204 (10%)Accuracy: 90%
Latin America
Total Cases: 1902False YES: 0 (0%)False NO: 127 (7%)Accuracy: 93%
Total Cases: 2097False YES: 1 (0%)False NO: 433 (21%)Accuracy: 79%
Total Cases: 2358False YES: 1 (0%)False NO: 443 (19%)Accuracy: 81%
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Exception Reporting
• OBIEE Report to show priority exceptions generated for the parts-at-risk predicted by the ODM Model (built on Oracle Advanced Planning Command Center , Oracle Value Chain Planning Suite)
• ODM Output stored in Value Chain Planning data model by specifying the region and organization
• VCP (Advanced Planning Command Center) reports latest exceptions for the respective plan for the parts-at-risk predicted
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Detailed Solution – APCC Report
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Customer Report
• This report shows the contract, base-model and party impacted by the parts-at-risk predicted by ODM Model
• Based on the part-at-risk, the model is fetched utilizing Demantra Data.
• 'EXPIRED', 'CANCELLED', 'TERMINATED‘ contracts are filtered out
• Premier Customers impacted by the parts @ risk are identified
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 31
Case Study: Predicting Hardware OpportunitiesVinay Deshmukh vinay.deshmukh@oracle.comSenior Director Oracle Applications Lab
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Problem Statement
• Predict the outcome of (non-Cloud) Hardware Opportunities whose expected revenue is greater than $1 million–Includes both Direct and Indirect sales channel
• For the opportunities predicted to be won , provide early visibility to suppliers and contract manufacturers by leveraging the capabilities of Value Chain Planning Suite .
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Solution Approach• Train the ODM model with the historical data
– Opportunities from 31-Mar-2011 to 31-Aug-2013– Trained the model with opportunities that were Won or Lost between 31-Mar-2011 and
31-Aug-2013– Additional computed attributes used - product weight, customer weight, partner weight
• Predict the likely outcome of opportunities open as of 1-Sep-2013 using the model
• Test the prediction by comparing against actual wins and losses for predicted opportunities
• Future: Use the predicted opportunities in Value Chain Planning as causal factors to improve forecast accuracy
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal 34
Model Attributes
Industry• Top level Industry
Product• Product Line
• Primary Competitor
• Product Group
• Product Weight
Customer• Account Weight
• Annual Revenue Category
• Number of Employees
Opportunity• Cycle Time
• Opportunity Status
• Expected Revenue
• Opened Date
Partner• Channel Type
• Partner Type
• Partner Weight
Geography• LOB Code
• Region
• Country
• Sales Method
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Calculating weights using Bayesian approach• ((Customer 'x' Won Opty / Total Won Opty) * (Customer 'x' Total Opty / Total Opty)) /
(Σ(Customer 'x' Won Opty / Total Won Opty) * (Customer 'x' Total Opty / Total Opty))
• ((Product 'y' Won Opty / Total Won Opty) * (Product 'y' Total Opty / Total Opty) ) /
(Σ(Product 'y' Won Opty / Total Won Opty) * (Product 'y' Total Opty / Total Opty))
• ((Partner 'z' Won Opty / Total Won Opty) * (Partner 'z' Total Opty / Total Opty)) /
(Σ(Partner 'z' Won Opty / Total Won Opty) * (Partner 'z' Total Opty / Total Opty))
Where x = number of customers, y = Number of products, z = Number of partners
• Note: Direct and Indirect Partner weights are calculated separately. For customer weights = 0 they are replaced by the median of the customer weights
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
The Metrics
• False Positives Model predicted that an event will occur but the event did not occur over the risk horizon
• False Negatives Model predicted than an event will not occur over the risk horizon but the event did occur
Model accuracy = 1 - [(false positives + false negatives)/ total observations]
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Accuracy Achieved (Direct + Indirect channels)
• Average Accuracy– 73.0 %
• Overall Accuracy– 78.2 %
• Accuracy of winning the deal – 84.0 %
Lost Won Total Correct %
Lost 1813 1114 2927 61.9406
Won 1309 6882 8191 84.0191
Total 3122 7996 11118
Actual
Predicted
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal 38
Conclusion
Challenge
• Predict Contract Lines@risk• Predict Spare Parts@risk • Predict H/W opportunity Wins
Benefits
Solution
• Oracle Data Mining for predictive analytics• Augment Oracle Value Chain Planning capabilities provided by Oracle
Demantra and Oracle Advanced Planning Command Center• OBIEE
• Reduced Cancellation Rates• Improve Service Delivery Performance to hardware spares customers• Early demand visibility to suppliers
Predictive Analytics with Oracle Data Mining
Oracle Applications Lab
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential – Internal/Restricted/Highly Restricted 40
top related