finding hidden intelligence with predictive analysis of data mining rafal lukawiecki strategic...

Download Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com

Post on 22-Dec-2015

216 views

Category:

Documents

3 download

Embed Size (px)

TRANSCRIPT

  • Slide 1
  • Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com
  • Slide 2
  • 2 Objectives Show use of Microsoft SQL Server 2008 Analysis Services Data Mining Tantalise you with the power of DM This seminar is based on a number of sources including a few dozen of Microsoft-owned presentations, used with permission. Thank you to Marin Bezic, Kathy Sabourin, Aydin Gencler, Bryan Bredehoeft, and Chris Dial for all the support. Thank you to Maciej Pilecki for assistance with demos. The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions 2009 Project Botticelli Ltd & entire material 2009 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
  • Slide 3
  • 3 Agenda Data Mining and Predictive Analytics Server and Process Considerations Scenarios & Demos
  • Slide 4
  • 4 What does Data Mining Do? Explores Your Data Finds Patterns Performs Predictions
  • Slide 5
  • 5 Typical Uses Data Mining Seek Profitable Customers Understand Customer Needs Anticipate Customer Churn Predict Sales & Inventory Build Effective Marketing Campaigns Detect and Prevent Fraud Correct Data During ETL
  • Slide 6
  • 6 Analysis Services Server Mining Model Data Mining Algorithm Data Source Server Mining Architecture Excel/Visio/SSRS/Your App OLE DB/ADOMD/XMLA Deploy BIDS Excel Visio SSMS App Data
  • Slide 7
  • 7 Mining Model Mining Process DM Engine Training data Data to be predicted Mining Model With predictions
  • Slide 8
  • 8 SCENARIO: CUSTOMER CLASSIFICATION & SEGMENTATION Who are our customers? Are there any relationships between their demographics and their buying power?
  • Slide 9
  • 9 Microsoft Decision Trees Use for: Classification: churn and risk analysis Regression: predict profit or income Association analysis based on multiple predictable variable Builds one tree for each predictable attribute Fast
  • Slide 10
  • 10 Decision Trees for Classification of Customers Buying Potential
  • Slide 11
  • 11 SCENARIO: PROFITABILITY AND RISK Who are our most profitable customers? Can I predict profit of a future customer based on demographics? Are they creditworthy? How much should I charge them to give a good loan and protect against losses?
  • Slide 12
  • 12 Profitability and Risk Finding what makes a customer profitable is also classification or regression Typically solved with: Decision Trees (Regression), Linear Regression, and Neural Networks or Logistic Regression Often used for prediction Important to predict probability of the predicted, or expected profit Risk scoring Logistic Regression and Neural Networks
  • Slide 13
  • 13 Neural Network & Logistic Regression Applied to Classification Regression Great for finding complicated relationship among attributes Difficult to interpret results Gradient Descent method LR is NNet with no hidden layers AgeEducationSexIncome Input Layer Hidden Layers Output Layer Loyalty
  • Slide 14
  • 14 1. Neural Networks for Profitability Analysis 2. Predicting Lending Risk with Neural Networks
  • Slide 15
  • 15 SCENARIO: CUSTOMER NEEDS ANALYSIS How do they behave? What are they likely to do once they bought that really expensive car? Should I intervene?
  • Slide 16
  • 16 Sequence Clustering Analysis of: Customer behaviour Transaction patterns Click stream Customer segmentation Sequence prediction Mix of clustering and sequence technologies Groups individuals based on their profiles including sequence data
  • Slide 17
  • 17 Analysis Customer Behaviour with Sequence Clustering
  • Slide 18
  • 18 SCENARIO: FORECASTING What are my sales going to be like in the next few months? Will I have credit problems? Will my server need an upgrade in the next 3 months?
  • Slide 19
  • 19 Time Series Uses: Forecast sales Inventory prediction Web hits prediction Stock value estimation Regression trees with extras
  • Slide 20
  • 20 Forecasting Using Time Series
  • Slide 21
  • 21 Summary of Techniques AlgorithmDescription Decision Trees Finds the odds of an outcome based on values in a training set Association Rules Identifies relationships between cases Clustering Classifies cases into distinctive groups based on any attribute sets Nave Bayes Clearly shows the differences in a particular variable for various data elements Sequence Clustering Groups or clusters data based on a sequence of previous events Time Series Analyzes and forecasts time-based data combining the powerof ARTXP (developed by Microsoft Research) for short-term predictionswith ARIMA (in SQL 2008) for long-term accuracy. Neural Nets Seeks to uncover non-intuitive relationships in data Linear Regression Determines the relationship between columns in order to predict an outcome Logistic Regression Determines the relationship between columns in order to evaluate the probability that a column will contain a specific state
  • Slide 22
  • 22 Time Series Sequence Clustering Neural Nets Nave Bayes Logistic Regression Linear Regression Decision Trees Clustering Association Rules Classification Estimation Segmentation Association Forecasting Text Analysis Advanced Data Exploration
  • Slide 23
  • 23 Summary Data Mining is a powerful, predictive technology Turns data into valuable, decision-making knowledge SQL Server 2008 Analysis Services support Predictive Analytics Mine your mountains of data for gems of intelligence today!
  • Slide 24
  • Summary and Q&A Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com
  • Slide 25
  • 25 BI & PM in an Enterprise Data Sources Staging Area Manual Cleansing Data Marts Data Warehouse Client Access 1: Clients need access to data 2: Clients may access data sources directly 3: Data sources can be mirrored/replicated to reduce contention 4: The data warehouse manages data for analyzing and reporting 5: Data warehouse is periodically populated from data sources 6: Staging areas may simplify the data warehouse population 7: Manual cleansing may be required to cleanse dirty data 8: Clients use various tools to query the data warehouse 9: Delivering BI enables a process of continuous business improvement
  • Slide 26
  • 26 Want Powerful BI Applications? You need a well designed Data Warehouse! Want BI Apps quickly with self-service abilities? Ensure good dimensional design: Easy to understand for a knowledge worker Flexible Correct and aligned
  • Slide 27
  • 27 Three Contexts of BI Use Personal BI Built by me, for me, used only by me Team BI Built by someone on the team, for the teams use Organizational BI Built and maintained by IT, for use across company 11 22 33
  • Slide 28
  • 28 Integrated BI Platform
  • Slide 29
  • 29 Resources Project Botticelli at your service! Training, mentoring, do-it-with-you on-the-job assistance with all BI and SQL needs Email me at rafal@projectbotticelli.comrafal@projectbotticelli.com Home: www.microsoft.com/biwww.microsoft.com/bi Demos on www.sqlserveranalysisservices.com, www.sqlserverdatamining.com, www.codeplex.comwww.sqlserveranalysisservices.com www.sqlserverdatamining.comwww.codeplex.com More demos and sessions at www.microsoft.com/technetspotlight www.microsoft.com/technetspotlight
  • Slide 30
  • 30 Q&A
  • Slide 31
  • 31 Thank You! Please email your comments or requests to rafal@projectbotticelli.com
  • Slide 32
  • 32 2009 Microsoft Corporation & Project Botticelli Ltd. All rights reserved. The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions 2009 Project Botticelli Ltd & entire material 2009 Microsoft Corp. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as alrea

Recommended

View more >