mine craft:

25
Mine Craft Why you should be mining your data and how to actually do it.

Upload: mark-tabladillo

Post on 27-May-2015

274 views

Category:

Data & Analytics


2 download

DESCRIPTION

Why you should be mining your data and how to actually do it. Every company needs a rock star. We want it to be you. This session will give real world examples of data mining successes as well as walk you through how to get started down the path of data enlightenment, so that you too can say "I Am A Data Miner℠".

TRANSCRIPT

Page 1: Mine craft:

Mine CraftWhy you should be mining your data and how to actually do it.

Page 2: Mine craft:

Mark Tabladillo – MVP

• Mark provides enterprise data science analytics advice and solutions. He uses Microsoft Azure Machine Learning, Microsoft SQL Server Data Mining, SAS, SPSS, R, and Hadoop (among other tools). He works with Microsoft Business Intelligence (SSAS, SSIS, SSRS, SharePoint, Power BI, .NET). He is a consultant for SolidQ.

• Mark has been a national leader in analytics and data science (data mining and machine learning) through conference speaking and instructional leadership since 1998: Microsoft TechEd, PASS Business Analytics Conference, Predictive Analytics World, SAS Global Forum, PASS Summit. He connects with people on Linked In and Twitter @marktabnet

Page 3: Mine craft:

David McFarland - MSNCSVUP

• Sr. Mgr. Business Intelligence – Rentpath • 2007 to present

• CTO – AdventureWorks Cycles• 2005 to present

• CTO – Northwind Traders• 2000 to 2005

• CTO – Lucerne Publishing• 1990 to 2000

Page 4: Mine craft:
Page 5: Mine craft:

How could data mining apply?Let’s look at three companies

Page 6: Mine craft:

Telecommunications

Page 7: Mine craft:

Oil and Gas

Page 8: Mine craft:

Volkswagen Group

Page 9: Mine craft:

Data Science HypothesisHyper-Hypo-

Page 10: Mine craft:

What Why How

Relational Data Warehouse

Flexible query Data from disparate sources; tables, schema, keys, relationships, index

Hadoop & HDInsight

Flexible storage and schema, massive parallel processing

Multiple nodes and distributed computing, commodity hardware, Java; Map Reduce and YARN

Tabular Fast query and calculations, easy to understand

In-memory, columnstore indexes

Multidimensional OLAP

Fast query; ad-hoc analysis Pre-aggregations, calculations

Data Mining & Machine Learning

Discovery of knowledge, find outliers, find similarities, make predictions

Estimations, creation of models

Page 11: Mine craft:
Page 12: Mine craft:
Page 13: Mine craft:
Page 14: Mine craft:
Page 15: Mine craft:
Page 16: Mine craft:

In the beginning, there was…

Margaret*

*Her real name, as I don’t think that she is THAT innocent.

Page 17: Mine craft:
Page 18: Mine craft:

Demonstration

Page 19: Mine craft:

Data platform: SQL Server 2014

Database Services

SQL Server*SQL Azure*

ReplicationSQL Azure Data

Sync*

Full Text & Semantic Search*

Data Integration

Services

Integration Services*

Master Data Services*

Data Quality Services*

StreamInsight*Project “Austin”*

Analytical Services

Analysis Services*

Data Mining

PowerPivot*

Reporting Services

Reporting Services*SQL Azure Reporting*

Report Builder

Power View*

Page 20: Mine craft:

3 things to tell yourself

• I will do the data mining exercise!• I will find a way to apply data mining at work!• I will be a rock star!

Page 21: Mine craft:

AppendixSteps toward Mine Craft

Page 22: Mine craft:

Major Websites SQL Server Data Mining http://technet.microsoft.com/en-us/sqlserver/cc510301.aspx http://www.sqlserverdatamining.com/

Microsoft Azure Machine Learning (currently in preview) http://azure.microsoft.com/en-us/services/machine-learning/

Page 23: Mine craft:

Software Dreamspark (students); BizSpark (businesses) SQL Server 2014 Enterprise (includes database engine, Analysis Services, SSMS and SSDT)

http://www.microsoft.com/en-us/server-cloud/products/sql-server/default.aspx

Microsoft Office http://office.microsoft.com/en-us/

Primer on Power BI -- MarkTab http://

blogs.msdn.com/b/mvpawardprogram/archive/2014/08/04/primer-on-power-bi-business-intelligence.aspx

Page 24: Mine craft:

Preparing for Microsoft SQL Server Data MiningLast updated: October 28, 2014

SQL Server• You will need SQL Server 2008 or higher; please include “Database Engine”, “Integration Services”, and “Analysis Services”. For SQL Server 2012 or 2014, you need the “Multidimensional and Data Mining Mode” for Analysis Services. (You may optionally install semantic search in SQL Server 2012 or 2014, print out the following directions before installing: http://msdn.microsoft.com/en-us/library/gg509085 )

o SQL Server 2008 or 2008 R2 – Enterprise Edition (or Developer Edition) The requirements for SQL Server 2008 are on http://msdn.microsoft.com/en-us/library/ms143506(v=SQL.100).aspx All client tools should be installed, including SQL Server Management Studio (SSMS) and Business Intelligence Development Studio (BIDS). Directions for installation are at http://msdn.microsoft.com/en-us/library/ms143219(v=SQL.100).aspx o SQL Server 2012 or 2014 – Business Intelligence Edition or Enterprise Edition (or Developer Edition) The requirements for SQL Server 2014 are on http://msdn.microsoft.com/en-us/library/ms143506 and include NET 3.5 SP1 (.NET 4.0 is also required, but it is installed during installation)All client tools should be installed, including SQL Server Management Studio (SSMS) and SQL Server Data Tools (SSDT). Directions for installation are at http://technet.microsoft.com/en-us/library/ms143219(v=sql.120).aspx

• Click http://www.microsoft.com/sqlserver/en/us/get-sql-server/try-it.aspx for a 180-day trial version of SQL Server 2014• Make sure you run Windows Update to have all the latest service packs and security updates applied

Page 25: Mine craft:

Microsoft Office• (Data mining does not integrate with the browser-based Office 365, which is otherwise a nice product.)• You will need Office 2007 or higher (with Excel) along with the free data mining add-in:

o For Office 2007: The 32-bit data mining add-in works with SQL Server 2008 or 2008 R2:http://www.microsoft.com/en-us/download/details.aspx?id=7294o For Office 2010: The 32- or 64-bit data mining add-in works with SQL Server 2012 or earlier:http://www.microsoft.com/en-us/download/details.aspx?id=35578o For Office 2013: The 32- or 64-bit data mining add-in works with SQL Server 2012 or earlier:http://www.microsoft.com/en-us/download/details.aspx?id=35578

• Install the add-in, and choose all the parts (sometimes not all the parts are checked).• After installation, run the “Server Configuration Utility” (from the Windows menu) to make sure you can connect from Excel to Analysis Services. Please also open the “Sample Excel Data” (Excel Workbook) to see if you can see the Data Mining tab, and also connect to Analysis Services. If you need help, there is a separate “Help and Documentation” link (which comes up either from Excel or from the Windows menu).

o You will need to have your own instance of an Analysis Services database, where you have administrative privileges (allowing both read and write access); if you have any questions on this point, please talk with a professional in your Information Technology group.

• Click http://technet.microsoft.com/en-us/evalcenter/jj192782.aspx for a 60-day trial version of Office Professional Plus 2013• Make sure you run Windows Update to have all the latest service packs and security updates applied