embracing the change and challengesbiconsulting.hu/letoltes/2017budapestdata/tuba... · sas data...

30
1 Copyright © 2017, SAS Institute Inc. All rights reserved. EVOLUTION OF ANALYTICS AND SAS: EMBRACING THE CHANGE AND CHALLENGES Data Science Talks, Budapest Data Forum, 13 June 2017 Tuba Islam, SAS Global Technology Practice, Analytics

Upload: others

Post on 24-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

1Copyright © 2017, SAS Institute Inc. All rights reserved.

EVOLUTION OF ANALYTICS AND SAS: EMBRACING THE CHANGE AND CHALLENGES

Data Science Talks, Budapest Data Forum, 13 June 2017

Tuba Islam, SAS Global Technology Practice, Analytics

Page 2: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

New trends and challenges in the analytics market (15 mins)

A healthcare use case (2 mins)

Demo of new developments at SAS (10 mins)

Page 3: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Page 4: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

Machine Learning is NOT new

Machine Learning

PROC DISCRIM (K-nearest-neighbor discriminant analysis)

– James Goodnight, SAS founder and CEO, 1979

Neural Networks and Statistical Models,

SAS Institute, 1994SAS Data Mining Primer Course

SAS Institute, 1998

Page 5: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

SAS Machine Learning

• Neural networks

• Decision trees

• Random forests

• Associations and sequence discovery

• Gradient boosting and bagging

• Support vector machines

• Nearest-neighbor mapping

• k-means clustering

• Self-organizing maps

• Local search optimization techniques such

as Genetic algorithms

• Regression

• Expectation maximization

• Multivariate adaptive regression splines

• Bayesian networks

• Factorization Machines

• Kernel density estimation

• Principal components analysis

• Singular value decomposition

• Gaussian mixture models

• Sequential covering rule building

• Model Ensembles

• And More…….

ALGORITHMS

Page 6: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

Why Now?

These concepts are not new,

but gaining fresh momentum..

Page 7: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2016 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Machine Learning

Artificial Intelligence

Deep Learning

Cognitive Computing

Page 8: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

How Did We Get to “Machine” in Analytics?

The definition has actually been refined over the years

A tool contains one or more parts that uses energy to perform an intended action

Tool

A computer is a type of machine and in that programs are sets of instructions for completing specific actions

Algorithms that learn are now defined as ‘machines’

Computers that have the ability to learn without being explicitly programmed (Arthur Samuel, 1959)

Computer

[Machine] Learning

Algorithm

Page 9: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

1. Traditional Predictive Analysis

Training data

Champion Algorithm

Hypothesis(score code)

Estimated output

Tournament implied

New input

Mo

de

l b

uild

ing

Scoring

• Model building is an “off-line” process

• Scoring is “on-line” or “in-line”

Page 10: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

2. Machine Learning

Training data• This is the

definition of an analytic “machine”!

Hypothesis(score code)

Champion Algorithm

This process repeats until no more improvement is possible (i.e. the model reaches ‘convergence’)

Retrain

Estimated output

New input

Page 11: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

3. The Next-gen of Machine Learning

Training data

Hypothesis(score code)

Champion Algorithm

The technique “learns” from new data

Retrain

Estimated output

New input

Page 12: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Co mpany Co nfidentia l – Fo r Inter nal Use OnlyCo pyright © SAS Inst i tute Inc . A l l r ights reser ved.

4. True Artificial Intelligence

Training data

Champion 1

(score code)

Decision/Work/ ActionStill a long way off…

Champion 2 Champion 3 Champion n

Decision logic

Hypothesis 1 Hypothesis 2 Hypothesis 3 Hypothesis n

New input

Page 13: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Input Learning

CustomerTargeting

CropYields

Fraud

Credit Risk

SmartCities

MedicalImages

THE PROCESS OF MACHINE LEARNING IN BUSINESS

Page 14: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

APPLICATIONS OF MACHINE LEARNING

Predictive Asset

Maintenance

FraudCredit Scoring

Next Best Offers Customer Segmentation

Targeted Acquisition /

Retention / AttritionReal-time Ad

placements

Natural Language

Processing

Network Intrusion

Detection

Online

Recommendations

Customer Lifetime

Value

Page 15: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

MARKET CHALLENGES

Critical decision making

information gets lost in big data

Customers and markets are more

demanding than ever requiring quicker and

more accurate responses

Analytical talent for data driven decision

making can be hard find

Page 16: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .Success Story: Geneia Healthcare

Page 17: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Copyr ight © SA S Inst i tute Inc . A l l r ights reserved.

HYBRIDISATION: CHANGE IS COMING..

Embrace Diversity

Hybrid Solutions

Hybrid Data

Hybrid Models

Hybrid Tasks

Hybrid Personas

Accuracy AND interpretability

Use of structured data AND voice/text/

video/images

Proprietary AND open toolsCloud AND on-premise

Data scientists AND power usersTechnology AND business

expertise

Data management machine learning

deployment

Page 19: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Copyr ight © SA S Inst i tute Inc . A l l r ights reserved.

Benefits of SAS® Viya™ for Machine LearningWhat Does This Mean to You?

• BETTER RESULTS: Improved accuracy

• In-memory analytics platform

• More sophisticated machine learning algorithms

• Auto-tuning functionality

• DIVERSITY: More freedom with openness and collaboration

• Open integration / use of APIs

• Democratisation of data and analytics

• Unified actions available across different interfaces for different personas

• CREATIVITY: New business applications

• Images are now native data sources for SAS!

• Speed in data prep and model build process to cultivate innovation

Page 20: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Company Conf ident ia l – For Internal Use OnlyCopyr ight © SA S Inst i tute Inc . A l l r ights reserved.

Visual Interfaces

Programming Interfaces

API Interfaces

MULTIPLE INTERFACES, SINGLE CODE BASE

Page 21: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Copyr ight © SA S Inst i tute Inc . A l l r ights reserved.

Workers

Controller

proc print data = hmeq (obs = 10);

run;

df = s.CASTable(‘hmeq’)

df.head(10)

df <- defCasTable(s, ‘hmeq’)

head(df, 10)

[table.fetch]

table.name = “hmeq”

from = 1 to = 10

CAS Action

APIs

Page 22: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Copyr ight © SA S Inst i tute Inc . A l l r ights reserved.

3D CHIP PRINTING

1 in a billion failure rate for droplets

50 million droplets per second

Potential for an error every 20 seconds

Classify wafer defects into different categories

Rule based classification was used with 80% accuracy

Semiconductor Manufacturing Industry

Classification which types

Page 23: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Copyr ight © SA S Inst i tute Inc . A l l r ights reserved.

s.image.loadImages(path='/folder/myfolder/img',casOut=vl(caslib='casuser',

name=’ordinary', replace=True),

decode=True)

s.image.loadImages(path='/folder/myfolder/DICOM',casOut=vl(caslib='casuser',

name='medical', replace=True),

recurse=True,series=vl(dicom=True),decode=True)

Medical Imaging IndustryDiagnosis efficiency / Enhanced Research

Page 24: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

Copyr ight © SA S Inst i tute Inc . A l l r ights reserved.

SAS Image Processing for SciSportsUse Deep Learning to Recognize Back Numbers

The data is from SciSports and the task is to recognize the numbers athlete's shirts. It contains 6,631 images that have the numbers between 1 and 99. I (aka, XQ) split into train (90%, ~6,000) and test (10%, ~630). The classes are highly

imbalanced. Some numbers only appear less than 10 times.

Page 26: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS Projects on GitHub

Page 27: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

SAS VIYA DEMO

ACCESSING MACHINE LEARNING ACTIONS FROM DIFFERENT INTERFACES

Page 28: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Different PersonasDifferent Interfaces

Same Actions

Page 29: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Data

Discovery Deployment

Relevant and Accurate

SOME THINGSNEVER CHANGE… INGREDIENTS FOR SUCCESS

Domain and Analytics Expertise

Automation and Monitoring

Creativity Collaboration

Communication

Page 30: EMBRACING THE CHANGE AND CHALLENGESbiconsulting.hu/letoltes/2017budapestdata/tuba... · SAS Data Mining Primer Course SAS Institute, 1998. Company Confidential –For Internal Use

C op yr i g h t © 2015 , SAS Ins t i t u te Inc . A l l r i g h ts r eser v ed .

Köszönöm!

[email protected]

uk.linkedin.com/in/tubaislam