data mining to real-time processing - tibco community · json real time xml real time action...
TRANSCRIPT
![Page 1: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/1.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Mike Alperin
Kai Waehner
August, 2016
Machine Learning in Manufacturing:
Data Mining to Real-time Processing
![Page 2: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/2.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
• Introduction to Machine Learning
• Machine Learning in Manufacturing
• Methods and Architecture
• Data Mining - Demo
• Real-time Processing of Streaming Data - Demo
• Q&A
Agenda
![Page 3: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/3.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Introduction to Machine Learning
![Page 4: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/4.jpg)
Machine Learning
Machine learning is a method of data analysis that automates analytical
model building. Using algorithms that iteratively learn from data, machine
learning allows computers to find hidden insights without being explicitly
programmed where to look.
http://www.sas.com
![Page 5: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/5.jpg)
Real World Examples of Machine Learning
Spam Detection Search Results +Product Recommendation
Picture Detection(Friends, Locations, Products)
Machine Learning is already present in your daily life…
Now, every enterprise is beginning to leverage it!
![Page 6: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/6.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
• Supervised – Solve known problems
• What factors are driving manufacturing defects?
• Decision Trees, Random Forest, Gradient Boosting Machine
• Unsupervised – Identify new patterns, Detect anomalies
• Are there new failure modes emerging?
• Clustering, Principle Components, Neural Networks, Support Vector
Machines
• Optimization – Support Decision-making
• What is the optimum scheduling of operators or equipment maintenance?
• Genetic Algorithm
Types of Machine Learning
![Page 7: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/7.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Decision Tree – Titanic Survival Rate
family size
Wikipedia
![Page 8: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/8.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Classical Statistics – Fit parameters to a well-defined model
![Page 9: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/9.jpg)
Decision Tree – Product Pass / Fail by Process & Equipment
Bad Product
Good Product
Clearcoat Bake Temperature>= 132 C
Sanding Station1, 2, 4 3 Basecoat Thickness
Peeling Clearcoat
< 132 C
… … … …
Automobile Paint Process
![Page 10: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/10.jpg)
Decision Tree – Training and Test Data Sets
![Page 11: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/11.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Ensemble Tree Algorithms
• Random Forest, Gradient Boosting Machine (GBM)
• Method – Average many simple trees
• Sample the data: fit a simple tree
• Re-sample the data; up-weighting the observations that weren’t fitted well in
previous model
• Continue adding trees until fit is good
• Save all the trees and average them
• Better fit + prediction than single trees
![Page 12: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/12.jpg)
12
Gradient Boosting Machine (GBM)
• Machine Learning• Machine learning algorithms + Big Data sets can produce models
that accurately fit complex data patterns.
• GBM: Better results than classic statistical methods• Ideal for data-mining / predictions for complex processes &
products• Performs well for variable reduction• Can fit complex nonlinear relationships & interactions• Scales to Big Data
• Easier to use• No need to specify the data model• Accommodates continuous and categorical predictors & responses• Handles missing data and outliers well
• Simple user interface • Model complexity can be hidden from user• Results presented with easy-to-understand visualizations• Interface easily customized for your use case and users
© Copyright 2000-2016 TIBCO Software Inc.
![Page 13: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/13.jpg)
Advanced Analytics and Big Data Tools
Many more ….
![Page 14: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/14.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Machine Learning in Manufacturing
![Page 15: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/15.jpg)
Correlate Product or Equipment Results to Process & Supplier Data
• Supplier - Incoming Materials and Components• measured electrical, chemical, physical characteristics
• batch-id, lot_id
• Manufacturing Process• Physical, chemical or electrical measurements
• WIP / MES: track-in / track-out date, process equipment id, recipe, operator, …
• Process equipment sensor data
• Equipment Maintenance logs
• Defect Inspections
• Cost of labor, materials, machines and facilities
• Product Quality and Reliability Test• Measured product functional and performance characteristics
• Accelerated life test results
• Product Field Returns• Failure mode, unit / batch / lot ID
• Failure analysis root cause results
• Warranty / Repair claim, call center and cost – structured & unstructured
![Page 16: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/16.jpg)
• Problem
• Product & Equipment problems difficult to accurately diagnose for complex manufacturing processes
• Big Data problem – millions of units, hundreds / thousands of predictors
• Response: Product, Process or Equipment Fail data
• Predictors: in-process equipment, process and product measurements or attributes
• Value
• Being used by customers to find previously undetected problems. Reduces time-to-market and increases profit.
• Method
• GBM analysis template to identify significant predictors, interactions and nonlinearities
• For large datasets, hybrid data access used to perform variable reduction step in-DB
• Simple interface – easy for business analyst to run and interpret results
GBM results for semiconductor yield as a function of in-process equipment & product measurements
Machine Learning to Predict Equipment or Product Fails
![Page 17: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/17.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Method and Architecture
![Page 18: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/18.jpg)
INSIGHT ACTION
Insight – Action Loop
![Page 19: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/19.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Fast Data Reference Architecture
Operational Analytics
OperationsOperationsLive UI
SENSOR DATA
TRANSACTIONS
MESSAGE BUS
MACHINE DATA
SOCIAL DATA
Streaming AnalyticsAction
AggregateAggregate
RulesRules
Stream Processing
AnalyticsAnalytics
CorrelateCorrelate
Live Monitoring
Continuous query processing
Continuous query processing
AlertsAlerts
Manual action, escalation
Manual action, escalation
HISTORICAL ANALYSIS
Data SheetsData
Sheets
BIBI
Data Scientists
Data Scientists
CleansedData
History
Data Discovery
Enterprise Service BusEnterprise Service Bus
ERPERP MDMMDM DBDB WMSWMS
SOASOA
Data Storage
Intern
al Data
Integratio
n B
us
APIAPI
Complex Event Processing
Machine LearningMachine Learning
Big Data
![Page 20: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/20.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Demo: Data Mining
![Page 21: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/21.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
• Model Set up – Parameter Selection
• Model Configuration
• Run Model
• Evaluate model: ROC Curve, AUC
• Visualize Model Results
• Variable Importance
• Interactions
Demo Outline
![Page 22: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/22.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Demo Screenshot – Model Configuration and Evaluation
![Page 23: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/23.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Demo Screenshot – Model Results
![Page 24: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/24.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Real-time Processing
Of Streaming Data
![Page 25: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/25.jpg)
Predictive Analytics for Manufacturing
Goal: Scrap parts as early as possible to reduce costs in a manufacturing process.
Question: When to scrap a part in Station 1 instead of sending it to Station 2?
Station 1 Station 2
Cost Before9€
7€ 13€Total Cost
29€(or more)
Scrap? Scrap?
![Page 26: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/26.jpg)
Fast Data Architecture for Predictive Maintenance
Operational Analytics
OperationsOperationsLive UI
CSV Batch
JSON Real Time
XML Real Time
Streaming AnalyticsAction
AggregateAggregate
RulesRules
AnalyticsAnalytics
CorrelateCorrelate
Live Datamart
Continuous query processing
Continuous query processing
AlertsAlerts
Manual action, escalation
Manual action, escalation
HISTORICAL ANALYSIS Data Scientists
Data Scientists
FlumeHDFS
Spotfire
R / TERRR / TERRHDFS
Hadoop (Cloudera)
StreamBase
TIBCO Fast Data Platform
H2OH2O
Oracle RDBMS
Avro Parquet … PMMLPMML
Inte
rnal D
ata
![Page 27: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/27.jpg)
TIBCO Spotfire with H2O Integration
Data Discovery / Data Mining (“Are parts that repeat a station more likely scrap parts?”)
![Page 28: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/28.jpg)
TIBCO Spotfire with H2O Integration
Advanced Analytics (“Scrap parts as early as possible!”)
![Page 29: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/29.jpg)
TIBCO Spotfire with H2O Integration
Advanced Analytics (“Scrap parts as early as possible!”)
![Page 30: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/30.jpg)
TIBCO StreamBase + R / TERR
![Page 31: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/31.jpg)
TIBCO StreamBase + H20
![Page 32: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/32.jpg)
TIBCO StreamBase + PMML
![Page 33: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/33.jpg)
TIBCO Live Datamart
Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)
Live Dartmart Desktop Client
![Page 34: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/34.jpg)
TIBCO Live Datamart
Operational Intelligence (“Monitor the manufacturing process and change rules in real time!”)
Live Dartmart Web API
![Page 35: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/35.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Demo:
Real Time Processing
(Predictive Scrapping of Parts
in an Assembly Line)
![Page 36: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/36.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Learn & Do More: Machine Learning on the TIBCO Community
Wiki page
Component Exchange:• Data functions• Accelerators• Templates
https://community.tibco.com/wiki/machine-learning-tibco-spotfire-and-streambase
https://community.tibco.com/exchange
![Page 37: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/37.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Learn & Do More: Accelerators on the TIBCO Community
https://community.tibco.com/wiki/accelerators https://community.tibco.com/exchange
Component Exchange Accelerators:• Apache Spark• Intelligent Equipment• Connected Vehicles
Wiki page
![Page 38: Data Mining to Real-time Processing - TIBCO Community · JSON Real Time XML Real Time Action Streaming Analytics Aggregate Rules Analytics Correlate Live Datamart Continuous query](https://reader034.vdocuments.net/reader034/viewer/2022042307/5ed36c9ef15ef3476a729bee/html5/thumbnails/38.jpg)
© Copyright 2000-2016 TIBCO Software Inc.
Q & A