cost effective machine learning technologies€¦ · cost effective machine learning technologies...

44
Cost Effective Machine Learning Technologies Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress Ricardo Vilalta Senior Data Scientist Adaptive Analytics LLC August 2018

Upload: others

Post on 21-Jun-2020

16 views

Category:

Documents


1 download

TRANSCRIPT

Cost Effective Machine Learning Technologies

Machine Learning & AI Upstream Onshore Oil & Gas 2018 Congress

Ricardo Vilalta

Senior Data Scientist Adaptive Analytics LLC

August 2018

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Machine Learning

Search

Artificial Intelligence

Planning Knowledge Representation

Machine Learning Robotics

Clustering

Classification

Genetic Algorithms

Reinforcement

Learning

Classification or Supervised Learning

Supervised Learning:

Training set x = {x1, x2, …, xN} (historic data)

Class or target vector y = {y1, y2, …, yk} (true labels)

Find a function f(x) that takes a vector x and outputs a class y.

{(x,y)}

f(x)

Classification or Supervised Learning

! 

Normal Operation

Abnormal Operation

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Transfer Learning

!  The goal is to transfer knowledge gathered from previous experience.

!  Also called Inductive Transfer or Learning to Learn.

!  Example: Invariant transformations across tasks.

Adapt Model Transfer Experience

Learn Predictive Model New Predictive Model

Example

A problem occurs when the drill string is no longer free to move (i.e., to rotate or move vertically), a situation called Stuck Pipe

Importance

Motivation:

The problem of stuck pipes accounts for several billions of dollars loss on capital equipment and non-productive time. Developing a method to predict this event in real-time has become high priority for the drilling industry (now possible due to modern sensor techniques and advanced data analysis tools).

Machine Learning Approach

Strategy: Use machine learning to learn a model that analyzes historical data and produces a model for prediction.

Predictive Model

Model may fail on different wells

Reasons:

•  Different geological formations

•  Hook load profile varies at different depth

•  Unexpected environmental conditions

Transfer Learning

Scenarios: 1.  Labeling in a new domain is costly.

DB1 (labeled)

Classification of Salt Deposits

DB2 (unlabeled)

Transfer Learning

Scenarios: 2. Data is outdated. Model created with one survey but a new survey is now available.

Survey 1

Learning System

Survey 2

?

Traditional Approach to Classification

DB1 DB2 DBn

Learning System

Learning System

Learning System

Transfer Learning

DB1 DB2

DB new

Learning System

Learning System

Learning System Knowledge

Source domain

Target domain

Knowledge of Parameters

Assume prior distribution of parameters

Source domain

Learn parameters and adjust prior distribution

Target domain

Learn parameters using the source prior distribution.

Feature Transfer

Identify common Features to all tasks

Example Weighting

Source Class 1 Source Class 2 Target

Target Class 1 Target Class 2

Example Weighting

Source Class 1 Data Source Class 2 Data Target Data Source Model Target Model

Example Weighting

Data Projection

When source instances cannot represent the target distribution at all in the parameter space, we can project source and target datasets to common feature space (i.e., we can align both datasets).

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Classification is Costly: Labeling

A representative subset of objects are labeled as one of the following six classes:

!  Plain

!  Crater Floor

!  Convex Crater Walls

!  Concave Crater Walls

!  Convex Ridges

!  Concave Ridges

517 labeled segments.

Pool-Based Sampling

Assume a small set of labeled examples and a large set of unlabeled examples. Here we evaluate and rank the whole set of unlabeled examples; we then choose one or more “important” examples.

Active Learning

Uncertainty: 1.0 0.5 1.0

Sampling Based on Uncertainty

Sampling Based on Uncertainty

Figure taken from “Active Learning” by Burr Settles, Morgan & Claypool, 2012.

70% accuracy 90% accuracy

Results with Active Learning

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

The idea is to disentangle factors of variation and to attain high level representations.

Pixel Information

Edges and Contours

Small Object Parts

Engine, Main Fuselage

Commercial Planes, Military Planes

Deep Learning

Deep Learning

!  We want to capture compact, high-level representations in an efficient and iterative manner.

Learning takes place at several levels

of representations.

Think about a hierarchy of concepts

of increasing complexity.

Low levels concepts are the foundation

for high level concepts.

An Example in Deep Learning

Learn a “concept” (sedimentary rocks) from many images until a high-level representation is achieved.

An Example in Deep Learning

Learn a hierarchy of abstract concepts using deep learning.

Local properties

Global properties

Deep Learning

Methodology

Cube of seismic data

Expert Labels

New training dataset

Learning Algorithm

Deep Learning

Deep Learning on Seismic Data

Challenges:

Single attributes bear incomplete information about the class.

Supervised Learning of Geological Bodies

Challenges:

Deep learning can capture “global” features that detect entire geological bodies as the result of the non-linear combination of many local models.

Supervised Learning of Geological Bodies

Decompose seismic cube into small cubes and create a large no. of examples.

Deep Learning on Seismic Data

Each cube is an example that we can feed into a deep learning architecture.

Deep Learning on Seismic Data

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Hardware for Machine Learning

Most machine learning applications require fast processing speeds and lots of memory and disk space. Applications are “computationally expensive” Example: Deep learning.

Many applications in machine learning need matrix multiplications. Calculations are easy but there are many, “MANY”, of them.

Hardware for Machine Learning

A solution for CPU’s being overpowered is to use GPUs. A GPU can handle many instructions at incredible speeds. Disadvantage: 4x times more expensive than CPUs and sometimes Not really necessary.

Hardware for Machine Learning

Suggested minimum requirements: Memory: 16 GB (ideally 32 GB) Disk Space: 2 TB Processor: Intel 7th Generation or better; or AMD Ryzen 2nd generation Very important: ** GPU ** If working remotely, it is better to use a simple device (tablet) and send information to a central server for analysis.

Hardware for Machine Learning

Many manufacturers are already producing specialized chips that do deep learning at the hardware level: TPUs (Tensor processing units ) by Google AMD’s new GPU ** It is too soon to know how well they will perform for future applications. **

New Emerging Trends in Machine Learning

!  New Trends in Machine Learning

!  Transfer Learning

!  Active Learning

!  Deep Learning

!  Hardware for Machine Learning

!  Summary

Summary

!  When we have similar classification tasks but there is indication that the distributions have changed ! Transfer Learning

!  When we have few training examples, labeling is expensive ! Active Learning

!  When we need more abstract features ! Deep Learning

!  Hardware using dep learning ! look for large memory, disk space, top processors, and do NOT FORGET the GPU.