deeplearning nlp

Download Deeplearning NLP

Post on 14-Jan-2017

1.619 views

Category:

Education

2 download

Embed Size (px)

TRANSCRIPT

  • A not-so-short introduction to Deep Learning NLP

    Francesco Gadaleta, PhD

    1

    worldofpiggy.com

  • What we do todayNLP introduction (
  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    The Goals of NLP

    Analysis of (free) text

    Extract knowledge/abstract concepts from textual data (text understanding)

    Generative models (chat bot, AI assistants, ...)

    Word/Paragraph similarity/classification

    Sentiment analysis

    3

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Traditional ML andNLP

    4

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Traditional NLP word representation

    0 0 0 0 1 0 0 0 0 0

    One-hot encoding of words: binary vectors of dimensions

    0 0 0 0 0 0 0 0 1 0

    0 1 0 0 0 0 0 0 0 0

    0 0 0 0 1 0 0 0 0 0

    book

    chapter

    paper

    AND

    AND

    = 0

    5

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Traditional soft-clustering word representation

    Soft clustering models learn for each cluster/topic a distribution over words of how likely that word is in each cluster

    Latent Semantic Analysis (LSA/LSI), Random projections

    Latent Dirichlet Analysis (LDA), HMM clustering

    6

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    LSA - Latent Semantic Analysis

    Words that are close in meaning will occur in similar pieces of text.

    Good for not-so-large text data

    SVD to reduce words and preserve similarity among paragraphs

    paragraphs

    wo

    rds

    Similarity = cosine(vec(w1), vec(w2))

    Low-rankNo PolysemyPoor SynonymyBag-of-word limitations (no order)

    UV

    M=U *

    Huge, Sparse, Noisy

    7

    X word counts/paragraph

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Traditional ML andDeep Learning

    8

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    The past and the present

    Human-designed representation

    blah blahBlah blah blah Blah blah blah blah blahBlah blah blah blah

    Handcrafted sound features

    ML model Predictions

    RegressionClusteringRandom ForestSVMKNN...

    Weight Optimization

    9

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    The future

    Representation Learning automatically learn good features or representations

    Deep Learning learn multiple levels of representation with increasing complexity and abstraction

    10

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    The promises of AI (1969-2016)

    11

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Brief history of AI

    1958 Rosenblatts perceptron

    1974 Backpropagation

    1998 ConvNets

    2012 Google Brain Project

    1995 Kernel methods (SVM)

    2006 Restricted Boltzmann Machine

    AI winter AI spring AI summer

    12

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Why is this happening?

    BIGData

    GPUPower

    ALGOProgress

    13

  • Geoffrey Hinton

    Cognitive psychologist AND Professor

    at University of Toronto AND one of

    the first to demonstrate the use of

    generalized backpropagation to train

    multi-layer networks.

    Known for Backpropagation OR

    Boltzmann machine AND great-great-

    grandson of logician George Boole

    14

  • Yann LeCun Postdoc at Hintons lab. Developed DJVu format.

    Father of Convolutional Neural Networks and Optical Character Recognition (OCR).

    Proposed bio inspired ML methods like Optimal Brain Damage a regularization method.

    LeNet-5 is now state-of-the-art in artificial vision.

    15

  • Yoshua Bengio

    Professor at University Montreal. Many contributions in Deep Learning.

    Known for Gradient-based learning, word representations and representation learning for NLP.

    16

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Some reasons to apply Deep Learning (non-exhaustive list)

    17

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    No. 1 Automatic Representation Learning

    1. Who wants to manually prepare features?

    2. Often over-specified or incomplete (or both)

    3. Done? Cool!Now do it again and again...

    Input Data Feature EngineeringML algorithm

    Time consuming

    MLAlgorithm 1

    MLAlgorithm 2

    MLAlgorithm 3

    Domain #1

    Domain #2

    Domain #3

    Validation

    Validation

    Validation

    18

    Featureengineering

    Featureengineering

    Featureengineering

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    No. 2Learning from unlabeled data

    Traditional NLP requires labeled training data

    Guess what? Almost all data is unlabeled

    Learning how data is generated is essential to understand data[Demo]

    19

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    No. 3Metric Learning

    Similarity

    Dissimilarity

    Distance matrix

    Kernel

    Define please!

    20

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    No. 4Human language is recursivePeople that don't know me think I'm shy.People that do know me wish I were.

    RecursionSame operator applied to different components (RNN)

    21

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Some examples

    22

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    LeNet (proposed in 1998 by Yan LeCun)

    Convolutional Neural Network for reading bank checks

    All units of a feature map share same set of weights Detect same feature at all possible locations of input Robust to shifts and distortions

    23

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    GoogLeNet (proposed in 2014 by Szegedy et al.)

    Specs22 layers 12x less parameters than winning network ILSVRC 2012 challengeIntroduced Inception module (filters similar to the primate visual cortex) to find out how a local sparse structure can be approximated by readily available dense componentsToo deep => gradient propagation problems => classifiers added in the middle of the network :)

    Object recognition

    Captioning

    Classification

    Scene description (*)

    (*) with semantically valid phrases.

    24

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    A not-so-classic example

    Kid eating icecream

    25

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Neural Image Captioning

    26

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Sentiment analysis

    TaskSocher et al. use RNN for sentiment prediction

    Demo http://nlp.stanford.edu/sentiment

    27

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Neural Generative Model Character-based RNN

    Text Alice in Wonderland

    Corpus len 167546

    Unique chars 85

    # sequences 55842

    Context chars 20

    Epochs 280

    CPU Intel i7

    GPU NVIDIA 560M

    RAM 16 GB

    neural networks are fun

    neural networks are fun

    neural networks are fun

    neural networks are fun neural networks are fun

    INPUT OUTPUT

    orrfe

    28

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    demo

    29

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Neural Network Architectures

    image - class image - caption sentence - class sentence - sentence sequence - sequence

    30

  • How many neural networks for speech recognition and NLP tasks?

    31

  • Just one (*)

    Layers

    Output: predict supervised target

    Hidden: learn abstract representations

    Input: raw sensory inputs.

    (*) Provided you dont fall for exotic stuff 32

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    NN architecture: Single Neuron

    n (3) inputs, 1 output, parameters W, b

    x1

    x2

    x3

    b=+1

    hw,b(x)

    Logistic activation function

    33

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Many Single Neurons make a Network

    Input Layer Layer 1 Layer 2

    Learning

    Many logistic regressions at the same time

    Hidden: neurons have no meaning for humans

    Output to be predicted stays the same

    Layer 3 Output Layer

    x1

    x2

    x3

    b=+1

    34

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    Neural Networks in a (not-so-small) nutshell

    *** DISCLAIMER ***

    After this section the charming and fascinating halo surrounding Neural Networks and Deep Learning will be gone.

    35

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    The core of a Neural Network

    x1

    x2

    x3

    b=+1

    36

  • A NOT-SO-SHORT INTRODUCTION TO DEEP LEARNING NLP - FRANCESCO GADALETA

    The core of a Neural Network

    x1

    x2

    x3

    b=+1W1 W2

    (Logistic regression) (Logistic regression)

    b1 b2

    37

  • A NOT-SO-SHORT INTRODUCTION TO DEE

View more >