python deep learning - jordi torres.ai

1

Python Deep LearningIntroducción práctica con Keras y TensorFlow 2

Jordi TORRES.AI

PATC Courses | Barcelona - February 2020

2

https://torres.ai/python-deep-learning/

Transparencias para impartir docencia con el libro #PythonDL

3

Acerca de estas transparencias:

● Versión: 0.8 (Barcelona, 31/01/2020)

○ Borrador actual de las transparencias del libro «Pyhon Deep Learning».

○ Algunas transparencias contienen texto en inglés. Con el tiempo iremos «puliendo» las transparencias (pero creemos que incluso así pueden ser usadas y por ello ya las compartimos)

4

Contenido del curso

PART 1: INTRODUCCIÓN

1. ¿Qué es el Deep Learning?

2. Entorno de trabajo

3. Python y sus librerías

PART 2: FUNDAMENTOS DEL DEEP LEARNING

4. Redes neuronales densamente conectadas

5. Redes neuronales en Keras

6. Cómo se entrena una red neuronal

7. Parámetros e hiperparámetros en redes neuronales

8. Redes neuronales convolucionales

PART 3: TÉCNICAS DEL DEEP LEARNING

9. Etapas de un proyecto Deep Learning

10. Datos para entrenar redes neuronales

11. Data Augmentation y Transfer Learning

12. Arquitecturas avanzadas de redes neuronales

PART 4: DEEP LEARNING GENERATIVO

13. Redes neuronales recurrentes

14. Generative Adversarial Networks

Course contentPART 1: INTRODUCTION

1. What is Deep Learning?

2. Work environment

3. Python and its libraries

PART 2: FUNDAMENTALS OF DEEP LEARNING

4. Densely connected neural networks.

5. Neural networks in Keras

6. How a neural network is trained

7. Parameters and hyperparameters in neural networks

8. Convolutional neural networks.

PART 3: DEEP LEARNING TECHNIQUES

9. Stages of a Deep Learning project

10. Data to train neural networks

11. Data Augmentation and Transfer Learning

12. Advanced neural network architectures

PART 4: GENERATIVE DEEP LEARNING

13. Recurrent neural networks

14. Generative Adversarial Networks

5

Recursos del libro● Página web del libro:

https://JordiTorres.ai/python-deep-learning

● Github del libro:

https://github.com/JordiTorresBCN/python-deep-learning

● Material adicional del libro para descargar:

https://marketing.marcombo.com + código promocional del libro

https://jorditorres.ai/python-deep-learning

https://github.com/JordiTorresBCN/python-deep-learning

https://marketing.marcombo.com/

6

PART 3: TÉCNICAS DEL DEEP LEARNINGPART 3: DEEP LEARNING TECHNIQUES









7










8

• Goal: Builds a model to predict the fuel efficiency of late-1970s and early 1980s automobiles. • To do this, we'll provide the model with a description of many

automobiles from that time period. • This description includes attributes like: cylinders, displacement,

horsepower, and weight.

Case study: fuel efficiency of late-1970s and early1980s automobiles

9

• Dataset source: https://www.kaggle.com/uciml/autompg-dataset• Number of Instances: 398• Attribute Information:

• Miles per Gallon (mpg): continuous• cylinders: multi-valued discrete• displacement: continuous• horsepower: continuous• weight: continuous• acceleration: continuous• model year: multi-valued discrete• origin: multi-valued discrete

Dataset: Auto MPG

10

Preparación

11

Carga de datos

13

«Limpieza» de los datos

● Data’s almost never clean!!!

○ So we need to make sure that all our data is good values

○ There are unknown values?

○ DataFrame.isna() : Detect missing values.

○ To keep this initial tutorial simple drop those rows using: DataFrame.dropna()

15

Manejo de datos categóricos● La columna de datos Origin no es numérica, sino

categórica, es decir, el 1 significa «USA», el 2 «Europa» y el 3 «Japan»

17

Evaluation modelData

• Training set → For training

• Validation set→ For hyperparameter tuning

• Test set → Test model performance

18

80%Training

19

● Separar los valores que queremos predecir

20

Normalizar los datos de entrada● Inspect the data: overall statistics

22

Normalizar los datos de entrada

24

Definición del modelo

25

Inspeccionar el modelo

26

Configuración del modelo● Función de pérdida (opciones):

27

Loss functions: MAE or MSE ?● Evaluation metrics used for regression differ from

classification

● Both mean squared error (MSE) and mean absolute error (MAE) are used in regression modeling

○ MAE is more robust to outliers since it does not make use of square(because of the square, large errors have relatively greater influenceon MSE than do the smaller error)

○ MSE is more useful if we are concerned about large errors whoseconsequences are much bigger than equivalent smaller ones.

● Up to you! ;-)

28

Configuración del modelo● Optimizador

29

Entrenamiento del modelo

20% data forvalidation

30

● Visualize the model's training progress using the statsstored in the history object

Evaluación del proceso de entrenamiento

33

Plot of loss on the training and validation datasets over training epochs

34

Overfitting● The gap between the training and validation loss indicates

overfitting● Our graph shows little improvement, or even degradation

in the validation error after about 100 epochs à Overfitting?

35

EarlyStopping

37

Evaluación del modelo

38

Evaluación del modelo: ¿Bueno?● error absoluto medio (MAE)

¿un valor más «comprensible» ?

○ 5 millas por galón à 2.26 kilómetros por litro de error.

○ Si lo pasamos al error que representa en litros por 100 km —medida que usamos habitualmente cuando hablamos de consumo de coches—, nos sale un error de unos 0.022 litros.

39

It is time to get your hands dirty!

40










41

¿Dónde encontrar datos?● Conjuntos de datos públicos para entrenar redes neuronales

● Conjuntos de datos precargados

● Conjuntos de datos de Kaggle

42

Caso de estudio: «Dogs vs. Cats»

43

(*) Colab carga los datos en el directorio /content

44

!wget --no-check-certificate \https://www.dropbox.com/s/sshnskxxolkrq9h/cats_and_dogs_small.zip?dl=0 \-O /tmp/cats_and_dogs_small.zip

45

Datos para entrenar, validar y probar

48

Comprobar los datos

51

Modelo

52

Recordatorio● capas convoluciones operan

sobre tensores 3D, llamados mapas de características (feature maps), con dos ejes espaciales de altura y anchura (height y width), además de un eje de canal (channels)también llamado profundidad (depth).

● Para una imagen de color RGB, la dimensión del eje depth es 3, pues la imagen tiene tres canales: rojo, verde y azul (red, green y blue).

54

Configuración del entrenamiento

55

Preprocesado de datos reales con ImageDataGenerator

56

Generadores de datos

62

Overfitting?

63

Probar el modelo con predict( )

64

Técnicas de prevención del overfitting● Reducir el tamaño del modelo

● Añadir Dropout

● Añadir regularizaciones L1 y/o L2

à Data Augmentationà Transfer Learning

65


66










67

Data Augmentation

68

Configuración de ImageGenerator

69

(*) mismo modelo que antes

70

¡OJO! Reducir durante la clase

73

(*) ACC previa: 0.739

74

Pre-trained model● Is a saved network that was previously trained on a large

dataset

○ The intuition behind transfer learning is that if this model trained on a large and general enough dataset, this model will effectively serve as a generic model of the visual world. We can leverage these learned featuremaps without having to train a large model on a large dataset by usingthese models as the basis of our own model specific to our task.

● There are 2 scenarios of transfer learning using a pretrainedmodel:

○ Feature Extraction

○ Fine-Tuning

75

Feature Extraction● Use the representations of learned by a previous network

to extract meaningful features from new samples. ● We simply add a new classifier, which will be trained from

scratch, on top of the pretrained model so that we can repurpose the feature maps learned previously for ourdataset.

Weights and biases initialized with trained values

76

Feature Extraction● Do we use the entire pretrained model or just the

convolutional base?

○ We use the feature extraction portion of these pretrainedconvnets (convolutional base) since they are likely to be genericfeatures and learned concepts over a picture.

○ However, the classification part of the pretrained model is oftenspecific to original classification task, and subsequently specific to the set of classes on which the model was trained.

77

Fine-Tuning1. Unfreezing a few of the top layers of a frozen model base

used for feature extraction, 2. and jointly training both the newly added classifier layers

as well as the last layers of the frozen model.

This allows us to "fine tune" the higher order feature representations in addition to our final classifier in order to make them more relevant for the specific task involved.

No Trainable Trainable

78

Feature Extraction

81

● En Keras congelar una capa: atributo trainable a false

● En Keras los modelos los podemos considerar como capas:

87

Fine-Tuning1. Unfreezing a few of the top layers of a frozen model base

used for feature extraction, 2. and jointly training both the newly added classifier layers

as well as the last layers of the frozen model.

This allows us to "fine tune" the higher order feature representations in addition to our final classifier in order to make them more relevant for the specific task involved.

No Trainable Trainable

88

Fine-Tuning

96


97










98

API funcional de Keras● Mismo ejemplo:

104

Modelos complejos

107

Redes neuronales con nombre própio● AlexNet,

● GoogLeNet,

● VGG

● ResNet

● …

108

Acceso a redes preentrenadas● Ejemplo: VGG16

110

Uso de redes preentrenadas con Keras● Conjunto de datos CIFAR-10

112

ResNet50

Test Acc 0.5725

115

Resnet50 preentrenada con Imagenet

116

Test Acc 0.7357

118

VGG19

121


python deep learning - jordi torres.ai

Documents