big data e deep learning - uniroma1.itispac.diet.uniroma1.it/scardapane/wp-content/...big data e...
TRANSCRIPT
Simone Scardapane {[email protected]}
Big Data e Deep Learning
Verso una nuova generazione di
programmi intelligenti (forse)
A cosa servono i dati?
1
https://datafloq.com/read/understanding-sources-big-data-infographic/338
Google Flu Trends
2
The big data hope
2
“…we can accurately estimate the current level of weekly influenza activity in
each region of the United States, with a reporting lag of about one day.”
Ginsberg, Jeremy, et al. "Detecting influenza epidemics using search engine query data."
Nature 457.7232 (2009): 1012-1014.
Inizialmente, i ricercatori stimavano di poterlo ottenere con un modello
fondamentalmente lineare fra queries Q e visite dal medico P:
logit P = 𝛽0 + 𝛽1 × logit Q + ϵ
The big data hubris
2
“GFT […] missed by a very large margin in the 2011–2012 flu season and
has missed high for 100 out of 108 weeks starting with August 2011.”
"“Big data hubris” is the often implicit assumption that big data are a
substitute for, rather than a supplement to, traditional data collection and
analysis."
Lazer, David, et al. "The parable of Google Flu: traps in big data analysis."
Science 343.14 March (2014).
Predictive police
2
“PredPol, [is a] “predictive policing” software program that shovels historical
crime data through a proprietary algorithm and spits out the 10 to 20 spots
most likely to see crime over the next shift.”
"Santa Cruz saw burglaries drop by 11% and robberies by 27% in the first
year of using the software."
Server And Protect: Predictive Policing Firm PredPol Promises To Map Crime Before It Happens
(Forbes, 2015)
A cosa può servire?
2
Possiamo usare questi dati per predire cosa scriveranno gli utenti?
Swiftkey Releases Predictive Keyboard Built On A Neural Network
http://digitalcallout.com/how-much-data-do-we-generate-every-day/
Image recognition
2
Microsoft, Google Beat Humans at Image Recognition
Ok, ma come?
Machine learning
2
Questa è un'anatra:
https://it.wikipedia.org/wiki/Anas_platyrhynchos#/media/File:Anas_platyrhynchos_male.jpg
Questa NON è un'anatra:
https://it.wikipedia.org/wiki/Quercus#/media/File:Quercus_pubescens_Tuscany.jpg
Come farlo capire
al computer?
Reti neurali artificiali
2http://neuralnetworksanddeeplearning.com/chap1.html
Una ispirazione biologica
2
Un elemento essenziale:
strati multipli di
elaborazione
Urbanski, M., Coubard, O. A., & Bourlon, C. (2014). Visualizing the blind brain: brain imaging of visual field defects from
early recovery to rehabilitation techniques. Frontiers in integrative neuroscience, 8.
Una (brevissima) storia delle reti neurali
2
• 1957: Frank Rosenblatt presenta il percettrone
• Anni '70: "AI Winter"
• Anni '80: la prima "rinascita" delle reti neurali
• Parziale abbandono fino al 2000
• Dal 2006: deep networks, la seconda "rinascita"
Fattori scatenanti
2
1. Nuovi algoritmi per allenare reti con vari strati nascosti
(inizializzazione unsupervised, ecc.).
2. Training set di svariati milioni di elementi ("big data").
3. Grandi capacità computazionali: clusters, GPU, ecc.
E Google?
2
Google è uno dei massimi esponenti nel campo:
• 2012: allena una rete neurale con oltre 1 miliardo di parametri su
frame estratti da YouTube
• 2014: acquista DeepMind per un costo stimato di $ 500 milioni
• 2015: rilascia il framework di machine learning distribuito
TensorFlow in open source
Il "neurone dei gatti"
2Le, Q. V. (2013, May). Building high-level features using large scale unsupervised learning. In 2013 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP), (pp. 8595-8598). IEEE.
"We built a cat detector!"
2
Strati di rappresentazione
2
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep Learning. Nature, 521, 436-444.
Cosa vede una deep network?
2
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer Vision–ECCV
2014 (pp. 818-833). Springer International Publishing.
Deep dreams
2
The best images from Google's Deep Dream software
Ingannare una rete neurale
2
Nguyen A, Yosinski J & Clune J. Deep Neural
Networks are Easily Fooled: High Confidence
Predictions for Unrecognizable Images. In
Computer Vision and Pattern Recognition
(CVPR ’15), IEEE, 2015.
E noi?
Strutturare i dati
2https://cloud.google.com/prediction/docs/developer-guide#trainingtheapi
Prediction API
2https://cloud.google.com/prediction/docs/developer-guide#trainingtheapi
ID univoco assegnato al
modello
Percorso del file di training
nel Cloud Storage
Richiedere una predizione
2https://cloud.google.com/prediction/docs/developer-guide#trainingtheapi
Prediction prediction = new Prediction(httpTransport,
requestInitializer, jsonFactory);
Input input = new Input();
InputInput inputInput = new InputInput();
inputInput.setCsvInstance(params);
input.setInput(inputInput);
Output output = prediction.trainedmodels().predict(modelId
input).execute();
Il futuro (forse)
Macchine che si guidano da sole?
2
https://www.google.com/selfdrivingcar/
Macchine che "parlano"?
2
http://googleresearch.blogspot.it/2014/11/a-picture-is-worth-thousand-coherent.html
Vinyals, O., Toshev, A., Bengio, S., & Erhan, D. (2014). Show and tell: A neural image caption generator.
arXiv preprint arXiv:1411.4555.
15
Big data e deep learning
Simone ScardapaneGDG L-ab Member
PhD Student @ La Sapienza
< Grazie dell’Attenzione! >