Download - Semantic Segmentation of Driving Behavior Data: Double Articulation Analyzer and its Application
Semantic Segmentation of Driving Behavior Data: Double Articulation Analyzer and its Application
Tadahiro TaniguchiCollege of Information Science & Engineering
Ritsumeikan University
Invited talk at The 4th Workshop on Naturalistic Driving Data Analytics,
IEEE IV2017, Los Angeles, 11th June, 2017@tanichu
Machine learning methods for unlabeled naturalistic driving dataSymbolization approach for driving behavior data:
Tadahiro Taniguchi @tanichu• Professor, Emergent System Laboratory,
College of Information Science and Engineering, Ritsumeikan University, Japan– 2003-2006: PhD student, Kyoto University – 2005-2008: JSPS research fellow, Kyoto University– 2008: Assistant professor, Ritsumeikan University– 2010: Associate professor, Ritsumeikan University – 2015-2016: Visiting Associate Professor,
Imperial College London– 2017: Professor, Ritsumeikan University– 2017: Visiting General Chief Scientist,
Panasonic CorporationAI solution center (20% C.A.)
• Research Topics– Machine learning, Intelligent robotics & vehicle,
Symbol emergence in robotics, Language acquisition
Contents
1. Overview of Double Articulation Analysis of Driving Behavior
2. Applications Segmentation and topic modeling Prediction of driving behavior Large-scale data
3. Deep learning for driving behavior feature extraction
Towards Machine learning-based Naturalistic driving behavior data analysis
E,g, Nagoya database [Takeda+]
Because of the massive size and huge diversity of NDD, hang-crafted and rule-based analysis are not scalable.
How can we segment driving behavior data and find semantic units from NDD?
Machine learning-based Semantic Segmentation
Cloud storage/ Internet
without labeling
Driving behavior data asunlabeled multi-dimensional timeseries data
Preprocessed time series data
Time-series data from each sensor
Research goal: Extraction of latent states from unlabeled naturalistic driving data
How can we construct this kind of latent finite state machine (FSM) from unlabeled naturalistic driving data?
We can consider naturalistic driving data has latent (in)finite states.
Perception
Prediction
How the recorded naturalistic driving behavior data are generated??
7
Vehicle dynamics and behavior
PerceptionDecision Maneuver
Environment
Intention
Driving behavior data include velocity, break pressure, steering angle and so on. (Note that we are excluding front-view camera image.)
They are influenced by driver’s intention and environmental conditions.
Latent variable
Driving data conditionally depend on driver’s intention and environmental conditions
Discovering latent dynamics of driver’s intention from observed driving behavior data
8
Vehicle dynamics and behavior
PerceptionDecision Maneuver
Environment
Intention
We have been focusing on the analysis of driving behavior obtained from CAN to estimate latent dynamics of drivers’ intention and environment.
Information found hereimplicitly involves information related to environment and intention.
CAN
Working hypothesis: double articulation structure on naturalistic driving behavior data
9
5 2 1691 7 2
dw 4 dw 10 dw 1Driving words
Driving letters
Driving behaviordata
Latent variable representing intention
zt-1 zt zt+1
CAN information
Vehicle dynamics and behavior
PerceptionDecision Maneuver
Environment
Intention
Double articulation structure in semiotic data• Semiotic time-series data often has double
articulation– Speech signal is a continuous and high-dimensional time-series.– Spoken sentence is considered as a sequence of phonemes.– The phonemes are grouped into words, and people give them
meanings.
h a u m ʌ́ tʃ I z ð í s
[h a u ] [m ʌ́ tʃ] [ i z ] [ð í s]
How much is this?Word
Phoneme
Speechsignal
semantic(meaningful)
meaningless
unsegmented
Does the human brain have a special capability to analyze double articulation structures embedded in time-series data?
1 2 46 1 27 8 5 10 11 13 14 7
W H A T I S T H I S T H I S I S A P E N
[WHAT] [IS] [THIS] [THIS] [IS] [A] [PEN]Speech
Motion
Driving
Working hypothesisDouble Articulation Structure in Human Behavior
2017/6/12
Basic assumption”Driving-behavior data has two-layered hierarchical structure.”
Ex.) “Turning right in an intersection“ is not a simple “rotating a steering wheel” maneuver, but a complex sequence of maneuvers.
Double articulation structure
Chunk: (a sequence of segments)Semantically consistent driving behavior unit
Segment:Physically consistent driving behavior unit
Driving words
Driving letters
Analogy between speech signal and driving behavior data
13
Speech signal Features Wordsequence
Driving behavior dataSpeech data
今⽇は楽しかった.・・・・
Driving behavior
h a zIʌu m tʃ
How much isWordPhoneme 5 2 1691 7 2
dw 4 dw 10 dw 1D. word
D. letter
Speechdata
D.B.data
Semantic unit Speech recognition DAA
Extracting driving words as high-level semantic representations that are representing the driver’s intention and situation concisely
Features Driving wordsequence
Nonparametric Bayesian approach towards finding driving letters and words
• Challenges– How can a system find the number of driving words
and letters from data?– How can a system estimate features and
characteristics (emission distribution) of driving letters?
– How can a system find a list of driving words (i.e, dictionary)?
– How can a system determine the number of letters contained in each driving word?
Nonparametric Bayesian approach in a data-driven manner??
Bayesian nonparametrics(Nonparametric Bayesian approach)
By assuming infinite dimensional categorical distribution (i.e., infinite number of clusters), we can develop a clustering method that can automatically estimate the number of clusters.
It can easily deal with “unseen” possible events (driving behaviors).
It is useful for modeling data-driving concept formation, motion segmentation and word segmentation.
K-mixture model(GMM and etc. etc.)
SBP model(DPGMM and etc. etc.)
Finite(means fixed)
Infinite(means flexible)
(Ordinal) Bayesian model Nonparametric Bayesian model
Dirichlet distribution Dirichlet process
Fully unsupervised machine learning method
Time seriesdata
Bayesian double articulation analyzer [Taniguchi’11]
Unsupervised word segmentation (NPYLM)[Mochihashi ‘09]
Drivingword
sticky HDP-HMM[Fox ‘08]
Driving letter
Double articulationAnalysis (segmentation)
Language model
16
Chunk
Segment
Tadahiro Taniguchi, Shogo NagasakaDouble Articulation Analyzer for Unsegmented Human Motion using Pitman-Yor Language model and Infinite Hidden Markov Model, IEEE/SICE International Symposium on System Integration, pp. 250 - 255 .(2011)
Finding driving letters from NDD:Sticky hierarchical Dirichlet process-hidden
Markov model (Sticky HDP-HMM) HDP-HMM is an HMM that has an infinite number of hidden
states. That can automatically segment continuous time series data and find the number of clusters (emission distributions) simultaneously [The+ ’06, Fox+ ‘08]
(Sticky) HDP-HMM is shown to be a subclass of HDP-HSMM [Johnson+ ].
β
z1
γ
λ θk∞
πkα
y1
z2
y2
z3
y3
zT
yT
κ
https://github.com/mattjj/pyhsmmFox, Emily B., et al. "An HDP-HMM for systems with state persistence." Proceedings of the 25th international conference on Machine learning. ACM, 2008.
Johnson, Matthew J., and Alan S. Willsky. "Bayesian nonparametric hidden semi-Markov models." Journal of Machine Learning Research 14.Feb (2013): 673-701.
Fully unsupervised machine learning method
Time seriesdata
Bayesian double articulation analyzer [Taniguchi’11]
Unsupervised word segmentation (NPYLM)[Mochihashi ‘09]
Drivingword
sticky HDP-HMM[Fox ‘08]
Drivingletter
Double articulationAnalysis (segmentation)
Language model
18
Chunk
Segment
Tadahiro Taniguchi, Shogo NagasakaDouble Articulation Analyzer for Unsegmented Human Motion using Pitman-Yor Language model and Infinite Hidden Markov Model, IEEE/SICE International Symposium on System Integration, pp. 250 - 255 .(2011)
Unsupervised word segmentation Supervised learning-based word segmentation
Morphological analysis methods in NLP. “これはりんごです.” -> “これ|は|りんご|です. ”
=> This is an apple. (Kore wa ringo desu) Unsupervised word segmentation
No preexisting dictionaries are used. A nonparametric Bayesian framework for word segmentation
[Goldwater+ 09] Unsupervised word segmentation method based on the Nested
Pitman–Yor language model (NPYLM) [Mochihashi+ 09].
S. Goldwater, T. L. Griffiths, and M. Johnson, “A Bayesian framework for word segmentation: exploring the effects of context.,” Cognition, vol. 112, no. 1, pp. 21–54, 2009.Daichi Mochihashi, Takeshi Yamada, Naonori Ueda."Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling". ACL-IJCNLP 2009, pp.100-108, 2009.
Language model(Vocabulary)
Word segmentation
Updating language model
2017/6/12 20From Mochihashi’s presentation slide: http://chasen.org/~daiti-m/paper/jfssa2009segment.pdf
Analysis
Fully unsupervised (data-driven) word segmentation based on Bayesian nonparametrics
Double Articulation Analyzer (DAA)(Conventional DAA)
• Inference– Approximate Inference
Procedure of Double Articulation Analyzer [Taniguchi ‘11]
• Unsupervised learning– Estimating
• Language model• Emission distribution• Segments and chunks
• Conditions– Unknown number of
words and letters– Unknown emission
distribution parametersNonparametric Bayesian approach
sticky HDP-HMM[Fox ‘07]
NPYLM[Moachihashi ‘09]
Tadahiro Taniguchi, Shogo Nagasaka, Double Articulation Analyzer for Unsegmented Human Motion using Pitman-Yor Language model and Infinite Hidden Markov Model, 2011 IEEE/SICE SII.(2011)
Driving words
Driving letters
Observation
Contents
1. Overview of Double Articulation Analysis of Driving Behavior
2. Applications Segmentation and topic modeling Prediction of driving behavior Large-scale data
3. Deep learning for driving behavior feature extraction
Contextual Scene Segmentation of Driving Behavior based on Double Articulation Analyzer [Takenaka ‘12]
Kazuhito Takenaka, Takashi Bando, Shogo Nagasaka, Tadahiro Taniguchi, Kentarou Hitomi, Contextual Scene Segmentation of Driving Behavior based on Double Articulation Analyzer, IEEE/RSJ International Conference on Intelligent Robots and Systems 2012 (IROS 2012), 4847-4852 .(2012)
We applied DAA to driving behavior data and showed that it could determine change points of driving context recognized by human.
Drive Video Summarization based on Double Articulation Structure of Driving Behavior [Takenaka ‘12]
Kazuhito Takenaka, Takashi Bando, Shogo Nagasaka, Tadahiro Taniguchi, "Drive Video Summarization based on Double Articulation Structure of Driving Behavior", ACM multim media 2012, http://www.youtube.com/watch?v=knwiO6dVbnY
We developed a drive video summarization method using DAA, and showed it can summarize drive video naturally for viewers.
Unsupervised drive topic finding from driving behavioral data [Bando ‘13]
The DAA could segment driving-behavior data into favorably organized chunks from the viewpoint of topic modeling.
Takashi Bando, Kazuhito Takenaka, Shogo Nagasaka, Tadahiro Taniguchi, Unsupervised drive topic finding from driving behavioral data, 2013 IEEE Intelligent Vehicles Symposium,(2013) IEEE-IV’13 Best poster paper award 1st prize
Generating Contextual Description from Driving Behavioral Data [Bando+ ‘13, ‘14]
We developed a method that can generate annotation automatically for driving behavior data [Bando ’13b].
We developed a method that can generate a contextual description of a whole trip using DAA , drive topic model, and Google map API [Bando ‘14].
Takashi Bando, Kazuhito Takenaka, Shogo Nagasaka, Tadahiro Taniguchi, Drive annotation via multimodal latent topic model, IEEE/RSJ International Conference on Intelligent Robots and Systems .(2013)Takashi Bando, Kazuhito Takenaka, Shogo Nagasaka, Tadahiro Taniguchi, Generating Contextual Description from Driving Behavioral Data, 2014 IEEE Intelligent Vehicles Symposium (IV'14), .(2014)
DAA
Topic model
Generatingannotation
26
Automatic Generation of Summarized DrivingVideo with Music and Captions [Takenaka+ ‘15 ]
Kazuhito Takenaka, Takashi Bando, Tadahiro TaniguchiAutomatic Generation of Summarized Driving Video with Music and Captions41th Annual Conference of the IEEE Industrial Electronics Society (IECON), .(2015)
Contents
1. Overview of Double Articulation Analysis of Driving Behavior
2. Applications Segmentation and topic modeling Prediction of driving behavior Large-scale data
3. Deep learning for driving behavior feature extraction
Driver Assistant System that understand a driver’s intention
“What kind of driving situation is this? What and when will the driver do next?”Semantic prediction is required for the long-term prediction of driving behavior
Are you going to park the car soon? Shall I start self-parking system?
“What is the driver doing now?”,“When will the current driving behavior finish?”“What will the driver do next?”
“What is the driver doing now?”,“When will the current driving behavior finish?”“What will the driver do next?”
Semiotic Prediction of Driving Behavior using Unsupervised Double Articulation Analyzer
A prediction method that can predict successive sequences of driving letters is developed by extending the DAA.
It exploits knowledge of driving wordsfor predicting future driving behavior
Tadahiro Taniguchi, Shogo Nagasaka, Kentarou Hitomi, Naiwala P. Chandrasiri, and Takashi Bando, Semiotic Prediction of Driving Behavior using Unsupervised Double Articulation Analyzer, 2012 IEEE Intelligent Vehicles Symposium (IV2012), 849 - 854 .(2012)
Tadahiro Taniguchi, Shogo Nagasaka, Kentarou Hitomi, Naiwala P. Chandrasiri, Takashi Bando and Kazuhito Takenaka, Sequence Prediction of Driving Behaviour Using Double Articulation Analyzer, IEEE Transactions on Systems, Man, and Cybernetics: Systems, Vol.46 (9), 1300-1313,(2015) doi:10.1109/TSMC.2015.2465933
Averaged number of correctly predictedHistogram
“What is the driver doing now?”,“When will the current driving behavior finish?”“What will the driver do next?”
Hypothetical scenario of target application
Next Contextual Changing Point (NCCP)
Prediction of Next Contextual Changing Point of Driving Behavior Using Unsupervised Bayesian Double Articulation Analyzer[Nagasaka ‘14, Taniguchi ‘14] We proposed a prediction method that can determine the next contextual
change point, i.e., the termination time of the current driving word, using fully nonparametric Bayesian framework using the HDP-HSMM and NPYLM.
32
When will the driver change his behavior?
S. Nagasaka, T. Taniguchi, K. Hitomi, K. Takenaka and T. Bando, “Prediction of Next Contextual Changing Point of Driving Behavior Using Unsupervised Bayesian Double Articulation Analyzer”, IEEE Intelligent Vehicles Symposium. (2014) (oral).
Tadahiro Taniguchi, Shogo Nagasaka, Kentaro Hitomi, Kazuhito Takenaka, and Takashi Bando, Unsupervised Hierarchical Modeling of Driving Behavior and Prediction of Contextual Changing Points, IEEE Transactions on Intelligent Transportation Systems, Vol.16 (4), 1746-1760 .(2014)
Predicted probability duration distribution over next
contextual change point
Proposed method
Linear regression
RNN Almost constant duration
distribution was output
Predicted duration distribution wasdynamically changed on the basis of the driving context.
Front camera view
Predicted duration distribution
Obs
erva
tion
tim
e
Timeline for predicted NCCPProposed method
Linear regression
RNN
True termination time of chunks*
The proposed method predicted the NCCPof driving behavioral data in a real environment more accurately than the compared (supervised learning )methods.
Predicted probability distribution
Determining Utterance Timing of a Driving Agent with Double Articulation Analyzer
[Taniguchi ‘15] Is “avoiding the contextual
change points” a good strategy for determining utterance timing of a driving agent??-> supported
better
Determining Utterance Timing of a Driving Agent with Double Articulation Analyzer,Tadahiro Taniguchi, Kai Furusawa, Hailong Liu, Yusuke Tanaka, Kazuhito Takenaka, and Takashi Bando, IEEE Transactions on Intelligent Transportation Systems (2015)
me
me
Contents
1. Overview of Double Articulation Analysis of Driving Behavior
2. Applications Segmentation and topic modeling Prediction of driving behavior Large-scale data
3. Deep learning for driving behavior feature extraction
Application to a large-scale driving corpus
DAA was applied to NUDrive (Nagoya U. database [Takeda+])
Possibility of application of DAA to large-scale driving corpus (NDD) was explored.
Takashi BANDO, Kazuhito TAKENAKA, Masataka MORI, Tadahiro TANIGUCHI, Chiyomi MIYAJIMA, and Kazuya TAKEDA, Symbolization approach for large-scale driving corpus, IBIS symposium (in Japanese) (2014)
Driv
er a
nd E
nvir
onm
ent
IDDriving word
Bi-clustering result using IRM
The number of driving words
Frequency of driving words
Lane Change Extraction
Masataka Mori, Kazuhito Takenaka, Takashi Bando, Tadahiro Taniguchi, Chiyomi Miyajima, and Kazuya Takeda, Automatic Lane Change Extraction based on Temporal Patterns of Symbolized Driving Behavioral Data, 2015 IEEE Intelligent Vehicles Symposium (IV'15), .(2015)
Using segment and topic information
Integrating driving behavior and traffic context through signal symbolization [Yamazaki+ 16]
Risk level of lane change is predicted by using driving words and driving topics.
Co-occurrence chunks are newly introduced.
Yamazaki, Suguru, et al. "Integrating driving behavior and traffic context through signal symbolization." Intelligent Vehicles Symposium (IV), (2016).
Driving Word2vec: Distributed Semantic Vector Representation for
Symbolized Naturalistic Driving Data [Fuchida+ 16]
Natural langauge Driving behavior Driving behavior has a certain
syntactic structure??? Can Word2Vec extract
semantic similarity between words from large-scale NDD corpus?????
Mikolov, T., Corrado, G., Chen, K., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. Proceedings of the International Conference on Learning Representations (ICLR 2013)Yusuke Fuchida, Tadahiro Taniguchi, Toshiaki Takano, Takuma Mori, Kazuhito Takenaka, Takashi Bando, Driving Word2vec: Distributed Semantic Vector Representation for Symbolized Naturalistic Driving Data, IEEE Intelligent Vehicles Symposium (IV), .(2016
Natural language has a certain syntactic structure.
Word2Vec can extract semantic similarity and relationships between words from large-scale corpus.
We evaluate similarity between DW2V and Drive Topic.
Drive Topics
DW2V
Similarity between DW2V and Drive topics
A driving word that is most similar to a driving word in a sense of DW2V was near to the word in a sense of Drive Topic.
Random sampling
Temporally neighbor
Significant correlation was found between distances between two data points in DW2V and Drive Topic
We concluded that the hypothesis was supported by the experimental result of DW2V
Future applications andresearch topics
SymbolizedNDD
NDD
DW2V
Drive topics
Double articulation analysis
Video summarizationScene retrieval
Prediction of driving behavior
Anomaly detection
Automatic descriptiongeneration
Statistical analysis ofa variety of drivers
Update in algorithm for double articulation analyzerNonparametric Bayesian DAA (NPB-DAA) [Taniguch+ 16](1) Conventional DAA
(2) Nonparametric Bayesian DAA (NPB-DAA)
Tadahiro Taniguchi, Shogo Nagasaka, Ryo Nakashima, Nonparametric Bayesian Double Articulation Analyzer for Direct Language Acquisition from Continuous Speech Signals, IEEE Transactions on Cognitive and Developmental Systems.(2016)
Hierarchical Dirichlet process hidden language model (HDP-HLM) [Taniguchi+ 16]
γLM
Language model(Word bigram)
γWM
i=1,…,∞αWM
j=1,…,∞Word model
(Letter bigram)
z1 zs-1 zs zs+1 zS
Latent words (Super state sequence)
wi
i=1,…,∞ls1 lsk lsL Latent letters
Ds1 Dsk
x1 xt1s1 xT
Acoustic model
ωj
θj
G
H
yT
Observation Ds1 Dsk DsL
Duration
βLM αLM
πLMi
βWM
πWMj
xt2s1 xt1sk xt2sk xt1sL xt2sL
j=1,…,∞
yt2sLyt1sLyt1sk yt2skyt1s1 yt2s1y1
DsL
zs
zszs
zs
zs
zs
zs
o Performance of DAA is betterx Computationally very expensive
Open challenges What is the ground-truth of chunks (segments) of driving
behavior data? How can we deal with drivers’ and vehicles’ characteristics
in a date-driven manner? How can we transfer knowledge learnt from a driver, a
vehicle and an environment to another (, i.e., transfer learning) ?
Inventing an effective application of symbolized NDD. Reducing computational cost of NPB-DAA and applying it to
NDD. Integrating front camera image and GPS information into
symbolization.
Towards efficient utilization of NDD
Contents
1. Overview of Double Articulation Analysis of Driving Behavior
2. Applications Segmentation and topic modeling Prediction of driving behavior Large-scale data
3. Deep learning for driving behavior feature extraction
Feature extraction from naturalistic driving behavior data Even a fully unsupervised learning method, like DAA,
depends on (hand-crafted) feature vectors. Driving behavior data recorded in a different car is
different. Which sensor information should we feed into an analysis method (a machine learning method)?
47
Vehicle dynamics and behavior
PerceptionDecision Maneuver
Environment
Intention
Automatic feature extraction method
Using deep sparse autoencoder for extracting feature representation from driving behavior data [Liu+ 14-16]
48
Deep�sparse�autoencoder
Low-dimensional�featurerepresentation
HaiLong Liu, Tadahiro Taniguchi, Toshiaki Takano, Yusuke Tanaka, Kazuhito Takenaka and Takashi Bando, Visualization of Driving Behavior Using Deep Sparse Autoencoder, 2014 IEEE Intelligent Vehicles Symposium (IV'14). (2014)
Visualization of Driving Behavior with color representation Using Deep Sparse Autoencoder
Driving�color�map
RGB�color�space
Changes in driving behavior caused by environmental differences were represented by difference in colors.
HaiLong Liu, Tadahiro Taniguchi, Toshiaki Takano, Yusuke Tanaka, Kazuhito Takenaka and Takashi Bando, Visualization of Driving Behavior Using Deep Sparse Autoencoder, 2014 IEEE Intelligent Vehicles Symposium (IV'14). (2014)
Essential Feature Extraction of Driving Behavior We have applied Deep Sparse Auto Encoder (DSAE) to driving behavior
data to extract “essential” features from driving behavior data [Liu ‘15].
50
The distances calculated using CCA
It was shon that DSAE can filter redundant
information, and DSAE can extract
consistent information.
HaiLong Liu, Tadahiro Taniguchi, Yusuke Tanaka, Kazuhito Takenaka and Takashi Bando, Essential Feature Extraction of Driving Behavior Using a Deep Learning Method, 2015 IEEE Intelligent Vehicles Symposium (IV'15) .(2015) Best student paper
Repairing defective driving behavior data [Liu+ 16]
HaiLong Liu, Tadahiro Taniguchi, Kazuhito Takenaka, Yuusuke Tanaka, and Takashi Bando, Reducing the Negative Effect of Defective Data on Driving Behavior Segmentation Via a Deep Sparse Autoencoder, IEEE 5th Global Conference on Consumer Electronics, .(2016) IEEE GCCE 2016 Outstanding Paper Award
Conclusion (Wrap up) We introduced our fully unsupervised learning-
based approach to segmentation and symbolization of NDD.
Double articulation analysis for driving behavior data was explained.
Applications of DAA, e.g., segmentation, video summarization, prediction and information retrieval, were introduced.
Deep learning for feature extraction was explained.
To analyze naturalistic driving behavior data in a fully data-driven manner, we still have many rooms to explore!
Information
2017/6/12 53
email: [email protected]
Special Thanks• Ritsumeikan University
• S. Nagasaka, H. Liu, K. Furusawa, Y. Fuchida, R. Nakashima, T. Sugihara
• DENSO co. • K. Takenaka, K. Hitomi,
Y. Tanaka, H. Misawa, M. Mori• DENSO International America
• T. Bando• Nagoya University
• K. Takeda, C. Miyajima• Okayama Pref. University
• N. Iwahashi
Visit http://www.tanichu.com/FacebookTwitter: @tanichu
Acknowledgement
[Github] NPB-DAAhttps://github.com/EmergentSystemLabStudent/NPB_DAA
We are looking for collaborators!We can share our code if you want.We are calling for a postdoc and
PhD candidate