learning to control
TRANSCRIPT
![Page 1: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/1.jpg)
Learning to controlD.A.Forsyth, UIUC
![Page 2: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/2.jpg)
Topics
• Scamper through basic reinforcement learning ideas• Imitation learning
• and its variants and problems• as structure learning
![Page 3: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/3.jpg)
First learned steering controller
![Page 4: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/4.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 5: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/5.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 6: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/6.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 7: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/7.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 8: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/8.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 9: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/9.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 10: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/10.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 11: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/11.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 12: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/12.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 13: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/13.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 14: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/14.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 15: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/15.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 16: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/16.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 17: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/17.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 18: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/18.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 19: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/19.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 20: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/20.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 21: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/21.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 22: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/22.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 23: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/23.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 24: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/24.jpg)
Fei-Fei+Johnson+Yeung 17
![Page 25: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/25.jpg)
Levine, ND
![Page 26: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/26.jpg)
Fragkiadaki, ND
![Page 27: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/27.jpg)
Fragkiadaki, ND
![Page 28: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/28.jpg)
Fragkiadaki, ND
![Page 29: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/29.jpg)
Fragkiadaki, ND
![Page 30: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/30.jpg)
Fragkiadaki, ND
![Page 31: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/31.jpg)
Fragkiadaki, ND
![Page 32: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/32.jpg)
Fragkiadaki, ND
![Page 33: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/33.jpg)
Fragkiadaki, ND
![Page 34: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/34.jpg)
Fragkiadaki, ND
![Page 35: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/35.jpg)
As you get further off the path, the probability of making an error grows, cause the classifier
thinks this state is rare
Fragkiadaki, ND
![Page 36: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/36.jpg)
Fragkiadaki, ND
![Page 37: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/37.jpg)
Fragkiadaki, ND
![Page 38: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/38.jpg)
Fragkiadaki, ND
![Page 39: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/39.jpg)
Fragkiadaki, ND
![Page 40: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/40.jpg)
Fragkiadaki, ND
![Page 41: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/41.jpg)
Fragkiadaki, ND
![Page 42: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/42.jpg)
Fragkiadaki, ND
![Page 43: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/43.jpg)
Notice you might not actually need a human here - if your states are
discretized, and you have enough data, you might get this by matching
Fragkiadaki, ND
![Page 44: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/44.jpg)
Fragkiadaki, ND
![Page 45: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/45.jpg)
Fragkiadaki, ND
![Page 46: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/46.jpg)
Traditional strategy
• Construct a parametric cost function
• So that, for training X*
• is close to correct Y*
• (see movies for some details on construction)
H(X ,Y; ✓)
argmaxY H(X ⇤
,Y; ✓)
Fragkiadaki, ND
![Page 47: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/47.jpg)
Fragkiadaki, ND
![Page 48: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/48.jpg)
HMM: Making scribal Latin searchable
• Goal: make the ink in a handwritten text searchable
• Issue: not a good idea to transcribe
• Strategy: • compute log P(ink|known sequence)
• for a line• known sequence can be a regular expression
• eg (character)^* mihi (character)^*• ex: check you can do this w/ DP
• rank lines by this, report
![Page 49: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/49.jpg)
HMM: Making scribal Latin searchable
• Goal: make the ink in a handwritten text searchable
• Issue: few examples of glyphs• hard to label
• Strategy: • doesn’t really matter
• like a substitution cypher - letter frequencies are what’s important• AND you can grow the pool of examples:
• when you see “interrogave?unt” you know it’s “interrogaverunt”• so you can get another glyph
![Page 50: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/50.jpg)
![Page 51: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/51.jpg)
![Page 52: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/52.jpg)
![Page 53: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/53.jpg)
Fragkiadaki, ND
![Page 54: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/54.jpg)
Fragkiadaki, ND
![Page 55: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/55.jpg)
Fragkiadaki, ND
![Page 56: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/56.jpg)
Fragkiadaki, ND
![Page 57: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/57.jpg)
Fragkiadaki, ND
![Page 58: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/58.jpg)
Fragkiadaki, ND
![Page 59: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/59.jpg)
Fragkiadaki, ND
![Page 60: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/60.jpg)
Fragkiadaki, ND
![Page 61: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/61.jpg)
Fragkiadaki, ND
![Page 62: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/62.jpg)
Fragkiadaki, ND
![Page 63: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/63.jpg)
Fragkiadaki, ND
![Page 64: Learning to control](https://reader033.vdocuments.net/reader033/viewer/2022061607/62a0096d49b11b046973e298/html5/thumbnails/64.jpg)
Fragkiadaki, ND