![Page 1: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/1.jpg)
Visual Scene Understanding (CS 598)
Derek Hoiem
Course Number: 46411Instructor: Derek HoiemRoom: Siebel Center 1109Class Time: Tuesday and Thursday 11:00am – 12:15pmOffice Hours: Tuesday and Thursday 12:15-1pm; by appointmentContact: [email protected], Siebel 3312
![Page 2: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/2.jpg)
Today
• Introductions
• Overview of logistics
• Overview of class material
![Page 3: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/3.jpg)
Vision: What is it good for?
Biological (Humans)
1.2.3.4.5.6.7.8.9.10.
Technological (Computers)
1.2.3.4.5.6.7.8.9.10.
Note: Unfortunately, these got erased when my computer crashed
![Page 4: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/4.jpg)
Course Logistics
![Page 5: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/5.jpg)
Class Content Overview
• Tutorials and Perspectives
• Paper readingI) Spatial InferenceII) ObjectsIII) ActionsIV) Context and Integration
![Page 6: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/6.jpg)
Visual Scene Understanding
Visual scene understanding is the ability to infer
general principles and current situations from imagery in a way that helps achieve goals.
![Page 7: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/7.jpg)
Visual Scene Understanding
Visual scene understanding is the ability to infer
general principles and current situations from imagery in a way that helps achieve goals.
![Page 8: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/8.jpg)
Visual Scene Understanding
Visual scene understanding is the ability to infer
general principles and current situations from imagery in a way that helps achieve goals.
![Page 9: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/9.jpg)
Visual Scene Understanding
Visual scene understanding is the ability to infer
general principles and current situations from imagery in a way that helps achieve goals.
![Page 10: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/10.jpg)
I. Spatial Inference
![Page 11: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/11.jpg)
Getting Around
![Page 12: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/12.jpg)
Getting Around
![Page 13: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/13.jpg)
Getting Around
![Page 14: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/14.jpg)
Spatial Inference: applications
Household RobotsAutomated Vehicles
Graphics ApplicationsPredict object size/position
![Page 15: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/15.jpg)
Spatial Inference: open questions
• How do we represent space?– Surface orientations, depth maps, voxels?
• How do we infer it from available sensory data (image, stereo, motion, laser range finder)?
![Page 16: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/16.jpg)
II. Objects
![Page 17: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/17.jpg)
Finding Things and Observing Them
Image classification: Are there any dogs?Photo credit: iansand – flickr.com
![Page 18: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/18.jpg)
Finding Things and Observing Them
Object Localization: Where are the dog(s)?
![Page 19: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/19.jpg)
Finding Things and Observing Them
Verification: Is this a dog?
![Page 20: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/20.jpg)
Finding Things and Observing Them
Description: Furry, small, nice, side view
![Page 21: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/21.jpg)
Finding Things and Observing Them
Identification: My friend Sally?
![Page 22: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/22.jpg)
Recognizing Stuff
SKY
SAND
WATER
![Page 23: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/23.jpg)
Object Recognition: applications
Photo SearchSecurity
Robots
![Page 24: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/24.jpg)
Object Recognition: open questions
• How many examples does it take to learn one category well?
• How many examples does it take to learn 100 categories well?
• How do these answers depend on the level of supervision?
• Can recognition be solved with simple methods and massive amounts of data?
• How can we quickly recognize an object?
• How can we scale up to deal with thousands of categories?
![Page 25: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/25.jpg)
III. Actions
![Page 26: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/26.jpg)
Taking Action
[Saxena et al. 2008]
![Page 27: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/27.jpg)
Recognizing Actions
KTH Dataset
Figure from Laptev et al. 2008
![Page 28: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/28.jpg)
Recognizing Actions
Figure from Laptev et al. 2008
![Page 29: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/29.jpg)
Reading Emotions
Photo credit: Comstok
![Page 30: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/30.jpg)
Actions: applications
SecurityVideo Search
![Page 31: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/31.jpg)
Actions: open questions
• How are actions defined?
• Does it make sense to categorize them?– If not, how do we recognize them?
• What are good visual representations for inferring actions?
• How can we recognize activities?
![Page 32: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/32.jpg)
IV. Context and Integration
[Hoiem et al. 2008]
![Page 33: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/33.jpg)
Context and Integration
[Hoiem et al. 2008]
• Objects + scene categories better detection
• Movement + objects action/activity recognition
• Space + objects navigation
![Page 34: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/34.jpg)
Context and Integration: applications
Everything that vision is good for
![Page 35: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/35.jpg)
Context and Integration: open questions
• Should context be explicit (e.g., “cars drive on the road”) or implicit (feature-based)?
• How do we model and learn the interactions between different processes and scene characteristics?
• How do we deal with the growing complexity as more and more pieces are put together?
![Page 36: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/36.jpg)
General Problems in Computer Vision
• Better understanding of limitations and their sources– Need new experimental paradigms
• Improve generalization– Aim to generalize across datasets, categories, and
tasks– Work on knowledge sharing and transfer
• Vision as a way of learning about the world– Integration into AI– Systems that acquire knowledge over time
![Page 37: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/37.jpg)
Successes of Computer Vision• Point matching (e.g. 2d3)
– Tracking– Structure from motion– Stitching
• Product inspection• Multiview 3d reconstruction• Face recognition and modeling• Object recognition on pre-2000 datasets• Interactive segmentation (ongoing)
![Page 38: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/38.jpg)
To Do
• Register on bulletin board
• Post comments on Thursdays reading (due tomorrow)
• Look over schedule and decide which days to present (due next Tues)
• Start thinking about projects– Let me know if you want a specific pairing (due Tues)
![Page 39: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/39.jpg)
Questions?
![Page 40: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/40.jpg)
![Page 41: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/41.jpg)
Goals
• Make you a better researcher (esp. in vision)– More knowledge– Better critical thinking skills– Improved communication skills– Improved research skills
![Page 42: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/42.jpg)
Grades
• Participation: 25%– Posting– Class discussion
• Presentation: 25%
• Projects: 50%– Proposal, progress report, final paper, and oral
![Page 43: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/43.jpg)
Policies
• Attendance required (see syllabus)
• Give credit where due
• No formal prerequisites
• Everything needs to be on time
![Page 44: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/44.jpg)
Reading
• Read well
• Post comments to bulletin board at least 24 hours before class
![Page 45: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/45.jpg)
Presentations• Presenter
– Everyone does two– Good quality coverage of topic (40 min)– See syllabus for guidelines– Sign up by next Tuesday (at latest)– TBAs are your choice (decide at least 4 weeks in advance)
• Demonstrator– If all days are taken, pair up– One person’s job will be to demonstrate some aspect of the algorithm
(e.g., where it succeeds and fails) by running it on many examples– May require implementation
• Note taker
![Page 46: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/46.jpg)
Projects• Timeline
– Proposal: Feb 12 (3 ½ weeks!)– Progress report: Mar 19– Presentation: paper May 5, oral later
• Progress report• Presentation
– Paper– Oral
• In pairs– Can choose partner or be randomly paired
• Suggestions on web
• Potentially will lead to publication (e.g. NIPS)
![Page 47: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/47.jpg)
To Do
• Register on bulletin board
• Post comments on Thursdays reading (due tomorrow)
• Look over schedule and decide which days to present (due next Tues)
• Start thinking about projects– Let me know if you want a specific pairing (due Tues)
![Page 48: Visual Scene Understanding (CS 598) Derek Hoiem Course Number: 46411 Instructor: Derek Hoiem Room: Siebel Center 1109 Class Time: Tuesday and Thursday](https://reader035.vdocuments.net/reader035/viewer/2022062423/56649ed85503460f94be6d2a/html5/thumbnails/48.jpg)
Questions?