human activity analysis

41
Human Activity Analysis By: Ryan Wendel

Upload: gina

Post on 23-Jan-2016

51 views

Category:

Documents


0 download

DESCRIPTION

Human Activity Analysis. By: Ryan Wendel. What is the Human Activity Analysis?. It is an ongoing analysis in which videos are analyzed frame by frame Most of the video recognition is pulled from 3-D graphic engines. What HAA covers. “HAA” stands for Human Activity Analysis - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Human Activity Analysis

Human Activity Analysis

By: Ryan Wendel

Page 2: Human Activity Analysis

It is an ongoing analysis in which videos are analyzed frame by frame

Most of the video recognition is pulled from 3-D graphic engines

What is the Human Activity Analysis?

Page 3: Human Activity Analysis

“HAA” stands for Human Activity Analysis Surveillance systems Patient monitoring systems Human-computer interfaces

What HAA covers

Page 4: Human Activity Analysis

We are going to take a look at methodologies that have been developed for simple human actions.

And high-level activities.

What we will cover

Page 5: Human Activity Analysis

Gestures Actions Interactions Group activities

Basic Human Activities

Page 6: Human Activity Analysis

Basic movements of a persons body parts. For example: Raising an arm Lifting a leg

Gestures

Page 7: Human Activity Analysis

A Single persons activities which could entail multiple gestures.

For example: Walking Waving Shaking body

Actions

Page 8: Human Activity Analysis

Interactions that involve two or more people / items.

For Example: Two people fighting

Interactions

Page 9: Human Activity Analysis

Activities performed by multiple people. For example: A group running A group walking A group fighting

Group Activities

Page 10: Human Activity Analysis

Can be separated into two sections◦ Single-layered approaches: An approach that

deals with recognizing human activities based on a video feed (frame by frame.)

◦ Hierarchical approaches: An approach aimed at describing the high level approach to HAA by showing high level activities in simpler terms.

Activity Recognition Methodologies

Page 11: Human Activity Analysis

Main objective is to analyze simple sequences of movements of humans

Can be categorized into two different categories ◦ Space-time approach: takes an input video as a

3-D volume◦ Sequential approach: takes an input video and

interprets it as a sequence of observations

Single-layered approaches

Page 12: Human Activity Analysis

Divided into three different subsections based on features◦ Space-time volume◦ Space-time Trajectories ◦ Space-time features

Space-time approach

Page 13: Human Activity Analysis

Captures a group of human activities by analyzing volumes of a video (frame by frame.)

Also uses types of recognition using space-time volumes to measure similarities between two volumes

Space-Time Volume

Page 14: Human Activity Analysis
Page 15: Human Activity Analysis
Page 16: Human Activity Analysis

Uses stick figure modeling to extract joint positions of a person at each frame by frame

Space-Time Trajectories

Page 17: Human Activity Analysis
Page 18: Human Activity Analysis

Does not extract features frame by frame Extracts features when there is a

appearance or shape change in 3-D Space-time volume

Space-Time features

Page 19: Human Activity Analysis
Page 20: Human Activity Analysis

Space-Time Volume◦ Hard to differentiate between multiple people in

the same scene. Space-Time Trajectories

◦ 3-D body-part detection and tracking is still an unsolved problem, and it requires a strong low-level component that can estimate 3-D join location.

Space-Time features◦ Not suitable for modeling complex activities

Disadvantages of Space-time approach

Page 21: Human Activity Analysis

Divided into two different subsections based on features◦ Exemplar-based◦ State model-based

Sequential approach

Page 22: Human Activity Analysis

Review◦ Sequential approach: takes an input video and

interprets it as a sequence of observations Exemplar-based

◦ Shows human activities with a set of sample sequences of action executions

Exemplar-based

Page 23: Human Activity Analysis
Page 24: Human Activity Analysis

Sequential set of sequences that represent a human activity as a model composed of a set of states.

State Model-Based

Page 25: Human Activity Analysis
Page 26: Human Activity Analysis

Exemplar-based is more flexible in terms of comparing multiple sample sequences

Where as State Model-based can handle a probabilistic analysis of an activity better.

Exemplar vs State Model

Page 27: Human Activity Analysis

Sequential approach is able to handle and detect more complex activities performed

Whereas the Space-time approach handles simpler less complex activities.

Both methods are based off of some type of a sequences of images

Space-time vs Sequential approach

Page 28: Human Activity Analysis

Allows the recognition of high-level activities based on the recognition results of other simpler activities

Advantages of the Hierarchical Approach◦ Has the ability to recognize high-level activities

with a more in depth structure◦ Amount of data required to recognize an activity

is significantly less then single-layered approach◦ Easier to incorporate human knowledge

Hierarchical Approaches

Page 29: Human Activity Analysis

Statistical approach Syntactic approach Description-based approach

Three main subgroups of Hierarchical approach

Page 30: Human Activity Analysis

Statistical approaches use the state-based models to recognize activities

If you use multiple layers of a state-based model you can use these separate models to recognize activities with sequential structures

Statistical approach

Page 31: Human Activity Analysis
Page 32: Human Activity Analysis

Human activities are recognized as a string of symbols

Human activities are shown as a set of production rules generating a string of actions

Syntactic approach

Page 33: Human Activity Analysis
Page 34: Human Activity Analysis

Human activities that use recognition with complex spatio-temporal structures◦ A spatio-temporal structure is a detector used for

recognizing human actions Uses Context-free grammars (CFGs) to

represent activities ◦ CFGs are used to recognize high-level activities◦ The detection extracts space-time points and

local periodic motions to obtain a sparse distribution of interest points in a video

Description-based approach

Page 35: Human Activity Analysis
Page 36: Human Activity Analysis

Probability theory Fuzzy logic Bayesian network:

◦ Used for recognition of an activity, based on the activities temporal structure representation

◦ Uses a large network with over 10,000 nodes

Image Understanding (IU)

Page 37: Human Activity Analysis

A group of persons marching◦ The images are recognized as an overall motion

of an entire group A group of people fighting

◦ Multiple videos are used to recognize the activity that a “group is fighting”

Group Activities

Page 38: Human Activity Analysis

Recognition of interactions between humans and objects requires multiple components involved.

A lot of human-object interaction ignores interaction between object recognition and motion estimation

You can also factor in object dependencies, motions, and human activities to determine activities involved

Interactions between humans and Objects

Page 39: Human Activity Analysis
Page 40: Human Activity Analysis

J.K. Aggarwal and M.S. Ryoo. 2011. Human activity analysis: A review. ACM Comput. Surv. 43, 3, Article 16 (April 2011), 43 pages. DOI=10.1145/1922649.1922653 http://doi.acm.org/10.1145/1922649.1922653

  Christopher O. Jaynes. 1996. Computer vision and artificial intelligence.

Crossroads 3, 1 (September 1996), 7-10. DOI=10.1145/332148.332152 http://doi.acm.org/10.1145/332148.332152

Zhu Li, Yun Fu, Thomas Huang, and Shuicheng Yan. 2008. Real-time human action recognition by luminance field trajectory analysis. In Proceedings of the 16th ACM international conference on Multimedia (MM '08). ACM, New York, NY, USA, 671-676. DOI=10.1145/1459359.1459456 http://doi.acm.org/10.1145/1459359.1459456

Paul Scovanner, Saad Ali, and Mubarak Shah. 2007. A 3-dimensional sift descriptor and its application to action recognition. In Proceedings of the 15th international conference on Multimedia (MULTIMEDIA '07). ACM, New York, NY, USA, 357-360. DOI=10.1145/1291233.1291311 http://doi.acm.org/10.1145/1291233.1291311

References

Page 41: Human Activity Analysis

Questions?