debs 2012 uncertainty tutorial

Event processing under uncertainty

Tutorial

Alexander Artikis1, Opher Etzion2 , Zohar Feldman2, & Fabiana Fournier2 3

DEBS 2012

July 16, 2012

1 NCSR Demokritos, Greece2 IBM Haifa Research Lab, Israel3 We acknowledge the help of Jason Filippou and Anastasios Skarlatidis

Event uncertainty from a petroleum producer point of view

Outline

SpeakerTime Topic

Opher Etzion30 minutes Part I: Introduction and illustrative example

Opher Etzion15 minutesPart II: Uncertainty representation

Opher Etzion 45 minutes Part III: Extending event processing to handle uncertainty

Alexander Artikis 80 minutes Part IV: AI based models for Event Recognition under Uncertainty

Opher Etzion10 minutesPart V: Open issues and summary

Part I:Introduction and illustrative example

Event processing under uncertainty tutorial

By 2015, 80% of all available data will be uncertainG

Multiple sources: IDC,Cisco

2005 2010 2015

Data quality solutions exist for enterprise data like customer, product, and address data, but

this is only a fraction of the total enterprise data.

By 2015 the number of networked devices will be double the entire global population. All

sensor data has uncertainty.

The total number of social media accounts exceeds the entire global

population. This data is highly uncertain in both its expression and content.

The big data perspective

Big data is one of the three technology trends at the leading edge a CEO cannot afford to overlook in 2012 [Gartner #228708, January 2012}

Dimensions of big data

Velocity

(data in motion)

Volume

(data processed in RT)

Variety

(data in many forms )

Veracity

(data in doubt)

The big data perspective in focus

Big data is one of the three technology trends at the leading edge a CEO cannot afford to overlook in 2012 [Gartner #228708, January 2012}

Dimensions of big data

Velocity

(data processed in RT)

Volume

(data in motion)

Variety

(data in many forms )

Veracity

(data in doubt)

Uncertainty in event processing

State-of-the-art systems assume that data satisfies the “closed world assumption”, being complete and precise as a result of a cleansing process before the data is utilized.

This tutorial presents common types of uncertainty and different ways of handling uncertainty in event processing

The problem

Processing data is deterministic

Online data is not leveraged for immediate operational decisions

Mishandling of uncertainty may result in undesirable outcomes

In real applications events may be uncertain or have imprecise content for various reasons (missing data, inaccurate/noisy input; e.g. data from sensors or social media)

Often, in real time applications cleansing the data is not feasible due to time constraints

Representative sources of uncertainty

Uncertaininput data/

Events

SourceMalfunction

Thermometer

Human error

MaliciousSource

Fake tweet

Sensor disrupter

Projectionof temporalanomalies

Wrong hourly sales summary

SourceInaccuracy

Samplingor

approximation

Propagationof

uncertainty

Visual data

Wrong trend

Inference based on uncertain value

Event processing

Runtime Engine

Definitions

DetectedSituations

EventSources

Run TimeBuild Time

Events

Rules / Patterns

Situation Detection

Authoring Tool

Actions

Types of uncertainty in event processing

Runtime Engine

Definitions

DetectedSituations

EventSources

Run TimeBuild Time

Events

Rules / Patterns

Situation Detection

Authoring Tool

Actions

incomplete event streams

insufficient event dictionary

erroneous event recognition

inconsistent event annotation

imprecise event patterns

Uncertainty in the event input, in the composite event pattern, in both

Illustrative example – Surveillance system for crime detectionA visual surveillance system provides data to an event driven application.

The goal of the event processing application is to detect and alert in real-time possible occurrences of crimes (e.g. drug deal) based on the surveillance system and citizens’ reports inputs.

Real time alerts are posted to human security officers1. Whenever the same person is observed participating in a

potential crime scene more than 5 times in a certain day

2. Identification of locations in which more than 10 suspicious detections were observed in three consecutive days

3. Report of the three most suspicious locations in the last month

Some terminology

Event producer/consumerEvent typeEvent streamEvent Processing Agent (EPA)Event Channel (EC) Event Processing Network (EPN)Context (temporal, spatial and segmentation)

Illustrative example – Event Processing Network (EPN)

Observation

Crime report

Crime observation

Alert special security forces

Criminals data store

Surveillance video

Most frequent locations

Suspicious location

detection

Mega suspect detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

EPA- Event Processing Agent

Producer/consumer

Channel

Police

Special security forces

Citizens Crime report matching

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Producer/consumer

Channel

Police

SuspiciousObservation

Matched crime

Mega suspect

Known criminal

Discovered criminal

Suspicious location

Top locations

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Police

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Producer/consumer

Channel

Police

......Crime Indication…........

Observation Filter

Assertion: Crime indication = ‘true’Context: Always

......Crime Indication…........Identity

Suspicious observation

UncertaintyWrong identificationMissing events Noisy pictures

Mega Suspect detection

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Police

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Producer/consumer

Channel

Police

Threshold

Assertion: Count (Suspicious observation)> 5

Context: Daily per suspect

......Suspect…........

Mega suspect

UncertaintyWrong identificationMissing events Wrong threshold

Split Mega Suspect

Emit Known criminal when exists (criminal-record)

Emit Discovered criminal otherwiseContext: Always

......Picture…........

Known criminal

UncertaintyWrong identificationMissing events Wrong threshold

......Suspect…........

Mega suspect

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Police

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Producer/consumer

Channel

Police

......Picture…........

Discovered criminal

Crime report matching

Sequence

(Suspicious observation [override],Crime report) Context: Crime type per location

UncertaintyMissed eventsWrong sequence Inaccurate locationInaccurate crime type

Matched crime

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Police

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Producer/consumer

Channel

Police

Crime report

Suspicious location detection

Threshold

Assertion: Count (Suspicious observation)>10Context: Location, Sliding 3 day interval

UncertaintyMissed eventsWrong timingInaccurate location

......Crime…..Identity location

Suspicious location

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Police

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Producer/consumer

Channel

Police

Order by: count (Suspicious location)Context: Month

UncertaintyDerived from uncertainty in Suspicious locationevents

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Police

Crime observation

Surveillance video

Suspicious location

detection

Split mega suspect

Alert police

Report stat.

police dept.

Alert patrol forces

Producer/consumer

Channel

Producer/consumer

Channel

Police

......Crime…..Identity location

Suspicious location

......Top locations list…........

Top locations

Part II:Uncertainty representation

Occurrence uncertainty: Uncertainty whether an event has occurred

The event header has anattribute with label “certainty”with values in the range of [0,1]

In raw events:Given by the source (e.g. sensor accuracy,level of confidence)

In derived events:Calculated as part of the event derivation

ExampleEvent type: Crime observation

Occurrence time: 7/7/12, 12:23

Certainty: 0.73

Temporal uncertainty: Uncertainty when an event has occurred

The event header has anattribute with the label “occurrence time” which may designate: timestamp, interval, or distribution over an interval

In raw events:Given by the source

Occurrence time: U(7/7/12, 12:00, 7/7/12 13:00)

Certainty: 0.9

The interval semantics

Occurrence time type: interval

Occurrence time type: timestamp, instance type: interval

Occurrence time type: timestamp, instance type: interval + distribution

Example: U(7/7/12, 12:00, 7/7/12 13:00)

1. The event happens during the entire interval

2. The event happens at some unknown time-point within the interval

3. The event happens at some time-point within an interval and the distribution of probabilities is known

Spatial uncertainty: uncertainty where an event has occurred

The event header has an attribute with the label “location” which may designate: point, poly-line, area and distribution of points within poly-line and area. Locations can be specified either incoordinates or in labels (e.g. addresses)

In raw events:Given by the source

Occurrence time: U(7/7/12, 12:00, 7/7/12 13:00)

Certainty: 0.9

Location: near 30 Market Street, Philadelphia

The spatial semantics

A poly-line may designate a road, e.g. M2 in UK

An area may designate a bounded geographical area, e.g. the city of Princeton.

Like temporal interval – the designation of a poly-line or area may be interpreted as: the event happens in the entire area, the event happens at unknown point within an area, the event happens within the area with some distribution of points

Examples:

Location type: area, Location := Princeton

Location type: point Location:= M2

Location type: point Location := U (center = city-hall Princeton, radius = 2KM)

Uncertainty about the event content – attribute values (meta data)

Meta-data definition:attribute values may beprobabilistic

Uncertainty about the event content – attribute value example

Instance:“Crime indication” has explicit distribution

Uncertainty in situation detection

Threshold

Assertion: Count (Suspicious observation)> 5

Recall the “Mega suspect detection”

The “Mega suspect detection” can have some probability, furthermore there can be some variations (due to possible missing event) when count =5, count = 4, etc… Thus it may become a collection of assertions with probabilities

Threshold

Assertion: Count (Suspicious observation)> 5 with probability = 0.9= 5 with probability = 0.45= 4 with probability = 0.2

Part III:Extending event processing to handle uncertainty

Uncertainty handling

Two main handling methods:

Uncertainty propagation The uncertainty of input events is propagated to the derived events

Uncertainty flatteningUncertain values are replaced with deterministic equivalents; events may be ignored.

Traditional event processing needs to be enhanced to account for uncertain events

Topics covered1. Expressions*

2. Filtering

3. Split

4. Context assignment

5. Pattern matching examples:

a. Top K

b. Split

* Note that transformation and aggregations are covered by expressions.

Expressions (1/4)

Expressions are used extensively in event-processing in

Derivation: assigning values to derived events

Assertions: filter and pattern detection

Expressions comprise a combination of

Operators

Arithmetic functions, e.g. sum, min

Logical operators, e.g. >, =

Textual operators, e.g. concatenation

Operands (terms) referencing

particular event attributes

general constants

Expressions (2/4)

Example: Summing the number of suspicious observations in two locations

location1.num-observations + location2.num-observations

Deterministic case: 12 + 6 = 18

Stochastic case: + =

12 1311 5 64 7

Expressions (3/4)

The ‘uncertainty propagation’ approach: Operators overloading

Explicit, e.g. Normal(1,1) + Normal(3,1) = Normal(4,2)

Numerical convolution (general distributions, probabilistic dependencies)

Monte-Carlo methods: generate samples, evaluate the result and aggregate

The ‘uncertainty flattening’ approach: Replacing the original expression using a predefined policy

Some examples

Expectation: X+Y Expectation(X+Y)

Confidence value: X+Y Percentile(X+Y,0.9)

X>5 ‘true’ if Prob{X>5}>0.9; ‘false’ otherwise

Expressions (4/4)

Probabilistic operators

Function Parameters Description expectation (X) X-Stochastic

term The expectation of the

stochastic term X

variance (X) X-Stochastic term

The variance of the stochastic term X

pdf (X, x) X-Stochastic Term, x-value

The probability of X taking value x

cdf (X, x) X-Stochastic Term, x-numeric

The probability that X takes a value not larger

than x

percentile (X,α) X-Stochastic Term, α-numeric

The smallest number z such that cdf(X,z)≥α

frequent (X) X-Stochastic Term

A value x that maximizes pdf(X,.)

Filtering (1/3)

Should the ‘Observation’ event instance pass the filter?

Crime observation

Assertion: crime-indication = ‘true’Context: Always

Certainty 0.8 Crime-Indication [‘true’, 0.7;

‘false’, 0.3]…........

Observation

Filtering (2/3)

The ‘uncertainty propagation’ approach

Crime observation

‘false’, 0.3]…........

Observation

Certainty 0.8x0.7

…........

observation.certainty * Prob{assertion}

Filtering (3/3)

The ‘uncertainty flattening’ approach

Crime observation

‘false’, 0.3]…........

Observation

Certainty 1

…........

observation.certainty * Prob{assertion} > 0.5

Split (1/3)

Split mega-suspect

Pattern: SplitContext: Always

......Certainty 0.75…..Id [‘John Doe1’, 0.3;

NA, 0.7]

Mega suspect Id is null

NA, 0.7]

Known criminal

NA, 0.7]

Discovered criminalId is not null

Split (2/3)

Split mega-suspect

NA, 0.7]

Mega suspect

......Certainty 0.75*0.3…..Id ‘John Doe1’

Known criminal

......Certainty 0.75*0.7

…..Id NA

Discovered criminal

Split (3/3)

Split mega-suspect

NA, 0.7]

Mega suspect

......Certainty 0.75

…..Id NA

Discovered criminal

Id → frequent(Id)

Context assignment – temporalExample 1 (1/3)

Should the ‘Suspicious observation’ event be assigned to the context segment ‘July 16, 2012, 10AM-11AM’ ?

Context segment: ‘John Doe’, ‘July 16, 2012, 10AM-11AM’

Certainty 0.8Occurrence time Uni(9:45AM,10:05AM)Id ‘John Doe’

…........

Suspicious Observation

…........

Prob{occurrence time ∊ [10AM,11AM]} * certainty

…........

Prob{occurrence time ∊ [10AM,11AM]} > 0.5

Context assignment – temporalExample 2

Should the uncertain ‘Suspicious observation’ event initiate a new context segment? Participate in an existing context segmentation?

Suspicious observation detection

Context : ‘John Doe‘State: 1 instance

…........

Suspicious observation detection

Context : ‘John Doe’

Context assignment – Segmentation-oriented (1/3)

Which of the context segmentations should the ‘Suspicious observation’ event be assigned to?

Context segment: ‘John Doe1’, ‘July 16, 2012, 10AM-11AM’

Certainty 0.8Occurrence time Uni(9:45AM,10:05AM)Id [‘John Doe1’, 0.3;

‘John Doe2’, 0.7]

observation.certainty * Prob{‘John Doe1’}

observation.certainty * Prob{‘John Doe2’}

Associate event to context segmentation frequent(Id)

Pattern matching: Top-K (1/3)

What is the location with the most crime indications?

Pattern: Top-K (K=1)Order by: count (suspicious location)Context: Month

......Crime…..IdLocation ‘A’

Suspicious location

......Certainty 0.3…..IdLocation ‘downtown’

Suspicious location

......Certainty 0.75…..IdLocation ‘suburbs’

Suspicious location

Count=

12 1311

12 131110 14

Order=

(aka event recognition)

Pattern: Top-K (K=1)Order by: count (Suspicious location)Context: Month

Suspicious location

......Certainty 0.3…..IdLocation ‘downtown’

Suspicious location

Count1=

Count2=

12 1311

12 131110 14

Order=(‘downtown’,’suburbs’) w.p. Prob{count1>count2}

(‘downtown’,’suburbs’) w.p. Prob{count1>count2}

Pattern: Top-K (K=1)Order by: f(count(suspicious location))Context: Month

Suspicious location

......Crime…..IdLocation ‘downtown’

Suspicious location

......Certainty 0.3…..Idlist ‘downtown’

Top suspicious location list

Suspicious location

Count=

12 1311

12 131110 14

e.g. expectation, cdf, …

Pattern matching: Sequence (1/3)

Should the following events instances be matched as a sequence?

Pattern: Sequence [Suspicious observation, Crime report]

Context: Location, Crime type

…........

Certainty 0.9Occurrence time 10:02AMId NA

…........

Crime report

Pattern: SequenceContext: Location, Crime type

…........

Crime report

…........

Matched crime

obs.certainty * crime.certainty * Prob{obs.time<crime.time}

obs.time | obs.time<crime.time

Pattern: SequenceContext: Location, Crime type

…........

Crime report

Certainty 0.72Occurrence time 9:55AMId ‘John Doe’

…........

Matched crime

Occurrence time → percentile(occurrence time, 0.5)

Part IV:Event Recognition under Uncertainty in Artificial Intelligence

Event Recognition under Uncertainty in ArtificialIntelligence

AI-based systems already deal with various types of uncertainty:

I Erroneous input data detection.

I Imperfect composite event definitions.

Overview:

I Logic programming.

I Markov logic networks.

Human Recognition from Video: Bilattice-based Reasoning

I Recognise humans given input data from part-based classifiersoperating on video.

Uncertainty:

Approach:I Extend logic programming by incorporating a bilattice:

I Consider both supportive and contradicting information abouta given human hypothesis.

I For every human hypothesis, provide a list of justifications(proofs) that support or contradict the hypothesis.

Human Recognition from Video

I Evidence FOR hypothesis “Entity1 is a human”:I Head visible.I Torso visible.I Scene is consistent at 1’s position.

I Evidence AGAINST hypothesis “Entity1 is a human”:I Legs occluded.

I However, the missing legs can be explained because of theocclusion by the image boundary.

I If occluded body parts can be explained by a rule, thehypothesis is further supported.

I Evidence FOR hypothesis “Entity A is a human”:I A is on the ground plane.I Scene is consistent at A’s position.

I Evidence AGAINST hypothesis “Entity A is a human”:I No head detected.I No torso detected.I No legs detected.

I Evidence against the hypothesis is overwhelming, so thesystem infers that A is unlikely to be a human.

I Recognising humans given three different sources ofinformation, supporting or contradicting human hypotheses:

human(X ,Y ,S)← head(X ,Y , S)human(X ,Y ,S)← torso(X ,Y ,S)¬human(X ,Y , S)← ¬scene consistent(X ,Y ,S)

I Supporting information sources are provided by head andtorso detectors.

I Contradicting information is provided by the violation ofgeometric constraints (scene consistent).

Human Recognition from Video:Uncertainty Representation

I We are able to encode information for and against:I Input data (output of part-based detectors)I Rules (typically they are less-than-perfect)I Hypotheses (inferrable atoms)

I We do this by defining a truth assignment function:

ϕ : L→ B

where L is a declarative language and B is a bilattice.I Some points in the bilattice:

I < 0, 0 > agnosticI < 1, 0 > absolute certainty about truthI < 0, 1 > absolute certainty about falsityI < 1, 1 > complete contradiction

I A perfect part-based detector would detect body parts andgeometrical constraints with absolute certainty.

head(25 , 95 , 0 .9)torso(25 , 95 , 0 .9)¬scene consistent(25 , 95 , 0 .9)

Human Recognition from Video: Input Data Uncertainty

I However, in realistic applications, every low-level informationis detected with a degree of certainty, usually normalized to aprobability.

ϕ(head(25 , 95 , 0 .9)) =< 0 .90 , 0 .10 >ϕ(torso(25 , 95 , 0 .9)) =< 0 .70 , 0 .30 >ϕ(¬scene consistent(25 , 95 , 0 .9)) =< 0 .80 , 0 .20 >

I For simple detectors, if the probability of the detection is p,the following holds:

ϕ(detection) =< p, 1−p >

I Traditional logic programming rules are binary:

human(X ,Y ,S)← head(X ,Y , S)human(X ,Y ,S)← torso(X ,Y , S)¬human(X ,Y , S)← ¬scene consistent(X ,Y ,S)

Human Recognition from Video: Rule Uncertainty

I But in a realistic setting, we’d like to have a measure of theirreliability:

ϕ(human(X ,Y , S)← head(X ,Y ,S)) =< 0 .40 , 0 .60 >ϕ(human(X ,Y , S)← torso(X ,Y ,S)) =< 0 .30 , 0 .70 >ϕ(¬human(X ,Y , S)← ¬scene consistent(X ,Y , S)) =

< 0 .90 , 0 .10 >

I Rules are specified manually.

I There exists an elementary weight learning technique: Thevalue x in <x,y> is the fraction of the times that the head ofa rule is true when the body is true.

Human Recognition from Video: Uncertainty Handling

I It is now possible to compute the belief for and against agiven human hypothesis, by aggregating through all thepossible rules and input data.

I This is done by computing the closure over ϕ of multiplesources of information.

I E.g. if we have a knowledge base S with only one rule and onefact that together entail human

human(X ,Y ,S)← head(X ,Y , S), head(25 , 95 , 0 .9)}

the degree of belief for and against the human hypothesiswould be:

cl(ϕ)(human(X ,Y , S)) =∧p∈S

cl(ϕ)(p) =

cl(ϕ)(

human(X ,Y ,S)← head(X ,Y , S))∧

cl(ϕ)(

head(25 , 95 , 0 .9))

= · · · =< xh, yh >

I Because we have a variety of information sources to entail ahypothesis from (or the negation of a hypothesis), theinference procedure needs to take into account all possibleinformation sources.

I The operator ⊕ is used for this.

I The final form of the equation that computes the closure overthe truth assignment of a hypothesis q is the following:

cl(ϕ)(q) =⊕S |=q

[ ∧p∈S

cl(ϕ)(p)]

︸︷︷︸Aggregating supporting

information

⊕¬⊕S|=¬q

[ ∧p∈S

cl(ϕ)(p)]

︸︷︷︸Negating contradicting

information

I Given this uncertain input

ϕ(head(25 , 95 , 0 .9)) =< 0 .90 , 0 .10 >ϕ(torso(25 , 95 , 0 .9)) =< 0 .70 , 0 .30 >ϕ(¬scene consistent(25 , 95 , 0 .9)) =< 0 .80 , 0 .20 >

I and these uncertain rules

ϕ(human(X ,Y , S)← head(X ,Y ,S)) =< 0 .40 , 0 .60 >ϕ(human(X ,Y , S)← torso(X ,Y ,S)) =< 0 .30 , 0 .70 >ϕ(¬human(X ,Y , S)← ¬scene consistent(X ,Y , S)) =

< 0.90, 0.10 >

I we would like to calculate our degrees of belief for and againstthe hypothesis human(25 , 95 , 0 .9).

cl(ϕ)(human(25 , 95 , 0 .9) =[< 0 .4 , 0 .6 > ∧ < 0 .9 , 0 .1 >

< 0 .3 , 0 .7 > ∧ < 0 .7 , 0 .3 >]⊕¬[< 0 .9 , 0 .1 > ∧ < 0 .8 , 0 .2 >

< 0 .36 , 0 > ⊕ < 0 .21 , 0 > ⊕¬ < 0 .72 , 0 >=

< 0 .4944 , 0 > ⊕ < 0 , 0 .72 >=< 0 .4944 , 0 .72 >

I Evidence against the hypothesis exceeds evidence for thehypothesis.

I Justifications for the hypothesis:I ϕ(head(25 , 95 , 0 .9)) =< 0 .90 , 0 .10 >I ϕ(torso(25 , 95 , 0 .9)) =< 0 .70 , 0 .30 >

I Justifications against the hypothesis:I ϕ(¬scene consistent(25 , 95 , 0 .9)) =< 0 .80 , 0 .20 >

Human Recognition from Video: Summary

Uncertainty:

Features:

I Consider both supportive and contradicting information abouta given hypothesis.

I For every hypothesis, provide a list of justifications (proofs)that support or contradict the hypothesis.

I Human detection is not a typical event recognition problem.

Public Space Surveillance: VidMAP

I Continuously monitor an area and report suspicious activity.

Uncertainty:

Approach:

I Multi-threaded layered system combining computer vision andlogic programming.

VidMAP: Architecture

High-Level ModuleStandard logic programming reasoning

Mid-Level ModuleUncertainty elimination

Low-Level ModuleBackground Subtraction, Tracking and Appearance Matching on

video content

VidMAP: Architecture

VidMAP: Uncertainty Elimination

I Filter out noisy observations (false alarms) by firstdetermining whether the observation corresponds to people orobjects that have been persistently tracked.

I A tracked object is considered to be a human if:I It is ‘tall’.I It has exhibited some movement in the past.

I A tracked object is considered to be a ‘package’ if:I It does not move on its own.I At some point in time, it was attached to a human.

VidMAP: Event Recognition

The following rule defines theft:

theft(H,Obj ,T )←human(H),package(Obj),possess(H,Obj ,T ),not belongs(Obj ,H,T )

I A human possesses an object if he carries it.

I An object belongs to a human if he was seen possessing itbefore anyone else.

VidMAP: Event Recognition

The following rule defines entry violation:

entry violation(H)←human(H),appear(H, scene,T1),enter(H, building door ,T2),not priviliged to enter(H,T1 ,T2)

I ‘building door’ and ‘scene’ correspond to video areas that canbe hard-coded by the user at system start-up.

I An individual is privileged to enter a building if he swipes hisID card in the ‘building door’ area of ‘scene’ or if he isescorted into the building by a friend who swipes his ID card.

VidMAP: Summary

Uncertainty:

Features:

I Uncertainty elimination.

I Intuitive composite event definitions that are easy to beunderstood by domain experts.

I Crude temporal representation.

Probabilistic Logic Programming: Event Calculus

I General-purpose event recognition system.

Uncertainty:

Approach:

I Express the Event Calculus in ProbLog.

The Event Calculus (EC)

I A Logic Programming language for representing and reasoningabout events and their effects.

I Key Components:I event (typically instantaneous)I fluent: a property that may have different values at different

points in time.

I Built-in representation of inertia:I F = V holds at a particular time-point if F = V has been

initiated by an event at some earlier time-point, and notterminated by another event in the meantime.

ProbLog

I A Probabilistic Logic Programming language.

I Allows for independent ‘probabilistic facts’ prob::fact.

I Prob indicates the probability that fact is part of a possibleworld.

I Rules are written as in classic Prolog.

I The probability of a query q imposed on a ProbLog database(success probability) is computed by the following formula:

Ps(q) = P(∨

e∈Proofs(q)

∧fi∈e

ProbLog-EC: Input & Output

Input Output

340 0.45 :: inactive(id0) 340 0.41 :: leaving object(id1, id0)

340 0.80 :: p(id0) =(20.88,−11.90) 340 0.55 :: moving(id2, id3)

340 0.55 :: appear(id0)

340 0.15 :: walking(id2)

340 0.80 :: p(id2) =(25.88,−19.80)

340 0.25 :: active(id1)

340 0.66 :: p(id1) =(20.88,−11.90)

340 0.70 :: walking(id3)

340 0.46 :: p(id3) =(24.78,−18.77)

ProbLog-EC: Event Recognition

Init Init

… Init …

Term … Term …

Uncertainty Elimination vs Uncertainty Propagation

Crisp-EC (uncertainty elimination) vs ProbLog-EC (uncertaintypropagation):

I Crisp-EC: input data with probability < 0.5 are discarded.

I ProbLog-EC: all input data are kept with their probabilities.

I ProbLog-EC: accept as recognised the composite events thathave probability > 0.5.

Crisp-EC: Uncertainty Elimination

0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Noise (Gamma distribution mean)

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Gamma Distribution Mean

Meeting

Crisp-EC

ProbLog-EC

meeting(Id1 , Id2 ) = true initiated ifactive(Id1 ) happens,close(Id1 , Id2 ) = true holds,person(Id2 ) = true holds,not running(Id2 ) happens

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Meeting

Crisp-EC

ProbLog-EC

meeting(Id1 , Id2 ) = true initiated ifactive(Id1 ) happens,close(Id1 , Id2 ) = true holds,person(Id2 ) = true holds,not running(Id2 ) happens

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Moving

moving(Id1 , Id2) = true initiated iffwalking(Id1) happens,walking(Id2) happens,close(Id1 , Id2) = true holds,orientation(Id1) = O1 holds,orientation(Id2) = O2 holds,|O1 − O2 | < threshold

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Moving

moving(Id1 , Id2) = true initiated iffwalking(Id1) happens,walking(Id2) happens,close(Id1 , Id2) = true holds,orientation(Id1) = O1 holds,orientation(Id2) = O2 holds,|O1 − O2 | < threshold

ProbLog-EC clearly outperforms Crisp-EC when:

I The environment is highly noisy (input data probability < 0.5)— realistic assumption in many domains,

I there are successive initiations that allow the compositeevent’s probability to increase and eventually exceed thespecified (0.5) threshold, and

I the amount of probabilistic conjuncts in an initiation conditionis limited.

ProbLog-EC: Summary

Uncertainty:

Features:

I Built-in rules for complex temporal representation.

I Independence assumption on input data is not alwaysdesirable.

Probabilistic Graphical Models

I E.g. Hidden Markov Models, Dynamic Bayesian Networks andConditional Random Fields.

I Can naturally handle uncertainty:I Erroneous input data detection.I Imperfect composite event definitions.

I Limited representation capabilities.

I Composite events with complex relations create models withprohibitively large and complex structure.

I Various extensions to reduce the complexity.

I Lack of a formal representation language.

I Difficult to incorporate background knowledge.

Markov Logic Networks (MLN)

I Unify first-order logic with graphical models.I Compactly represent event relations.I Handle uncertainty.

I Syntactically: weighted first-order logic formulas (Fi ,wi ).

I Semantically: (Fi ,wi ) represents a probability distributionover possible worlds.

P(world) ∝ exp (∑

(weights of formulas it satisfies))

A world violating formulas becomes less probable, but notimpossible!

Event Recognition using MLN

Grounding

Markov Logic Networks

Markov Network(probabilistic inference)

Input Output

SDE RecognisedCE

CE Knowledge

P(CE =True | SDE)(evidence)

Markov Logic: Representation

suspect(Id)⇐criminal record(Id) ∨citizen report(Id) ∨holds Dangerous Object(Id)

suspicious transaction(Id1 )⇐suspect(Id1 ) ∧mega suspect(Id2 ) ∧deliver(Id1 , Id2 )

Markov Logic: Representation

I Weight: is a real-valued number.

I Higher weight −→ Stronger constraint.I Hard constraints:

I Must be satisfied in all possible worlds.I Infinite weight values.I Background knowledge.

I Soft constraints:I May not be satisfied in some possible worlds.I Strong weight values: almost always true.I Weak weight values: describe exceptions.

Markov Logic: Input Data Uncertainty

Sensors detect events with:

I Absolute certainty

I A degree of confidence

Example:

I holds Dangerous Object(Id) detected with probability 0.8.

I It can be represented with an auxiliary formula.

I The value of the weight of the formula is the log-odds of theprobability.

1.38629 obs holds Dangerous Object(Id)⇒holds Dangerous Object(Id)

Markov Logic: Network Construction

I Formulas are translated into clausal form.

I Weights are divided equally among the clauses of a formula.

I Given a set of constants from the input data, ground allclauses.

I Ground predicates are Boolean nodes in the network.I Each ground clause:

I Forms a clique in the network,I and is associated with wi and a Boolean feature.

P(X = x) = 1Z exp (

∑i wini (x))

Z =∑

x ′∈X exp(∑

i wini (x ′))

suspect(Id)⇐criminal record(Id) ∨citizen report(Id) ∨holds Dangerous Object(Id)

suspicious transaction(Id1 )⇐suspect(Id1 ) ∧mega suspect(Id2 ) ∧deliver(Id1 , Id2 )

13w1 ¬criminal record(Id) ∨ suspect(Id)

13w1 ¬citizen report(Id) ∨ suspect(Id)

13w1 ¬holds Dangerous Object(Id) ∨ suspect(Id)

w2 ¬suspect(Id1 ) ∨ ¬mega suspect(Id2 ) ∨¬deliver(Id1 , Id2 ) ∨ suspicious transaction(Id1 )

For example, the clause:w2 ¬suspect(Id1 ) ∨ ¬mega suspect(Id2 ) ∨

¬deliver(Id1 , Id2 ) ∨ suspicious transaction(Id1 )

produces the following groundingsw2 ¬suspect(alex) ∨ ¬mega suspect(alex) ∨

¬deliver(alex , alex) ∨ suspicious transaction(alex)

w2 ¬suspect(alex) ∨ ¬mega suspect(nick) ∨¬deliver(alex , nick) ∨ suspicious transaction(alex)

w2 ¬suspect(nick) ∨ ¬mega suspect(alex) ∨¬deliver(nick, alex) ∨ suspicious transaction(nick)

w2 ¬suspect(nick) ∨ ¬mega suspect(nick) ∨¬deliver(nick, nick) ∨ suspicious transaction(nick)

deliver

(alex,alex)

deliver

(nick,nick)

suspicious_

transaction

(nick)

suspect

(nick)

suspect

(alex)

deliver

(nick,alex)

citizen_ report

(nick)

holds_

Dangerous_

Object

(nick)

criminal_

record

(nick)

suspicious_

transaction

(alex)

suspect

(alex)

suspect

(nick)

deliver

(alex,nick)

citizen_ report

(alex)

holds_

Dangerous_

Object

(alex)

criminal_

record

(alex)

Markov Logic: World state discrimination

deliver

(alex,alex)

deliver

(nick,nick)

suspicious_

transaction

(nick)

suspect

(nick)

suspect

(alex)

deliver

(nick,alex)

citizen_ report

(nick)

holds_

Dangerous_

Object

(nick)

criminal_

record

(nick)

suspicious_

transaction

(alex)

suspect

(alex)

suspect

(nick)

deliver

(alex,nick)

citizen_ report

(alex)

holds_

Dangerous_

Object

(alex)

criminal_

record

(alex)

State 1: P(X = x1)=1Zexp( 1

3w1· 2 + 1

3w1· 2 + w2· 4) = 1

Ze2w1+4w2

3w1· 2 + 1

3w1· 2 + w2· 3) = 1

Ze2w1+3w2

Markov Logic: World state discrimination

deliver

(alex,alex)

deliver

(nick,nick)

suspicious_

transaction

(nick)

suspect

(nick)

suspect

(alex)

deliver

(nick,alex)

citizen_ report

(nick)

holds_

Dangerous_

Object

(nick)

criminal_

record

(nick)

suspicious_

transaction

(alex)

suspect

(alex)

suspect

(nick)

deliver

(alex,nick)

citizen_ report

(alex)

holds_

Dangerous_

Object

(alex)

criminal_

record

(alex)

3w1· 2 + 1

3w1· 2 + w2· 4) = 1

Ze2w1+4w2

3w1· 2 + 1

3w1· 2 + w2· 3) = 1

Ze2w1+3w2

Markov Logic: Inference

I Event recognition: querying about composite events.

I Large network with complex structure.

I Exact inference is infeasible.

I MLN combine logical and probabilistic inference methods.I Given some evidence E = e, there are two types of inference:

1. Most Probable Explanation:I argmax

q(P(Q = q|E = e))

I e.g. Maximum Satisfiability Solving.

2. Marginal:I P(Q = true|E = e)I e.g. Markov Chain Monte Carlo sampling.

Markov Logic: Machine Learning

Training dataset: input data annotated with composite events.

Input Data Composite Events

. . .happensAt(active(id1 ), 101)happensAt(walking(id2 ), 101) holdsAt(meeting(id1 , id2 ), 101)orientationMove(id1 , id2 , 101) holdsAt(meeting(id2 , id1 ), 101)¬close(id1 , id2 , 24, 101)

. . .happensAt(walking(id1 ), 200)happensAt(running(id2 ), 200) ¬holdsAt(meeting(id1 , id2 ), 200)¬orientationMove(id1 , id2 , 200) ¬holdsAt(meeting(id2 , id1 ), 200)¬close(id1 , id2 , 24, 200)

Markov Logic: Machine Learning

I Structure learning:I First-order logic formulas.I Based on Inductive Logic Programming techniques.

I Weight estimation:I Structure is known.I Find weight values that maximise the likelihood function.I Likelihood function: how well our model fits to the data.I Generative learning.I Discriminative learning.

Markov Logic: Discriminative Weight Learning

In event recognition we know a priori :

I Evidence variables: input data, eg simple events

I Query variables: composite events

I Recognise composite events given the input data.

Conditional log-likelihood function:

log Pw (Q = q | E = e) =∑

i wini (e, q)− log Ze

I Joint distribution of query Q variables given the evidencevariables E .

I Conditioning on evidence reduces the likely states.

I Inference takes place on a simpler model.

I Can exploit information from long-range dependencies.

MLN-based Approaches

Most of the approaches:

I Knowledge base of domain-depended composite eventdefinitions.

I Weighted definitions.

I Input uncertainty expressed using auxiliary formulas.

I Temporal constraints over successive time-points.

More expressive methods:

I Based on Allen’s interval logic.

I Event Calculus.

Event Calculus in MLN

I General-purpose event recognition system.

Uncertainty:

Approach:

I Express the Event Calculus in Markov logic.

The Event Calculus

I A logic programming language for representing and reasoningabout events and their effects.

I Translation to Markov Logic is therefore straightforward.

I Key Components:I event (typically instantaneous)I fluent: a property that may have different values at different

points in time.

I Built-in representation of inertia:I F = V holds at a particular time-point if F = V has been

initiated by an event at some earlier time-point, and notterminated by another event in the meantime.

Event Calculus

0 3 10 20Time0.0

Initiated Initiated Terminated

0 3 10 20Time0.0

Hard-constrained inertia rules:

2.3 CE initiatedAt T if[Conditions]

¬(CE holdsAt T ) iff¬(CE holdsAt T−1),¬(CE initiatedAt T−1)

2.5 CE terminatedAt T if[Conditions]

CE holdsAt T iffCE holdsAt T−1,¬(CE terminatedAt T−1)

0 3 10 20Time0.0

Soft-constrained initiation inertia rules:

2.8 ¬(CE holdsAt T ) iff¬(CE holdsAt T−1),¬(CE initiatedAt T−1)

CE holdsAt T iffCE holdsAt T−1,¬(CE terminatedAt T−1)

0 3 10 20Time0.0

Soft-constrained termination inertia rules:

2.8 CE holdsAt T iffCE holdsAt T−1,¬(CE terminatedAt T−1)

0 3 10 20Time0.0

Soft-constrained termination inertia rules:

0.8 CE holdsAt T iffCE holdsAt T−1,¬(CE terminatedAt T−1)

I Crisp-EC (uncertainty elimination) is compared to variousMLN-EC dialects (uncertainty propagation):

I MLN-ECHI : hard-constrained inertia.I MLN-ECSI (h): soft-constrained termination inertia rules.I MLN-ECSI : soft-constrained inertia.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

Threshold

easure

ECcrisp MLN–ECHI

MLN–ECSI(h) MLN–ECSI

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

Threshold

easure

ECcrisp MLN–ECHI

MLN–ECSI(h) MLN–ECSI

Uncertainty:

Features:

I Customisable inertia behaviour to meet the requirements ofthe application under consideration.

I Weight learning is effective.

I Lack of explicit representation of intervals.

Part V:Open issues and summary

Open issues

Being a relative new area there exist many open issues related to events uncertainty both research and practical

In this tutorial, we showed a sample of several such open issues

Open issue: Assignment

How the assignment of uncertainty values is done?

Can people assign probabilities? Maybe fuzzy values?

Can probabilities be inferred statistically?

Open issue: real-time processing

Back to the “big data” applications: uncertainty is one dimension, scalability and real-time processing are other dimensions.

The handling of uncertain events in a scale and in real-time requires novel algorithmic approach

Open issue: Forecasting

Applications employing derived events that are forecasted to occur in the (relatively near) future have inherent uncertainty associated with this forecasting.

The forecasting model should support various level of uncertainty (occurrence uncertainty, temporal uncertainty, location uncertainty) built in the forecasting model

This tutorial dealt with various aspects of event uncertainty: motivation, representation, AI based techniques, and event processing handling techniques

Uncertainty will become gradually part of event-based applications

debs 2012 uncertainty tutorial

Technology

eugene debs

the eugene v. debs foundation · 2016. 6. 6. ·...

ernest e debs park

solving debs grand challenge with wso2...

post-print pdf-version a tutorial on uncertainty

tutorial uncertainty quantification - akunatutorial −...

tutorial: complex event recognition languages · aim of our...

views from the edge - debs foundation

the danger ahead by debs

debs child care and preschool

debs 2015 tutorial : patterns for realtime streaming...

debs 2012 basic proactive

a modern approach to situation awareness - debs...

debs 2011 tutorial on non functional properties of event...

debs design experience

debs 2013 tutorial : why is event-driven thinking different...

uqtools: the uncertainty quantification toolbox ... ·...

decision making under uncertainty extracted from: icse’...

debs presentation 2009 july62009

debs park cover-sm