csc411artificial intelligence1 chapter 5 stochastic methods contents the elements of counting...
Post on 22-Dec-2015
230 views
TRANSCRIPT
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 11
Chapter 5
STOCHASTIC METHODS
ContentsContents• The Elements of Counting• Elements of Probability Theory• Applications of the Stochastic
Methodology• Bayes’ Theorem
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 22
Application AreasApplication AreasDiagnostic Reasoning. In medical diagnosis, for example, there is not always an obvious cause/effect relationship between the set of symptoms presented by the patient and the causes of these symptoms. In fact, the same sets of symptoms often suggest multiple possible causes.Natural language understanding. If a computer is to understand and use a human language, that computer must be able to characterize how humans themselves use that language. Words, expressions, and metaphors are learned, but also change and evolve as they are used over time.Planning and scheduling. When an agent forms a plan, for example, a vacation trip by automobile, it is often the case that no deterministic sequence of operations is guaranteed to succeed. What happens if the car breaks down, if the car ferry is cancelled on a specific day, if a hotel is fully booked, even though a reservation was made?Learning. The three previous areas mentioned for stochastic technology can also be seen as domains for automated learning. An important component of many stochastic systems is that they have the ability to sample situations and learn over time.
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 33
Set OperationsSet OperationsLet A and B are two sets, U universeLet A and B are two sets, U universe– Cardinality |A|: number of elements in ACardinality |A|: number of elements in A– Complement Ā: all elements in U but not in AComplement Ā: all elements in U but not in A– Subset: A Subset: A B B– Empty set: Empty set: – Union: A Union: A B B– Intersection: A Intersection: A B B– Difference: A - BDifference: A - B
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 44
Addition RulesAddition RulesThe Addition rule for combining two sets:
The Addition rule for combining three sets:
This Addition rule may be generalized to any finite number of sets
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 55
• The Cartesian Product of two sets A and B
• The multiplication principle of counting, for two sets
5
Multiplication RulesMultiplication Rules
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 66
• The permutations of a set of n elements taken r at a time
• The combinations of a set of n elements taken r at a time
Permutations and CombinationsPermutations and Combinations
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 77
Events and ProbabilityEvents and Probability
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 88
• The probability of any event E from the sample space S is:
• The sum of the probabilities of all possible outcomes is 1
• The probability of the compliment of an event is
• The probability of the contradictory or false outcome of an event
Probability PropertiesProbability Properties
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 1010
The Kolmogorov AxiomsThree Kolmogorov Axioms:
From these three Kolmogorov axioms, all of probability theory can be constructed.
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 1111
Traffic ExampleTraffic ExampleProblem descriptionProblem descriptionA driver realizes the gradual slowdown and A driver realizes the gradual slowdown and
searches for possible explanation by means of searches for possible explanation by means of car-based download systemcar-based download system
– Road construction?Road construction?– Accident?Accident?
Three Boolean parametersThree Boolean parameters– S: whether slowdownS: whether slowdown– A: whether accidentA: whether accident– C: whether road constructionC: whether road construction
Download data – next pageDownload data – next page
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 1212
The joint probability distribution for the traffic slowdown, S, accident, A, and construction, C, variable of the example
A Venn diagram representation of the probability distributions is traffic slowdown, A is accident, C is construction.
• Download data:
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 1515
Prior and Posterior ProbabilityPrior and Posterior Probability
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 1616
A Venn diagram illustrating the calculations of P(d|s) as a function of p(s|d).
Conditional ProbabilityConditional Probability
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 1717
The chain rule for two sets:
The generalization of the chain rule to multiple sets
We make an inductive argument to prove the chain rule, consider the nth case:
We apply the intersection of two sets of rules to get:
And then reduce again, considering that:
Until is reached, the base case, which we have already demonstrated.
Chain RulesChain Rules
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 1818
Independent EventsIndependent Events
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 2020
A probabilistic finite state acceptor for the pronunciation of “tomato”.
Probabilistic Finite State AcceptorProbabilistic Finite State Acceptor
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 2121
The ni words with their frequencies and probabilities from the Brown and Switchboard corpora of 2.5M words.
The ni words
The ni phone/word probabilities from the Brown and Switchboard corpora.
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 2222
• Given a set of evidence E, and a set of hypotheses H ={hi}
The conditional probability of hi given E is:
p(hi|E) = (p(E|hi) h(hi))/p(E)
• Maximum a posteriori hypothesis (most probable hypothesis), since p(E)is a constant for all hypotheses
arg max(hi) p(E|hi)p(hi)
• E is partitioned by all hypotheses, thus
p(E) = i p(E|hi)p(hi)
Bayes’ Rules
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 2323
The general form of Bayes’ theorem where we assume the set of hypotheses H partition the evidence set E:
General Form of Bayes’ Theorem
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 2424
• Used in PROSPECTOR
• A simple example: suppose to purchase an automobile:
Applications of Bayes’ Theorem
DealersDealers Go to probabilityGo to probability Purchase a1 Purchase a1 probabilityprobability
11 d1 = 0.2d1 = 0.2 p1 = 0.2p1 = 0.2
22 d2 = 0.4d2 = 0.4 p2 = 0.4p2 = 0.4
33 d3 = 0.4d3 = 0.4 p3 = 0.3p3 = 0.3
• The application of Bayes’ rule to the car purchase problem:
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 2525
• Naïve Bayes, or the Bayes classifier, that uses the partition assumption, even when it is not justified:.
• Assume all evidences are independent, given a particular hypothesis
Bayes Classifier
CSc411CSc411 Artificial IntelligenceArtificial Intelligence 2626
The Bayesian representation of the traffic problem with potential explanations.
The joint probability distribution for the traffic and construction variables
The Traffic Problem
Given bad traffic, what is the probability of road construction?
p(C|T)=p(C=t, T=t)/(p(C=t, T=t)+p(C=f, T=t))=.3/(.3+.1)=.75