forgetting counts : constant memory inference for a dependent hierarchical pitman- yor process
DESCRIPTION
Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical Pitman- Yor Process. Nicholas Bartlett, David Pfau , Frank Wood Presented by Yingjian Wang Nov. 17, 2010. Outline. Background The sequential memoizer Forgetting The dependent HPY Experiment results. Background. - PowerPoint PPT PresentationTRANSCRIPT
Forgetting Counts : Constant Memory Inference for a Dependent Hierarchical
Pitman-Yor Process
Nicholas Bartlett, David Pfau, Frank WoodPresented by Yingjian Wang
Nov. 17, 2010
• Background• The sequential memoizer• Forgetting• The dependent HPY• Experiment results
Outline
Background2006,Teh, ‘A hierarchical Bayesian language model based on Pitman-Yor processes’
N-gram Markov chain language model with the HPY prior.
2009, Wood, ‘A Stochastic Memoizer for Sequence Data’
The Sequential Memoizer (SM) with linear space/time inference scheme. (lossless)
2010, Gasthaus, ’ Lossless compression based onthe Sequence Memoizer’
Combine the SM with an arithmetic coder to develop a compressor (PLUMP/dePLUMP), see www.deplump.com.
2010, Bartlett, ‘Forgetting Counts : Constant Memory Inferencefor a Dependent HPY’
Develop a constant memory/space inference for the SM, by using a dependent HPY. (with loss)
SM-Two concepts
• Memoizer (Donald Michie, 1968): A device which returns former results under the same input instead of recalculating in order to save time.
• Stochastic Memoizer (Wood, 2009): The returned results can change since the prediction probability is based upon a stochastic process.
SM-model and trie
• model:
• The prefix trie: restaurants.
gram
SM-the NSP (1)•The Normalized Stable Process: (Perman, 1990)
Pitman-Yor Process:
A Normalized Stable Process
( , , )G PY d c H
( ,0, )G PY d HDirichlet Process:
(0, , )G PY c H
Concentration parameter: c=0
Discount parameter: d=0
• Collapse the middle restaurants:Theorem:If:
Then:
• Prefix tree: restaurants (Weiner, 1973; Ukkonen, 1995)
SM-the NSP (2)
SM-linear space inference
Forgetting
• Motivation: to achieve constant memory inference on the basis of SM. How to do? ---
• Methods – Forgetting/delete the restaurants. • Restaurants - the basic memory units in the
context tree:• How to delete? – two deletion schemes:
random deletion; greedy deleting.
( , ) 2u usize c t V
Deletion schemes• Random deletion: uniformly delete one leaf
restaurant.• Greedy deletion: least negatively impacts the
estimated likelihood of the observed sequence.
Leaf restaurants
The SMC algorithm
The dependent HPY• But wait, what we get after the deletion-
addition? Will the processes be independent? – No (Since the seating arrangement in the parent restaurant has been changed.)
The experiment results