Predictive state representation

<h2 id="definition">Definition</h2>
Consider a dynamic system based on a discrete set 
 
 
 
 
 
 A
 
 
 
 
 {\displaystyle {\mathcal {A}}}
 
 of actions and a discrete set 
 
 
 
 
 
 O
 
 
 
 
 {\displaystyle {\mathcal {O}}}
 
 of observations.<a class="footnote-ref" id="fnref:3" href="#fn:3">3</a> A history 
 
 
 
 h
 
 
 {\displaystyle h}
 
 is a sequence 
 
 
 
 
 a
 
 1
 
 
 
 o
 
 1
 
 
 …
 
 a
 
 ℓ
 
 
 
 o
 
 ℓ
 
 
 
 
 {\displaystyle a_{1}o_{1}\dots a_{\ell }o_{\ell }}
 
 where 
 
 
 
 
 a
 
 1
 
 
 ,
 …
 ,
 
 a
 
 ℓ
 
 
 
 
 {\displaystyle a_{1},\dots ,a_{\ell }}
 
 are the actions taken by the agent in that order and 
 
 
 
 
 o
 
 1
 
 
 ,
 …
 ,
 
 o
 
 ℓ
 
 
 
 
 {\displaystyle o_{1},\dots ,o_{\ell }}
 
 are the observations made up by the agent. Let us write 
 
 
 
 P
 (
 
 a
 
 1
 
 
 
 o
 
 1
 
 
 …
 
 a
 
 ℓ
 
 
 
 o
 
 ℓ
 
 
 )
 
 
 {\displaystyle P(a_{1}o_{1}\dots a_{\ell }o_{\ell })}
 
 to be the conditional probability of observing 
 
 
 
 
 o
 
 1
 
 
 ,
 …
 ,
 
 o
 
 ℓ
 
 
 
 
 {\displaystyle o_{1},\dots ,o_{\ell }}
 
 given that the actions taken are 
 
 
 
 
 a
 
 1
 
 
 ,
 …
 ,
 
 a
 
 ℓ
 
 
 
 
 {\displaystyle a_{1},\dots ,a_{\ell }}
 
.
We now want to characterize a given hidden state reached after some history 
 
 
 
 h
 
 
 {\displaystyle h}
 
. To do that, we introduce the notion of test. A test 
 
 
 
 t
 
 
 {\displaystyle t}
 
 is of the same type that a history: it is a sequence of action-observation pairs. The idea is now to consider a set of tests 
 
 
 
 {
 
 t
 
 1
 
 
 ,
 …
 ,
 
 t
 
 n
 
 
 }
 
 
 {\displaystyle \{t_{1},\dots ,t_{n}\}}
 
 to full characterize a hidden state. To do that, we first define the probability of a test 
 
 
 
 t
 
 
 {\displaystyle t}
 
 conditional on a history 
 
 
 
 h
 
 
 {\displaystyle h}
 
: 
 
 
 
 P
 (
 t
 ∣
 h
 )
 :=
 
 
 
 P
 (
 h
 t
 )
 
 
 P
 (
 h
 )
 
 
 
 
 
 {\displaystyle P(t\mid h):={\frac {P(ht)}{P(h)}}}
 
.
We now define the prediction vector 
 
 
 
 p
 (
 h
 )
 =
 [
 P
 (
 
 t
 
 1
 
 
 ∣
 h
 )
 ,
 …
 ,
 P
 (
 
 t
 
 n
 
 
 ∣
 h
 )
 ]
 
 
 {\displaystyle p(h)=[P(t_{1}\mid h),\dots ,P(t_{n}\mid h)]}
 
. We say that 
 
 
 
 p
 (
 h
 )
 
 
 {\displaystyle p(h)}
 
 is a predictive state representation (PSR) if and only if it forms a sufficient statistic for the system. In other words, 
 
 
 
 p
 (
 h
 )
 
 
 {\displaystyle p(h)}
 
 is a predictive state representation (PSR) if and only if for all possible tests 
 
 
 
 t
 
 
 {\displaystyle t}
 
, there exists a function 
 
 
 
 
 f
 
 t
 
 
 
 
 {\displaystyle f_{t}}
 
 such that for all histories 
 
 
 
 h
 
 
 {\displaystyle h}
 
, 
 
 
 
 P
 (
 t
 ∣
 h
 )
 =
 
 f
 
 t
 
 
 (
 p
 (
 h
 )
 )
 
 
 {\displaystyle P(t\mid h)=f_{t}(p(h))}
 
.
The functions 
 
 
 
 
 f
 
 t
 
 
 
 
 {\displaystyle f_{t}}
 
 are called projection functions. We say that the PSR is linear when the function 
 
 
 
 
 f
 
 t
 
 
 
 
 {\displaystyle f_{t}}
 
 is linear for all possible tests 
 
 
 
 t
 
 
 {\displaystyle t}
 
. The main theorem proved in <a class="footnote-ref" id="fnref:4" href="#fn:4">4</a> is stated as follows.
Theorem. Consider a finite POMDP with 
 
 
 
 k
 
 
 {\displaystyle k}
 
 states. Then there exists a linear PSR with a number of tests 
 
 
 
 n
 
 
 {\displaystyle n}
 
 smaller that 
 
 
 
 k
 
 
 {\displaystyle k}
 
.

<ul><li><a href="/facts/Michael_L._Littman/phmmBaUM">Littman, Michael L.</a>; <a href="/facts/Richard_S._Sutton/anEmfcaa">Richard S. Sutton</a>; Satinder Singh (2002). <a href="http://www.eecs.umich.edu/~baveja/Papers/psr.pdf">"Predictive Representations of State"</a> (PDF). Advances in Neural Information Processing Systems 14 (NIPS). pp. 1555–1561.</li></ul>
<ul><li>Singh, Satinder; Michael R. James; Matthew R. Rudary (2004). <a href="http://www.eecs.umich.edu/~baveja/Papers/uai2004psr.pdf">"Predictive State Representations: A New Theory for Modeling Dynamical Systems"</a> (PDF). Uncertainty in Artificial Intelligence: Proceedings of the Twentieth Conference (UAI). pp. 512–519.</li></ul>
<ul><li>Wiewiora, Eric Walter (2008), <a href="http://cseweb.ucsd.edu/~ewiewior/dissertation.pdf">Modeling Probability Distributions with Predictive State Representations</a> (PDF)</li></ul>

<h2 id="references">References</h2>

<ol>
<li id="fn:1">James, Michael R.; Singh, Satinder (2004). "Learning and discovery of predictive state representations in dynamical systems with reset". Twenty-first international conference on Machine learning - ICML '04. p. 53. CiteSeerX 10.1.1.67.5179. doi:10.1145/1015330.1015359. ISBN 978-1-58113-838-2. S2CID 9111832. <a href="978-1-58113-838-2" target="_blank">978-1-58113-838-2</a> <a href="#fnref:1" class="footnote-back-ref">↩</a></li>
<li id="fn:2">Izadi, Masoumeh T.; Precup, Doina (9 August 2003). "A planning algorithm for predictive state representations". Proceedings of the 18th International Joint Conference on Artificial Intelligence. Ijcai'03: 1520–1521. <a href="https://dl.acm.org/doi/abs/10.5555/1630659.1630919" target="_blank">https://dl.acm.org/doi/abs/10.5555/1630659.1630919</a> <a href="#fnref:2" class="footnote-back-ref">↩</a></li>
<li id="fn:3">Littman, Michael; Sutton, Richard S (2001). "Predictive Representations of State". Advances in Neural Information Processing Systems. 14. MIT Press. <a href="https://papers.nips.cc/paper_files/paper/2001/hash/1e4d36177d71bbb3558e43af9577d70e-Abstract.html" target="_blank">https://papers.nips.cc/paper_files/paper/2001/hash/1e4d36177d71bbb3558e43af9577d70e-Abstract.html</a> <a href="#fnref:3" class="footnote-back-ref">↩</a></li>
<li id="fn:4">Littman, Michael; Sutton, Richard S (2001). "Predictive Representations of State". Advances in Neural Information Processing Systems. 14. MIT Press. <a href="https://papers.nips.cc/paper_files/paper/2001/hash/1e4d36177d71bbb3558e43af9577d70e-Abstract.html" target="_blank">https://papers.nips.cc/paper_files/paper/2001/hash/1e4d36177d71bbb3558e43af9577d70e-Abstract.html</a> <a href="#fnref:4" class="footnote-back-ref">↩</a></li>
</ol>

Predictive state representation open-in-new

Predictive state representation