The Kolmogorov backward equation (KBE) (diffusion) and its adjoint sometimes known as the Kolmogorov forward equation (diffusion) are partial differential equations (PDE) that arise in the theory of continuous-time continuous-state Markov processes. Both were published by Andrey Kolmogorov in 1931. Later it was realized that the forward equation was already known to physicists under the name Fokker–Planck equation; the KBE on the other hand was new.
Informally, the Kolmogorov forward equation addresses the following problem. We have information about the state x of the system at time t (namely a probability distribution p t ( x ) {\displaystyle p_{t}(x)} ); we want to know the probability distribution of the state at a later time s > t {\displaystyle s>t} . The adjective 'forward' refers to the fact that p t ( x ) {\displaystyle p_{t}(x)} serves as the initial condition and the PDE is integrated forward in time (in the common case where the initial state is known exactly, p t ( x ) {\displaystyle p_{t}(x)} is a Dirac delta function centered on the known initial state).
The Kolmogorov backward equation on the other hand is useful when we are interested at time t in whether at a future time s the system will be in a given subset of states B, sometimes called the target set. The target is described by a given function u s ( x ) {\displaystyle u_{s}(x)} which is equal to 1 if state x is in the target set at time s, and zero otherwise. In other words, u s ( x ) = 1 B {\displaystyle u_{s}(x)=1_{B}} , the indicator function for the set B. We want to know for every state x at time t , ( t < s ) {\displaystyle t,\ (t<s)} what is the probability of ending up in the target set at time s (sometimes called the hit probability). In this case u s ( x ) {\displaystyle u_{s}(x)} serves as the final condition of the PDE, which is integrated backward in time, from s to t.
Kolmogorov Backward Equation
Let { X t } 0 ≤ t ≤ T {\displaystyle \{X_{t}\}_{0\leq t\leq T}} be the solution of the stochastic differential equation
d X t = μ ( t , X t ) d t + σ ( t , X t ) d W t , 0 ≤ t ≤ T , {\displaystyle dX_{t}\;=\;\mu {\bigl (}t,X_{t}{\bigr )}\,dt\;+\;\sigma {\bigl (}t,X_{t}{\bigr )}\,dW_{t},\quad 0\;\leq \;t\;\leq \;T,}
where W t {\displaystyle W_{t}} is a (possibly multi-dimensional) Brownian motion, μ {\displaystyle \mu } is the drift coefficient, and σ {\displaystyle \sigma } is the diffusion coefficient. Define the transition density (or fundamental solution) p ( t , x ; T , y ) {\displaystyle p(t,x;\,T,y)} by
p ( t , x ; T , y ) = P [ X T ∈ d y ∣ X t = x ] d y , t < T . {\displaystyle p(t,x;\,T,y)\;=\;{\frac {\mathbb {P} [\,X_{T}\in dy\,\mid \,X_{t}=x\,]}{dy}},\quad t<T.}
Then the usual Kolmogorov backward equation for p {\displaystyle p} is
∂ p ∂ t ( t , x ; T , y ) + A p ( t , x ; T , y ) = 0 , lim t → T p ( t , x ; T , y ) = δ y ( x ) , {\displaystyle {\frac {\partial p}{\partial t}}(t,x;\,T,y)\;+\;A\,p(t,x;\,T,y)\;=\;0,\quad \lim _{t\to T}\,p(t,x;\,T,y)\;=\;\delta _{y}(x),}
where δ y ( x ) {\displaystyle \delta _{y}(x)} is the Dirac delta in x {\displaystyle x} centered at y {\displaystyle y} , and A {\displaystyle A} is the infinitesimal generator of the diffusion:
A f ( x ) = ∑ i μ i ( x ) ∂ f ∂ x i ( x ) + 1 2 ∑ i , j [ σ ( x ) σ ( x ) T ] i j ∂ 2 f ∂ x i ∂ x j ( x ) . {\displaystyle A\,f(x)\;=\;\sum _{i}\,\mu _{i}(x)\,{\frac {\partial f}{\partial x_{i}}}(x)\;+\;{\frac {1}{2}}\,\sum _{i,j}\,{\bigl [}\sigma (x)\,\sigma (x)^{\mathsf {T}}{\bigr ]}_{ij}\,{\frac {\partial ^{2}f}{\partial x_{i}\,\partial x_{j}}}(x).}
Feynman–Kac formula
Assume that the function F {\displaystyle F} solves the boundary value problem
∂ F ∂ t ( t , x ) + μ ( t , x ) ∂ F ∂ x ( t , x ) + 1 2 σ 2 ( t , x ) ∂ 2 F ∂ x 2 ( t , x ) = 0 , 0 ≤ t ≤ T , F ( T , x ) = Φ ( x ) . {\displaystyle {\frac {\partial F}{\partial t}}(t,x)\;+\;\mu (t,x)\,{\frac {\partial F}{\partial x}}(t,x)\;+\;{\frac {1}{2}}\,\sigma ^{2}(t,x)\,{\frac {\partial ^{2}F}{\partial x^{2}}}(t,x)\;=\;0,\quad 0\leq t\leq T,\quad F(T,x)\;=\;\Phi (x).}
Let { X t } 0 ≤ t ≤ T {\displaystyle \{X_{t}\}_{0\leq t\leq T}} be the solution of
d X t = μ ( t , X t ) d t + σ ( t , X t ) d W t , 0 ≤ t ≤ T , {\displaystyle dX_{t}\;=\;\mu (t,X_{t})\,dt\;+\;\sigma (t,X_{t})\,dW_{t},\quad 0\leq t\leq T,}
where { W t } t ≥ 0 {\displaystyle \{W_{t}\}_{t\geq 0}} is standard Brownian motion under the measure P {\displaystyle \mathbb {P} } . If
∫ 0 T E [ ( σ ( t , X t ) ∂ F ∂ x ( t , X t ) ) 2 ] d t < ∞ , {\displaystyle \int _{0}^{T}\,\mathbb {E} \!{\Bigl [}{\bigl (}\sigma (t,X_{t})\,{\frac {\partial F}{\partial x}}(t,X_{t}){\bigr )}^{2}{\Bigr ]}\,dt\;<\;\infty ,}
then
F ( t , x ) = E [ Φ ( X T ) | X t = x ] . {\displaystyle F(t,x)\;=\;\mathbb {E} \!{\bigl [}\;\Phi (X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.}
Proof. Apply Itô’s formula to F ( s , X s ) {\displaystyle F(s,X_{s})} for t ≤ s ≤ T {\displaystyle t\leq s\leq T} :
F ( T , X T ) = F ( t , X t ) + ∫ t T { ∂ F ∂ s ( s , X s ) + μ ( s , X s ) ∂ F ∂ x ( s , X s ) + 1 2 σ 2 ( s , X s ) ∂ 2 F ∂ x 2 ( s , X s ) } d s + ∫ t T σ ( s , X s ) ∂ F ∂ x ( s , X s ) d W s . {\displaystyle F(T,X_{T})\;=\;F(t,X_{t})\;+\;\int _{t}^{T}\!{\Bigl \{}{\frac {\partial F}{\partial s}}(s,X_{s})\;+\;\mu (s,X_{s})\,{\frac {\partial F}{\partial x}}(s,X_{s})\;+\;{\tfrac {1}{2}}\,\sigma ^{2}(s,X_{s})\,{\frac {\partial ^{2}F}{\partial x^{2}}}(s,X_{s}){\Bigr \}}\,ds\;+\;\int _{t}^{T}\!\sigma (s,X_{s})\,{\frac {\partial F}{\partial x}}(s,X_{s})\,dW_{s}.}
Because F {\displaystyle F} solves the PDE, the first integral is zero. Taking conditional expectation and using the martingale property of the Itô integral gives
E [ F ( T , X T ) | X t = x ] = F ( t , x ) . {\displaystyle \mathbb {E} \!{\bigl [}F(T,X_{T})\,{\big |}\;X_{t}=x{\bigr ]}\;=\;F(t,x).}
Substitute F ( T , X T ) = Φ ( X T ) {\displaystyle F(T,X_{T})=\Phi (X_{T})} to conclude
F ( t , x ) = E [ Φ ( X T ) | X t = x ] . {\displaystyle F(t,x)\;=\;\mathbb {E} \!{\bigl [}\;\Phi (X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.}
Derivation of the Backward Kolmogorov Equation
We use the Feynman–Kac representation to find the PDE solved by the transition densities of solutions to SDEs. Suppose
d X t = μ ( t , X t ) d t + σ ( t , X t ) d W t . {\displaystyle dX_{t}\;=\;\mu (t,X_{t})\,dt\;+\;\sigma (t,X_{t})\,dW_{t}.}
For any set B {\displaystyle B} , define
p B ( t , x ; T ) ≜ P [ X T ∈ B ∣ X t = x ] = E [ 1 B ( X T ) | X t = x ] . {\displaystyle p_{B}(t,x;\,T)\;\triangleq \;\mathbb {P} \!{\bigl [}X_{T}\in B\,\mid \,X_{t}=x{\bigr ]}\;=\;\mathbb {E} \!{\bigl [}\mathbf {1} _{B}(X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.}
By Feynman–Kac (under integrability conditions), if we take Φ = 1 B {\displaystyle \Phi =\mathbf {1} _{B}} , then
∂ p B ∂ t ( t , x ; T ) + A p B ( t , x ; T ) = 0 , p B ( T , x ; T ) = 1 B ( x ) , {\displaystyle {\frac {\partial p_{B}}{\partial t}}(t,x;\,T)\;+\;A\,p_{B}(t,x;\,T)\;=\;0,\quad p_{B}(T,x;\,T)\;=\;\mathbf {1} _{B}(x),}
where
A f ( t , x ) = μ ( t , x ) ∂ f ∂ x ( t , x ) + 1 2 σ 2 ( t , x ) ∂ 2 f ∂ x 2 ( t , x ) . {\displaystyle A\,f(t,x)\;=\;\mu (t,x)\,{\frac {\partial f}{\partial x}}(t,x)\;+\;{\tfrac {1}{2}}\,\sigma ^{2}(t,x)\,{\frac {\partial ^{2}f}{\partial x^{2}}}(t,x).}
Assuming Lebesgue measure as the reference, write | B | {\displaystyle |B|} for its measure. The transition density p ( t , x ; T , y ) {\displaystyle p(t,x;\,T,y)} is
p ( t , x ; T , y ) ≜ lim B → y 1 | B | P [ X T ∈ B ∣ X t = x ] . {\displaystyle p(t,x;\,T,y)\;\triangleq \;\lim _{B\to y}\,{\frac {1}{|B|}}\,\mathbb {P} \!{\bigl [}X_{T}\in B\,\mid \,X_{t}=x{\bigr ]}.}
Then
∂ p ∂ t ( t , x ; T , y ) + A p ( t , x ; T , y ) = 0 , p ( t , x ; T , y ) → δ y ( x ) as t → T . {\displaystyle {\frac {\partial p}{\partial t}}(t,x;\,T,y)\;+\;A\,p(t,x;\,T,y)\;=\;0,\quad p(t,x;\,T,y)\;\to \;\delta _{y}(x)\quad {\text{as }}t\;\to \;T.}
Derivation of the Forward Kolmogorov Equation
The Kolmogorov forward equation is
∂ ∂ T p ( t , x ; T , y ) = A ∗ [ p ( t , x ; T , y ) ] , lim T → t p ( t , x ; T , y ) = δ y ( x ) . {\displaystyle {\frac {\partial }{\partial T}}\,p{\bigl (}t,x;\,T,y{\bigr )}\;=\;A^{*}\!{\bigl [}p{\bigl (}t,x;\,T,y{\bigr )}{\bigr ]},\quad \lim _{T\to t}\,p(t,x;\,T,y)\;=\;\delta _{y}(x).}
For T > r > t {\displaystyle T>r>t} , the Markov property implies
p ( t , x ; T , y ) = ∫ − ∞ ∞ p ( t , x ; r , z ) p ( r , z ; T , y ) d z . {\displaystyle p(t,x;\,T,y)\;=\;\int _{-\infty }^{\infty }p{\bigl (}t,x;\,r,z{\bigr )}\,p{\bigl (}r,z;\,T,y{\bigr )}\,dz.}
Differentiate both sides w.r.t. r {\displaystyle r} :
0 = ∫ − ∞ ∞ [ ∂ ∂ r p ( t , x ; r , z ) ⋅ p ( r , z ; T , y ) + p ( t , x ; r , z ) ⋅ ∂ ∂ r p ( r , z ; T , y ) ] d z . {\displaystyle 0\;=\;\int _{-\infty }^{\infty }{\Bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,p{\bigl (}r,z;\,T,y{\bigr )}\;+\;p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,{\frac {\partial }{\partial r}}\,p{\bigl (}r,z;\,T,y{\bigr )}{\Bigr ]}\,dz.}
From the backward Kolmogorov equation:
∂ ∂ r p ( r , z ; T , y ) = − A p ( r , z ; T , y ) . {\displaystyle {\frac {\partial }{\partial r}}\,p{\bigl (}r,z;\,T,y{\bigr )}\;=\;-\,A\,p{\bigl (}r,z;\,T,y{\bigr )}.}
Substitute into the integral:
0 = ∫ − ∞ ∞ [ ∂ ∂ r p ( t , x ; r , z ) ⋅ p ( r , z ; T , y ) − p ( t , x ; r , z ) ⋅ A p ( r , z ; T , y ) ] d z . {\displaystyle 0\;=\;\int _{-\infty }^{\infty }{\Bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,p{\bigl (}r,z;\,T,y{\bigr )}\;-\;p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,A\,p{\bigl (}r,z;\,T,y{\bigr )}{\Bigr ]}\,dz.}
By definition of the adjoint operator A ∗ {\displaystyle A^{*}} :
∫ − ∞ ∞ [ ∂ ∂ r p ( t , x ; r , z ) − A ∗ p ( t , x ; r , z ) ] p ( r , z ; T , y ) d z = 0. {\displaystyle \int _{-\infty }^{\infty }{\bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\;-\;A^{*}\,p{\bigl (}t,x;\,r,z{\bigr )}{\bigr ]}\,p{\bigl (}r,z;\,T,y{\bigr )}\,dz\;=\;0.}
Since p ( r , z ; T , y ) {\displaystyle p(r,z;\,T,y)} can be arbitrary, the bracket must vanish:
∂ ∂ r p ( t , x ; r , z ) = A ∗ [ p ( t , x ; r , z ) ] . {\displaystyle {\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\;=\;A^{*}{\bigl [}p{\bigl (}t,x;\,r,z{\bigr )}{\bigr ]}.}
Relabel r → T {\displaystyle r\to T} and z → y {\displaystyle z\to y} , yielding the forward Kolmogorov equation:
∂ ∂ T p ( t , x ; T , y ) = A ∗ [ p ( t , x ; T , y ) ] , lim T → t p ( t , x ; T , y ) = δ y ( x ) . {\displaystyle {\frac {\partial }{\partial T}}\,p{\bigl (}t,x;\,T,y{\bigr )}\;=\;A^{*}\!{\bigl [}p{\bigl (}t,x;\,T,y{\bigr )}{\bigr ]},\quad \lim _{T\to t}\,p(t,x;\,T,y)\;=\;\delta _{y}(x).}
Finally,
A ∗ g ( x ) = − ∑ i ∂ ∂ x i [ μ i ( x ) g ( x ) ] + 1 2 ∑ i , j ∂ 2 ∂ x i ∂ x j [ ( σ ( x ) σ ( x ) T ) i j g ( x ) ] . {\displaystyle A^{*}\,g(x)\;=\;-\sum _{i}\,{\frac {\partial }{\partial x_{i}}}{\bigl [}\mu _{i}(x)\,g(x){\bigr ]}\;+\;{\frac {1}{2}}\,\sum _{i,j}\,{\frac {\partial ^{2}}{\partial x_{i}\,\partial x_{j}}}{\Bigl [}{\bigl (}\sigma (x)\,\sigma (x)^{\mathsf {T}}{\bigr )}_{ij}\,g(x){\Bigr ]}.}
See also
- Etheridge, A. (2002). A Course in Financial Calculus. Cambridge University Press.
References
Andrei Kolmogorov, "Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung" (On Analytical Methods in the Theory of Probability), 1931, [1] http://eudml.org/doc/159476 ↩