Universal nonlinear filtering using Feynman path integrals II: the continuous-continuous model with additive noise

Balaji, Bhashyam

doi:10.1186/1754-0410-3-2

Research article
Open access
Published: 10 February 2009

Universal nonlinear filtering using Feynman path integrals II: the continuous-continuous model with additive noise

Bhashyam Balaji¹

PMC Physics A volume 3, Article number: 2 (2009) Cite this article

10k Accesses
3 Citations
Metrics details

Abstract

In this paper, the Feynman path integral formulation of the continuous-continuous filtering problem, a fundamental problem of applied science, is investigated for the case when the noise in the signal and measurement model is Gaussian and additive. It is shown that it leads to an independent and self-contained analysis and solution of the problem. A consequence of this analysis is the configuration space Feynman path integral formula for the conditional probability density that manifests the underlying physics of the problem. A corollary of the path integral formula is the Yau algorithm that has been shown to have excellent numerical properties. The Feynman path integral formulation is shown to lead to practical and implementable algorithms. In particular, the solution of the Yau partial differential equation is reduced to one of function computation and integration.

PACS Codes:02.50.Ey, 02.50.Fz, 05.10.Gg, 89.90.+n, 93E10, 93E11

1 Introduction

1.1 Motivation

The fundamental dynamical laws of physics, both classical and quantum mechanical, are described in terms of variables continuous in time. The continuous nature of the dynamical variables has been verified at all length scales probed so far, even though the relevant dynamical variables, and the fundamental laws of physics, are very different in the microscopic and macroscopic realms. In practical situations, one often deals with macroscopic objects whose state is phenomenologically well-described by classical deterministic laws modified by external disturbances that can be modelled as random noise, or Langevin equations. Even when there is no underlying fundamental dynamical law, the Langevin equation provides an effective description of the state variables in many applications. It is therefore natural to consider the problem of the evolution of a state of a signal of interest described by a Langevin equation called the state process.

When the state model noise is Gaussian (or more generally multiplicatively Gaussian) the state process is a Markov process. Since the process is stochastic, the state process is completely characterized by a probability density function. The Fokker-Planck-Kolmogorov foward equation (FPKfe) describes the evolution of this probability density function (or equivalently, the transition probability density function) and is the complete solution of the state evolution problem.

However, in many applications the signal, or state variables, cannot be directly observed. Instead, what is measured is a nonlinearly related stochastic process called the measurement process. The measurement process can often be modelled by yet another continuous stochastic dynamical system called the measurement model. In other words, the observations, or measurements, are discrete-time samples drawn from a different Langevin equation called the measurement process.

The conditional probability density function of the state variables, given the observations, is the complete solution of the filtering problem. This is because it contains all the probabilistic information about the state process that is in the measurements and in the initial condition [1]. This is the Bayesian approach, i.e., the a priori initial data about the signal process contained in the initial probability distribution of the state is incorporated into the solution. Given the conditional probability density, optimality may be defined under various criterion. Usually, the conditional mean, which is the least mean-squares estimate, is studied due to its richness in results and mathematical elegance. The solution of the optimal nonlinear filtering problem is termed universal, if the initial distribution can be arbitrary.

1.2 Fundamental Sochastic Filtering Results

When the state and measurement processes are linear, the linear filtering problem was solved by Kalman and Bucy [2, 3]. The celebrated Kalman filter has been successfully applied to a large number of problems in many different areas.

Nevertheless, the Kalman filter suffers from some major limitations. The Kalman filter is not optimal even for the linear filtering case if the initial distribution is not Gaussian. It may still be optimal for a linear system under certain criteria, such as minimum mean square error, but not a general criterion. In other words, the Kalman filter is not a universal optimal filter, even when the filtering problem is linear. Secondly, the Kalman filter cannot be an optimal solution for the general nonlinear filtering problem since it assumes that the signal and measurement models are linear. The extended Kalman filter (EKF), obtained by applying the Kalman filter to a linearized model, cannot be a reliable solution, in general. Thirdly, even when the EKF estimates the state well in some cases, it gives no reliable indication of the accuracy of the state estimate, i.e., the conditional variance is unreliable. Finally, the Kalman filter assumes that the conditional probability distribution is Gaussian, which is a very restrictive assumption; for instance, it rules out the possibility of a multi-modal conditional probability distribution.

The continuous-continous nonlinear filtering problem (i.e., continuous-time state and measurement stochastic processes) was studied in [4–6] and [7]. This led to a stochastic differential equation, called the Kushner equation, for the conditional probability density in the continuous-continuous filtering problem. It was noted in [8, 9], and [10] that the stochastic differential equation satisfied by the unnormalized conditional probability density, called the Duncan-Mortensen-Zakai (DMZ) equation, is linear and hence considerably simpler than the Kushner equation. The robust DMZ equation, a partial differential equation (PDE) that follows from the DMZ equation via a gauge transformation, was derived in [11] and [12].

A disadvantage of the robust DMZ equation is that the coefficients depend on the measurements. Thus, one does not know the PDE to solve prior to the measurements. As a result, real-time solution is impossible. A fundamental advance was made in tackling the general nonlinear filtering problem by S-T. Yau and Stephen Yau. In [13], it was proved that the robust DMZ equation is equivalent to a partial differential equation that is independent of the measurements, which is referred to as the Yau Equation (YYe) in this paper. Specifically, the measurements only enter as initial condition at each measurement step. Thus, no on-line solution of a PDE is needed; all PDE computations can be done off-line.

However, numerical solution of partial differential equations presents several challenges. A naïve discretization may not be convergent, i.e., the approximation error may not vanish as the grid size is reduced. Alternatively, when the discretization spacing is decreased, it may tend to a different equation, i.e., be inconsistent. Furthermore, the numerical method may be unstable. Finally, since the solution of the YYe is a probability density, it must be positive which may not be guaranteed by the discretization.

A different approach to solving the PDE was taken in [14] and [15]. An explicit expression for the fundamental solution of the YYe as an ordinary integral was derived. It was shown that the formal solution to the YYe may be written down as an ordinary, but somewhat complicated, multi-dimensional integral, with an infinite series as the integrand. In addition, an estimate of the time needed for the solution to converge to the true solution was presented.

1.3 Outline of the Paper

In this paper, the (Euclidean) Feynman path integral (FPI) formulation is employed to tackle the continuous-continuous nonlinear filtering problem. Specifically, phrasing the stochastic filtering problem in a language common in physics, the solution of the stochastic filtering problem is presented. In particular, no other result in filtering theory (such as the DMZ equation, the robust DMZ equation, etc.) is used. The path integral formulation leads to a path integral formula for the transition probability density for the general additive noise case. A corollary of the FPI formalism is the path integral formula for the fundamental solution of the YYe and the Yau algorithm – a fundamental result of nonlinear filtering theory. It is noted that this paper provides a detailed derivation of results that were used in [16].

The following point needs to be emphasized to readers familiar with the discussion of standard filtering theory – the FPI is different from the Feynman-Kǎc path integral. In filtering theory literature, it is the Feynman-Kǎc formalism that is often used. The Feynman-Kǎc formulation is a rigorous formulation and has led to several rigorous results in filtering theory. However, in spite of considerable effort it has not been proven to be directly useful in the development of reliable practical algorithms with desirable numerical properties. It also obscures the physics of the problem.

In contrast, it is shown that the FPI leads to formulas that are eminently suitable for numerical implementation. It also provides a simple and clear physical picture. Many path integral manipulations have no counterpart in the Feynman-Kǎc approach. Finally, the theoretical insights provided by the FPI are highly valuable, as evidenced by numerous examples in modern theoretical physics (see, for instance, [17]), and shall be employed in subsequent papers.

The outline of this paper is as follows. In the following section, the filtering problem is reformulated in a language common in physics. In Section 3, the path integral formula for the transition probability density is derived for the general additive noise case. The Yau algorithm is then derived in the following section. In Sections 5 and 6 some conceptual remarks and numerical examples are presented. The conclusions are presented in Section 7. In Appendix 1, aspects of continuous-continuous filtering are reviewed.

For more details on the path integral methods, see any modern text on quantum field theory, such as [17], and especially [18] which discusses application of FPI to the study of stochastic processes.

2 The continuous filtering problem: a physical reformulation

In this section, the filtering problem is stated in a language commonly used in theoretical physics.

2.1 Langevin Equation

Consider an ensemble of systems with state variables described by the Langevin equation:

\begin{matrix} \dot{x} (t) = f (x (t), t) + e (x (t), t) ν (t), & x (0) = x_{0}, & Q (t) \in ℝ^{p \times p} \end{matrix} .

(1)

Here, x(t) ∈ ℝⁿ, the drift f(x(t), t) ∈ ℝⁿ, the diffusion vielbein e(x(t), t) ∈ ℝ^n×p, and ν(t) is δ-correlated with covariance matrix Q(t) ∈ ℝ^p×p. When the diffusion vielbein is independent of the state x(t), the noise is termed additive. It is the additive noise that is studied here, since it enables the use of functional methods common in quantum field theory.

Due to the random noise, each system leads to a different vector x(t) that depends on time. Although only one realization of the stochastic process is ever observed, it is meaningful to speak about an ensemble average. For fixed times t = t_i, i = 1, 2, ..., r, the probability density of finding the random vector x(t) in the (n-dimensional) interval x_i≤ x(t_i) ≤ x_i+ dx_i(1 ≤ i ≤ r) is given by

W_{r} (t_{r}, x_{r}; \dots; t_{1}, x_{1}) = 〈 \prod_{i = 1}^{r} δ^{n} (x (t_{i}) - x_{i}) 〉,

(2)

where x_iis an n-dimensional column vector and ⟨·⟩ denotes averaging with respect to the signal model noise ν(t). The complete information on the random vector x(t) is contained in the infinite hierarchy of such probability densities. The quantity of interest here is the conditional probability density

\begin{array}{l} p (t_{r}, x_{r} | t_{r - 1}, x_{r - 1}, ...; t_{1}, x_{1}) & = & 〈 δ^{n} (x (t_{r})) 〉 |_{x (t_{r - 1}) = x_{r - 1, ..., x (t_{1}) = x_{1}}}, & x (t_{r}) \equiv x_{r} \\ = & \frac{W_{r} (t_{r}, x_{r}; ...; t_{1}, x_{1})}{\int W_{n} (t_{r}, x_{r}; ...; t_{1}, x_{1}) {d^{n} x_{r}}} . \end{array}

(3)

Now the process described by the Langevin equation with δ-correlated Langevin force is a Markov process, i.e., the conditional probability density depends only on the value at the immediate previous time:

p(t_n, x_n|t_n-1, x_n-1; ...; t₁, x₁) = p(t_n, x_n|t_n-1, x_n-1).

It can be shown that the transition probability satisfies the Fokker-Planck-Kolmogorov forward equation (FPKfe) (see, for instance, [19])

\begin{array}{l} \frac{\partial p}{\partial t} (t, x) & = & - \sum_{i = 1}^{n} \frac{\partial}{\partial x_{i}} [f_{i} (x (t), t) p (t, x)] + \frac{1}{2} \sum_{i, j = 1}^{n} \sum_{a = 1}^{p} \frac{\partial^{2}}{\partial x_{i} \partial x_{j}} [{(e (x (t), t) Q (t))}_{i a} e_{a j}^{T} (x (t), t) p (t, x))], \\ \equiv & L_{p} (t, x) . \end{array}

(5)

Finally, the Gaussian noise process ν(t) can be represented by the following path integral measure

\begin{matrix} [d ρ (ν (t))] = [D ν (t)] \exp [- \frac{1}{2} \sum_{a, b = 1}^{p} \int ν_{a} (t) {[Q^{- 1} (t)]}_{a b} ν_{b} (t) d t], & ν \in ℝ^{p} . \end{matrix}

(6)

where ν(t) is a real vector for each t. This leads to a configuration space FPI formula for the fundamental solution of the FPKfe (see, for instance, [18]). The path integral formula for the fundamental solution for the FPkfe is applied to the continuous-discrete filtering problem with additive (state model) noise in [20, 21].

2.2 The Continuous-Continuous Filtering Problem

Similarly, the continuous-continuous model can be written as follows:

{\begin{matrix} \dot{x} (t) & = f (x (t), t) + e (x (t), t) ν (t), & x (0) = x_{0}, & Q (t) \in ℝ^{p \times p} \\ \dot{y} (t) & = h (x (t), t) + n (x (t), t) μ (t), & y (0) = y_{0}, & R (t) \in ℝ^{q \times q} \end{matrix}

(7)

Here, y(t) ∈ ℝ^m, the measurement model drift h(x(t), t) ∈ ℝ^m, the diffusion vielbein n(x(t), t) ∈ ℝ^m×q, and μ(t) is δ-correlated with covariance matrix W(t) ∈ ℝ^q×q.

Thus, in continuous-continuous filtering, the continuous-time measurement stochastic process needs to be incorporated as well. Consider another ensemble of systems with state variables whose time evolution is governed by the measurement process. The measurement noise means that each system in the ensemble leads to a different time-dependent vector y(t). Thus, even though only one realization of the measurement stochastic process is observed, it is still meaningful to talk about an ensemble average of the measurement process (in addition to one over the state process). Thus, the quantity of interest in continuous-continuous filtering is the conditional probability density

P (t_{2}, x_{2}; y_{2} | t_{1}, x_{1}; y_{1}) = {〈 〈 δ^{n} (x (t_{2}) - x_{2}) δ^{m} (y (t_{2}) - y_{2}) 〉 〉}_{μ} |_{x (t_{1}) = x_{1}, y (t_{1}) = y_{1}},

(8)

where ⟨·⟩_μdenotes averaging with respect to the measurement noise μ(t). A crucial difference between the state and measurement stochastic process is that, unlike the state, the measurement samples are known.

Note that the conditional transition probability density is the complete solution to the continuous-continuous filtering problem, since if the initial distribution is u(t_i-1, x'|Y_i-1), where Y_i-1is the set of all measurements prior to t_i-1, then the evolved conditional probability distribution is

u(t, x|Y_i) = ∫ P(t, x; y_i|t_i-1, x'; y_i-1) u (t_i-1, x'|Y_i-1) {dⁿx'}.

In the following sections, the path integral formulas for P(t₂, x₂; y₂|t₁, x₁; y₁) are derived. In Section 4 it shall be shown that it leads to the Yau algorithm. It shall be shown that the YYe plays the same role here that the FPKfe does in continuous-discrete filtering.

3 Path integral formula for the conditional transition probability density

The path integral formula for the conditional transition probability density shall now be derived using functional methods. Note that implicit in the use of these formal functional methods is the use of the Feynman convention, or symmetric discretization for the drift.

As noted in Section 2, the transition probability density is computed by averaging over the signal and measurement ensembles, i.e.,

\begin{array}{l} P (t, x; y | t_{0}, x_{0}; y_{0}) & = & \int [d ρ (ν (t, t_{0}))] [d ρ (μ (t, t_{0}))] \\ \times [D \dot{x} (t)] J δ^{n} [\dot{x} (t) - f (x (t), t) - e (t) ν (t)] \\ \times [D \dot{y} (t)] J_{y} δ^{m} [\dot{y} (t) - h (x (t), t) - n (t) μ (t)] \\ \times δ^{n} [x (t) - x] |_{x (t_{0}) = x_{0}} δ^{m} [y (t) - y] |_{y (t_{0}) = y_{0}} . \end{array}

(10)

From the assumptions of the signal and measurement noise processes, it is evident that

[d ρ (ν (t, t_{0}))] = [D ν (t, t_{0})] \exp (- \frac{1}{2} \sum_{a, b = 1}^{p} \int_{t_{0}}^{t} ν_{a} (t) {[Q^{- 1} (t)]}_{a b} ν_{b} (t) d t),

(11)

[d ρ (μ (t, t_{0}))] = [D μ (t, t_{0})] \exp (- \frac{1}{2} \sum_{c, d = 1}^{p} \int_{t_{0}}^{t} μ_{c} (t) {[W^{- 1} (t)]}_{c d} μ_{d} (t) d t) .

(12)

The Jacobian J follows from the functional derivative of the Langevin equation:

\begin{array}{l} \frac{δ}{δ x_{j} (t^{'})} [{\dot{x}}_{i} (t) - f_{i} (x (t), t) - \sum_{a = 1}^{p} e_{i a} (t) ν_{a} (t)] & = & [δ_{i j} \frac{d}{d t} - \frac{\partial f_{i}}{\partial x_{j}} (x (t), t)] δ (t - t^{'}), \\ = & - \frac{d}{d t^{'}} [δ_{i j} δ (t - t^{'}) - θ (t - t^{'}) \frac{\partial f_{i}}{\partial x_{j}} (x (t), t)] . \end{array}

(13)

Hence,

\begin{matrix} J = \det (- \frac{d}{d t^{'}}) \det (δ_{i j} δ (t - t^{'}) - θ (t - t^{'}) \frac{\partial f_{i}}{\partial x_{j}} (x (t), t)), \\ = N \det (δ_{i j} δ (t - t^{'}) - θ (t - t^{'}) \frac{\partial f_{i}}{\partial x_{j}} (x (t), t))), \end{matrix}

(14)

where $N$ is an irrelevant constant, or,

\begin{matrix} \ln J = \ln \det [δ_{i j} δ (t - t^{'}) - θ (t - t^{'}) \frac{\partial f_{i}}{\partial x_{j}} (x (t), t)], \\ = - \frac{1}{2} \sum_{i = 1}^{n} \int \frac{\partial f_{i}}{\partial x_{i}} (x (t), t) d t . \end{matrix}

(15)

The Jacobian J_yis trivial (as the measurement model drift is y-independent) and can be absorbed into the measure.

It is noteworthy that J is not trivial. In quantum field theory, nontrivial Jacobians usually imply that there is an anomaly, as in the case of chiral anomalies in gauge theories. However, there is no reason for an anomaly here; after all, this is not even a quantum field theoretical system. The puzzle is resolved by noting that path integral anomalies in quantum field theory arise from the "multiplicative" part in the change of variables (i.e., ψ (x) → U(x) ψ (x)). In contrast, the nontrivial Jacobian term here arises from the additive term; the multiplicative term does not contribute to the Jacobian, in accordance with expectations.

Thus, so far,

\begin{array}{l} P (t, x; y | t_{0}, x_{0}; y_{0}) & = & \int_{y (t_{0}) = y_{0}}^{y (t) = y} \int_{x (t_{0}) = x_{0}}^{x (t) = x} [D x (t)] [D y (t)] [d ρ (ν (t, t_{0}))] [d ρ (μ (t, t_{0}))] \\ \times δ^{n} ({\dot{x}}_{i} (t) - f_{i} (x (t), t) - \sum_{a = 1}^{p} e_{i a} (t) ν_{a} (t)) \exp (- \frac{1}{2} \sum_{i = 1}^{n} \int_{t_{0}}^{t} \frac{\partial f_{i}}{\partial x_{i}} (x (t), t) d t) \\ \times δ^{m} ({\dot{y}}_{k} (t) - h_{k} (x (t), t) - \sum_{c = 1}^{q} n_{k c} (t) μ_{c} (t)) . \end{array}

(16)

Using the Fourier integral version of the delta function

\begin{array}{l} P (t, x; y | t_{0}, x_{0}; y_{0}) & = & \int_{y (t_{0}) = y_{0}}^{y (t) = y} \int_{x (t_{0}) = x_{0}}^{x (t) = x} [D x (t)] [D y (t)] [D ν (t, t_{0})] [D μ (t, t_{0})] \\ [D λ (t, t_{0})] \exp (i \sum_{i = 1}^{n} \int_{t_{0}}^{t} λ_{i} (t) [{\dot{x}}_{i} (t) - f_{i} (x (t), t) - \sum_{a = 1}^{p} e_{i a} (t) ν_{a} (t)] d t) \\ [D κ (t, t_{0})] \exp (i \sum_{k = 1}^{n} \int_{t_{0}}^{t} κ_{k} (t) [{\dot{y}}_{k} (t) - h_{k} (x (t), t) - \sum_{c = 1}^{q} n_{k c} (t) μ_{c} (t)] d t) \\ \times \exp (- \frac{1}{2} \sum_{a, b = 1}^{p} \int_{t_{0}}^{t} ν_{a} (t) {(Q^{- 1} (t))}_{a b} ν_{b} (t) d t) \times \exp (- \frac{1}{2} \sum_{i = 1}^{n} \int_{t_{0}}^{t} \frac{\partial f_{i}}{\partial x_{i}} (x (t), t) d t) \\ \times \exp (- \frac{1}{2} \sum_{c, d = 1}^{q} \int_{t_{0}}^{t} μ_{c} (t) {(W^{- 1} (t))}_{c d} μ_{d} (t) d t) . \end{array}

(17)

Integrating over ν(t, t₀), and μ(t, t₀) leads to

\begin{matrix} P (t, x; y | t_{0}, x_{0}; y_{0}) = \int_{y (t_{0}) = y_{0}}^{y (t) = y} \int_{x (t_{0}) = x_{0}}^{x (t) = x} [D x (t)] [D y (t)] [D λ (t, t_{0})] [D κ (t, t_{0})] \exp (- \frac{1}{2} \sum_{i = 1}^{n} \int \frac{\partial f_{i}}{\partial x_{i}} (x (t), t) d t) \\ \times \exp (- \int_{t_{0}}^{t} [\sum_{i, j = 1}^{n} \sum_{a, b = 1}^{p} λ_{i} (t) e_{i a} (t) Q_{a b} (t) e_{b j}^{T} (t) λ_{j} (t) - i \sum_{i = 1}^{n} λ_{i} (t) ({\dot{x}}_{i} (t) - f_{i} (x (t), t))] d t) \\ \times \exp (- \int_{t_{0}}^{t} [\sum_{k, l = 1}^{m} \sum_{c, d = 1}^{q} κ_{k} (t) n_{k c} (t) W_{c d} (t) n_{d l}^{T} (t) κ_{l} (t) - i \sum_{k = 1}^{m} κ_{k} (t) ({\dot{y}}_{k} (t) - h_{k} (x (t), t))] d t) \end{matrix}

(18)

Integrating over λ(t, t₀) and κ(t, t₀), it is clear that

P (t, x; y | t_{0}, x_{0}; y_{0}) = \int_{y (t_{0}) = y_{0}}^{y (t) = y} \int_{x (t_{0}) = x_{0}}^{x (t) = y} [D x (t)] [D y (t)] \exp [- S (t_{0}, t)],

(19)

where the action is given by

\begin{array}{l} S (t_{0}, t) = \frac{1}{2} \sum_{i, j = 1}^{n} \sum_{a, b = 1}^{p} \int_{t_{0}}^{t} [{\dot{x}}_{i} (t) - f_{i} (x (t), t)] {(e_{i a} (t) Q_{a b} (t) e_{b j}^{T} (t))}^{- 1} [{\dot{x}}_{j} (t) - f_{j} (x (t), t)] d t \\ + \frac{1}{2} \int_{t_{0}}^{t} [\sum_{k, l = 1}^{n} \sum_{c, d = 1}^{q} [{\dot{y}}_{k} (t) - h_{k} (x (t), t)] {(n_{k c} (t) W_{c d} (t) n_{d l}^{T} (t))}^{- 1} [{\dot{y}}_{l} (t) - h_{l} (x (t), t)] + \sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x (t), t)] d t . \end{array}

(20)

4 Derivation of the Yau algorithm

Observe that the path integral formula derived in the previous section is over both the state and measurement variables. In this section, we shall show that in some cases it is possible to reduce the result to a path integral over the state variables only. This has the advantage of being pre-computable since it is independent of measurements. It shall be shown to lead to the Yau algorithm.

4.1 Sampled Continuous Measurements

Suppose measurements are available at time t_i-1and t_i, and that the conditional transition probability density at time t_iis desired. Further, assume that there are no measurements available between t_i-1and t_i. The general path integral formula (Equation 19) cannot be simplified unless some additional assumptions are made.

First, consider the case where e(t)Q(t)e^T(t) and n(t)W(t)n^T(t) are ħ_νI_n×nand ħ_μI_m×mand h(x(t)) is not explicitly time dependent. Then, the contribution to the action due to the measurement process is

- \frac{1}{2 ℏ_{μ}} \sum_{k = 1}^{m} \int_{t_{i} - 1}^{t_{i}} d t {[{\dot{y}}_{k} - h_{k} (x (t))]}^{2} = - \frac{1}{2 ℏ_{μ}} \sum_{k = 1}^{m} \int_{t_{i - 1}}^{t_{i}} d t [{\dot{y}}_{k}^{2} (t) + h_{k}^{2} (x (t)) - 2 h_{k} (x (t)) {\dot{y}}_{i} (t)] .

(21)

The quantity of interest is the state and we seek to integrate over the measurement variables.

Now the first term is independent of the state variables. The second term can be added to the action term that is independent of y(t). It remains to investigate the contribution of the third term:

\frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} \int_{t_{i - 1}}^{t_{i}} h_{k} (x (t)) {\dot{y}}_{k} (t) d t .

(22)

There are two issues in this evaluation. Firstly, this can be evaluated via the usual integration by parts, but it is important to note that it is valid only for symmetric discretization. Secondly, since the measurements are sampled, there are two possible interpretations when t_i- t_i-1= ϵ, where ϵ is an infinitesimal:

\frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} \int_{t_{i - 1}}^{t_{i}} h_{k} (x (t)) {\dot{y}}_{k} (t) d t = {\begin{array}{l} \frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} \int_{t_{i - 1}}^{t_{i}} h_{k} (x (t)) {\dot{y}}_{k} (t) d t, \\ \frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} \int_{t_{i - 1} - ϵ}^{t_{i} - ϵ} h_{k} (x (t)) {\dot{y}}_{k} (t) d t . \end{array}

(23)

This leads to two possibilities:

\frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} \int_{t_{i - 1}}^{t_{i}} h_{k} (x (t)) {\dot{y}}_{k} (t) d t = {\begin{array}{l} \frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i}) + x (t_{i - 1})]) [y_{k} (t_{i}) - y_{k} (t_{i - 1})], \\ \frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i - 1}) + x (t_{i - 2})]) [y_{k} (t_{i - 1}) - y_{k} (t_{i - 2})] . \end{array}

(24)

Finally, the residual Gaussian path integral over y(t) can be ignored as it is independent of the state. Therefore, the path integral formula simplifies to

\begin{array}{l} P (t_{i}, x_{i}; y_{i} | t_{i - 1}, x_{i - 1}; y_{i - 1}) & = & \tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) \\ \times {\begin{array}{l} \exp (\frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i}) + x (t_{i - 1})]) [y_{k} (t_{i}) - y_{k} (t_{i - 1})]), \\ \exp (\frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i - 1}) + x (t_{i - 2})]) [y_{k} (t_{i - 1}) - y_{k} (t_{i - 2})]), \end{array} \end{array}

(25)

where

\tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) = \int_{x (t_{i - 1}) = x_{i - 1}}^{x (t_{i}) = x_{i}} [D x (t)] \exp (- S (t_{i - 1}, t_{i})),

(26)

and

S (t_{i - 1}, t_{i}) = \frac{1}{2} \int_{t_{i - 1}}^{t_{i}} d t [\frac{1}{ℏ_{ν}} \sum_{i = 1}^{n} {[{\dot{x}}_{i} (t) - f_{i} (x (t)]}^{2} + \sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x (t)) + \frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} h_{k}^{2} (x (t))] .

(27)

Secondly, if the conditions above are relaxed to allowing explicit time dependence of the drift term in the state model, i.e., f(x(t), t), and and C = (n(t)W(t)n^T(t)), then the path integral formula becomes

\begin{array}{l} P (t_{i}, x_{i}; y_{i} | t_{i - 1}, x_{i - 1}; y_{i - 1}) & = & \tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) \\ \times {\begin{array}{l} \exp ((\frac{1}{ℏ_{μ}} \sum_{k, l = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i}) + x (t_{i - 1})]) {(C^{- 1})}_{k l} [y_{l} (t_{i}) - y_{l} (t_{i - 1})]), \\ \exp ((\frac{1}{ℏ_{μ}} \sum_{k, l = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i - 1}) + x (t_{i - 2})]) {(C^{- 1})}_{k l} [y_{l} (t_{i - 1}) - y_{l} (t_{i - 2})]), \end{array} \end{array}

(28)

where

\tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) = \int_{x (t_{i - 1}) = x_{i - 1}}^{x (t_{i}) = x_{i}} [D x (t)] \exp (- S (t_{i - 1}, t_{i})),

(29)

and the action is given by

S (t_{i - 1}, t_{i}) = \frac{1}{2} \int_{t_{i - 1}}^{t_{i}} d t [\frac{1}{ℏ_{ν}} \sum_{i = 1}^{n} {[{\dot{x}}_{i} (t) - f_{i} (x (t), t)]}^{2} + \sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x (t), t) + \sum_{k, l = 1}^{m} h_{k} (x (t)) {(C^{- 1})}_{k l} h_{l} (x (t))] .

(30)

Finally, consider the case when e(t)Q(t)e(t) is time-independent and invertible, but otherwise arbitrary, and C = (n(t)W(t)n^T(t)). Note that C is a constant, symmetric matrix and assumed to be invertible. The path integral formula is given by Equations 28 and 29 and with the action S(t_i-1, t_i) is given by

\begin{matrix} S (t_{i - 1}, t_{i}) = \frac{1}{2} \int_{t_{i - 1}}^{t_{i}} d t \sum_{i, j = 1}^{n} \sum_{a, b = 1}^{p} [{\dot{x}}_{i} (t) - f_{i} (x (t), t)] {[e_{i a} (t) Q_{a b} (t) e_{b j}^{T} (t)]}^{- 1} [{\dot{x}}_{j} (t) - f_{j} (x (t), t)] \\ + \frac{1}{2} \int_{t_{i - 1}}^{t_{i}} d t [\sum_{k, l = 1}^{m} h_{k} (x (t)) {(C^{- 1})}_{k l} h_{l} (x (t)) + \sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x (t), t)] . \end{matrix}

(31)

4.2 The Yau Algorithm

In Section 2 it was noted that if v_i-1(t_i-1, x_i-1) is the conditional probability density at time t_i-1, then the conditional probability density at time t_iis given by

v_i(t_i, x_i) = ∫ P (t_i, x_i; y_i|t_i-1, x_i-1; y_i-1) v_i-1(t_i-1, x_i-1) {dⁿx_i-1}.

For simplicity, let us first consider the case e(t)Q(t)e^T(t) and n(t)W(t)n^T(t) are ħ_νI_n×nand ħ_μI_m×m, and h(x(t)) is not explicitly time dependent. Then

\begin{array}{l} v_{i} (t_{i}, x_{i}) = \\ {\begin{array}{l} \int \tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) \exp (- \frac{1}{ℏ_{ν}} \sum_{k = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i}) + x (t_{i - 1})]) [y_{k} (t_{i}) - y_{k} (t_{i - 1})]) v_{i - 1} (t_{i - 1}, x_{i - 1}) {d^{n} x_{i - 1}}, \\ \int \tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) \exp (- \frac{1}{ℏ_{ν}} \sum_{k = 1}^{m} h_{k} (\frac{1}{2} [x (t_{i - 1}) + x (t_{i - 2})]) [y_{k} (t_{i - 1}) - y_{k} (t_{i - 2})]) v_{i - 1} (t_{i - 1}, x_{i - 1}) {d^{n} x_{i - 1}} . \end{array} \end{array}

(33)

When t_i- t_i-1is small

h_{k} (\frac{1}{2} [x (t_{i}) + x (t_{i - 1})]) \approx h_{k} (x (t_{i - 1})) + O (\sqrt{ϵ}),

(34)

and Equation 33 becomes

v_{i} = (t_{i}, x_{i}) = {\begin{array}{l} \int \tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) \exp (- \frac{1}{ℏ_{ν}} \sum_{k = 1}^{m} h_{k} (x (t_{i - 1})) [y_{k} (t_{i}) - y_{k} (t_{i - 1})]) v_{i - 1} (t_{i - 1}, x_{i - 1}) {d^{n} x_{i - 1}}, \\ \int \tilde{P} (t_{i}, x_{i} | t_{i - 1}, x_{i - 1}) \exp (- \frac{1}{ℏ_{ν}} \sum_{k = 1}^{m} h_{k} (x (t_{i - 1})) [y_{k} (t_{i - 1}) - y_{k} (t_{i - 2})]) v_{i - 1} (t_{i - 1}, x_{i - 1}) {d^{n} x_{i - 1}} . \end{array}

(35)

Following the method originally used by Feynman, the partial differential equation satisfied by $\tilde{P}$ (t, x|t₀, x₀) may be derived (see, for instance, [20]). In particular, $\tilde{P}$ (t_i, x_i|t_i-1, x_i-1) is the fundamental solution of the Yau Equation(YYe):

{\begin{array}{l} \frac{\partial \tilde{P}}{\partial t} (t, x | t_{0}, x_{0}) = \frac{ℏ_{ν}}{2} \sum_{i = 1}^{n} \frac{\partial^{2} \tilde{P}}{\partial x_{i}^{2}} (t, x | t_{0}, x_{0}) - \sum_{i = 1}^{n} \frac{\partial}{\partial x_{i}} [f_{i} (x) \tilde{P} (t, x | t_{0}, x_{0})] - \frac{1}{2 ℏ_{μ}} \sum_{k = 1}^{m} h_{k}^{2} (x) \tilde{P} (t, x | t_{0}, x_{0}), \\ \tilde{P} (t_{0}, x | t_{0}, x_{0}) = δ^{n} (x - x_{0}) . \end{array}

(36)

This implies that v_i(t_i, x) is the solution at t_iof

{\begin{array}{l} \frac{\partial v_{i}}{\partial t} (t, x) = \frac{1}{2} \sum_{l = 1}^{n} \frac{\partial^{2} v_{i}}{\partial x_{l}^{2}} (t, x) - \sum_{l = 1}^{n} f_{l} (x) \frac{\partial v_{i}}{\partial x_{l}} (t, x) - (\sum_{l = 1}^{n} \frac{\partial f_{l}}{\partial x_{l}} (x) + \frac{1}{2} \sum_{l = 1}^{m} h_{l}^{2} (x)) v_{i} (t, x), \\ v_{i} (t_{i - 1}, x) = v_{i - 1} (t_{i - 1}, x) {\begin{array}{l} \exp (\sum_{j = 1}^{m} h_{j} (x) [y_{j} (t_{i}) - y_{j} (t_{i - 1})]), \\ \exp (\sum_{j = 1}^{m} h_{j} (x) [y_{j} (t_{i - 1}) - y_{j} (t_{i - 2})]) . \end{array} \end{array}

(37)

This is precisely the Yau algorithm.

Likewise, it is straightforward to see that for the general case studied in Section 4.1, the Yau algorithm is extended to this case as follows: v_i(t_i, x_i) is the solution at t_iof the PDE

{\begin{matrix} \frac{\partial v_{i}}{\partial t} (t, x) = \frac{1}{2} \sum_{l, j = 1}^{n} \sum_{a, b = 1}^{p} (e_{l a} (t) Q_{a b} (t) e_{b j}^{T} (t)) \frac{\partial^{2} v_{i}}{\partial x_{l} \partial x_{j}} (t, x) \\ - \sum_{l = 1}^{n} \frac{\partial}{\partial x_{l}} [f_{l} (x, t) v_{i} (t, x)] - \frac{1}{2} \sum_{k, l = 1}^{m} h_{k} (x) {(C^{- 1})}_{k l} h_{l} (x (t)) v_{i} (t, x), \\ v_{i} (t_{i - 1}, x) = v_{i - 1} (t_{i - 1}, x) {\begin{array}{l} \exp (\sum_{j, k = 1}^{m} h_{j} (x) {(C^{- 1})}_{j k} [y_{k} (t_{i}) - y_{k} (t_{i - 1})]), \\ \exp (\sum_{j, k = 1}^{m} h_{j} (x) {(C^{- 1})}_{j k} [y_{k} (t_{i - 1}) - y_{k} (t_{i - 2})]) . \end{array} \end{matrix}

(38)

This is a straightforward generalization of the YYe to the state model with explicit time dependence.

5 Some remarks

Following are some remarks on the FPI solution of the filtering problem:

Note that the FPI formulation has given a complete and self-contained solution of the continuous-continuous filtering problem. For instance, the DMZ equation or its variants were not used as an input. On the contrary, the FPI formula naturally led us to the YYe and the Yau algorithm.
Note also that the DMZ equation (and variants) cannot be solved reliably in real-time as the various approximations assume that drift is bounded and require solution of a stochastic PDE that depend on measurements. In contrast, the general FPI formula presented in Section 3 can potentially be used for an efficient and reliable real-time solution. This is because the measurement time interval is usually small so that the simplest approximation of the path integral (termed the Dirac-Feynman approximation, see Section 6) is adequate. In contrast, in quantum mechanics and quantum field theory one is interested in the large time case.
Unlike the Yau algorithm, the PI formula is valid even for the general time-dependent case for large measurement time interval. In other words, one can compute the conditional transition probablity density using the conventional methods (see, for instance, [22]).
The YYe can be viewed as a local expression of the path integral formula. That is, a path integral is a global object, while the PDE is a local one.
In this paper, the signal noise and measurement noise are assumed to be additive. This is a stronger condition than the orthogonalilty of the diffusion vielbein assumed in [13]. The solution for the general case has been presented in [16].
It is noted that other algorithms can also be solved using the FPI formulas with obvious changes. Usually, they require the solution of the FPKfe, which corresponds to the case h(x) = 0. Note that the FPKfe arises naturally in the solution of the continuous-discrete filtering problem [20]. Of course, as noted in Appendix 1, it would be unnecessary since the Yau algorithm has the best numerical properties. What is interesting to note is that the path integral formula naturally leads to the algorithm with the best numerical properties.
The Yau algorithm also has the form of the "prediction" and "correction" part, as in continuous-discrete filtering [20]. Specifically, the prediction part is the solution of the YYe, whereas the correction part is the multiplicative factor in the initial condition. However, it is crucial to note that the the prediction part contains the measurement model drift. In contrast, the measurement model plays no role in the prediction part in continuous-discrete filtering.

6 Examples

6.1 The Dirac-Feynman Approximation

From the discussion in the previous sections, it is evident that computing the path integral requires computing the Lagrangian, L, defined by S = ∫ dt L(x, $\dot{x}$ , t). The simplest (and crudest) approximation (for the case e(t)Q(t)e^T(t) = ħ_νI_n×nand n(t) W (t)n^T(t) = ħ_μI_m×m) is to use the following approximation that is valid for infinitesimal time step:

\begin{array}{l} \tilde{P} (t^{″}, x^{″} | t^{'}, x^{'}) = \\ \exp (- \frac{t^{″} - t^{'}}{2} {\frac{1}{ℏ_{ν}} \sum_{i = 1}^{n} {[\frac{{x^{″}}_{i} - {x^{'}}_{i}}{t^{″} - t^{'}} - f_{i} (\frac{x^{″} + x^{'}}{2})]}^{2} + \sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (\frac{x^{″} + x^{'}}{2}) + \frac{1}{ℏ_{μ}} \sum_{k = 1}^{m} h_{k}^{2} (\frac{x^{″} + x^{'}}{2})}) . \end{array}

(39)

This is the Dirac-Feynman (DF) approximation. The algorithm that follows from applying the DF approximation to the Yau algorithm is the Dirac-Feynman-Yau (DFY) algorithm.

6.2 Example 1

As an example, consider the following continuous-continuous filtering model that has been studied in [23] (and [24])

\begin{array}{l} \dot{x} (t) = a \cos (b x (t)) + σ_{x} ν (t), \\ \dot{y} (t) = \tan^{- 1} x (t) + σ_{y} μ (t) . \end{array}

(40)

The Lagrangian for this model is given by

\frac{1}{2 σ_{x}^{2}} {(\dot{x} (t) - a \cos (b x (t)))}^{2} - \frac{a b}{2} \sin (b x (t)) + \frac{1}{2 σ_{y}^{2}} {(\tan^{- 1} x (t))}^{2} .

(41)

The parameters chosen were as in the reference. Specifically, a = 1.2, b = 3, σ_x= 0.3, spatial grid spacing Δx = 0.01 and extent [-1.5, 1.5], temporal grid spacing Δt = 0.01 with 200 time steps.

In the first set, the measurement noise was set as σ_y= 0.05. Figure 1 shows a sample of state and measurement processes. The conditional mean, computed using the DFY algorithm, is plotted in Figure 2. Since there was negligible difference in performance between the pre-measurement and post-measurement forms, only the former was employed. Also plotted are 2σ limits. The conditional mean and variance were computed from the computed conditional probability density. The fact that the target was mostly within the 2σ limits of the conditional mean shows that the tracking performance of this algorithm is good.

In the next set, the measurement noise was set as σ_y= 0.0125. For this "small noise" case, most of the algorithms studied in [23] failed. In Figure 3 is plotted a sample of signal and measurement processes. The conditional mean and the 2σ limits computed using the DFY algorithm for this instance is plotted in Figure 4.

It is seen that good tracking performance is maintained for the small noise case even when the crudest path integral approximation is used in the Yau algorithm.

6.3 Example 2: Cubic Sensor Problem

The cubic sensor problem is defined by the following signal and measurement model:

\begin{array}{l} \dot{x} (t) = σ_{x} ν (t), \\ \dot{y} (t) = x^{3} (t) + σ_{y} μ (t) . \end{array}

(42)

It is a well-studied nonlinear filtering problem because it is one of the simplest examples of a filtering problem that is not finite dimensional (see, for instance, [25] and references therein).

For the simulation of the cubic sensor problem, the following model parameters were chosen (as in [25])

\begin{matrix} Δ t = 0.001, \\ Δ x = 0.01, \\ x_{0} = N (0, 0.01), \\ σ_{x} = 1, \\ σ_{y} = 1, \\ T = [0, 20] . \end{matrix}

(43)

The EKF is a sub-optimal filter which approximates the conditional probability by a Gaussian. For the cubic sensor problem the EKF is given by

\begin{matrix} d \hat{x} = 3 {\hat{x}}^{2} P (d y - {\hat{x}}^{3} d t), \\ d P = (1 - 9 {\hat{x}}^{4} P^{2}) d t . \end{matrix}

(44)

The Yau Lagrangian for the cubic sensor problem is

L = \frac{1}{2 σ_{x}^{2}} ({\dot{x}}^{2}) + \frac{1}{2 σ_{y}^{2}} x^{6} .

(45)

The DF approximation the Yau Lagrangian is

\tilde{P} (t^{″}, x^{″} | t^{'}, x^{'}) = \exp (- \frac{(t^{″} - t^{'})}{2 σ_{x}^{2}} {(\frac{x^{″} - x^{'}}{t^{″} - t^{'}})}^{2} - (\frac{t^{″} - t^{'}}{2 σ_{y}^{2}}) {(\frac{x^{″} + x^{'}}{2})}^{6}) .

(46)

Figure 5 shows the performance of the DFY algorithm. Specifically, the conditional mean along with the two standard deviation bounds computed using the computed conditional probability density is plotted. Observe that the EKF fails completely in this case. As noted in [25], this is because the EKF considers only the first two moments (which vanish here); it is the fourth central moment that plays a crucial role in this example (for the chosen initial condition). Also, note that the state is within the 2σ region for most of the time. This shows that, unlike the EKF, the path integral filter has a reliable error analysis.

After an initial period, the performance of the path integral approximation is seen to be excellent and comparable to that obtained using PDE methods in [25]. However, the crucial point is that the path integral solution is equally simple for the higher-dimensional case with more complicated models (e.g., colored noise), whereas a PDE solution would be significantly harder, if not impossible, to implement in real-time.

6.4 Comments

It is remarkable to note that very good performance is obtained using the crudest approximation. Of course, when the time step is large, it will fail (unlike the path integral formula itself). However, the practical situation is that the time steps are often small. Therefore, the DF approximation may be adequate in most cases.

The implementation of this method is trivial. The contrast with other methods, such as those studied in [23], is striking. For instance, many of those methods require off-line computation of complicated partial differential equations with uncertain numerical properties.

The results obtained in this paper used single time-step. More accuracy can be obtained quite simply using multiple time steps. Also, the computation of the transition probability density can be done off-line, but the on-line computation was not an onerous burden for the examples studied here.

It is important to note also that the transition probability density matrix (or tensor in the general case) is sparse (sparsity determined by ħ_μ, ħ_ν). This is of great importance in higher dimensional filtering problems because

Sparse matrix storage requirements are small,
The relevant transition probability density matrix elements can be computed based on the conditional density in the previous step, and
Sparse matrix computations are very fast.

Note that unlike some other approximation techniques studied in [23], the conditional probability density is obviously always positive (provided, of course, that the initial distribution is positive).

Finally, a comment on measures of performance. Note that a good tracker is one that furnishes not only a good estimate of a state but also provides a reliable measure of the quality of the estimate. For the linear, Gaussian case, the conditional mean and the variance are Gaussian and the Kalman estimates are optimal and provide a complete description. However, for the general nonlinear case, such a concise description is not possible. For instance, the conditional probability density may be highly skewed. It may be multi-modal, in which case the conditional mean is not a very meaningful quantity. A more general measure is to indicate domains of "significant" probability mass; a good filtering solution is then one that guarantees that the state is in the region of significant probability mass with a very high confidence. For the purposes of the paper, the conditional variance was chosen.

7 Conclusion

In this paper, the formal path integral solution to the continuous-continuous nonlinear filtering problem has been presented. The solution is universal, i.e., the initial distribution may be arbitrary. Since the path integral measure is manifestly positive, positivity is maintained if the initial distribution is positive.

A path integral formulation has several advantages. It is well known that Feynman path integrals have led to theoretical insights in other areas including quantum mechanics, quantum field theory and even pure mathematics. It is demonstrated in this paper that it is possible to express the fundamental solution of the YYe in terms of Feynman path integrals. Finally, Feynman path integrals are very suitable for numerical implementation. Practical path integral filtering techniques, especially for solving large dimensional problems, will be presented in subsequent papers.

Appendix 1 continuous-continuous filtering and the Yau equation

In this section, the main results of (continuous-continuous) nonlinear filtering theory are summarized. For the general case (e.g., not the finite-dimensional filter case) and from a practical point of view the most important results are the YYe and the Yau algorithm.

A.1 The Continuous-Continuous Model

The signal and observation model in continuous-continuous filtering is the following:

{\begin{array}{l} d x (t) = f (x (t), t) d t + e (x (t), t) d v (t), & x (0) = x_{0}, \\ d y (t) = h (x (t), t) d t + n (t) d w (t), & y (0) = 0. \end{array}

(47)

Here x, v, y, and w are ℝⁿ, ℝ^p, ℝ^mand ℝ^qvalued stochastic processes, respectively, and e(x(t), t) ∈ ℝ^n×pand n(t) ∈ ℝ^m×q. These are defined in the Itô sense. The Brownian processes v and w are assumed to be independent with p×p and q ×q covariance matrices Q(t) and W(t), respectively. We denote n(t)W (t)n^T(t) by R(t), a m × m matrix. Also, f is referred to as the drift, e as the diffusion vielbein, and eQe^Tas the diffusion matrix.

In this section, some of the relevant work on continuous-continuous filtering is summarized. Hence, it is assumed that n = p, m = q, f and h are vector-valued C^∞ smooth functions, e(x, t) is an orthogonal matrix-valued C^∞ smooth function, Q(t) is a n × n identity matrix, and n(t) and R(t) are m × m identity matrices. No explicit time dependence is assumed in the model.

A.2 The DMZ Stochastic Differential Equation

The unnormalized conditional probability density, σ (t, x) of the state given the observations {Y (s): 0 ≤ s ≤ t} satisfies the DMZ equation:

\begin{matrix} d σ (t, x) = L_{0} σ (t, x) d t + \sum_{i = 1}^{n} L_{i} σ (t, x) d y_{i} (t), & σ (0, x) = σ_{0} . \end{matrix}

(48)

Here

\begin{matrix} L_{0} = \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2}}{\partial x_{i}^{2}} - \sum_{i = 1}^{n} f_{i} (x) \frac{\partial}{\partial x_{i}} - \sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x) - \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x), \\ = - \sum_{i = 1}^{n} \frac{\partial}{\partial x_{i}} (f_{i} (x) \cdot) + \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2}}{\partial x_{i}^{2}} - \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x), \\ = \frac{1}{2} (\sum_{i = 1}^{n} D_{i}^{2} - η (x)), \end{matrix}

(49)

where $L_{i}$ is the zero-degree differential operator of multiplication by h_i(x), i = 1, ..., m, σ₀ is the probability density of the initial time t₀, and

\begin{matrix} D_{i} \equiv \frac{\partial}{\partial x_{i}} - f_{i} (x), \\ η (x) \equiv \sum_{i = 1}^{n} f_{i}^{2} (x) + \sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x) + \sum_{i = 1}^{m} h_{i}^{2} (x) . \end{matrix}

(50)

The DMZ equation is to be interpreted in the Stratanovich sense. Note that

\begin{matrix} D_{i} = \frac{\partial}{\partial x_{i}} - f_{i} (x), \end{matrix}

(51)

\begin{matrix} = e^{F_{i}} \frac{\partial}{\partial x_{i}} e^{- F_{i}}, & where & F_{i} = F_{i} (x) = \int_{0}^{x} f_{i} (t) d t . \end{matrix}

(52)

Hence,

D_{i}^{2} = e^{F_{i}} \frac{\partial^{2}}{\partial x_{i}^{2}} e^{- F_{i}},

(53)

and

L_{0} = \frac{1}{2} (\sum_{i = 1}^{n} e^{F_{i}} \frac{\partial^{2}}{\partial x_{i}^{2}} e^{- F_{i}} - η (x)) .

(54)

A.3 The Robust DMZ Partial Differential Equation

Following [11] and [12] introduce a new unnormalized density

u (t, x) = \exp (- \sum_{i = 1}^{m} h_{i} (x) y_{i} (t)) σ (t, x) .

(55)

Under this transformation, the DMZ SDE is transformed into the following time-varying PDE

{\begin{array}{l} \frac{\partial u}{\partial t} (t, x) = \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2} u}{\partial x_{i}^{2}} (t, u) + \sum_{i = 1}^{n} (- f_{i} (x) + \sum_{j = 1}^{m} y_{j} (t) \frac{\partial h_{j}}{\partial x_{i}} (x)) \frac{\partial u}{\partial x_{i}} (t, x) \\ - (\sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x) + \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x) - \frac{1}{2} \sum_{i = 1}^{m} y_{i} (t) Δ h_{i} (x) + \sum_{i = 1}^{m} \sum_{j = 1}^{n} y_{i} (t) f_{j} (x) \frac{\partial h_{i}}{\partial x_{j}} (x) \\ - \frac{1}{2} \sum_{i, j = 1}^{m} \sum_{k = 1}^{n} y_{i} (t) y_{j} (t) \frac{\partial h_{i}}{\partial x_{k}} (x)) u (t, x), \\ u (0, x) = σ_{0} (x) . \end{array}

(56)

This is called the robust DMZ equation. Here Δ is the Laplacian. The solution of a PDE when the initial condition is a delta function is called the fundamental solution.

A.4 The Yau Equation

Recently, it was proved that the real-time solution of the general nonlinear filtering problem can be obtained reliably [13, 26]. Let $P$ = {τ₀ <τ₁ < ⋯ <τ_k= τ} be a partition of the time interval [τ₀, τ], and let the norm of the partition $P_{k}$ be defined as | $P_{k}$ | = sup_1≤i≤k{|τ_i- τ_i-1|}. If u_l(t, x) satisfies the equation

{\begin{array}{l} \frac{\partial u_{l}}{\partial t} (t, x) = \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2} u_{l}}{\partial x_{i}^{2}} (t, x) + \sum_{i = 1}^{n} (- f_{i} (x) + \sum_{j = 1}^{m} y_{j} (τ_{l}) \frac{\partial h_{j}}{\partial x_{i}} (x)) \frac{\partial u_{l}}{\partial x_{i}} (t, x) \\ - (\sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x) + \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x) - \frac{1}{2} \sum_{i = 1}^{m} y_{i} (τ_{l}) Δ h_{i} (x) + \sum_{i = 1}^{m} \sum_{j = 1}^{n} y_{i} (τ_{l}) f_{j} (x) \frac{\partial h_{i}}{\partial x_{j}} (x) \\ - \frac{1}{2} \sum_{i, j = 1}^{m} \sum_{k = 1}^{n} y_{i} (τ_{l}) y_{j} (τ_{l}) \frac{\partial h_{i}}{\partial x_{k}} (x) \frac{\partial h_{j}}{\partial x_{k}} (x)) u_{l} (t, x), \\ u_{l} (τ_{l - 1}, x) = u_{l - 1} (τ_{l - 1}, x), \end{array}

(57)

in the time interval τ_l-1≤ t ≤ τ_l, then the function ${\tilde{u}}_{l}$ (t, x) defined as

{\tilde{u}}_{l} (t, x) = \exp (\sum_{i = 1}^{m} y_{i} (τ_{l}) h_{i} (x)) u_{l} (t, x)

(58)

satisfies the parabolic partial differential equation

\begin{matrix} \frac{\partial {\tilde{u}}_{l}}{\partial t} (t, x) = \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2} {\tilde{u}}_{l}}{\partial x_{i}^{2}} (t, x) - \sum_{i = 1}^{n} f_{i} (x) \frac{\partial {\tilde{u}}_{l}}{\partial x_{i}} (t, x) - (\sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x) + \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x)) {\tilde{u}}_{l} (t, x), \\ = - \sum_{i = 1}^{n} \frac{\partial}{\partial x_{i}} [f_{i} (x) {\tilde{u}}_{l} (t, x)] + \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2} {\tilde{u}}_{l}}{\partial x_{i}^{2}} (t, x) - \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x) {\tilde{u}}_{l} (t, x), \end{matrix}

(59)

in the same time interval. The converse of the statement is also true. In [27], it was also noted that it is sufficient to use the previous observation, i.e., u_l(t, x) satisfies Equation 57 if and only if ${\tilde{u}}_{l}$ (t, x) defined as

{\tilde{u}}_{l} (t, x) = \exp (\sum_{i = 1}^{m} y_{i} (τ_{l - 1}) h_{i} (x)) u_{l} (t, x)

(60)

satisfies Equation 59 in the time interval τ_l-1≤ t ≤ τ_l. We refer to Equations 58 (60) and Equation 59 as the post-measurement (pre-measurement) forms of the YYe.

Observe that Equation 57 is obtained by setting y(t) to y(τ_l) in Equation 56. It was proved that the solution of Equation 57 approximates very well the solution of the robust DMZ equation (Equation 56), i.e., it converges to u(t, x) in both pointwise sense and L² sense. Thus, solving Equation 56 is equivalent to solving Equation 59. Finally,

σ (τ, x) = \lim_{| P_{k} | \to 0} {\tilde{u}}_{k} (τ_{k}, x) .

(61)

Thus, the solution of the YYe (as | $P_{k}$ | → 0) is the desired unnormalized conditional probability density.

Observe that when h(x) = 0, it is simply the FPKfe. However, unlike the FPKfe, the YYe does not satisfy the current conservation condition, i.e., the right-hand term is not a total divergence. This means that it does not conserve probability. This fundamental difference is traced to the fact that the FPKfe evolves the normalized probability density (and preserves the normalization), while the YYe evolves the unnormalized conditional probability density. Therefore, this distinction is made between the two equations in this paper.

A.5 The Yau Algorithm

We may summarize the real-time algorithm, based on both the pre- and post-measurement forms of the YYe, of Yau as follows. Suppose measurements are available at times

⋯ <τ₀ <τ₁ <τ₂ < ⋯ <τ_k= τ.

We seek the solution u_i(t, x), which is the solution of the robust DMZ equation. Let the initial distribution be u(τ₀, x) = σ₀ (x). Then, evolve the initial distribution to the first measurement instant, τ₁, using the YYe:

{\begin{matrix} \frac{\partial {\tilde{u}}_{1}}{\partial t} (t, x) = \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2} {\tilde{u}}_{1}}{\partial x_{i}^{2}} (t, x) - \sum_{i = 1}^{n} f_{i} (x) \frac{\partial {\tilde{u}}_{1}}{\partial x_{i}} (t, x) - (\sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x) + \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x)) {\tilde{u}}_{1} (t, x), \\ {\tilde{u}}_{1} (τ_{0}, x) = σ_{0} (x) {\begin{array}{l} \exp (\sum_{j = 1}^{m} [y_{j} (τ_{0}) - y_{j} (τ_{- 1})] h_{j} (x)) & (Pre-Measurement) \\ \exp (\sum_{j = 1}^{m} [y_{j} (τ_{1}) - y_{j} (τ_{0})] h_{j} (x)) & (Post-Measurement) . \end{array} \end{matrix}

(63)

The solution of equation 63 at time τ₁ is ${\tilde{u}}_{1}$ (τ₁, x). Note that u₁ (τ₁, x) is given by

u_{1} (τ_{1}, x) = {\tilde{u}}_{1} (τ_{1}, x) {\begin{array}{l} \exp (- \sum_{i = 1}^{m} y_{i} (τ_{0}) h_{i} (x)) & (Pre-Measurement) \\ \exp (- \sum_{i = 1}^{m} y_{i} (τ_{1}) h_{i} (x)) & (Post-Measurement) . \end{array}

(64)

Next, solve the YYe to the next measurement instant τ₂ with initial condition ${\tilde{u}}_{2}$ (τ₁, x), i.e.,

{\begin{matrix} \frac{\partial {\tilde{u}}_{2}}{\partial t} (t, x) = \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2} {\tilde{u}}_{2}}{\partial x_{i}^{2}} (t, x) - \sum_{i = 1}^{n} f_{i} (x) \frac{\partial {\tilde{u}}_{2}}{\partial x_{i}} (t, x) - (\sum_{i = 1}^{n} \frac{\partial f_{i}}{\partial x_{i}} (x) + \frac{1}{2} \sum_{i = 1}^{m} h_{i}^{2} (x)) {\tilde{u}}_{2} (t, x), \\ {\tilde{u}}_{2} (τ_{1}, x) = {\tilde{u}}_{1} (τ_{1}, x) {\begin{array}{l} \exp (\sum_{j = 1}^{m} (y_{j} (τ_{1}) - y_{j} (τ_{0})) h_{j} (x)) & (Pre-Measurement) \\ \exp (\sum_{j = 1}^{m} (y_{j} (τ_{2}) - y_{j} (τ_{1})) h_{j} (x)) & (Post-Measurement), \end{array} \end{matrix}

(65)

to obtain ${\tilde{u}}_{2}$ (τ₂, x). In fact, for i ≥ 2, u_i(τ_i, x) can be computed from ${\tilde{u}}_{i}$ (τ_i, x), where ${\tilde{u}}_{i}$ (t, x) satisfies the equation

{\begin{matrix} \frac{\partial {\tilde{u}}_{i}}{\partial t} (t, x) = \frac{1}{2} \sum_{i = 1}^{n} \frac{\partial^{2} \tilde{u}}{\partial x_{i}^{2}} (t, x) - \sum_{j = 1}^{n} f_{j} (x) \frac{\partial \tilde{u}}{\partial x_{j}} (t, x) - (\sum_{j = 1}^{n} \frac{\partial f_{j}}{\partial x_{j}} (x) + \frac{1}{2} \sum_{j = 1}^{m} h_{j}^{2} (x)) {\tilde{u}}_{i} (t, x), \\ {\tilde{u}}_{i} (τ_{i - 1}, x) = {\tilde{u}}_{i - 1} (τ_{i - 1}, x) {\begin{array}{l} \exp (\sum_{j = 1}^{m} (y_{j} (τ_{i - 1}) - y_{j} (τ_{i - 2})) h_{j} (x)) & (Pre-Measurement) \\ \exp (\sum_{j = 1}^{m} (y_{j} (τ_{i}) - y_{j} (τ_{i - 1})) h_{j} (x)) & (Post-Measurement) . \end{array} \end{matrix}

(66)

The initial condition in Equation 66 follows from noting that (since u_i(τ_i-1, x) = u_i-1(τ_i-1, x))

\begin{matrix} {\tilde{u}}_{i} (τ_{i - 1}, x) = u_{i} (τ_{i - 1}, x) \exp (\sum_{j = 1}^{m} y_{j} (τ_{i}) h_{j} (x)), \\ \begin{matrix} = \exp (- \sum_{j = 1}^{m} y_{j} (τ_{i - 1}) h_{j} (x)) {\tilde{u}}_{i - 1} (τ_{i - 1}, x) \exp (\sum_{j = 1}^{m} y_{j} (τ_{i}) h_{j} (x)), & (Pre-Measurement) \end{matrix} \end{matrix}

(67)

\begin{matrix} {\tilde{u}}_{i} (τ_{i - 1}, x) = u_{i} (τ_{i - 1}, x) \exp (\sum_{j = 1}^{m} y_{j} (τ_{i}) h_{j} (x)), \\ \begin{matrix} = \exp (- \sum_{j = 1}^{m} y_{j} (τ_{i - 1}) h_{j} (x)) {\tilde{u}}_{i - 1} (τ_{i - 1}, x) \exp (\sum_{j = 1}^{m} y_{j} (τ_{i}) h_{j} (x)), & (Post-Measurement) . \end{matrix} \end{matrix}

(68)

Thus the solution of the robust DMZ equation at the ith step is given by

u_{i} (τ_{i}, x) = {\tilde{u}}_{i} (τ_{i}, x) {\begin{array}{l} \exp (- \sum_{j = 1}^{m} y_{j} (τ_{i - 1}) h_{j} (x)) & (Pre-Measurement) \\ \exp (- \sum_{j = 1}^{m} y_{j} (τ_{i}) h_{j} (x)) & (Post-Measurement) . \end{array}

(69)

Note that the Yau algorithm is a recursive algorithm as it does not need any previous observation data. Furthermore, the YYe is independent of data and so can be computed off-line, and that the YYe is much simpler than the robust DMZ equation. Finally, note that the output of the Yau algorithm is the desired unnormalized conditional probability density.

Note that while there are several other possible real-time solutions to the continuous-continuous nonlinear filtering problem, all of them assume that the signal and measurement model drifts are bounded (see remarks in [26] and [28]). As a result, those algorithms cannot provide a reliable solution to many real-world problems.

References

Jazwinski AH: Stochastic Processes and Filtering Theory. 2007, Dover Publications
MATH Google Scholar
Kalman RE: Trans ASME J Basic Eng. 1960, 82D: 35-45.
Article Google Scholar
Kalman RE, Bucy RS: Trans ASME J Basic Eng. 1961, 83: 95-108.
Article MathSciNet Google Scholar
Kushner HJ: SIAM Journal of Control. 1964, 2: 106-119.
MathSciNet MATH Google Scholar
Kushner HJ: SIAM Journal of Control. 1964, 2: 106-119.
MathSciNet MATH Google Scholar
Stratanovich RL: Theor Prob Appl. 1960, 5: 156-178. 10.1137/1105015.
Article Google Scholar
Bucy RS: IEEE Transactions on Automatic Control. 1965, 10 (2): 198-10.1109/TAC.1965.1098109.
Article Google Scholar
Duncan TE: Probability densities for diffusion processes with applications to nonlinear filtering theory. PhD thesis. 1967, Stanford University
Google Scholar
Mortensen RE: Optimal control of continuous time stochastic systems. PhD thesis. 1966, University of California, Berkeley
Google Scholar
Zakai M: Z Wahrsch Verw Geb. 1969, 11: 230-243. 10.1007/BF00536382.
Article MathSciNet MATH Google Scholar
Davis MHA: Z Wahrsch Vers Gebiete. 1980, 54 (2): 125-139. 10.1007/BF00531444.
Article MATH Google Scholar
Rozovskii BL: Uspekhi Math Nauk. 1972, 27: 213-214.
MathSciNet MATH Google Scholar
Yau ST, Yau SST: Mathematical Research Letters. 2000, 7: 671-693. [http://www.mrlonline.org/mrl/2000-007-006/2000-007-006-002.pdf]
Article MathSciNet MATH Google Scholar
Yau ST, Yau SST: Applied Mathematics and Optimization. 1996, 34 (3): 231-266. 10.1007/s002459900028.
Article MathSciNet MATH Google Scholar
Yau ST, Yau SST: Advances in Mathematics. 1998, 140: 156-189. 10.1006/aima.1998.1767.
Article MathSciNet MATH Google Scholar
Balaji B: Journal of Statistical Mechanics: Theory and Experiment. 2008, 2008 (01): P01014-10.1088/1742-5468/2008/01/P01014. (17pp).
Article MathSciNet Google Scholar
Siegel W: 1999, http://www.citebase.org/abstract?id=oai:arXiv.org:hep-th/9912205, [http://www.arxiv.org/abs/hep-th/9912205]
Zinn-Justin J: Quantum Field Theory and Critical Phenomena. 2002, International Series in Monographs on Physics, Oxford University Press
Book Google Scholar
Risken H: The Fokker-Planck Equation: Methods of Solution and applications. 1999, Springer-Verlag, 2
Google Scholar
Balaji B: submitted to IEEE Transactions on Aerospace and Electronic Systems. 2006, [http://www.citebase.org/abstract?id=oai:arXiv.org:0708.0354]
Google Scholar
Balaji B: Continuous-Discrete Filtering using the Dirac-Feynman Algorithm. IEEE Radar Conference, Rome, Italy. 2008
Google Scholar
Lepage PG: 2005, [http://arxiv.org/abs/hep-lat/0506036]
Fung CPA: New Numerical algorithms for Nonlinear Filtering. PhD thesis. 1995, University of Southern California
Google Scholar
Lototsky S, Mikulevicius R, Rozovskii BL: SIAM Journal of Control and Optimization. 1997, 35 (2): 435-461. 10.1137/S0363012993248918. [http://citeseer.ist.psu.edu/lototsky97nonlinear.html]
Article MathSciNet MATH Google Scholar
Yan C, Yau SST: A New suboptimal filter and numerical solutions for the cubic sensor problem. Proceedings of the 2006 IEEE International Conference on Networking, Sensing and Control. 2006
Google Scholar
Yau ST, Yau SST: SIAM Journal on Control and Optimization. 2008, 47: 163-195. 10.1137/050648353. [http://link.aip.org/link/?SJC/47/163/1]
Article MathSciNet MATH Google Scholar
Yau SST, Yau ST: Real Time Algorithm for nonlinear filtering problem. Proceedings of the 40th IEEE Conference on Decision and Control. 2001, Orlando, FL, USA
Google Scholar
Yau SST, Yau ST: SIAM Journal of Control and Optimization. 2005, 44 (3): 1019-1039. 10.1137/S0363012902411970.
Article MATH Google Scholar

Download references

Acknowledgements

The author thanks the Editor and three referees for constructive comments and suggestions that helped in significantly improving the paper. This work was supported by a Defence Research and Development Canda under a Technology Investment Fund.

Author information

Authors and Affiliations

Radar Systems Section, Defence Research and Development Canada, Ottawa, 3701 Carling Avenue, Ottawa, ON, K1A 0Z4, Canada
Bhashyam Balaji

Authors

Bhashyam Balaji
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bhashyam Balaji.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Balaji, B. Universal nonlinear filtering using Feynman path integrals II: the continuous-continuous model with additive noise. PMC Phys A 3, 2 (2009). https://doi.org/10.1186/1754-0410-3-2

Download citation

Received: 24 April 2008
Accepted: 10 February 2009
Published: 10 February 2009
DOI: https://doi.org/10.1186/1754-0410-3-2

Universal nonlinear filtering using Feynman path integrals II: the continuous-continuous model with additive noise

Abstract

1 Introduction

1.1 Motivation

1.2 Fundamental Sochastic Filtering Results

1.3 Outline of the Paper

2 The continuous filtering problem: a physical reformulation

2.1 Langevin Equation

2.2 The Continuous-Continuous Filtering Problem

3 Path integral formula for the conditional transition probability density

4 Derivation of the Yau algorithm

4.1 Sampled Continuous Measurements

4.2 The Yau Algorithm

5 Some remarks

6 Examples

6.1 The Dirac-Feynman Approximation

6.2 Example 1

6.3 Example 2: Cubic Sensor Problem

6.4 Comments

7 Conclusion

Appendix 1 continuous-continuous filtering and the Yau equation

A.1 The Continuous-Continuous Model

A.2 The DMZ Stochastic Differential Equation

A.3 The Robust DMZ Partial Differential Equation

A.4 The Yau Equation

A.5 The Yau Algorithm

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Rights and permissions

About this article

Cite this article

Share this article

Keywords

PMC Physics A