Pose Estimation and Visual Servoing: State Estimation Using Linear Kalman Filter (KF)

I. Introduction to Kalman Filter

The object of the estimation is a random process, which is not necessarily stationary. Its characterization is given by a linear differential (or difference) equation with known coefficients - the state equation of a dynamic system.
Random disturbances affect the evolution of the process (process noise). The process is observed in the presence of measurement errors (measurement noise). The solution obtained (in the time domain) so as to minimize a mean square error is the Kalman-Bucy or Kalman filter.
The (discrete time) Kalman filter (KF) computes the best estimate of the current state (in the minimum mean square error sense) based on the measurements up to the current time.
The filter gain reflects the relative accuracy of the predicted state vs. the new observation. The new (updated) state estimate is the optimal combination of

the entire past data (with the predicted state being the sufficient statistic) and
the latest measurement

KF estimates/predicts the state of a linear system with zero-mean white noise disturbances - it reduces the uncertainties due to such disturbances. While this seems very restrictive, through a number of extensions, this technique can be used in numerous problems where the disturbances do not satisfy the above conditions. The essence of the extensions is to map the problem at hand into one that meets (or nearly meets) the above constraints.

II. The Procedure of Linear Kalman Filter

The algorithm of a Linear Kalman Filter is as below:

Index:

X : State

Q : Process noise covariance matrix

Z : Measurement

R : Measurement noise covariance matrix

_prd : Prediction

_hat : Estimation

P : Estimation covariance matrix

The algorithm:

Predicting next state from current updated state (last estimated state). If there is not an estimate available we must initialize the filter as described in section III.
Predicting next measurement from next predicted state (obtained from step 1).
Obtaining the measurement from sensors if we have a real-time application or generating the measurement if we are in a simulation.
Calculating the residual of measurement from differencing real or simulated measurement and predicted measurement ( this is innovation term or correction term).
Predicting the covariance of estimation.
Predicting the covariance of measurement.
Calculating the filter gain.
Updating the covariance of estimation.
Updating next state by summing the predicted state with product of filter gain and residual (residual).

The seudo code:

Step 1 : Predicting next state from previous updated estimate

X_prd_(k+1) = F × X_hat_(k);

Step 2 : Predicting next measurement

Z_prd_(k+1) = H × X_prd_(k+1);

Step 3 : Obtaining measurement

Z_(k+1) = H × X_(k) + w_(k);

Step 4 : Calculating measurement residual

V_(k+1)= Z_(k+1) - z_prd_(k+1);

Step 5 : Predicting state covariance matrix

P_prd_(k+1) = F × P_(k) × F' + Q ;

Step 6 : Calculating predicted measurement covariance matrix

S_(k+1) = H × P_prd_(k+1) × H' + R ;

Step 7 : Calculating the filter gain

W_(k+1) = P_prd_(k+1) × H' × inv(S_(k+1)) ;

Step 8 : Updating the state covariance matrix

P_(k+1) = P_prd_(k+1) - W_(k+1) × S_(k+1) × W_(k+1)' ;

Step 9 : Updating the state estimation

X_hat_(k+1) = X_prd_(k+1) + W_(k+1) × V_(k+1) ;

III. Example

2D Motion with constant velocity in presence of acceleration noise.

The Model

Consider the system with state

which evolves according to

which is of the form

X(k + 1) = F X(k) + G v(k) (III.1)

where F is the system matrix and G is the weighting matrix of the process noises and the initial condition of the system is

This represents a two-dimensional motion with position [x y]' and velocity [ẋ ẏ]' sampled at intervals T, which will be assumed as unity in the sequel.
The process noises, a vector, which model the acceleration, are zero-mean white sequences with variance

E[v₁(k)²] = q₁ & E[v₂(k)²] = q₂

The measurements consist of the state's position component corrupted by additive noise

which is of the form

z(k) = H X(k) + J w(k) (III.2)

where the measurement noises are zero-mean white sequence with variance

E[w₁(k)²] = r₁ = 1 & E[w₂(k)²] = r₂ = 1.

All noise sequences are mutually independent. The process and measurement covariance matrix are

The process and measurement noises covariance Q and R are constant during time under linear Gaussian assumption and error in them is not acceptable and leads to large covariance in estimation (increasing uncertainty) while covariance of estimation P, will change during time and converges into its steady state. But a correct P will lead to a quick convergences into the steady state.

IV. Initialization of the State Estimator

A state estimation filter is called consistent if its estimation errors are "commensurate" or "compatible" with the filter-calculated covariances. At initialization, it is just as important that the covariance associated with the initial estimate reflects realistically its accuracy.
According to the Bayesian model, the true initial state is a random variable, assumed to be normally distributed with a known mean - the initial estimate - and a given covariance matrix

X(0) ~ N[(00) , P(00)] (IV.1)

The norm ("chi-square") test for the initial estimation error is

X(00)' P(00)^-1 X(00) ≤ C² (IV.2)

where C² is the upper limit of the, say, 95% confidence region from the chi-square distribution with the corresponding number of degrees of freedom.
Sometimes one has to choose the initial covariance - the "choice" has to be such that (IV.2) is satisfied. A large error in the initial estimate, if the latter is deemed highly accurate, will persist a long time because it leads to a low filter gain and thus the new information from the measurements receives weighting that is too low.

Filter Initialization – Summary

The error in the initial state estimate has to be consistent with the initial state covariance. When an initial covariance is "chosen," it should be such that the error is at most 2 times the corresponding standard deviation.
According to the Bayesian assumptions in the Kalman filter,

the true initial state is a random variable and
the initial estimate is a fixed known quantity.

In simulations one can reverse this point of view by fixing the initial state (the scenario) and generating the initial estimate with a random number generator as follows:
The initial estimate is generated with mean equal to the true initial state and with covariance equal to the initial state covariance.
In practice, the initialization can be done from two consecutive position measurements by

using the latest measurement as initial position estimate,
differencing them to obtain the velocity estimate,
calculating the corresponding covariance matrix.

This amounts to a first-order polynomial fit. Higher-order polynomial fits can also be used for initializing the estimation of states that have higher derivatives.
This initialization method should also be followed in Monte Carlo simulations where, with noises independent from run to run, one will then obtain initial errors also independent from run to run.
In some circumstances, one-point initialization is appropriate. There is a general procedure to initialize filters, to be discussed later.

Practical Implementation in Tracking

Backing to our example for tracking, the practical implementation of the initialization for example above can be done as follows. Consider 4 state components, positions and velocities in a given coordinate. Furthermore we assume that we have 2 sensors which measure the position. So the measurements

are available, then for the true values x(k) and y(k), k = 0, 1, we generate the corresponding measurement noises, say
w₁(k) ~ N[0 , r₁] & w₂(k) ~ N[0 , r₂]. (IV.4)

Then, denoting by T the sampling interval, one has

And finally the corresponding 4×4 initial covariance matrix after calculating

P(11) = E[ (X - ) (X - )' ]

will be

This method, called two-point differencing, guarantees consistency of the initialization of the filter, which starts updating the state at k = 2. (Note that because MATLAB do not accepts zero for index, so in the code the indices start with 1 instead of zero)
If several (Monte Carlo) runs are made, then the same initialization procedure has to be followed with new (independent) noises generated in every run according to (IV.4). "Reuse" of the same initial conditions in Monte Carlo runs will lead to biased estimates.

V. Simulation and results

The figures below show the results of a single run simulation of the example for different values of q₁ and q₂ (The variances of process noises). The variances of measurement noises are considered as unity in all runs (r₁=r₂=1).

q₁=q₂=0 and r₁=r₂=1

Figure V.1- Simulation for q₁=q₂=0 and r₁=r₂=1.
The simulation shows that the velocities approximately converge to the true values after step 20.

Figure V.2- Position and velocity variances for q₁=q₂=0 and r₁=r₂=1.In the case where the process noises are absent the variance of states will remain large for a long time before converging to their steady states.

Figure V.3- Position and velocity gains for q₁=q₂=0 and r₁=r₂=1.

Note that the gain values for x and y are coincide because of the similarity of these two modes in view of their state transition matrix and their noise models.

q₁=q₂=1 and r₁=r₂=1

Figure V.4- Simulation for q₁= q₂=1 and r₁=r₂=1.

Figure V.5- Position and velocity variances for q₁=q₂=1 and r₁=r₂=1.

Figure V.6- Position and velocity gains for q₁= q₂=1 and r₁=r₂=1.

Note that the values for x and y are coincide in this case.

q₁=1, q₂=9 and r₁=r₂=1

Figure V.4- Simulation for q₁=1, q₂=9 and r₁=r₂=1.

Figure V.5- Position and velocity variances for q₁= 1, q₂=9 and r₁=r₂=1.

Figure V.6- Position and velocity gains for q₁=1, q₂=9 and r₁=r₂=1.

q₁=9, q₂=1 and r₁=r₂=1

Figure V.4- Simulation for q₁=9, q₂=1 and r₁=r₂=1.

Figure V.5- Position and velocity variances for q₁= 9, q₂=1 and r₁=r₂=1.

Figure V.6- Position and velocity gains for q₁=9, q₂=1 and r₁=r₂=1.

Note that the values for x and y are coincide in this case.

q₁=q₂=9 and r₁=r₂=1

Figure V.4- Simulation for q₁=q₂=9 and r₁=r₂=1.

Figure V.5- Position and velocity variances for q₁=q₂=9 and r₁=r₂=1.

Figure V.6- Position and velocity gains for q₁= q₂=9 and r₁=r₂=1.

VI. Testing the consistency of the filter

Consistency of the filter (or covariance matching in some literatures) is an important property which must be checked after designing and tunning each estimator.

Definition and the Statistical Tests for Filter Consistency

A state estimator (filter) is called consistent if its state estimation errors satisfy

E[ X(k) - (kk) ] E[ (kk) ] = 0 (VI.1)

E[[ X(k) - (kk) ][ X(k) - (kk) ]'] E[ (kk) (kk)' ] = P(klk) (VI.2)

This is a finite-sample consistency property, that is, the estimation errors based on a finite number of samples (measurements) should be consistent with their theoretical statistical properties:

Have mean zero (i.e., the estimates are unabiased).
Have covariance matrix as calculated by the filter.

In contradistinction, the parameter estimator consistency is an asymptotic (infinite size sample) property.
The consistency criteria of a filter are as follows:

(a) The state errors should be acceptable as zero mean and have magnitude commensurate with the state covariance as yielded by the filter.

(b) The innovations should also have the same property.

(c) The innovations should be acceptable as white.

The last two criteria are the only ones that can be tested in applications with real data. The first criterion, which is really the most important one, can be tested only in simulations.

Using the notation

X_hat(kk) = X(k) - X_hat(kk) (VI.3)

define the normalized (state) estimation error squared (NEES) as

𝜖(k) = (kk)' P(kk)^-1 (kk) (VI.4)

The test to be presented next is based on the above quadratic form, and it can verify simultaneously both properties 1 and 2.

Under hypothesis that the filter is consistent and the LG assumption, 𝜖(k) is chi-square distributed with n_X, degrees of freedom, where n_X, is the dimension of X. Then

E[𝜖(k)] = n_x (VI.5)

and the test is whether (VI.5) can be accepted.

Figures V.1 to V.3 show, for a single run, the behavior of the state's NEES (VI.4) for various values of the process noise variance 4 for filters that are perfectly matched. Out of 100 points, 2 to 6 are found outside the 95% probability region.

In this case a two-sided region was considered. The lower limit of this probability region is 0.48 and the upper bound is 11.14 since, for a n_x = 4 degrees of freedom chi-square random variable, the 2.5% tail points are

[ 𝜒²₄(0.025) , 𝜒²₄(0.975) ] = [ 0.48 , 11.14 ] (VI.6)

Note that in this case the two-sided 95% region is [ 0.48 , 11.14 ]. Since the lower limit is practically zero, we can consider only the upper limit which will be for the 5% tail rather than for the 2.5% tail

𝜒²₄(0.95) = 9.49 (VI.7)

It should be noted that taking a 5% or a 2.5% (or a 1%) tail is rather arbitrary.

Figure VI.1- The 95% confidence region of estimation error for q₁=q₂=0 and r₁=r₂=1.

There are no points out of the confidence region which is acceptable.

Figure VI.2- The 95% confidence region of estimation error for q₁=q₂=1 and r₁=r₂=1.

There are only 4 points out of the confidence region which is acceptable.

Figure VI.3- The 95% confidence region of estimation error for q₁=1, q₂=9 and r₁=r₂=1.

There are 27 points out of the confidence region in this case which is not acceptable. Does the covariance of process noise affects the consistecy of the filter ?

Figure VI.4- The 95% confidence region of estimation error for q₁=9, q₂=1 and r₁=r₂=1.

There are 31 points out of the confidence region in this case which is not acceptable. Does the covariance of process noise affects the consistecy of the filter ?

Figure VI.5- The 95% confidence region of estimation error for q₁=q₂=9 and r₁=r₂=1.

There are 57 points out of the confidence region in this case which is not acceptable. Does the covariance of process noise affects the consistecy of the filter ?

Pose Estimation and Visual Servoing

Sunday, July 25, 2010

State Estimation Using Linear Kalman Filter (KF)

No comments: