Monte Carlo Methods
- Three methods to compute an
integral/expectation.
- integrate analytically, resulting in an exact
solution.
- numerical algorithm (deterministic),
resulting in an approximate solution.
- simulation-based method (random algorithm,
Monte Carlo methods), resulting in an approximate solution.
- Sampling in Monte Carlo methods means
generating a realization of a random variable, which has a specified PDF/PMF.
In other word, sampling means simulating a r.v.
- Sampling = observation.
- Missing data model, censored data model.
- Population with distribution P = sample space
X with P, i.e., {X, sigma field, P}.
- Sampling PDF characterizes the channel model
as in communication theory; it is also the same as transition probability,
likelihood probability.
- Gibbs sampler:
- Used to simulate a N-dimensional random
vector (with N-dimensional PDF f(x_1, x_2,..., x_N)) by an iterative
procedure.
- The basic idea is to reduce the simulation of
N-dimensional random vector to simulations of univariate (marginal)
conditional PDF's. That is, reduce high dimensional problem to one
dimensional problem.
Markov chain Monte Carlo (MCMC) techniques:
- In generating a random variant of high
dimension (the pdf of which is assumed to be P(x)), MCMC methods are
computationally efficient than inverse CDF method, acceptance-rejection
method, and importance sampling. Intuitively, MCMC methods are like stirring
an urn that contains many lottery numbered balls; each stir results in a
configuration of the balls in the urn; given the n-th stir, the (n-1)-th stir
and the (n+1)-th stir are conditionally independent; hence, the configurations
with marginal pdf $P_n(x)$ form a Markov chain. As n approaches
infinity, $P_n(x)$ converges to the target distribution P(x); the stirring is
to explore the distribution P(x).
- The key in MCMC is to choose the transition
pdf (called transition kernel) and design efficient algorithms.
- Particle filters (PF)
- Particle filter methods are a kind of
sequential/recursive Bayesian estimation/filtering; they provide computation
algorithms to calculate the posterior PDF; they are not optimal but are
sub-optimal (approximation algorithms).
- The basic idea is sequential importance
sampling (SIS) + resampling (to reduce degeneracy, where after a few
iterations, all but one particle will have negligible weight).
- The merit of particle filters is that they
are general enough to handle any nonlinear system model (state equation) and
non-Gaussian noise model.
- The drawback is that they are general so that
they do not utilize the specific structure of the nonlinear system model,
which could potentially improve the performance. For example, if
linearization is a good approximation of the nonlinearity of system dynamics
and/or measurement equations, it may be better to use extended Kalman filter.
- Formal Bayesian inference treats any
variable being inferred from the data as random while Maximum likelihood (ML)
approach may take it as fixed but unknown parameter. But it really has
little to do with what mother nature actually produces this kind of variable
as deterministic or random, but it is simply a logical thinking for
statistical inference. A limitation of Bayesian philosophy is that the
result has too much reliance on the prior. For state estimation, it is
natural to think the problem in a Bayesian fashion (MMSE). In this sense PF
follows the Bayesian methodology. However, the interpretation of the
estimate by PF often deviates from the Bayesian justification since the sample
mean and sample variance of the PF have little to do with Bayesian inference.
You only have to report High probability density (HPD) or something similar,
since you only care about modes, mean, and/or median, instead of PDF.
Note the MMSE, MMAE, and MAP estimates are the mean, median, and mode
of the posterior PDF, respectively, if noise is Gaussian.
- Recursive importance sampling propagates HPD.
- Other kinds of PF
- Sampling importance resampling (SIR) filter
- Auxiliary sampling importance resampling (ASIR)
filter
- Regularized particle filter (RPF)
กก
- Mixture Kalman filter
- Extended Kalman filter: 1) use a linear model
to approximate a nonlinear dynamic system; 2) use Kalmal filter based on the
approximated linear model.
- Kalman filters propagate the first and second
order statistics (mean& variance). Particle filters propagate particles
(PMF or histogram), thus it has high computation/storage complexity.
- Bayesian belief: when we don't know the prior
PMF of a discrete random variable/vector, we assume a certain prior PMF,
called belief, so that we can proceed with Bayesian inference.
- For linear estimation, we typically use
either Kalman filter or Wiener filter (no one use Wiener filter in
practice). Wiener filter is restricted to stationary processes. But Kalman
filter can deal with non-stationary processes (e.g., with time-varying mean
and auto-correlation). Kalman filter can also deal with nonlinear systems,
using extended Kalman filter. For nonlinear systems, a general estimation
method is particle filter.
EM vs. BCJR
- BCJR algorithm is an EM algorithm.
Belief propagation algorithm.
- Sum-product algorithm, factor graph.
- All these algorithms propagate posterior
probability; the output of one processing module, posterior probability, will
be used as prior probability in another processing module.
กก