It is also called "information
capacity". It takes units of "bits per
transmission", or "bits per second", or
"bits/s/Hz".
It is different from another capacity
called connection-carrying capacity, used in connection admission
control.
Capacity of a discrete channel, in unit
of "bits per transmission".
C = max I(X;Y), where X and Y are
input/output discrete random variables; I(X;Y) is mutual
information; and the maximization is over the Probability Mass
Function (PMF) of the input X.
Binary Symmetric Channel (BSC):
C = 1 - H(p)
Binary Erasure Channel (BEC):
C = 1 - p
Capacity of a continuous channel, in unit
of "bits per transmission".
C = max I(X;Y), where X and Y are
input/output continuous random variables, and the maximization is
over the Probability Density Function (PDF) of the input X.
Additive White Gaussian Noise (AWGN)
channel (continuous-valued discrete-time channel)
C = 1/2 *log(1+SNR) for
real-valued input
C = log(1+SNR) for complex-valued
input
Capacity of a band-limited AWGN channel
(continuous-valued continuous-time channel or waveform channel), in unit
of "bits per second".
Shannon's information capacity
theorem (for real-valued waveform channel with AWGN):
C = B*log(1+SNR), where SNR=P/(B*N0).
Capacity of a discrete-time
discrete-valued channel with memory (refer to Gallager's book, pp. 97),
or Finite state channel (FSC).
Capacity of a fading channel
Ergodic capacity for a discrete-time
i.i.d. fading channel, in unit of "bits per
transmission".
C_erg = E[max I(X;Y|a)]
where $a$ is the channel state (or
channel gain); the maximization is over the conditional input PDF
f(x|a); the expectation is over the PDF of the channel state $a$.
The above definition is applicable
for both discrete-valued and continuous-valued X, Y and $a$.
But the time has to be discrete.
Outage capacity
Inequalities:
Jensen Inequality:
For convex function f(x), f(E(x))<=E(f(x)).
For concave function f(x), f(E(x))>=E(f(x)).
Almost all inequalities related to
expectation (e.g., Holder inequality, Cauchy-Schwarz inequality) can
be proved by Jensen Inequality.
Holder inequality:
Cauchy-Schwarz inequality:
Mincowski inequality:
Lyapounov inequality:
Kraft inequality:
Intuition: the sum of probabilities of
disjoint sets in a sample space should not be greater than 1.
\sum_{x \in X} D^{-l(x)}<=1, where x is a letter, X is an alphabet, D
denotes the number of bits in a coded symbol, l(x) is the length of the
code corresponding to x.
Fano inequality:
exp(x)>1+x, for x>0. (Use Taylor
expansion to prove.)
exp(-x)>1-x, for x<0.
ln(1+x)<x, for x>0. (Let f(x)=ln(1+x)-x.
Take the derivative and get the proof.) Draw a figure to see
this. f(x)=x increases faster than g(x)=ln(1+x).
ln(1+x)<x, for -1<x<0. (Note the
sign changes when taking the derivative.)
ln(1-x)<-x, for 0<x<1.
ln(1+K)<= \sum_{k=1}^K 1/k <=
1+ln(K).
MDL theory for Universal coding: looking at
the mathematical property (stochastic complexity, parametric complexity,
optimal performance bound, etc.), instead of actual encoding algorithm.
Entropy has three properties:
Non-negativity
Monotonic increase with the number of
equally-distributed elementary events
Additivity
The essense of the additivity of entropy is
that partitioning increases entropy. Partitioning is like branching in a
tree.
Key Techniques:
Asymptotic Equipartition Property (AEP): Basically
it stems from Weak Law of Large Numbers. Use
epsilon-N argument. Use the concept of typical sequence.
Random coding argument: Generate a code book
randomly. Due to the symmetric property of codewords, there is no need to
compute the symbol error probability for each codeword. Instead, only
considering one codeword is sufficient to obtain the symbol error probability.
Capacity for channels with additive white Gaussian
noise:
c=log(1+SNR), which is a concave
function. For Low SNR, c is approximately equal to SNR.
The approximation error gets larger as SNR gets larger. Note c<SNR
for any SNR.
Instantaneous channel capacity: c(t)=log(1+SNR(t)),
where SNR(t) is the instantaneous SNR at time t, for flat fading channel.
Instantaneous channel capacity density: c(t,f)=log(1+SNR(t,f)),
where SNR(t,f) is the SNR for frequency f at time t, for frequency selective
fading channel.