Mutual information
In probability theory, the mutual information between two variables X and Y is given by
| Table of contents |
|
2 Relation to other quantities 3 References |
If X and Y are independent,
then I(X,Y) = 0,
since P(X,Y) = P(X) P(Y) in that case.
Mutual information is symmetric: I(X,Y) = I(Y,X).
Mutual information is nonnegative: I(X,Y) ≥ 0.
The mutual information can be equivalently expressed as
Mutual information can also be expressed in terms of the Kullback-Leibler divergence.
Note that
Athanasios Papoulis. Probability, Random Variables, and Stochastic Processes, second edition. New York: McGraw-Hill, 1984. (See Chapter 15.)Properties of mutual information
Relation to other quantities
where H(X) and H(X|Y) are the unconditional and conditional entropy of X,
likewise H(Y) and H(Y|X) are the unconditional and conditional entropy of Y,
with
and
Since H(X) > H(X|Y),
this proves the nonnegativity property stated above.
Thus mutual information can be understood as a weighted Kullback-Leibler divergence:
the more different the distributions P(X) and P(X|Y),
the greater the information gain.References