When the occurrence of one event does not affect the likelihood of another
Independence
is a fundamental notion in
probability theory
, as in
statistics
and the theory of
stochastic processes
. Two
events
are
independent
,
statistically independent
, or
stochastically independent
[1]
if, informally speaking, the occurrence of one does not affect the probability of occurrence of the other or, equivalently, does not affect the
odds
. Similarly, two
random variables
are independent if the realization of one does not affect the
probability distribution
of the other.
When dealing with collections of more than two events, two notions of independence need to be distinguished. The events are called
pairwise independent
if any two events in the collection are independent of each other, while
mutual independence
(or
collective independence
) of events means, informally speaking, that each event is independent of any combination of other events in the collection. A similar notion exists for collections of random variables. Mutual independence implies pairwise independence, but not the other way around. In the standard literature of probability theory, statistics, and stochastic processes,
independence
without further qualification usually refers to mutual independence.
Definition
[
edit
]
For events
[
edit
]
Two events
[
edit
]
Two events
and
are independent (often written as
or
, where the latter symbol often is also used for
conditional independence
) if and only if their
joint probability
equals the product of their probabilities:
[2]
: p. 29
[3]
: p. 10
| | (
Eq.1
)
|
indicates that two independent events
and
have common elements in their
sample space
so that they are not
mutually exclusive
(mutually exclusive iff
). Why this defines independence is made clear by rewriting with
conditional probabilities
as the probability at which the event
occurs provided that the event
has or is assumed to have occurred:
and similarly
Thus, the occurrence of
does not affect the probability of
, and vice versa. In other words,
and
are independent to each other. Although the derived expressions may seem more intuitive, they are not the preferred definition, as the conditional probabilities may be undefined if
or
are 0. Furthermore, the preferred definition makes clear by symmetry that when
is independent of
,
is also independent of
.
Odds
[
edit
]
Stated in terms of
odds
, two events are independent if and only if the
odds ratio
of
and
is unity (1). Analogously with probability, this is equivalent to the conditional odds being equal to the unconditional odds:
or to the odds of one event, given the other event, being the same as the odds of the event, given the other event not occurring:
The odds ratio can be defined as
or symmetrically for odds of
given
, and thus is 1 if and only if the events are independent.
More than two events
[
edit
]
A finite set of events
is
pairwise independent
if every pair of events is independent
[4]
—that is, if and only if for all distinct pairs of indices
,
| | (
Eq.2
)
|
A finite set of events is
mutually independent
if every event is independent of any intersection of the other events
[4]
[3]
: p. 11
—that is, if and only if for every
and for every k indices
,
| | (
Eq.3
)
|
This is called the
multiplication rule
for independent events. It is
not a single condition
involving only the product of all the probabilities of all single events; it must hold true for all subsets of events.
For more than two events, a mutually independent set of events is (by definition) pairwise independent; but the converse is
not necessarily true
.
[2]
: p. 30
Log probability and information content
[
edit
]
Stated in terms of
log probability
, two events are independent if and only if the log probability of the joint event is the sum of the log probability of the individual events:
In
information theory
, negative log probability is interpreted as
information content
, and thus two events are independent if and only if the information content of the combined event equals the sum of information content of the individual events:
See
Information content § Additivity of independent events
for details.
For real valued random variables
[
edit
]
Two random variables
[
edit
]
Two random variables
and
are independent
if and only if
(iff) the elements of the
π
-system
generated by them are independent; that is to say, for every
and
, the events
and
are independent events (as defined above in
Eq.1
). That is,
and
with
cumulative distribution functions
and
, are independent
iff
the combined random variable
has a
joint
cumulative distribution function
[3]
: p. 15
| | (
Eq.4
)
|
or equivalently, if the
probability densities
and
and the joint probability density
exist,
More than two random variables
[
edit
]
A finite set of
random variables
is
pairwise independent
if and only if every pair of random variables is independent. Even if the set of random variables is pairwise independent, it is not necessarily
mutually independent
as defined next.
A finite set of
random variables
is
mutually independent
if and only if for any sequence of numbers
, the events
are mutually independent events (as defined above in
Eq.3
). This is equivalent to the following condition on the joint cumulative distribution function
.
A finite set of
random variables
is mutually independent if and only if
[3]
: p. 16
| | (
Eq.5
)
|
It is not necessary here to require that the probability distribution factorizes for all possible
-element
subsets as in the case for
events. This is not required because e.g.
implies
.
The measure-theoretically inclined may prefer to substitute events
for events
in the above definition, where
is any
Borel set
. That definition is exactly equivalent to the one above when the values of the random variables are
real numbers
. It has the advantage of working also for complex-valued random variables or for random variables taking values in any
measurable space
(which includes
topological spaces
endowed by appropriate σ-algebras).
For real valued random vectors
[
edit
]
Two random vectors
and
are called independent if
[5]
: p. 187
| | (
Eq.6
)
|
where
and
denote the cumulative distribution functions of
and
and
denotes their joint cumulative distribution function. Independence of
and
is often denoted by
.
Written component-wise,
and
are called independent if
For stochastic processes
[
edit
]
For one stochastic process
[
edit
]
The definition of independence may be extended from random vectors to a
stochastic process
. Therefore, it is required for an independent stochastic process that the random variables obtained by sampling the process at any
times
are independent random variables for any
.
[6]
: p. 163
Formally, a stochastic process
is called independent, if and only if for all
and for all
| | (
Eq.7
)
|
where
.
Independence of a stochastic process is a property
within
a stochastic process, not between two stochastic processes.
For two stochastic processes
[
edit
]
Independence of two stochastic processes is a property between two stochastic processes
and
that are defined on the same probability space
. Formally, two stochastic processes
and
are said to be independent if for all
and for all
, the random vectors
and
are independent,
[7]
: p. 515
i.e. if
| | (
Eq.8
)
|
Independent σ-algebras
[
edit
]
The definitions above (
Eq.1
and
Eq.2
) are both generalized by the following definition of independence for
σ-algebras
. Let
be a probability space and let
and
be two sub-σ-algebras of
.
and
are said to be independent if, whenever
and
,
Likewise, a finite family of σ-algebras
, where
is an
index set
, is said to be independent if and only if
and an infinite family of σ-algebras is said to be independent if all its finite subfamilies are independent.
The new definition relates to the previous ones very directly:
- Two events are independent (in the old sense)
if and only if
the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by an event
is, by definition,
- Two random variables
and
defined over
are independent (in the old sense) if and only if the σ-algebras that they generate are independent (in the new sense). The σ-algebra generated by a random variable
taking values in some
measurable space
consists, by definition, of all subsets of
of the form
, where
is any measurable subset of
.
Using this definition, it is easy to show that if
and
are random variables and
is constant, then
and
are independent, since the σ-algebra generated by a constant random variable is the trivial σ-algebra
. Probability zero events cannot affect independence so independence also holds if
is only Pr-
almost surely
constant.
Properties
[
edit
]
Self-independence
[
edit
]
Note that an event is independent of itself if and only if
Thus an event is independent of itself if and only if it
almost surely
occurs or its
complement
almost surely occurs; this fact is useful when proving
zero?one laws
.
[8]
Expectation and covariance
[
edit
]
If
and
are statistically independent random variables, then the
expectation operator
has the property
- [9]
: p. 10
and the
covariance
is zero, as follows from
The converse does not hold: if two random variables have a covariance of 0 they still may be not independent.
Similarly for two stochastic processes
and
: If they are independent, then they are
uncorrelated
.
[10]
: p. 151
Characteristic function
[
edit
]
Two random variables
and
are independent if and only if the
characteristic function
of the random vector
satisfies
In particular the characteristic function of their sum is the product of their marginal characteristic functions:
though the reverse implication is not true. Random variables that satisfy the latter condition are called
subindependent
.
Examples
[
edit
]
Rolling dice
[
edit
]
The event of getting a 6 the first time a die is rolled and the event of getting a 6 the second time are
independent
. By contrast, the event of getting a 6 the first time a die is rolled and the event that the sum of the numbers seen on the first and second trial is 8 are
not
independent.
Drawing cards
[
edit
]
If two cards are drawn
with
replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are
independent
. By contrast, if two cards are drawn
without
replacement from a deck of cards, the event of drawing a red card on the first trial and that of drawing a red card on the second trial are
not
independent, because a deck that has had a red card removed has proportionately fewer red cards.
Pairwise and mutual independence
[
edit
]
Consider the two probability spaces shown. In both cases,
and
. The random variables in the first space are pairwise independent because
,
, and
; but the three random variables are not mutually independent. The random variables in the second space are both pairwise independent and mutually independent. To illustrate the difference, consider conditioning on two events. In the pairwise independent case, although any one event is independent of each of the other two individually, it is not independent of the intersection of the other two:
In the mutually independent case, however,
Triple-independence but no pairwise-independence
[
edit
]
It is possible to create a three-event example in which
and yet no two of the three events are pairwise independent (and hence the set of events are not mutually independent).
[11]
This example shows that mutual independence involves requirements on the products of probabilities of all combinations of events, not just the single events as in this example.
Conditional independence
[
edit
]
For events
[
edit
]
The events
and
are conditionally independent given an event
when
.
For random variables
[
edit
]
Intuitively, two random variables
and
are conditionally independent given
if, once
is known, the value of
does not add any additional information about
. For instance, two measurements
and
of the same underlying quantity
are not independent, but they are conditionally independent given
(unless the errors in the two measurements are somehow connected).
The formal definition of conditional independence is based on the idea of
conditional distributions
. If
,
, and
are
discrete random variables
, then we define
and
to be conditionally independent given
if
for all
,
and
such that
. On the other hand, if the random variables are
continuous
and have a joint
probability density function
, then
and
are conditionally independent given
if
for all real numbers
,
and
such that
.
If discrete
and
are conditionally independent given
, then
for any
,
and
with
. That is, the conditional distribution for
given
and
is the same as that given
alone. A similar equation holds for the conditional probability density functions in the continuous case.
Independence can be seen as a special kind of conditional independence, since probability can be seen as a kind of conditional probability given no events.
History
[
edit
]
Before 1933, independence, in probability theory, was defined in a verbal manner. For example,
de Moivre
gave the following definition: “Two events are independent, when they have no connexion one with the other, and that the happening of one neither forwards nor obstructs the happening of the other”.
[12]
If there are n independent events, the probability of the event, that all of them happen was computed as the product of the probabilities of these n events. Apparently, there was the conviction, that this formula was a consequence of the above definition. (Sometimes this was called the Multiplication Theorem.), Of course, a proof of his assertion cannot work without further more formal tacit assumptions.
The definition of independence, given in this article, became the standard definition (now used in all books) after it appeared in 1933 as part of Kolmogorov's axiomatization of probability.
[13]
Kolmogorov
credited it to
S.N. Bernstein
, and quoted a publication which had appeared in Russian in 1927.
[14]
Unfortunately, both Bernstein and Kolmogorov had not been aware of the work of the
Georg Bohlmann
. Bohlmann had given the same definition for two events in 1901
[15]
and for n events in 1908
[16]
In the latter paper, he studied his notion in detail. For example, he gave the first example showing that pairwise independence does not imply imply mutual independence.
Even today, Bohlmann is rarely quoted. More about his work can be found in
On the contributions of Georg Bohlmann to probability theory
from
de:Ulrich Krengel
.
[17]
See also
[
edit
]
References
[
edit
]
- ^
Russell, Stuart; Norvig, Peter (2002).
Artificial Intelligence: A Modern Approach
.
Prentice Hall
. p.
478
.
ISBN
0-13-790395-2
.
- ^
a
b
Florescu, Ionut (2014).
Probability and Stochastic Processes
. Wiley.
ISBN
978-0-470-62455-5
.
- ^
a
b
c
d
Gallager, Robert G. (2013).
Stochastic Processes Theory for Applications
. Cambridge University Press.
ISBN
978-1-107-03975-9
.
- ^
a
b
Feller, W (1971). "Stochastic Independence".
An Introduction to Probability Theory and Its Applications
.
Wiley
.
- ^
Papoulis, Athanasios (1991).
Probability, Random Variables and Stochastic Processes
. MCGraw Hill.
ISBN
0-07-048477-5
.
- ^
Hwei, Piao (1997).
Theory and Problems of Probability, Random Variables, and Random Processes
. McGraw-Hill.
ISBN
0-07-030644-3
.
- ^
Amos Lapidoth (8 February 2017).
A Foundation in Digital Communication
. Cambridge University Press.
ISBN
978-1-107-17732-1
.
- ^
Durrett, Richard
(1996).
Probability: theory and examples
(Second ed.).
page 62
- ^
E Jakeman.
MODELING FLUCTUATIONS IN SCATTERED WAVES
.
ISBN
978-0-7503-1005-5
.
- ^
Park, Kun Il (2018).
Fundamentals of Probability and Stochastic Processes with Applications to Communications
. Springer.
ISBN
978-3-319-68074-3
.
- ^
George, Glyn, "Testing for the independence of three events,"
Mathematical Gazette
88, November 2004, 568.
PDF
- ^
Cited according to: Grinstead and Snell’s Introduction to Probability. In: The CHANCE Project. Version of July 4, 2006.
- ^
Kolmogorov, Andrey
(1933). Grundbegriffe der Wahrscheinlichkeitsrechnung (in German). Berlin: Julius SpringerTranslation: Kolmogorov, Andrey (1956). Translation:Foundations of the Theory of Probability (2nd ed.). New York: Chelsea. ISBN 978-0-8284-0023-7.
- ^
S.N. Bernstein
, Probability Theory (Russian), Moscow, 1927 (4 editions, latest 1946)
- ^
Georg Bohlmann
: Lebensversicherungsmathematik, Encyklop¨adie der mathematischen Wissenschaften, Bd I, Teil 2, Artikel I D 4b (1901), 852?917
- ^
Georg Bohlmann
: Die Grundbegriffe der Wahrscheinlichkeitsrechnung in ihrer Anwendung auf die Lebensversichrung, Atti del IV. Congr. Int. dei Matem. Rom, Bd. III (1908), 244?278.
- ^
de:Ulrich Krengel
: On the contributions of Georg Bohlmann to probability theory (PDF; 6,4 MB), Electronic Journal for History of Probability and Statistics, 2011.
External links
[
edit
]