probability
given a sample space
and an associated sigma-algebra
, a probability function is a function
with domain
that satisfies
any function for all
.
.
- if
are pairwise disjoint, then
.
we need general methods of defining probability functions that we know will always satisfy Kolmogorov's Axioms. we do not want to have to check the axioms for each new probability function. the following gives a common method of defining a legitimate probability function.
let
be a finite set. let
be any sigma algebra of subsets of
. let
be nonnegative numbers that sum to 1. for any
, define
by
(the sum over an empty set is defined to be 0.) then
is a probability function on
. this remains true if
is a countable set.
[cite:;taken from @berger_inference_2002 chapter 1 basics of probability theory; theorem 1.2.6]
[cite:;taken from @berger_inference_2002 chapter 1 basics of probability theory; theorem 1.2.6]
[cite:;refer to @berger_inference_2002 chapter 1 basics of probability theory; example 1.2.7]
before we leave the axiomatic development of probability, there is one further point to consider. axiom 3 of probability.html, which is commonly known as the Axiom of Countable Additivity, is not universally accepted among statisticians. indeed, it can be argued that axioms should be simple, self-evident statements. comparing axiom 3 to the other axioms, which are simple and self-evident, may lead us to doubt whether it is reasonable to assume the truth of axiom 3.the Axiom of Countable Additivity is rejected by a school of statisticians led by deFinetti (1972), who chooses to replace this axiom with the Axiom of Finite Additivity.
while this axiom may not be entirely self-evident, it is certainly simpler than the Axiom of Countable Additivity (and is implied by it).
assuming only finite additivity, while perhaps more plausible, can lead to unexpected complications in statistical theory - complications that, at this level, do not necessarily enhance understanding of the subject. we therefore proceed under the assumption that the Axiom of Countable Additivity holds.
[cite:;taken from @berger_inference_2002 chapter 1 basics of probability theory]
if
is a probability function and
is any set in
, then
,
;
.
if
is a probability function and
and
are any sets in
, then
the following theorem gives some useful results for dealing with a collection of sets;
;
- if
, then
.
if
is a probability function, then
for any partition
;
for any sets
(Boole's inequality).
formula (b) of broken link: blk:the-prob-3 gives a useful inequality for the probability of an intersection. since
, we have from broken link: blk:the-prob-2, after some rearranging,
this inequality is a special case of what is known as Bonferroni's inequality. Bonferroni's inequality allows us to bound the probability of a simultaneous event (the intersection) in terms of the probabilities of the individual events.
there is a similarity between Boole's inequality and Bonferroni's inequality. in fact, they are essentially the same thing. we could have used boole's inequality to derive broken link: blk:the-prob-3. if we apply boole's inequality to
, we have
and using the facts that
and
, we obtain
this becomes, on rearranging terms,
which is a more general version of the Bonferroni inequality of probability.html.
[cite:;taken from @berger_inference_2002]there is a similarity between Boole's inequality and Bonferroni's inequality. in fact, they are essentially the same thing. we could have used boole's inequality to derive broken link: blk:the-prob-3. if we apply boole's inequality to