Basics of Probability

These notes are intended as a simple reference for a summary of terminology, formulas and examples done in class for basic probability. Specific examples and techniques for computation will be covered elsewhere. They are not comprehensive and are meant to be a supplement to the text.

Set theory

Set theory lays the foundation for probability. Below is a review the definitions of sets and associated notations for operation and relations between sets.

Definitions and notation

Set: An unordered collection of elements.
Element: Given an element $a$, $a \in A$ means that "$a$ is an element of $A$". Likewise, $a\notin A$ means "$a$ is not an element of $A$".
Universal set: The set of all possible elements is usually denoted by $S$ (sometimes also $\Omega$).
Empty set
Subset: We say that a set $A$ is a subset of another set $B$ if all of the elements of $A$ are also elements of $B$. This is denoted by $A \subset B$.
Union
Intersection
Disjoint
Complement: The complement of $A$, denoted by $\overline{A}$ (sometimes also $A^c$) is the set of all elements that don't belong to $A$.
Relative complement: The relative complement of $A$ in $B$, denoted by $B\backslash A$, is the set of all elements in $B$ that don't belong to $A$. This is sometimes called the set theoretic difference and is sometime denoted by $B - A$.

Venn Diagrams

Venn Diagrams are a useful tool to visualize set operations and relations as two-dimensional disks (or other shapes). Below are some examples. The colored region is labeled below the diagram.

$A\cup B$

$A\cap B$

$A\backslash B$

$\bar{A}$

Compositions and algebraic relations

Sets equipped with the operations of union, intersection and complement form what is known as a Boolean algebra (see here for more information). This algebra satisfies the distributive property.

Distributive property

Let $A,B,C \subset S$, then \[ (A\cup B)\cap C = (A\cap C)\cup(B\cap C) \] and \[ (A\cap B)\cup C)= (A\cup C) \cap(B\cup C). \]

It will be useful later to understand how to decompose sets into other sets. For instance it follows immediately that for any set $A\subset S$ \[ S = A \cup \bar{A} \quad \text{and} \quad A\cap \bar{A} = \emptyset \]

Additionally, we can always decompose $A\cup B$ into two disjoint sets $B$ and $A\backslash B$ or $B\backslash A$ and $A$. This can be visualized via the following Venn diagrams.

${\color{salmon}{A\cup B}}$

$=$

$\color{yellowgreen}{(A\backslash B)}$

$\cup$

$\color{teal}{B}$

Question

If $A$ and $B$ are disjoint, what are $A\backslash B$ and $B\backslash A$?

DeMorgan's laws

Can "distribute" complements through unions and intersection by turning unions to intersections and vice-versa: \[ \begin{aligned} \overline{(A\cup B)} &= \overline{A}\cap \overline{B}\\ \overline{(A\cap B)} &= \overline{A}\cup \overline{B}. \end{aligned} \]

We can verify DeMorgan's law's using Venn diagrams. For instance, $\overline{(A\cup B)} = \overline{A}\cap \overline{B}$ can be see as

${\color{salmon}{\overline{(A\cup B)}}}$

$=$

$\color{yellowgreen}{\overline{A}}$

$\cap$

$\color{teal}{\overline{B}}$

while the relation $\overline{(A\cap B)} = \overline{A}\cup \overline{B}$ is given by

${\color{salmon}{\overline{(A\cap B)}}}$

$=$

$\color{yellowgreen}{\overline{A}}$

$\cup$

$\color{teal}{\overline{B}}$

Probability

The theory of probability built from set theory. Although with slightly different terminology.

Terminology

We will commonly use the following terminology:

Experiment: A process by which and observation is made with well-defined possible outcomes.
Sample space: The set $S$ of all possible outcomes of an experiment. This is the same as the universal set.
Event: A set of possible outcomes. A subset of the sample space. Usually denoted with capital letters.
Simple event: An event with only one possible outcome. These are the elements of the sample space $S$. Also known as a sample point.
Mutually exclusive: Two events $A$ and $B$ are mutually exclusive if they don't share any possible outcomes. Or in set theoretic notation, they are disjoint $A\cap B = \emptyset$.

Probability measure

A probability measure $P$ is a function that assigns events to probabilities (numbers between $0$ and $1$ signifying how likely the event is). Any probability measure must satisfy the axioms of probability.

Axioms of probability

(1) Non-negativity: $P(A) \geq 0$ for each $A\subset S$.
(2) Normality: $P(S) = 1$.
(3) Countable additivity: Let $A_1,A_2,\ldots$ be a (potentially infinite) sequence of events, which are pair-wise mutually exclusive \[ A_i\cap A_j = \emptyset ,\quad i \neq j,\] then \[ P(A_1\cup A_2 \cup \ldots ) = \sum_{i=1}^\infty P(A_i). \]

It is easy to see that countable additivity implies finite additivity in the sense that for any finite sequence $A_1, A_2, \ldots, A_n$ of pair wise mutually exclusive events one has \[ P(A_1\cup A_2 \cup \ldots \cup A_n) = \sum_{i=1}^n P(A_i). \]

Note

Finite additivity "does not" imply countable additivity, and therefore axiom 3 cannot in general be replaced by a statement about finite additivity. This is especially important when dealing with infinite sample spaces, where is it natural to decompose events as countable unions.

Discrete vs continuous probability

Probability theory is often split into two distict flavors depending on whether the sample space $S$ is discrete or continuous.

Discrete sample space: A discrete sample space is either finite or countable (meaning the simple events can be arranged in a list and counted).
Continuous sample space: A sample space which is not discrete.

Examples: The sets $\{1,2\}$, $\{1,2,4,\ldots,n\}$, $\{1,2,3,\ldots\}$, $\{2,4,6,8,\ldots\}$, are all discrete.

Examples: The sets $[0,1]$, $[0,1]^2$ $\mathbb{R}$, $\mathbb{R}^2$ are all continuous.

In the case of discrete sample spaces, a probability measure is completely determined by it's action on the sample events \[ P(A) = \sum_{a \in A} P(\{a\}). \] In the the case of continuous sample spaces, one has to be more careful. For instance, when $S = \mathbb{R}$, we will often make use of a probability density function $p(x)$ to define the the measure via integration \[ P(A) = \int_A p(x)\,dx. \]

Compositions and laws

The ability to decompose sets into two or more disjoint sets has a direct relation to a probability measure through additivity.

Law of complement

Let $A\subset S$, then \[ P(\bar{A})= 1 - P(A). \]

Proof: To show this we write $S = A \cup \bar{A}$. Since $A\cap \bar{A} = \emptyset$, we can apply additivity and normality to conclude \[ 1 = P(S) = P(A) + P(\bar{A}). \] Solving for $P(\bar{A})$ gives the law. QED

Note

The law of complement immediately implies that $P(\emptyset) = 0$.

Law of differences

For any two sets $A, B\subset S$, we have \[\tag{1}\label{1} P(B\backslash A) = P(A\cup B) - P(A), \] and \[\tag{2}\label{2} P(B\backslash A) = P(B) - P(A\cap B). \]

Proof: Follows by writing $A\cup B = (B\backslash A)\cup A$ (convince yourself this is true) and realizing that $(B\backslash A)$ and $A$ are disjoint. Therefore by additivity \[ P(A\cup B) = P(B\backslash A) + P(A). \] Solving for $P(B\backslash A)$ gives \eqref{1}. Likewise we can also write $B = (B\backslash A)\cup (A\cap B)$. Since $(B\backslash A)$ and $(A\cap B)$ are disjoint, we find \[ P(B) = P(B\backslash A) + P(A\cap B). \] Solving for $P(B\backslash A)$ gives \eqref{2}.
QED

Law of addition

Let $A,B \subset S$ be two events (not necessarily mutually exclusive), then \[ P(A\cup B) = P(A) + P(B) - P(A\cap B). \]

Proof: Equate \eqref{1} and \eqref{2} in the law of differences and solve for $P(A\cup B)$. More intuitively we can see that in the sum $P(A) + P(B)$, the overlap $P(A\cap B)$ gets counted twice and therefore must be accounted for by subtracting it once.
QED

Question

Can you show that \[ \begin{aligned} P(A\cup B\cup C) &= P(A) + P(B) + P(C)\\ &- P(A\cap B) - P(A\cap C)\\ &- P(B\cap C)\\ &+ P(A\cap B \cap C)? \end{aligned} \] (Hint: Apply the law of addition three times.)

Notes

APMA 1650 - Spring 2021

Basics of Probability

Set theory

Definitions and notation

Venn Diagrams

Compositions and algebraic relations

Question

Probability

Terminology

Probability measure

Note

Discrete vs continuous probability

Compositions and laws

Note

Question