These notes are intended as a simple reference for a summary of terminology, formulas and examples done in class for basic probability. Specific examples and techniques for computation will be covered elsewhere. They are not comprehensive and are meant to be a supplement to the text.
Set theory lays the foundation for probability. Below is a review the definitions of sets and associated notations for operation and relations between sets.
Venn Diagrams are a useful tool to visualize set operations and relations as two-dimensional disks (or other shapes). Below are some examples. The colored region is labeled below the diagram.
Sets equipped with the operations of union, intersection and complement form what is known as a Boolean algebra (see here for more information). This algebra satisfies the distributive property.
It will be useful later to understand how to decompose sets into other sets. For instance it follows immediately that for any set $A\subset S$ \[ S = A \cup \bar{A} \quad \text{and} \quad A\cap \bar{A} = \emptyset \]
Additionally, we can always decompose $A\cup B$ into two disjoint sets $B$ and $A\backslash B$ or $B\backslash A$ and $A$. This can be visualized via the following Venn diagrams.
${\color{salmon}{A\cup B}}$
$=$
$\color{yellowgreen}{(A\backslash B)}$
$\cup$
$\color{teal}{B}$
We can verify DeMorgan's law's using Venn diagrams. For instance, $\overline{(A\cup B)} = \overline{A}\cap \overline{B}$ can be see as
${\color{salmon}{\overline{(A\cup B)}}}$
$=$
$\color{yellowgreen}{\overline{A}}$
$\cap$
$\color{teal}{\overline{B}}$
while the relation $\overline{(A\cap B)} = \overline{A}\cup \overline{B}$ is given by
${\color{salmon}{\overline{(A\cap B)}}}$
$=$
$\color{yellowgreen}{\overline{A}}$
$\cup$
$\color{teal}{\overline{B}}$
The theory of probability built from set theory. Although with slightly different terminology.
We will commonly use the following terminology:
A probability measure $P$ is a function that assigns events to probabilities (numbers between $0$ and $1$ signifying how likely the event is). Any probability measure must satisfy the axioms of probability.
It is easy to see that countable additivity implies finite additivity in the sense that for any finite sequence $A_1, A_2, \ldots, A_n$ of pair wise mutually exclusive events one has \[ P(A_1\cup A_2 \cup \ldots \cup A_n) = \sum_{i=1}^n P(A_i). \]
Probability theory is often split into two distict flavors depending on whether the sample space $S$ is discrete or continuous.
Examples: The sets $\{1,2\}$, $\{1,2,4,\ldots,n\}$, $\{1,2,3,\ldots\}$, $\{2,4,6,8,\ldots\}$, are all discrete.
Examples: The sets $[0,1]$, $[0,1]^2$ $\mathbb{R}$, $\mathbb{R}^2$ are all continuous.
In the case of discrete sample spaces, a probability measure is completely determined by it's action on the sample events \[ P(A) = \sum_{a \in A} P(\{a\}). \] In the the case of continuous sample spaces, one has to be more careful. For instance, when $S = \mathbb{R}$, we will often make use of a probability density function $p(x)$ to define the the measure via integration \[ P(A) = \int_A p(x)\,dx. \]
The ability to decompose sets into two or more disjoint sets has a direct relation to a probability measure through additivity.
Proof: To show this we write $S = A \cup \bar{A}$. Since $A\cap \bar{A} = \emptyset$, we can apply additivity and normality to conclude \[ 1 = P(S) = P(A) + P(\bar{A}). \] Solving for $P(\bar{A})$ gives the law. QED
Proof: Follows by writing $A\cup B = (B\backslash A)\cup A$ (convince yourself this is true) and realizing that $(B\backslash A)$ and $A$ are disjoint. Therefore by additivity
\[
P(A\cup B) = P(B\backslash A) + P(A).
\]
Solving for $P(B\backslash A)$ gives \eqref{1}. Likewise we can also write $B = (B\backslash A)\cup (A\cap B)$. Since $(B\backslash A)$ and $(A\cap B)$ are disjoint, we find
\[
P(B) = P(B\backslash A) + P(A\cap B).
\]
Solving for $P(B\backslash A)$ gives \eqref{2}.
QED
Proof: Equate \eqref{1} and \eqref{2} in the law of differences and solve for $P(A\cup B)$. More intuitively we can see that in the sum $P(A) + P(B)$, the overlap $P(A\cap B)$ gets counted twice and therefore must be accounted for by subtracting it once.
QED