1. Joint Probability Function
Joint probability can be classified with discrete random variables and continuous random variables. In this post I want deal with just discrete random variables and its joint probability function.
Let Y_1, Y_2 be discrete random variables. The joint probability distribution for [latex] Y_1, Y_2 is given by
p(y_1, y_2)=P(Y_1=y_1, Y_2=y_2), \,\,\, -\infty< y_1 < \infty, -\infty< y_2 < \infty
The function p(y_1, y_2) will be referred to as the joint probability function. [referred by Mathematical Statistics with Applications]
The below can be satisfied.
- 1. p(y_1, y_2) \geq 0 \,\,\, for \,all y_1, y_2
- 2. \sum_{y_1, y_2} p(y_1, y_2) = 1
Let's see an example. Let the total probability space be S = {middle school 3rd grade students in Seoul} and Y1 represents favorite idols while Y2 represent the place where they are live. Y1 has three elements {'BTS', 'EXO', OTHERS', Y2 has two elements {'NORTH', 'SOUTH'}. Then we can assign some joint probabilities for each outcomes (Y1, Y2).
You can answer the below questions.
- P(Y_1='BTS') =?
- P(Y_2='SOUTH') =?
- P(Y_1='BTS', Y_2='SOUTH') =?
The answer of the last question is 3 / 64. Please examine that.
2. Chain Rule
To induce the result by using the chain rule, joint probability function can be thought as an consecutive trials.
P(x, y) = P(x \, and \, y)we can think about that y followed by x or vise versa.
I would not introduce the concept of Joint Probability Density Function here. Rather than that I want to explain the chain rules by using joint probability considered as 'consecutive events'.
We can describe the consecutive events like below.
P(x \, and \, y) = P(x) \cdot P(y|x)i.e. probability \, of \, x \rightarrow y \, : \, P(x) \cdot P(y|x)
How about in the case of x \rightarrow y \rightarrow z ?
It can be decomposed into two steps.
1. x \rightarrow y, z
2. x, y \rightarrow z
At the first step,
probability \, of \, x \rightarrow y,z \, : \, P(x) \cdot P(y,z|x)At the second step, we derive the equation carefully given x,
probability \, of \, x,y \rightarrow z \, : \, P(x) \cdot P(z|x,y) \cdot P(y|x)Like above, we can derive the joint probabilities of n events : x_1, \, x_2, \cdots , x_n
P(x_1,x_2, \cdots , x_n) = P(x_n|x_{n-1}, \cdots, x_1) \cdots P(x_2|x_1) \cdot P(x_1)