**There were some miss writings in this blog. I am very sorry about that. I have corrected my mistakes and please let me know if you will find something wrong in later posts.**

In the previous *post* I’ve explained independent trials with two red balls in sampling.

If we assume that there are seven black balls and three red balls inside a box, we can think about two red ball sampling. If we can do that in two scenarios. 1) sample the balls without replacement, 2) sample the balls with replacement.

There are 7 black balls and 3 red balls inside a box.

1. Two consecurative red balls with replacement : \dfrac{3}{10} \times \dfrac{3}{10} = \dfrac{9}{100}

2. Two consecurative red balls without replacement : \dfrac{3}{10} \times \dfrac{2}{9} = \dfrac{6}{90}

We should define events first of all, before saying independence of events.

How can we deal with event A and event B ?

At a glance, we can define “event A = red ball popped”, “event B = black ball popped”. Then how could we describe the first red and the second will be also red?

It can be represented like : P(B) \times P(B|B) Isn’t it so weird ?

OK. Let’s forgive the first modeling quickly!

In another approach, we can define “red ball popped in the first trial = event A”, “red ball popped in the second trial = event B”

Then, the probability will be P(A) \times P(B|A). We got more plausible result.

But we are still very confused with P(B). The probability of the sampling with replacement is 3/10 * 3/10 and the second term (3/10) is for P(B|A), while the probability of the sampling without replacement is 3/10 * 2/9 and the second term(2/9) is for P(B|A).

The event B “red ball popped in the second trial” is truly affected by P(A) or not ? Then, how can we calculate P(B)?

To avoid the great disruption, we should define our sample space again. We should represent composed results together at a one outcome. i.e. {BR, BB, RB, RR} when B = Black and R = Red.

Therefore in the sampling with replacement

event A = {RB, RR}, P(A) = 3/10*7/10 + 3/10*3/10 = 3/10

event B = {BR, RR}, P(B) = 7/10*3/10 + 3/10*3/10 = 3/10

while in the sampling without replacement

event A = {RB, RR}, P(A) = 3/10*7/9 + 3/10*2/9 = 3/10

event B = {BR, RR}, P(B) = 7/10*3/9 + 3/10*2/9 = 3/10

To prove the independence of two events, it is analogous with P(B|A) = P(B)

=> P(B|A) = \dfrac {P(B \cap A)}{P(A)}

B \cap A is the event A and event B so that

1. During with replcement : {(R, R)} = 3/10*3/10

2. During without replacement : {(R, R)} = 3/10*2/9

therefore

1. During with replcement : P(B|A) = \dfrac {P(B \cap A)}{P(A)}= \dfrac { \dfrac {9}{100}}{ \dfrac {3}{10}} = \dfrac {3}{10} = P(B)

2. During without replcement : P(B|A) = \dfrac {P(B \cap A)}{P(A)}= \dfrac { \dfrac {6}{90}}{ \dfrac {3}{10}} = \dfrac {2}{9} ≠ P(B)

If we haven’t explained the previous paragraphs, one can be quite confused with why P(B) = 3/10 and P(B|A) = P(B). Therefore this kind of red ball example is not suitable for an example of independence of events. (Such a long explanation will be needed)

You may seem to notice the surprising facts that the red ball popped in the second trail (event B) is the same as 3/10 in the sampling without replacement as well in the sampling with replacement. (the probability of “event A” was also same in both cases) This was found when I was googling in some lecture *note*.

Sampling with replacement

Suppose that we sample from N objects without replacement, and that m of the objects are red. Let A be ‘first object is red’ and B be ‘second object is red’. Some people think that P(B) is obviously the same as P(A), while others are very suspicious about it. Here is the argument, using the Theorem of Total Probability (*)

P(A) = \dfrac {m}{N} \,\,\,\,\, P(A^C) = \dfrac {N-m}{N}P(B|A) = \dfrac {m-1}{N-1} \,\,\,\,\, P(B|A^C) = \dfrac {m}{N-1}

So,

P(B)= P(A) \times P(B|A) + P(A^C) \times P(B|A^C)=\dfrac {m}{N} \times \dfrac {m-1}{N-1}+ \dfrac {N-m}{N} \times \dfrac {m}{N-1}

= \dfrac {m}{N(N-1)} [m-1+N-m]

= \dfrac {m(N-1)}{N(N-1)}= \dfrac {m}{N}

(*) P(A)=P(A \cap B)+P(A \cap B^C)