Celeb Glow
general | April 11, 2026

Applying law of total probability to conditional probability

$\begingroup$

I was solving problems based on Bayes theorem from the book "A First Course in Probability by Sheldon Ross". The problem reads as follows:

An insurance company believes that there are two types of people: accident prone and not accident prone. Company statistics states that accident prone person have an accident in any given year with probability $0.4$, whereas the probability is $0.2$ for not-accident prone person. If we assume $30\%$ of population is accident prone, what is the conditional probability that a new policyholder will have an accident in his or her second year of policy ownership, given that the policyholder has had an accident in the first year?

The solution given is as follows:

Book Solution
$$ \begin{align} P(A)=0.3 & & (given)\\ \therefore P(A^c)=1-P(A)=0.7 & & \\ P(A_1|A)=P(A_2|AA_1)=0.4 & &(given)\\ P(A_1|A^c)=P(A_2|A^cA_1)=0.2 & & (given) \end{align} $$ $$ P(A_1)=P(A_1|A)P(A)+P(A_1|A^c)P(A^c) =(.4)(.3)+(.2)(.7)=.26 \\ P(A|A_1)=\frac{(.4)(.3)}{.26}=\frac{6}{13} \\ P(A^c|A_1)=1-P(A|A_1)=\frac{7}{13} $$ $$ \begin{align} P(A_2|A_1)& =P(A_2|AA_1)P(A|A_1)+P(A_2|A^cA_1)P(A^c|A_1) &&...(I)\\ &=(.4)\frac{6}{13}+(.2)\frac{7}{13}\approx .29\\ \end{align} $$

I dont understand the statement $(I)$.

My Solution
Shouldnt it be like this: $$P(A_2|A_1)=P(A_2|AA_1)P(AA_1)+P(A_2|A^cA_1)P(A^cA_1)$$ Continuing further:
$$ \begin{align} P(A_2|A_1)&=P(A_2|AA_1)P(A_1|A)P(A)+P(A_2|A^cA_1)P(A_1|A^c)P(A^c)\\ &=(.4)(.4)(.3)+(.2)(.2)(.7)=0.076 \end{align} $$

Am I wrong? If yes, where did I go wrong?

Added Later

After going through comments and thinking more, it seems that I am struggling to apply law of total probability (and my above solution is very well wrong). The basic form of law of total probability, which I came across till now, is as follows: $$P(A)=P(A|\color{red}{B})P(\color{red}{B})+P(A|\color{magenta}{B^c})P(\color{magenta}{B^c})$$ I am first time facing application of this law for conditional probability, as done book solution: $$P(A_2|A_1)=P(A_2|AA_1)P(A|A_1)+P(A_2|A^cA_1)P(A_c|A_1)$$ as it involves three events ($A,A_1,A_2$). Book did not explained this. Though in current problem, it looks "somewhat" intuitive,

  1. can someone generalize it, so as to make my understanding more clear? Say for $n$ events?

  2. Also, in $P(A_2|A_1)=P(A_2|\color{red}{AA_1})P(\color{red}{A|A_1})+P(A_2|\color{magenta}{A^cA_1})P(\color{magenta}{A^c|A_1})$, I feel red colored stuff should be same and pink colored stuff should be same, as in case of simple form law of total probability.

  3. I felt it should be $P(A_2|\color{red}{(A_1|A)})P(\color{red}{A_1|A})+P(A_2|\color{magenta}{(A_1|A^c)})P(\color{magenta}{A_1|A^c})$. Am I absolutely stupid here?

  4. For a moment I felt its related to:$P(E_1E_2E_2...E_n)=P(E_1)P(E_2|E_1)P(E_3|E_1E_2)...P(E_n|E_1...E_{n-1})$. Is it so?

I am now screwed at my ability to apply law of total probability. Please enlighten me.

$\endgroup$ 9

2 Answers

$\begingroup$
  1. can someone generalize it, so as to make my understanding more clear? Say for $n$ events?

If $(B_k)_n$ is a sequence of $n$ events that partition the sample space (or if at least $(B_k\cap A_1)_n$ partitions $A_1$) then, $\mathsf P(A_2\mid A_1) = \sum_{k=1}^n \mathsf P(A_2\mid A_1\cap B_k)\mathsf P(B_k\mid A_1)$

  1. Also, in $P(A_2|A_1)=P(A_2|\color{red}{AA_1})P(\color{red}{A|A_1})+P(A_2|\color{magenta}{A^cA_1})P(\color{magenta}{A^c|A_1})$, I feel red colored stuff should be same and pink colored stuff should be same, as in case of simple form law of total probability.

They are not the same in the case of the simple form. So why should they be?

Where $\Omega$ is the entire sample space, then:

$${{\mathsf P(A_2)~}{= \mathsf P(A_2\mid \Omega)\\=\mathsf P(A_2\mid \color{red}{A}, \Omega)P(\color{red}{A}\mid \Omega)+\mathsf P(A_2\mid \color{magenta}{A^c}, \Omega)\,\mathsf P(\color{magenta}{A^c}\mid \Omega)\\=\mathsf P(A_2\mid \color{red}{A})P(\color{red}{A})+\mathsf P(A_2\mid \color{magenta}{A^c})\,\mathsf P(\color{magenta}{A^c})}}$$

  1. I felt it should be $P(A_2|\color{red}{(A_1|A)})P(\color{red}{A_\,\mathsf 1|A})+P(A_2|\color{magenta}{(A_1|A^c)})P(\color{magenta}{A_1|A^c})$. Am I absolutely stupid here?

:) Well, I would not say absolutely.   But seriously, it is a rather common misunderstanding.

The conditioning bar is not a set operation.   It seperates the event from the condtion that the probability function is being measured over.   There can only be one inside any probability function; they do not nest.

  1. For a moment I felt its related to:$P(E_1E_2E_2...E_n)=P(E_1)P(E_2|E_1)P(E_3|E_1E_2)...P(E_n|E_1...E_{n-1})$. Is it so?

Yes, this is so.   Specifically $\mathsf P(A_2,A,A_1)=\mathsf P(A_2\mid A,A_1)\mathsf P(A\mid A_1)\mathsf P(A_1)\\ \mathsf P(A_2,A^\mathsf c,A_1)=\mathsf P(A_2\mid A^\mathsf c,A_1)\mathsf P(A^\mathsf c\mid A_1)\mathsf P(A_1)$

$$\begin{align}\mathsf P(A_2\mid A_1) ~ & = \mathsf P((A\cup A^\mathsf c){\cap} A_2\mid A_1) && \text{Union of Complements} \\[1ex] & = \mathsf P((A{\cap}A_2)\cup(A^\mathsf c{\cap}A_2)\mid A_1) && \text{Distributive Law} \\[1ex] & = \mathsf P(A{\cap}A_2\mid A_1) + \mathsf P(A^\mathsf c{\cap}A_2\mid A_1) && \text{Additive Rule for Union of Exclusive Events} \\[1ex] & = \dfrac{\mathsf P(A{\cap}A_1{\cap}A_2)+\mathsf P(A^\mathsf c{\cap}A_1{\cap}A_2)}{\mathsf P(A_1)} && \text{by Definition} \\[1ex] & = \dfrac{\mathsf P(A_2\mid A{\cap}A_1)\,\mathsf P(A{\cap}A_1)+\mathsf P(A_2\mid A^\mathsf c{\cap}A_1)\,\mathsf P(A^\mathsf c{\cap}A_1)}{\mathsf P(A_1)} && \text{by Definition} \\[1ex] & = {\mathsf P(A_2\mid A{\cap}A_1)\,\mathsf P(A\mid A_1)+\mathsf P(A_2\mid A^\mathsf c{\cap}A_1)\,\mathsf P(A^\mathsf c\mid A_1)} && \text{by Definition of Conditional Probability} \end{align}$$

$\endgroup$ 11 $\begingroup$

Another rationale for the answer is to get the cue from the statements:

1) Probability that an accident prone drive will have an accident on a given year is 0.4

2) Probability that an non accident prone driver will have an accident on a given year is 0.2

3) Probability that a person is accident prone is 0.3

These are all given:

What we can derive is $P(\text{Having an accident/ Accident Prone}) =.4\times .3$

$P( \text{Having an accident/ Not Accident Prone}) = .2\times .7$

Now use Bayes theorem to find $P(\text{if the person is accident prone/he has had an accident}) = \frac{.4\times .3} {.4\times .3+.2\times .7}$

$P(\text{if the person is not accident prone/he has had an accident}) = \frac{.2\times .7} {.4\times .3+.2\times .7}$

Now the new person has had an accident in the first year. (Given).

We need to find out the probabilities of whether he could be classified as accident prone or not accident prone which is what you found in the last two steps.

Having found that Now use Total Probability rule to find out if he will have an accident in the second year ( which has got little to do with the first year) using the first two facts.

$P(\text{Accident on the second year}) = P(\text{accident on a given year/Accident Prone})*P(\text{if the person is accident prone/he has had an accident}) + P(\text{accdient on a given year/ No Accident Prone})* P(\text{if the person is not accident prone/he has had an accident})$

$\endgroup$ 2

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy