Exercises
Problem 1
You have an event your interested in studying, \(A\). What are the lower and upper bounds for the probability that \(A\) occurs? I.e.what are the lower and upper bounds of \(P(A)\)?
Show solutions
By the definition of the probability of an event we know that it’s bounded both from above and below. The upper bound being that \(A\) is guaranteed, and the lower bound being that it is impossible. We represent this on the interval \([0,1]\). I.e. we know that the probability of even t \(A\) occurring is between \(0\) and \(1\). More compact \(P(A)\in[0,1]\)
Problem 2
One day you walk outside and overhear a person talking about the probability of an event. You hear little of what they have to say, but you do hear this person say “…this won’t have to be enough, we will have to redo this experiment many more times to be certain I am approaching the right probability.” Amazingly, some time later you hear another person mumbling to them selves about probability. “Wow, now this is a surprise. This result did not coincide with my previous beliefs at all. I will have to rethink the probability of such an event occurring”, they say.
You get the feeling that these two people have very different way of interpreting probability? Identify who the frequentist and who the bayesian statistician is.
Show solutions
Here, person 1 is using frequentistic methods. Person 1’s focus is entirely on getting in as many independent trials as possible to get a clear image of the probability of an event. Person 2 on the other hand has a completely different view. Person 2 obviously had an initial belief about the probability of an event, but after seeing a surprising result they were determined to update their beliefs about the probability of this event.
Problem 3
Vegard is very interested in the quality of watermelons. He has decided that he wants to find out the probability of watermelons being overripe at REMA 1000. He feels he has two feasible tests he can do to find this probability, one is Bayesian in nature and one is frequentist. Vegard can either go in with an initial belief, buy a single watermelon, then update his beliefs. He can keep doing this until he is quite certain of the probability of the population. On the other hand, Vegard can buy 40 watermelons at once, check them all, and then argue for that sample being representative of the population.
Which method is Bayesian and which is frequentist?
Show solutions
The first option here is Bayesian. Vegard has an initial belief on the probability of the watermelons being overripe, and then he iterates and tries to make that more exact by checking one watermelon at the time.
The second option is a very typical frequentist test. From the whole population, a decent chunk is tested at once, in hope that it’s a representative sample. In the representative sample overripe watermelons should have the same relative frequency as in the entire population.
Problem 4
You have a sample space, \(S\), and in that sample space you have two identifiable events, \(A\) and \(B\). Can you determine a case where the probability of the intersection of \(A\) and \(B\) (\(P(A\cap B)\)) occurring, equals the probability of \(A\) (\(P(A)\)) occurring? Draw a Venn diagram!
Show solutions
Recall that the intersection of two events is when they both occur. I.e. for us to be in the intersection in a Venn diagram, both shapes have to cover the same area. There are infinitely variations here, but the important part is that \(A\) must be contained within \(B\). When this is the case, all of A will be intersected by \(B\) (we say that \(A\) is a subset of \(B\), and the notation is \(A\subseteq B\)). In this case we get: \[P(A\cap B)=P(A)\]
Problem 5
You have a sample space, \(S\), and in that sample space you have two identifiable events, \(A\) and \(B\). Can you determine a case where the probability of the union of \(A\) and \(B\) (\(P(A\cup B)\)) occurring, equals the probability of \(B\) (\(P(B)\)) occurring? Draw a Venn diagram!
Show solutions
Recall that the union of two events is when either one or both events occur. This means that all area covered by either \(A\) or \(B\) will be part of the union. We now want the probability of the union occurring to equal the probability of \(B\) occurring. Since all area covered by either \(A\) or \(B\) is part of the union, we can have no excess are beyond \(B\), i.e. \(A\) should once again be a subset of \(B\) (\(A\subseteq B\)). If this is the case we get: \[P(A\cup B)=P(B).\]
Problem 6
Consider a situation where you have \(0<P(A)<P(B)<1\).
What are the lower and upper bounds of \(P(A\cap B)\)?
What are the lower and upper bounds of \(P(A\cup B)\)?
Show solutions
The intersection tells us that both \(A\) and \(B\) have to occur, and as such, we the lower one of the two gives us an upper bound, as the one event cant intersect with more than the cases where it itself occurs (draw a Venn diagram to confirm). We know nothing as to whether or not \(A\) and \(B\) are disjoint, and as this is still a possibility we may not have an intersection, this makes \(0\) our lower bound. I.e. \[P(A\cap B)\in[0, P(A)]\]
The union tells us that either \(A\) or \(B\) or both occurs. As all events where either one or both occurs, the unions lower bound is given by the higher of the two probabilities. We could also be in a situation where \(A\cup B\) cover the whole sample space \(S\), and in such a case we would have an upper bound of 1. I.e: \[P(A\cup B)\in[P(B),1]\]
Problem 7
What is \(A\cap B\) when events \(A\) and \(B\) are disjoint.
Show solutions
When the two events \(A\) and \(B\) are disjoint, there are no possible situation that would fit into the intersection \(A \cap B\). In set notation we would write: \[A\cap B=\emptyset\] where \(\emptyset\) is called the empty set and denotes a set with no elements.
Problem 8
You throw a fair six-sided dice, compute the probabilities of the following events.
You roll 1
You roll 3 or 4
You roll an odd number
You roll a number that’s greater than 5
You roll a number that’s less than or equal to 10
Show solutions
A fair 6 sided die gives us the sample space \(S=\{1,2,3,4,5,6\}\) with each respective outcome having the same probability to be tossed. i.e. if A is an event that denotes any one number being tossed, then: \[ P(A)=\frac{1}{6}\]
Rolling a 1 is a single element in the sample space, i.e. \(P(1)=\frac{1}{6}\)
3 and 4 constitute 2 possible disjoint outcomes, i.e. \[P(3\cup 4)=P(3)+P(4)=P\frac{1}{6}+\frac{1}{6}=\frac{2}{6}=\frac{1}{3}\]
There are 3 different odd numbered outcomes
\[P(odd)=3\cdot\frac{1}{6}=\frac{1}{2}\]
- There is only one possible outcome greater than 5, namely 6. I.e.
\[ P(X>5)=P(6)=\frac{1}{6}\]
- Our sample space tells us we can only roll integers from 1 to 6, all of which are less than 10.
\[ P(X\leq10)=P(1\cup2\cup3\cup4\cup5\cup6)=\sum_{i=1}^6\frac{1}{6}=1\]
Problem 9
You flip a coin 5 times as an experiment.
Let event \(A\) be getting all heads. What is \(P(A)\)?
Let event \(B\) be the complement of \(A\), i.e \(B=A^c\). What is the probability of getting \(B\)?
What is \(P(A\cap B)\)?
Compute \(P(A \cup B)\), how do you interpret this result?
Show solutions
- We treat each coin flip as independent of all the others, we then also let the probability of getting heads be \(P(H)=\frac{1}{2}\) for each coin toss. Since each toss is independent we can simply multiply them all together to find our final probability:
\[ P(A)=P(HHHHH)=\prod_{i=1}^5\frac{1}{2}=\left(\frac{1}{2}\right)^5=\frac{1}{2^5}=\frac{1}{32} \]
- \(B\) being the complement of \(A\) means that it will occur whenever \(A\) doesn’t. It’s also simple to compute as we can use a nifty formula.
\[\begin{align}P(A^c)=1-P(A) \\ \Rightarrow P(B)=P(A^c)=1-P(A)=1-\frac{1}{32}=\frac{31}{32} \end{align}\]
Since \(A\) and \(B\) are complements, they are by definition disjoint, as at no point will both occur (no intersection!!!). As such, we get a clear implication that \(A\cap B=\emptyset\) Following from this \(P(A\cap B)=0\)
Finally we can compute the union. Recall that we have a formula for this.
\[P(A \cup B)=P(A)+P(B)-P(A\cap B) \]
We know all of the relevant probabilities needed for the computation.
\[ P(A \cup B)=\frac{1}{32}+\frac{31}{32}-0=1 \]
The probability of the union being 1 means that either \(A\), \(B\) or \(A\cap B\) will occur. We already know, however that the events are disjoint, and as such there is no intersection. I.e. it’s either \(A\) or \(B\) that must occur. Interpreting this, we end up at the quite banal fact that when we do this experiment, we will* either* end up with 5 heads, or we will not (any other permutation). In fact this situation illustrates a powerful concept concerning complements of events, being that if an event does not occur, then the complement will occur.
Problem 10
Let \(S\) be a sample space. What is the probability of the complement of \(S\) (\(P(S^c)\))? What does this imply?
Show solutions
We know from the axioms that \(P(S)=1\), i.e. we can easily compute the complement.
\[ P(S^c)=1-P(S)=1-1=0 \]
This implies that the event of being within the sample space covers all other events. I.e. when we there is no chance of any event occurring that is not in the sample space.
Probelm 11
You roll two 6 sided dice, one blue and one red. Consider now, what would be your sample space if:
you considered the value on the red die and blue die separately?
you consider the sum of the two dice?
Show solutions
- For every toss, we here have to consider two separate values, and we denote them as \((x,y)\). Each of the two dice can roll from 1 to 6. Adding it all together we can construct the sample space
\[ S=\left\{ \begin{array}{cccccc} (1,1),& (1,2),& (1,3),& (1,4),& (1,5),& (1,6), \\ (2,1),& (2,2),& (2,3),& (2,4),& (2,5),& (2,6), \\ (3,1),& (3,2),& (3,3),& (3,4),& (3,5),& (3,6), \\ (4,1),& (4,2),& (4,3),& (4,4),& (4,5),& (4,6), \\ (5,1),& (5,2),& (5,3),& (5,4),& (5,5),& (5,6), \\ (6,1),& (6,2),& (6,3),& (6,4),& (6,5),& (6,6) \end{array} \right\} \]
- We now consider the sum of the two dice. The lowest value is when both roll 1’s, and the highest possible one is when both roll 6’s. Every integer value between these two sums are achievable. We now end up with:
\[ S_{sum}=\{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12\} \]
Problem 12
You throw two dice.
In how many outcomes is their sum less equals 6?
What is the probability of getting a sum total of 2?
Show solutions
- If dice 1 rolls a 6, then the other dice will make the sum at least 7, i.e. no candidates. Dice 1 rolls 5, then we get 6 if dice 2 rolls 1, i.e. 1 candidate. Following similar logic if dice 1 rolls 1-4, then each produce 1 more candidate.
In total we end up with 5 outcomes that give us a sum of 6.
- For us to have a sum of 2, we need both dice to roll 1. i.e. there is only one outcome that nets this event There are 36 possible outcomes. i.e. \(P(sum \ 2)=\frac{1}{36}\)
Problem 13
Let \(A=\{1,2,3,4,6\}\), \(B=\{2,4,6,8\}\), and \(C=\{2,3,4,5\}\).
Find \(A\cap B\cap C\)
Find \(A\cup B\cup C\)
- Recall first that
\[ A\cap (B\cap C)=A\cap B\cap C \]
So we can find the intersection simply by looking for the elements that appear in all of the sets. We end up with.
\[ A\cap B\cap C = \{2,4\} \]
- Recall first that
\[ A\cup (B\cup C)=A\cup B\cup C \] The union will now consist of any element that falls within any of the sets, and as such we get
\[A\cup B\cup C =\{1,2,3,4,5,6,8\} \]
Problem 14
You know that the probabilities of some intersections are given by \(P(A\cap B)=\frac{1}{2}\) and \(P(A\cap B^c)=\frac{1}{3}\). Compute \(P(A)\).
Show solutions
Two things are very clear when working with complements, and that’s that they are disjoint events and that the probability of one or the other occurring sum to 1 exactly. Directly following from this is that we can always make use of the law of total probability when we know the probabilities of the intersections between an event, \(A\), and both complements. In this case we have exactly the right information to make use of the law of total probability. Recall the formula. \[ P(A)=\sum_{i=1}^kP(A\cap B_i)=P(A\cap B_1)+\cdots+P(A\cap B_2) \] Let’s use this to compute the probability
\[ P(A)=P(A\cap B)+P(A\cap B^c)=\frac{1}{2}+\frac{1}{3}=\frac{3+2}{6}=\frac{5}{6} \]
Problem 15
We have an event, \(A\), and we are interested in figuring out the probability of it occurring. We know the probabilities of \(B_1,...,B_n\) and we know that \(\sum_{i=1}^nP(B_i)=1\). Why can’t we use the law of total probability here? What could go wrong? (It might help to draw a Venn diagram if you’re struggling.)
Show solutions
The problem here is that we don’t know if \(B_1,...,B_n\) are disjoint events, and since, there may be areas where they intersect with each other, we run the risk of counting some space several times if we were to sum all intersections between \(A\) and all \(B_i\). We also can’t be certain taht the entire sample space is covered by \(B_i\) if tehy are not all disjoint (see problem 16 for implications). We can illustrate algebraically (to simplify we reduce some generality, but this argument does extend). Let \[ n=2, \ P(B_1\cap B_2)>0 \] We can now let the \(A\cap (B_1\cap B_2)\neq\emptyset\), i.e. \(A\) also occurs in the intersection. We can now compute \(P(A)\):
\[\begin{align*} P(A)&=P(A\cap B_1)+P(A \cap B_2)\\ &=P(A\cap((B_1\cap B_2)\cup(B_1\cap B_2^c)))+P(A\cap((B_1\cap B_2)\cup(B_1^c\cap B_2))) \\ &=P(A\cap(B_1\cap B_2)) + P(A\cap(B_1\cap B_2^c))+ P(A\cap(B_1\cap B_2)) + P(A\cap(B_1^c\cap B_2)) \\ &=2P(A\cap(B_1\cap B_2))+ P(A\cap(B_1\cap B_2^c))+ P(A\cap(B_1^c\cap B_2)) \end{align*}\]
As we can see here, we have to account for an area where \(A\) intersects twice, and as such, this effect will have us overestimate the probability of \(A\) occurring. As such the equation above is necessarily incorrect.
Problem 16
We have an event, \(A\), and we are interested in figuring out the probability of it occurring. We know the probabilities of \(B_1,...,B_n\) and we know that all \(B_i\) are pairwise disjoint(\(B_i\cap B_j=\emptyset, \ i\neq j, \ \forall i,j\)). Why can’t we use the law of total probability here? What could go wrong? (It might help to draw a Venn diagram if you’re struggling.)
Show solutions
The problem here is that the sum of all probabilities \(P(B_i)\) may not equal 1. If the sum of probabilities \(P(B_i)\) is less than 1, then there are cases where no events \(B_i\) occur. If \(A\) intersects here, the law of total probability will lead us to underestimating \(P(A)\), as there is no \(B_i\) that intersects this part of \(A\). Let’s try to show this algebraically (we reduce generality here too, but argument holds). \[ P((B_1\cup...\cup B_n)^c)>0, \ (B_1\cup...\cup B_n)^c\subset A, \ n=2 \] If we try to compute using the law of total probability we then get
\[ P(A)=P(A\cap B_1)+P(A\cap B_2) \ (*)\]
Note that \((E_1\cup E_2)^c=E_1^c\cap E_2^c\)
From this we know that \(B_1^c \cap B_2^c\subset A\).
Because they are disjoint we also know that \(B_1\subset B_2^c, \ B_2\subset B_1^c\), but we obviously have \(B_1, B_2\not\subset B_1^c \cap B_2^c\), as any event will be disjoint with its complement.
This tells us that we don’t account for \(B_1^c \cap B_2^c\subset A\) in \((*)\), and thus we must be underestimating the probability, and the equation cannot hold.
Problem 17
We are considering the probability of an event, \(A\). We know that \(P(A\cap B)=0.2\) and that \(P(A\cap B^c)=0.05\). Compute \(P(A)\).
Show solutions
By definition, the events \(B\) and \(B^c\) are disjoint and \(P(B)+P(B^c)=1\), and as such, we can make use of the law of total probability. \[ P(A)=\sum_{i=1}^nP(A\cap B_i)=P(A\cap B)+P(A\cap B^c)=0.2+0.05=0.25=\frac{1}{4} \]
Problem 18
We get that \(P(A)=0.7\), \(P(B_1)=0.3\), \(P(B_2)=0.2\), \(P(B_3)=0.5\). All \(B_i\) are disjoint events. We also get that \(P(A | B_1)=0.3\), \(P(A | B_2)=0.9\). Find \(P(A | B_3)\) using the law of total probability.
Show solutions
We need to rewrite the law of total probability to compute the conditional probability we want. Let’s use the definition conditional probability for this. \[ P(A|B)=\frac{P(A\cap B)}{P(B)} \Rightarrow P(A\cap B)=P(A | B)P(B) \ (*) \]
Recall the law of total probability, now we can rewrite it by substituting in \((*)\). We can use this altered equation to solve for \(P(A|B_3)\)
\[ \begin{align*} P(A)&=\sum_{i=1}^nP(A\cap B_i)=\sum^n_{i=1}P(A|B_i)P(B_i) \\ \Rightarrow P(A)&=P(A|B_1)P(B_1)+P(A|B_2)P(B_2)+P(A|B_3)P(B_3) \\ \Rightarrow P(A|B_3)&=\frac{P(A)-P(A|B_1)P(B_1)-P(A|B_2)P(B_2)}{P(B_3)} \end{align*}\]
Now we can easily compute \(P(A|B_3)\)
\[ P(A|B_3)=\frac{0.7-0.3*0.3-0.2*0.9}{0.5}=\frac{0.7-0.09-0.18}{0.5}=0.86 \]
Problem 19
Consider two events, \(A\)and \(B\). You get that \(P(A)=0.1\), \(P(B)=0.2\) and \(P(A\cap B)=0.05\). Compute $P(A B) $ Argue for why \(P(A\cup B)=P(A)+P(B)-P(A\cap B)\) makes sense.
Show solution
We already know the formula used for the computation.
\[ P(A\cup B)=0.1+0.2-0.05=0.25 \]
To argue for why the formula makes sense, let’s first divide up the union between A and B into disjoint parts. We know that the union includes any element within either A, B or both. We will have one part which is A and not B, and these will can be represented as intersections in and of themselves, i.e. \(A\cap B^c\) and \(A^c \cap B\). The final part of the union is when both A and B occurr, meaning our intersection \(A\cap B\). With this we can find a formula for the union \[ P(A\cup B)=P(A\cap B^c)+P(A^c\cap B)+P(A\cap B) \quad (*)\] Now, let’s consider \(P(A)\) and \(P(B)\). Any set \(A\) can necessarily be constructed by the union of it’s intersection with another set \(B\) and the complement of that set \(B^c\) on the sample space \(S\). This is very technical, but the idea is that every element (possible outcome) within \(A\) will either also be in \(B\), or not, hence ion \(B^c\). This idea let’s us rewrite \(P(A)+P(B)\), and we will represent them as sums similar to what we have in \((*)\).
\[\begin{align*} P(A)+P(B)&=P(A\cap B)+P(A\cap B^c)+P(A\cap B)+P(A^c\cap B)\\ &=2P(A\cap B)+P(A\cap B^c)+P(A\cap B) \quad (**)\end{align*}\]
It’s plain to see that \((**)-P(A\cap B)=(*)\), and therefore
\[P(A\cup B)=P(A)+P(B)-P(A\cap B) \]
Problem 20
You get that \(P(B)=0.5\), \(P(A\cap B)=0.4\). Compute \(P(A|B)\).
Show solutions
We compute the conditional probability using the definition.
\[ P(A|B)=\frac{P(A\cap B)}{P(B)}=\frac{0.4}{0.5}=0.8 \]
Problem 21
You get that \(P(B)=0.2\), \(P(A | B)=0.9\). Compute \(P(A\cap B)\).
Show solution
We make use of the definition here to compute the probability.
\[ \begin{align*} P(A|B)&=\frac{P(A\cap B)}{P(B)} \Rightarrow P(A\cap B)=P(A|B)P(B) \\ P(A|B)&=0.9*0.2=0.18 \end{align*} \]
Problem 22
You find that \(P(A)=0.6, \ P(B)=0.3\) and \(P(A|B)+P(B|A)=0.9\). Find \(P(A\cap B)\).
Show solutions
Let’s try to rewrite \(P(A|B)+P(B|A)=0.9\) and solve for \(P(A\cap B)\).
\[\begin{align*} P(A|B)+P(B|A)=0.9 \Leftrightarrow & \frac{P(A\cap B)}{P(B)}+\frac{P(A\cap B)}{P(A)}=\frac{9}{10} \\ \Leftrightarrow &\frac{10}{3}P(A\cap B)+\frac{10}{6}P(A\cap B)=\frac{9}{10} \\ \Leftrightarrow &\frac{30}{6}P(A\cap B)=\frac{9}{10} \Leftrightarrow 5P(A\cap B)=\frac{9}{10} \\ \therefore P(A\cap B)&=\frac{9}{50}=0.18 \end{align*}\]
Problem 23
You get that \(P((A\cup B)^c)=\frac{1}{2}\) and that \(P(A\cap B)=\frac{3}{20}\) \(C\) is the event that either \(A\) or \(B\) occurs, but not both. What is \(P(C)\)?
Show solution
First we should find a way to express \(P(C)\). The probability of \(E\) occurring is obviously related to the probabilities of \(A\) and \(B\) occrring. but we have to remove any cases where both of them occur, i.e. the intersection. Recall from problem 19 as we can construct an event \(A\) as all the outcomes shared with \(B\) (\(P(A\cap B)\)) and all the outcomes which aren’t shared, or shared with not \(B\) (the complement \(P(A\cap B^c)\)), in the same sample space. I.e.
\[ P(A)=P(A\cap B)+P(A\cap B^c) \]
The same can be done for \(B\). Now to construct \(P(C)\) we can remove the respective \(P(A\cap B)\) andb sum the remaining probabilities.
\[P(C)=P(A\cap B^c)+P(A^c\cap B)=P(A\cup B)-P(A\cap B)\] Now we can finally start relating this to the probabilities given in the problem. We need to recall the complement rule, before we can compute however.
\[ P(A)=1-P(A^c) \Rightarrow P(A\cup B)=1-P((A\cup B)^c)\]
We can substitute this in to the expression for \(P(C)\)
\[ P(C)=1-P((A\cup B)^c)-P(A\cap B)=1-\frac{1}{2}-\frac{3}{20}=\frac{7}{20}=0.35 \]
Problem 24
Prove that for any event \(A\) such that \(A\) is independent of another event \(B\), show that
\[P(A|B)=P(A)\]
Show solution
Consider first the definition of independence.
\[ P(A\cap B)=P(A)P(B) \ (*)\]
Now let’s also consider the definition of conditional probability.
\[ P(A|B)=\frac{P(A\cap B)}{P(B)} \]
Finally let’s substitute in the definition given in \((*)\) and reduce the fraction.
\[ P(A|B)=\frac{P(A)P(B)}{P(B)}=P(A) \quad q.e.d.\]
Problem 25
You find that \(P(A)=\frac{3}{10}\), \(P(B)=\frac{1}{2}\) and \(P(A\cap B)=\frac{3}{10}\)
Are events \(A\) and \(B\) independent?
Show solution
We check if the probabilities fulfill the conditions given by the definition.
\[ P(A)P(B)=\frac{3}{10}\frac{1}{2}=\frac{3}{20}\neq\frac{3}{10}=P(A\cap B) \] I.e. A and B are not independent.
Problem 26
You find that \(P(A)=\frac{5}{8}\), \(P(B)=\frac{1}{5}\) and \(P(A\cap B)=\frac{1}{8}\)
Are events \(A\) and \(B\) independent?
Show solution
We check if the probabilities fulfill the conditions given by the definition.
\[ P(A)P(B)=\frac{5}{8}\frac{1}{5}=\frac{1}{8}=P(A\cap B) \] I.e. A and B are independent.
Problem 27
You find that \(P(A)=0.8\), \(P(B)=0.2\) and \(P(A\cap B)=0.125\)
Are events \(A\) and \(B\) independent?
Show solution
We check if the probabilities fulfill the conditions given by the definition.
\[ P(A)P(B)=0.8*0.2=0.16\neq 0.125=P(A\cap B) \] I.e. A and B are not independent.
Problem 28
You find that \(P(A)=\frac{3}{4}\) and \(P(A\cap B)=\frac{1}{10}\)
What must \(P(B)\) be for the events \(A\) and \(B\) to be independent.
Show solution
Our criterion follows from the definition.
\[P(A\cap B)=P(A)P(B) \Rightarrow P(B)=\frac{P(A\cap B)}{P(A)} \] Now we can compute the probability.
\[ P(B)=\frac{\frac{1}{10}}{\frac{3}{4}}=\frac{1}{10}\frac{4}{3}=\frac{4}{30}=\frac{2}{15} \]
So \(P(B)\) should be \(\frac{2}{15}\approx0.133\)
Proplem 29
You have \(P(A|B)=0.3\), \(P(A)=0.4\), \(P(B)=0.1\). Compute the probability \(P(B|A)\).
Show solution
Here we can simply compute using Bayes rule!
\[ P(B|A)=\frac{P(A|B)P(B)}{P(A)} \] Let’s compute:
\[ P(B|A)=\frac{0.3*0.1}{0.4}=\frac{0.03}{0.4}=2.5*0.03=0.075\]
So we conclude the probability of B given A is 7.5%
Problem 30
A certain disease affects 1% of a population. A test for the disease is 90% accurate for those who have the disease (i.e., it correctly identifies 90% of sick people) but also has a 5% false positive rate (i.e., it incorrectly identifies 5% of healthy people as having the disease).
If a randomly chosen person tests positive, what is the probability that they actually have the disease?
Following, what would be the probability that someone testing positive is not sick?
Would you consider this to be a good test?
Show solution
Let S be the event that a random person is suffering from the unnamed disease. Let T be the event that a random person gets a positive test. With this in mind, let’s try to identify what is given in the text.
\[ \begin{align*} P(S)&=0.01 \Rightarrow P(S^c)=0.99 \\ P(T|S)&=0.90 \\ P(T|S^c)&=0.05 \end{align*}\]
We are now looking for then probability that a person who tests positive is sick. i.e. \(P(S|T)\).
\[\begin{align*} P(S|T)&=\frac{P(S\cap T)}{P(S)}=\frac{P(T|S)P(S)}{P(T)} \end{align*}\]
We already know the values of \(P(T|S)\) and \(P(S)\). Remains to find \(P(T)\).
We can attempt to use the law of total probability to find \(P(T)\).
\[\begin{align*} P(T)&=P(T|S)P(S)+P(T|S^c)P(S^c)\\ &=0.90*0.01+0.05*0.99\\ &=0.009+0.0495=0.0585\end{align*} \]
Finally; we can compute \(P(S|T)\)
\[P(S|T)=\frac{0.90*0.01}{0.0585}\approx0.154\]
Let’s now find the complement, i.e. \(P(S^c|t)=1-P(S|T)\).
\[P(S^c|T)\approx1-0.154=0.846\]
This should in no way be considered. When only 15% of cases report a true positive, that’s a bad sign. The false positive that constitutes about 85% of the positive tests is what we would call Type I errors. There is no one answer to what’s good level of Type I errors to have is when doing a test, but the most common benchmark is 5% (commonly denoted as \(\alpha=0.05\), and is usually called the significance level of a test).
Problem 31
Let \(A\) and \(B\) be disjoint events. What is \(P(A\cap B)\)? Feel free to draw a venn diagram to visualise this concept.
Show solution
When \(A\) and \(B\) disjoint, we get that
\[A\cap B=\emptyset \Rightarrow P(A\cap B)=0\]
Let’s plug this into the definition of a conditional probability.
\[ P(A|B)=\frac{P(A\cap B)}{P(B)}=\frac{0}{P(B)}=0 \]
Problem 32
In a random draw you can end up with the respective values {3,7,9,11}, each with a respective probability of 0.25. Is this a probability distribution? Why or why not?
Show solution
This is indeed a probability distribution. Here, each discrete possible value, is assigned a positive probability of occurring, and each of these probabilities add up to 1. Which are the requirements for a discrete probability distribution.
Problem 33
Consider now a fair dice with six faces.
Do the results of a throw constitute a probability distribution?
Find the expected value of the dice toss
Find the variance of the dice toss
Show solutions
This is indeed a probability distribution. Here, each discrete possible value, is assigned a positive probability of occurring, and each of these probabilities add up to 1. Which are the requirements for a discrete probability distribution.
To find the expected value of a randomly distributed variable we use the formula given by
\[ E[X]=\sum_{i=1}^nx_ip(x_i) \]
Here, \(X\) denotes the random variable that has the distribution, \(x_i\) denotes the ith value \(X\) can take, and \(p(x_i)\) denotes the probability that \(X\) will take the value \(x_i\). Note that the sum of all \(p(x_i)\) should always add up to 1 exactly.
For a fair die with 6 faces we get possible values {1,2,3,4,5,6} and respective probabilities \(p=\frac{1}{6}\) for each possible outcome. With this we can easily find the expected value.
\[ E[X]=\sum_{i=1}^6x_ip(x_i)=\sum_{i=1}^6x_i\frac{1}{6}=\frac{1}{6}(1+2+3+4+5+6)=\frac{21}{6}=3.5 \]
- We have a general formula we can use to determine the variance in a distribution.
\[Var[X]=E[X^2]-(E[X])^2\]
From b) we already know \(E[X]\), and that can just be squared to find the second term in the equation, however, the first term \(E[X^2]\) still remains to be found. This term is found by squaring the values our random variable can take, i.e.
\[ E[X]=\sum_{i=1}^nx_i^2p(x_i) \] Let’s now compute this
\[ E[X]=\sum_{i=1}^6x_i^2p(x_i)=\sum_{i=1}^6x_i^2\frac{1}{6}=\frac{1}{6}(1+4+9+16+25+36)=\frac{91}{6}\approx15.167 \]
Now remember to finis the equation.
\[Var[X]=\frac{91}{6}-(3.5)^2\approx2.9167 \]
Problem 34
A random variable, \(X\) has a binomial distribution if the three following requirements are fulfilled.
There are one or more draws which can end up in two states. Success or failure.
For each draw, there is the same probability for success.
Each draw is mutually indpendent from all other draws.
We denote the total number of draws as \(n\), and the probability fro success on each draw as \(p\). For a binomial distribution we have special formulas for the expected value and variance.
\[ E[X]=np, \ Var[X]=np(1-p) \]
For a Binomial distribution, the probability distribution is given by
\[ P(X=x)=\binom{n}{x}p^x(1-p)^{n-x} \] Where x denotes the amount of successes drawn.
A lightbulb manufacturing company claims that 95% of their lightbulbs pass quality control. A quality inspector randomly selects 20 lightbulbs from a day’s production line for testing. Let the amount of lightbulbs that pass quality control be given by the random variable, \(X\), and let \(X\) have a binomial distribution with \(n=20\) and \(p=0.95\).
What is the probability that exactly 18 lightbulbs pass the quality check?
What is the probability that at least 19 lightbulbs pass the quality check?
What is the expected number of lightbulbs that pass the quality check?
What is the variance and standard deviation (\(\text{SD}(X)\)) of the number of passing lightbulbs?
Show solutions
To find the probability that 18 bulbs pass the check we can simply plug our given information into the probability function. \[ \begin{align*} P(X=18)&=\binom{20}{18}0.95^{18}(1-0.95)^{20-18}\\&=\frac{20!}{2!18!}0.95^{18}0.05^2\\&=\frac{20*19}{2}0.95^{18}0.05^2\\&= 190*0.95^{18}0.05^2\approx1.887 \end{align*}\]
To find the probability \(P(X\geq19)\). Since we have only two cases where this is the case \(P(X=19)\) and \(P(X=20)\), as we only check 20 bulbs, this is a simple task. We need only compute these two seperate probabilities and sum them together.
We again make use pf the probability function.
\[ P(X=19)=\binom{20}{19}0.95^{19}(1-0.95)^{20-19}=20*0.95^{19}*0.05\approx0.377 \]
\[ P(X=20)=\binom{20}{20}0.95^{20}0.05^{0}=0.95^{20}\approx0.358 \]
Now we need only sum these probabilities.
\[ P(X\geq19)=P(X=19)+P(X=20)\approx0.377+0.358=0.735 \]
- The expected number of lightbulubs that pass the quality check is given by the distinct fomrula
\[ E[X]=np=20*0.95=19 \]
- The variance is given by the formula
\[ Var[X]=np(1-p)=20*0.95*0.05=0.95 \]
The standard deviation is always given as the postive square root of the variance.
\[\text{SD}(X)=\sqrt{Var[X]}=\sqrt{0.95}\approx0.975 \]
Problem 35
A random variable, ( X ), has a Poisson distribution if the following conditions are fulfilled:
Events occur randomly and independently over time or space.
The events occur at a known average rate ( \(\lambda\) ).
Two events cannot occur at exactly the same time.
We denote the average rate of occurrence by ( \(\lambda\) ).
For a Poisson distribution we have the following formulas for the expected value and variance:
\[ E[X] = \lambda, \quad Var[X] = \lambda \]
For a Poisson distribution, the probability distribution is given by:
\[ P(X = x) = \frac{e^{-\lambda} \lambda^x}{x!} \]
A call center receives an average of 3 customer calls per hour. Let the number of calls received in a given hour be modeled by the random variable ( X (10) ).
What is the probability that exactly 2 calls are received in an hour?
What is the probability that at less than 2 calls are received in an hour?
What is the expected number of calls received per hour?
What is the variance and standard deviation (( (X) )) of the number of calls per hour?
Show solutions
In this case we can simply compute the probabilty with teh help of our distribution fuction. \[ P(X = 2) = \frac{e^{-3} \cdot 3^2}{2!} = \frac{9e^{-3}}{2} \approx 0.224 \]
Here we need to realise that less than 2 is the same as 0 or in this case. As such, we would like to compute both the probaility that none call in an hour and the probaility that exactly one call is received. Then finally we can sum these two proabilities.
\[ P(X=0)=\frac{e^{-3}\cdot3^0}{0!}=e^{-3}\approx 0.050 \] \[ P(X=1)=\frac{e^{-3}\cdot3^1}{1!}=3e^{-3}\approx0.149 \]
\[ P(X < 2) = P(X=0)+P(X=1)\approx0.050+0.149=0.199 \]
For a Poisson distribution the expected value is simply given by the parameter \(\lambda\) which here represents average callers per hour. \[ E[X] = \lambda = 3 \]
For a Poisson distribution in particular the variance and the expected value is the same \[ Var[X] = \lambda = 3, \quad \text{SD}(X) = \sqrt{3} \approx 1.732 \]
Problem 36
A random variable, \(X\), has a geometric distribution if the three following requirements are fulfilled:
Each trial results in either success or failure.
The probability of success is the same for every trial.
Each trial is mutually independent from all other trials.
The geometric distribution models the number of trials up to and including the first success.
We denote the probability of success as \(p\). And thus it is very related to the binomial distribution. Whereas in the binomial we have decided upon a certain number of trials and then we check how many successes we’ve found, with a geometric we keep doing trials until we find a success, and then stop.
For a geometric distribution we have special formulas for the expected value and variance:
\[ E[X] = \frac{1}{p}, \quad Var[X] = \frac{1 - p}{p^2} \]
The probability distribution is given by:
\[ P(X = x) = (1 - p)^{x - 1}p \]
A machine produces small parts for watches. Each part has a 2% chance of being defective. A quality inspector examines parts one by one until the first defective part is found.
Let the number of parts examined be given by the random variable \(X\), and let \(X\) have a geometric distribution with \(p = 0.02\).
What is the probability that the first defective part is the 10th one tested?
What is the probability that more than 20 parts are tested before finding a defective one? (Hint: \(P(X\leq x)=1-(1-p)^x\) (this is a pretty simple to show, but could be unintuitive))
What is the expected number of parts tested before finding a defective one?
What is the variance and standard deviation (\(\text{SD}(X)\)) of the number of parts tested?
Show solutions
This is a simple computation with the probability function. \[ P(X = 10) = (1 - 0.02)^9 \cdot 0.02 = 0.98^9 \cdot 0.02 \approx 0.1667 \]
Let’s be a bit clever when computing this We have that
\[ P(X>x)=1-P(X\leq x) \] By the rule of complements.
Let’s now subsitute in for the hint we got
\[ P(X>x)=1-(1-(1-p)^x)=1-1+(1-p)^x=(1-p)^x \] Finally we can subsititute in for \(p\) and \(x\)
\[ P(X > 20) = (1 - 0.02)^{20} = 0.98^{20} \approx 0.667 \]
Let’s compute teh expected value with the formula! \[ E[X] = \frac{1}{0.02} = 50 \]
Let’s now du the same for teh variance and standard deviation! \[ Var[X] = \frac{1 - 0.02}{0.02^2} = \frac{0.98}{0.0004} = 2450 \]
\[ \text{SD}(X) = \sqrt{2450} \approx 49.5 \]
Problem 37
A random variable, \(X\), has a negative binomial distribution if the three following requirements are fulfilled:
Each trial results in either success or failure.
The probability of success is the same for each trial.
Each trial is mutually independent from all other trials.
The negative binomial distribution models the number of trials needed to get exactly \(r\) successes.
We denote the number of successes as \(r\) and the probability of success on each trial as \(p\). So it’s in a way the inverse of the binomial distribution.
For a negative binomial distribution we have special formulas for the expected value and variance:
\[ E[X] = \frac{r}{p}, \quad Var[X] = \frac{r(1 - p)}{p^2} \]
The probability distribution is given by:
\[ P(X = x) = \binom{x - 1}{r - 1} p^r (1 - p)^{x - r} \]
A salesperson has a 30% chance of closing a sale on each call. They continue calling until they have made 3 sales.
Let the number of calls made be given by the random variable \(X\), and let \(X\) have a negative binomial distribution with \(r = 3\) and \(p = 0.3\).
What is the probability that the third sale occurs on the 5th call?
What is the probability that more than 7 calls are needed?
What is the expected number of calls made before making 3 sales?
What is the variance and standard deviation (\(\text{SD}(X)\)) of the number of calls made?
Show solutions
Here we can simply compute using the distribution function. \[ P(X = 5) = \binom{4}{2} \cdot (0.3)^3 \cdot (0.7)^2 = 6 \cdot 0.027 \cdot 0.049 \approx 0.079 \]
Note here that since we need at least 3 sales, there is zero chance that the salesperson is done before they have made three calls. Thus, when using the rule for complements we can write \[P(X\leq7)=\sum^7_{i=1}P(X=x)=0+0+\sum_{i=3}^7P(X=x)=\sum_{i=3}^7P(X=x) \] Now, by computing \(P(X=3), ..., P(X=7)\) (same procedure as in a)), we can find our desired probability. \[ P(X > 7) = 1 - P(X \leq 7) = 1 - \sum_{x=3}^{7} P(X = x) \approx 1 - 0.353= 0.647 \]
Here we simply use the fomrula \[ E[X] = \frac{3}{0.3} = 10 \]
Here we simply use the formula \[ Var[X] = \frac{3(1 - 0.3)}{0.3^2} = \frac{2.1}{0.09} = 23.\overline{3} \]
\[ \text{SD}(X) = \sqrt{23.\overline{3}} \approx 4.83 \]
Problem 38
A random variable, \(X\), has a hypergeometric distribution if the three following requirements are fulfilled:
We select a sample from a finite population without replacement.
The population contains a known number of “successes” and “failures”.
Each draw changes the composition of the population.
We denote: - \(N\) = total population size
- \(K\) = number of successes in the population
- \(n\) = number of draws (sample size)
For a hypergeometric distribution we have special formulas for the expected value and variance:
\[ E[X] = n \cdot \frac{K}{N}, \quad Var[X] = n \cdot \frac{K}{N} \cdot \frac{N - K}{N} \cdot \frac{N - n}{N - 1} \]
The probability distribution is given by:
\[ P(X = x) = \frac{\binom{K}{x} \binom{N - K}{n - x}}{\binom{N}{n}} \]
In a school of 100 students, 40 are members of the chess club. A random sample of 10 students is selected to take part in a survey.
Let the number of chess club members in the sample be given by the random variable \(X\), and let \(X\) have a hypergeometric distribution with \(N = 100\), \(K = 40\), and \(n = 10\).
What is the probability that exactly 4 chess club members are selected?
What is the probability that at least 2 chess club members are selected?
What is the expected number of chess club members selected?
What is the variance and standard deviation (\(\text{SD}(X)\)) of the number of chess club members selected?
Show solutions
\[ P(X = 4) = \frac{\binom{40}{4} \binom{60}{6}}{\binom{100}{10}}=\frac{\frac{40!}{36!4!}\frac{60}{54!6!}}{\frac{100!}{90!10!}} \approx 0.264 \]
We compute: \[ P(X \geq 2) = 1 - P(X = 0) - P(X = 1) \] Where: \[ P(X = 0) = \frac{\binom{40}{0} \binom{60}{10}}{\binom{100}{10}}\approx 0.004, \quad P(X = 1) = \frac{\binom{40}{1} \binom{60}{9}}{\binom{100}{10}} \approx 0.034 \] Thus: \[ P(X \geq 2) \approx 1 - (0.004 + 0.038) = 0.962 \]
Here we can just compute the expectation with our fomrula \[ E[X] = 10 \cdot \frac{40}{100} = 4 \]
The variance and standard deviation are just as easy to compute \[ Var[X] = 10 \cdot \frac{40}{100} \cdot \frac{60}{100} \cdot \frac{90}{99} \approx 2.18 \] \[ \text{SD}(X) = \sqrt{2.18} \approx 1.48 \]