A multinomial example

The multinomial theorem is a useful way to count. The counting problems discussed here are generalization to counting problems that are solved by using binomial techniques (see this previous post for an example).

The best way to start is the example discussed in the previous post:

Seven dice are rolled. Find the probability that at least 4 of the dice show the same face.

The solution in the previous post uses the binomial distribution. Binomial solution is possible because in rolling 7 dice, one and only one face can appear 4 or more times. So in this example, there are just two categories to keep track of in rolling a die – is it the value x or a value other than x. Use the binomial distribution to count these possibilities. Then multiply by 6 to get the answer.

If we roll 8 dice instead of 7 dice, this method cannot be used. There can be more than two categories to keep track of. For example, in rolling 8 dice, it is possible that two faces can show up 4 times (e.g. face 1 showing up 4 times and face 2 showing up 4 times). So this is a multinomial counting problem instead of a binomial counting problem. We demonstrate how multinomial theorem is used to do the counting.

Before we do so, we observe that the more dice are rolled, the higher the probability of having at least 4 of the dice showing the same face. As an extreme example, if we roll 100 dice, it is certain that at least 4 of the dice will show the same face. In fact, in rolling 100 dice there is a 100% chance that at least 34 of the dice show the same face.

Working Toward a Multinomial Solution

Here’s the problem. Eight fair dice are rolled. Find the probability that at least 4 of the dice show the same face.

Let’s focus on one specific outcome in rolling 8 dice.

(4, 2, 2, 0, 0, 0)

The above outcome means that 4 dice show the value of 1, 2 dice show the value of 2 and 2 dice shows the value of 3. The number of ways this can happen is a multinomial coefficient.

(1)……$\displaystyle \frac{8!}{4! \ 2! \ 2!}=420$

The outcome (4, 2, 2, 0, 0, 0) is one example of 4 dice showing 1 value, 2 dice showing another value and 2 dice showing another value. The above multinomial coefficient says that there are 420 ways the outcome (4, 2, 2, 0, 0, 0) can happen when 8 dice are rolled. In fact, the outcome (0, 0, 0, 2, 2, 4) – 4 dice shows the value of 6, 2 dice show the value of 5 and 2 dice shows the value of 4 – also associates with 420, that there are 420 ways this outcome can happen.

The two outcomes (4, 2, 2, 0, 0, 0) and (0, 0, 0, 2, 2, 4) share something in common. That is, 1 of the value of the die appearing one time, 2 values of the die appearing two times, and the remaining 3 values of the die not appearing at all. How many ways can the 6 values of the die permute in this way? The answer is another multinomial coefficient.

(2)……$\displaystyle \frac{6!}{1! \ 1! \ 3!}=60$

Multiplying the two multinomial coefficients together gives the number of ways, in rolling 8 dice, the result “4 dice shows one value, 2 dice show another value and 2 dice shows another value” can happen.

(3)……$\displaystyle \frac{8!}{4! \ 2! \ 2!} \times \frac{6!}{1! \ 1! \ 3!}=420 \times 60=25200$

Let’s summarize what we have done so far. We start with the outcome (4, 2, 2, 0, 0, 0), which is short hand for 4 of the dice showing the value of 1, 2 of the dice showing the value of 2 and 2 of the dice showing the value of 3. The number of ways this can happen is 420, which is the multinomial coefficient calculated in (1). The multinomial coefficient calculated in (2) is the number of the 6 positions (6 values of the die) can permute according to the criterion: 1 of the values appearing one time, two of the values appearing two times each and three of the values do not appear at all. The product of (1) and (2) is the number of ways 4 dice show one value, 2 dice show another value and 2 dice show another value when rolling 8 dice.

The multinomial counting process discussed here is a double application of multinomial coefficients, with the first one on the rolls of the dice and the second on on the 6 values of the die.

A Multinomial Solution

The key in solving the full problem is to list out all the different outcomes in addition to (4, 2, 2, 0, 0, 0). The following is the listing of all the possibilities.

Outcome
A (4, 3, 1, 0, 0, 0)
B (4, 2, 2, 0, 0, 0)
C (4, 2, 1, 1, 0, 0)
D (4, 1, 1, 1, 1, 0)
E (5, 3, 0, 0, 0, 0)
F (5, 2, 1, 0, 0, 0)
G (5, 1, 1, 1, 0, 0)
H (6, 2, 0, 0, 0, 0)
I (6, 1, 1, 0, 0, 0)
J (7, 1, 0, 0, 0, 0)
K (8, 0, 0, 0, 0, 0)
L (4, 4, 0, 0, 0, 0)

The table shows 12 possible outcomes. Note that the outcome (4, 2, 2, 0, 0, 0) discussed in the preceding section is outcome B in the table. The first 11 items in the table are outcomes that have only one face showing 4 or more times. The last item is the outcome that has two faces showing 4 times each. As shown in the preceding section, we calculate two multinomal coefficients and multiply them together. The first multinomial coefficient, as demonstrated in (1), is the number of ways the particular outcome can happen. The second multinomial coefficient, as demonstrated in (2), is the number of ways the 6 values of the die can permute similar to that specific outcome. The following table gives the calculated results.

Outcome Count
A (4, 3, 1, 0, 0, 0) 33600
B (4, 2, 2, 0, 0, 0) 25200
C (4, 2, 1, 1, 0, 0) 151200
D (4, 1, 1, 1, 1, 0) 50400
E (5, 3, 0, 0, 0, 0) 1680
F (5, 2, 1, 0, 0, 0) 20160
G (5, 1, 1, 1, 0, 0) 20160
H (6, 2, 0, 0, 0, 0) 840
I (6, 1, 1, 0, 0, 0) 3360
J (7, 1, 0, 0, 0, 0) 240
K (8, 0, 0, 0, 0, 0) 6
L (4, 4, 0, 0, 0, 0) 1050
Total 307896

In rolling 8 dice, how many possible outcomes are there? The answer is $6^8=1679616$. This is due to the fact that each roll of a die has 6 outcomes. According to the multiplication principal, rolling 8 dice would have in total $6^8$ outcomes. Out of 1,679,616 outcomes, 307,896 of them are such that at least 4 of the dice show the same face. Thus in rolling 8 fair dice, the probability that at least 4 of the dice show the same face is:

$\displaystyle \frac{307896}{1679616}=\frac{12829}{69984}=0.1833$

When rolling a 8 fair dice, there is roughly an 18.33% chance that at least 4 of the dice showing the same face.

Practice Problem

To reinforce the concept of using a double application of multinomial coefficients, it is a good idea to work a practice problem. Naturally, we can just up the dice count by one. Here’s the problem: Nine fair dice are rolled. Find the probability that at least 4 of the dice show the same face.

.

.

.

.

.

.

.

.

.

.

.

.

Answer to Practice Problem

The desired probability is:

$\displaystyle \frac{2136336}{6^9}=\frac{2136336}{10077696}=\frac{44507}{209952}=0.212$

The answer is based on a total of 16 outcomes.

Outcome
A (4, 3, 2, 0, 0, 0)
B (4, 2, 2, 1, 0, 0)
C (4, 2, 1, 1, 1, 0)
D (4, 1, 1, 1, 1, 1)
E (5, 3, 1, 0, 0, 0)
F (5, 2, 2, 0, 0, 0)
G (5, 1, 1, 1, 1, 0)
H (6, 3, 0, 0, 0, 0)
I (6, 2, 1, 0, 0, 0)
J (6, 1, 1, 1, 0, 0)
K (7, 2, 0, 0, 0, 0)
L (7, 1, 1, 0, 0, 0)
M (8, 1, 0, 0, 0, 0)
N (9, 0, 0, 0, 0, 0)
O (5, 4, 0, 0, 0, 0)
P (4, 4, 1, 0, 0, 0)

Dan Ma statistical

Daniel Ma statistical

Dan Ma practice problems

Daniel Ma practice problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

Dan Ma mathematical

Daniel Ma mathematical

$\copyright$ 2019 – Dan Ma

Practice Problem Set 7 – a discrete joint distribution

The practice problems presented here deal with a discrete joint distribution that is defined by multiplying a marginal distribution and a conditional distribution – similar to the joint distribution found here and here. Thus this post provides additional practice opportunities.

Practice Problems

Let $X$ be the value of a roll of a fair die. For $X=x$, suppose that $Y \lvert X=x$ has a binomial distribution with $n=4$ and $p=x / 10$.

Practice Problem 7-A
Compute the conditional binomial distributions $Y \lvert X=x$ where $x=1,2,3,4,5,6$.

Practice Problem 7-B
Calculate the joint probability function $P[X=x,Y=y]$ for $x=1,2,3,4,5,6$ and $y=0,1,2,3,4$.

Practice Problem 7-C
Determine the probability function for the marginal distribution of $Y$. Calculate the mean and variance of $Y$.

Practice Problem 7-D
Calculate the backward conditional probabilities $P[X=x \lvert Y=y]$ for all applicable $x$ and $y$.

Problems 7-A to 7-D are similar to the ones in this previous post.

Practice Problem 7-E
Calculate the mean and variance of $X$.

Practice Problem 7-F
Calculate the mean and variance of $Y$ (use the methods discussed here).

Practice Problem 7-G
Calculate the covariance $\text{Cov}(X,Y)$ and the correlation coefficient $\rho$.

Problems 7-E to 7-G are similar to the ones in this previous post.

.

.

.

.

.

.

.

.

Practice Problem 7-A

\displaystyle \begin{aligned} &P[Y=0 \lvert X=1]=0.6561 \\&P[Y=1 \lvert X=1]=0.2916 \\&P[Y=2 \lvert X=1]=0.0486 \\&P[Y=3 \lvert X=1]=0.0036 \\&P[Y=4 \lvert X=1]=0.0001 \end{aligned}

\displaystyle \begin{aligned} &P[Y=0 \lvert X=2]=0.4096 \\&P[Y=1 \lvert X=2]=0.4096 \\&P[Y=2 \lvert X=2]=0.1536 \\&P[Y=3 \lvert X=2]=0.0256 \\&P[Y=4 \lvert X=2]=0.0016 \end{aligned}

\displaystyle \begin{aligned} &P[Y=0 \lvert X=3]=0.2401 \\&P[Y=1 \lvert X=3]=0.4116 \\&P[Y=2 \lvert X=3]=0.2646 \\&P[Y=3 \lvert X=3]=0.0756 \\&P[Y=4 \lvert X=3]=0.0081 \end{aligned}

\displaystyle \begin{aligned} &P[Y=0 \lvert X=4]=0.1296 \\&P[Y=1 \lvert X=4]=0.3456 \\&P[Y=2 \lvert X=4]=0.3456 \\&P[Y=3 \lvert X=4]=0.1536 \\&P[Y=4 \lvert X=4]=0.0256 \end{aligned}

\displaystyle \begin{aligned} &P[Y=0 \lvert X=5]=0.0625 \\&P[Y=1 \lvert X=5]=0.25 \\&P[Y=2 \lvert X=5]=0.375 \\&P[Y=3 \lvert X=5]=0.25 \\&P[Y=4 \lvert X=5]=0.0625 \end{aligned}

\displaystyle \begin{aligned} &P[Y=0 \lvert X=6]=0.0256 \\&P[Y=1 \lvert X=6]=0.1536 \\&P[Y=2 \lvert X=6]=0.3456 \\&P[Y=3 \lvert X=6]=0.3456 \\&P[Y=4 \lvert X=6]=0.1296 \end{aligned}

Practice Problem 7-B

\displaystyle \begin{aligned} &P[Y=4,X=1]=\frac{0.0001}{6} \\&P[Y=4,X=2]=\frac{0.0016}{6} \\&P[Y=4,X=3]=\frac{0.0081}{6} \\&P[Y=4,X=4]=\frac{0.0256}{6} \\&P[Y=4,X=5]=\frac{0.0625}{6} \\&P[Y=4,X=6]=\frac{0.1296}{6} \end{aligned}

\displaystyle \begin{aligned} &P[Y=3,X=1]=\frac{0.0036}{6} \\&P[Y=3,X=2]=\frac{0.0256}{6} \\&P[Y=3,X=3]=\frac{0.0756}{6} \\&P[Y=3,X=4]=\frac{0.1536}{6} \\&P[Y=3,X=5]=\frac{0.25}{6} \\&P[Y=3,X=6]=\frac{0.3456}{6} \end{aligned}

\displaystyle \begin{aligned} &P[Y=2,X=1]=\frac{0.0486}{6} \\&P[Y=2,X=2]=\frac{0.1536}{6} \\&P[Y=2,X=3]=\frac{0.2646}{6} \\&P[Y=2,X=4]=\frac{0.3456}{6} \\&P[Y=2,X=5]=\frac{0.375}{6} \\&P[Y=2,X=6]=\frac{0.3456}{6} \end{aligned}

\displaystyle \begin{aligned} &P[Y=1,X=1]=\frac{0.2916}{6} \\&P[Y=1,X=2]=\frac{0.4096}{6} \\&P[Y=1,X=3]=\frac{0.4116}{6} \\&P[Y=1,X=4]=\frac{0.3456}{6} \\&P[Y=1,X=5]=\frac{0.25}{6} \\&P[Y=1,X=6]=\frac{0.1536}{6} \end{aligned}

\displaystyle \begin{aligned} &P[Y=0,X=1]=\frac{0.6561}{6} \\&P[Y=0,X=2]=\frac{0.4096}{6} \\&P[Y=0,X=3]=\frac{0.2401}{6} \\&P[Y=0,X=4]=\frac{0.1296}{6} \\&P[Y=0,X=5]=\frac{0.0625}{6} \\&P[Y=0,X=6]=\frac{0.0256}{6} \end{aligned}

Practice Problem 7-C

\displaystyle \begin{aligned} &P[Y=4]=\frac{0.2275}{6} \\&P[Y=3]=\frac{0.854}{6} \\&P[Y=2]=\frac{1.533}{6} \\&P[Y=1]=\frac{1.862}{6} \\&P[Y=0]=\frac{1.5235}{6} \end{aligned}

$\displaystyle E[Y]=1.4$

$\displaystyle E[Y^2]=3.22$

$\displaystyle Var[Y]=1.26$

Practice Problem 7-D

\displaystyle \begin{aligned} &P[X=1 \lvert Y=0]=\frac{0.6561}{1.5235}=0.4307 \\&P[X=2 \lvert Y=0]=\frac{0.4096}{1.5235}=0.2689 \\&P[X=3 \lvert Y=0]=\frac{0.2401}{1.5235}=0.1576 \\&P[X=4 \lvert Y=0]=\frac{0.1296}{1.5235}=0.0851 \\&P[X=5 \lvert Y=0]=\frac{0.0625}{1.5235}=0.0410 \\&P[X=6 \lvert Y=0]=\frac{0.0256}{1.5235}=0.0168 \end{aligned}

\displaystyle \begin{aligned} &P[X=1 \lvert Y=1]=\frac{0.2916}{1.862}=0.1566 \\&P[X=2 \lvert Y=1]=\frac{0.4096}{1.862}=0.2200 \\&P[X=3 \lvert Y=1]=\frac{0.4116}{1.862}=0.2211 \\&P[X=4 \lvert Y=1]=\frac{0.3456}{1.862}=0.1856 \\&P[X=5 \lvert Y=1]=\frac{0.25}{1.862}=0.1343 \\&P[X=6 \lvert Y=1]=\frac{0.1536}{1.862}=0.0825 \end{aligned}

\displaystyle \begin{aligned} &P[X=1 \lvert Y=2]=\frac{0.0486}{1.533}=0.0317 \\&P[X=2 \lvert Y=2]=\frac{0.1536}{1.533}=0.1002 \\&P[X=3 \lvert Y=2]=\frac{0.2646}{1.533}=0.1726 \\&P[X=4 \lvert Y=2]=\frac{0.3456}{1.533}=0.2254 \\&P[X=5 \lvert Y=2]=\frac{0.375}{1.533}=0.2446 \\&P[X=6 \lvert Y=2]=\frac{0.3456}{1.533}=0.2254 \end{aligned}

\displaystyle \begin{aligned} &P[X=1 \lvert Y=3]=\frac{0.0036}{0.854}=0.0042 \\&P[X=2 \lvert Y=3]=\frac{0.0256}{0.854}=0.0300 \\&P[X=3 \lvert Y=3]=\frac{0.0756}{0.854}=0.0885 \\&P[X=4 \lvert Y=3]=\frac{0.1536}{0.854}=0.1799 \\&P[X=5 \lvert Y=3]=\frac{0.25}{0.854}=0.2927 \\&P[X=6 \lvert Y=3]=\frac{0.3456}{0.854}=0.4047 \end{aligned}

\displaystyle \begin{aligned} &P[X=1 \lvert Y=4]=\frac{0.0001}{0.2275}=0.0004 \\&P[X=2 \lvert Y=4]=\frac{0.0016}{0.2275}=0.0070 \\&P[X=3 \lvert Y=4]=\frac{0.0081}{0.2275}=0.0356 \\&P[X=4 \lvert Y=4]=\frac{0.0256}{0.2275}=0.1125 \\&P[X=5 \lvert Y=4]=\frac{0.0625}{0.2275}=0.2747 \\&P[X=6 \lvert Y=4]=\frac{0.1296}{0.2275}=0.5697 \end{aligned}

Practice Problem 7-E

$\displaystyle E[X]=\frac{7}{2}=3.5$

$\displaystyle E[X^2]=\frac{91}{6}$

$\displaystyle Var[X]=\frac{35}{12}$

Practice Problem 7-F

$\displaystyle E[Y]=1.4$

$\displaystyle E[Y^2]=3.22$

$\displaystyle Var[Y]=1.26$

Practice Problem 7-G

$\displaystyle \text{Cov}(X,Y)=\frac{7}{6}$

$\displaystyle \rho=\frac{7}{6 \sqrt{3.675}}=0.60858$

Dan Ma statistical

Daniel Ma statistical

Dan Ma practice problems

Daniel Ma practice problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

Dan Ma mathematical

Daniel Ma mathematical

$\copyright$ 2019 – Dan Ma

Practice Problem Set 6 – transformation of univariate random variables

This post provides practice problems to reinforce the concept discussed in this post on transformation of univariate distributions.

In each problem in this post, a pdf for the random variable $X$ is given and a transformation $Y=g(X)$ is given where $g(x)$ is a one-to-one function. The problem is to obtain the pdf of the transformed variable $Y$ as well as to calculate probabilities regarding $Y$.

.

 Practice Problem 6-A Let $Y=2 X+5$ where the pdf of the random variable $X$ is given by: $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{1}{50} \ (10-x) &\ \ \ \ \ \ 0 < x < 10 \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Find the pdf of $Y$. Determine the mean and variance of $Y$. Compute the probability $P[10 Determine the 75th percentile of $Y$.

.

 Practice Problem 6-B Let $Y=\sqrt{X}$ where the pdf of the random variable $X$ is given by: $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{1}{1024} \ (-63+61 x+3 x^2-x^3) &\ \ \ \ \ \ 1 < x < 9 \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Find the pdf of $Y$. Compute the probability $P[1

$\text{ }$

 Practice Problem 6-C Suppose that the random variable $X$ follows an exponential distribution with the following pdf. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{1}{4} \ e^{-x / 4} &\ \ \ \ \ \ 0 < x < \infty \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=3 X+2$. Find the pdf of $Y$. Determine the mean and variance of $Y$. Compute the probability $P[6 Determine the 90th percentile of $Y$.

$\text{ }$

 Practice Problem 6-D The random variable $X$ has a pdf that is given below. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{1}{4} \ x (4-x^2) &\ \ \ \ \ \ 0 < x < 2 \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=5-\frac{X}{2}$. Find the pdf of $Y$. Determine the mean and variance of $Y$.

$\text{ }$

 Practice Problem 6-E Suppose that the random variable $X$ follows an exponential distribution with the following pdf. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{1}{4} \ e^{-x / 4} &\ \ \ \ \ \ 0 < x < \infty \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=X^2$. Determine of pdf of $Y$. Determine of CDF of $Y$. Calculate the probabilities $P(Y > y)$ where $y=4,8,16,32,64,128,256$.

$\text{ }$

 Practice Problem 6-F Suppose that the random variable $X$ follows an exponential distribution with the following pdf. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{1}{4} \ e^{-x / 4} &\ \ \ \ \ \ 0 < x < \infty \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=\sqrt{X}$. Determine of pdf of $Y$. Determine of CDF of $Y$. Calculate the probabilities $P(Y > y)$ where $y=4,8,16,32$.

$\text{ }$

 Practice Problem 6-G Suppose that $X$ follows a uniform distribution whose pdf is given by the following. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle 1 &\ \ \ \ \ \ 0 < x < 1 \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=-\theta \ln(X)$ where $\theta$ is a positive constant. Determine the pdf of $Y$. What is this distribution?

$\text{ }$

 Practice Problem 6-H Consider the random variable $X$ whose pdf is given below. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{3}{125} x^2 &\ \ \ \ \ \ 0 < x < 5 \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=X^3$. Determine the pdf of $Y$. What is this distribution?

$\text{ }$

 Practice Problem 6-I Suppose that $X$ has the following density function. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle 6 \ x \ e^{-3 \ x^2} &\ \ \ \ \ \ 0 < x < \infty \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=X^2$. Determine the pdf of $Y$. What is this distribution?

$\text{ }$

 Practice Problem 6-J Suppose that the pdf of the random variable $X$ is given below. $\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{3}{x^4} &\ \ \ \ \ \ 1 < x < \infty \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$ Let $Y=\ln(X)$. Determine the pdf of $Y$. What is this distribution?

.

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

6-A
• $\displaystyle f_Y(y)=\frac{1}{200} (25-y) \ \ \ \ \ \ 5
• $\displaystyle E[Y]=\frac{35}{3}$, $\displaystyle Var[Y]=\frac{200}{9}$
• $\displaystyle P[10
• 75th percentile = 15
6-B
• $\displaystyle f_Y(y)=\frac{1}{512} (-63y+61y^3+3y^5-y^7) \ \ \ \ \ \ 1
• $\displaystyle P[1
6-C
• $\displaystyle f_Y(y)=\frac{1}{12} e^{-\frac{1}{12} (y-2) } \ \ \ \ \ \ 2
• $\displaystyle E[Y]=14$, $\displaystyle Var[Y]=144$
• $\displaystyle P[16
• 90th percentile = 29.63102112
6-D
• $\displaystyle f_Y(y)=4 (-120+74y-15y^2+y^3) \ \ \ \ \ \ 4
• $\displaystyle E[Y]=\frac{67}{15}$, $\displaystyle Var[Y]=\frac{11}{225}$
6-E
• $\displaystyle f_Y(y)=\frac{1}{8} \ \frac{1}{\sqrt{y}} \ e^{-(y/16)^{0.5}} \ \ \ \ \ \ 0
• $\displaystyle F_Y(y)=1- e^{-(y/16)^{0.5}} \ \ \ \ \ \ 0
• $\displaystyle 1-F_Y(4)=e^{-0.5}=0.6065306597$
• $\displaystyle 1-F_Y(8)=e^{-1/ \sqrt{2}}=0.4930686914$
• $\displaystyle 1-F_Y(16)=e^{-1}=0.3678794412$
• $\displaystyle 1-F_Y(32)=e^{- \sqrt{2}}=0.2431167344$
• $\displaystyle 1-F_Y(64)=e^{-2}=0.1353352832$
• $\displaystyle 1-F_Y(128)=e^{- \sqrt{8}}=0.0591057466$
• $\displaystyle 1-F_Y(256)=e^{-4}=0.0183156389$
6-F
• $\displaystyle f_Y(y)=\frac{1}{2} \ y \ e^{-(y/2)^{2}} \ \ \ \ \ \ 0
• $\displaystyle F_Y(y)=1- e^{-(y/2)^{2}} \ \ \ \ \ \ 0
• $\displaystyle 1-F_Y(4)=e^{-4}=0.0183156389$
• $\displaystyle 1-F_Y(8)=e^{-16}=1.12535 \cdot 10^{-7}$
• $\displaystyle 1-F_Y(16)=e^{-64}=1.6038 \cdot 10^{-28}$
• $\displaystyle 1-F_Y(32)=e^{-256}=6.6163 \cdot 10^{-112}$
6-G
• $\displaystyle f_Y(y)=\frac{1}{\theta} \ e^{-y / \theta} \ \ \ \ \ \ 0
• Exponential distribution
6-H
• $\displaystyle f_Y(y)=\frac{1}{125} \ \ \ \ \ \ 0
• Uniform distribution
6-I
• $\displaystyle f_Y(y)=3 \ e^{-3 y} \ \ \ \ \ \ 0
• Exponential distribution with mean 1/3
6-J
• $\displaystyle f_Y(y)=3 \ e^{-3 y} \ \ \ \ \ \ 0
• Exponential distribution with mean 1/3

Dan Ma statistical

Daniel Ma statistical

Dan Ma practice problems

Daniel Ma practice problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

Dan Ma mathematical

Daniel Ma mathematical

$\copyright$ 2019 – Dan Ma

Transformation of univariate random variables

Creating new probability distributions from old ones (or existing ones) is a familiar theme in the study pf probability. This post shows how to generate a distribution under a transformation. The process is illustrated with examples.

Practice problems are found in the next post.

Examples

The starting point is the random variable $X$ whose probability density function (pdf) is given by the following.

$\displaystyle f(x) = \left\{ \begin{array}{ll} \displaystyle \frac{1}{8} \ (4-x) &\ \ \ \ \ \ 0 < x < 4 \\ \text{ } & \text{ } \\ \displaystyle 0 &\ \ \ \ \ \ \text{otherwise} \\ \end{array} \right.$

The given $X$ is transformed in four different ways as follows:

$Y_1=X^2$

$Y_2=16-X^2$

$Y_3=\sqrt{X}$

$Y_4=\sqrt{4-X}$

We demonstrate how to derive the pdfs of these four new random variables based on the pdf given at the beginning. Note that the support of $X$ is the interval $(0, 4)$. Because of the transformations, the supports of the $Y$ variables are different. The support of $Y_1$ is the interval $(0, 16)$. The support of $Y_2$ is also $(0, 16)$. The support of $Y_3$ is $(0, 2)$. The support of $Y_4$ is also $(0, 2)$.

There are two ways to derive the pdfs of $Y_i$, $i=1,2,3,4$. One way is the CDF method: to find the CDF of the new variable and then take the derivative to get the pdf. Another way is the method of transformation, which is the focus here. We show how to use CDF method on $Y_1$ in order to draw out the idea of the method of transformation.

The $f_{Y_1}(y)$ and $F_{Y_1}(y)$ be the pdf and CDF of $Y_1$. Let $F(x)$ be the CDF of $X$. The following derives $F_{Y_1}(y)$.

\displaystyle \begin{aligned} F_{Y_1}(y)&=P(Y_1 \le y) \\&=P(X^2 \le y) \\&=P(X \le \sqrt{y}) \\&=F(\sqrt{y}) \end{aligned}

Thus the CDF $F_{Y_1}(y)$ is the CDF $F(x)$ evaluated at $\sqrt{y}$. Since pdf is the derivative of the CDF, the pdf $f_{Y_1}(y)$ is obtained by taking derivative of $F(\sqrt{y})$.

\displaystyle \begin{aligned} f_{Y_1}(y)&=\frac{d}{dy} F_{Y_1}(y) \\&=\frac{d}{dy} F(\sqrt{y}) \\&=f(\sqrt{y}) \ \frac{d}{dy} \sqrt{y} \ \ \ \ \ *\\&=\frac{1}{8} \biggl(4-\sqrt{y} \biggr) \frac{1}{2 \sqrt{y}} \\&=\frac{1}{16} \biggl(\frac{4}{\sqrt{y}}-1 \biggr) \ \ \ \ \ \ \ \ \ \ 0

The step that is labeled with * is the key step in the derivative and will be discussed in further details.

The Method of Transformation

Let’s describe the method demonstrated in the above derivation. There is a starting probability distribution represented by the random variable $X$. Its pdf is $f(x)$ whose support is a subset of the x-axis, likely an interval (of finite or infinite length). Let’s call the support $S$. The support of a pdf is the set of all $x$ such that $f(x)>0$. We have a differentiable function $g(x)$ defined on $S$. This function $g(x)$ is a one-to-one function over the support $S$. The function $g(x)$ does not have to be a one-to-one function over all of the x-axis. It just has to be one-to-one over the support $S$. As a result, the function $g(x)$ is either an increasing function or a decreasing function over the support.

Since $y=g(x)$ is a one-to-one function, it has an inverse $x=g^{-1}(y)$. The inverse $g^{-1}(y)$ is defined over the set $g(S)$.

Consider the new random variable $Y=g(X)$. The following gives the pdf of $Y$.

(1)……$\displaystyle f_Y(y)=f(g^{-1}(y)) \ \biggl \lvert \frac{d}{dy} \ g^{-1}(y) \biggr \lvert$

The formula (1) gives the method of transformation and is illustrated by the step labeled with * above. With $Y_1=X^2$, the transformation is the function $g(x)=x^2$. It is not a one-to-one function over the entire x-axis but it is a one-to-one function on the support $(0, 4)$. In fact, $g(x)=x^2$ is an increasing function over $(0, 4)$. The inverse function is then $g^{-1}(y)=\sqrt{y}$. Applying (1) gives the pdf $f_Y(y)$.

One thing to keep in mind is that the method works only if the transformation $g(x)$ is a one-to-one function over the support of the original pdf $f(x)$ (either $g(x)$ is increasing or decreasing). If not, the method will produce a wrong answer. Another thing to keep in mind is that when the transformation $g(x)$ is a decreasing function, its inverse $g^{-1}(y)$ is also a decreasing function. Then its derivative would be negative. In (1), we use the absolute value of the derivative.

Examples Continued

Using the method of transformation, the following shows the pdfs of $Y_i$, $i=1,2,3,4$.

$\displaystyle f_{Y_1}(y)=\frac{1}{16} \biggl(\frac{4}{\sqrt{y}}-1 \biggr) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0

$\displaystyle f_{Y_2}(y)=\frac{1}{16} \biggl(\frac{4}{\sqrt{16-y}}-1 \biggr) \ \ \ \ \ \ \ \ \ \ 0

$\displaystyle f_{Y_3}(y)=\frac{1}{4} \ (4 y-y^3 ) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0

$\displaystyle f_{Y_4}(y)=\frac{1}{4} \ y^3 \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0

It will be instructive to examine the graphs of the pdfs. The following is the graph of the pdf of $X$.

Figure 1

The starting pdf (Figure 1) is a straight line with negative slope. In this distribution, more probabilities are found near zero. For example $P(X \le 1)=0.4375$. About 44% of the values from this distribution are expected to be less than 1. The following is the graph of the pdf of $Y_1=X^2$.

Figure 2

Figure 2 shows that the effect of the transformation $y=x^2$ is to push the probabilities further to zero. The 44% that is less than 1 in Figure 1 is further pushed toward zero. Hence the graph in Figure 2 is extremely positively skewed (or skewed to the right since the right tail is longer). The following is the graph of the pdf of $Y_2=16-X^2$.

Figure 3

Figure 3 shows that the effect of the transformation $y=16-x^2$ is to push the probabilities in the opposite direction toward 4. The hence the distribution of $Y_2$ is extremely negative skewed (or left skewed since the left tail is longer).

Remarks

The method of transformation is a great tool making it possible to create new distributions with desired characteristics from old ones.

Some named distributions are generated from transformation. For example, the lognormal distribution is a transformation from the normal distribution where the transformation is an exponential function. More specifically, if $X$ has a normal distribution with mean $\mu$ and variance $\sigma^2$, then $Y=e^X$ has a lognormal distribution and parameters $\mu$ and $\sigma^2$. The transformation goes the other way too. If $Y$ has a lognormal distribution and parameters $\mu$ and $\sigma^2$, then $X=\ln(Y)$ has a normal distribution with mean $\mu$ and variance $\sigma^2$. See here for a discussion of lognormal distribution. Practice problems on lognormal distribution are found here.

Another named distribution that is generated from a transformation is the Weibull distribution. It is generated by raising an exponential random variable to a power (discussed here). The topic of raising an exponential distribution to a power is further discussed here. For more distributions created by transformation, explore to the site in the given links.

Practice problems are found in the next post.

Dan Ma statistical

Daniel Ma statistical

Dan Ma practice problems

Daniel Ma practice problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

Daniel Ma mathematical

Dan Ma mathematical

$\copyright$ 2019 – Dan Ma

Practice Problem Set 5 – bivariate normal distribution

This post provides practice problems to reinforce the concept of bivariate normal distribution discussed in two posts – one is a detailed introduction to bivariate normal distribution and the other is a further discussion that brings out more mathematical properties of the bivariate normal distribution. The properties discussed in these two posts form the basis for the calculation behind the practice problems presented here.

The practice problems presented here are mostly on calculating probabilities. The normal probabilities can be obtained using a normal table or a calculator that has a function for normal distribution (such as TI84+). The answers for normal probabilities given at the end of the post have two versions – one using a normal table (found here) and the other one using TI84+.

.

 Practice Problem 5-A Suppose that $X$ and $Y$ follow a bivariate normal distribution with parameters $\mu_X=15$, $\sigma_X=4$, $\mu_Y=20$, $\sigma_Y=5$ and $\rho=-0.7$. Determine the following. Compute the probability $P[12 For $X=20$, determine the mean and standard deviation of the conditional distribution of $Y$ given $X=20$. Determine $P[12, the probability that $12 given $X=20$.

.

 Practice Problem 5-B Suppose that $X$ and $Y$ follow a bivariate normal distribution with parameters $\mu_X=6$, $\sigma_X=1.6$, $\mu_Y=4$, $\sigma_Y=1.2$ and $\rho=0.8$. Determine the following. Compute the probability $P[3 Determine $E[Y \lvert X=x]$, the mean of the conditional distribution of $Y$ given $X=x$. Determine $\sigma_{Y \lvert x}^2=Var[Y \lvert X=x]$ and $\sigma_{Y \lvert x}$, the variance and the standard deviation of the conditional distribution of $Y$ given $X=x$. For each of the $x$ values 6, 8, 10 and 12, determine the 99.7% interval $(a,b)$ for the conditional distribution of $Y$ given $x$, i.e. $a$ is three standard deviations below the mean and $b$ is 3 standard deviations above the mean. For each of the $x$ values 6, 8, 10 and 12, determine $P[3. Explain the magnitude of each of these probabilities based on the intervals in 6.

$\text{ }$

 Practice Problem 5-C Let $X$ and $Y$ have a bivariate normal distribution with parameters $\mu_X=50$, $\sigma_X=10$, $\mu_Y=60$, $\sigma_Y=5$ and $\rho=0.6$. Determine the following. Calculate $P[100 Determine the 5 parameters of the bivariate normal random variables $L=X+Y$ and $M=X-Y$. Calculate $P[100

$\text{ }$

 Practice Problem 5-D Suppose $X$ is the height (in inches) and $Y$ is the weight (in pounds) of a male student in a large university. Furthermore suppose that $X$ and $Y$ follow a bivariate normal distribution with parameters $\mu_X=69$, $\mu_Y=155$, $\sigma_X=2.5$, $\sigma_Y=20$ and $\rho=0.55$. What is the distribution of the weights of all male students what are 5 feet 11 inches tall (71 inches)? For a randomly chosen male student who is 71 inches tall, what is the probability that his weight is between 170 and 200 pounds? For male students who are 71 inches tall, what is the 90th percentile of weight?

$\text{ }$

 Practice Problem 5-E Suppose that $X$ and $Y$ have a bivariate normal distribution with parameters $\mu_X=70$, $\mu_Y=70$, $\sigma_X=5$, $\sigma_Y=10$ and $\rho>0$. Further suppose that $P[58.24. Determine $\rho$.

$\text{ }$

 Practice Problem 5-F Suppose that $X$ and $Y$ have a bivariate normal distribution with parameters $\mu_X=70$, $\mu_Y=60$, $\sigma_X=10$, $\sigma_Y=12$ and $\rho=0.8$. Compute $P[45 When $X=60$, 4 values of $Y$ are observed. Compute $P[45<\overline{Y}<55 \lvert X=60]$ where $\overline{Y}$ is the mean of the sample of size 4.

$\text{ }$

 Practice Problem 5-G Let $X$ and $Y$ have a bivariate normal distribution with parameters $\mu_X=70$, $\mu_Y=50$, $\sigma_X=10$, $\sigma_Y=12$ and $\rho=-0.65$. Determine the following. $P[X-Y<50]$ $\displaystyle P[55<\frac{X+Y}{2}<65]$

$\text{ }$

 Practice Problem 5-H Let $X$ and $Y$ have a bivariate normal distribution with parameters $\mu_X=70$, $\sigma_X=5$, $\mu_Y=50$, $\sigma_Y=10$ and $\rho=0.75$. Determine the following probabilities. $P \biggl[ \frac{X+Y}{2}<68 \biggr]$ $P \biggl[ \frac{X+Y}{2}<68 \ \biggl \lvert Y=60 \biggr]$

$\text{ }$

 Practice Problem 5-I For a couple from a large population of married couples, let $X$ be the height (in inches) of the husband and let $Y$ be the height (in inches) of the wife. Suppose that $X$ and $Y$ have a bivriate normal distribution with parameters $\mu_X=68$, $\mu_Y=65$, $\sigma_X=2.2$, $\sigma_Y=2.5$ and $\rho=0.5$. For a randomly selected wife from this population, determine the probability that her height is between 68 inches and 72 inches. For a randomly selected wife from this population whose husband is 72 inches tall, determine the probability that her height is between 68 inches and 72 inches. For a randomly selected couple from this population, determine the probability that the wife is taller than the husband.

$\text{ }$

 Practice Problem 5-J The annual revenues of Company X and Company Y are positively correlated since the correlation coefficient between the two revenues is 0.65. The annual revenue of Company X is, on average, 4,500 with standard deviation 1,500. The annual revenue of Company Y is, on average, 5,500 with standard deviation 2,000. Calculate the probability that annual revenue of Company X is less than 6,800 given that the annual revenue of Company Y is 6,800. Calculate the probability that the annual revenue of Company X is greater than that of Company Y given that their total revenue is 12,000.

.

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

5-A
1. $P[12 (table), 0.8904014212 (TI84+)
2. $E[Y \lvert X=20]=15.625$, $Var[Y \lvert X=20]=12.75$
3. $P[12 (table), 0.8447309876 (TI84+)
5-B
• $P[3 (table),0.5953433508 (TI84+)
• $\displaystyle E[Y \lvert x]=0.4+0.6 \ x$
• $\displaystyle Var[Y \lvert x]=0.5184$, standard deviation = 0.72
• For x = 6, (1.84, 6.16)
• For x = 8, (3.04, 7.36)
• For x = 10, (4.24, 8.56)
• For x = 12, (5.44, 9.76)
• $P[3 (table), 0.8351333522 (TI84+)
• $P[3 (table), 0.3894682472 (TI84+)
• $P[3 (table), 0.025919702 (TI84+)
• $P[3 (table), 0.0001524802646 (TI84+)
5-C
• $P[100 (table), 0.7551912515 (TI84+)
• $\displaystyle \mu_L=110 \ \ \ \sigma_L=\sqrt{185} \ \ \ \mu_M=-10 \ \ \ \sigma_M=\sqrt{65} \ \ \ \rho_{L,M}=\frac{75}{\sqrt{185} \sqrt{65}}$
• $P[100 (table), 0.8966089617 (TI84+)
5-D
• Normal with mean 163.8 and standard deviation $\sqrt{279}$.
• $P[170 (table), 0.3401418637 (TI84+)
• 90th percentile = 185.18 (table), 185.2061314 (TI84+)
5-E
• 0.8
5-F
• $P[45 (table), 0.5119251771 (TI84+)
• $P[45<\overline{Y}<55 \lvert X=60]=0.8329$ (table), 0.8325288097 (TI84+)
5-G
• $P[X-Y<50]=0.9332$ (table), 0.9331927713 (TI84+)
• $\displaystyle P[55<\frac{X+Y}{2}<65]=0.7154$ (table), 0.7135779177 (TI84+)
5-H
• $P \biggl[ \frac{X+Y}{2}<68 \biggr]=0.8708$ (table), 0.8710504336 (TI84+)
• $P \biggl[ \frac{X+Y}{2}<68 \ \biggl \lvert Y=60 \biggr]=0.7517$ (table), 0.7518542213 (TI84+)
5-I
• 0.1125 (table), 0.1125145409 (TI84+)
• 0.3523 (table), 0.3539664536 (TI84+)
• 0.1020 (table), 0.1022447094 (TI84+)
5-J
• 0.9279 (table), 0.9280950079 (TI84+)
• 0.1736 (table), 0.1736950626 (TI84+)

Dan Ma statistical

Daniel Ma statistical

Dan Ma practice problems

Daniel Ma practice problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

$\copyright$ 2019 – Dan Ma

Calculating bivariate normal probabilities

This post extends the discussion of the bivariate normal distribution started in this post from a companion blog. Practice problems are given in the next post.

Suppose that the continuous random variables $X$ and $Y$ follow a bivariate normal distribution with parameters $\mu_X$, $\sigma_X$, $\mu_Y$, $\sigma_Y$ and $\rho$. What to make of these five parameters? According to the previous post, we know that

• $\mu_X$ and $\sigma_X$ are the mean and standard deviation of the marginal distribution of $X$,
• $\mu_Y$ and $\sigma_Y$ are the mean and standard deviation of the marginal distribution of $Y$,
• and finally $\rho$ is the correlation coefficient of $X$ and $Y$.

So the five parameters of a bivariate normal distribution are the means and standard deviations of the two marginal distributions and the fifth parameter is the correlation coefficient that serves to connect $X$ and $Y$. If $\rho=0$, then $X$ and $Y$ are simply two independent normal distributions.

When calculating probabilities involving a bivariate normal distribution, keep in mind that both marginal distributions are normal. Furthermore, the conditional distribution of one variable given a value of the other is also normal. Much more can be said about the conditional distributions.

The conditional distribution of $Y$ given $X=x$ is usually denoted by $Y \lvert X=x$ or $Y \lvert x$. In additional to being a normal distribution, it has a mean that is a linear function of $x$ and has a variance that is constant (it does not matter what $x$ is, the variance is always the same). The linear conditional mean and constant variance are given by the following:

$\displaystyle E[Y \lvert X=x]=\mu_Y+\rho \ \frac{\sigma_Y}{\sigma_X} \ (x-\mu_X)$

$\displaystyle Var[Y \lvert X=x]=\sigma_Y^2 \ (1-\rho^2)$

Similarly, the conditional distribution of $X$ given $Y=y$ is usually denoted by $X \lvert Y=y$ or $X \lvert y$. In additional to being a normal distribution, it has a mean that is a linear function of $x$ and has a variance that is constant. The linear conditional mean and constant variance are given by the following:

$\displaystyle E[X \lvert Y=y]=\mu_X+\rho \ \frac{\sigma_X}{\sigma_Y} \ (y-\mu_Y)$

$\displaystyle Var[X \lvert Y=y]=\sigma_X^2 \ (1-\rho^2)$

The information about the conditional distribution of $Y$ on $X=x$ is identical to the information about the conditional distribution of $X$ on $Y=y$, except for the switching of $X$ and $Y$. An example is helpful.

Example 1
Suppose that the continuous random variables $X$ and $Y$ follow a bivariate normal distribution with parameters $\mu_X=10$, $\sigma_X=10$, $\mu_Y=20$, $\sigma_Y=5$ and $\rho=0.6$. The first two parameters are the mean and standard deviation of the marginal distribution of $X$. The next two parameters are the mean and standard deviation of the marginal distribution of $Y$. The parameter $\rho$ is the correlation coefficient of $X$ and $Y$. Both marginal distributions are normal.

Let’s focus on the conditional distribution of $Y$ given $X=x$. It is normally distributed. Its mean and variance are:

\displaystyle \begin{aligned} E[Y \lvert X=x]&=\mu_Y+\rho \ \frac{\sigma_Y}{\sigma_X} \ (x-\mu_X) \\&=20+0.6 \ \frac{5}{10} \ (x-10) \\&=20+0.3 \ (x-10) \\&=17+0.3 \ x \end{aligned}

$\displaystyle \sigma_{Y \lvert x}^2=Var[Y \lvert X=x]=\sigma_Y^2 (1-\rho^2)=25 \ (1-0.6^2)=16$

$\displaystyle \sigma_{Y \lvert x}=4$

The line $y=17+0.3 \ x$ is also called the least squares regression line. It gives the mean of the conditional distribution of $Y$ given $x$. Because $X$ and $Y$ are positively correlated, the least squares line has positive slope. In this case, the larger the $x$, the larger is the mean of $Y$. The standard deviation of $Y$ given $x$ is constant across all possible $x$ values.

With mean and standard deviation known, we can now compute normal probabilities. Suppose the realized value of $X$ is 25. Then the mean of $Y \lvert 25$ is $E[Y \lvert 25]=24.5$. The standard deviation, as indicated above, is 4. In fact, for any other $x$, the standard deviation of $Y \lvert x$ is also 4. Now calculate the probability $P[20. We first calculate it using a normal table found here.

\displaystyle \begin{aligned} P[20

Using a TI84+ calculator, $P[20. In contrast, the probability $P[20 is (using the table found here):

\displaystyle \begin{aligned} P[20

Using a TI84+ calculator, $P[20. Note that $P[20 is for the marginal distribution of $Y$. It is not conditioned on any realized value of $X$.

Practice Problems

Statistics Practice Problems

probability Practice Problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

$\copyright$ 2018 – Dan Ma

Practice Problem Set 4 – Correlation Coefficient

This post provides practice problems to reinforce the concept of correlation coefficient discussed in this
post in a companion blog. The post in the companion blog shows how to evaluate the covariance $\text{Cov}(X,Y)$ and the correlation coefficient $\rho$ of two continuous random variables $X$ and $Y$. It also discusses the connection between $\rho$ and the regression curve $E[Y \lvert X=x]$ and the least squares regression line.

The structure of the practice problems found here is quite simple. Given a joint density function for a pair of random variables $X$ and $Y$ (with an appropriate region in the xy-plane as support), determine the following four pieces of information.

• The covariance $\text{Cov}(X,Y)$
• The correlation coefficient $\rho$
• The regression curve $E[Y \lvert X=x]$
• The least squares regression line $y=a+b x$

The least squares regression line $y=a+bx$ whose slope $b$ and y-intercept $a$ are given by:

$\displaystyle b=\rho \ \frac{\sigma_Y}{\sigma_X}$

$\displaystyle a=\mu_Y-b \ \mu_X$

where $\mu_X=E[X]$, $\sigma_X^2=Var[X]$, $\mu_Y=E[Y]$ and $\sigma_Y^2=Var[Y]$.

.

For some of the problems, the regression curves $E[Y \lvert X=x]$ coincide with the least squares regression lines. When the regression curve is in a linear form, it coincides with the least squares regression line.

As mentioned, the practice problems are to reinforce the concepts discussed in this post.

.

 Practice Problem 4-A $\displaystyle f(x,y)=\frac{3}{4} \ (2-y) \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-B $\displaystyle f(x,y)=\frac{1}{2} \ \ \ \ \ \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-C $\displaystyle f(x,y)=\frac{1}{8} \ (x+y) \ \ \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-D $\displaystyle f(x,y)=\frac{1}{2 \ x^2} \ \ \ \ \ \ \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-E $\displaystyle f(x,y)=\frac{1}{2} \ (x+y) \ e^{-x-y} \ \ \ \ \ \ \ \ \ \ \ \ \ x>0, \ y>0$

$\text{ }$

 Practice Problem 4-F $\displaystyle f(x,y)=\frac{3}{8} \ x \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-G $\displaystyle f(x,y)=\frac{1}{2} \ xy \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-H $\displaystyle f(x,y)=\frac{3}{14} \ (xy +x) \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-I $\displaystyle f(x,y)=\frac{3}{32} \ (x+y) \ xy \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ 0

$\text{ }$

 Practice Problem 4-J $\displaystyle f(x,y)=\frac{3y}{(x+1)^6} \ \ e^{-y/(x+1)} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x>0, \ y>0$

$\text{ }$

 Practice Problem 4-K $\displaystyle f(x,y)=\frac{y}{(x+1)^4} \ \ e^{-y/(x+1)} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ x>0, \ y>0$ For this problem, only work on the regression curve $E[Y \lvert X=x]$. Note that $E[X]$ and $Var[X]$ do not exist.

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

$\text{ }$

4-A
• $\displaystyle \text{Cov}(X,Y)=\frac{1}{10}$
• $\displaystyle \rho=\sqrt{\frac{1}{3}}=0.57735$
• $\displaystyle E[Y \lvert X=x]=\frac{2 (4-3 x^2+x^3)}{3 (4- 4x+x^2)}=\frac{2 (2+x-x^2)}{3 (2-x)} \ \ \ \ \ 0
• $\displaystyle y=\frac{2}{3} \ (x+1)$
4-B
• $\displaystyle \text{Cov}(X,Y)=\frac{1}{9}$
• $\displaystyle \rho=\frac{1}{2}$
• $\displaystyle E[Y \lvert X=x]=1+\frac{1}{2} x \ \ \ \ \ 0
• $\displaystyle y=1+\frac{1}{2} x$
4-C
• $\displaystyle \text{Cov}(X,Y)=-\frac{1}{36}$
• $\displaystyle \rho=-\frac{1}{11}$
• $\displaystyle E[Y \lvert X=x]=\frac{x+\frac{4}{3}}{x+1} \ \ \ \ \ 0
• $\displaystyle y=\frac{14}{11}-\frac{1}{11} x$
4-D
• $\displaystyle \text{Cov}(X,Y)=\frac{1}{3}$
• $\displaystyle \rho=\frac{1}{2} \ \sqrt{\frac{15}{7}}=0.7319$
• $\displaystyle E[Y \lvert X=x]=\frac{x^2}{2} \ \ \ \ \ 0
• $\displaystyle y=-\frac{1}{3}+ x$
4-E
• $\displaystyle \text{Cov}(X,Y)=-\frac{1}{4}$
• $\displaystyle \rho=-\frac{1}{7}=-0.1429$
• $\displaystyle E[Y \lvert X=x]=\frac{x+2}{x+1} \ \ \ \ \ x>0$
• $\displaystyle y=\frac{12}{7}-\frac{1}{7} x$
4-F
• $\displaystyle \text{Cov}(X,Y)=-\frac{3}{40}$
• $\displaystyle \rho=\frac{3}{\sqrt{19}}=0.3974$
• $\displaystyle E[Y \lvert X=x]=\frac{x}{2} \ \ \ \ \ 0
• $\displaystyle y=\frac{x}{2}$
4-G
• $\displaystyle \text{Cov}(X,Y)=\frac{16}{225}$
• $\displaystyle \rho=\frac{4}{\sqrt{66}}=0.4924$
• $\displaystyle E[Y \lvert X=x]=\frac{2}{3} x \ \ \ \ \ 0
• $\displaystyle y=\frac{2}{3} x$
4-H
• $\displaystyle \text{Cov}(X,Y)=\frac{298}{3675}$
• $\displaystyle \rho=\frac{149}{3 \sqrt{12259}}=0.4486$
• $\displaystyle E[Y \lvert X=x]=\frac{x (2x+3)}{3x+6} \ \ \ \ \ 0
• $\displaystyle y=-\frac{2}{41}+\frac{149}{246} x$
4-I
• $\displaystyle \text{Cov}(X,Y)=-\frac{1}{144}$
• $\displaystyle \rho=-\frac{5}{139}=-0.03597$
• $\displaystyle E[Y \lvert X=x]=\frac{4x+6}{3x+4} \ \ \ \ \ 0
• $\displaystyle y=\frac{204}{139}-\frac{5}{139} x$
4-J
• $\displaystyle \text{Cov}(X,Y)=\frac{3}{2}$
• $\displaystyle \rho=\frac{1}{\sqrt{3}}=0.57735$
• $\displaystyle E[Y \lvert X=x]=2 (x+1) \ \ \ \ \ x>0$
• $\displaystyle y=2 (x+1)$
4-K
• $\displaystyle E[Y \lvert X=x]=2 (x+1) \ \ \ \ \ x>0$

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

$\copyright$ 2018 – Dan Ma