Practice Problem Set 4 – Correlation Coefficient

This post provides practice problems to reinforce the concept of correlation coefficient discussed in this
post in a companion blog. The post in the companion blog shows how to evaluate the covariance \text{Cov}(X,Y) and the correlation coefficient \rho of two continuous random variables X and Y. It also discusses the connection between \rho and the regression curve E[Y \lvert X=x] and the least squares regression line.

The structure of the practice problems found here is quite simple. Given a joint density function for a pair of random variables X and Y (with an appropriate region in the xy-plane as support), determine the following four pieces of information.

  • The covariance \text{Cov}(X,Y)
  • The correlation coefficient \rho
  • The regression curve E[Y \lvert X=x]
  • The least squares regression line y=a+b x

The least squares regression line y=a+bx whose slope b and y-intercept a are given by:

    \displaystyle b=\rho \ \frac{\sigma_Y}{\sigma_X}

    \displaystyle a=\mu_Y-b \ \mu_X

where \mu_X=E[X], \sigma_X^2=Var[X], \mu_Y=E[Y] and \sigma_Y^2=Var[Y].

.

For some of the problems, the regression curves E[Y \lvert X=x] coincide with the least squares regression lines. When the regression curve is in a linear form, it coincides with the least squares regression line.

As mentioned, the practice problems are to reinforce the concepts discussed in this post.

.

Practice Problem 4-A
    \displaystyle f(x,y)=\frac{3}{4} \ (2-y) \ \ \ \ \ \ \ 0<x<y<2

\text{ }

Practice Problem 4-B
    \displaystyle f(x,y)=\frac{1}{2} \ \ \ \ \ \ \ \ \ \ \ \ 0<x<y<2

\text{ }

Practice Problem 4-C
    \displaystyle f(x,y)=\frac{1}{8} \ (x+y) \ \ \ \ \ \ \ \ \ 0<x<2, \ 0<y<2

\text{ }

Practice Problem 4-D
    \displaystyle f(x,y)=\frac{1}{2 \ x^2} \ \ \ \  \ \ \ \ \ \ \ \ \ 0<x<2, \ 0<y<x^2

\text{ }

Practice Problem 4-E
    \displaystyle f(x,y)=\frac{1}{2} \ (x+y) \ e^{-x-y} \ \ \ \  \ \ \ \ \ \ \ \ \ x>0, \ y>0

\text{ }

Practice Problem 4-F
    \displaystyle f(x,y)=\frac{3}{8} \ x \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ 0<y<x<2

\text{ }

Practice Problem 4-G
    \displaystyle f(x,y)=\frac{1}{2} \ xy \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ 0<y<x<2

\text{ }

Practice Problem 4-H
    \displaystyle f(x,y)=\frac{3}{14} \ (xy +x) \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ 0<y<x<2

\text{ }

Practice Problem 4-I
    \displaystyle f(x,y)=\frac{3}{32} \ (x+y) \ xy \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ 0<x<2, \ 0<y<2

\text{ }

Practice Problem 4-J
    \displaystyle f(x,y)=\frac{3y}{(x+1)^6} \ \ e^{-y/(x+1)} \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ x>0, \ y>0

\text{ }

Practice Problem 4-K
    \displaystyle f(x,y)=\frac{y}{(x+1)^4} \ \ e^{-y/(x+1)} \ \ \ \ \ \ \  \ \ \ \ \ \ \ \ \ x>0, \ y>0

For this problem, only work on the regression curve E[Y \lvert X=x]. Note that E[X] and Var[X] do not exist.

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

Problem ………..Answer
4-A
  • \displaystyle \text{Cov}(X,Y)=\frac{1}{10}
  • \displaystyle \rho=\sqrt{\frac{1}{3}}=0.57735
  • \displaystyle E[Y \lvert X=x]=\frac{2 (4-3 x^2+x^3)}{3 (4- 4x+x^2)}=\frac{2 (2+x-x^2)}{3 (2-x)} \ \ \ \ \ 0<x<2
  • \displaystyle y=\frac{2}{3} \ (x+1)
4-B
  • \displaystyle \text{Cov}(X,Y)=\frac{1}{9}
  • \displaystyle \rho=\frac{1}{2}
  • \displaystyle E[Y \lvert X=x]=1+\frac{1}{2} x \ \ \ \ \ 0<x<2
  • \displaystyle y=1+\frac{1}{2} x
4-C
  • \displaystyle \text{Cov}(X,Y)=-\frac{1}{36}
  • \displaystyle \rho=-\frac{1}{11}
  • \displaystyle E[Y \lvert X=x]=\frac{x+\frac{4}{3}}{x+1} \ \ \ \ \ 0<x<2
  • \displaystyle y=\frac{14}{11}-\frac{1}{11} x
4-D
  • \displaystyle \text{Cov}(X,Y)=\frac{1}{3}
  • \displaystyle \rho=\frac{1}{2} \ \sqrt{\frac{15}{7}}=0.7319
  • \displaystyle E[Y \lvert X=x]=\frac{x^2}{2} \ \ \ \ \ 0<x<2
  • \displaystyle y=-\frac{1}{3}+ x
4-E
  • \displaystyle \text{Cov}(X,Y)=-\frac{1}{4}
  • \displaystyle \rho=-\frac{1}{7}=-0.1429
  • \displaystyle E[Y \lvert X=x]=\frac{x+2}{x+1} \ \ \ \ \ x>0
  • \displaystyle y=\frac{12}{7}-\frac{1}{7} x
4-F
  • \displaystyle \text{Cov}(X,Y)=-\frac{3}{40}
  • \displaystyle \rho=\frac{3}{\sqrt{19}}=0.3974
  • \displaystyle E[Y \lvert X=x]=\frac{x}{2} \ \ \ \ \ 0<x<2
  • \displaystyle y=\frac{x}{2}
4-G
  • \displaystyle \text{Cov}(X,Y)=\frac{16}{225}
  • \displaystyle \rho=\frac{4}{\sqrt{66}}=0.4924
  • \displaystyle E[Y \lvert X=x]=\frac{2}{3} x \ \ \ \ \ 0<x<2
  • \displaystyle y=\frac{2}{3} x
4-H
  • \displaystyle \text{Cov}(X,Y)=\frac{298}{3675}
  • \displaystyle \rho=\frac{149}{3 \sqrt{12259}}=0.4486
  • \displaystyle E[Y \lvert X=x]=\frac{x (2x+3)}{3x+6}  \ \ \ \ \ 0<x<2
  • \displaystyle y=-\frac{2}{41}+\frac{149}{246} x
4-I
  • \displaystyle \text{Cov}(X,Y)=-\frac{1}{144}
  • \displaystyle \rho=-\frac{5}{139}=-0.03597
  • \displaystyle E[Y \lvert X=x]=\frac{4x+6}{3x+4}  \ \ \ \ \ 0<x<2
  • \displaystyle y=\frac{204}{139}-\frac{5}{139} x
4-J
  • \displaystyle \text{Cov}(X,Y)=\frac{3}{2}
  • \displaystyle \rho=\frac{1}{\sqrt{3}}=0.57735
  • \displaystyle E[Y \lvert X=x]=2 (x+1) \ \ \ \ \ x>0
  • \displaystyle y=2 (x+1)
4-K
  • \displaystyle E[Y \lvert X=x]=2 (x+1) \ \ \ \ \ x>0

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

\copyright 2018 – Dan Ma

Advertisements

Practice Problem Set 3 – The Big 3 Discrete Distributions

This post presents exercises on the big 3 discrete distributions – binomial, Poisson and negative binomial, reinforcing the concepts discussed in several blog posts (here and here).

A previous problem set on Poisson and gamma is found here.

A previous problem set on Poisson distribution is found here.

\text{ }

Practice Problem 3-A

The amount of damage from an auto collision accident is modeled by an exponential distribution with mean 5. Ten unrelated auto collision claims are examined by an insurance adjuster.

  • What is the probability that five of the claims will have damages exceeding the mean damage amount?
  • What is the probability that at most two of the claims will have damages exceeding the mean damage amount?
  • What is the expected number of claims with damages exceeding the mean?

\text{ }

Practice Problem 3-B

The jackpot of the Powerball lottery can sometimes be in the hundreds of millions dollars. The odds of winning the jackpot are one in 292 million. However, there are prizes other than the jackpot (some of the lesser prizes are $100 and $7). The odds of winning a prize in Powerball are one in 24.87. A Powerball player buys one ticket every month for a year.

  • What is the probability of winning at least one prize?
  • What is the probability of winning at least two prizes?
  • What is the probability of winning at least three prizes?
  • What is the probability of winning at least four prizes?

See here for the calculation of Powerball winning odds.

\text{ }

Practice Problem 3-C

According to a poll conducted by AAA, 94% of teen drivers acknowledge the dangers of texting and driving but 35% admitted to doing it anyway. In a random sample of 20 teen drivers,

  • what is the probability that exactly five of the teen drivers do texting while driving?
  • what is the probability that more than five of the teen drivers do texting while driving?

\text{ }

Practice Problem 3-D

According to aviation statistics in the commercial airline industry, approximately one in 225 bags or luggage that are checked is lost. A business executive will be flying frequently next year and will be checking 100 bags or luggage during that one year.

  • Determine the probability that the business executive will not lose any bags or luggage during his travel.
  • Determine the probability that the business executive will lose one or two bags or luggage during his travel.

\text{ }

Practice Problem 3-E

A large group of insured drivers are classified as high risk and low risk. About 10% of the drivers in this group are considered high risk while the remaining 90% are considered low risk drivers. The number of auto accidents in a year for a high risk driver in this group is modeled by a binomial distribution with mean 0.8 and variance 0.64. The number of auto accidents in a year for a low risk driver is modeled by a binomial distribution with mean 0.4 and variance 0.36. Suppose that an insured driver is randomly selected from this group.

  • What is the probability that the randomly selected insured driver will have no auto accident in the next policy year?
  • What is the probability that the randomly selected insured driver will have more than 1 auto accident in the next policy year?
  • What is the variance of the number of auto accidents for the randomly selected insured drivers in the next policy year?

\text{ }

Practice Problem 3-F
The number of TV sets of a particular brand sold in a given week at an electronic store has a Poisson distribution with mean 4.

  • Determine the probability that the store will sell more than 4 TV sets next week.
  • Determine the minimum number of TV sets that the manager should order for the next week so that the probability of having more sales than available TV sets is less than 0.10.

\text{ }

Practice Problem 3-G

The number of vacant rooms in a given night in a certain hotel follows a Poisson distribution with mean 1.75. Three travelers without reservation walk into the hotel one night. Assume that they do not know each other.

  • Determine the probability that rooms are available for all three travelers.
  • Given that rooms are available for all three travelers, determine the probability that the hotel will still be able to accommodate three more travelers without reservation who also do not know each other.

\text{ }

Practice Problem 3-H

Cars running the red light arrive at a busy intersection according to a Poisson process with the rate of 0.5 per hour.

  • What is the probability that there will be at most 4 cars running the red light in a 5-hour period?
  • After a period of having no activities in running red light, what is the probability that it will take more than 90 minutes to see another car running the red light?
  • After a period of having no activities in running red light, what is the probability that it will take more than 90 minutes to see two cars running the red light?

\text{ }

Practice Problem 3-I
Consider a roulette wheel consisting of 38 numbers – 1 through 36, and 0 and 00. A player always makes bets on one of the numbers 1 through 12.

  • Determine the probability that the player will lose his first 5 bets.
  • Determine the probability that the first win of the player will occur on the 5th bet.
  • Determine the probability that the first win of the player will occur no later than the 5th bet.

\text{ }

Practice Problem 3-J
Suppose that roughly 10% of the adult population have type II diabetes. A researcher wishes to find 3 adult patients who are diabetic. Suppose that the researcher evaluate one patient at a time until finding three diabetic patients.

  • What is the probability that the third diabetic patient is found after evaluating 10 or 11 patients?

\text{ }

Practice Problem 3-K

For any high risk insured driver, the number of auto accidents in a year has a negative binomial distribution with mean 1.6 and variance 2.88. One such insured driver is selected at random and observed for one year. What is the probability that the insured driver will have more than one accident?

\text{ }

Practice Problem 3-L

A discrete probability distribution has the following probability function.

    \displaystyle P(X=k)=\frac{(k+1) (k+2)}{2} \ \biggl(\frac{4}{9} \biggr)^3 \ \biggl(\frac{5}{9} \biggr)^k \ \ \ \ \ k=0,1,2,3,\cdots

Determine the mean and variance of X.

\text{ }

Practice Problem 3-M

A large pool of insureds is made up of two subgroups – low risk (75% of the pool) and high risk (25% of the pool). The number of claims in a year for each insured can be any non-negative integer 0, 1, 2, 3, … The number of claims in a year for each insured in the low risk group has a negative binomial distribution with mean 0.5 and variance 0.625. The number of claims in a year for each insured in the high risk group has a negative binomial distribution with mean 0.75 and variance 0.9375.

If a randomly selected insured from the pool is observed to have one claim in a given year, what is the probability that the insured is a high risk insured?

\text{ }

Practice Problem 3-N

An American roulette wheel has 38 areas – numbers 1 through 36 and 0 and 00. A player bets on odd numbers (1, 3, 5, 7, …, 35). He leaves the game when he wins 5 bets.

  • What is the expected number of bets the player will lose before winning 5 bets?
  • What is the probability that the player will lose 5 bets before leaving the game?

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

Problem ………..Answer
3-A
  • 0.171367
  • 0.2247123
  • 10 e^{-1}=3.67879
3-B
  • 0.388889698
  • 0.081670443
  • 0.010882596
  • 0.000997406
3-C
  • 0.127199186
  • 0.754604255
3-D
  • 0.640545556
  • 0.348149413
3-E
  • 0.43425
  • 0.16795
  • 0.6264
3-F
  • \displaystyle 1-\frac{103}{3} e^{-4}=0.371163065
  • min is 7 since P(X>6)=0.11 and P(X>7)=0.0511
3-G
  • \displaystyle 1-4.28125 e^{-1.75}=0.256030305
  • 0.035673762
3-H
  • 0.891178019
  • \displaystyle e^{-0.75}=0.472366553
  • \displaystyle 1.75 e^{-0.75}=0.826641467
3-I
  • \displaystyle (13/19)^5=0.1499507895
  • \displaystyle (6/19) (13/19)^4=0.0692
  • \displaystyle 1-(13/19)^5=0.85
3-J
  • 0.036589713
3-K
  • \displaystyle \frac{304}{729}=0.417
3-L
  • 3.75
  • 8.4375
3-M
  • \displaystyle \frac{0.0768}{0.2688}=0.2857
3-M
  • \displaystyle \frac{50}{9}=5.56
  • 0.1213520403

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

\copyright 2018 – Dan Ma

Practice Problem Set 2 – Poisson and Gamma

This post presents exercises on gamma distribution and Poisson distribution, reinforcing the concepts discussed in this blog post in a companion blog and blog posts in another blog. Because the shape parameter of the gamma distribution in the following problems is a positive integer, the calculation of probabilities for the gamma distribution is based on Poisson distribution.

\text{ }

Practice Problem 2-A
Suppose that X is the useful working life (in years) of a brand new industrial machine. The following is the probability density function of X.

    \displaystyle f(t)=\frac{1}{24} \ \biggl(\frac{1}{5}\biggr)^5 \ t^4 \ e^{-\frac{1}{5} \ t} \ \ \ \ \ \ t>0

A manufacturing plant has just purchased such a new machine. Determine the probability that the machine will be in operation for the next 20 years.

\text{ }

Practice Problem 2-B

The annual rainfall (in inches) in Western Colorado is modeled by a distribution with the following cumulative distribution function.

    \displaystyle F(x)=1-e^{-0.2 x}-0.2 \ x \ e^{-0.2 x}-0.02 \ x^2 \ e^{-0.2 x} \ \ \ \ \ \ \ 0<x<\infty

In a year in which the annual rainfall is above 20 inches, determine the probability that the annual rainfall is above 30 inches.

\text{ }

Practice Problem 2-C

The annual rainfall (in inches) in Western Colorado is modeled by a distribution with the following cumulative distribution function.

    \displaystyle F(x)=1-e^{-0.2 x}-0.2 \ x \ e^{-0.2 x}-0.02 \ x^2 \ e^{-0.2 x} \ \ \ \ \ \ \ 0<x<\infty

Determine the mean and the variance of the annual rainfall in this region.

\text{ }

Practice Problem 2-D

The repair time (in hours) for an industrial machine has a gamma distribution with mean 1.5 and variance 0.75.

  • Determine the probability that a repair time exceeds 2 hours.
  • Determine the probability that a repair time is at least 5 hours given that it already exceeds 2 hours.

\text{ }

Practice Problem 2-E

Customers arriving at a jewelry store according to a Poisson process with an average rate of 2.5 per hours. The store opens its door at 9 AM.

  • What is the probability that the first customer arrives at the store before 11 AM?
  • What is the probability that the first two customers arrive at the store before 11 AM?
  • What is the probability that the first three customers arrive at the store before 11 AM?
  • What is the probability that the first five customers arrive at the store before 11 AM?

\text{ }

Practice Problem 2-F
In a certain city, telephone calls to 911 emergency response system arrive on the average of two every 3 minutes. Suppose that the arrivals of 911 calls are modeled by a Poisson process.

  • What is the probability of four or more calls arriving in a 5-minute period?
  • A call to the 911 system just ended. What is the probability that the wait time for the next 6 calls is more than 10 minutes?

\text{ }

Practice Problem 2-G

Customers arrive at a store at an average rate of 30 per hour according to a Poisson process.

  • Determine the probability that at least 5 customers arrive at the store in the first 10 minutes after opening on a given day.
  • Determine the probability that, after opening, it will take more than 15 minutes for the 6th customer to arrive at the store.

\text{ }

Practice Problem 2-H
Cars arrive at a highway tollbooth at an average rate of 6 cars every 10 minutes according to a Poisson process.

  • Determine the probability that the toll collector will have to wait longer than 20 minutes before collecting the seventh toll.
  • A toll collector just starts his shift. Determine the median time (in minutes) until he collects the first toll.

\text{ }

Practice Problem 2-I
The number of claims in a year for an insured from a large group of insureds is modeled by the following model.

    \displaystyle P(X=x \lvert \lambda)=\frac{e^{-\lambda} \lambda^x}{x!} \ \ \ \ \ x=0,1,2,3,\cdots

The parameter \lambda varies from insured to insured. However, it is known that \lambda is modeled by the following density function.

    \displaystyle g(\lambda)=32 \ \lambda^2 \ e^{-4 \lambda} \ \ \ \ \ \ \lambda>0

An insured is randomly selected from the large group of insureds. Determine the mean and the variance of the number of claims for this insured in the next year.

\text{ }

Practice Problem 2-J
Suppose that the number of accidents per year per driver in a large group of insured drivers follows a Poisson distribution with mean \lambda. The parameter \lambda follows a gamma distribution with mean 0.9 and variance 0.27.

Given that a randomly selected insured has at least one claim, determine the probability that the insured has more than one claim.

\text{ }

Practice Problem 2-K

Customers arrive at a shop according to a Poisson process. The waiting time (in minutes) until the 5th customer is modeled by the following density function.

    \displaystyle f(t)=324 \ t^4 \ e^{-6 \ t} \ \ \ \ \ \ t>0

  • Determine mean and variance of the time until the 6th customer after the opening of the shop on a given day.
  • Determine the probability that the wait for the 7th customer is longer than 2 minutes.

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

Problem ………..Answer
2-A
  • \displaystyle \frac{103}{3} e^{-4}=0.6288
2-B
  • \displaystyle \frac{25}{13} e^{-2}=0.2603
2-C
  • mean = 15
  • variance = 75
2-D
  • \displaystyle P(X>2)=13 e^{-4}
  • \displaystyle P(X>5 \lvert X>2)=\frac{61}{13} \ e^{-6}=0.01163
2-E
  • \displaystyle 1-e^{-5}=0.99326
  • \displaystyle 1-6 e^{-5}=0.95957
  • \displaystyle 1-18.5 e^{-5}=0.87535
  • \displaystyle 1-\frac{1569}{24} e^{-5}=0.55951
2-F
  • \displaystyle 1-\frac{1301}{81} e^{-10/3}=0.4270
  • \displaystyle 1-\frac{197789}{729} e^{-20/3}=0.6547
2-G
  • \displaystyle 1-\frac{1569}{24} e^{-5}=0.5595
  • \displaystyle \frac{52383.28125}{120} e^{-7.5}=0.241436
2-H
  • \displaystyle 7457.8 e^{-12}=0.04582
  • \displaystyle \frac{\text{ln}(0.5)}{-0.6}=1.15525 min
2-I
  • mean = 0.75
  • variance = 0.9375
2-J
  • \displaystyle \frac{0.6561}{1.3^4}=0.22972
2-K
  • mean = 1, variance = 1/6
  • \displaystyle 7457.8 e^{-12}=0.04582

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

\copyright 2018 – Dan Ma

Practice Problem Set 1 – Order Statistics

This post presents exercises on order statistics, reinforcing the concepts discussed in two blog posts in a companion blog on mathematical statistics.

The first blog post from the companion blog is an introduction to order statistics. That post presents the probability distributions of the order statistics, both individually and jointly. The second post presents basic examples illustrating how to calculate the order statistics.

\text{ }

Practice Problem 1-A
A random sample of size 4 is drawn from a population that has a uniform distribution on the interval (0,5). The resulting order statistics are X_{(1)}, X_{(2)}, X_{(3)} and X_{(4)}.

Determine the cumulative distribution function (CDF) of the 3rd order statistic X_{(3)}. Evaluate the probability P(X_{(3)}>2).

\text{ }

Practice Problem 1-B

As in Problem 1-A, a random sample of size 4 is drawn from a population that has a uniform distribution on the interval (0,5). The resulting order statistics are X_{(1)}, X_{(2)}, X_{(3)} and X_{(4)}.

Evaluate the conditional probability P(X_{(4)}>4 \lvert X_{(3)} >2).

\text{ }

Practice Problem 1-C

The random sample X_1,X_2,\cdots,X_9 of size 9 is drawn from a population that has a uniform distribution on the interval (0,10).

Evaluate the mean and variance of the 7th order statistic X_{(7)}.

\text{ }

Practice Problem 1-D

Suppose that the random sample X_1,X_2 of size 2 is drawn from a population that has an exponential distribution with mean 10. Let X_{(1)} be the sample minimum and let X_{(2)} be the sample maximum.

Evaluate the conditional probability P(X_{(1)}<5 \lvert X_{(2)} <10).

\text{ }

Practice Problem 1-E

Suppose that X_1,X_2,X_3 is a random sample drawn from an exponential distribution with mean 10. The sample median here is the 2nd order statistic X_{(2)}.

Evaluate the probability that the sample median is between 5 and 10.

\text{ }

Practice Problem 1-F
Suppose that X_1,X_2,X_3 is a random sample drawn from a uniform distribution on the interval (0,2). Let R=X_{(3)}-X_{(1)} be the sample range.

  • Determine the CDF of the sample range R.
  • Evaluate the mean and variance of the sample range R.

\text{ }

Practice Problem 1-G
As in Problem 1-F, X_1,X_2,X_3 is a random sample drawn from a uniform distribution on the interval (0,2). Let R=X_{(3)}-X_{(1)} be the sample range. The following relationship relates the variance of the sample range R with the variances and covariance of X_{(1)} and X_{(3)}.

    Var(R)=Var(X_{(1)})+Var(X_{(3)})-2 \ Cov(X_{(1)},X_{(3)})

  • Evaluate Cov(X_{(1)},X_{(3)}) using the above relationship.
  • Evaluate the correlation coefficient \rho of X_{(1)} and X_{(3)}.

\text{ }

Practice Problem 1-H
Suppose that X_1,X_2,X_3 is a random sample drawn from a uniform distribution on the interval (0,5). Let X_{(1)},X_{(2)} and X_{(3)} be the resulting order statistics.

  • Determine the conditional density function of X_{(3)} given that X_{(1)}=x, X_{(2)}=y for all 0<x<y<5.
  • What is the distribution indicated by the conditional density function?
  • Evaluate the condition mean E(X_{(3)} \lvert  X_{(1)}=x, X_{(2)}=y) and the conditional variance Var(X_{(3)} \lvert  X_{(1)}=x, X_{(2)}=y).

\text{ }

Practice Problem 1-I
Suppose that X_1,X_2,X_3 is a random sample drawn from an exponential distribution with mean \theta. Let X_{(1)},X_{(2)} and X_{(3)} be the resulting order statistics.

  • Determine the conditional density function of X_{(3)} given that X_{(1)}=x, X_{(2)}=y for all 0<x<y<\infty.
  • What is the distribution indicated by the conditional density function?
  • Evaluate the condition mean E(X_{(3)} \lvert  X_{(1)}=x, X_{(2)}=y) and the conditional variance Var(X_{(3)} \lvert  X_{(1)}=x, X_{(2)}=y).

\text{ }

Practice Problem 1-J

Suppose that X_1,X_2,X_3 is a random sample drawn from a continuous distribution with density function f(x)=2x where 0<x<1. Let the resulting order statistics be X_{(1)}, X_{(2)} and X_{(3)} where X_{(1)} is the sample minimum, X_{(2)} is the sample median and X_{(3)} is the sample maximum.

  • Evaluate the mean and variance of the sample minimum X_{(1)}.
  • Evaluate the mean and variance of the sample maximum X_{(3)}.

\text{ }

Practice Problem 1-K

As in Problem 1-J, suppose that X_1,X_2,X_3 is a random sample drawn from a continuous distribution with density function f(x)=2x where 0<x<1. Let the resulting order statistics be X_{(1)}, X_{(2)} and X_{(3)} where X_{(1)} is the sample minimum, X_{(2)} is the sample median and X_{(3)} is the sample maximum.

  • Evaluate the covariance between X_{(1)} and X_{(3)}.
  • Evaluate the correlation between X_{(1)} and X_{(3)}.

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

Problem ………..Answer
1-A
  • \displaystyle F_{X_{(3)}}(x)=\frac{20}{625} \ x^3-\frac{3}{625} \ x^4
  • \displaystyle \frac{513}{625}=0.8208
1-B
  • \displaystyle \frac{337}{513}=0.65692
1-C
  • E(X_{(7)}=7
  • \displaystyle Var(X_{(7)}=\frac{21}{11}=1.9
1-D
  • \displaystyle P(X_{(2)}<10)=(1-e^{-1})^2
  • \displaystyle P(X_{(1)}<5 \lvert X_{(2)}<10)=\frac{1-3 e^{-1}+2 e^{-1.5}}{(1-e^{-1})^2}=0.8575
1-E
  • \displaystyle 3 \ (e^{-1}-e^{-2})-2 \ (e^{-1.5}-e^{-3})=0.3509
1-F
  • \displaystyle F_R(r)=\frac{3}{4} \ r^2-\frac{1}{4} \ r^3
  • \displaystyle E(R)=1
  • \displaystyle Var(R)=\frac{1}{5}
1-G
  • \displaystyle Cov(X_{(1)},X_{(3)})=\frac{1}{20}
  • \displaystyle \rho=\frac{1}{3}
1-H
  • For 0<x<y<5, \displaystyle f_{X_{(3)} \lvert X_{(1)}=x,X_{(2)}=y}(z \lvert x, y)=\frac{1}{5-y} \ \ \ \ \ y<z<5
  • This is a uniform distribution on (y,5)
  • \displaystyle E(X_{(3)} \lvert X_{(1)}=x,X_{(2)}=y)=\frac{1}{2} \ (y+5)
  • \displaystyle Var(X_{(3)} \lvert X_{(1)}=x,X_{(2)}=y)=\frac{(5-y)^2}{12}
1-I
  • For 0<x<y<\infty, \displaystyle f_{X_{(3)} \lvert X_{(1)}=x,X_{(2)}=y}(z \lvert x, y)=\frac{\frac{1}{\theta} e^{-z/\theta}}{e^{-y/\theta}} \ \ \ \ \ y<z<\infty
  • This is an exponential distribution conditional on z>y
  • \displaystyle E(X_{(3)} \lvert X_{(1)}=x,X_{(2)}=y)=y+\theta
  • \displaystyle Var(X_{(3)} \lvert X_{(1)}=x,X_{(2)}=y)=\theta^2
1-J
  • \displaystyle E(X_{(1)})=\frac{16}{35}
  • \displaystyle Var(X_{(1)})=\frac{201}{4900}
  • \displaystyle E(X_{(3)})=\frac{6}{7}
  • \displaystyle Var(X_{(3)})=\frac{3}{196}
1-K
  • \displaystyle Cov(X_{(1)},X_{(3)})=\frac{2}{245}
  • \displaystyle \rho=\frac{8}{\sqrt{603}}=0.326

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

\copyright 2018 – Dan Ma

Basic exercises for lognormal distribution

This post presents exercises on the lognormal distribution. These exercises are to reinforce the basic properties discussed in this companion blog post.

Additional resources: another discussion of lognormal, a concise summary of lognormal and a problem set on lognormal.

_____________________________________________________________________________________

Exercises

Exercise 1
Let X be a normal random variable with mean 6.5 and standard deviation 0.8. Consider the random variable Y=e^X. what is the probability P(800 \le Y \le 1000)?

\text{ }

Exercise 2
Suppose Y follows a lognormal distribution with parameters \mu=1 and \sigma=1. Let Y_1=1.25 Y. Determine the following:

  • The probability that Y_1 exceed 1.
  • The 40th percentile of Y_1.
  • The 80th percentile of Y_1.

\text{ }

Exercise 3
Let Y follows a lognormal distribution with parameters \mu=4 and \sigma=0.9. Compute the mean, second moment, variance, third moment and the fourth moment.

\text{ }

Exercise 4
Let Y be the same lognormal distribution as in Exercise 3. Use the results in Exercise 3 to compute the coefficient of variation, coefficient of skewness and the kurtosis.

\text{ }

Exercise 5
Given the following facts about a lognormal distribution:

  • The lower quartile (i.e. 25% percentile) is 1000.
  • The upper quartile (i.e. 75% percentile) is 4000.

Determine the mean and variance of the given lognormal distribution.

\text{ }

Exercise 6
Suppose that a random variable Y follows a lognormal distribution with mean 149.157 and variance 223.5945. Determine the probability P(Y>150).

\text{ }

Exercise 7
Suppose that a random variable Y follows a lognormal distribution with mean 1200 and median 1000. Determine the probability P(Y>1300).

\text{ }

Exercise 8
Customers of a very popular restaurant usually have to wait in line for a table. Suppose that the wait time Y (in minutes) for a table follows a lognormal distribution with parameters \mu=3.5 and \sigma=0.10. Concerned about long wait time, the restaurant owner improves the wait time by expanding the facility and hiring more staff. As a result, the wait time for a table is cut by half. After the restaurant expansion,

  • what is the probability distribution of the wait time for a table?
  • what is the probability that a customer will have to wait more than 20 minutes for a table?

\text{ }
_____________________________________________________________________________________

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }
_____________________________________________________________________________________

Answers

Exercise 1

  • 0.1040

\text{ }

Exercise 2

  • 0.8888
  • 1.4669
  • 7.8707

\text{ }

Exercise 3

  • E(Y)=e^{4.405}
  • E(Y^2)=e^{9.62}
  • E(Y^3)=e^{15.645}
  • E(Y^4)=e^{22.48}
  • Var(Y)=(e^{0.81}-1) \ e^{8.81}

\text{ }

Exercise 4

  • \displaystyle \text{CV}= 1.117098
  • \displaystyle \gamma_1= 4.74533
  • \displaystyle \beta_2= 60.41075686

\text{ }

Exercise 5

  • \displaystyle E(Y)= 3415.391045
  • \displaystyle E(Y^2)= 34017449.61
  • \displaystyle Var(Y)= 22352553.62

\text{ }

Exercise 6

  • 0.4562

\text{ }

Exercise 7

  • 0.3336

\text{ }

Exercise 8

  • longnormal with \mu=3.5+\log(0.5) and \sigma=0.1 where \log is the natural logarithm.
  • 0.0294

_____________________________________________________________________________________

\copyright \ 2015 \text{ by Dan Ma}
Revised October 18, 2018

Calculating the skewness of a probability distribution

This post presents exercises on calculating the moment coefficient of skewness. These exercises are to reinforce the calculation demonstrated in this companion blog post.

For a given random variable X, the Pearson’s moment coefficient of skewness (or the coefficient of skewness) is denoted by \gamma_1 and is defined as follows:

    \displaystyle \begin{aligned} \gamma_1&=\frac{E[ (X-\mu)^3 ]}{\sigma^3} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (1) \\&=\frac{E(X^3)-3 \mu E(X^2)+3 \mu^2 E(X)-\mu^3}{\sigma^3} \\&=\frac{E(X^3)-3 \mu [E(X^2)+\mu E(X)]-\mu^3}{\sigma^3} \\&=\frac{E(X^3)-3 \mu \sigma^2-\mu^3}{\sigma^3} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (2) \\&=\frac{E(X^3)-3 \mu \sigma^2-\mu^3}{(\sigma^2)^{\frac{3}{2}}} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (3) \end{aligned}

(1) is the definition which is the ratio of the third central moment to the cube of the standard deviation. (2) and (3) are forms that may be easier to calculate. Essentially, if the first three raw moments E(X), E(X^2) and E(X^3) are calculated, then the skewness coefficient can be derived via (3). For a more detailed discussion, see the companion blog post.

_____________________________________________________________________________________

Practice Problems

Practice Problems 1
Let X be a random variable with density function f(x)=10 x^9 where 0<x<1. This is a beta distribution. Calculate the moment coefficient of skewness in two ways. One is to use formula (3) above. The other is to use the following formula for the skewness coefficient for beta distribution.

    \displaystyle \gamma_1=\frac{2(\beta-\alpha) \ \sqrt{\alpha+\beta+1}}{(\alpha+\beta+2) \ \sqrt{\alpha \ \beta}} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ (4)

\text{ }

Practice Problems 2
Calculate the moment coefficient of skewness for Y=X^2 where X is as in Practice Problem 1. It will be helpful to first calculate a formula for the raw moments E(X^k) of X.

\text{ }

Practice Problems 3
Let X be a random variable with density function f(x)=8 (1-x)^7 where 0<x<1. This is a beta distribution. Calculate the moment coefficient of skewness using (4).

\text{ }

Practice Problems 4
Suppose that X follows a gamma distribution with PDF f(x)=4 x e^{-2x} where x>0.

  • Show that E(X)=1, E(X^2)=\frac{3}{2} and E(X^3)=3.
  • Use the first three raw moments to calculate the moment coefficient of skewness.

\text{ }

Practice Problems 5
Calculate the moment coefficient of skewness for Y=X^2 where X is as in Practice Problem 4. It will be helpful to first calculate a formula for the raw moments E(X^k) of X.

\text{ }

Practice Problems 6
Verify the calculation of \gamma_1 and the associated calculation of Example 6 in this companion blog post.

\text{ }

Practice Problems 7
Verify the calculation of \gamma_1 and the associated calculation of Example 7 in this companion blog post.

\text{ }

Practice Problems 8
Verify the calculation of \gamma_1 and the associated calculation of Example 8 in this companion blog post.

\text{ }
_____________________________________________________________________________________

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }
_____________________________________________________________________________________

Answers

Practice Problems 1

  • \displaystyle \gamma_1=\frac{-36 \sqrt{3}}{13 \sqrt{10}}=-1.516770159

\text{ }

Practice Problems 2

  • \displaystyle \gamma_1=\frac{- \sqrt{7}}{\sqrt{5}}=-1.183215957

\text{ }

Practice Problems 3

  • \displaystyle \gamma_1=\frac{7 \sqrt{10}}{11 \sqrt{2}}=1.422952349

\text{ }

Practice Problems 4

  • \displaystyle \gamma_1=\sqrt{2}

\text{ }

Practice Problems 5

  • \displaystyle \gamma_1=\frac{138}{7 \sqrt{21}}=4.302009836

\text{ }

_____________________________________________________________________________________

\copyright \ 2015 \text{ by Dan Ma}

Practice problems for order statistics and multinomial probabilities

This post presents exercises on calculating order statistics using multinomial probabilities. These exercises are to reinforce the calculation demonstrated in this blog post.

_____________________________________________________________________________________

Practice Problems

Practice Problems 1
Draw a random sample X_1,X_2,\cdots,X_{11} of size 11 from the uniform distribution U(0,4). Calculate the following:

  • P(Y_4<2<Y_5<Y_7<4<Y_8)
  • P(Y_4<2<Y_6<Y_7<4<Y_8)

\text{ }

Practice Problems 2
Draw a random sample X_1,X_2,\cdots,X_7 of size 7 from the uniform distribution U(0,5). Calculate the probability P(Y_4<2<4<Y_7).

\text{ }

Practice Problems 3
Same setting as in Practice Problem 2. Calculate P(Y_7>4 \ | \ Y_4<2) and P(Y_7>4). Compare the conditional probability with the unconditional probability. Does the answer for P(Y_7>4 \ | \ Y_4<2) make sense in relation to P(Y_7>4)?

\text{ }

Practice Problems 4
Same setting as in Practice Problem 2. Calculate the following:

  • P(Y_4<2<Y_7<4)
  • P(2<Y_7<4 \ | \ Y_4<2)
  • P(2<Y_7<4)
  • Does the answer for P(2<Y_7<4 \ | \ Y_4<2) make sense in relation to P(2<Y_7<4)?

\text{ }

Practice Problems 5
Draw a random sample X_1,X_2,\cdots,X_6 of size 6 from the uniform distribution U(0,4). Consider the conditional distribution Y_3 \ | \ Y_5<2. Calculate the following:

  • P(Y_3 \le t \ | \ Y_5<2)
  • f_{Y_3}(t \ | \ Y_5<2)
  • E(Y_3 \ | \ Y_5<2)
  • E(Y_3)

where 0<t<2. Compare E(Y_3) and E(Y_3 \ | \ Y_5<2). Does the answer for the conditional mean make sense?

\text{ }

Practice Problems 6
Draw a random sample X_1,X_2,\cdots,X_7 of size 7 from the uniform distribution U(0,5). Calculate the following:

  • P(Y_4 > 4 \ | \ Y_2>2)
  • P(Y_4 > 4)
  • Compare the two probabilities. Does the answer for the conditional probability make sense?

\text{ }
_____________________________________________________________________________________

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }

\text{ }
_____________________________________________________________________________________

Answers

Practice Problems 1

  • \displaystyle \frac{11550}{177147}
  • \displaystyle \frac{18480}{177147}

\text{ }

Practice Problems 2

  • \displaystyle \frac{11088}{78125}

\text{ }

Practice Problems 3

  • \displaystyle P(Y_7>4 \ | \ Y_4<2)=\frac{11088}{22640}
  • \displaystyle P(Y_7>4)=\frac{61741}{78125}

\text{ }

Practice Problems 4

  • \displaystyle P(Y_4<2<Y_7<4)=\frac{8064}{78125}
  • \displaystyle P(2<Y_7<4 \ | \ Y_4<2)=\frac{8064}{22640}
  • \displaystyle P(2<Y_7<4)=\frac{16256}{78125}

\text{ }

Practice Problems 5

  • \displaystyle P(Y_3 \le t \ | \ Y_5<2)=\frac{-10t^6+84t^5-300t^4+400t^3}{448}
  • \displaystyle f_{Y_3}(t \ | \ Y_5<2)=\frac{-60t^5+420t^4-1200t^3+1200t^2}{448}
  • \displaystyle E(Y_3 \ | \ Y_5<2)=\frac{55}{49}
  • \displaystyle E(Y_3)=\frac{84}{49}

\text{ }

Practice Problems 6

  • \displaystyle \frac{3641}{12393}
  • \displaystyle \frac{2605}{78125}

\text{ }

_____________________________________________________________________________________

\copyright \ 2015 \text{ by Dan Ma}