More lognormal calculation

This post presents more calculation examples for lognormal distribution, complementing and supplementing previous posts on lognormal distribution. A practice problem set is found here.

A basic introduction of the lognormal distribution is found here, with an accompanying set of practice problems found here.

Additional discussion of lognormal model is found here, using it as a model of security prices.

Lognormal Percentiles

If the mean $\mu$ and the variance $\sigma^2$ are known, the normal distribution is completely determined. Likewise knowing the mean and the variance, the lognormal distribution is fixed. It is also the case that given two percentiles, the lognormal distribution can be determined. The following example shows that given two lognormal percentiles, the parameters $\mu$ and $\sigma$ can be determined. They can the be used to calculate any other distributional quantities, such as another percentile.

Example 1
Suppose that the random variable $X$ follows a lognormal distribution such that its 90th percentile is 95.88059 and its 99th percentile is 774.87305. Determine its 95th percentile.

The normal 90th, 95th and 99th percentiles are:

$\mu+z_{0.90} \cdot \sigma$

$\mu+z_{0.95} \cdot \sigma$

$\mu+z_{0.99} \cdot \sigma$

where $z_{0.90}$ is the 90th percentile of the standard normal distribution, etc. The lognormal percentiles are obtained by raising $e$ to the normal percentiles. We first solve the problem by using table values for the standard normal percentiles. We also give the answers using the calculator TI84+.

First, we use tables values $z_{0.90}=1.282$ , $z_{0.95}=1.645$ and $z_{0.99}=2.326$ . We start with the following equations.

(1)

…..

$\displaystyle e^{\mu+z_{0.90} \cdot \sigma}=95.880549$

(2) ….. $\displaystyle e^{\mu+z_{0.99} \cdot \sigma}=774.87305$

Divide the second equation by the first, we obtain:

(3)

…..

$\displaystyle e^{(z_{0.99}-z_{0.90}) \cdot \sigma}=\frac{774.87305}{95.880549}$

Taking natural log of both sides of (3), we obtain:

(4)

…..

$\displaystyle (z_{0.99}-z_{0.90}) \cdot \sigma=\text{Ln} \biggl( \frac{774.87305}{95.880549} \biggr)$

(5) ….. $\sigma=2.001528807$

Plugging $\sigma$ into (1), we obtain $\mu$ as follows:

(6)

…..

$\displaystyle e^\mu=\frac{95.880549}{e^{z_{0.90} \cdot \sigma}}$

(7) ….. $\displaystyle \mu=\text{Ln}(95.880549)-z_{0.90} \cdot \sigma=1.997143205$

From (5) and (7), we take $\mu=2$ and $\sigma=2$ . The 95th lognormal percentile is:

(8)

…..

$\displaystyle e^{2+z_{0.95} \cdot 2}=198.3434254$

…..

To get a more precise answer, we use normal percentiles from the TI84+ calculator: $z_{0.90}=1.281551567$ , $z_{0.95}=1.644853626$ and $z_{0.99}=2.326347877$ . Going through the same series of calculation from (1) to (8), we obtain:

(9)

…..

$\displaystyle e^{2+z_{0.95} \cdot 2}=198.2853692$

…..

Order Statistics from Lognormal Samples

We use a specific lognormal distribution to demonstrate the concept. Suppose $X$ follows a lognormal distribution with parameters $\mu=1$ and $\sigma=0.5$ . We draw a random sample $X_1,X_2,\cdots,X_{11}$ of size 11 from this population. We rank the 11 sample items from the smallest to the largest. The ranked results are labeled $Y_1<Y_2< \cdots <Y_6 < \cdots < Y_{11}$ . In this example, $Y_1$ is the sample minimum, $Y_6$ is the sample median and $Y_{11}$ is the sample maximum. In this example, $Y_j$ is the $j$ th order statistic where $1 \le j \le 11$ .

One important tool in learning about order statistics is through their density functions and joint density functions. When the random sample is drawn from a lognormal population, the density functions of the order statistics, though can be derived, are not easy to work with for the purpose of calculation. However, we can still evaluate probability statements about the order statistics using either a binomial calculation (when it involves only one order statistic) or a multinomial calculation (when it involves 2 or more order statistics).

The multinomial approach we use here is discussed in this previous post. The only difference is that the random samples discussed here are drawn from the lognormal distribution. A practice problem set for the multinomial approach is found here. Another set of practice problems for order statistics is found here.

We demonstrate with examples using the random sample of size 11 drawn the lognormal distribution with $\mu=1$ and $\sigma=0.5$ , indicated above.

Example 2
Suppose that the random variable $X$ has a lognormal distribution with parameters $\mu=1$ and $\sigma=0.5$ . A random sample $X_1,\cdots,X_{11}$ of size 11 is drawn from the population represented by $X$ . Let $Y_1,\cdots,Y_{11}$ represent the corresponding order statistics. Evaluate the following probabilities.

$P(Y_5 < 2.5)$
$P(Y_9 > 4)$

Here, $Y_5$ is the 5th order statistic, which is the fifth smallest sample item in the random sample while $Y_9$ is the ninth order statistic in the random sample. Before we evaluate these probabilities, we evaluate the probabilities $P(X < 2.5)$ and $P(X < 4)$ .

$\displaystyle \begin{aligned} P(X < 2.5)&=P[\text{Ln}(X) < \text{Ln}(2.5)] \\&=P \biggl[\frac{\text{Ln}(X)-1}{0.5} < \frac{\text{Ln}(2.5)-1}{0.5} \biggr] \\&=\Phi \biggl( \frac{\text{Ln}(2.5)-1}{0.5} \biggr) \\&=\Phi(-1.7) \\&=1-0.5675=0.4325=P \end{aligned}$

$\displaystyle \begin{aligned} P(X < 4)&=P[\text{Ln}(X) < \text{Ln}(4)] \\&=P \biggl[\frac{\text{Ln}(X)-1}{0.5} < \frac{\text{Ln}(4)-1}{0.5} \biggr] \\&=\Phi \biggl( \frac{\text{Ln}(4)-1}{0.5} \biggr) \\&=\Phi(0.77) \\&=0.7794=Q \end{aligned}$

The above probabilities $P$ and $Q$ are based on the standard normal table. We now translate the probability statement $P(Y_5 < 2.5)$ into a binomial probability. For the event $Y_5 < 2.5$ to happen, there must be at least 5 sample items $X_i$ that are less than 2.5.

$\displaystyle \begin{aligned} P(Y_5 < 2.5)&=P(\text{at least 5 } X_i < 2.5) \\&=1-P(\text{at most 4 } X_i < 2.5) \\&=1-(1-P)^{11}-11 \cdot P^1 \cdot (1-P)^{10}-55 \cdot P^2 \cdot (1-P)^9 \\& \ \ -165 \cdot P^3 \cdot (1-P)^8-330 \cdot P^4 \cdot (1-P)^7 \\&=0.556249306 \end{aligned}$

Note that $P(Y_9 > 4)=1-P(Y_9 \le 4)$ . For the event $Y_9 \le 4$ to happen, at least 9 sample item $X_i$ are less than or equal to than 4.

$\displaystyle \begin{aligned} P(Y_9 > 4)&=1-P(Y_9 \le 4)\\&=1-P(\text{at least 9 } X_i \le 4) \\&=1-55 \cdot Q^9 \cdot (1-Q)^{2}-11 \cdot Q^{10} \cdot (1-Q)^1 - Q^{11} \\&=0.450738923 \end{aligned}$

To get more precise answers, use $P=0.433520366$ and $Q=0.7801171596$ in the above calculation (from TI84+). Plugging these values in will produce the following:

$P(Y_5 < 2.5)=0.5590023377$

…..

$P(Y_9 > 4)=0.4483855009$ …..(using TI84+)

There is about a 56% chance that in a random sample of size 11 from this lognormal population, the 5th smallest sample item $Y_5$ is less than 2.5. On the other hand, there is a 45% chance that the third largest sample item $Y_9$ is greater than 4.

Example 3
As discussed previously, the lognormal distribution has parameters $\mu=1$ and $\sigma=0.5$ and a random sample of size 11 is drawn. The order statistics are $Y_1,\cdots,Y_{11}$ . Evaluate the following probabilities:

$P(Y_5<2.5<Y_6<Y_8<4<Y_9)$
$P(Y_5<2.5<Y_7<Y_8<4<Y_9)$
$P(Y_5<2.5<Y_7<4<Y_9)$
$P(Y_5<2.5<4<Y_9)$

We work the first two probabilities in this example. The remaining two are done in the next example.

These probabilities involve 2 or more order statistics. One way to evaluate these probabilities is to obtain the appropriate joint density function and then integrate the joint density over an appropriate region. As mentioned earlier, the approach we take here is the multinomial approach.

Take the first probability $P(Y_5<2.5<Y_6<Y_8<4<Y_9)$ . The random sample $X_1,\cdots,X_{11}$ of size 11 can be viewed as a multinomial experiment. There are three intervals to consider – $(0, 2.5)$ , $(2.5, 4)$ and $(4, \infty)$ . Each of the 11 random sampling falls into one and exactly one of these three intervals. The probabilities of a sample item $X_i$ falling into these intervals are:

$P(X < 2.5)=0.4325=p_1$

$P(2.5 < X < 4)=0.7794 - 0.4325=0.3469=p_2$

$P(4 < X)=1-0.4325-0.3469=0.2206=p_3$

These probabilities are calculated in Example 2. The experiment is to sample from the lognormal distribution with $\mu=1$ and $\sigma=0.5$ 11 times. Each random sample item falls into one of these intervals $(0, 2.5)$ , $(2.5, 4)$ and $(4, \infty)$ with probabilities 0.4325, 0.3469 and 0.2206, respectively. For the event $Y_5<2.5<Y_6<Y_8<4<Y_9$ to happen, 5 of the sample items must fall into $(0, 2.5)$ , 3 of the sample items must fall into $(2.5, 4)$ and 3 of the sample items must fall into $(4, \infty)$ . Consider the following multinomial probability:

$\displaystyle \begin{aligned} P(Y_5<2.5<Y_6<Y_8<4<Y_9)&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3 \\&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot 0.4325^5 \cdot 0.3469^3 \cdot 0.2206^3 \\&=9240 \cdot 0.4325^5 \cdot 0.3469^3 \cdot 0.2206^3 \\&=0.0626659971 \end{aligned}$

When the probabilities $p_1$ , $p_2$ and $p_3$ are obtained using TI84+, we have the following answer to $P(Y_5<2.5<Y_6<Y_8<4<Y_9)$ .

$P(X < 2.5)=0.433520366=p_1$

…..

$P(2.5 < X < 4)=0.3465967937=p_2$ …..(using TI84+)

$P(4 < X)=1-p_1-p_2=0.2198828404=p_3$ …..(using TI84+)

$\displaystyle \begin{aligned} P(Y_5<2.5<Y_6<Y_8<4<Y_9)&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3 \\&=0.0626277965 \end{aligned}$

We now work the second probability $P(Y_5<2.5<Y_7<Y_8<4<Y_9)$ . For the event $Y_5<2.5<Y_7<Y_8<4<Y_9$ to occur, either 5 or 6 sample items are less than 2.5. We must account for these 2 cases.

$\displaystyle \begin{aligned} P(Y_5<2.5<Y_7<Y_8<4<Y_9)&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3 \\& \ \ \ + \frac{11!}{6! \cdot 2! \cdot 3!} \cdot p_1^6 \cdot p_2^2 \cdot p_3^3\\&=9240 \cdot 0.4325^5 \cdot 0.3469^3 \cdot 0.2206^3 \\& \ \ \ + 4620 \cdot 0.4325^6 \cdot 0.3469^2 \cdot 0.2206^3 \\&=0.101730632 \end{aligned}$

When the probabilities $p_1$ , $p_2$ and $p_3$ are obtained using TI84+, we have the following answer to $P(Y_5<2.5<Y_7<Y_8<4<Y_9)$ .

$P(X < 2.5)=0.433520366=p_1$

…..

$P(2.5 < X < 4)=0.3465967937=p_2$ …..(using TI84+)

$P(4 < X)=1-p_1-p_2=0.2198828404=p_3$ …..(using TI84+)

$\displaystyle \begin{aligned} P(Y_5<2.5<Y_7<Y_8<4<Y_9)&=0.1017949581 \end{aligned}$

Example 4
We now complete Example 3 by calculating the following probabilities.

$P(Y_5<2.5<Y_7<4<Y_9)$
$P(Y_5<2.5<4<Y_9)$

Consider the probability $P(Y_5<2.5<Y_7<4<Y_9)$ . This one involves 4 cases. Out of 11 sample items, either 5 or 6 of them fall into the interval $(,2.5)$ (either $Y_6$ falls into $(0,2.5)$ or $Y_6$ falls into $(2.5,4)$ ). For each of these scenarios, there are two cases – either $Y_8$ falls into $(2.5,4)$ or $Y_8$ falls into $(4,\infty)$ . The following shows the 4 separate calculations and the total.

$\displaystyle \frac{11!}{5! \cdot 2! \cdot 4!} \cdot p_1^5 \cdot p_2^2 \cdot p_3^4=0.0298878328$

$\displaystyle \frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3=0.0626659971$

$\displaystyle \frac{11!}{6! \cdot 1! \cdot 4!} \cdot p_1^6 \cdot p_2^1 \cdot p_3^4=0.0124209548$

$\displaystyle \frac{11!}{6! \cdot 2! \cdot 3!} \cdot p_1^6 \cdot p_2^2 \cdot p_3^3=0.039064635$

$P(Y_5<2.5<Y_7<4<Y_9)=0.1440394197$

When the probabilities $p_1$ , $p_2$ and $p_3$ are obtained using TI84+, we have the following answer to $P(Y_5<2.5<Y_7<4<Y_9)$ .

$P(X < 2.5)=0.433520366=p_1$

…..

$P(2.5 < X < 4)=0.3465967937=p_2$ …..(using TI84+)

$P(4 < X)=1-p_1-p_2=0.2198828404=p_3$ …..(using TI84+)

$\displaystyle \begin{aligned} P(Y_5<2.5<Y_7<4<Y_9)&=0.1440174396 \end{aligned}$

For the probability $P(Y_5<2.5<4<Y_9)$ , there are even more cases. There are 10 cases. The calculation is shown below.

$\displaystyle \frac{11!}{5! \cdot 0! \cdot 6!} \cdot p_1^5 \cdot p_2^0 \cdot p_3^6$

$\displaystyle \frac{11!}{5! \cdot 1! \cdot 5!} \cdot p_1^5 \cdot p_2^1 \cdot p_3^5$

$\displaystyle \frac{11!}{5! \cdot 2! \cdot 4!} \cdot p_1^5 \cdot p_2^2 \cdot p_3^4$

$\displaystyle \frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3$

$\displaystyle \frac{11!}{6! \cdot 0! \cdot 5!} \cdot p_1^6 \cdot p_2^0 \cdot p_3^5$

$\displaystyle \frac{11!}{6! \cdot 1! \cdot 4!} \cdot p_1^6 \cdot p_2^1 \cdot p_3^4$

$\displaystyle \frac{11!}{6! \cdot 2! \cdot 3!} \cdot p_1^6 \cdot p_2^2 \cdot p_3^3$

$\displaystyle \frac{11!}{7! \cdot 0! \cdot 4!} \cdot p_1^7 \cdot p_2^0 \cdot p_3^4$

$\displaystyle \frac{11!}{7! \cdot 1! \cdot 3!} \cdot p_1^7 \cdot p_2^1 \cdot p_3^3$

$\displaystyle \frac{11!}{8! \cdot 0! \cdot 3!} \cdot p_1^8 \cdot p_2^0 \cdot p_3^3$

$P(Y_5<2.5<4<Y_9)=0.1723237892$

With $p_1$ , $p_2$ and $p_3$ obtained from TI84+, the answer is

$P(Y_5<2.5<4<Y_9)=0.1723606109$

Example 5
Use Example 2 and Example 4 to compute the conditional probability $P(4<Y_9 \lvert Y_5<2.5)$ . Compare this with the unconditional probability $P(4<Y_9)$ .

$\displaystyle \begin{aligned} P(4<Y_9 \lvert Y_5<2.5)&=\frac{P(Y_5<2.5<4<Y_9)}{P(Y_5<2.5)} \\&=\frac{0.1723237892}{0.5562493060} \\&=0.3097959627 \end{aligned}$

…..

$\displaystyle \begin{aligned} P(4<Y_9 \lvert Y_5<2.5)&=\frac{P(Y_5<2.5<4<Y_9)}{P(Y_5<2.5)} \\&=\frac{0.1723606109}{0.5590023377} \\&=0.3083361183 \end{aligned}$ …..(using TI84+)

From Example 2, $P(4<Y_9)$ is about 0.45. Without knowing additional information, there is a 45% chance that the 9th order statistic $Y_9$ is greater than 4. But if we know that there are at least 5 sample items smaller than 2.5, it is less likely that $Y_9$ is greater than 4 (about 31% chance).

Large Lognormal Samples

Independent sum of lognormal distributions is not lognormal. However if the sample is large enough, we can approximate the independent sum using the normal distribution due to the central limit theorem. We present one example.

Example 6
For a certain insurance company, insurance claims follow a lognormal distribution with parameters $\mu=5$ and $\sigma=1$ .

Calculate the probability that a randomly selected claim is between 200 and 250.
The insurance company is to process fifty claims this month. Approximate the probability that the average claim amount is 200 and 250.

For an individual claim $X$ , the mean is $E(X)=e^{\mu + 0.5 \sigma}=e^{5.5}=\mu_X$ and the second moment is $E(X^2)=e^{2 \mu + 2 \sigma^2}=e^{12}$ . This means the variance of an individual claim is $Var(X)=e^{12}-e^{11}=e^{11} (e-1)$ . Thus the standard deviation of an individual claim is $\sigma_X=\sqrt{e^{11} (e-1)}$ .

For a random sample of size 50, $X_1,\cdots,X_{50}$ , the sample mean is $\overline{X}=\frac{1}{50} (X_1+\cdots+X_{50})$ . The mean of the sample mean is $\mu_{\overline{X}}=e^{5.5}$ and the standard deviation of the sample mean is $\sigma_{\overline{X}}=\frac{\sigma_X}{\sqrt{50}}=\frac{1}{\sqrt{50}} \sqrt{e^{11} (e-1)}$ .

We first calculate the probability $P(200<X<250)$ .

$\displaystyle \begin{aligned} P(200<X<250)&=P[\text{Ln}(200)<\text{Ln}(X)<\text{Ln}(250)] \\&=\Phi \biggl[ \frac{\text{Ln}(250)-5}{1} \biggr]-\Phi \biggl[ \frac{\text{Ln}(200)-5}{1} \biggr] \\&=\Phi(0.52)-\Phi(0.30)\\&=0.6985-0.6179\\&=0.0806 \end{aligned}$

The above probability using TI84+ is 0.0817076952. The following calculates the probability concerning the sample mean.

$\displaystyle \begin{aligned} P(200<\overline{X}<250)&\approx \Phi \biggl[ \frac{250-\mu_{\overline{X}}}{\sigma_{\overline{X}}} \biggr]-\Phi \biggl[ \frac{200-\mu_{\overline{X}}}{\sigma_{\overline{X}}} \biggr] \\&= \Phi \biggl[ \frac{250-e^{5.5}}{\frac{1}{\sqrt{50}} \sqrt{e^{11} (e-1)}} \biggr]-\Phi \biggl[ \frac{200-e^{5.5}}{\frac{1}{\sqrt{50}} \sqrt{e^{11} (e-1)}} \biggr] \\&=\Phi(0.12)-\Phi(-0.99)\\&=0.5478-(1-0.8389)\\&=0.3867 \end{aligned}$

Note the difference in calculation between the probability for individual $X$ and for $\overline{X}$ . For the former, we take the natural log to transform $X$ into a normal variable. For the latter, $\overline{X}$ is approximately normal since we apply the central limit theorem. Thus we do not need to apply natural log on $\overline{X}$ .

With $\sigma_{\overline{X}}=45.36$ , the spread of the sample mean $\overline{X}$ is much smaller than the spread for individual $X$ where $\sigma_X=320.75$ . Thus it is much less likely for an individual observation of $X$ to fall between 200 and 250.

Practice Problems

A practice problem set is found here.

Dan Ma statistical

Daniel Ma statistical

Dan Ma practice problems

Daniel Ma practice problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

Dan Ma mathematical

Daniel Ma mathematical

Tagged: Central Limit Theorem, Lognormal Distribution, Order statistics

Practice Problem Set 8 – more lognormal calculation | Probability and Statistics Problem Solve April 8, 2020 at 2:27 pm Reply

[…] This set of practice problems is to complement a discussion on lognormal distribution (found here). […]

Lognormal model of security prices | Probability and Statistics Problem Solve April 8, 2020 at 9:15 pm Reply

[…] The preceding post discusses several examples of calculation involving the lognormal distribution. This post presents another one – using the lognormal distribution as a model of prices of a financial security. Practice problems are found here. […]

Probability and Statistics Problem Solve

More lognormal calculation

2 thoughts on “More lognormal calculation”

Leave a comment Cancel reply

Pages

Archives

Recent Posts

Categories

Probability and Statistics Problem Solve

More lognormal calculation

Share this:

Related

2 thoughts on “More lognormal calculation”

Leave a comment Cancel reply

Pages

Archives

Recent Posts

Categories