More lognormal calculation

This post presents more calculation examples for lognormal distribution, complementing and supplementing previous posts on lognormal distribution. A practice problem set is found here.

A basic introduction of the lognormal distribution is found here, with an accompanying set of practice problems found here.

Additional discussion of lognormal model is found here, using it as a model of security prices.

Lognormal Percentiles

If the mean \mu and the variance \sigma^2 are known, the normal distribution is completely determined. Likewise knowing the mean and the variance, the lognormal distribution is fixed. It is also the case that given two percentiles, the lognormal distribution can be determined. The following example shows that given two lognormal percentiles, the parameters \mu and \sigma can be determined. They can the be used to calculate any other distributional quantities, such as another percentile.

Example 1
Suppose that the random variable X follows a lognormal distribution such that its 90th percentile is 95.88059 and its 99th percentile is 774.87305. Determine its 95th percentile.

The normal 90th, 95th and 99th percentiles are:

    \mu+z_{0.90} \cdot \sigma

    \mu+z_{0.95} \cdot \sigma

    \mu+z_{0.99} \cdot \sigma

where z_{0.90} is the 90th percentile of the standard normal distribution, etc. The lognormal percentiles are obtained by raising e to the normal percentiles. We first solve the problem by using table values for the standard normal percentiles. We also give the answers using the calculator TI84+.

First, we use tables values z_{0.90}=1.282, z_{0.95}=1.645 and z_{0.99}=2.326. We start with the following equations.

    (1) …..\displaystyle e^{\mu+z_{0.90} \cdot \sigma}=95.880549

    (2) …..\displaystyle e^{\mu+z_{0.99} \cdot \sigma}=774.87305

Divide the second equation by the first, we obtain:

    (3) …..\displaystyle e^{(z_{0.99}-z_{0.90}) \cdot \sigma}=\frac{774.87305}{95.880549}

Taking natural log of both sides of (3), we obtain:

    (4) …..\displaystyle (z_{0.99}-z_{0.90}) \cdot \sigma=\text{Ln} \biggl( \frac{774.87305}{95.880549} \biggr)

    (5) …..\sigma=2.001528807

Plugging \sigma into (1), we obtain \mu as follows:

    (6) …..\displaystyle e^\mu=\frac{95.880549}{e^{z_{0.90} \cdot \sigma}}

    (7) …..\displaystyle \mu=\text{Ln}(95.880549)-z_{0.90} \cdot \sigma=1.997143205

From (5) and (7), we take \mu=2 and \sigma=2. The 95th lognormal percentile is:

    (8) …..\displaystyle e^{2+z_{0.95} \cdot 2}=198.3434254…..(using table values)

To get a more precise answer, we use normal percentiles from the TI84+ calculator: z_{0.90}=1.281551567, z_{0.95}=1.644853626 and z_{0.99}=2.326347877. Going through the same series of calculation from (1) to (8), we obtain:

    (9) …..\displaystyle e^{2+z_{0.95} \cdot 2}=198.2853692…..(using TI84+)

Order Statistics from Lognormal Samples

We use a specific lognormal distribution to demonstrate the concept. Suppose X follows a lognormal distribution with parameters \mu=1 and \sigma=0.5. We draw a random sample X_1,X_2,\cdots,X_{11} of size 11 from this population. We rank the 11 sample items from the smallest to the largest. The ranked results are labeled Y_1<Y_2< \cdots <Y_6 < \cdots < Y_{11}. In this example, Y_1 is the sample minimum, Y_6 is the sample median and Y_{11} is the sample maximum. In this example, Y_j is the jth order statistic where 1 \le j \le 11.

One important tool in learning about order statistics is through their density functions and joint density functions. When the random sample is drawn from a lognormal population, the density functions of the order statistics, though can be derived, are not easy to work with for the purpose of calculation. However, we can still evaluate probability statements about the order statistics using either a binomial calculation (when it involves only one order statistic) or a multinomial calculation (when it involves 2 or more order statistics).

The multinomial approach we use here is discussed in this previous post. The only difference is that the random samples discussed here are drawn from the lognormal distribution. A practice problem set for the multinomial approach is found here. Another set of practice problems for order statistics is found here.

We demonstrate with examples using the random sample of size 11 drawn the lognormal distribution with \mu=1 and \sigma=0.5, indicated above.

Example 2
Suppose that the random variable X has a lognormal distribution with parameters \mu=1 and \sigma=0.5. A random sample X_1,\cdots,X_{11} of size 11 is drawn from the population represented by X. Let Y_1,\cdots,Y_{11} represent the corresponding order statistics. Evaluate the following probabilities.

  • P(Y_5 < 2.5)
  • P(Y_9 > 4)

Here, Y_5 is the 5th order statistic, which is the fifth smallest sample item in the random sample while Y_9 is the ninth order statistic in the random sample. Before we evaluate these probabilities, we evaluate the probabilities P(X < 2.5) and P(X < 4).

    \displaystyle \begin{aligned} P(X < 2.5)&=P[\text{Ln}(X) < \text{Ln}(2.5)] \\&=P \biggl[\frac{\text{Ln}(X)-1}{0.5} < \frac{\text{Ln}(2.5)-1}{0.5} \biggr] \\&=\Phi \biggl( \frac{\text{Ln}(2.5)-1}{0.5} \biggr) \\&=\Phi(-1.7) \\&=1-0.5675=0.4325=P \end{aligned}

    \displaystyle \begin{aligned} P(X < 4)&=P[\text{Ln}(X) < \text{Ln}(4)] \\&=P \biggl[\frac{\text{Ln}(X)-1}{0.5} < \frac{\text{Ln}(4)-1}{0.5} \biggr] \\&=\Phi \biggl( \frac{\text{Ln}(4)-1}{0.5} \biggr) \\&=\Phi(0.77) \\&=0.7794=Q \end{aligned}

The above probabilities P and Q are based on the standard normal table. We now translate the probability statement P(Y_5 < 2.5) into a binomial probability. For the event Y_5 < 2.5 to happen, there must be at least 5 sample items X_i that are less than 2.5.

    \displaystyle \begin{aligned} P(Y_5 < 2.5)&=P(\text{at least 5 } X_i < 2.5) \\&=1-P(\text{at most 4 } X_i < 2.5) \\&=1-(1-P)^{11}-11 \cdot P^1 \cdot (1-P)^{10}-55 \cdot P^2 \cdot (1-P)^9 \\& \ \ -165 \cdot P^3 \cdot (1-P)^8-330 \cdot P^4 \cdot (1-P)^7 \\&=0.556249306  \end{aligned}

Note that P(Y_9 > 4)=1-P(Y_9 \le 4). For the event Y_9 \le 4 to happen, at least 9 sample item X_i are less than or equal to than 4.

    \displaystyle \begin{aligned} P(Y_9 > 4)&=1-P(Y_9 \le 4)\\&=1-P(\text{at least 9 } X_i \le 4) \\&=1-55 \cdot Q^9 \cdot (1-Q)^{2}-11 \cdot Q^{10} \cdot (1-Q)^1 - Q^{11}  \\&=0.450738923  \end{aligned}

To get more precise answers, use P=0.433520366 and Q=0.7801171596 in the above calculation (from TI84+). Plugging these values in will produce the following:

    P(Y_5 < 2.5)=0.5590023377…..(using TI84+)

    P(Y_9 > 4)=0.4483855009…..(using TI84+)

There is about a 56% chance that in a random sample of size 11 from this lognormal population, the 5th smallest sample item Y_5 is less than 2.5. On the other hand, there is a 45% chance that the third largest sample item Y_9 is greater than 4.

Example 3
As discussed previously, the lognormal distribution has parameters \mu=1 and \sigma=0.5 and a random sample of size 11 is drawn. The order statistics are Y_1,\cdots,Y_{11}. Evaluate the following probabilities:

  • P(Y_5<2.5<Y_6<Y_8<4<Y_9)
  • P(Y_5<2.5<Y_7<Y_8<4<Y_9)
  • P(Y_5<2.5<Y_7<4<Y_9)
  • P(Y_5<2.5<4<Y_9)

We work the first two probabilities in this example. The remaining two are done in the next example.

These probabilities involve 2 or more order statistics. One way to evaluate these probabilities is to obtain the appropriate joint density function and then integrate the joint density over an appropriate region. As mentioned earlier, the approach we take here is the multinomial approach.

Take the first probability P(Y_5<2.5<Y_6<Y_8<4<Y_9). The random sample X_1,\cdots,X_{11} of size 11 can be viewed as a multinomial experiment. There are three intervals to consider – (0, 2.5), (2.5, 4) and (4, \infty). Each of the 11 random sampling falls into one and exactly one of these three intervals. The probabilities of a sample item X_i falling into these intervals are:

    P(X < 2.5)=0.4325=p_1

    P(2.5 < X < 4)=0.7794 - 0.4325=0.3469=p_2

    P(4 < X)=1-0.4325-0.3469=0.2206=p_3

These probabilities are calculated in Example 2. The experiment is to sample from the lognormal distribution with \mu=1 and \sigma=0.5 11 times. Each random sample item falls into one of these intervals (0, 2.5), (2.5, 4) and (4, \infty) with probabilities 0.4325, 0.3469 and 0.2206, respectively. For the event Y_5<2.5<Y_6<Y_8<4<Y_9 to happen, 5 of the sample items must fall into (0, 2.5), 3 of the sample items must fall into (2.5, 4) and 3 of the sample items must fall into (4, \infty). Consider the following multinomial probability:

    \displaystyle \begin{aligned} P(Y_5<2.5<Y_6<Y_8<4<Y_9)&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3 \\&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot 0.4325^5 \cdot 0.3469^3 \cdot 0.2206^3 \\&=9240 \cdot 0.4325^5 \cdot 0.3469^3 \cdot 0.2206^3  \\&=0.0626659971  \end{aligned}

When the probabilities p_1, p_2 and p_3 are obtained using TI84+, we have the following answer to P(Y_5<2.5<Y_6<Y_8<4<Y_9).

    P(X < 2.5)=0.433520366=p_1…..(using TI84+)

    P(2.5 < X < 4)=0.3465967937=p_2…..(using TI84+)

    P(4 < X)=1-p_1-p_2=0.2198828404=p_3…..(using TI84+)

    \displaystyle \begin{aligned} P(Y_5<2.5<Y_6<Y_8<4<Y_9)&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3   \\&=0.0626277965  \end{aligned}

We now work the second probability P(Y_5<2.5<Y_7<Y_8<4<Y_9). For the event Y_5<2.5<Y_7<Y_8<4<Y_9 to occur, either 5 or 6 sample items are less than 2.5. We must account for these 2 cases.

    \displaystyle \begin{aligned} P(Y_5<2.5<Y_7<Y_8<4<Y_9)&=\frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3 \\& \ \ \ + \frac{11!}{6! \cdot 2! \cdot 3!} \cdot p_1^6 \cdot p_2^2 \cdot p_3^3\\&=9240 \cdot 0.4325^5 \cdot 0.3469^3 \cdot 0.2206^3 \\& \ \ \ + 4620 \cdot 0.4325^6 \cdot 0.3469^2 \cdot 0.2206^3  \\&=0.101730632  \end{aligned}

When the probabilities p_1, p_2 and p_3 are obtained using TI84+, we have the following answer to P(Y_5<2.5<Y_7<Y_8<4<Y_9).

    P(X < 2.5)=0.433520366=p_1…..(using TI84+)

    P(2.5 < X < 4)=0.3465967937=p_2…..(using TI84+)

    P(4 < X)=1-p_1-p_2=0.2198828404=p_3…..(using TI84+)

    \displaystyle \begin{aligned} P(Y_5<2.5<Y_7<Y_8<4<Y_9)&=0.1017949581  \end{aligned}

Example 4
We now complete Example 3 by calculating the following probabilities.

  • P(Y_5<2.5<Y_7<4<Y_9)
  • P(Y_5<2.5<4<Y_9)

Consider the probability P(Y_5<2.5<Y_7<4<Y_9). This one involves 4 cases. Out of 11 sample items, either 5 or 6 of them fall into the interval (,2.5) (either Y_6 falls into (0,2.5) or Y_6 falls into (2.5,4)). For each of these scenarios, there are two cases – either Y_8 falls into (2.5,4) or Y_8 falls into (4,\infty). The following shows the 4 separate calculations and the total.

    \displaystyle \frac{11!}{5! \cdot 2! \cdot 4!} \cdot p_1^5 \cdot p_2^2 \cdot p_3^4=0.0298878328

    \displaystyle \frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3=0.0626659971

    \displaystyle \frac{11!}{6! \cdot 1! \cdot 4!} \cdot p_1^6 \cdot p_2^1 \cdot p_3^4=0.0124209548

    \displaystyle \frac{11!}{6! \cdot 2! \cdot 3!} \cdot p_1^6 \cdot p_2^2 \cdot p_3^3=0.039064635

    P(Y_5<2.5<Y_7<4<Y_9)=0.1440394197

When the probabilities p_1, p_2 and p_3 are obtained using TI84+, we have the following answer to P(Y_5<2.5<Y_7<4<Y_9).

    P(X < 2.5)=0.433520366=p_1…..(using TI84+)

    P(2.5 < X < 4)=0.3465967937=p_2…..(using TI84+)

    P(4 < X)=1-p_1-p_2=0.2198828404=p_3…..(using TI84+)

    \displaystyle \begin{aligned} P(Y_5<2.5<Y_7<4<Y_9)&=0.1440174396  \end{aligned}

For the probability P(Y_5<2.5<4<Y_9), there are even more cases. There are 10 cases. The calculation is shown below.

    \displaystyle \frac{11!}{5! \cdot 0! \cdot 6!} \cdot p_1^5 \cdot p_2^0 \cdot p_3^6

    \displaystyle \frac{11!}{5! \cdot 1! \cdot 5!} \cdot p_1^5 \cdot p_2^1 \cdot p_3^5

    \displaystyle \frac{11!}{5! \cdot 2! \cdot 4!} \cdot p_1^5 \cdot p_2^2 \cdot p_3^4

    \displaystyle \frac{11!}{5! \cdot 3! \cdot 3!} \cdot p_1^5 \cdot p_2^3 \cdot p_3^3

    \displaystyle \frac{11!}{6! \cdot 0! \cdot 5!} \cdot p_1^6 \cdot p_2^0 \cdot p_3^5

    \displaystyle \frac{11!}{6! \cdot 1! \cdot 4!} \cdot p_1^6 \cdot p_2^1 \cdot p_3^4

    \displaystyle \frac{11!}{6! \cdot 2! \cdot 3!} \cdot p_1^6 \cdot p_2^2 \cdot p_3^3

    \displaystyle \frac{11!}{7! \cdot 0! \cdot 4!} \cdot p_1^7 \cdot p_2^0 \cdot p_3^4

    \displaystyle \frac{11!}{7! \cdot 1! \cdot 3!} \cdot p_1^7 \cdot p_2^1 \cdot p_3^3

    \displaystyle \frac{11!}{8! \cdot 0! \cdot 3!} \cdot p_1^8 \cdot p_2^0 \cdot p_3^3

    P(Y_5<2.5<4<Y_9)=0.1723237892

With p_1, p_2 and p_3 obtained from TI84+, the answer is

    P(Y_5<2.5<4<Y_9)=0.1723606109

Example 5
Use Example 2 and Example 4 to compute the conditional probability P(4<Y_9 \lvert Y_5<2.5). Compare this with the unconditional probability P(4<Y_9).

    \displaystyle \begin{aligned} P(4<Y_9 \lvert Y_5<2.5)&=\frac{P(Y_5<2.5<4<Y_9)}{P(Y_5<2.5)} \\&=\frac{0.1723237892}{0.5562493060} \\&=0.3097959627  \end{aligned}…..(using table values)

    \displaystyle \begin{aligned} P(4<Y_9 \lvert Y_5<2.5)&=\frac{P(Y_5<2.5<4<Y_9)}{P(Y_5<2.5)} \\&=\frac{0.1723606109}{0.5590023377} \\&=0.3083361183  \end{aligned}…..(using TI84+)

From Example 2, P(4<Y_9) is about 0.45. Without knowing additional information, there is a 45% chance that the 9th order statistic Y_9 is greater than 4. But if we know that there are at least 5 sample items smaller than 2.5, it is less likely that Y_9 is greater than 4 (about 31% chance).

Large Lognormal Samples

Independent sum of lognormal distributions is not lognormal. However if the sample is large enough, we can approximate the independent sum using the normal distribution due to the central limit theorem. We present one example.

Example 6
For a certain insurance company, insurance claims follow a lognormal distribution with parameters \mu=5 and \sigma=1.

  • Calculate the probability that a randomly selected claim is between 200 and 250.
  • The insurance company is to process fifty claims this month. Approximate the probability that the average claim amount is 200 and 250.

For an individual claim X, the mean is E(X)=e^{\mu + 0.5 \sigma}=e^{5.5}=\mu_X and the second moment is E(X^2)=e^{2 \mu + 2 \sigma^2}=e^{12}. This means the variance of an individual claim is Var(X)=e^{12}-e^{11}=e^{11} (e-1). Thus the standard deviation of an individual claim is \sigma_X=\sqrt{e^{11} (e-1)}.

For a random sample of size 50, X_1,\cdots,X_{50}, the sample mean is \overline{X}=\frac{1}{50} (X_1+\cdots+X_{50}). The mean of the sample mean is \mu_{\overline{X}}=e^{5.5} and the standard deviation of the sample mean is \sigma_{\overline{X}}=\frac{\sigma_X}{\sqrt{50}}=\frac{1}{\sqrt{50}} \sqrt{e^{11} (e-1)}.

We first calculate the probability P(200<X<250).

    \displaystyle \begin{aligned} P(200<X<250)&=P[\text{Ln}(200)<\text{Ln}(X)<\text{Ln}(250)] \\&=\Phi \biggl[ \frac{\text{Ln}(250)-5}{1} \biggr]-\Phi \biggl[ \frac{\text{Ln}(200)-5}{1} \biggr] \\&=\Phi(0.52)-\Phi(0.30)\\&=0.6985-0.6179\\&=0.0806 \end{aligned}

The above probability using TI84+ is 0.0817076952. The following calculates the probability concerning the sample mean.

    \displaystyle \begin{aligned} P(200<\overline{X}<250)&\approx \Phi \biggl[ \frac{250-\mu_{\overline{X}}}{\sigma_{\overline{X}}} \biggr]-\Phi \biggl[ \frac{200-\mu_{\overline{X}}}{\sigma_{\overline{X}}} \biggr]   \\&= \Phi \biggl[ \frac{250-e^{5.5}}{\frac{1}{\sqrt{50}} \sqrt{e^{11} (e-1)}} \biggr]-\Phi \biggl[ \frac{200-e^{5.5}}{\frac{1}{\sqrt{50}} \sqrt{e^{11} (e-1)}} \biggr] \\&=\Phi(0.12)-\Phi(-0.99)\\&=0.5478-(1-0.8389)\\&=0.3867 \end{aligned}

Note the difference in calculation between the probability for individual X and for \overline{X}. For the former, we take the natural log to transform X into a normal variable. For the latter, \overline{X} is approximately normal since we apply the central limit theorem. Thus we do not need to apply natural log on \overline{X}.

With \sigma_{\overline{X}}=45.36, the spread of the sample mean \overline{X} is much smaller than the spread for individual X where \sigma_X=320.75. Thus it is much less likely for an individual observation of X to fall between 200 and 250.

Practice Problems

A practice problem set is found here.

Dan Ma statistical

Daniel Ma statistical

Dan Ma practice problems

Daniel Ma practice problems

Daniel Ma mathematics

Dan Ma math

Daniel Ma probability

Dan Ma probability

Daniel Ma statistics

Dan Ma statistics

Dan Ma mathematical

Daniel Ma mathematical

\copyright 2020 – Dan Ma

Tagged: , ,

2 thoughts on “More lognormal calculation

  1. […] This set of practice problems is to complement a discussion on lognormal distribution (found here). […]

  2. […] The preceding post discusses several examples of calculation involving the lognormal distribution. This post presents another one – using the lognormal distribution as a model of prices of a financial security. Practice problems are found here. […]

Leave a comment