Consider the following problems. We work Problem 1. Problem 2 is left as exercise.
Problem 1
Suppose that the following is the probability function of the random variable .
where .
Evaluate the mean and variance of .
_____________________________________________________________
Problem 2
Suppose that the following is the probability function of the random variable .
where is the probability function of the binomial distribution with
trials and probability of success
,
is the probability function of the binomial distribution with
trials and probability of success
, and
is the probability function of the binomial distribution with
trials and probability of success
.
Evaluate the mean and variance of .
Answers for Problem 2 are found at the end of the post.
_____________________________________________________________
Discussion of Problem 1
The probability function is the weighted average of two binomial probability functions. The weights are
and
. Thus the probability distribution of
is said to be the mixture of two binomial distributions with equal mixing weights.
One interpretation of the probability function is that the underlying phenomenon can be one of two phenomena. As an actuarial science example, suppose that a block of insurance policies is divided into two groups, roughly equal in size. One group is a low risk group. It has a low claim frequency (the probability of a policyholder in this group having a claim in a given year is
). The other group is a high risk group. It has a high claim frequency (the probability of a policyholder having a claim is
). Suppose you pick a policyholder at random from this block. What is the expected number of claims in a year from this randomly chosen insured? What is the variance of the number of claims?
Since the probability function is a weighted average of binomial distributions, the mean and other higher moments are the weighted average of the binomial means and higher moments. However, as you will see below, the variance of the mixture is not the weighted average of the binomial variances.
Before we work the problem, we need some preliminary facts about binomial distributions. Suppose has a binomial distribution with parameters
and
. This fact is denoted by the notation
. Then the mean and variance of
are
and
, respectively. Since
, it follows that:
Now the calculation:
The idea for and
is that the mean and the second moment of
are the weighted average of the means and second moments of the binomial distributions. The following is the variance of
We use the insurance example indicated earlier to interpret the unconditional variance in . The two binomial distributions in the probability function
are conditional distributions (e.g. conditional on what group of insureds the randomly chosen policyholder comes from, high risk or low risk). To put the result
into perspective, note that both of these binomial distributions have the same variance, i.e.,
. Yet the unconditional variance
is much higher than
. The additional variance
is the additional variance due to the uncertainty in the risk parameter of the insured (the uncertainty of what group the randomly chosen policyholder comes from).
The two binomial distributions in the probability function are conditional distributions indexed by a parameter variable that is implicit in
. For example, when the randomly chosen policyholder is from the low risk group, the parameter is
and the number of claims follows
. When the randomly chosen policyholder is from the high risk group, the parameter is
and the number of claims follows
. The uncertainty in the risk parameter
has the effect of increasing the unconditional variance of the mixture.
The increase in variance is a key characteristic of mixture distributions. Whenever a probability distribution is the mixture of conditional distributions, the uncertainty in the parameter variable always has the effect of increasing the unconditional variance of the mixture. In the insurance example, the uncertainty of the risk characteristics of the insureds across the entire block is reflected in the higher unconditional variance (as demonstrated in Problem 1).
_____________________________________________________________
See the following blog posts for more detailed discussion of mixture distributions.
_____________________________________________________________
Answers for Problem 2