Model Selection Guided Tutorials

Here are a set of exercises that guide the viewer through some of the theoretical foundations of Loss Data Analytics. Each tutorial is based on one or more questions from the professional actuarial examinations – typically the Society of Actuaries Exam C.

Tutorial Structure. Each guided tutorial has a strategy set that describes the context. When you hit the “Start quiz” button, you begin the tutorial that is comprised of a series of mini-questions designed to lead you to the target question. At each stage, hints are provided as well as feedback on the correct solution of each mini-question.

Your Assignment. In reviewing these exercises, ideally the viewer will:

Work the problem posed referring only to basic theory
Even if you get the answer correct, review the strategy for this type of problem by clicking (revealing) the Strategy for … header
If you feel comfortable with the strategy and got the problem correct, then you may choose to move on. However, you might also decide to follow the step-by-step process for solving the problem by clicking on the “Start Quiz” button. It is not really a quiz — it is a guided tutorial.

Strategy for Solving Kernel Smoothing Problems

One way that you can provide a smooth estimate of a density without reference to a parametric family is through a so-called kernel density estimate. A kernel density estimate of a probability density function (f(x)) has the following form:
$$ f_n(x) = frac{1}{nb} sum_{i=1}^n kleft(frac{x-X_i}{b}right).$$ In this expression, (X_1, ldots, X_n) is our random sample of (n) observations, the positive constant (b) is known as the bandwidth and the function ( k(cdot ) ) is called a kernel function. The following are standard choices of the kernel function:

uniform kernel, ( k(y) = frac{1}{2} I(|y| le 1) )
triangular kernel, (k(y) = (1-|y|)times I(|y| le 1))
Epanechnikov kernel, (k(y) = frac{3}{4}(1-y^2) times I(|y| le 1))
Gaussian kernel (k(y) = phi(y)), where (phi(cdot)) is the standard normal density function.

For some situations, you can also use kernel methods to give a smooth approximation of the distribution function as follows:
$$ hat{F}_n(x) = frac{1}{n} sum_{i=1}^n Kleft(frac{x-X_i}{b}right).$$ Here, ( hat{F}_n(x)) is known as the kernel density estimator of a distribution function. The function (K ) is a probability distribution function associated with the kernel density (k). To illustrate, for the uniform kernel, we have
$$ K(y) = begin{cases}
0 & y<-1\
frac{y+1}{2}& -1 le y < 1 \
1 & y ge 1 .\ end{cases} $$ To solve questions using kernel smoothing, you can do the following:

From the problem, identify the data points (X_i), the kernel function to be applied, and the bandwidth (b) .
Rescale the observations about a point (x). Call the rescaled observations ( y_i = (x – X_i)/b ).
Apply the kernel density ( ( k(y_i) ) ) or distribution function ( ( K(y_i) ) ) , as appropriate. Kernels such as uniform, triangular and Epanechinikov have a limited domain, so make sure that you account for this feature when you evaluate the kernel function.
Then, take the average. For the density estimator, divide this average by (b). (Not needed for the distribution function.)

Kernel Smoothing 268
You are given the following ages at time of death for 10 individuals:
$$begin{array}{c}25 &30& 35& 35& 37 &39 &45 &47 &49 &55 end{array} $$

With a bandwidth of (b=10):
(a) Calculate the kernel density estimate of (f(40)), using a uniform kernel.
(b) Calculate the kernel density estimate of (f(40)), using a triangular kernel.
(c) Calculate the kernel density estimate of (F(40)), using a uniform kernel.
(d) Calculate the kernel density estimate of (F(40)), using a triangular kernel.
[WpProQuiz 68]

[raw]

[/raw]

Strategy for Solving Nonparametric Estimators of the Distribution Function with Censored Data Problems

Nonparametric Distribution Function Censored Data 252
The following is a sample of 10 payments :
$$begin{array}{c}4 &4& 5+& 5+ &5+ &8 &10+ &10+ &12 &15 end{array} $$

where + indicates that a loss exceeded the policy limit.

(a) Using the Kaplan-Meier product-limit estimator, calculate the probability that the loss on a policy exceeds 11, (widehat{S}(11))
(b) Using the Nelson-Åalen estimator, calculate the probability that the loss on a policy exceeds 11, (widehat{S}(11))
(a) Calculate Greenwood’s approximation to the variance of the product-limit estimate (widehat{S}(11))

[WpProQuiz 69]

[raw]

[/raw]

Strategy for Solving Problems Involving Estimators of the Distribution Function with Censored Data

Distribution Function for Censored Data 199
You consider personal auto property damage claims in a certain region. A a sample of four claims is:
130 240 300 540
The values of two additional claims are known to exceed 1000.

You wish to compare a nonparametric fit to a fit based on the parametric Weibull distribution
$$ F(x) = 1- exp left[- left(frac{x}{theta} right)^{0.2} right], x > 0$$(a) Using the nonparametric Nelson-Åalen estimator, calculate the probability that the loss on a policy exceeds 500, (widehat{S}_{NA}(500))
(b) Using the Weibull distribution, calculate the maximum likelihood estimator of ( theta ).
(c) With your fitted Weibull distribution from part (b), determine the estimate of the survival function at 500, ( S(500) ).

[WpProQuiz 70]

[raw]

[/raw]

Strategy for Solving Bayesian Estimation Problems

Bayesian Estimation 64
For a group of insureds, you are given;
(i) The amount of a claim in uniformly distributed but will not exceed a certain unknown limit (theta).
(ii)The prior distribution of (theta) is (pi(theta)=frac{500}{{theta}^2}, theta>500).
(iii) Two independents claims of 400 and 600 are observed.

(a) Calculate the posterior probability that (700 lt theta lt 900)
(b) Calculate the probability that the next claim will exceed 550.

[WpProQuiz 71]

[raw]

[/raw]

Strategy for Solving MomentsPercentile Matching Problems

MomentsPercentile Matching 143
The parameters of the inverse Pareto distribution
$$F(x)=left(frac{x}{x+theta}right)^{tau}$$

are to be estimated based on the following on the following data :
$$begin{array}{c}15 &45& 140& 250 &560 &1340 end{array} $$

Calculate the estimate of (theta) obtained by
(a) matching (k)th moments with (k=-1) and (k=-2).
(b) percentile matching, using the 36th and 60th empirically smoothed percentile estimates .

[WpProQuiz 72]

[raw]

[/raw]