Example: Massachusetts Bodily Injury Claims

For our first look at fitting the normal curve to a set of data, we consider data from Rempala and Derrig (2005). They considered claims arising from automobile bodily injury insurance coverages. These are amounts incurred for outpatient medical treatments that arise from automobile accidents, typically sprains, broken collarbones and the like. The data consists of a sample of 272 claims from Massachusetts that were closed in 2001 (by “closed,” we mean that the claim is settled and no additional liabilities can arise from the same accident). Rempala and Derrig were interested in developing procedures for handling mixtures of “typical” claims and others from providers who reported claims fraudulently. For this sample, we consider only those typical claims, ignoring the potentially fraudulent ones.

Table 1.2 provides several statistics that summarize different aspects of the distribution. Claim amounts are in units of logarithms of thousands of dollars. The average logarithmic claim is 0.481, corresponding to \$1,617.77 (=1000 ($\exp(0.481)$)). The smallest and largest claims are -3.101 (\$45) and 3.912 (\$50,000), respectively.

$$
{\small
\begin{matrix}
\begin{array}{c}
\text{Table 1.2. Summary Statistics of}\\
\text{Massachusetts Automobile Bodily Injury Claims}
\end{array} \\
\tiny
\begin{array}{l|cccccccc} \hline \text{Variable} & & & & \text{Standard} & & & \text{25th} & \text{75th} \\ & \text{Number} & \text{Mean} & \text{Median} & \text{Deviation} & \text{Minimum} & \text{Maximum} & \text{Percentile} & \text{Percentile} \\ \text{Claims} & 272 & 0.481 & 0.793 & 1.101 & -3.101 & 3.912 & -0.114 & 1.168 \\ \hline
\end{array} \\
\begin{array}
\textit{Note}: \text{Data are in logs of thousands of dollars}
\end{array}
\end{matrix}
}
$$

R Code and Output for Table 1.2

[raw] [/raw]