- Describe different types of variables typically encountered in insurance practice
- Classify a variable into the appropriate category
Before discussing how to use insurance data to make decisions, it is helpful to first describe common data features. People, firms, and other entities that we want to understand are described in a dataset by numerical characteristics. As these characteristics vary by entity, they are commonly known as variables. To manage insurance systems, it will be critical to understand the distribution of each variable and how they are associated with one another. We will encounter datasets that have many variables (“high dimensional”) and so it useful to begin by classifying them into different types. As will be seen, this classification is not strict; there is overlap among the types. Nonetheless, the classification summarized in Table 1.1 and explained in the remainder of this section provide a solid first step in framing a dataset.
Table 1.1. Variable TypesVariable Type | Example |
Qualitative | |
Binary | Sex |
Categorical (Unordered, Nominal) | Territory (e.g., state/province) in which an insured resides |
Ordered Category (Ordinal) | Claimant satisfaction (five point scale ranging from 1=dissatisfied to 5 =satisfied) |
Quantitative | |
Continuous | Policyholder's age, weight, income |
Discrete | Amount of deductible |
Count | Number of insurance claims |
Combinations of Discrete and Continuous | Policy losses, mixture of 0's (for no loss) and positive claim amount |
Interval Variable | Driver Age: 16-24 (young), 25-54 (intermediate), 55 and over (senior) |
Circular Data | Time of day measures of customer arrival \hline |
Multivariate Variable | |
High Dimensional Data | Characteristics of a firm purchasing worker's compensation insurance (location of plants, industry, number of employees, and so on) |
Spatial Data | Longitude/latitude of the location an insurance hailstorm claim |
Missing Data | Policyholder's age (continuous/interval) and ``-99'' for ``not reported,'' that is, missing |
Censored and Truncated Data | Amount of insurance claims in excess of a deductible |
Aggregate Claims | Losses recorded for each claim in a motor vehicle policy. |
Stochastic Process Realizations | The time and amount of each occurrence of an insured loss |