Qualitative Variables

Let us start with the simplest type, a binary variable. As suggested by its name, a binary variable is one with only two possible values. Although not necessary, the two values are commonly taken to be a “0” and a “1.” Binary variables are typically used to indicate whether or not an entity possesses an attribute. For example, we might code a variable to be a “1” if an insured is female and a “0” if male. (An insured is a person who is covered under an insurance agreement.)

More generally, a qualitative, or categorical, variable is one for which the measurement denotes membership in a set of groups, or categories. For example, if you were coding in which area of the country in which an insured resides, you might use a “1” for the northern part, “2” for southern, and “3” for everything else. A binary variable is a special type of categorical variable where there are only two categories. This location variable is an example of a nominal variable, one for which the levels have no natural ordering. Any analysis of nominal variables should not depend on the labeling of the categories. For example, instead of using a “1,2,3” for “north, south, other”, I should arrive at the same set of summary statistics if I used a “2,1,3” coding instead, interchanging north and south.

In contrast, an ordinal variable is a type of categorical variable for which an ordering does exist. For example, with a survey to see how satisfied customers are with our claims servicing department, we might use a five point scale that ranges from “1” meaning “dissatisfied” to a “5” meaning “satisfied.” Ordinal variables provide a clear ordering of levels of a variable but the amount of separation between levels is unknown.

[raw] [/raw]