5.1 An Iterative Approach to Data Analysis and Modeling

In this section, you learn how to:
  • Describe the iterative approach to data analysis and modeling.

Video Overview of the Section (Alternative .mp4 Version -5:45 min)

In our introduction of basic linear regression in Chapter 2, we examined the data graphically, hypothesized a model structure, and compared the data to a candidate model in order to formulate an improved model. Box (1980) describes this as an iterative process which is shown in Figure 5.1.

F5Iterative
Figure 5.1 The iterative model specification process.

This iterative process provides a useful recipe for structuring the task of specifying a model to represent a set of data. The first step, the model formulation stage, is accomplished by examining the data graphically and using prior knowledge of relationships, such as from economic theory or standard industry practice. The second step in the iteration is based on the assumptions of the specified model. These assumptions must be consistent with the data to make valid use of the model. The third step, diagnostic checking, is also known as data and model criticism; the data and model must be consistent with one another before additional inferences can be made. Diagnostic checking is an important part of the model formulation; it can reveal mistakes made in previous steps and provide ways to correct these mistakes.

The iterative process also emphasizes the skills you need to make regression analysis work. First, you need a willingness to summarize information numerically and portray this information graphically. Second, it is important to develop an understanding of model properties. You should understand how a theoretical model behaves in order to match a set of data to it. Third, understanding theoretical properties of the model are also important for inferring general relationships based on the behavior of the data.

[raw] [/raw]