Multiple Imputation in Stata: Introduction

Many SSCC members are eager to use multiple imputation in their research, or have been told they should be by reviewers or advisors. This series is intended to be a practical guide to the technique and its implementation in Stata, based on the questions SSCC members are asking the SSCC's statistical computing consultants.

The series assumes you are already familiar with the basic concepts of multiple imputation and is not intended as a substitute for a study of the literature on it. White, Royston and Wood's article Multiple Imputation using chained equations: Issues and guidance for practice (Statistics in Medicine, November 2010) is a good starting point for that study.

The article also assumes you are familiar with Stata usage and syntax. If you are not, we suggest working through our Stata for Researchers series and (optionally but usefully) Stata Programming Essentials. We also note that the Stata documentation on the mi commands used for multiple imputation is very good and much broader than this series, which will focus on the most common scenarios among SSCC members.

One of the best ways to get an intuitive sense for how multiple imputation works is to run examples using constructed data sets where the right answers are known. We've created a set of examples to go along with this series. For the sake of narrative continuity they have been placed in a separate article, but you'll find links at the appropriate places. We strongly suggest you read them, but ideally you'll run the provided code yourself and experiment with changing it.

This series currently includes the following sections:

  1. Introduction
  2. Deciding to Impute
  3. Creating Imputation Models
  4. Imputing
  5. Managing Multiply Imputed Data
  6. Estimating
  7. Examples
  8. Recommended Readings

Next: Deciding to Impute

Last Revised: 1/14/2013