/******************* transform.do Example file demonstrating the danger of transforming a variable to make it normal if the other variables are related to the untransformed variable Written by Russell Dimond, Summer 2012 for the Social Science Computing Cooperative at UW-Madison ********************/ clear all set more off // generate random data // x1 drawn from log normal distribution // x2 drawn from normal distribution // y is sum of x1 and x2 plus normal error term set obs 10000 set seed 4409 gen x1=exp(invnorm(runiform())) gen x2=invnorm(runiform()) gen y=invnorm(runiform())+x1+x2 reg y x1 x2 // drop values at random replace y=. if runiform()<.1 replace x1=. if runiform()<.1 replace x2=. if runiform()<.1 misstable sum, gen(m_) // complete cases analysis reg y x1 x2 // lnx1 is log transform of x1, thus normal gen lnx1=ln(x1) // imputation model for lnx1 // misspecification obvious in rvfplot reg lnx1 y x2 rvfplot graph export transform1.png, replace kdensity x1 if !m_x1 graph export transform2.png, replace preserve // impute using regress with transformed x1 // passive ix1 is imputed value of x1 (untransformed) mi set wide mi register imputed y lnx1 x2 mi impute chained (regress) y lnx1 x2, add(10) mi passive: gen ix1=exp(lnx1) mi estimate: reg y ix1 x2 restore preserve // impute using pmm with transformed x1 // passive ix1 is imputed value of x1 (untransformed) mi set wide mi register imputed y lnx1 x2 mi impute chained (regress) y x2 (pmm) lnx1, add(10) mi passive: gen ix1=exp(lnx1) mi estimate: reg y ix1 x2 restore preserve // impute using regress on original x1 mi set wide mi register imputed y x1 x2 mi impute chained (regress) y x1 x2, add(10) mi estimate: reg y x1 x2 mi xeq 1: kdensity x1 if m_x1 graph export transform3.png, replace restore preserve // impute using truncreg on original x1 /* Unfortunatley, this imputation model crashes for reasons as yet unknown. If you can tell us why we'd love to hear it. mi set wide mi register imputed y x1 x2 mi impute chained (regress) y x2 (truncreg, ll(0)) x1, add(10) mi estimate: reg y x1 x2 mi xeq 1: kdensity x1 if m_x1 graph export transform4.png, replace restore preserve */ // impute using pmm on original x1 mi set wide mi register imputed y x1 x2 mi impute chained (regress) y x2 (pmm) x1, add(10) mi estimate: reg y x1 x2 mi xeq 1: kdensity x1 if m_x1 graph export transform5.png, replace