--- title: "Structural Equation Modeling with Stata" author: "Doug Hemken" date: "October 2015" --- ```{r setup, echo=FALSE, message=FALSE} source("../StataMDsetup.r") ``` # Introduction [Stata notes](../stata.html) This workshop series assumes you already have a knowledge of Structural Equation Modeling, and are mainly interested in learning how to use Stata to estimate these models. We will start with simple models, and try to make things more complicated/nuanced from there. There are two core Stata commands for structural equation modeling: `sem` for models built on multivariate normal assumptions, and `gsem` for models with generalized linear components. In the usual Stata command style, both `sem` and `gsem` will be used as *estimation* commands, and each will allow a host of *post-estimation* commands to further examine the models. We will take our first example from the MPlus documentation. ```{r infile, collectcode=TRUE} infile x1-x3 using "..\..\MPlus\Basics\Sample stats\ex3.1.dat" * The file is found at * "http://www.ssc.wisc.edu/~hemken/MPlus/Basics/Sample stats/ex3.1.dat" ``` A quick visualization of our data shows us three variables with differing degrees of correlation: ```{r graphmatrix, results="hide", echo=1} graph matrix _all, half maxes(yscale(range(-5 5)) ylabel(-4(4)4)) graph export "Covmodel_files/scattermatrix.png", replace ``` ![Scatterplot matrix](Covmodel_files/scattermatrix.png) We can start by using the usual Stata commands to look at some descriptive statistics for our data. One of the advantages of using Stata for SEM is that we have all of the usual data manipulation and statistical commands at our fingertips! ```{r correlate, collectcode=TRUE} correlate , means covariance ``` These are sample covariances, with $N-1$ used in the denominator. Stata\'s `sem` command reports maximum likelihood covariances, with $N$ used in the denominator. We can use the usual Stata command language to convert like this: ```{r convert} matrix CV = r(C)*(r(N)-1)/r(N) matrix list CV ``` # Covariance Model The covariances (plus the means) form a saturated model for our data, that is, they perfectly fit the covariance matrix (plus the means vector). (Check this against the previous result.) ```{r covmodel} sem (x1-x3 -> ) ``` ## Model Specification *Paths* are specified in `sem` using parentheses and an \"arrow\", which may point either to the left or to the right (`->` or `<-` are equivalent). Multiple variables (we may use Stata\`s typical *varlist* shortcuts) may be collected on either side of the path arrow, or paths may be specified separately. Our covariance model could be written ``` sem (x1-x3 -> ) sem (<- x1-x3 ) sem (x1->)(<-x2)(x3->) ``` (You ought to be able to come up with a few more variants.) ## Sample Covariances, instead Alternatively, we can have `sem` report sample covariances: ```{r altcov} sem (x1-x3 -> ), nm1 // sample covariances instead of ML ``` ## Correlations If we are interested in the standardized solution to this model, this would be just the correlation matrix. ```{r std} correlate pwcorr, sig sem (x1-x3 -> ), standardized ``` Next: [Elementary path models](ElementaryPaths.html) ```{r cleanup, engine="R", echo=FALSE} unlink("profile.do") ```