* Interpreting margins, logistic regression example
* Use the classic example of predicting low birth weights
clear
webuse lbw
/* Preliminary Exploration */
* We'll work with the variables "age" and "race"
tabulate race low, row
/* In these data, black mothers have the highest incidence of
low birth weight, while white mothers have the lowest incidence. */
logistic low age
/* We see that increasing age predicts decreasing incidence of
low birth weight. */
tabstat age, by(race) stat(mean sd)
regress age i.race, notable
/* We also see that race and age are related so as to confound
each other's effect on low birth weight. */
/* Main Example */
/* Here we will use mother's age and race to predict the risk/incidence
of low birth weight babies. */
logistic low age i.race
/* The constant and the race coefficients reflect predicted probabilities
when age == 0 . */
display 1/(1+exp(-_b[_cons])) // white, age=0
display 1/(1+exp(-(_b[_cons]+_b[2.race]))) // black, age=0
/* Margins over race reflect predicted probabilities where age is
distributed as given in the data **for both/all races**.
Margins, dydx reflects differences between race categories - similar
to the interpretation of coefficients - but over the whole age
distribution as given by the data, rather than at age==0 .*/
margins i.race
margins, dydx(i.race)
/* If we add the age restriction, margins can give us the same information
as the coefficients. */
margins i.race, at(age=0)
margins, dydx(i.race) at(age=0)
/* This shows how the margins are calculated. */
logistic low age i.race
preserve
replace race=1
predict white
replace race=2
predict black
generate diff = black - white
summarize white black diff
restore
/* This shows how the margins at age=0 are calculated. */
preserve
replace age=0
replace race=1
predict white
replace race=2
predict black
generate diff = black - white
summarize white black diff
restore