[an error occurred while processing this directive] [an error occurred while processing this directive]

Stata notes

There are a couple of approaches one could take to add a single point to a scatterplot. One is to overlay the scatterplot with the plot produced by scatteri, an immediate scatterplot.

In this example, we will plot the overall mean in both the $$x$$ and the $$y$$ variables. A linear regression of $$y$$ on $$x$$ always passes through this point. A regression with higher-order terms seldom passes through this point!

Identifying the mean

First we identify the mean values of $$y$$ and $$x$$ and save them as local macro variables.

  sysuse auto
summarize price, meanonly
local X = r(mean)
summarize mpg, meanonly
local Y = r(mean)

Overlay scatter and scatteri

Then we overlay the scatterplot and the immediate scatterplot of the single point.

  twoway (scatter mpg price) (scatteri Y' X', msymbol(D))

The msymbol(D) gives us a large, diamond-shaped point marker.

Label the point, add a regression line

  twoway (scatter mpg price)(lfit mpg price) ///
(scatteri Y' X' (6) "Grand Mean", msymbol(D))

The "(6)" is a clock position for the point label.

Use a quadratic fit, add better annotation

  twoway (scatter mpg price)(qfit mpg price) ///
(scatteri Y' X' (6) "Grand Mean", msymbol(D)), ///
xtitle("Price (\$)") ytitle("Mileage (mpg)") ///
legend(order(1 "Observed" 2 "Predicted" 3 "Grand Mean"))

Here we can see that the regression line misses the grand mean.

[an error occurred while processing this directive]