By default, PROC CORR
gives you descriptive statistics as well as bivariate correlations and significance tests for all pairs of numeric variables in the data set
proc corr data=sashelp.class;
run;
The SAS System
The CORR Procedure
3 Variables: Age Height Weight
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
Age 19 13.31579 1.49267 253.00000 11.00000 16.00000
Height 19 62.33684 5.12708 1184 51.30000 72.00000
Weight 19 100.02632 22.77393 1901 50.50000 150.00000
Pearson Correlation Coefficients, N = 19
Prob > |r| under H0: Rho=0
Age Height Weight
Age 1.00000 0.81143 0.74089
<.0001 0.0003
Height 0.81143 1.00000 0.87779
<.0001 <.0001
Weight 0.74089 0.87779 1.00000
0.0003 <.0001
Often you will be interested in selected variables.
proc corr data=sashelp.class nosimple;
var weight height;
run;
The SAS System
The CORR Procedure
2 Variables: Weight Height
Pearson Correlation Coefficients, N = 19
Prob > |r| under H0: Rho=0
Weight Height
Weight 1.00000 0.87779
<.0001
Height 0.87779 1.00000
<.0001
As with most SAS procedures, you may be interested in these statistics in subsets of the data. Using BY
requires the data first be sorted
proc sort data=sashelp.class out=class;
by sex;
run;
proc corr nosimple;
by sex;
var weight height;
run;
The SAS System
-------------------------------------------- Sex=F ---------------------------------------------
The CORR Procedure
2 Variables: Weight Height
Pearson Correlation Coefficients, N = 9
Prob > |r| under H0: Rho=0
Weight Height
Weight 1.00000 0.88655
0.0014
Height 0.88655 1.00000
0.0014
The SAS System
-------------------------------------------- Sex=M ---------------------------------------------
The CORR Procedure
2 Variables: Weight Height
Pearson Correlation Coefficients, N = 10
Prob > |r| under H0: Rho=0
Weight Height
Weight 1.00000 0.85008
0.0018
Height 0.85008 1.00000
0.0018
Use WHERE
to process just a subset of the data.
proc corr data=sashelp.class nosimple;
where age gt 12;
var weight height;
run;
The SAS System
The CORR Procedure
2 Variables: Weight Height
Pearson Correlation Coefficients, N = 12
Prob > |r| under H0: Rho=0
Weight Height
Weight 1.00000 0.79650
0.0019
Height 0.79650 1.00000
0.0019
PROC CORR
can produce an output data set containing correlations, means, and variances, which can be used as input to other SAS procs such as PROC REG
.
proc corr data=sashelp.class outp=classcorr noprint;
run;
proc reg data=classcorr(type=corr);
model weight = age height;
run;
The SAS System
The REG Procedure
Model: MODEL1
Dependent Variable: Weight
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 7215.63710 3607.81855 27.23 <.0001
Error 16 2120.09974 132.50623
Corrected Total 18 9335.73684
Root MSE 11.51114 R-Square 0.7729
Dependent Mean 100.02632 Adj R-Sq 0.7445
Coeff Var 11.50811
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -141.22376 33.38309 -4.23 0.0006
Age 1 1.27839 3.11010 0.41 0.6865
Height 1 3.59703 0.90546 3.97 0.0011
PROC CORR
can produces bivariate scatterplots, or a scatterplot matrix, using the PLOTS=
option. (In ordinary interactive use, you do not have to enable ods html and graphics, but in batch mode you do.)
proc corr data=sashelp.class nosimple nocorr plots=matrix;
run;