Using R from SAS
Doug Hemken
July 2017
SAS can call R to pass data directly back and forth and to capture R output, but R can only call SAS in batch mode.
What follows, and more is documented in the SAS Online Help.
Setup in SAS
SAS requires two configuration options in order to communicate with R. First the RLANG
option must be set when SAS is started. This may be set either in a custom configuration file (not currently implemented by SSCC) or on the SAS command line.
On Winstat you can implement either solution by putting a SAS shortcut on your desktop and changing it's properties, such as adding -rlang
at the end of the command line.
Second, SAS needs an R_HOME environment variable to point it to the correct, available version of R.
The acceptable versions of R depend upon which version of SAS you are running.
On Winstat, the available versions are R 3.1.2 and R 3.4.0, but only the former works with our version of SAS, which is SAS 9.4 TS1M2. The most reliable way to set R_HOME is to include the statement
options set=R_HOME='C:\Program Files\R\R-3.1.2';
within your SAS command file.
Sending SAS data to R
SAS can pass data to an R session, and ask R for an analysis. All communication with R is done via SAS's PROC IML
. Note here that capitization matters in R, and that character variables are automatically converted to factors. In this example, then, it is important that the variable names be capitalized!
proc iml;
call ExportDataSetToR("Sashelp.Class", "dframe" );
submit / R;
names(dframe)
lm(Weight ~ Height + Age + Sex, data=dframe)
endsubmit;
run;
[1] "Finishing Rprofile.site from C:/Program Files/R/R-3.1.2/etc"
[1] "Name" "Sex" "Age" "Height" "Weight"
Call:
lm(formula = Weight ~ Height + Age + Sex, data = dframe)
Coefficients:
(Intercept) Height Age SexM
-125.115 2.873 3.113 8.744
Loading a package in R
You can load packages in R in the usual way, so long as the package is installed and in a location where R will find it. In this example, we can have R load the MASS
package, run a linear model with one of it's data sets, and send the default R output back to SAS.
proc iml;
submit / R;
library(MASS, lib.loc=.Library)
# use a data frame from MASS
lm(VitC ~ Cult + Date + HeadWt, data=cabbages)
endsubmit;
[1] "Finishing Rprofile.site from C:/Program Files/R/R-3.1.2/etc"
Call:
lm(formula = VitC ~ Cult + Date + HeadWt, data = cabbages)
Coefficients:
(Intercept) Cultc52 Dated20 Dated21 HeadWt
63.334 10.135 -1.213 4.186 -4.412
Importing a data frame from R
R matrices and data frames may be brought back into SAS as well, for any manipulation you might want to do in SAS. Here, we just grab the cabbages
data frame from R and show that SAS's PROC GLM "agrees" with R's lm
command (once you realize they have different reference categories).
proc iml;
submit / R;
library(MASS, lib.loc=.Library)
endsubmit;
call ImportDataSetFromR("cabbages", "cabbages");
run;
proc glm data=cabbages;
class Cult Date;
model VitC = Cult Date HeadWt / solution;
run;
[1] "Finishing Rprofile.site from C:/Program Files/R/R-3.1.2/etc"
The GLM Procedure
Class Level Information
Class Levels Values
Cult 2 c39 c52
Date 3 d16 d20 d21
Number of Observations Read 60
Number of Observations Used 60
The GLM Procedure
Dependent Variable: VitC
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 4 4035.062033 1008.765508 27.66 <.0001
Error 55 2005.787967 36.468872
Corrected Total 59 6040.850000
R-Square Coeff Var Root MSE VitC Mean
0.667963 10.42096 6.038946 57.95000
Source DF Type I SS Mean Square F Value Pr > F
Cult 1 2496.150000 2496.150000 68.45 <.0001
Date 2 909.300000 454.650000 12.47 <.0001
HeadWt 1 629.612033 629.612033 17.26 0.0001
Source DF Type III SS Mean Square F Value Pr > F
Cult 1 1303.360264 1303.360264 35.74 <.0001
Date 2 259.433518 129.716759 3.56 0.0353
HeadWt 1 629.612033 629.612033 17.26 0.0001
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 77.65535683 B 2.45990290 31.57 <.0001
Cult c39 -10.13496356 B 1.69531779 -5.98 <.0001
Cult c52 0.00000000 B . . .
Date d16 -4.18644031 B 2.01826561 -2.07 0.0427
Date d20 -5.39955164 B 2.11225494 -2.56 0.0134
Date d21 0.00000000 B . . .
HeadWt -4.41229218 1.06191296 -4.16 0.0001
NOTE: The X'X matrix has been found to be singular, and a generalized
inverse was used to solve the normal equations. Terms whose
estimates are followed by the letter 'B' are not uniquely estimable.