8 SAS National Language Support

8.1 Intro

SAS allows you to work with a wide variety language encodings, and provides user interfaces to a handful of these encodings as well. This also means that your SAS output might be encoded in many different ways. However Rmarkdown assumes that all input is in the UTF-8 encoding (which should accomodate the whole variety).

8.2 Setup

Set up your document by loading SASmarkdown.

library(SASmarkdown)

To use a language encoding that is not your default in SAS requires additional set up when SAS is started. This is accomplished by adding a -config option on the SAS command line. In Markdown this will be done through the engine.opts setting.

To then read a SAS output file in a non-“latin1” encoding into Rmarkdown, an encoding chunk option will be used.

8.3 Default Language

Your default language depends on how you have SAS configured. In my case, SAS defaults to an English language interface, and a “latin1” (or “wlatin1” on Windows) language encoding.

First, consider an example that gives us the default listing output:

proc means data=sashelp.class(keep=height);
run;
                            The MEANS Procedure

                        Analysis Variable : Height 
 
     N            Mean         Std Dev         Minimum         Maximum
    ------------------------------------------------------------------
    19      62.3368421       5.1270752      51.3000000      72.0000000
    ------------------------------------------------------------------

8.4 French

I can switch to a French encoding by pointing SAS to the French language configuration file.

sasopts <- "-nosplash -ls 75 -config 'C:/Program Files/SASHome/SASFoundation/9.4/nls/fr/sasv9.cfg'"
knitr::opts_chunk$set(engine.opts=list(sas=sasopts, saslog=sasopts))

Because French output is also encoded in the “latin1” standard, nothing special is required to use the SAS output.

The result is:

proc means data=sashelp.class(keep=height);
run;
                            La procédure MEANS

                       Variable d'analyse : Height 
 
     N         Moyenne         Ec-type         Minimum         Maximum
    ------------------------------------------------------------------
    19      62.3368421       5.1270752      51.3000000      72.0000000
    ------------------------------------------------------------------

And using HTML output:

proc means data=sashelp.class(keep=height);
run;
Variable d'analyse : Height
N Moyenne Ec-type Minimum Maximum
19 62.3368421 5.1270752 51.3000000 72.0000000

8.5 Chinese

Switching to a Chinese encoding requires one extra step. Not only do we have to ensure that SAS produces Chinese output (this might be your default, but it is not mine), but we also have to instruct Rmarkdown to transcode from Chinese (in this case, the “gbk” standard) to UTF-8.

So a SAS set up is

sasopts <- "-nosplash -ls 75 -config 'C:/Program Files/SASHome/SASFoundation/9.4/nls/zh/sasv9.cfg'"
knitr::opts_chunk$set(engine.opts=list(sas=sasopts, saslog=sasopts, 
                                       sashtml=sasopts, sashtmllog=sasopts))

This Chinese output is also encoded in the “gbk” standard, so a chunk option is required to use the SAS output properly.

The code chunk looks like this:

```{sas, encoding="gbk"} 
proc means data=sashelp.class(keep=height);
run;
```

The result is:

proc means data=sashelp.class(keep=height);
run;
                              MEANS PROCEDURE

                       分析变量: Height 身高(英寸)
 
   数目            均值          标准差          最小值          最大值
   --------------------------------------------------------------------
     19      62.3368421       5.1270752      51.3000000      72.0000000
   --------------------------------------------------------------------

And with HTML output:

proc means data=sashelp.class(keep=height);
run;
分析变量: Height 身高(英寸)
数目 均值 标准差 最小值 最大值
19 62.3368421 5.1270752 51.3000000 72.0000000

8.6 Documentation

To see what encodings you may use for SASmarkdown, use the function iconvlist(). The result is platform dependent.

To see what encodings SAS is capable of producing see Encoding Values for a SAS Session. The options again depend on your operating system (platform).

To see what encoding SAS is using,

2          proc options option=encoding;
3          run;
     SAS (R) PROPRIETARY SOFTWARE RELEASE 9.4  TS1M6

 ENCODING=EUC-CN   指定 SAS 会话的默认字符集编码。

Be aware that the encoding name given by SAS may not match the encoding name used by R. Here, for instance, SAS calls it’s encoding “EUC-CN”, but that value fails in R (despite being in the iconvlist).

After a little experimentation, I find “gb2312” and “gbk” (an expanded version of “gb2312”) both appear to work. (But I don’t read or speak any Chinese language, so someone tell me if my choice is wrong!) I came to these guesses by

  • looking at the output file with the Notepad++ text editor, which guessed the file is “gb2312”
  • and by saving SAS HTML output and looking at the character set it declared, which was “gbk”.

Written using

  • SASmarkdown version 0.8.0.
  • knitr version 1.40.
  • R version 4.2.2 (2022-10-31 ucrt).