<> # Writing Dynamic Markdown Documents Using Stata (DRAFT) ## Comparing *stmd*, *dyndoc*, *markstat*, *markdoc*, and *webdoc* #### Doug Hemken #### `stata c(current_date)` - [Software Requirements](#software-requirements) - [Workflow](#workflow) - [Discussion](#discussion) - [Code Blocks](#code-blocks) - [In-line Code](#display-inline) - [Graphs](#graphs) - [Tables](#tables) - [Formulas](#formulas) - [Processsing](#processing) In Stata there are several commands available to generate HTML documents from dynamically specified Markdown source documents. As of Stata 15, the command `dyndoc` is the official Stata command for this task. The `stmd` command was developed to enable users to write documents in standard Markdown style while using many of the underlying `dyndoc` capabilities. Previously available commands while `markstat` and `markdoc` are previously released user written commands from Germán Rodríguez and E.F. Haghish, respectively. A fourth command, `webdoc` from Ben Jann, takes dynamically specified Markdown source documents and generates plain Markdown documents. A second step is then required to convert plain Markdown to HTML. This can be accomplished with either the recently released `markdown` command, or using the `pandoc` command (installed with `markdoc`). The central function of these commands is to take a file containing both text written in Markdown and code written in Stata, and produce an HTML document that includes the text, the code, and the results from the code. (`markstat` and `markdoc` both produce documents in other formats as well.) The fundamental dynamic features of these commands boil down to executing and displaying the results of: - blocks of Stata code - the Stata `display` command, in line with other text - Stata graphics commands ## Software Requirements - `dyndoc` has everything it needs (in Stata 15). - `markstat` and `markdoc` both require installing *pandoc*, and specifying it's location. - `markstat` additionally requires the `whereis` package. `markdoc` requires the `weaver` and `statax` packages, as well as *wkhtmltopdf*. - `webdoc` requires either *pandoc* or Stata 15. For users working in a computing environment where they do not have administrator rights to install software, or users who are not familiar with specifying the paths to executables, these are hurdles. ## Workflow - `dyndoc` and `markstat` both embed Stata code within the Markdown text - `markdoc` `and webdoc` embed the Markdown text within Stata code comments. The differences between these commands are great enough that the you are required to choose among them once you begin mixing text and code - they are decidedly ***not*** alternative engines for rendering documents from the same file. For documentation of simple coding tasks, I find it pretty straightforward to simply begin writing a document in Markdown, including the Stata code as I go. Both `dyndoc` and `markstat` are well suited to this writing workflow. For documentation of tasks that require some coding effort, I usually find myself writing just the code first, often interspersed with comments that will eventually be fleshed out as text. `markdoc` and `webdoc` seem more oriented toward this writing workflow. If you are writing your document in the Stata do-file editor -- which does not recognize ***any*** of these formats -- you can conveniently test selections of your code interactively at any point in the writing process. When you are nearing the end of the process of combining code and text, you will probably appreciate the format that is visually the simplest - this is, after all, one of the main points of Markdown. For working in the do-file editor, it may be convenient to turn off syntax highlighting (Edit - Preferences), especially to work with `markdoc` and `webdoc` files where plain text (in Stata comments) is rendered green. ##Discussion The spirit of Markdown is to have a written format that provides a few formatting options (headers, lists, code blocks, image and url links), but that is nearly as readable before processing as after. Adding code to be processed necessarily makes this a little more complicated - but not that much. A dynamic document should look pretty much like a pure Markdown document, without the results and graphs. Of these commands, `markstat` is clearly the best at keeping the ink on the page to a minimum, corraling the visual clutter. It gives the user the best source documents to work with. `dyndoc` is the clear winner on simplicity of processing syntax (the `dyndoc` command itself) and use. No extra installation is required, and the user need not know anything about locating and specifying executables on their computer(s). `dyndoc` also produces the least file clutter. `Markstat`, `markdoc`, and `webdoc` all leave intermediate files littering your directories. While these files can come in handy at times, keeping them should be an option not a requirement. On the other hand, `markstat` and `markdoc` are both capable of taking the same source file and creating documents in other formats, notably `.pdf`. None of these formats has an accompanying utility command to extract the plain Stata code for use in a live demonstration. `Markstat`, however, produces a do file as a side effect (but this requires running the do file first). To my mind, the ideal dynamic documentation command and document format would combine the simplicity of `markstat` (which most closely conforms to Markdown standards for non-Stata languages), with the simple installation and lack of file clutter of `dyndoc`, and the flexibility of output formats provided by `markstat` and `markdoc`. Take a look at some of the details, and see if you don't agree. ## Code Blocks A dynamic document is composed of text and code to be executed. It is dynamic in the sense that, after writing, the document is processed to produce the final version for reading. To be processed by Stata, some distinction has to be made between text and code. Each of these commands does this in a different style. ### `dyndoc` <> For use with `dyndoc`, code blocks begin with `<>` and end with `<>`, and are usually formatted with Markdown code fences, like this: <>| ~~~~ Code | With Context -------------------|------------------- | Some text. ``` | ``` <> | <> sysuse auto | sysuse auto <> | <> ``` | ``` | More text. ~~~~ <> The result in your document would be rendered as: ``` <> sysuse auto <> ``` ### `markstat` ` ```{s} ` For use with `markstat`, you can demarcate code blocks several different ways. Perhaps the clearest, visually, is to use backticks marked with an `{s}`, as in: ~~~~ Code | With Context -------------------|------------------- | Some text. ```{s} | ```{s} sysuse auto | sysuse auto ``` | ``` | More text. ~~~~ The braces are optional, for an even cleaner look. And if you are willing to give up some Markdown formatting features for lists, you can simply use indentation and blank lines to demarcate code. In context: ``` Some text. sysuse auto More text. ``` Uniquely, `markstat` allows you to use an "m" instead of an "s" for the code fence "info tag", to work directly in Mata. This visual style, and the use of the info tag to signify a code language, makes `markstat` the most in sync with non-Stata dynamic Markdown use. ### `markdoc` ` /*** ` With `markdoc`, the Stata code is written in ordinary .do file style, the text is demarcated, and the first step of processing is to produce a Stata log file in `smcl` format. Our last example would look like this: <> ``` qui log using somefile /*** Some text. ***/ sysuse auto /*** More text ***/ qui log c ``` <> ~~~~ Code | With Context ---------------------------|------------------- quietly log using somefile | quietly log using somefile | /*** | Some text. | ***/ sysuse auto | sysuse auto | /*** | More text. | ***/ qui log c | qui log c ~~~~ (It is important that the final `log close` be abbreviated as above.) ### `webdoc` ` /*** ` As with `markdoc`, for `webdoc` the Stata code is written in ordinary .do file style, the text is demarcated. Here the first step produces a Markdown document, i\.e\. all of the dynamic elements are resolved and replaced. A second step is required to then take this to an HTML document. Our example would look like this: ~~~~ Code | With Context --------------------------------------|------------------- webdoc init example, logall plain md | webdoc init example, logall plain md | /*** | Some text. | ***/ sysuse auto | sysuse auto | /*** | More text. | ***/ ~~~~ ## *Display* inline In addition to showing your results in separate code blocks in your final document, you can also use the results of code in line in your text. <> ### `dyndoc` `<>` <> With `dyndoc`, anything that can be returned by the `display` command can be included in a line with text. For example ``` <> Today's date is <>. <> ``` would appear as: Today's date is <>. ### `markstat` `s With `markstat` this is visually simpler. Code is just demarcated with backticks and an "s" info tag. ``` Today's date is `s c(current_date)`. ``` ### `markdoc` `txt` command In `markdoc` format, in-line text and results are added to your document by a command embedded within Stata code. ``` ***/ txt "Today's date is " c(current_date) "." /*** ``` ### `webdoc` `webdoc substitute` command In `webdoc` format, results can be substituted into in-line text by a command embedded within Stata code. ``` webdoc substitute "XXX" "`c(current_date)'" /*** Today's date is XXX ***/ ``` ## Graphs In addition to the sorts of results you might find in the Results window, you may also want to include a graph in your document. None of these commands will completely automate that for you, that is, they will not detect that you have issued a graph command or sent output to a graph window. However, not too much extra work is required. For `dyndoc`, all you have to do (after creating the graph) is include a dynamic tag where you want the graph to appear in your document. For `markstat` and `markdoc`, you will need to save the graph as a file, then add a link to the file in your document. A small amount of extra effort is required to hide the `graph export` from the reader. Suppose we first make a graph with: ``` <> histogram price <> ``` <> ###`dyndoc` `<>` <> Include the graph with: ``` <> <> <> ``` The result then will be: <> ### `markstat` `graph export` For `markstat`, you would first save the graph using the `graph export` command in Stata, then include an image link in your Markdown text. ~~~~ ```{s} graph export hist_ms.svg, replace ``` Some text. ![Prices](hist_ms.svg) ~~~~ ### `markdoc` `graph export` Like `markstat`, first save the graph, then link it. ~~~~ ***/ graph export hist_md.svg, replace /*** Some text. ![Prices](hist_md.svg) ~~~~ ### `webdoc` `webdoc graph` `Webdoc` is similar to `dyndoc`, in that you just include a directive where you want the graph to appear in the document - you do not need to include code to save the graph first. This directive appears as Stata code. ~~~~ Some text. ***/ webdoc graph /*** ~~~~ ##Example Files Simple documents summarizing what we have covered so far: [review-dyndoc.smd](review-dyndoc.smd) and [review-dyndoc.html](review-dyndoc.html) Note the file extension for `markdoc` ***must*** be `.stmd` [review-markstat.stmd](review-markstat.stmd) and [review-markstat.html](review-markstat.html) [review-markdoc.do](review-markdoc.do) and [review-markdoc.html](review-markdoc.html) [review-webdoc.do](review-webdoc.do) and [review-webdoc.html](review-webdoc.html) ##Tables Uniquely, `dyndoc` is able to place *some* Stata output tables in the document text, rendered as *html tables*. Instead of just ``` <> tabulate rep78 foreign <> ``` You use (note the `markdown` option and the lack of code fences) ``` <> <> tabulate rep78 foreign, markdown <> <> ``` To produce <> tabulate rep78 foreign, markdown <> The same effect can be acheived with the other commands, but not nearly as simply (for the command Stata currently supports). [`markstat`](http://data.princeton.edu/stata/markdown/tables) requires you to specify each cell value as inline code, which takes a lot of hidden code to set up. [`markdoc`](https://github.com/haghish/MarkDoc/wiki/tbl) like `markstat` requires specifying each cell value as inline code, but perhaps simplifies this a little with a special `tbl` command. `webdoc`, geared as it is to produce Markdown, would be very similar to `markstat`, relying on substitution rather than in-line code. And it should be noted that for *arbitrary* tables in dyndoc, the same tedious approach would be required - this is only partially implemented in Stata, and is so far largely undocumented as well. Some commands that produce output tables formatted in Markdown are: - `_coef_table, markdown` - `estimates tables [stores], markdown` - `tabulate var1 [var2], markdown` - `table [specification], markdown` ##Formulas If your web site serves MathJax, all these commands will pass formulas along to your web site. A display formula like: ``` $$ mpg = \beta_{0} + \beta_{1} \times weight $$ ``` becomes: $$ mpg = \beta_{0} + \beta_{1} \times weight $$ Since any of these commands could be rendered from Markdown to HTML by Pandoc, they can all use Pandoc to render formulas in other ways as well (e.g. Unicode). Except in graphs, you would *not* use `SMCL`. ##Processing ###`dyndoc` `dyndoc` requires no additional packages or software, and is processed with: ``` dyndoc review-dyndoc.smd, replace ``` ###`markstat` `markstat` requires installing the additional `whereis` package and the pandoc software, so I am glossing several steps here. Once installed, before your first use of `markstat` you issue the command: ``` whereis pandoc "C:\Program Files\RStudio\bin\pandoc\pandoc.exe" ``` Thereafter you use `markstat` with: ``` markstat using review-markstat.stmd, strict ``` ###`markdoc` `markdoc` also requires additional packages and software - this command takes the most effort to set up. Once everything is installed, processing a document means processing a do file to create a log, then processing the log. ``` do review-markdoc.do markdoc review-markdoc, export(html) pandoc("C:\Program Files\RStudio\bin\pandoc\pandoc.exe") ``` ###`webdoc` `webdoc` also requires two steps for every document, like `markdoc`. ``` webdoc do review-webdoc.do markdown review-webdoc.md, saving(review-webdoc.html) ``` <> restore set linesize `ols' graph dir, memory local cgraph = ustrtrim("`r(list)'") local gdrop : list cgraph - ograph if "`gdrop'" ~= "" { graph drop `gdrop' } <> <>