Skip to content

literate programming (documents with code): knitr, R Markdown, Sweave, Latex, etc.

paciorek edited this page Sep 25, 2014 · 2 revisions

[NOTE for Fall 2014; the formatting below uses Sweave formatting and we've asked you to use the knitr style formatting. So you'll need to translate what appears below from the Sweave (.Rnw) to knitr (.Rtex) style in your coding.]

Summary (Oct. 18, 2013)

I've summarized my current understanding of using R/bash/Python chunks in in the form of a test Sweave (.Rnw) document and accompanying R code file and compiled PDF (the files can also just be found in your local git repository for the class under 'literateProgramming'). This demonstrates some of the cases that cause lines to overflow off the page and some ways to avoid that, though in some cases I did not find a fix or a work-around. Suggestions are welcome!

Note that when I converted the testRnw.Rnw file to a Rtex file, the results I found were the same in terms of what formatted well and what did not.

Controlling whether chunks of code are included in the PDF and whether they are evaluated in R

To run a chunk of code but not have the code included in the document, you can do:

<<myChunk, include=FALSE>>=
a <- 7
@

To include the code in the document but not run/evaluate the R code, you can do:

<<myChunk2, eval=FALSE>>=
b <- 7
@

Including a code chunk where the code is in another file

If you want to have all your code in a separate file of code and just pull it into your document on the fly, see the demo above in terms of how I had "chunk5" be in the .R file and then referred to chunk5 in the .Rnw file. You need to do "read_chunk" on the .R file at the beginning of the Rnw file.

Dealing with line breaks

  • To split long lines of bash code within an Rnw or Rtex file try this:
    echo "my long line\
    of stuff" > tmp.txt
  • or this to split outside of a string. (Make sure there is no space after the "")
    echo "my long line"\
    > tmp2.txt
  • You can break lines of R code in many places.
  • You can also try this within a chunk:
    %% begin.rcode, tidy=TRUE
    % print(mean(c(12341234,12341234,23412,3431234,21341234,1234,1234,1234,12341234))
    % x=rnorm(5)
    %% end.rcode
  • Another thing to try is to set the width in a setup chunk before the \begin{document}
%% begin.rcode setup, include=FALSE
% options(width=60)
%% end.rcode

Some tips on LaTeX and knitr

  • LaTeX: To escape your section from numbering, add an asterisk between the command name and the opening curly brace. For instance:
\section*{Introduction}
  • LaTeX: Use the verbatim environment to display text as you type it. For instance:
\begin{verbatim} 
x <- c(1, rep(1e-16, 1e4))
x.sum <- x[1]
for(i in 2:length(x)){
  x.sum <- x.sum + x[i]
}
print(x.sum)
\end{verbatim}

  • knitr: Using other languages in knitr
<<engine='bash'>>=
cd ~/Desktop
wget "http://www.gutenberg.org/cache/epub/100/pg100.txt"
@

<<engine='python'>>=
def testing():
    return "Hello, World!"
print testing()
@

Matt's contribution on breaking lines

So you put this as the first chunk in your Rnw document. It adjusts the preferences for all of the following chunks of code such that knitr runs the code and then passes both the source code and the output to the listings package for formatting.

    <<setup, include=FALSE, cache=FALSE>>=
    library(knitr)
    opts_chunk$set(fig.path = 'figure/listings-')
    options(replace.assign = TRUE)
    render_listings()
    @

This defaults to a kind of strange color scheme for the actual PDF output. You can change the fonts and the colors by adjusting the listings package preferences in the Sweavel.sty file that is created after you knit the document for the first time. This link includes a list of some of the settings that you can change: http://en.wikibooks.org/wiki/LaTeX/Source_Code_Listings#Using_the_listings_package

Note that if you look at the Sweavel.sty file, there is by default a "breaklines=true" setting that is actually causing the line breaking.

Dealing with blank lines

I have a work around as far as leaving blank lines in between code in an R chunk. First, set the background color of the chunk to white (the same as the background of the rest of the document.) Do this by using the option background='#FFFFFF' in the chunk initialization line. Then when you want to create an empty line, just end the chunk and use a command like \baselineskip or \newline to create a blank line in the actual Latex portion of the document. Other than this work around I haven't really thought about any other solutions being that this issue is not something that I've had to bother with. I use a Mac, but with Rstudio and knitr. So even though the person who was having issues is using Windows, I think that since they use Rstudio and knitr as well, it should work... but I might be wrong. Tom

Sweave demos

I used two different ways compiling my PDF files, and I feel sweave is much more handy since we don't need to save it as a .Rtex and Rstudio can finish all of the compiling jobs. I found a website with some demos on it, and it is very useful, you can find demo.rnw as well as its compiled PDF file: http://users.stat.umn.edu/~geyer/Sweave/ . Just remember to start you code with <<...>>= and end it with @, I think I have encountered less problems in using sweave.

Jimmy Gu

Tessa's overview of document creation

Here are some thoughts.

  1. Latex is a coding/typesetting language that takes a source (text) file, in which the user has typed raw text as well as making use of functions to denote complicated structures like equations, greek letters, mathematical symbology, tables, and figures. This source file is saved as a .tex file, e.g. 'filename.tex'. Executing 'pdflatex filename.tex' at the command line outputs a pdf file containing the nicely formatted text and beautifully rendered mathematical symbology. This typesetting step can also be accomplished using a gui like TexShop or TexnicCenter, which will take any .tex file and turn it into a .pdf file.
    
  2. R is a computing language that takes a source (text) file, in which the user has typed raw text, making use of functions to create objects which may be displayed as numerical values, plots, or other kinds of output, which a user may wish to include in a nice report, as generated by Latex.
    
  3. Each of the languages above have their own syntax and their own file structures. Traditionally, including output from R in Latex is done by copying and pasting, which is inefficient and error prone.
  4. Sweave and knitr are a way of weaving or knitting together the two types of sources in order to produce a unified output. Sweave (and its .Rnw extension) predates knitr (and its .Rtex extension).
  5. Making use of these begins by creating a source (text) file, with either the .Rnw or .Rtex extension. The file contains raw text, some of which refers to things that should simply be typeset and formatted (such as mathematical equations, or written commentary about what is being done), and some of which should be evaluated and placed into the formatted document (such as code snippets, numerical values, or plots). The text file must somehow indicate which items are to be simply typeset and formatted (as would appear in a regular .tex file), and which bits need to be executed before being typeset (as in a .R file). This syntax is what differs between .Rnw and .Rtex. I would suggest you look at the knitr demo page for examples.
  6. As the homework assignment says, knitting the .Rnw or .Rtex file in R will output a Latex file. This is a standard .tex file, as I explained it in (1). You might try opening that .tex file to see what it looks like. Look for similarities and differences between your original .Rnw/.Rtex and the new .tex file. 
    
  7. This new .tex file is still simply a file full of text, waiting to be turned into a .pdf document with nice symbols, plots, etc. As explained in (1), the .tex file must be compiled/typeset into a .pdf, in one of the manners explained there.