A simple workflow for using R with Microsoft Office products

The challenge of integrating Microsoft products with R software has been an outstanding issue for several years. Reasons for these issues are complicated and related to fundamental differences in developing proprietary vs open-source products. To date, I don’t believe there has been a satisfactory solution but I present this blog as my attempt to work around at least some of the issues using the two. As a regular contributor to R-bloggers, I stress that one should use MS products as little as possible given the many issues that have been described (for example, here, here, and here). It’s not my intent to pick on Microsoft. In fact, I think Excel is a rather nifty program that has its place in specific situations. However, most of my work is not conducive to the point-and-click style of spreadsheet analysis and the surprising limited number of operations available in Excel prevent all but the simplest analyses. I try my best to keep my work within the confines of RStudio, given its integration with multiple document preparation systems.

I work with several talented researchers that have different philosophies than my own on the use of Microsoft products. It’s inevitable that we’re occasionally at odds. Our difficulties go both directions — my insistence on using pdfs for creating reports or manuscripts and the other party’s inclination towards the spreadsheet style of analysis. It seems silly that we’re limited by the types of medium we prefer. I’ve recently been interested in developing a workflow that addresses some of the issues of using end-products from different sources under the notion of reproducibility. To this end, I used Pandoc and relevant R packages (namely gdata and knitr) to develop a stand-alone workflow that allows integration of Microsoft products with my existing workflows. The idea is simple. I want to import data sent to me in .xlsx format, conduct the analysis and report generation entirely within RStudio, and convert the output to .docx format on completion. This workflow allows all tasks to be completed within RStudio, provided the supporting documents, software, and packages work correctly.

Of course, I don’t propose this workflow as a solution to all issues related to Office products and R. I present this material as a conceptual and functional design that could be used by others with similar ideas. I’m quite happy with this workflow for my personal needs, although I’m sure it could be improved upon. I describe this workflow using the pdf below and provide all supporting files on Github: https://github.com/fawda123/pan_flow.

\documentclass[xcolor=svgnames]{beamer}
\usetheme{Boadilla}
\usecolortheme[named=SeaGreen]{structure}
\usepackage{graphicx}
\usepackage{breqn}
\usepackage{xcolor}
\usepackage{booktabs}
\usepackage{verbatim}
\usepackage{tikz}
\usetikzlibrary{shadows,arrows,positioning}
\definecolor{links}{HTML}{2A1B81}
\hypersetup{colorlinks,linkcolor=links,urlcolor=links}
\usepackage{pgfpages}

\tikzstyle{block} = [rectangle, draw, text width=9em, text centered, rounded corners, minimum height=3em, minimum width=7em, top color = white, bottom color=brown!30,  drop shadow]

\newcommand{\ShowSexpr}[1]{\texttt{{\char`\\}Sexpr\{#1\}}}

\begin{document}

\title[R with Microsoft]{A simple workflow for using R with Microsoft products}
\author[M. Beck]{Marcus W. Beck}

\institute[USEPA NHEERL]{USEPA NHEERL Gulf Ecology Division, Gulf Breeze, FL\\
Email: \href{mailto:beck.marcus@epa.gov}{beck.marcus@epa.gov}, Phone: 850 934 2480}

\date{May 21, 2014}

%%%%%%
\begin{frame}
\vspace{-0.3in}
\titlepage
\end{frame}

%%%%%%
\begin{frame}{The problem...}
\begin{itemize}
\item R is great and has an increasing user base\\~\\
\item RStudio is integrated with multiple document preparation systems \\~\\
\item Output documents are not in a format that facilitates collaboration with 
non R users, e.g., pdf, html \\~\\
\item Data coming to you may be in a proprietary format, e.g., xls spreadsheet
\end{itemize}
\end{frame}

%%%%%%
\begin{frame}{The solution?}
\begin{itemize}
\item Solution one - Make liberal use of `projects' within RStudio \\~\\
\item Solution two - Use \texttt{gdata} package to import excel data \\~\\
\item Solution three - Get pandoc to convert document formats - \href{http://johnmacfarlane.net/pandoc/}{http://johnmacfarlane.net/pandoc/} \\~\\
\end{itemize}
\onslide<2->
\large
\centerline{\textit{Not recommended for simple tasks unless you really, really love R}}
\end{frame}

%%%%%
\begin{frame}{An example workflow}
\begin{itemize}
\item I will present a workflow for integrating Microsoft products within RStudio as an approach to working with non R users \\~\\
\item Idea is to never leave the RStudio environment - dynamic documents! \\~\\
\item General workflow... \\~\\
\end{itemize}
\small
\begin{center}
\begin{tikzpicture}[node distance=2.5cm, auto, >=stealth]
  \onslide<2->{
	\node[block] (a) {1. Install necessary software and packages};}
	\onslide<3->{
	\node[block] (b)  [right of=a, node distance=4.2cm] {2. Create project in RStudio};
 	\draw[->] (a) -- (b);}
 	\onslide<4->{
 	\node[block] (c)  [right of=b, node distance=4.2cm]  {3. Setup supporting docs/functions};
 	\draw[->] (b) -- (c);}
   \onslide<5->{
   \node[block] (d)  [below of=a, node distance=2.5cm]  {4. Import with \texttt{gdata}, summarize};
 	\draw[->] (c) -- (d);}
   \onslide<6->{
   \node[block] (e)  [right of=d, node distance=4.2cm]  {5. Create HTML document using knitr Markdown};
 	\draw[->] (d) -- (e);}
     \onslide<7->{
   \node[block] (f)  [right of=e, node distance=4.2cm]  {6. Convert HTML doc to Word with Pandoc};
   \draw[->] (e) -- (f);}
\end{tikzpicture}
\end{center}
\end{frame}

%%%%%%
\begin{frame}[shrink]{The example}
You are sent an Excel file of data to summarize and report but you love R and want to do everything in RStudio...
<<echo = F, results = 'asis', message = F>>=
library(gdata)
library(xtable)

prl_pth <- 'C:/strawberry/perl/bin/perl.exe'
url <- 'https://beckmw.wordpress.com/wp-content/uploads/2014/05/my_data.xlsx'
dat <- read.xls(xls = url, sheet = 'Sheet1', perl = prl_pth)
out.tab <- xtable(dat, digits=4)
print.xtable(out.tab, type = 'latex', include.rownames = F, 
  size = 'scriptsize')
@
\end{frame}

%%%%%%
\begin{frame}{Step 1}
Install necessary software and Packages \\~\\
\onslide<1->
\begin{itemize}
\onslide<2->
\item R and RStudio (can do with other R editors)\\~\\
\item Microsoft Office\\~\\
\onslide<3->
\item Strawberry Perl for using \texttt{gdata} package\\~\\
\item Pandoc\\~\\
\onslide<4->
\item Packages: \texttt{gdata}, \texttt{knitr}, \texttt{utils}, \texttt{xtable}, others as needed...
\end{itemize}
\end{frame}

%%%%%%
\begin{frame}{Step 2}
Create a project in RStudio \\~\\
\begin{itemize}
\item Create a folder or use existing on local machine \\~\\
\item Add .Rprofile file to the folder for custom startup \\~\\
\item Move all data you are working with to the folder \\~\\
\item Literally create project in RStudio \\~\\
\item Set options within RStudio \\~\\
\end{itemize}
\end{frame}

%%%%%%
\begin{frame}[fragile]{Step 3}
Setup supporting docs/functions, i.e., .Rprofile, functions, report, master
\scriptsize
\begin{block}{.Rprofile}
<<echo = T, eval = F, results = 'markup'>>=
# library path
.libPaths('C:\\Users\\mbeck\\R\\library')

# startup message
cat('My project...\n')

# packages to use
library(utils) # for system commands
library(knitr) # for markdown
library(gdata) # for import xls
library(reshape2) # data format conversion
library(xtable) # easy tables
library(ggplot2) # plotting

# perl path for gdata
prl_pth <- 'C:/strawberry/perl/bin/perl.exe'

# functions to use
source('my_funcs.r')
@
\end{block}
\end{frame}

%%%%%%
\begin{frame}[t, fragile]{Step 3}
Setup supporting docs/functions, i.e., .Rprofile, functions, report, master
\scriptsize
\begin{block}{my\_funcs.r}
<<echo = T, eval = F, results = 'markup'>>=
######
# functions for creating report,
# created May 2014, M. Beck

######
# processes data for creating output in report,
# 'dat_in' is input data as data frame,
# output is data frame with converted variables
proc_fun<-function(dat_in){
  
  # convert temp to C
  dat_in$Temperature <- round((dat_in$Temperature - 32) * 5/9)
  
  #  convert data to long format
  dat_in <- melt(dat_in, measure.vars = c('Restoration', 'Reference'))
  
  return(dat_in)
  
}

######
# creates linear model for data,
# 'proc_dat' is processed data returned from 'proc_fun',
# output is linear model object
mod_fun <- function(proc_in) lm(value ~ variable + Year, dat = proc_in)
@
\end{block}
\end{frame}

%%%%%%
\begin{frame}[fragile,shrink]{Step 3}
Setup supporting docs/functions, i.e., .Rprofile, functions, report, master
\scriptsize
\begin{block}{report.Rmd}
\begin{verbatim}
======================
Here's a report I made for `r gsub('/|.xlsx','',name)`
----------------------

```{r echo=F, include=F}  
# import data
url <- paste0('https://beckmw.wordpress.com/wp-content/uploads/2014/05', name)
dat <- read.xls(xls = url, sheet = 'Sheet1', perl = prl_pth)

# process data for tables/figs
dat <- proc_fun(dat)

# model of data
mod <- mod_fun(dat)
```

### Model summary
```{r results='asis', echo=F}
print.xtable(xtable(mod, digits = 2), type = 'html')
```

### Figure of restoration and reference by year
```{r reg_fig, echo = F, fig.width = 5, fig.height = 3, dpi=200}
ggplot(dat, aes(x = Year, y = value, colour = variable)) + 
  geom_point() +
  stat_smooth(method = 'lm')
```
\end{verbatim}
\end{block}
\end{frame}

%%%%%%
\begin{frame}[t, fragile]{Step 3}
Setup supporting docs/functions, i.e., .Rprofile, functions, report, master
\scriptsize
\begin{block}{master.r}
<<echo = T, eval = F, results = 'markup'>>=
# file to process
name <- '/my_data.xlsx'

# rmd to html
knit2html('report.Rmd')

# pandoc conversion of html to word doc
system(paste0('pandoc -o report.docx report.html'))
@
\end{block}
\end{frame}

%%%%%%
\begin{frame}[fragile]{Steps 4 - 6}
\small
After creating supporting documents in Project directory, final steps are completed by running `master.r'
\begin{itemize}
\item Step 4 - xls file imported using \texttt{gdata} package, implemented in `report.Rmd'
\item Step 5 - HTML document created by converting `report.Rmd' with \texttt{knit2html} in `master.r'
\item Step 6 - HTML document converted to Word with Pandoc by invoking system command
\end{itemize}
\begin{block}{master.r}
<<echo = T, eval = F, results = 'markup'>>=
# file to process
name <- '/my_data.xlsx'

# rmd to html
knit2html('report.Rmd')

# pandoc conversion of html to word doc
system(paste0('pandoc -o report.docx report.html'))
@
\end{block}
\end{frame}


\end{document}

To use the workflow, start a new version control project through Git in RStudio, pull the files from the repository, and run the master file. An excellent introduction for using RStudio with Github can be found here. I’ve also included two excel files that can be used to generate the reports. You can try using each one by changing the name variable in the master file and then running the commands:

name <- 'my_data.xlsx'
knit2html('report.Rmd')
system(paste0('pandoc -o report.docx report.html'))

or…

name <- 'my_data_2.xlsx'
knit2html('report.Rmd')
system(paste0('pandoc -o report.docx report.html'))

The output .docx file should be different depending on which Excel file you use as input. As the pdf describes, none of this will work if you don’t have the required software/packages, i.e., R/RStudio, Strawberry Perl, Pandoc, MS Office, knitr, gdata, etc. You’ll also need Git installed if you are pulling the files for local use (again, see here). I’d be interested to hear if anyone finds this useful or any general comments on improvements/suggestions for the workflow.

Cheers,

Marcus

Brief introduction on Sweave and Knitr for reproducible research

A few weeks ago I gave a presentation on using Sweave and Knitr under the guise of promoting reproducible research. I humbly offer this presentation to the blog with full knowledge that there are already loads of tutorials available online. This presentation is \LaTeX specific and slightly biased towards Windows OS, so it probably has limited relevance if you are interested in other methods. Anyhow, I hope this is useful to some of you.

Cheers,

Marcus

\documentclass[xcolor=svgnames]{beamer}
%\documentclass[xcolor=svgnames,handout]{beamer}
\usetheme{Boadilla}
\usecolortheme[named=Sienna]{structure}
\usepackage{graphicx}
\usepackage[final]{animate}
%\usepackage[colorlinks=true,urlcolor=blue,citecolor=blue,linkcolor=blue]{hyperref}
\usepackage{breqn}
\usepackage{xcolor}
\usepackage{booktabs}
\usepackage{verbatim}
\usepackage{tikz}
\usetikzlibrary{shadows,arrows,positioning}
\usepackage[noae]{Sweave}
\definecolor{links}{HTML}{2A1B81}
\hypersetup{colorlinks,linkcolor=links,urlcolor=links}
\usepackage{pgfpages}
%\pgfpagesuselayout{4 on 1}[letterpaper, border shrink = 5mm, landscape]

\tikzstyle{block} = [rectangle, draw, text width=7em, text centered, rounded corners, minimum height=3em, minimum width=7em, top color = white, bottom color=brown!30,  drop shadow]

\newcommand{\ShowSexpr}[1]{\texttt{{\char`\\}Sexpr\{#1\}}}

\begin{document}
\SweaveOpts{concordance=TRUE}

\title[Nuts and bolts of Sweave/Knitr]{The nuts and bolts of Sweave/Knitr for reproducible research with \LaTeX}
\author[M. Beck]{Marcus W. Beck}

\institute[USEPA NHEERL]{ORISE Post-doc Fellow\\
USEPA NHEERL Gulf Ecology Division, Gulf Breeze, FL\\
Email: \href{mailto:beck.marcus@epa.gov}{beck.marcus@epa.gov}, Phone: 850 934 2480}

\date{January 15, 2014}

%%%%%%
\begin{frame}
\vspace{-0.3in}
\titlepage
\end{frame}

%%%%%%
\begin{frame}{Reproducible research}
\onslide<+->
In it's most general sense... the ability to reproduce results from an experiment or analysis conducted by another.\\~\\
\onslide<+->
From Wikipedia... `The ultimate product is the \alert{paper along with the full computational environment} used to produce the results in the paper such as the code, data, etc. that can be \alert{used to reproduce the results and create new work} based on the research.'\\~\\
\onslide<+->
Concept is strongly based on the idea of \alert{literate programming} such that the logic of the analysis is clearly represented in the final product by combining computer code/programs with ordinary human language [Knuth, 1992].
\end{frame}

%%%%%%
\begin{frame}{Non-reproducible research}
\begin{center}
\begin{tikzpicture}[node distance=2.5cm, auto, >=stealth]
	\onslide<2->{
	\node[block] (a) {1. Gather data};}
	\onslide<3->{
	\node[block] (b)  [right of=a, node distance=4.2cm] {2. Analyze data};
 	\draw[->] (a) -- (b);}
 	\onslide<4->{
 	\node[block] (c)  [right of=b, node distance=4.2cm]  {3. Report results};
 	\draw[->] (b) -- (c);}
%  	\onslide<5->{
%  	\node [right of=a, node distance=2.1cm] {\textcolor[rgb]{1,0,0}{X}};
%  	\node [right of=b, node distance=2.1cm] {\textcolor[rgb]{1,0,0}{X}};}
\end{tikzpicture}
\end{center}
\vspace{-0.5cm}
\begin{columns}[t]
\onslide<2->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Begins with general question or research objectives
\item Data collected in raw format (hard copy) converted to digital (Excel spreadsheet)
\end{itemize}
\end{column}}
\onslide<3->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Import data into stats program or analyze directly in Excel
\item Create figures/tables directly in stats program
\item Save relevant output
\end{itemize}
\end{column}}
\onslide<4->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Create research report using Word or other software
\item Manually insert results into report
\item Change final report by hand if methods/analysis altered
\end{itemize}
\end{column}}
\end{columns}

\end{frame}

%%%%%%
\begin{frame}{Reproducible research}
\begin{center}
\begin{tikzpicture}[node distance=2.5cm, auto, >=stealth]
	\onslide<1->{
	\node[block] (a) {1. Gather data};}
	\onslide<1->{
	\node[block] (b)  [right of=a, node distance=4.2cm] {2. Analyze data};
 	\draw[<->] (a) -- (b);}
 	\onslide<1->{
 	\node[block] (c)  [right of=b, node distance=4.2cm]  {3. Report results};
 	\draw[<->] (b) -- (c);}
\end{tikzpicture}
\end{center}
\vspace{-0.5cm}
\begin{columns}[t]
\onslide<1->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Begins with general question or research objectives
\item Data collected in raw format (hard copy) converted to digital (\alert{text file})
\end{itemize}
\end{column}}
\onslide<1->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Create \alert{integrated script} for importing data (data path is known) 
\item Create figures/tables directly in stats program
\item \alert{No need to export} (reproduced on the fly)
\end{itemize}
\end{column}}
\onslide<1->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item Create research report using RR software
\item \alert{Automatically include results} into report
\item \alert{Change final report automatically} if methods/analysis altered
\end{itemize}
\end{column}}
\end{columns}

\end{frame}

%%%%%%
\begin{frame}{Reproducible research in R}
Easily adopted using RStudio [\href{http://www.rstudio.com/}{http://www.rstudio.com/}]\\~\\
Also possible w/ Tinn-R or via command prompt but not as intuitive\\~\\
Requires a \LaTeX\ distribution system - use MikTex for Windows [\href{http://miktex.org/}{http://miktex.org/}]\\~\\
\onslide<2->{
Essentially a \LaTeX\ document that incorporates R code... \\~\\
Uses Sweave (or Knitr) to convert .Rnw file to .tex file, then \LaTeX\ to create pdf\\~\\
Sweave comes with \texttt{utils} package, may have to tell R where it is \\~\\
}
\end{frame}

%%%%%%
\begin{frame}{Reproducible research in R}
Use same procedure for compiling a \LaTeX\ document with one additional step

\begin{center}
\begin{tikzpicture}[node distance=2.5cm, auto, >=stealth]
	\onslide<2->{
	\node[block] (a) {1. myfile.Rnw};}
	\onslide<3->{
	\node[block] (b)  [right of=a, node distance=4.2cm] {2. myfile.tex};
 	\draw[->] (a) -- (b);\node [right of=a, above=0.5cm, node distance=2.1cm] {Sweave};}
 	\onslide<4->{
 	\node[block] (c)  [right of=b, node distance=4.2cm]  {3. myfile.pdf};
 	\draw[->] (b) -- (c);
 	\node [right of=b, above=0.5cm, node distance=2.1cm] {pdfLatex};}
\end{tikzpicture}
\end{center}
\vspace{-0.5cm}
\begin{columns}[t]
\onslide<2->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item A .tex file but with .Rnw extension
\item Includes R code as `chunks' or inline expressions
\end{itemize}
\end{column}}
\onslide<3->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item .Rnw file is converted to a .tex file using Sweave
\item .tex file contains output from R, no raw R code
\end{itemize}
\end{column}}
\onslide<4->{
\begin{column}{0.33\textwidth}
\begin{itemize}
\item .tex file converted to pdf (or other output) for final format
\item Include biblio with bibtex
\end{itemize}
\end{column}}
\end{columns}

\end{frame}

%%%%%%
\begin{frame}[containsverbatim]{Reproducible research in R} \label{sweaveref}
\begin{block}{.Rnw file}
\begin{verbatim}
\documentclass{article}
\usepackage{Sweave}

\begin{document}

Here's some R code:

\Sexpr{'<<eval=true,echo=true>>='}
options(width=60)
set.seed(2)
rnorm(10)
\Sexpr{'@'}

\end{document}
\end{verbatim}
\end{block}

\end{frame}

%%%%%%
\begin{frame}[containsverbatim,shrink]{Reproducible research in R}
\begin{block}{.tex file}
\begin{verbatim}
\documentclass{article}
\usepackage{Sweave}

\begin{document}

Here's some R code:

\begin{Schunk}
\begin{Sinput}
> options(width=60)
> set.seed(2)
> rnorm(10)
\end{Sinput}
\begin{Soutput}
 [1] -0.89691455  0.18484918  1.58784533 -1.13037567  
 [5] -0.08025176  0.13242028  0.70795473 -0.23969802  
 [9]  1.98447394 -0.13878701
\end{Soutput}
\end{Schunk}

\end{document}
\end{verbatim}
\end{block}

\end{frame}

%%%%%%
\begin{frame}{Reproducible research in R}
The final product:\\~\\
\centerline{\includegraphics{ex1_input.pdf}}
\end{frame}

%%%%%%
\begin{frame}[fragile]{Sweave - code chunks}
\onslide<+->
R code is entered in the \LaTeX\ document using `code chunks'
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<>>='}
\Sexpr{'@'}
\end{verbatim}
\end{block}
Any text within the code chunk is interpreted as R code\\~\\
Arguments for the code chunk are entered within \verb|\Sexpr{'<<here>>'}|\\~\\
\onslide<+->
\begin{itemize}
\item{\texttt{eval}: evaluate code, default \texttt{T}}
\item{\texttt{echo}: return source code, default \texttt{T}}
\item{\texttt{results}: format of output (chr string), default is `include' (also `tex' for tables or `hide' to suppress)}
\item{\texttt{fig}: for creating figures, default \texttt{F}}
\end{itemize}
\end{frame}

%%%%%%
\begin{frame}[fragile]{Sweave - code chunks}
Changing the default arguments for the code chunk:
\begin{columns}[t]
\begin{column}{0.45\textwidth}
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
<<>>=
2+2
@
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<eval=F,echo=F>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
Returns nothing...
\end{column}
\begin{column}{0.45\textwidth}
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<eval=F>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
<<eval=F>>=
2+2
@
\onslide<+->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<echo=F>>='}
2+2
\Sexpr{'@'}
\end{verbatim}
\end{block}
<<echo=F>>=
2+2
@
\end{column}
\end{columns}
\end{frame}

%%%%%%
\begin{frame}[t,fragile]{Sweave - figures}
\onslide<1->
Sweave makes it easy to include figures in your document
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<myfig,fig=T,echo=F,include=T,height=3>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\end{verbatim}
\end{block}
\onslide<2->
<<myfig,fig=T,echo=F,include=T,height=3>>=
set.seed(2)
hist(rnorm(100))
@
\end{frame}

%%%%%%
\begin{frame}[t,fragile]{Sweave - figures}
Sweave makes it easy to include figures in your document
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<myfig,fig=T,echo=F,include=T,height=3>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\end{verbatim}
\end{block}
\vspace{\baselineskip}
Relevant code options for figures:
\begin{itemize}
\item{The chunk name is used to name the figure, myfile-myfig.pdf}
\item{\texttt{fig}: Lets R know the output is a figure}
\item{\texttt{echo}: Use \texttt{F} to suppress figure code}
\item{\texttt{include}: Should the figure be automatically include in output}
\item{\texttt{height}: (and \texttt{width}) Set dimensions of figure in inches}
\end{itemize}
\end{frame}

%%%%%%
\begin{frame}[t,fragile]{Sweave - figures}
An alternative approach for creating a figure
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<myfig,fig=T,echo=F,include=F,height=3>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\includegraphics{rnw_name-myfig.pdf}
\end{verbatim}
\end{block}
\includegraphics{Sweave_intro-myfig.pdf}
\end{frame}

%%%%%%
\begin{frame}[t,fragile]{Sweave - tables}
\onslide<1->
Really easy to create tables
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<results=tex,echo=F>>='}
library(stargazer)
data(iris)
stargazer(iris,title='Summary statistics for Iris data')
\Sexpr{'@'}
\end{verbatim}
\end{block}
\onslide<2->
<<results=tex,echo=F>>=
data(iris)
library(stargazer)
stargazer(iris,title='Summary statistics for Iris data')
@

\end{frame}

%%%%%%
\begin{frame}[t,fragile]{Sweave - tables}
Really easy to create tables
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<results=tex,echo=F>>='}
library(stargazer)
data(iris)
stargazer(iris,title='Summary statistics for Iris data')
\Sexpr{'@'}
\end{verbatim}
\end{block}
\vspace{\baselineskip}
\texttt{results} option should be set to `tex' (and \texttt{echo=F})\\~\\
Several packages are available to convert R output to \LaTeX\ table format
\begin{itemize}
\item{xtable: most general package}
\item{hmisc: similar to xtable but can handle specific R model objects}
\item{stargazer: fairly effortless conversion of R model objects to tables}
\end{itemize}
\end{frame}

%%%%%%
\begin{frame}[fragile]{Sweave - expressions}
\onslide<1->
All objects within a code chunk are saved in the workspace each time a document is compiled (unless \texttt{eval=F})\\~\\
This allows the information saved in the workspace to be reproduced in the final document as inline text, via \alert{expressions}\\~\\
\onslide<2->
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<echo=F>>='}
data(iris)
dat<-iris
\Sexpr{'@'}
\end{verbatim}
Mean sepal length was \ShowSexpr{mean(dat\$Sepal.Length)}.
\end{block}
\onslide<3->
<<echo=F>>=
data(iris)
dat<-iris
@
\vspace{\baselineskip}
Mean sepal length was \Sexpr{mean(dat$Sepal.Length)}.
\end{frame}

%%%%%%
\begin{frame}[fragile]{Sweave - expressions}
Change the global R options to change the default output\\~\\
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<echo=F>>='}
data(iris)
dat<-iris
options(digits=2)
\Sexpr{'@'}
\end{verbatim}
Mean sepal length was \ShowSexpr{format(mean(dat\$Sepal.Length))}.
\end{block}
<<echo=F>>=
data(iris)
dat<-iris
options(digits=2)
@
\vspace{\baselineskip}
Mean sepal length was \Sexpr{format(mean(dat$Sepal.Length))}.\\~\\
\end{frame}

%%%%%%
\begin{frame}{Sweave vs Knitr}
\onslide<1->
Does not automatically cache R data on compilation\\~\\
\alert{Knitr} is a useful alternative - similar to Sweave but with minor differences in args for code chunks, more flexible output\\~\\
\onslide<2->
\begin{columns}
\begin{column}{0.3\textwidth}
Must change default options in RStudio\\~\\
Knitr included with RStudio, otherwise download as package
\end{column}
\begin{column}{0.6\textwidth}
\centerline{\includegraphics[width=0.8\textwidth]{options_ex.png}}
\end{column}
\end{columns}
\end{frame}

%%%%%%
\begin{frame}[fragile]{Knitr}
\onslide<1->
Knitr can be used to cache code chunks\\~\\
Date are saved when chunk is first evaluated, skipped on future compilations unless changed\\~\\
This allows quicker compilation of documents that import lots of data\\
~\\
\begin{block}{}
\begin{verbatim}
\Sexpr{'<<mychunk, cache=TRUE, eval=FALSE>>='}
load(file='mydata.RData')
\Sexpr{'@'}
\end{verbatim}
\end{block}
\end{frame}

%%%%%%
\begin{frame}[containsverbatim,shrink]{Knitr} \label{knitref}
\begin{block}{.Rnw file}
\begin{verbatim}
\documentclass{article}

\Sexpr{'<<setup, include=FALSE, cache=FALSE>>='}
library(knitr)

#set global chunk options
opts_chunk$set(fig.path='H:/docs/figs/', fig.align='center', 
dev='pdf', dev.args=list(family='serif'), fig.pos='!ht')

options(width=60)
\Sexpr{'@'}

\begin{document}

Here's some R code:

\Sexpr{'<<eval=T, echo=T>>='}
set.seed(2)
rnorm(10)
\Sexpr{'@'}

\end{document}
\end{verbatim}
\end{block}

\end{frame}

%%%%%%
\begin{frame}{Knitr}
The final product:\\~\\
\centerline{\includegraphics[width=\textwidth]{knit_ex.pdf}}
\end{frame}

%%%%%%
\begin{frame}[containsverbatim,shrink]{Knitr}
Figures, tables, and expressions are largely the same as in Sweave\\~\\

\begin{block}{Figures}
\begin{verbatim}
\Sexpr{'<<myfig,echo=F>>='}
set.seed(2)
hist(rnorm(100))
\Sexpr{'@'}
\end{verbatim}
\end{block}
\vspace{\baselineskip}
\begin{block}{Tables}
\begin{verbatim}
\Sexpr{"<<mytable,results='asis',echo=F,message=F>>="}
library(stargazer)
data(iris)
stargazer(iris,title='Summary statistics for Iris data')
\Sexpr{'@'}
\end{verbatim}
\end{block}

\end{frame}

%%%%%%
\begin{frame}{A minimal working example}
\onslide<1->
Step by step guide to creating your first RR document\\~\\
\begin{enumerate}
\onslide<2->
\item Download and install \href{http://www.rstudio.com/}{RStudio}
\onslide<3->
\item Dowload and install \href{http://miktex.org/}{MikTeX} if using Windows
\onslide<4->
\item Create a unique folder for the document - This will be the working directory
\onslide<5->
\item Open a new Sweave file in RStudio
\onslide<6->
\item Copy and paste the file found on slide \ref{sweaveref} for Sweave or slide \ref{knitref} for Knitr into the new file (and select correct compile option)
\onslide<7->
\item Compile the pdf (runs Sweave/Knitr, then pdfLatex)\\~\\
\end{enumerate}
\onslide<7->
\centerline{\includegraphics[width=0.6\textwidth]{compile_ex.png}}
\end{frame}

%%%%%%
\begin{frame}{If things go wrong...}
\LaTeX\ Errors can be difficult to narrow down - check the log file\\~\\
Sweave/Knitr errors will be displayed on the console\\~\\
Other resources
\begin{itemize}
\item{`Reproducible Research with R and RStudio' by C. Garund, CRC Press}
\item{\LaTeX forum (like StackOverflow) \href{http://www.latex-community.org/forum/}{http://www.latex-community.org/forum/}}
\item Comprehensive Knitr guide \href{http://yihui.name/knitr/options}{http://yihui.name/knitr/options}
\item Sweave user manual \href{http://stat.ethz.ch/R-manual/R-devel/library/utils/doc/Sweave.pdf}{http://stat.ethz.ch/R-manual/R-devel/library/utils/doc/Sweave.pdf}
\item Intro to Sweave \href{http://www.math.ualberta.ca/~mlewis/links/the_joy_of_sweave_v1.pdf}{http://www.math.ualberta.ca/~mlewis/links/the_joy_of_sweave_v1.pdf}
\end{itemize}
\vspace{\baselineskip}
\end{frame}

\end{document}