Parks Canada; Ecological Integrity Monitoring Program
Yourselves
Workshop website: https://andyteucher.ca/pc-intermediate-r/
WTF Book: https://rstats.wtf/
.Rprofile and .Renviron filesEnable Rstudio keyboard shortcuts via:
Settings > Workbench > Keybindings: Rstudio Keybindings
Ctrl + Shift + P
Often we are (or think we are) only doing our data work on our own
When we want to facilitate collaboration or focus on reproducibility, we need new strategies
What they Forgot to Teach You About R - Much of this material is distilled from this book.
Jenny Bryan is a hero in the R world.
An early adopter teaching R/GitHub as a prof at UBC, now at Posit
Establish the concept of the project as the basic organizational unit of work.
Apply best practices in and leverage benefits of using a project-oriented workflow.
Creating robust file paths that travel well in time and space.
Constructing human and machine readable file names that sort nicely.
Differentiating workflow elements, analysis inputs, and analysis outputs in project structure to create navigable programming interfaces.
Restarting R frequently, with a blank slate.
Don’t fret over past mistakes.
Raise the bar for new work.
Saving code is an absolute requirement for reproducibility. (Future you, future us)
Save commands as “scripts” (.R) or “notebooks” (.Rmd).
It doesn’t have to be polished.
Just save it!
Everything that really matters should be achieved through code that you save
Contrast: Series of unrecorded mouse clicks
The process is important, the product is just an outcome
work on more than 1 thing at a time
collaborate, communicate, distribute
start and stop safely
dedicated directory
RStudio Project or Positron Workspace
Git repo, probably syncing to a remote
Project-oriented workflows
There is another path.
Project-oriented workflow designs this away. 🙌
Project-oriented workflows
One folder per project.
Report? R package? Chapter? Website? Whatever.
Can be the same unit as a GitHub Repo.
If using RStudio, it’s Project (capital P)
If using Positron, it’s Workspace
Each Project gets its own R instance
R starts at the project root working directory: all paths are relative to the project’s folder.
Project-oriented workflows
my-project/
├── 01_read-data.R
├── 02_clean-data.R
├── 03_analysis.R
├── 04_output.R
├── R
├── README.md
├── data
│ ├── derived_data
│ └── raw_data
├── outputs
└── paper
├── paper.qmd
└── references.bib
Project-oriented workflows
Open Project = dedicated instance of RStudio
RStudio leaves notes to itself in foo.Rproj
Open Project = dedicated instance of Positron
Often just a project folder that’s been opened in its own window via Open Folder or similar
.git/ directory.Rproj file.vscode/settings.json file_quarto.yml fileDESCRIPTION filerenv.lock file.here fileThe chance of setwd() having the desired effect – making the file paths work – for anyone besides its author is ~0%.
It’s also unlikely to work for the author one or two years or computers from now.
Hard-wired, absolute paths, especially when sprinkled throughout the code, make a project brittle. Such code does not travel well across time or space.
relative to a stable base, not absolute paths.
use file system functions, not paste(), strsplit(), etc.
Instead of:
Or:
Set your work up as an RStudio or Positron Project/Workspace and use relative paths:
Or:
Because each project uses an isolated R process
rm(list = ls())?| Option | Persists? |
|---|---|
A. library(dplyr) |
|
B. summary <- head |
|
C. options(stringsAsFactors = FALSE) |
|
D. Sys.setenv(LANGUAGE = "fr") |
|
E. x <- 1:5 |
|
F. attach(iris) |
02:00
rm(list = ls())?| Option | Persists? |
|---|---|
A. library(dplyr) |
|
B. summary <- head |
|
C. options(stringsAsFactors = FALSE) |
|
D. Sys.setenv(LANGUAGE = "fr") |
|
E. x <- 1:5 |
|
F. attach(iris) |
.Rdata file..Rdata file.Tools > Global Options.usethis::use_blank_slate()
This is the default (and not customizable) in Positron
Session -> Restart R

Windows
Ctrl + Shift + F10Mac
Cmd + Shift + 0usethis::create_project("~/i_am_new")
File -> New Project -> New Directory -> New Project
usethis::create_project("~/i_exist")
File -> New Project -> Existing Directory
usethis::create_project("~/i_am_new", rstudio = FALSE)
File -> New Folder From Template
usethis::create_project("~/i_exist", rstudio = FALSE)
File -> New Project -> Existing Directory
Note: if you don’t specify rstudio = FALSE, it will create an RStudio .Rproj file. This does no harm.
Try either option now with a folder containing (or that will contain) Bivalve Data 2014-2022_ICE.csv.
machine readable
human readable
sort nicely

Jenny Bryan “Naming things” video
NormConf · Dec 4, 2022:
What features differentiate 😔 vs 😍?
😔
😍
myabstract.docx
Joe’s Filenames Use Spaces and Punctuation.xlsx
figure 1.png
homework.R
JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt
2018-01_teucher-abstract-conference.docx
joes-filenames-are-getting-better.xlsx
fig01_scatterplot-talk-length-vs-interest.png
2024-07-25_ecol-455_assignment-5.R
1986-01-28_raw-data-from-challenger-o-rings.txt
01:00
use underscore _ to separate different chunks
use hyphen - for words in the same chunk
This creates names that are regular expression and globbing friendly, and easy to compute on! 🎉
Adapted from
https://djnavarro.net/slides-project-structure/#1.
name contains info on content
name anticipates context
concept of a slug 🐌 from user-friendly URLs
1986-01-28_raw-data-from-challenger-o-rings.txt
concise, meaningful description
usually appended to the end
put something numeric in there
left pad with zeros for constant width, nice sorting, 01
use the ISO 8601 standard for dates, YYYY-MM-DD
order = chronological or … consider common sense
Intuitive sorting.
# A tibble: 2 × 1
files
<fs::path>
1 _examples/data/2024-07-16_site-2_plot-data.csv
2 _examples/data/2024-08-12_site-1_plot-data.csv
Easy to filter in R (or the shell or whatever)
# A tibble: 4 × 3
date site data_type
<chr> <chr> <chr>
1 "" examples/data/2024-07-16 site-2
2 "" examples/data/2024-07-25 site-2
3 "" examples/data/2024-08-12 site-1
4 "" examples/data/2024-08-18 site-1
Intentional delimiters means meta-data is easily recovered.
_ delimits fields; - delimits words
Rename the file Bivalve Data 2014-2022_ICE.csv to something that is machine readable, human readable, and sorts nicely.
Read the file in to R using a relateive path
machine readable, human readable, and sort nicely
easy to implement NOW
payoffs accumulate as your skills evolve and
projects get more complex
my-project/
├── 01_read-data.R
├── 02_clean-data.R
├── 03_analysis.R
├── 04_output.R
├── R
├── README.md
├── data
│ ├── derived_data
│ └── raw_data
├── outputs
└── paper
├── paper.qmd
└── references.bib
R packages
base R
14 base + 15 recommended packages
ships with all binary distributions of R
library(pkg) function to attach a packageThe system library (base/recommended packages).
.Library
All libraries for the current session
.libPaths()
All installed packages
installed.packages()
installed.packages(), what are the base and recommended packages?.Rprofile - contains R code to be run at the start of each session..Renviron - contains environment variables to be set in R sessions..Rprofile.RprofileIf it matters for code you share, it should not be in .Rprofile
.Rprofile?library(tidyverse)f <- dplyr::filtertheme_set(theme_bw()).Rprofile?.Rprofile.Renviron.RenvironVAR_NAME=value
GOOGLE_API_KEY=your_api_key_here
.RenvironAccess environment variables in R with Sys.getenv("VAR_NAME")
.Renviron called MY_NAME with your name as the value..Rprofile that gives you a personalized message when you start R.Ctrl + Shift + P%>% or |>)<-).Last.valueShows last evaluated value in console
.Last.value in environment listingControlled by an option:
setwd()