ETC5513: Collaborative and Reproducible Practices

Tutorial 10

Author

Michael Lydeamore

Published

20 May 2025

Objectives

  • Learn to use renv for dependency management
  • Collaborate with a partner to test reproducibility
  • Practice using GitHub for version control

Pair Programming Setup

  1. Pair up with a partner and decide who will be Person A and Person B.

  2. Both partners should create a new repository on GitHub:

    • Go to GitHub and log in.
    • Click on the New Repository button.
    • Give your repository a name (e.g., renv-tutorial).
    • Check the boxes to include a license, a .gitignore file, and a README file.
    • Click Create Repository.
  3. Clone your repository to your local machine:

    • Copy the repository URL from GitHub.
    • In RStudio, go to File > New Project > Version Control > Git.
    • Paste the repository URL and choose a local folder to clone the repository.
  4. Open the cloned repository in RStudio.

  5. Inside your repository, create a new Quarto file:

    • Go to File > New File > Quarto Document.
    • Save the file as analysis.qmd.

Initializing renv

  1. Initialize renv in your R session not inside the Quarto document:

    renv::init()
    • This will create a local library for your project and generate renv files.
  2. Stage, commit, and push the changes to GitHub:

    git add .
    git commit -m "Initialize renv"
    git push origin main

Data Analysis Tasks

Person A: Palmer Penguins

  1. Install the palmerpenguins package:

    install.packages("palmerpenguins")
  2. Use the penguins dataset to:

    • Create a summary table of the data.
    • Generate a plot (e.g., scatterplot of flipper length vs. body mass).
Tip

Make sure you add the library command to your Quarto document

  1. Add cross-references to the plot in your Quarto document.

  2. Snapshot the project dependencies in the R console:

    renv::snapshot()
  3. Stage, commit, and push your changes to GitHub.

Person B: Life Tables

  1. Install the HistData package:

    install.packages("HistData")
  2. Use the Breslau dataset, filtered to age >= 5, to:

    • Create a summary table of the data.
    • Generate a plot (e.g., Age at death against number of deaths).
  3. Add cross-references to the plot in your Quarto document.

  4. Snapshot the project dependencies:

    renv::snapshot()
  5. Stage, commit, and push your changes to GitHub.

Testing Reproducibility

  1. Share your repository details with your partner.

  2. Clone each other’s repository and open it in RStudio.

  3. Restore the project dependencies:

    renv::restore()

    You should see renv installing the necessary dependencies for the project.

  4. Check your library paths with .libPaths(). Do you recognise the path?

  5. Render the Quarto document. Verify that you can reproduce the analysis and outputs.

  6. Discuss any issues encountered and how to resolve them.

Extension: Try adding a development package from GitHub (e.g., naniar) and follow the renv workflow. Verify if the package is recorded in the lockfile and how it differs from CRAN packages.