In the tutorial, you got to know more about R, and some of the available R and RStudio resources to help you through the semester.
You were also introduced to ChatGPT that you can use to assist in your learning. We will be using ChatGPT ethically as per the University guidelines.
Today’s plan
Aim
Quarto documents
R Code Chunk Options
Including images and figures
Computer file architecture
RStudio Projects
Good coding practices
Second hour: hands on practice
Scaffolding of reproducible research & reporting
Think of reproducible reporting as a project
The project needs to contain all the resources needed to produce a reproducible output.
Definition: Computational Reproducibility
Obtaining consistent computational results using the same input data, computational steps, methods, code, and conditions of analysis.
Elements of a reproducible project
We need to have a plan to organise, store and make all the project files available
All the elements of the project should be files
All files should be stored within the project location (typically a folder)
All your files should be explicitly tied together
Project organisation example
Workflow for reproducible research
Stages for reproducible data analysis and reporting
Clear research questions to be investigated
Clear objectives: what is the goal of this report?
Data gathering
Exploratory data analysis
Data analysis
Results presentation
All of the above needs to be documented and tied together
In this unit
We will create documents that are reproducible
Incorporate analyses that are reproducible
Include report text
All combined together
Our reproducible documents will be created using the scripting language R combined with quarto.
Let’s talk about computer paths
And then RStudio Projects
Computer paths
Where are files and folders stored on our computer?
Computer paths
Definition: Path
A path is the complete location or name of where a computer file, directory, device, or web page is located
Some examples:
Windows: C:\Documents\ETC5513
Mac/Linux: /Users/Documents/ETC5513
Internet: http://rcp.numbat.space/
Absolute and Relative Paths
Definition: Absolute Path
An absolute or full path begins from the lowest level, typically a drive letter or root (/)
Definition: Relative Path
A relative path refers to a location that is relative to the current directory. They typically start with a . (although this may be hidden from the user)
toc: Table of contents. You can read more abotu that here
This is the resulting HTML
Tables and Captions
Code:
```{r}library(dslabs)data(murders)table_data <- head(murders, 5)knitr::kable(table_data, caption = "Gun murder data from FBI reports by state", digits = 2)```
Result:
Gun murder data from FBI reports by state
state
abb
region
population
total
Alabama
AL
South
4779736
135
Alaska
AK
West
710231
19
Arizona
AZ
West
6392017
232
Arkansas
AR
South
2915918
93
California
CA
West
37253956
1257
Tables and Captions
Code:
```{r}library(dslabs)data(murders)table_data <- head(murders, 5)knitr::kable(table_data, caption = "Gun murder data from FBI reports by state", digits = 2)```
For more information, type knitr::kable() into your R console.
Figures and captions
Figures from R are created inside code chunks.
Typically, we will generate figures using ggplot2
Inside the code chunk, we use the fig-cap chunk option to generate a caption.
You will also want to include fig-label so it gets a number.
Figures and captions
```{r}#| fig-label: cars-plot#| fig-cap: "Distance taken for a car to stop, against it's speed during the test."library(ggplot2)ggplot(cars, aes(x = speed, y = dist) ) + geom_point()```
Distance taken for a car to stop, against it’s speed during the test.
Inserting external images/photos/figures
There are two different ways to include external pictures.