Lecturer: Michael Lydeamore
Department of Econometrics and Business Statistics
Aim
Learning more on creating reproducible reports:
More on Git:
Solving git conflicts:
So far:
Next:
Options inside the R code chunks:
fig-align
: Controls the alignment of figures in the report default, center, left, or rightfig-cap
: Captions. fig-cap: "My amazing graph."
fig-height
, fig-width
: Size of the figure in inchesheight
, width
: Size of your plot in the final file. For example width = "50%"
which means half of the width of the image container (if the image is directly contained by a page instead of a child element of the page, that means half of the page width).Using Markdown syntax:
![Caption](path-to-image-here){fig-align="center"}
Using the knitr
package:
Global options are those that are applied to the entire document.
Best is to add this R code chunk at the beginning of the document before the libraries R code chunk.
They can be overwritten by the individual R code chunk options!
Quarto automatically includes referencing information that used to be part of the bookdown
package. It’s one of the many advantages of moving to Quarto over RMarkdown.
If you’ve used Bookdown before, just note that you no longer need to swap output formats for references to work.
filename_files
, figures will saved in a subfolder called figure-html
(or appropriate document type) be named using the R code chunk names ( remember to name your R code chunks!)This will create a new folder called Images and will place all the figures inside.
To reference figures, we have to include a label
and a fig-cap
. For example,
```{r}
#| label: fig-scatterplot
#| fig-cap: "Normalised mileage of cars. Positive values represent above average mileage, negative values indicate negative mileage"
#| eval: false
data("mtcars") # load data
mtcars$`car name` <- rownames(mtcars) # create new column for car names
mtcars$mpg_z <- round((mtcars$mpg - mean(mtcars$mpg))/sd(mtcars$mpg), 2) # compute normalized mpg
mtcars$mpg_type <- ifelse(mtcars$mpg_z < 0, "below", "above") # above / below avg flag
mtcars <- mtcars[order(mtcars$mpg_z), ] # sort
mtcars$`car name` <- factor(mtcars$`car name`, levels = mtcars$`car name`) # convert to factor to retain sorted order in plot.
# Diverging Barcharts
ggplot(mtcars, aes(x=`car name`, y=mpg_z, label=mpg_z)) +
geom_bar(stat='identity', aes(fill=mpg_type), width=.5) +
scale_fill_manual(name="Mileage",
labels = c("Above Average", "Below Average"),
values = c("above"="#00ba38", "below"="#f8766d")) +
labs(subtitle="Normalised mileage from 'mtcars'",
title= "Diverging Bars", x="Normalised mileage", y="Car Name") +
coord_flip()
```
Code:
@fig-scatterplot shows the normalised miles per gallon of a variety of makes of car.
Output:
Figure 1 shows the normalised miles per gallon of a variety of makes of car.
Citing a table follows the same syntax:
mpg | cyl | disp | hp | |
---|---|---|---|---|
Cadillac Fleetwood | 10.4 | 8 | 472 | 205 |
Lincoln Continental | 10.4 | 8 | 460 | 215 |
Camaro Z28 | 13.3 | 8 | 350 | 245 |
Duster 360 | 14.3 | 8 | 360 | 245 |
Chrysler Imperial | 14.7 | 8 | 440 | 230 |
Maserati Bora | 15.0 | 8 | 301 | 335 |
In text:
We can see the results in @tbl-summarytable
Output:
We can see the results in Table 1
Warning
In order for a table to be cross-referenceable, it’s label must start with with tbl-
.
kable
function from the kableExtra
package.Note that we don’t have to add the caption inside kable
, we can use a chunk option. But the functional form will still work.
To reference a section, use @sec-label
, and add the #sec-
identifier to the heading. For example:
## Introduction {#sec-introduction}
which we would then reference with @sec-introduction
.
Note that for this to work, we need to set number-sections: true
in the YAML, as sections are only referred to by numbers.
Let’s have a look at an example.
git clone
is used to target an existing repository and create a clone, or copy of the target repository.git pull
is used to fetch and download content from a remote repository and immediately update the local repository to match that content.git status
displays the state of the working directory and the staging areagit add file_name
adds a change in the working directory to the staging areagit commit -m "Message"
(m = message for commit. The git commit is used to create a snapshot of the staged changes along a timeline of a Git projects history.)git push origin branch name
is used to upload local repository content to a remote repository.Each repository has one default branch, and can have multiple other branches. Branching is a great feature of version control!
Branching is particularly important with Git as it is the mechanism that is used when you are collaborating with other researchers/data scientists.
HEAD
is a pointer that Git uses to reference the current snapshot that we are looking at.
You can create branches directly on GitHub. More info here
You can also delete branches directly on GitHub
As you get more comfortable with git, you might find this a bit slow and tedious.
We will be using our command line interface/Terminal or Git Bash to create and move across branches.
We use the git branch
and git checkout
commands.
git branch
show us the branches we have in our repo and marks our current branch with *
git branch newbranch_name
creates a new branch but does not move the HEAD
of the repo there.git checkout newbranch_name
moves the HEAD
to newbranch_name
HEAD
and checkout
How does Git know what branch you’re currently on?
By using the pointer: HEAD
. In Git, this is a pointer to the local branch you are currently on.
Internally, the git checkout
command updates the HEAD
to point to either the specified branch or commit.
Using the checkout
command
git checkout -b newbranch_name
creates a new branch and moves the repo HEAD
to this branchgit branch
to see in which branch you are currently ingit push origin newbranch_name
Alternatively if we had files or changes added into that branch: - git add .
(adding all the modified files into the staging area) - git commit -m "Updating new newbranch_name"
- git push origin newbranch_name
git checkout main
: First move to the branch we want to move content intogit merge newbranch_name -m "Merging branches"
git push origin main
to update the remote repositoryRemember, we can use git status
to check the status of our repo at any time.
It is essential to git checkout
before creating a new branch.
If the branch where you are currently working was already merged with the main branch you’ll need to undo almost all the changes from the old branch that did not make it into the main branch.
Reason: all the old changes from that branch will appear as new changes in combination with the changes that are actually new.
It is fixable but a mess that you want to avoid!
Caution
Don’t create branches from a branch that is not the main branch unless you are deliberately doing it
To delete a branch from your local repository:
git branch -a
: list all the branchesgit checkout main
: Move to main
branchgit branch -d Name_of_branch
: Delete unwanted branchCaution
You cannot delete a branch if your HEAD is on that branch
To delete a branch from your remote repository (GitHub):
git push origin --delete Name_of_branch
Imagine that you are working on your local repository and a collaborator has created a new branch in your remote repo.
You are currently working on your local repo and want to have a look at the new branch. That means that the local repo and your remote repo have diverged.
That is, both the local and remote repositories are not currently synchronized.
git fetch origin
git fetch origin
looks where origin
is and fetches any data from it that you don’t yet have.HEAD
) to its new, more up-to-date position.Note: If the git repo contains more than one remote, such as origin and upstream, git fetch
will fetch all the changes from all of the remotes.
git fetch origin
will only fetch the changes from remote origin
git fetch
updates all remote branchesgit remote
(The git remote command lets you create, view, and delete connections to remote repositories.)git fetch origin
: fetch the changes from remote origin (Fetching is what you do when you want to see what everybody else has been working on in the remote repo)git branch -a
shows all the branches available in the local repository + all the branches fetched from the remote.The branches fetched from the remote origin would be preceded by remotes/origin/
To do that: - First make sure you are working in that branch in your local repo: git branch -a
- Add changes into the staging area, commit and push changes to the corresponding branch into the remote repository: git add files
, git commit -m "Message"
, git push origin name-of-the-branch
git checkout branchname
Imagine that you have two branches:
To check in which branch you are currently
git branch
or git branch -a
, you will see an *
to let you in which branch the HEAD
of your repository currently is.To go back to main branch (assuming that you were in there): git checkout main
Resource here.
Suppose we have two branches: main
and new_development
and our goal is to bring changes from the branch new_development
into our main
branch:
main
branch: git checkout main
git merge new_development
git push origin main
If those steps are successful your new_development
branch will be fully integrated within the main branch.
However, it is possible that Git will not be able to automatically resolve some conflicts,
# Auto-merging index.html
# CONFLICT (content): Merge conflict in index.html
# Automatic merge failed; fix conflicts and then commit the result.
Important
Do not panic.
No-one likes merge conflicts but they happen, and are fixable.
You will have to resolve them manually.
This normally happens when two branches have the same file but with two different versions of the file. In that case Git is not able to figure out which version to use and is asking you to resolve the conflict.
First, figure out which files are affected by the conflict:
git status
git status
# On branch main
# You have unmerged paths.
# (fix conflicts and run "git commit")
#
# Unmerged paths:
# (use "git add <file>..." to mark resolution)
#
# both modified: example.Rmd
#
# no changes added to commit (use "git add" and/or "git commit -a")
<<<<<<
, ======
, and >>>>>>
Edit the file
git add filename
git commit -m "Message"
git push origin main
When you open the conflict file in a text editor such as Rstudio, you will see the conflicted part marked like this:
/* code unaffected by conflict */
<<<<<<< HEAD
/* code from main that caused conflict */
=======
/* code from feature that caused conflict */
>>>>>>
When Git encounters a conflict, it adds <<<< ; >>>> and ======= to highlight the parts that caused the conflict and need to be resolved.
main
branchgit add
to stage the file/s and git commit
to commit the changes: this will generate the merge commit.Important
When we create a branch using Rstudio the branch is created both in the local and in the remote repository (at the same time.)
Otherwise some of your branches and changes might not be updated.
Please follow the link and get the Github Education pack.
From now on we will be using VSCode for managing git
. Please look at instructions in Week 3 to get it installed.
After your tutorial this week:
In the next weeks, we will be using VSCode
Summary
ETC5513 Week 4