ETC5513: Collaborative and Reproducible Practices
Tutorial 7
Cleaning Up Your Commit History
With git rm
and .gitignore
🧭 Goal
Learn how to remove files from Git tracking with git rm
, and prevent them from being re-added with a .gitignore
file.
1️⃣ Create and Clone a Repo
On GitHub
Create a new repository:
git-ignore-cleanup
✅ Check “Add a README file”
In RStudio
- Go to File → New Project → Version Control → Git
- Paste the repo URL (SSH or HTTPS)
- Choose a location and click Create Project
✅ You’re now working in a Git-tracked project.
2️⃣ Add and Commit a File
Create a new
qmd
in RStudio:- Go to File → New File → Quarto Document
Save the file as
notes.qmd
Add some content like:
summary(mtcars)
Stage and commit the file:
"Add analysis script"
3️⃣ Accidentally Add a Data File
Download the data from Week 2 on Moodle, and save it into your project as
data.csv
Stage and commit:
git add data.csv git commit -m "Add raw data"
4️⃣ Remove the File from Git, But Keep It Locally
Realise you don’t want this in version control, but you still need it for local use.
In the terminal:
git rm --cached data.csv
git commit -m "Stop tracking data.csv"
✅ data.csv
is still on your computer, but Git will no longer track it.
5️⃣ Add It to .gitignore
To prevent it from being accidentally added again:
Open (or create) a
.gitignore
file in your repo rootAdd:
data.csv
Save and stage the
.gitignore
fileCommit:
"Ignore data.csv"
🔍 Check It Worked
- Run:
git status
✅ You should not see data.csv
listed anywhere — Git is now ignoring it.
3️⃣ Squash Commits with Interactive Rebase
Let’s now try squashing a few commits into one clean one.
Step 1: Make a Messy Commit History
Edit your
.qmd
file and make 3 separate commits:- Add a new section or chunk → commit:
"Add section"
- Fix a typo → commit:
"Fix typo"
- Add a final comment → commit:
"Add footnote"
- Add a new section or chunk → commit:
✅ Commit after each change using the Git pane or terminal.
Step 2: Check Your Commit History
Run in the Terminal:
git log --oneline
You should see something like:
c3d4e5f Add footnote
b2c3d4e Fix typo
a1b2c3d Add section
...
Step 3: Start an Interactive Rebase
git rebase -i HEAD~3
You’ll see:
pick a1b2c3d Add section
pick b2c3d4e Fix typo
pick c3d4e5f Add footnote
Step 4: Squash the Commits
Change it to:
pick a1b2c3d Add section
squash b2c3d4e
squash c3d4e5f
Save and write a new combined commit message like:
Add section with typo fix and footnote
Save again to finish the rebase.
Step 5: Confirm It Worked
Run:
git log --oneline
✅ You should now see one clean commit where there were three.
🧠 Reflect
- Why is
--amend
useful when working on a single file? - When is it good practice to squash commits?
- What would happen if you did this after pushing?
✅ Summary
Action | Command |
---|---|
Fix your last commit | git commit --amend |
Combine multiple commits | git rebase -i HEAD~N |
Keep your history clean | Use these before pushing |
🎉 You’ve just learned to write cleaner, more professional commit histories!
🔁 Extension Activity: Merge vs Rebase
🧭 Goal
Understand the difference between git merge
and git rebase
by applying both to the same branches and comparing the result.
1️⃣ Setup: Create a Feature Branch
In your GitHub-connected RStudio project:
Create a file:
experiment.R
Add one line:
Main branch version
Save, stage, and commit:
"Add base file on main"
Create a new branch called
feature
:
git switch -c feature
2️⃣ Add Work on the Feature Branch
Edit
experiment.R
again:Feature branch addition
Save and commit:
"Add feature content"
✅ You now have two commits on separate branches.
3️⃣ Add a Change to main
- Switch back to
main
:
git switch main
Add to the file again:
Main branch additional note
Save and commit:
"Add note on main branch"
📊 At This Point…
Your Git history looks like this:
A---B (feature)
/
---O---C (main)
O
= Original commitA
= Feature commitC
= Main branch commitB
= We’ll merge or rebase next
4️⃣ Option A: Merge the Feature Branch
git merge feature
You’ll get a merge commit, like this:
A---B (feature)
/ \
---O---C-----M (main)
✅ History shows a clear branching path and merge point.
5️⃣ Option B: Try It Again with Rebase
This will recreate the same setup and use
rebase
instead ofmerge
.
- Reset the last merge:
git reset --hard HEAD~1
- Switch to the
feature
branch:
git switch feature
- Rebase it onto
main
:
git rebase main
- Now go back to
main
and fast-forward:
git switch main
git merge feature
📊 After Rebase
Your Git history now looks like:
---O---C---A' (main, feature)
A'
is a new version of A, replayed on top of C- No merge commit needed — linear history
🧠 Reflect
- What’s the key difference between
merge
andrebase
?
- Which history is easier to read?
- When is a merge preferred?
- Why must you be careful rebasing pushed commits?
✅ Summary
Action | Result |
---|---|
git merge feature |
Preserves both branches + merge commit |
git rebase main (on feature) |
Rewrites feature history as linear |
Use merge after pushing |
✅ Safe for shared work |
Use rebase before pushing |
✅ Keeps history clean |
🎉 You’ve now seen both strategies in action — use the right one for the right job!