Skip to content

04 Collaborating with Git & GitHub

Neha Moopen edited this page Aug 4, 2021 · 1 revision

The Data Privacy Handbook is hosted on GitHub (where you are now!) and version controlled by git.

  • Git is software that tracks the history of files (who made which changes and when). It allows reverting to a previous version and can run both online (e.g., here on GitHub) and offline (on your local PC).
  • GitHub is a platform that you can use to collaborate on projects that use git. It additionally allows for threaded discussions (issues), pull requests (see below) and several great apps, but is not something you can "run" on your local PC, because it is a website.

For the Data Privacy Handbook, we use git to track changes to the handbook content and GitHub to host the content online. For now, we are assuming that you will use RStudio to edit and version control the handbook content. However, you can also open .Rmd files in other programs (e.g., Atom, VS Code, Zettler), edit them and then use the command line to work with git.

Installation

  1. Create a Github account.
  2. Download and install git locally. To work with git locally, we have already installed RStudio in step 01.
  3. For editing the handbook content, please refer to the Contributing Guidelines.

How does git(hub) work?

The git workflow

When working on a git project (within a folder called a git repository), you will always perform the following steps:

  1. Make changes to some file and save them like you normally would
  2. Stage the changes: select which files you want to make a snapshot of (this step is most explicit if you work in the command line)
  3. Commit the changes: make a snapshot of the changes made so far. A commit (snapshot) is always accompanied by a commit message explaining what changes were made. Any commit gets a specific identifier that can be used to reverse (undo) the commit.

Stage- and commit-related commands (these are simply clicks if you work in RStudio):

  • Check which files are changed but not yet staged or committed: git status
  • Stage a file (tip: use the tab to use autocompletion): git add filename
  • Stage multiple files: git add filename1 filename2 filename3
  • Stage all unstaged files in the workspace: git add -A .
  • Commit the change(s) you staged: git commit -m "Change x and y to z"

Branches

A git repository can exist in multiple “versions” which are called branches. The main branch should always be considered the clean branch. Besides that, you can create other branches that are meant to make your own changes, or try something different without dirtying the clean (main) version. If you want to update the clean version with your edits, you can (ask to) merge your branch with the main branch.

Some branch-related commands:

  • Check which branch you are working on now (and list which branches there are): git branch -v
  • Change branches: git checkout branchname
  • Create a new branch: git checkout -b newbranchname

Workflow on GitHub

Please refer to the Contributing Guidelines for the complete and preferred workflow on GitHub.

Keeping your local copy (clone) up to date

If you are working on a project with many collaborators making changes, the odds are that your own fork (online copy) and/or clone (local copy) are becoming out-of-date quite fast. Therefore, it is recommended to update those copies each time before you start making changes yourself, so you are working on the most recent versions of the files.

  • If you are working locally from the command line, you can set up the owner’s repository as the "upstream" repository and then pull all commits from the upstream repository to your local PC:
    • Setting up the original repository as the upstream: git remote add upstream https://github.com/ownername/repositoryname.git
    • Pulling changes from the upstream repository: git pull upstream branchname
  • If you are working locally from RStudio, please refer to this page
  • To update your online version of the repository, simply push the changes (e.g., git push origin main after pulling from the upstream)

Resources