Skip to content
Farley Lai edited this page Apr 30, 2021 · 1 revision

Essential Git tasks.

Credentials over HTTPS

Enter user/passwd once until timeout:

git config --global credential.helper cache --timeout=21600 # 6hrs

Branching

Create and check out a new branch:

git checkout -b branch_name

Alternatively, fectch and check out a remote branch:

git fetch <remote-repo> <remote-branch>:<local-branch>
git checkout <local-branch>

Work on the files in the new branch.
Push the newly created branch to the remote repository:

git push -u origin branch_name
git push -u origin heads/branch_name            # in case of a same name tag

Delete local and remote branches:

git branch -d branch_name
git push origin --delete branch_name
git push origin --delete heads/branch_name      # delete with a branch ref
git push origin --delete refs/heads/branch_name # delete with a full branch ref

Tagging

To tag the current commit:

git tag v1.5                                    # simple tag
git tag -a v1.5 -m "description"                # with annotation

To push tags to the remote:

git push origin tag_name                        # push a particular tag
git push origin tags/tag_name                   # in case of same name branch
git push origin --tags                          # push all tags

Delete local and remote tags:

git tag -d tag_name                             # delete a local tag
git push origin --delete tag_name               # delete a tag from remote
git push origin --delete tags/tag_name          # delete with a tag ref
git push origin --delete refs/tags/tag_name     # delete with a full tag ref

Query the latest tag regardless of annotations:

git describe --tags --abbrev=0

Fork an External Repo

It is essential to first import the external repo in our GitLab. Next, configure a remote for a fork as follows:

$ git remote add upstream https://github.com/ORIGINAL_OWNER/ORIGINAL_REPOSITORY.git

# To show all the remotes:
$ git remove -v

See here for more details.

Merge Upstream Updates

Once an external repo is forked or imported as a submodule, it is necessary to merge the updates from upstream once in a while. Assume the working branch is master, the following instructions fetch and merge the updates from the upstream repo:

git fetch upstream
git checkout master
git merge upstream/master

Make sure to checkout the right branch to merge.

Submodules

Initialize a submodule to reference ML repository

git submodule add [-b dev] https://gitlab.com/necla-ml/ML.git submodules/ML
git submodule init 

.gitmodules will be created afterwards to add and commit.

Set up the package in the project root

Create a symlink to the package in the submodule for ease of import:

ln -s submodules/ML/ml

Working with submodules

First time to clone a project with submodules:

git clone --recursive [URL to Git repo]

Update submodules after cloning the repo:

git submodule update --init
# if there are nested submodules:
git submodule update --init --recursive

Checkout the desired branch for each submodule from a detached HEAD:

git submodule foreach -q --recursive 'git checkout $(git config -f $toplevel/.gitmodules submodule.$name.branch || echo master)'

Pull submodules:

# pull all changes in the repo including changes in the submodules
git pull --recurse-submodules

# pull all changes for the submodules only
git submodule update --remote

Push submodules:

# update submodule in the master branch
# skip this if you use --recurse-submodules
# and have the master branch checked out
cd [submodule directory]
git checkout master
git pull

# commit the change in main repo
# to use the latest commit in master of the submodule
cd ..
git add [submodule directory]
git commit -m "move submodule to latest commit in master"

# share your changes
git push

Command for each submodules

git submodule foreach 'git reset --hard'
# including nested submodules
git submodule foreach --recursive 'git reset --hard'

Remove submodules:

git submodule deinit -f mymodule
rm -rf .git/modules/mymodule
git rm -f mymodule

Git Large File Storage

conda install git git-lfs if necessary.

git lfs install                               # First time installation
git lfs track [file | pattern]                # Under LFS

git add .gitattributes
git add tracked_files
git commit . -m "Add tracked files to LFS"
git push                                      # Upload LFS objects

git lfs pull tracked_files                    # Download from LFS storage

GIT_LFS_SKIP_SMUDGE=1 git clone <repo>        # Clone the repo without LFS objects

Migrating existing repository data to LFS

Files committed within the repository when they should have been committed with LFS can be migrated to Git LFS storage as follows:

git lfs migrate import --include="pattern"

Remove Large Files from Recent Commit

It is not supposed to push binaries or other files that can be installed or generated by running separate scripts or commands. Once pushed, those binaries or unnecessary files would be part of the history for later cloning of the repo every time. If the check-in is just recent, the server repo may be restored through rebase and forced push:

git rebase -i commit-sha1
git push --force

Git will show recent commits for the user to manually decide whether to pick or drop each recent commit. Simply dropping those checking in large files restores the server repo.

Remove Large Files from History

To allow to work offline, git clone duplicate the entire repo including the history. Therefore, even if a large file was deleted, the existence in the history still causes unnecessary duplication when cloning the repo. To address this issue, all the previous history involving the large file must be cleaned up.

A utility called BFG simplifies the process of cleaning the repo as follows:

git clone --mirror git://example.com/some-big-repo.git
java -jar /path/to/downloaded/bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git

cd some-big-repo.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push

Make sure the local repo is up to date and pushed before the cleaning. After the cleaning the commit hash would be changed in the remote repo. Better clone the repo again before continuing the work.

Repository Cleanup

See documentation on GitLab.