Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up large .faiss files from .git #39

Open
xkortex opened this issue Jun 24, 2020 · 2 comments
Open

Clean up large .faiss files from .git #39

xkortex opened this issue Jun 24, 2020 · 2 comments

Comments

@xkortex
Copy link

xkortex commented Jun 24, 2020

I noticed there were some large files accidentally committed to the repo. This is more of an inconvenience than a major issue, but it means there's a large pull every time you pip install from the git repo (implicit clone). Not sure if you are still actively maintaining this, but if you are interested in purging these files, I was able to shrink the directory from ~160MB to 1.8MB with this protocol. Definitely practice with a backup repo and alternate remote! You can check out my results here.

git clone $repo /tmp/medifor
cd /tmp/medifor
find . -name '*.faiss' -exec rm {} \; # remove faiss files
git add .
git commit -m "remove large faiss files"

Then comes the fun part. Use dockerized BFG repo cleaner to purge the history:

docker run --rm -it -v $PWD:/data -w /data soodesune/bfg-repo-cleaner --strip-blobs-bigger-than 10M

that will strip the files but it doesn't fully prune them just yet, so then you run

git reflog expire --expire=now --all && git gc --prune=now --aggressive

to prune and collect garbage, and voila! .git should be much smaller.

@shiblon
Copy link
Contributor

shiblon commented Jun 25, 2020

rtyley/bfg-repo-cleaner#36

It looks like we can't do this in a repository that contains pull requests. We can do it by creating a new repo, moving stuff, deleting this one, then renaming it, but I don't know that it's the best idea to engage in that right now.

Unfortunately, that doesn't help you, since pip install doesn't have a --depth option and apparently won't anytime soon.

I'm leaving this open in case we get the gumption to fix it or have a bright idea. Meanwhile, what I'm seeing is a lot of this:

$ git push
Enumerating objects: 13, done.
Counting objects: 100% (13/13), done.
Writing objects: 100% (29/29), 6.10 KiB | 6.10 MiB/s, done.
Total 29 (delta 13), reused 13 (delta 13), pack-reused 16
remote: Resolving deltas: 100% (15/15), completed with 10 local objects.
To https://github.com/mediaforensics/medifor.git
 ! [remote rejected] refs/pull/1/head -> refs/pull/1/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/10/head -> refs/pull/10/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/11/head -> refs/pull/11/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/12/head -> refs/pull/12/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/14/head -> refs/pull/14/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/16/head -> refs/pull/16/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/17/head -> refs/pull/17/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/17/merge -> refs/pull/17/merge (deny updating a hidden ref)
 ! [remote rejected] refs/pull/18/head -> refs/pull/18/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/19/head -> refs/pull/19/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/19/merge -> refs/pull/19/merge (deny updating a hidden ref)
 ! [remote rejected] refs/pull/2/head -> refs/pull/2/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/20/head -> refs/pull/20/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/21/head -> refs/pull/21/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/22/head -> refs/pull/22/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/23/head -> refs/pull/23/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/24/head -> refs/pull/24/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/25/head -> refs/pull/25/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/27/head -> refs/pull/27/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/29/head -> refs/pull/29/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/3/head -> refs/pull/3/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/30/head -> refs/pull/30/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/31/head -> refs/pull/31/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/32/head -> refs/pull/32/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/33/head -> refs/pull/33/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/34/head -> refs/pull/34/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/35/head -> refs/pull/35/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/36/head -> refs/pull/36/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/37/head -> refs/pull/37/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/38/head -> refs/pull/38/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/4/head -> refs/pull/4/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/5/head -> refs/pull/5/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/6/head -> refs/pull/6/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/7/head -> refs/pull/7/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/8/head -> refs/pull/8/head (deny updating a hidden ref)
 ! [remote rejected] refs/pull/9/head -> refs/pull/9/head (deny updating a hidden ref)

https://stackoverflow.com/questions/34265266/remote-rejected-errors-after-mirroring-a-git-repository

@xkortex
Copy link
Author

xkortex commented Jun 26, 2020

Huh, interesting. Yeah like I said it isn't a tremendous issue, I just did it because I had to fork the medifor API anyways (because...reasons, frankly inadequate ones) so I figured I'd give it a shot and report my findings to you folks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants