Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git-annex: size miscalculation #32

Open
kousu opened this issue Nov 30, 2022 · 3 comments
Open

git-annex: size miscalculation #32

kousu opened this issue Nov 30, 2022 · 3 comments

Comments

@kousu
Copy link
Member

kousu commented Nov 30, 2022

It seems to be possible that the summary size on a repo (top right) can become out of sync with git-annex files. In this example, this one file is larger (6.6MB) than the repo thinks its entire size (3.3MB) is:

Screenshot 2022-11-30 at 00-52-07 test

The number seems to be generated -- and then cached -- by this function:

https://github.com/neuropoly/gitea/blob/fa43bce541507c8a702723f84764a48db9278506/modules/repository/create.go#L288-L301

So this is a cache invalidation problem: git annex copy --to is not triggering a recomputation like it should be.

@kousu
Copy link
Member Author

kousu commented Nov 30, 2022

Here are all the callers:

p115628@joplin:~/src/neurogitea/gitea$ git grep UpdateRepoSize
models/repo/update.go:// UpdateRepoSize updates the repository size, calculating it using util.GetDirectorySize
models/repo/update.go:func UpdateRepoSize(ctx context.Context, repoID, size int64) error {
modules/repository/create.go:// UpdateRepoSize updates the repository size, calculating it using util.GetDirectorySize
modules/repository/create.go:func UpdateRepoSize(ctx context.Context, repo *repo_model.Repository) error {
modules/repository/create.go:   return repo_model.UpdateRepoSize(ctx, repo.ID, size+lfsSize)
modules/repository/create.go:   if err = UpdateRepoSize(ctx, repo); err != nil {
modules/repository/generate.go: if err := UpdateRepoSize(ctx, generateRepo); err != nil {
modules/repository/repo.go:             if err = UpdateRepoSize(ctx, repo); err != nil {
routers/web/repo/view.go:               if err = repo_module.UpdateRepoSize(ctx, ctx.Repo.Repository); err != nil {
routers/web/repo/view.go:                       ctx.ServerError("UpdateRepoSize", err)
services/mirror/mirror_pull.go: if err := repo_module.UpdateRepoSize(ctx, m.Repo); err != nil {
services/repository/check.go:                   if err := repo_module.UpdateRepoSize(ctx, repo); err != nil {
services/repository/fork.go:    if err := repo_module.UpdateRepoSize(ctx, repo); err != nil {
services/repository/push.go:    if err = repo_module.UpdateRepoSize(ctx, repo); err != nil {

I notice that last line says 'push' so I took an educated guess and tried adding a commit:

p115628@joplin:~/src/neurogitea/test/test$ touch a
p115628@joplin:~/src/neurogitea/test/test$ git add a
p115628@joplin:~/src/neurogitea/test/test$ git commit -m "touch"
[main 18bb9db] touch
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 a
p115628@joplin:~/src/neurogitea/test/test$ git push
Locking support detected on remote "origin". Consider enabling it with:
  $ git config lfs.https://localhost/kousu/test.git/info/lfs.locksverify true
Énumération des objets: 4, fait.
Décompte des objets: 100% (4/4), fait.
Compression par delta en utilisant jusqu'à 128 fils d'exécution
Compression des objets: 100% (2/2), fait.
Écriture des objets: 100% (3/3), 269 octets | 269.00 Kio/s, fait.
Total 3 (delta 1), réutilisés 0 (delta 0), réutilisés du pack 0
remote: . Processing 1 references
remote: Processed 1 references in total
To localhost:kousu/test.git
   10bed66..18bb9db  main -> main

Now the size is fixed:

Screenshot 2022-11-30 at 01-03-25 test

so it really does seem to be a relatively simple cache-invalidation issue.

So what we want to do here, maybe, is add an UpdateRepoSize() call to the git-annex-shell code? Somewhere around here:

https://github.com/neuropoly/gitea/blob/0901a8cadf1da063c774465a888d6e019e60cfc5/cmd/serv.go#L322-L334

https://github.com/neuropoly/gitea/blob/0901a8cadf1da063c774465a888d6e019e60cfc5/cmd/serv.go#L380

@kousu
Copy link
Member Author

kousu commented Nov 30, 2022

In practice this bug is going to be pretty rare, since usually you must git annex sync or git push origin git-annex:git-annex after running git annex copy --to (or use git annex sync --content), otherwise the files are inaccessible, which will trigger the recomputation.

@kousu
Copy link
Member Author

kousu commented Feb 19, 2023

Heads up: this is getting significantly more complicated go-gitea#22900

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant