Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renaming paths into pre-existing path causes double-nested paths (sometimes) #538

Open
ChrisCarini opened this issue Jan 29, 2024 · 1 comment

Comments

@ChrisCarini
Copy link

ChrisCarini commented Jan 29, 2024

Hello,

We are noticing quite strange behavior (only sometimes*) when using git-filter-repo when trying to re-home an entire repository down one directory. Below is a simple example directory structure of what we are (a) starting with, (b) end up with sometimes (split into two examples)

"Starting with" structure

$ > tree -a -L 2 .
.
├── D_bar
├── D_batz
├── D_foo
│   ├── F_blop
│   ├── F_flap
│   └── F_flop
├── F_blatz
├── F_blub
└── F_flub

(Note: D_ represents a directory, and F_ represents a file throughout the examples.)

Command we run

git filter-repo --to-subdirectory-filter D_foo

"End up with" structure

(correct; what we desire) we get this sometimes*
$ > tree -a -L 3
.
└── D_foo
    ├── D_bar
    ├── D_batz
    ├── D_foo
    │   ├── F_blop
    │   ├── F_flap
    │   └── F_flop
    ├── F_blatz
    ├── F_blub
    └── F_flub
(incorrect; what we don't desire) we get this sometimes*
$ > tree -a -L 4
.
└── D_foo
    └── D_foo
        ├── D_bar
        ├── D_batz
        ├── D_foo
        │   ├── F_blop
        │   ├── F_flap
        │   └── F_flop
        ├── F_blatz
        ├── F_blub
        └── F_flub

The 'unexpected part'

You'll notice in the second one, there is two nested D_foo directories (i.e. D_foo/D_foo/{D_foo,D_batz,D_bar,...} instead of D_foo/{D_foo,D_batz,D_bar,...}).

*"sometimes"

What is very weird here is that this only seems to happen 'sometimes' - that is, the command will do what we expect for several hours on-end, and then start doing what we don't expect for several more hours on-end. That is to say that I can run this command on a fresh clone of a repository (fresh clone before any invocation of git-filter-repo) every few minutes for 4+ hours. Then, seemingly out of no where, the invocations of git-filter-repo (again, with a fresh clone before invocation) will start doing what we don't expect (double-nested directories).

I have tried debugging git-filter-repo script, stepping through what I believe is the associated part of the tool that does the path renaming, and do not see how/why/when this is happening.

--to-subdirectory-filter <dir> vs --path-rename :<dir>/

We have also tried using --path-rename :<dir>/ instead, and see similar behavior - this is not a large surprise after looking at the code since even the docs say these two commands are equivalent.

Attempts to 'work around' the strange behavior

We have also tried splitting the operation into two git-filter-repo commands, something like below:

git filter-repo --path-rename D_foo/:D_foo_existing/ --path-rename :D_foo/ && \
git filter-repo --path-rename D_foo/D_foo_existing/:D_foo/D_foo/ && \

This, too, seemed to show the same behavior of sometimes doing what we want/expect, and sometimes not doing what we want/expect.

We are also seeing this behavior across a few developer machines, so we don't believe it is related to any local Git configurations or anything.

Please advise! 😄 🙏

@newren
Copy link
Owner

newren commented Aug 2, 2024

You have me stumped; I have absolutely no idea how that could possibly happen. Perhaps try adding the --debug flag and compare the .git/filter-repo/fast-export.{original,filtered} files. Does the fast-export.filtered version show the nesting of paths? Of course, whether it does or doesn't I'm still stumped, but I guess it lets us at least cut down the side where the problem might be a little bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants