Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nondeterministic macOS is_symlink assertion failure in overwriting_files_and_lone_directories_works #1373

Closed
EliahKagan opened this issue May 20, 2024 · 1 comment · Fixed by #1374
Labels
acknowledged an issue is accepted as shortcoming to be fixed

Comments

@EliahKagan
Copy link
Member

Current behavior 😯

In gix-worktree-state/tests/state/checkout.rs, the test overwriting_files_and_lone_directories_works fails occasionally on the assertion:

https://github.com/Byron/gitoxide/blob/3c2174101ed35dcb9bdb4585b3245507b15efe59/gix-worktree-state/tests/state/checkout.rs#L238

Usually it passes, but occasionally it fails. This happens both on CI and in manual runs. As far as I know, the failure only happens on macOS. Here's an example from CI in my fork:

        FAIL [   0.422s] gix-worktree-state-tests::worktree state::checkout::overwriting_files_and_lone_directories_works

--- STDOUT:              gix-worktree-state-tests::worktree state::checkout::overwriting_files_and_lone_directories_works ---

running 1 test
test state::checkout::overwriting_files_and_lone_directories_works ... FAILED

failures:

failures:
    state::checkout::overwriting_files_and_lone_directories_works

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 10 filtered out; finished in 0.42s


--- STDERR:              gix-worktree-state-tests::worktree state::checkout::overwriting_files_and_lone_directories_works ---
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
thread 'state::checkout::overwriting_files_and_lone_directories_works' panicked at gix-worktree-state/tests/state/checkout.rs:238:9:
assertion `left == right` failed
  left: false
 right: true
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I don't know why this happens, but maybe it is somehow related to concurrency? I say this based only on #1354 (comment), indicating that "the filesystem probe" for symlinks is "still not safe for concurrent operations."

Expected behavior 🤔

The test should pass in all runs, rather than just most.

Git behavior

I'm unsure if there is a corresponding Git behavior, because I am unsure if this is a test bug or a bug in the code under test. Above the line with the assertion, the test has this comment:

https://github.com/Byron/gitoxide/blob/3c2174101ed35dcb9bdb4585b3245507b15efe59/gix-worktree-state/tests/state/checkout.rs#L237

I'm not sure what aspect of Git behavior that's referring to, though. As far as I know, Git determines whether it will attempt to create symlinks on Windows by checking core.symlinks and, unlike on other systems, defaults it to false if it is not set in any scope.

Steps to reproduce 🕹

I don't know a good procedure. I've been running tests a fair amount and occasionally seeing the problem. Running the tests many times in succession would likely trigger it. Also running them many times in succession while only allowing one test to run at a time might shed some light on whether concurrency is involved, but this would of course be even more time-consuming.

@Byron Byron added the acknowledged an issue is accepted as shortcoming to be fixed label May 21, 2024
@Byron
Copy link
Member

Byron commented May 21, 2024

Thanks for reporting!

The filesystem probe would be thread-safe (within the worktree-state test-suite) now and I would expect it to be stable - that I have now put to the test as pre-condition.

Otherwise, I don't seem to be able to reproduce the issue locally, neither in single-threaded nor in multi-threaded mode, running it 100 times in short succession.

❯ hyperfine -N -m100 /Users/byron/dev/github.com/Byron/gitoxide/target/debug/deps/worktree-9f25a11ba27ab3a7
Benchmark 1: /Users/byron/dev/github.com/Byron/gitoxide/target/debug/deps/worktree-9f25a11ba27ab3a7
  Time (mean ± σ):     212.4 ms ±  11.5 ms    [User: 113.8 ms, System: 77.8 ms]
  Range (min … max):   204.6 ms … 314.8 ms    100 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

In multi-threaded mode, strangely enough it does appear stable as well.

cargo test -p gix-worktree-state-tests --features gix-features-parallel
❯ hyperfine -N -m100 /Users/byron/dev/github.com/Byron/gitoxide/target/debug/deps/worktree-b64ec3555922d84e
Benchmark 1: /Users/byron/dev/github.com/Byron/gitoxide/target/debug/deps/worktree-b64ec3555922d84e
  Time (mean ± σ):     211.2 ms ±  11.9 ms    [User: 150.0 ms, System: 106.2 ms]
  Range (min … max):   198.2 ms … 322.9 ms    100 runs

  Warning: The first benchmarking run for this command was significantly slower than the rest (322.9 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.

However, I think the key here is cargo nextest, which runs all tests in parallel, thus making it possible for the 'gitoxide repository probe' to be run in parallel.

Screenshot 2024-05-21 at 09 22 25

This may happen in a few places.

In order to fix this particular issue, I think the probe code should throw a lock down into the repository to use the filesystem for synchronization or, probably better, do synchronization in-process. However, the in-process synchronization is probably not good enough for nextest as it builds all test binaries and then runs them.

An alternative to a lock-file, which I prefer, is to make sure the processes don't step on each others feet, so files they create are uniquely named.

This is certainly the solution we'd have to go for here.

Byron added a commit that referenced this issue May 22, 2024
If we are dependent on symlinks, we should be sure that the probe
actually detects symlinks.
Byron added a commit that referenced this issue May 22, 2024
This is achieved by making filenames unique so they won't clash.
@Byron Byron mentioned this issue May 22, 2024
23 tasks
@Byron Byron linked a pull request May 22, 2024 that will close this issue
23 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged an issue is accepted as shortcoming to be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants