Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

push: Avoid a race on the SSH server socket #18

Merged
merged 1 commit into from
Jan 11, 2024
Merged

Conversation

dbnicholson
Copy link
Owner

Sometimes the SSH server is slow to exit so the tempdir cleanup manages to see its socket file before the server deletes it, and then fails when trying to unlink it since it is already gone.

Wait for the SSH server to exit to avoid the issue. And do not trust it to exit on SIGTERM to avoid getting stuck waiting indefinitely.

For instance, running the testsuite on Debian Buster I frequently get this traceback:

  def test_refs(self, source_repo, dest_repo, sshd, ssh_options,
                tmp_files_path, capfd):
      random_commit(source_repo, tmp_files_path, 'test1')
      random_commit(source_repo, tmp_files_path, 'test2')
      self.push_refs(source_repo, dest_repo, sshd, ssh_options, capfd,
                     refs=['test1'])
      self.push_refs(source_repo, dest_repo, sshd, ssh_options, capfd,
>                    refs=['test2'])

tests/test_push.py:263:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/test_push.py:223: in push_refs
    ssh_options=ssh_options, command='dumpenv')
otpush/push.py:346: in push_refs
    ssh.run(cmd)
/usr/lib/python3.7/tempfile.py:944: in __exit__
    self.cleanup()
/usr/lib/python3.7/tempfile.py:948: in cleanup
    _rmtree(self.name)
/usr/lib/python3.7/shutil.py:491: in rmtree
    _rmtree_safe_fd(fd, path, onerror)
/usr/lib/python3.7/shutil.py:449: in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

topfd = 26, path = '/tmp/ostree-push-673fumwk', onerror = <function rmtree.<locals>.onerror at 0x7f569eb728c8>

    def _rmtree_safe_fd(topfd, path, onerror):
        try:
            with os.scandir(topfd) as scandir_it:
                entries = list(scandir_it)
        except OSError as err:
            err.filename = path
            onerror(os.scandir, path, sys.exc_info())
            return
        for entry in entries:
            fullname = os.path.join(path, entry.name)
            try:
                is_dir = entry.is_dir(follow_symlinks=False)
                if is_dir:
                    orig_st = entry.stat(follow_symlinks=False)
                    is_dir = stat.S_ISDIR(orig_st.st_mode)
            except OSError:
                is_dir = False
            if is_dir:
                try:
                    dirfd = os.open(entry.name, os.O_RDONLY, dir_fd=topfd)
                except OSError:
                    onerror(os.open, fullname, sys.exc_info())
                else:
                    try:
                        if os.path.samestat(orig_st, os.fstat(dirfd)):
                            _rmtree_safe_fd(dirfd, fullname, onerror)
                            try:
                                os.rmdir(entry.name, dir_fd=topfd)
                            except OSError:
                                onerror(os.rmdir, fullname, sys.exc_info())
                        else:
                            try:
                                # This can only happen if someone replaces
                                # a directory with a symlink after the call to
                                # os.scandir or stat.S_ISDIR above.
                                raise OSError("Cannot call rmtree on a symbolic "
                                              "link")
                            except OSError:
                                onerror(os.path.islink, fullname, sys.exc_info())
                    finally:
                        os.close(dirfd)
            else:
                try:
>                   os.unlink(entry.name, dir_fd=topfd)
E                   FileNotFoundError: [Errno 2] No such file or directory: 'socket'

/usr/lib/python3.7/shutil.py:447: FileNotFoundError

This supersedes #15.

Sometimes the SSH server is slow to exit so the tempdir cleanup manages
to see its socket file before the server deletes it, and then fails when
trying to unlink it since it is already gone.

Wait for the SSH server to exit to avoid the issue. And do not trust it
to exit on SIGTERM to avoid getting stuck waiting indefinitely.

For instance, running the testsuite on Debian Buster I frequently get
this traceback:

      def test_refs(self, source_repo, dest_repo, sshd, ssh_options,
                    tmp_files_path, capfd):
          random_commit(source_repo, tmp_files_path, 'test1')
          random_commit(source_repo, tmp_files_path, 'test2')
          self.push_refs(source_repo, dest_repo, sshd, ssh_options, capfd,
                         refs=['test1'])
          self.push_refs(source_repo, dest_repo, sshd, ssh_options, capfd,
    >                    refs=['test2'])

    tests/test_push.py:263:
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    tests/test_push.py:223: in push_refs
        ssh_options=ssh_options, command='dumpenv')
    otpush/push.py:346: in push_refs
        ssh.run(cmd)
    /usr/lib/python3.7/tempfile.py:944: in __exit__
        self.cleanup()
    /usr/lib/python3.7/tempfile.py:948: in cleanup
        _rmtree(self.name)
    /usr/lib/python3.7/shutil.py:491: in rmtree
        _rmtree_safe_fd(fd, path, onerror)
    /usr/lib/python3.7/shutil.py:449: in _rmtree_safe_fd
        onerror(os.unlink, fullname, sys.exc_info())
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    topfd = 26, path = '/tmp/ostree-push-673fumwk', onerror = <function rmtree.<locals>.onerror at 0x7f569eb728c8>

        def _rmtree_safe_fd(topfd, path, onerror):
            try:
                with os.scandir(topfd) as scandir_it:
                    entries = list(scandir_it)
            except OSError as err:
                err.filename = path
                onerror(os.scandir, path, sys.exc_info())
                return
            for entry in entries:
                fullname = os.path.join(path, entry.name)
                try:
                    is_dir = entry.is_dir(follow_symlinks=False)
                    if is_dir:
                        orig_st = entry.stat(follow_symlinks=False)
                        is_dir = stat.S_ISDIR(orig_st.st_mode)
                except OSError:
                    is_dir = False
                if is_dir:
                    try:
                        dirfd = os.open(entry.name, os.O_RDONLY, dir_fd=topfd)
                    except OSError:
                        onerror(os.open, fullname, sys.exc_info())
                    else:
                        try:
                            if os.path.samestat(orig_st, os.fstat(dirfd)):
                                _rmtree_safe_fd(dirfd, fullname, onerror)
                                try:
                                    os.rmdir(entry.name, dir_fd=topfd)
                                except OSError:
                                    onerror(os.rmdir, fullname, sys.exc_info())
                            else:
                                try:
                                    # This can only happen if someone replaces
                                    # a directory with a symlink after the call to
                                    # os.scandir or stat.S_ISDIR above.
                                    raise OSError("Cannot call rmtree on a symbolic "
                                                  "link")
                                except OSError:
                                    onerror(os.path.islink, fullname, sys.exc_info())
                        finally:
                            os.close(dirfd)
                else:
                    try:
    >                   os.unlink(entry.name, dir_fd=topfd)
    E                   FileNotFoundError: [Errno 2] No such file or directory: 'socket'

    /usr/lib/python3.7/shutil.py:447: FileNotFoundError

Signed-off-by: Emanuele Aina <[email protected]>
@dbnicholson dbnicholson merged commit ec0b7cc into master Jan 11, 2024
8 checks passed
@dbnicholson dbnicholson deleted the push-cleanup-race branch January 11, 2024 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants