Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate intermittent test failures in PushManifestListTestCase #1275

Closed
lubosmj opened this issue May 4, 2023 · 9 comments · Fixed by #1376
Closed

Investigate intermittent test failures in PushManifestListTestCase #1275

lubosmj opened this issue May 4, 2023 · 9 comments · Fixed by #1376
Assignees
Labels

Comments

@lubosmj
Copy link
Member

lubosmj commented May 4, 2023

=================================== FAILURES ===================================
____________ PushManifestListTestCase.test_push_manifest_list_v2s2 _____________

self = <pulp_container.tests.functional.api.test_push_content.PushManifestListTestCase testMethod=test_push_manifest_list_v2s2>

    def test_push_manifest_list_v2s2(self):
        """Push the created manifest list in the v2s2 format."""
        self.registry.login(
            "-u", self.user_admin["username"], "-p", self.user_admin["password"], self.registry_name
        )
>       self.registry.manifest_push(
            self.v2s2_tag,
            self.v2s2_image_path,
            "--all",
            "--format",
            "v2s2",
        )

usr/local/lib/python3.8/site-packages/pulp_container/tests/functional/api/test_push_content.py:432: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
usr/local/lib/python3.8/site-packages/pulp_smash/cli.py:975: in <lambda>
    self.manifest_push = lambda *args: self._dispatch_command(
usr/local/lib/python3.8/site-packages/pulp_smash/cli.py:1036: in _dispatch_command
    result = self._client.run(cmd)
usr/local/lib/python3.8/site-packages/pulp_smash/cli.py:322: in run
    return self.response_handler(completed_process)
usr/local/lib/python3.8/site-packages/pulp_smash/cli.py:54: in code_handler
    completed_proc.check_returncode()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = CompletedProcess(args=('podman', 'manifest', 'push', 'manifest_list', 'pulp.example.com:443/foo:manifest_list', '--all...: initiating layer upload to /v2/foo/blobs/uploads/ in pulp.example.com:443: name invalid: Invalid repository name.\n')

    def check_returncode(self):
        """Raise an exception if ``returncode`` is non-zero.
    
        Raise :class:`pulp_smash.exceptions.CalledProcessError` if
        ``returncode`` is non-zero.
    
        Why not raise ``subprocess.CalledProcessError``? Because stdout and
        stderr are not included when str() is called on a CalledProcessError
        object. A typical message is::
    
            "Command '('ls', 'foo')' returned non-zero exit status 2"
    
        This information is valuable. One could still make
        ``subprocess.CalledProcessError`` work by overloading ``args``:
    
        >>> if isinstance(args, (str, bytes)):
        ...     custom_args = (args, stdout, stderr)
        ... else:
        ...     custom_args = tuple(args) + (stdout, stderr)
        >>> subprocess.CalledProcessError(args, returncode)
    
        But this seems like a hack.
    
        In addition, it's generally good for an application to raise expected
        exceptions from its own namespace, so as to better abstract away
        dependencies.
        """
        if self.returncode != 0:
>           raise exceptions.CalledProcessError(
                self.args, self.returncode, self.stdout, self.stderr
            )
E           pulp_smash.exceptions.CalledProcessError: Command ('podman', 'manifest', 'push', 'manifest_list', 'pulp.example.com:443/foo:manifest_list', '--all', '--format', 'v2s2', '--tls-verify=true') returned non-zero exit status 125.
E           
E           stdout: 
E           
E           stderr: Getting image list signatures
E           Copying 3 of 3 images in list
E           Copying image sha256:d8fbbbf3fec1857c32c110292a9decf9744f9f97d7247019ae4776c241395221 (1/3)
E           Getting image source signatures
E           Copying blob sha256:463b9bfafbaf88a2dee0aa77fd293a7cb91558af9664bc23d4ed00830463615d
E           Copying blob sha256:d7ab49ce0b755a907e914d148cc8c1f437f7688048aea18a15d965f60c5d0a43
E           Error: copying image 1/3 from manifest list: writing blob: initiating layer upload to /v2/foo/blobs/uploads/ in pulp.example.com:443: name invalid: Invalid repository name.
@lubosmj
Copy link
Member Author

lubosmj commented May 15, 2023

Let's see if the issue persists after merging: #1273.

I changed the names of test repositories to avoid collisions with references. The repository foo could be deleted by running tests' cleanups (i.e., test_push_empty_manifest_list, test_push_manifest_list_oci) while we handle the next push operation from test_push_manifest_list_v2s2. Delete tasks do not block the execution of a new test case.

        self.addCleanup(self.distributions_api.delete, distribution.pulp_href)
        cls.v2s2_tag = "manifest_list"
+        cls.v2s2_image_path = f"{cls.registry_name}/foo_v2s2:{cls.v2s2_tag}"
        cls.registry._dispatch_command("manifest", "create", cls.v2s2_tag)
        cls.registry._dispatch_command("manifest", "add", cls.v2s2_tag, cls.manifest_a)
        cls.registry._dispatch_command("manifest", "add", cls.v2s2_tag, cls.manifest_b)
        cls.registry._dispatch_command("manifest", "add", cls.v2s2_tag, cls.manifest_c)

        # create a new manifest list composed of the pulled manifest images
        cls.oci_tag = "manifest_list_oci"
+        cls.oci_image_path = f"{cls.registry_name}/foo_oci:{cls.oci_tag}"
        cls.registry._dispatch_command("manifest", "create", cls.oci_tag)
        cls.registry._dispatch_command("manifest", "add", cls.oci_tag, cls.manifest_a)
        cls.registry._dispatch_command("manifest", "add", cls.oci_tag, cls.manifest_b)
        cls.registry._dispatch_command("manifest", "add", cls.oci_tag, cls.manifest_c)

        # create an empty manifest list
 +       cls.empty_image_tag = "empty_manifest_list"
        cls.empty_image_path = f"{cls.registry_name}/foo_empty:{cls.empty_image_tag}"
        cls.registry._dispatch_command("manifest", "create", cls.empty_image_tag)

@lubosmj
Copy link
Member Author

lubosmj commented May 25, 2023

The issue persists after introducing the above change:

            )
E           pulp_smash.exceptions.CalledProcessError: Command ('podman', 'manifest', 'push', 'manifest_list_oci', 'pulp.example.com:443/foo_oci:manifest_list_oci', '--all', '--format', 'oci', '--tls-verify=true') returned non-zero exit status 125.
E           
E           stdout: 
E           
E           stderr: Getting image list signatures
E           Copying 3 of 3 images in list
E           Copying image sha256:d8fbbbf3fec1857c32c110292a9decf9744f9f97d7247019ae4776c241395221 (1/3)
E           Getting image source signatures
E           Copying blob sha256:463b9bfafbaf88a2dee0aa77fd293a7cb91558af9664bc23d4ed00830463615d
E           Copying blob sha256:d7ab49ce0b755a907e914d148cc8c1f437f7688048aea18a15d965f60c5d0a43
E           Error: copying image 1/3 from manifest list: writing blob: initiating layer upload to /v2/foo_oci/blobs/uploads/ in pulp.example.com:443: name invalid: Invalid repository name.

@pulpbot pulpbot moved this from In Progress to Free to take in RH Pulp Kanban board May 26, 2023
@lubosmj
Copy link
Member Author

lubosmj commented Jun 28, 2023

I am removing the prio-list label because it is not bugging us that often.

@lubosmj
Copy link
Member Author

lubosmj commented Aug 25, 2023

Something suspicious is going on here:

pulp [d6a18192f452492eb6401c497ca73f47]: django.request:WARNING: Not Found: /v2/foo_v2s2/blobs/sha256:d7ab49ce0b755a907e914d148cc8c1f437f7688048aea18a15d965f60c5d0a43
pulp [cf1449a8054943baa655f1c20b8d7005]: django.request:WARNING: Not Found: /v2/foo_v2s2/blobs/sha256:463b9bfafbaf88a2dee0aa77fd293a7cb91558af9664bc23d4ed00830463615d
pulp [a37babc8ad1248669d66f93a55e25247]: django.request:WARNING: Bad Request: /v2/foo_v2s2/blobs/uploads/
...
pulp [ee118ef4fd8c40c09b1081c49ca4dd63]: 127.0.0.1 - - [25/Aug/2023:03:13:49 +0000] "POST /v2/foo_v2s2/blobs/uploads/ HTTP/1.0" 202 0 "-" "containers/5.22.0 (github.com/containers/image)"
pulp [a37babc8ad1248669d66f93a55e25247]: 127.0.0.1 - - [25/Aug/2023:03:13:49 +0000] "POST /v2/foo_v2s2/blobs/uploads/ HTTP/1.0" 400 102 "-" "containers/5.22.0 (github.com/containers/image)"
pulp [e8644858037c44d99c3a087985baf681]: 127.0.0.1 - - [25/Aug/2023:03:13:49 +0000] "PATCH /v2/foo_v2s2/blobs/uploads/018a2aaf-09c6-70d8-9f51-2aa3ab05f05e HTTP/1.0" 204 0 "-" "containers/5.22.0 (github.com/containers/image)"
pulp [e3f7dabb59bf434a8ee6793c35cb387a]: 127.0.0.1 - - [25/Aug/2023:03:13:49 +0000] "PUT /v2/foo_v2s2/blobs/uploads/018a2aaf-09c6-70d8-9f51-2aa3ab05f05e?digest=sha256%3Ad7ab49ce0b755a907e914d148cc8c1f437f7688048aea18a15d965f60c5d0a43 HTTP/1.0" 201 0 "-" "containers/5.22.0 (github.com/containers/image)"

Pulp returns 400 during the upload for some reason.

@ipanova
Copy link
Member

ipanova commented Aug 25, 2023

anything in the logs why 400?

@ipanova
Copy link
Member

ipanova commented Aug 25, 2023

it says invalid repository name

@ipanova
Copy link
Member

ipanova commented Aug 25, 2023

There is a race condition here.
podman has parallel blob upload defaulting to 6

pulp [ee118ef4fd8c40c09b1081c49ca4dd63]: 127.0.0.1 - - [25/Aug/2023:03:13:49 +0000] "POST /v2/foo_v2s2/blobs/uploads/ HTTP/1.0" 202 0 "-" "containers/5.22.0 (github.com/containers/image)"
pulp [a37babc8ad1248669d66f93a55e25247]: 127.0.0.1 - - [25/Aug/2023:03:13:49 +0000] "POST /v2/foo_v2s2/blobs/uploads/ HTTP/1.0" 400 102 "-" "containers/5.22.0 (github.com/containers/image)"

Note the time.
We need to use serializer.get_or_create() instead of create() to avoid the race https://github.com/pulp/pulp_container/blob/main/pulp_container/app/registry_api.py#L300

@ipanova
Copy link
Member

ipanova commented Aug 25, 2023

I just don't understand why it manifests specifically in the manifest list push and not during regular image

@ipanova
Copy link
Member

ipanova commented Aug 25, 2023

Actually, it does not fail on serializer.create() because that it would fail with integrity error which we catch later on, but rather on serializer.is_valid line which ends up raising ValidaitonError

repo_serializer.is_valid(raise_exception=True)
Traceback (most recent call last):
  File "/usr/local/lib/pulp/lib64/python3.10/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-c26b317cf155>", line 1, in <cell line: 1>
    repo_serializer.is_valid(raise_exception=True)
  File "/usr/local/lib/pulp/lib64/python3.10/site-packages/rest_framework/serializers.py", line 235, in is_valid
    raise ValidationError(self.errors)
rest_framework.exceptions.ValidationError: {'name': [ErrorDetail(string='This field must be unique.', code='unique')]}

lubosmj added a commit to lubosmj/pulp_container that referenced this issue Aug 25, 2023
@pulpbot pulpbot moved this to Needs review in RH Pulp Kanban board Aug 25, 2023
@lubosmj lubosmj moved this from Not Started to In Progress in Pulp Container Roadmap Oct 30, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Oct 31, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 1, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 1, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 1, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 8, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 8, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 8, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 8, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 8, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 8, 2023
lubosmj added a commit to lubosmj/pulp_container that referenced this issue Nov 9, 2023
lubosmj added a commit that referenced this issue Nov 9, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in Pulp Container Roadmap Nov 9, 2023
@pulpbot pulpbot moved this from Needs review to Done in RH Pulp Kanban board Nov 9, 2023
@lubosmj lubosmj moved this from Done to Shipped in Pulp Container Roadmap Feb 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Shipped
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants