Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static Docker containers need to be restarted with --cpuset-cpus="0-3" #3375

Closed
Haroon-Khel opened this issue Feb 8, 2024 · 7 comments
Closed
Assignees
Labels

Comments

@Haroon-Khel
Copy link
Contributor

ref #3360 (comment)

All of our containers need to be restarted with the proper command to assign 4 cpus. The command --cpuset-cpus="0-3" needs to be used instead of --cpus=4.0. That way the test jobs can properly read the number of cpus on the container, instead of reading 160 cores on the dockerhost, and then assigning the appropriate concurrency. At the moment test jobs are running with -concurrency:81 on the containers while it should be -concurrency:3

The following nodes have been restarted with --cpuset-cpus="0-3"

dockerhost-equinix-ubuntu2204-armv8l-1

dockerhost-equinix-ubuntu2004-armv8l-1

@Haroon-Khel Haroon-Khel changed the title Docker containers need to be restarted with --cpuset-cpus="0-3" Static Docker containers need to be restarted with --cpuset-cpus="0-3" Feb 8, 2024
@sxa
Copy link
Member

sxa commented Feb 8, 2024

@Haroon-Khel Is test-docker-sles15-armv8l-1 based on the BCI image referenced in #3135?

@adamfarley
Copy link
Contributor

adamfarley commented Feb 14, 2024

Note: This PR should cap test concurrency to either:

  • (0.5*cores)+1
    or
  • (0.5*gigs-of-memory)

Whichever is smaller.

Also, we calculate "memory" as either the machine memory of the cgroup (container) memory, whichever is smaller.

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Feb 16, 2024

✅ Implies the containers have been rerun with --cpuset-cpus="0-3"

[
    {
        "name": "dockerhost-equinix-ubuntu2004-armv8-1",
        "ip": "147.75.35.203",
        "containers": [
            "build-docker-ubuntu2004-armv7l-1",  Does not exist on machine
            "test-docker-alpine313-aarch64-1", replaced by test-docker-alpine319-armv8-2 ✅ 
            "test-docker-alpine314-aarch64-1", replaced by test-docker-alpine319-armv8-4 ✅ 
            "test-docker-fedora39-armv8l-1", ✅ 
            "test-docker-sles15-armv8l-1", ✅ 
            "test-docker-ubuntu1804-armv8l-4", ✅ 
            "test-docker-ubuntu2004-armv7l-1",
            "test-docker-ubuntu2004-armv7l-2",
            "test-docker-ubuntu2004-armv7l-3",
            "test-docker-ubuntu2004-armv8l-1", ✅ 
            "test-docker-ubuntu2004-armv8l-2", ✅ 
            "test-docker-ubuntu2004-armv8l-3", ✅ 
            "test-docker-ubuntu2204-armv8l-2", ✅ 
            "test-docker-ubuntu2310-armv8l-1" ✅ 
        ],
        "containersCount": 14
    },
    {
        "name": "dockerhost-equinix-ubuntu2004-x64-1",
        "ip": "145.40.114.58",
        "containers": [
            "test-docker-alpine314-x64-1",
            "test-docker-alpine317-x64-1",
            "test-docker-centos8-x64-1",
            "test-docker-debian11-x64-1",
            "test-docker-fedora35-x64-1",
            "test-docker-fedora37-x64-1",
            "test-docker-fedora37-x64-3",
            "test-docker-ubi8-x64-1",
            "test-docker-ubuntu2004-x64-1",
            "test-docker-ubuntu2204-x64-1",
            "test-docker-ubuntu2204-x64-3"
        ],
        "containersCount": 11
    },
    {
        "name": "dockerhost-equinix-ubuntu2204-armv8-1",
        "ip": "139.178.86.243",
        "containers": [
            "test-docker-alpine314-armv8-1", replaced by test-docker-alpine319-armv8-3 ✅ 
            "test-docker-alpine314-armv8-3", duplicate of test-docker-alpine314-armv8-1
            "test-docker-alpine315-armv8-2", exists in jenkins but not on dockerhost (ghost)
            "test-docker-alpine319-armv8-1", ✅ 
            "test-docker-debain12-armv8l-1", ✅ 
            "test-docker-ubuntu2004-armv7l-4", ✅ 
            "test-docker-ubuntu2004-armv7l-5", ✅ 
            "test-docker-ubuntu2004-armv7l-6", ✅ 
            "test-docker-ubuntu2204-armv8-1", ✅ 
            "test-docker-ubuntu2204-armv8-2", ✅ 
            "test-docker-ubuntu2204-armv8-3" ✅ 
        ],
        "containersCount": 11
    },
    {
        "name": "dockerhost-equinix-ubuntu2204-x64-1",
        "ip": "145.40.113.173",
        "containers": [
            "test-docker-alpine314-x64-2",
            "test-docker-alpine317-x64-2",
            "test-docker-centos8-x64-2",
            "test-docker-debian11-x64-2",
            "test-docker-fedora35-x64-2",
            "test-docker-fedora37-x64-2",
            "test-docker-ubi8-x64-2",
            "test-docker-ubuntu2004-x64-2",
            "test-docker-ubuntu2204-x64-2"
        ],
        "containersCount": 9
    },
    {
        "name": "dockerhost-marist-ubuntu2204-s390x-1",
        "ip": "148.100.74.237",
        "containers": [
            "test-docker-sles12-s390x-1", ✅ 
            "test-docker-sles15-s390x-1" ✅ 
        ],
        "containersCount": 2
    },
    {
        "name": "dockerhost-osuosl-ubuntu2004-ppc64le-1",
        "ip": "140.211.168.214",
        "containers": [
            "docker-osuosl-ubuntu2004-ppc64le-1", duplicate of dockerhost-osuosl-ubuntu2004-ppc64le-1
            "test-docker-fedora33-ppc64le-1", replaced with test-docker-fedora39-ppc64le-1
            "test-docker-ubuntu1804-ppc64le-1", replaced with test-docker-ubuntu2004-ppc64le-1
            "test-docker-ubuntu2010-ppc64le-1" replaced with test-docker-ubuntu2204-ppc64le-3
        ],
        "containersCount": 4
    },
    {
        "name": "dockerhost-osuosl-ubuntu2204-aarch64-1",
        "ip": "140.211.167.67",
        "containers": [],
        "containersCount": 0
    },
    {
        "name": "dockerhost-rise-ubuntu2204-aarch64-1",
        "ip": "34.72.108.242",
        "containers": [],
        "containersCount": 0
    },
    {
        "name": "dockerhost-skytap-ubuntu2004-ppc64le-1",
        "ip": "20.61.136.212",
        "containers": [
            "test-docker-debian11-ppc64le-1", ✅ 
            "test-docker-debian11-ppc64le-2", ✅ 
            "test-docker-debian11-ppc64le-3", ✅ 
            "test-docker-debian11-ppc64le-4", ✅ 
            "test-docker-ubuntu2204-ppc64le-1", ✅ 
            "test-docker-ubuntu2204-ppc64le-2" ✅ 
        ],
        "containersCount": 6
    },
    {
        "name": "dockerhost-skytap-ubuntu2204-x64-1",
        "ip": "20.61.136.254",
        "containers": [
            "test-docker-debian12-x64-1", ✅ 
            "test-docker-fedora39-x64-1", ✅ 
            "test-docker-ubuntu2204-x64-4", ✅ 
            "test-docker-ubuntu2204-x64-5" ✅ 
        ],
        "containersCount": 4
    }
]

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Feb 16, 2024

Annoyingly to rerun a container with different parameters, it isnt as simple as docker stop $container docker start $container --new-options. As far as I can tell, I need to stop the running container, remove it, and then run a container from the same image with the new options. I can use https://github.com/adoptium/infrastructure/blob/master/ansible/playbooks/AdoptOpenJDK_Unix_Playbook/dockernode.yml to automate this but I need to update some of the dockerfiles first

@Haroon-Khel
Copy link
Contributor Author

I wont restart the x64 equinix nodes as we want to start decommissiong those nodes as anyway as per #3378 (comment)

@Haroon-Khel
Copy link
Contributor Author

With the x64 equinix dockerhost machines decommissioned, theres just the ppc64le nodes on dockerhost-osuosl-ubuntu2004-ppc64le-1 left

@Haroon-Khel
Copy link
Contributor Author

Haroon-Khel commented Apr 11, 2024

dockerhost-osuosl-ubuntu2004-ppc64le-1 nodes have been restarted. Issue is closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Development

No branches or pull requests

3 participants