Azure driver is not working correctly, when it tries to execute more than one job in parallel #384

Ben10k · 2023-12-02T11:49:22Z

I am using the latest versions of:

self-hsoted Drone ( 2.20.0 )
drone-runner-aws ( v1.0.0-rc.105 )

.drone_pool.yml that works:

version: "1"
instances:
- name: ubuntu-azure
  default: true
  type: azure
  pool: 0
  limit: 1
  platform:
    os: linux
    arch: amd64
  spec:
    account:
      client_id: "****"
      client_secret: "****"
      subscription_id: "****"
      tenant_id: "****"
    resource_group: drone-runners
    location: eastus2
    size: Standard_B4s_v2
    image:
      username: "****"
      password: "****"
      publisher: canonical
      offer: 0001-com-ubuntu-server-focal
      sku: 20_04-lts-gen2
      version: latest

When I try to run pipelines, this configuration works, but if I increase the limit to anything above and try to run parallel pipleines, something stops working.

Both pipeline stages turn yellow and the timer starts.
In the azure portal and runner's logs I can see that 2 VMs are provisioned
After about 60-80 seconds, one job starts running, and work as expected, when the pipeline finishes, the VM gets destroyed (visible both in runner's logs and on Azure portal)
Another job stays in progress, but steps do not start
After about 20-21 minutes, the pipeline stage fails with context deadline exceeded
After enabling the trace and debug logs on the runner, I found these log messages, which indicate that if a few VMs are started at the same time, the runner assigns the same IP addresses for VMs even though only 1 VM actually has that IP, and other 2 VMs have their own unique public IPs.

time="2023-12-02T11:36:06Z" level=debug msg="azure: [provision] complete" cloud=azure fields.time=47.23s image=0001-com-ubuntu-server-focal ip=20.230.100.19 name=drone-runner-vm-cd8fb7c9b-6gn44-ubuntu-azure-XWp05980 pool=ubuntu-azure size=Standard_B4s_v2 zone="[]"
time="2023-12-02T11:36:08Z" level=debug msg="azure: [provision] complete" cloud=azure fields.time=48.40s image=0001-com-ubuntu-server-focal ip=20.230.100.19 name=drone-runner-vm-cd8fb7c9b-6gn44-ubuntu-azure-basMfX91 pool=ubuntu-azure size=Standard_B4s_v2 zone="[]"
time="2023-12-02T11:36:09Z" level=debug msg="azure: [provision] complete" cloud=azure fields.time=47.53s image=0001-com-ubuntu-server-focal ip=20.230.100.19 name=drone-runner-vm-cd8fb7c9b-6gn44-ubuntu-azure-5v88gnFS pool=ubuntu-azure size=Standard_B4s_v2 zone="[]"

The VM which actually has that IP, continue executing the pipeline, but the other 2 hang for 20 minutes, until they are removed.

If I increase the pool value to anything more than 0, than all pipelines timeout with the same issues.

Note: I have tried the same runner with the same pipelines with another pool of AWS instances and it worked perfectly.

The text was updated successfully, but these errors were encountered:

Ben10k · 2023-12-02T21:30:38Z

I found the issue and was able to solve it locally.
Will raise a PR shortly.

Ben10k · 2023-12-15T09:41:03Z

@raghavharness I am tagging you as I see you are actively maintaining this repository.

Can you please take a look?

Ben10k mentioned this issue Dec 2, 2023

fix(azure): remove race condition on IP address when creating multiple VMs #385

Closed

Ben10k changed the title ~~Azure drivwer is not working correctly, when it tries to execute more than one job in parallel~~ Azure driver is not working correctly, when it tries to execute more than one job in parallel Dec 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Azure driver is not working correctly, when it tries to execute more than one job in parallel #384

Azure driver is not working correctly, when it tries to execute more than one job in parallel #384

Ben10k commented Dec 2, 2023 •

edited

Loading

Ben10k commented Dec 2, 2023

Ben10k commented Dec 15, 2023

Azure driver is not working correctly, when it tries to execute more than one job in parallel #384

Azure driver is not working correctly, when it tries to execute more than one job in parallel #384

Comments

Ben10k commented Dec 2, 2023 • edited Loading

Ben10k commented Dec 2, 2023

Ben10k commented Dec 15, 2023

Ben10k commented Dec 2, 2023 •

edited

Loading