You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The second replica fails because there are no idle instances to reuse. The run is terminated because the second replica failed to start.
# BACKEND REGION INSTANCE RESOURCES SPOT PRICE
1 gcp us-west4 e2-standard-2 2xCPU, 8GB, 100.0GB (disk) yes $0.0095 busy
Active run my-service already exists. Detected configuration changes that can be updated in-place: ['replicas']
my-service provisioning completed (running)
Service is published at http://localhost:3000/proxy/services/ilya/my-service/
Serving HTTP on 0.0.0.0 port 8000 (http://localhost:3000/proxy/services/ilya/my-service/) ...
Run failed with error code TERMINATED_BY_SERVER.
Check CLI, server, and run logs for more details
Expected behaviour
The second replica fails because there are no idle instances to reuse. The run remains running.
dstack version
0.18.24
Server logs
INFO dstack._internal.server.services.runs:861 run(46bef6)my-service: scaling UP 1 replica(s)
[09:06:24] DEBUG dstack._internal.server.background.tasks.process_submitted_jobs:99 job(c9251a)my-service-0-1: provisioning has started
[09:06:29] DEBUG dstack._internal.server.background.tasks.process_submitted_jobs:99 job(c9251a)my-service-0-1: provisioning has started
DEBUG dstack._internal.server.background.tasks.process_submitted_jobs:222 job(c9251a)my-service-0-1: reuse instance failed
INFO dstack._internal.server.background.tasks.process_runs:330 run(46bef6)my-service: run status has changed RUNNING -> TERMINATING
INFO dstack._internal.server.services.jobs:283 job(c9251a)my-service-0-1: job status is FAILED, reason: FAILED_TO_START_DUE_TO_NO_CAPACITY
DEBUG dstack._internal.server.services.jobs:192 job(c1f616)my-service-0-0: stopping runner 34.16.243.7
DEBUG dstack._internal.server.services.jobs:234 job(c1f616)my-service-0-0: stopping container
[09:07:02] INFO dstack._internal.server.services.jobs:268 job(c1f616)my-service-0-0: instance 'cloud-0' has been released, new status is IDLE
INFO dstack._internal.server.services.jobs:283 job(c1f616)my-service-0-0: job status is TERMINATED, reason: TERMINATED_BY_SERVER
INFO dstack._internal.server.services.runs:848 run(46bef6)my-service: run status has changed TERMINATING -> FAILED, reason: JOB_FAILED
Additional information
It is important to keep the existing replica running to avoid service downtime. One of the ideas of having multiple replicas is increasing fault tolerance, so failures of one replica should not affect other replicas.
The text was updated successfully, but these errors were encountered:
Steps to reproduce
This can be reproduced both with auto-scaling and with manual in-place update. This example uses in-place update.
--reuse
.--reuse
.Actual behaviour
The second replica fails because there are no idle instances to reuse. The run is terminated because the second replica failed to start.
Expected behaviour
The second replica fails because there are no idle instances to reuse. The run remains running.
dstack version
0.18.24
Server logs
Additional information
It is important to keep the existing replica running to avoid service downtime. One of the ideas of having multiple replicas is increasing fault tolerance, so failures of one replica should not affect other replicas.
The text was updated successfully, but these errors were encountered: