-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reset Dask worker to use TCP even if it was configured to use TLS in yaml file #836
Comments
Thanks for raising this @weiwang217. I've opened #837 to resolve this. Would you mind testing that PR out and letting me know if it solves your problem? |
Thanks! How can I build and install the dask operator?
Jacob Tomlinson ***@***.***> 于2023年10月19日周四 03:48写道:
… Thanks for raising this @weiwang217 <https://github.com/weiwang217>. I've
opened #837 <#837> to resolve
this. Would you mind testing that PR out and letting me know if it solves
your problem?
—
Reply to this email directly, view it on GitHub
<#836 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZF2RVJUPVM3WL7I4EDBN3YAEAPRAVCNFSM6AAAAAA6GFV5BWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZQGU2TKNBTHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Wang, Wei
MAIL: ***@***.***
|
We have documentation on how to do this here https://kubernetes.dask.org/en/latest/testing.html#testing-operator-controller-prs |
It worked. When is the code going to be merged into the main? Thanks!
Thanks,
Wei
Jacob Tomlinson ***@***.***> 于2023年10月20日周五 02:39写道:
… We have documentation on how to do this here
https://kubernetes.dask.org/en/latest/testing.html#testing-operator-controller-prs
—
Reply to this email directly, view it on GitHub
<#836 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZF2RUJWBMGVBJMVI6T5BLYAJBFJAVCNFSM6AAAAAA6GFV5BWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZSGQYDONZYHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Wang, Wei
MAIL: ***@***.***
|
Hi Jacob, I have a suspicion that change may have caused a regression when working with replicas > 1. When I start a new DaskJob, all but one replica fails to connect to the scheduler because of duplicate names. Indeed when I run I see:
Worker 2:
Because the last defined environment variable is the first replica, all replicas share the same name. Do you mind taking a look? (Context: I'm on the same team as weiwang217 and we just noticed this change recently) |
Thanks for reporting this @kjleftin. Why are you setting the |
Hi Jacob, I'm following the example code in https://kubernetes.dask.org/en/latest/operator_resources.html#daskjob Specifically, passing the DASK_WORKER_NAME env. variable to the dask worker CLI:
Note that I'm not setting DASK_WORKER_NAME explicitly. That is handled by the Dask Operator. (Before this change, each worker would have a different value for DASK_WORKER_NAME, but after this change, each worker has the same value). |
@kjleftin ok thanks for the clarification. I expect we may need to use |
Describe the issue:
The DASK operator reset to use TCP even if it was configured to use TLS
The code to append the config is here:
dask-kubernetes/dask_kubernetes/operator/controller/controller.py
Line 156 in fa7255b
Minimal Complete Verifiable Example:
# Put your MCVE code here
Anything else we need to know?:
Environment:
The text was updated successfully, but these errors were encountered: