-
Notifications
You must be signed in to change notification settings - Fork 413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhance flow for custom machineconfigs for specific machinesets #1619
Comments
And the real problem here is that the MCS serves:
So new nodes will only get the last config that successfully rolled out, so this attempt deadlocks. |
Another idea: some sort of "force MCS to serve |
This though gets into the "node identity" problem - see also #784 |
If someone wants to change the machine type of their workers today I think it'd require a custom Alternatively, scaling down to 0 workers then back up should work, but obviously that's a lot more disruptive. |
The other alternative would be having something like a |
An entirely different idea is for something in the stack to have a high level tunable knob like:
This would be a MachineConfig fragment. This logic could potentially live in the MCD; rather than having a machineConfig object describe a specific number of disks, the MCD would dynamically look at the node and say "oh this instance has 2" and set things up accordingly. The tricky part with this is that simply waiting for "all disks" gets racy. It'd really be best done with knowledge of how many disks to expect. We may need some integration with the machineAPI so we know how many disks to expect per instance type (per configuration) - it feels dangerous to just unilaterally take over block devices we see, though maybe there are provider-specific ways to know they're ephemeral disks. |
This machineconfig provided in first comment: openshift/release#8102 Work together well. All that you need to do is create a custom user-data secret. This can be done easily. First, find an existing machineset you want to modify, and download the user-data secret.
Modify scrape the secret contents into a new file, base64 decode those contents (you'll get some json), edit the URL near the beginning of the JSON to specify the new mcp name you created. Base64 re-encode that modified json (make sure to use -w0 option so you don't get line wrap), paste that into the user-data yaml file, update the name to something sensible and oc apply. Update your preferred machineset to utilize the m5d.4xlarge instance type and to utilize the new user-data secret you created. That's it. |
This part is pretty user hostile I think. But doing better requires much tighter MCO/machineAPI integration; something like having the machineAPI ask the MCO "please give me the user data secret for pool X" or so. |
We more or less are doing that by specifying a user-data secret. The MCO could stamp out user-data for each pool, and users could just update/create a machineset that points to that secret. |
I think in general a good pattern would be using |
I think /var/lib/containers should be it's own drive. All of /var is too broad and might make the system impossible to recover if that volume is lost for whatever reason. |
Agree this is the most obvious thing to start with for OpenShift.
Remember for CoreOS systems, the system boots with an empty |
Right, that's why I think maybe it's not the best idea to use all of /var as an ephemeral volume. That opens a totally different can of worms. I think if we identify what the primary purpose of using these drives is, it can help us better align what the implementation should look like. For me, the primary purpose is increased IOPS for containers. Since the containers themselves are ephemeral, an ephermal disk seems like a good choice. If a pod needs persistent storage, it should utilize a PVC (static pods not withstanding, etcd, etc). |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
/lifecycle frozen |
One thing that was pointed out in a chat about this is that today, one can also do this by providing a custom user data Ignition config for the desired machineset. A core problem today is that nothing really "owns" the pointer data. But, we could make this cleaner if we had an explicit way to include a machineconfig fragment in a particular machineset's pointer data. That would get us out of the MCO having to be aware of this. But, it would mean "day 2" changes for that machineconfig object wouldn't work, or at least would only work for new nodes. |
Also tangentially related to this, I did do a spike on https://github.com/cgwalters/coreos-cloud-instance-store-provisioner/ |
The CI team is trying to use new AWS m5d.xlarge instances which have two NVMe disks attached. We crafted a custom RAID partition machineconfig to enable that.
We added this as part of the main
worker
pool - the MCD will fail to roll out the partitioning on the existing workers, but that's fine because the plan was to "roll" the worker pool. Basically get the new MC in the pool, have new workers come online with that config, then scale down the old workers.However, there are a few issues here.
First, this whole thing would obviously be a lot better if we had machineset-specific machineconfigs. That would solve a bunch of races and be much more elegant.
What we're seeing right now is that one new m5d node went
OutOfDisk=true
because it was booted with just a 16G root volume from the old config. That unschedulable node then blocks rollout of further changes.I think we can unstick ourselves here by deleting that node and getting the MCO to roll out the new config.
The text was updated successfully, but these errors were encountered: