Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

correct the diskID for device path resolution #309

Conversation

fmoehler
Copy link

@fmoehler fmoehler commented May 4, 2023

For ali and aws the device path is not correctly resolved, due to some differences in the ID/Name of the disk between /dev/disk/by-id/ and the ID/Name of the disk in the IAAS provider

@fmoehler fmoehler marked this pull request as draft May 4, 2023 13:34
@fmoehler fmoehler marked this pull request as ready for review May 4, 2023 14:22
@rkoster rkoster requested review from a team, klakin-pivotal and nouseforaname and removed request for a team May 4, 2023 15:04
@@ -57,7 +57,7 @@ var _ = Describe("IDDevicePathResolver", func() {
err = fs.Symlink("/dev/intermediate/fake-device-path", "/dev/disk/by-id/virtio-fake-disk-id-include-longname")
Expect(err).ToNot(HaveOccurred())

fs.SetGlob("/dev/disk/by-id/*fake-disk-id-include-longname", []string{"/dev/disk/by-id/virtio-fake-disk-id-include-longname"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might just misunderstand the PR... but why did you remove fake- from here and elsewhere? Does that have anything to do with the change?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also this can be changed to be clearer. The different cloud providers usually have a disk ID starting with some characters, then a hyphen character and then continuing with a random string. (e.g. AWS: vol-123456789ABCD). However the symlink in by-id does not have the hyphen character (for AWS (e.g. nvme-Amazon_Elastic_Block_Store_vol123456789ABCD). For Ali, everything before and including the hyphen character is missing in by-id (e.g. Ali DiskID: d-123456789ABCD; by-id: virtio-123456789ABCD). This means the device path resolution always fails atm because the Glob looks for "*vol-123456789ABCD" (for AWS) or "*d-123456789ABCD" (for Ali), but cannot find it in by-id. Therefore, we remove everything before the hyphen to make the Glob work again. In this case this means removing the "fake-" at the beginning of the mocked fake disk id. Do you think the mocked Disk ID should be changed to be clearer? Something like "fake_disk-some_random_identifier"?

@klakin-pivotal klakin-pivotal requested review from cunnie and klakin-pivotal and removed request for klakin-pivotal May 4, 2023 17:01
@fmoehler fmoehler requested a review from klakin-pivotal May 5, 2023 06:43
@max-soe
Copy link

max-soe commented May 11, 2023

@klakin-pivotal @nouseforaname @cunnie,
This is an important bugfix for us. Can you please review this PR?

@nouseforaname
Copy link
Contributor

nouseforaname commented May 11, 2023

Hi,

can you please provide examples of the errors on ali and aws that you've seen?

In general, the updates look okay but right now I cannot tell what error you've seen that creates the need for this PR.

Additionally, since this only happens with specific disks I'm wondering if this should be enough reason to actually add an integration test for this..

@fmoehler
Copy link
Author

Hi,

in the linked issue the errors are described. Shall I provide some more examples?
If needed, I could also add an integration test for this scenario.

@nouseforaname
Copy link
Contributor

Ah nice. I think I'd be okay if the tests you already added got updated with a few real live ali / aws paths so we leave a track of how these look RN..

The reason is that I don't know if this
a) never worked for these types of disks
b) something in ali / aws changed?

@fmoehler fmoehler force-pushed the resolution-of-device-path-by-id-fails branch from 1b74f65 to 4296bbb Compare May 11, 2023 12:14
Some IaaS provider have disk ID with prefix which is separated
by a hyphen character. However the symlink doesn't have the prefix
and the hyphen character that is why the resolution doesn't work in this
case. This pr removes the prefix with the hyphen from the disk ID before
the resolution to provide a solution for this case
@fmoehler fmoehler force-pushed the resolution-of-device-path-by-id-fails branch from 4296bbb to 6636681 Compare May 11, 2023 12:19
Copy link
Contributor

@rkoster rkoster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had some internal discussion and posted a follow up question here (the issue context, so lets discuss the problem a bit more, and figure out where to fix it): #310 (comment)

@rkoster
Copy link
Contributor

rkoster commented Jun 1, 2023

Proposed architecture, during Working group meeting is to create a configuration property for the bosh agent to configure a strip-volume-prefix-regex (or something like that), which would allow specifying a regex prefix per IaaS in the bosh stemcell builder here (in the case of Ali Cloud).

@rkoster
Copy link
Contributor

rkoster commented Jun 15, 2023

Closing in favor of #311

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

6 participants