Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address problem of missing pool in StoppedPools property #3352

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mulkieran
Copy link
Member

@mulkieran mulkieran commented May 25, 2023

No description provided.

@mulkieran mulkieran self-assigned this May 25, 2023
@mulkieran mulkieran marked this pull request as ready for review May 25, 2023 22:20
@mulkieran mulkieran requested review from jbaublitz and bgurney-rh May 25, 2023 22:20
@mulkieran mulkieran removed the request for review from jbaublitz May 26, 2023 15:49
@jbaublitz
Copy link
Member

I just wanted to mention that maybe we should also handle the case (although less likely to be triggered) in handle.rs too.

@mulkieran
Copy link
Member Author

There seems to be a bit of a bug in that, the pool is still started and fully running. If stratisd is stopped and restarted, the pool should appear in either stopped or started pools, but appears in neither.

@mulkieran mulkieran force-pushed the error-if-no-unlock-method branch from a37ee64 to 379337c Compare May 26, 2023 19:19
@mulkieran
Copy link
Member Author

There is a deeper question. By corrupting the devices in a specific way, we can engineer a state where the pool is fully set up, but when we restart Stratis the pool is placed in the stopped pools data structure. If we teardown the pool, we make it impossible to restart it. This is a deeper issue, I'll file a separate issue for that.

@mulkieran mulkieran requested a review from jbaublitz May 30, 2023 15:53
@mulkieran
Copy link
Member Author

@jbaublitz Whoops, I forgot the remark here: #3352 (comment) . Taking a look now.

self.internal.len(),
self.internal.values().map(|info| info.encryption_info()),
)
.ok()
.map(|info| StoppedPoolInfo {
.unwrap_or({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this does the right thing. Errors are only returned when they're a mixture of encrypted and unencrypted devices. I think this aspect of the code should be handled in engine/shared.rs in the gather methods. Gather takes a function as a parameter so it can likely be modified to calculate whether it's PoolEncryption::KeyDesc(MaybeInconsistent::Yes), etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the two arguments to gather_encryption_info are constructed, the iterator derived from internal.values() returns None in one place, so that the flattened length is less than the unflattened.

I think that what you're saying is that when the information is being gathered, it is possible to determine what is the encryption info on the devices w/ valid encryption info.

So, if all the devices w/ valid LUKS metadata are KeyDescription only, then you would like to see PoolEncryptionInfo::KeyDesc(MaybeInconsistent::Yes) rather than both showing inconsistent?

I think we can do that, but my own position is that in the general case where the LUKS info is actually missing because invalid, that information is not certainly known, so I decided to use the most confused value.

If all the devices had valid LUKS2 metadata, and each had a key description, but some key descriptions were different from others, then I would think that it would be appropriate to return
`PoolEncryptionInfo::KeyDesc(MaybeInconsistent::Yes), but it actually make more sent to me to return Both``` as I do here, in this situation.

.filter_map(|(pool_uuid, map)| {
map.stopped_pool_info().map(|info| (*pool_uuid, info))
})
.map(|(pool_uuid, map)| (*pool_uuid, map.stopped_pool_info()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks correct. It seems like we may have been only displaying encrypted pools in general for stopped pools, not just in the case of this bug!

If there is encryption info on some devices but not others, just assume
maximum possible inconsistency in the encryption.

Signed-off-by: mulhern <[email protected]>
@mulkieran mulkieran force-pushed the error-if-no-unlock-method branch from e30e767 to 451b196 Compare May 31, 2023 17:03
@mulkieran mulkieran changed the title Return an error if there is no unlock method Address problem of missing pool in StoppedPools property May 31, 2023
@mulkieran
Copy link
Member Author

I don't think that gather_encryption_info should return an error under any circumstances. At present, encryption_info() on one device in a DeviceSet may return None. If other return Some, then the answer is for the PoolEncryptionInfo to be maximally inconsistent, as it is here. But, that can be done in in the gather_encryption_info method. I'll give that a go. How that relates to the related method gather_pool_name is an interesting question. As ever, I'ld like to do this with a set and some rules.

If exactly one element of the set and that is None, then the pool encryption info is None.

If two elements of the set, then something is inconsistent.
None + anything => maximally inconsistent, since there should be some encryption info on that other device and we have no idea what it is.

Then feed the rest to the from method for PoolEncryptionInfo, as before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress (long term)
Development

Successfully merging this pull request may close these issues.

2 participants