-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CP-45016 Implement inbound SXM for SMAPIv3 SRs #6101
base: master
Are you sure you want to change the base?
CP-45016 Implement inbound SXM for SMAPIv3 SRs #6101
Conversation
f8fbee6
to
9d2be3e
Compare
b9ff795
to
1254a96
Compare
26aca8a
to
9095100
Compare
9095100
to
7067707
Compare
2dcb915
to
c54a8da
Compare
ocaml/xenopsd/lib/xenops_server.ml
Outdated
let final_handshake () = | ||
clean_src_vm () ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is not part of the handshake, I prefer the make this call separately in the sequence below (L2797 and L2814).
I'm actually surprised that this hasn't caused problems for plain, non-storage live migration on SMAPIv3 SRs... or has it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well this will only cause problems if it is a storage migration. For pure VM migration, I think it is fine since that implies the VM will move onto a different host (say from host A to host B), and having the datapath activated on the src host A is ok as that does not affect host B.
With storage migration, though, the VM could be staying on the same host (or not), but it does not matter. If the destinatoin VM attempts to activate the datapath on the mirror VDI, it will give you a VDI in use as it is currently being used by mirroring (or by source VM as well if it is a pure storage migration on the same host), as designed by SMAPIv3
I can't see anything obviously wrong with this, but this is a big PR that modifies the storage migration code that is generally hard to fully follow and understand. The changes related to SMAPIv3 SRs are not live until those SRs are configured to expose a new feature, but this PR also changes SMAPIv1 code paths in a number of ways. We therefore need to carefully test especially the SMAPIv1 storage migration use cases, including VM migration from a pool running "old" code to a pool that contains these changes. So before accepting this, I'd like to understand how this has been tested. |
a72bf25
to
41ce262
Compare
0080177
to
fe9d48b
Compare
mirror_vm. The storage_interface defines mirror_vm first. But logging | ||
suggests that Storage_mux receives copy_vm first so I had to reverse the ordering | ||
of these two params to make it work. *) | ||
SMAPI.DATA.MIRROR.start dbg vconf.sr vconf.location new_dp vconf.copy_vm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anyone understands what is going on here, please let me know :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't make any sense at all... Can you use the labelled arguments, for everyone's sanity?
fe9d48b
to
9470845
Compare
Marking this PR as ready for review again as I finished some of the changes. For testing, I plan to run all the XRT tests available for SXM and see if there are any regressions. We could alternatively also consider feature branch if it is too risky to merge all of this into master. Although I am slightly against that as this PR contains some general code improvements as well. |
ocaml/vhd-tool/src/impl.ml
Outdated
if not (List.mem destination_protocol possible_protocols) then | ||
if | ||
not | ||
(List.exists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand what this code does; maybe define a new function and use it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hack, it converts any Nbd export_name
into a dummy_nbd which is just Nbd ""
(empty string) to check for possible protocols. The intention of this check is to see if nbd is part of the list of possible protocols, no matter what export name it has. Maybe it would be easier to understand if I extract the (=)
thing and add a comment?
; sm_config= | ||
List.filter_map | ||
(fun (k, v) -> | ||
if String.starts_with ~prefix:_sm_config_prefix_key k then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would benefit from an comment that includes an example what is being transformed here and a motivation.
) | ||
fl | ||
|
||
(** [parse_string_int64 features] takes a [features] list in its plain string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting some code that splits a string into two parts but did not see it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see string_int64_of_string_opt
Signed-off-by: Vincent Liu <[email protected]>
Signed-off-by: Vincent Liu <[email protected]>
The previous feature processing logic was a bit ad hoc with various random conversion functions. Abstract them into one module and refactor a few functions to make their uses more explicit. Also try to encourage users, when processing features, to convert them into `Feature.t` and use the provided functions in this module rather than writing `List.mem` directly. Signed-off-by: Vincent Liu <[email protected]>
This is so that `Storage_migrate` can use this function, otherwise there would be a dependency cycle. This function also sounds like a utility function. Also add an mli file for `Storage_utils`. Signed-off-by: Vincent Liu <[email protected]>
This includes: - VDI.compose for composing VDIs. Note the ordering of the parameter `parent` and `child` is reversed, due to the inconsistency between SMAPIv2 and SMAPIv3 - DP.attach_info which calls the idempotent DP.attach3 on a already attached VDI to get the attach info again. - VDI.{add_to,remove_from}sm_config for xapi to keep track of the mirror vdi - VDI.set_content_id for determining similarity between VDIs Signed-off-by: Vincent Liu <[email protected]>
This API call prepares a nbd server that will be used for mirroring. This gets mapped to different logic on SMAPIv1 and SMAPIv3. - for SMAPIv1, this will reuse the old logic of xapi using Tapctl to retrieve this information from tapdisk - for SMAPIv3, this uses the new `import_activate` call to prepare an nbd server Also adjust the position of the module `DATA` to use the `DP` module. Signed-off-by: Vincent Liu <[email protected]>
Previously we hardcoded the export to be empty string when the nbd client negotiates with the server. This patch allows specifying this as a protocol parameter. This is useful for SXM of SMAPIv3 SRs when we copy the snapshot from source to destination, where the export name is needed when we copy using nbd. Signed-off-by: Vincent Liu <[email protected]>
This is the main commit that implements the tracking logic of SXM for inbound SMAPIv3 SRs. It mainly consists of the following changes: - Adding parameters to various SMAPIv2 calls to track the domain slice that is invoking the SMAPI calls. For SXM, we track two particular domains: the mirror domain and the copy domain, corresponding to two of the operations happening during migration: mirror of the VDI and copy of the base image. Although this was not needed for SMAPIv1 calls, it is now necessary for SMAPIv3 calls since it requires explicit tracking of the domain slices. - Extend various SMAPIv2 calls with the new domain parameter since SMAPIv3 has one qemu-dp process per domain, and therefore requires us to keep track of the domain parameter. - Invoking the `import_activate` call in the new `nbd_handler` on the receiving end to prepare an nbd server for mirroring. Signed-off-by: Vincent Liu <[email protected]>
SXM using qemu datapaths is done in two parts: inbound and outbound, hence the storage side has declared a `VDI_MIRROR_IN` features. Implement the checking logic for this feature here. This is done in two parts: 1. Check the relevant SM for the feature `VDI_MIRROR_IN`. This is done on the destination host. If this fails, then storage migration will be prevented from happening in the beginning. Consider this as a pre-check. 2. Check in xapi-storage-script that the chosen datapath supports mirroring. Technically mirroring is a datapath feature so it makes sense to check it there. But currently there is no easy way to register a datapath feature into the SM object in xapi database, so instead the volume plugins for SMAPIv3 SRs would also declare this feature. However, we still need a final check on the datapath features to make sure the selected datapath really supports this feature. Note this check happens while trying to set up the mirror, after storage migration has started, and will give a relatively cryptic error to the user. The end result of this is that if either the volume plugin does not declare `VDI_MIRROR_IN`/`VDI_MIRROR` or the dp plugin does not declare any of these two, SXM will fail (upfront/mid way). Note current SXM of SMAPIv1 SRs are not affected since `VDI_MIRROR` is a superset of `VDI_MIRROR_IN`. Signed-off-by: Vincent Liu <[email protected]>
`VDI.compose` operation links a child volume to a parent, which changes the internal structure of the volume. This is unsafe to do while the mirroring datapath is activated. Instead, we move `VDI.compose` into `receive_finalize`, which is when the source VM is paused, and the mirroring datapath deactivated. Implement `receive_finalize2` for this purpose, and also maintain backwards compatibility. Note this change does not make the dummy snapshot of the mirror VDI on the destination host redundant, as there are two purposes for this dummy snapshot: 1. to prevent GC from doing leaf coalescing; 2. to get a differencing disk, which is needed for VDI.compose. The first purpose is no longer needed due to this change, while the second purpose still remains valid. Signed-off-by: Vincent Liu <[email protected]>
This new function returns the real nbd server of the underlying storage backend. This nbd server will not accept any fd passing. This is in contrast to import_activate, which returns a "special" nbd server that will accept fds via SCM_RIGHTS. Signed-off-by: Vincent Liu <[email protected]>
The new http handler nbd_proxy accepts a connection from an nbd client, and acts as a proxy between the client and the nbd server of the underlying storage backend, as returned by get_nbd_server. This proxy is used in place of the old style of fd passing to tapdisks for VDI copying during storage migration, since the new nbd server from qemu-dp does not support accepting fds. This might also be used in the future for outbound storage migration for mirroring. Signed-off-by: Vincent Liu <[email protected]>
Previously the source VM is cleaned up at the very end of the migration, after the destination VM is up and running. This has caused problems for SMAPIv3 SRs as the new VM needs to plug in the new VBD which requires a datapath activate on the VDI which is currently in use by the old VM (and the mirroring process). This window needs to be reduced since we cannot have two datapath activated at the same time. Signed-off-by: Vincent Liu <[email protected]>
Improve the documentation of functions such as `VDI.compose` and `DATA.MIRROR.receive_cancel`. Also use more consistent logging style. Signed-off-by: Vincent Liu <[email protected]>
9470845
to
fc559ea
Compare
; sm_config= [] | ||
; physical_utilisation= | ||
x.Xapi_storage.Control.physical_utilisation | ||
(* All the sm_config stored is of the form (k, v) where k will be prefixed by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should appear after the ;
that separates physical_utilisation
from sm_config
| c, c' when c = c' -> | ||
true | ||
| _ -> | ||
false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| c, c' when c = c' -> | |
true | |
| _ -> | |
false | |
| c, c' -> | |
c = c' |
@@ -352,6 +352,8 @@ end | |||
|
|||
let _nonpersistent = "NONPERSISTENT" | |||
|
|||
let _vdi_mirror_in = "VDI_MIRROR_IN" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the commit
CP-45016: Implement checking of mirroring features
What could be done to make the error less cryptic? maybe something along the error path to transform the error?
Local.VDI.compose dbg r.sr r.parent_vdi r.leaf_vdi ; | ||
(* On SMAPIv3, compose would have removed the now invalid dummy vdi, so | ||
there is no need to destroy it anymore, while this is necessary on SMAPIv1 SRs. *) | ||
log_and_ignore_exn (fun () -> Local.VDI.destroy dbg r.sr r.dummy_vdi) ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does log_and_ignore_exn
produce scary errors in the logs?
I am aware this is quite a large PR and there are a couple of things that could be done in different ways, so would probably benefit from opening it a bit earlier for discussion.
Currently this feature is controlled by the feature flag
sm3_import_mirror
on the storage side, which in turn controls the SM featureVDI_MIRROR_IN
, which controls whether SXM can run or not. Current SXM of SMAPIv1 SRs are not affected, sinceVDI_MIRROR
containsVDI_MIRROR_IN
Undraft this as the test went surprisingly well (all green).
Best reviewed by commits.
There is also a lightweight design doc on xs-documents if any one wants to read it.