Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-45016 Implement inbound SXM for SMAPIv3 SRs #6101

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

Vincent-lau
Copy link
Contributor

@Vincent-lau Vincent-lau commented Nov 4, 2024

I am aware this is quite a large PR and there are a couple of things that could be done in different ways, so would probably benefit from opening it a bit earlier for discussion.

Currently this feature is controlled by the feature flag sm3_import_mirror on the storage side, which in turn controls the SM feature VDI_MIRROR_IN, which controls whether SXM can run or not. Current SXM of SMAPIv1 SRs are not affected, since VDI_MIRROR contains VDI_MIRROR_IN

Undraft this as the test went surprisingly well (all green).

Best reviewed by commits.

There is also a lightweight design doc on xs-documents if any one wants to read it.

@Vincent-lau Vincent-lau marked this pull request as draft November 4, 2024 14:36
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch from f8fbee6 to 9d2be3e Compare November 4, 2024 14:41
ocaml/vhd-tool/src/impl.ml Outdated Show resolved Hide resolved
ocaml/xapi/smint.ml Outdated Show resolved Hide resolved
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch from b9ff795 to 1254a96 Compare November 4, 2024 16:45
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch 3 times, most recently from 26aca8a to 9095100 Compare November 5, 2024 11:31
@Vincent-lau Vincent-lau marked this pull request as ready for review November 5, 2024 14:01
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch from 9095100 to 7067707 Compare November 5, 2024 14:06
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch 3 times, most recently from 2dcb915 to c54a8da Compare November 15, 2024 17:37
ocaml/xapi/smint.ml Outdated Show resolved Hide resolved
let final_handshake () =
clean_src_vm () ;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is not part of the handshake, I prefer the make this call separately in the sequence below (L2797 and L2814).
I'm actually surprised that this hasn't caused problems for plain, non-storage live migration on SMAPIv3 SRs... or has it?

Copy link
Contributor Author

@Vincent-lau Vincent-lau Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this will only cause problems if it is a storage migration. For pure VM migration, I think it is fine since that implies the VM will move onto a different host (say from host A to host B), and having the datapath activated on the src host A is ok as that does not affect host B.

With storage migration, though, the VM could be staying on the same host (or not), but it does not matter. If the destinatoin VM attempts to activate the datapath on the mirror VDI, it will give you a VDI in use as it is currently being used by mirroring (or by source VM as well if it is a pure storage migration on the same host), as designed by SMAPIv3

@robhoes
Copy link
Member

robhoes commented Nov 18, 2024

I can't see anything obviously wrong with this, but this is a big PR that modifies the storage migration code that is generally hard to fully follow and understand.

The changes related to SMAPIv3 SRs are not live until those SRs are configured to expose a new feature, but this PR also changes SMAPIv1 code paths in a number of ways. We therefore need to carefully test especially the SMAPIv1 storage migration use cases, including VM migration from a pool running "old" code to a pool that contains these changes.

So before accepting this, I'd like to understand how this has been tested.

@Vincent-lau Vincent-lau marked this pull request as draft November 18, 2024 17:37
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch 2 times, most recently from a72bf25 to 41ce262 Compare December 5, 2024 17:34
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch 2 times, most recently from 0080177 to fe9d48b Compare December 10, 2024 15:17
@Vincent-lau Vincent-lau marked this pull request as ready for review December 10, 2024 17:24
mirror_vm. The storage_interface defines mirror_vm first. But logging
suggests that Storage_mux receives copy_vm first so I had to reverse the ordering
of these two params to make it work. *)
SMAPI.DATA.MIRROR.start dbg vconf.sr vconf.location new_dp vconf.copy_vm
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anyone understands what is going on here, please let me know :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't make any sense at all... Can you use the labelled arguments, for everyone's sanity?

@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch from fe9d48b to 9470845 Compare December 10, 2024 17:38
@Vincent-lau
Copy link
Contributor Author

Marking this PR as ready for review again as I finished some of the changes. For testing, I plan to run all the XRT tests available for SXM and see if there are any regressions. We could alternatively also consider feature branch if it is too risky to merge all of this into master. Although I am slightly against that as this PR contains some general code improvements as well.

if not (List.mem destination_protocol possible_protocols) then
if
not
(List.exists
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand what this code does; maybe define a new function and use it here?

Copy link
Contributor Author

@Vincent-lau Vincent-lau Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a hack, it converts any Nbd export_name into a dummy_nbd which is just Nbd "" (empty string) to check for possible protocols. The intention of this check is to see if nbd is part of the list of possible protocols, no matter what export name it has. Maybe it would be easier to understand if I extract the (=) thing and add a comment?

; sm_config=
List.filter_map
(fun (k, v) ->
if String.starts_with ~prefix:_sm_config_prefix_key k then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would benefit from an comment that includes an example what is being transformed here and a motivation.

)
fl

(** [parse_string_int64 features] takes a [features] list in its plain string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting some code that splits a string into two parts but did not see it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see string_int64_of_string_opt

The previous feature processing logic was a bit ad hoc with various
random conversion functions. Abstract them into one module and refactor
a few functions to make their uses more explicit.

Also try to encourage users, when processing features, to convert them
into `Feature.t` and use the provided functions in this module rather
than writing `List.mem` directly.

Signed-off-by: Vincent Liu <[email protected]>
This is so that `Storage_migrate` can use this function, otherwise there
would be a dependency cycle. This function also sounds like a utility
function.

Also add an mli file for `Storage_utils`.

Signed-off-by: Vincent Liu <[email protected]>
This includes:
- VDI.compose for composing VDIs. Note the ordering of the parameter
  `parent` and `child` is reversed, due to the inconsistency between
  SMAPIv2 and SMAPIv3
- DP.attach_info which calls the idempotent DP.attach3 on a already
  attached VDI to get the attach info again.
- VDI.{add_to,remove_from}sm_config for xapi to keep track of the mirror
  vdi
- VDI.set_content_id for determining similarity between VDIs

Signed-off-by: Vincent Liu <[email protected]>
This API call prepares a nbd server that will be used for mirroring.
This gets mapped to different logic on SMAPIv1 and SMAPIv3.

- for SMAPIv1, this will reuse the old logic of xapi using Tapctl to
  retrieve this information from tapdisk
- for SMAPIv3, this uses the new `import_activate` call to prepare an
  nbd server

Also adjust the position of the module `DATA` to use the `DP` module.

Signed-off-by: Vincent Liu <[email protected]>
Previously we hardcoded the export to be empty string when the nbd
client negotiates with the server. This patch allows specifying this
as a protocol parameter.

This is useful for SXM of SMAPIv3 SRs when we copy the snapshot from
source to destination, where the export name is needed when we copy
using nbd.

Signed-off-by: Vincent Liu <[email protected]>
This is the main commit that implements the tracking logic of SXM for
inbound SMAPIv3 SRs. It mainly consists of the following changes:

- Adding parameters to various SMAPIv2 calls to track the domain slice
  that is invoking the SMAPI calls. For SXM, we track two particular
  domains: the mirror domain and the copy domain, corresponding to two
  of the operations happening during migration: mirror of the VDI and
  copy of the base image. Although this was not needed for SMAPIv1 calls,
  it is now necessary for SMAPIv3 calls since it requires explicit tracking
  of the domain slices.
- Extend various SMAPIv2 calls with the new domain parameter since
  SMAPIv3 has one qemu-dp process per domain, and therefore requires us
  to keep track of the domain parameter.
- Invoking the `import_activate` call in the new `nbd_handler` on the
  receiving end to prepare an nbd server for mirroring.

Signed-off-by: Vincent Liu <[email protected]>
SXM using qemu datapaths is done in two parts: inbound and outbound,
hence the storage side has declared a `VDI_MIRROR_IN` features.
Implement the checking logic for this feature here. This is done in two
parts:

1. Check the relevant SM for the feature `VDI_MIRROR_IN`. This is done
   on the destination host. If this fails, then storage migration will
   be prevented from happening in the beginning. Consider this as a
   pre-check.
2. Check in xapi-storage-script that the chosen datapath supports
   mirroring. Technically mirroring is a datapath feature so it makes
   sense to check it there. But currently there is no easy way to
   register a datapath feature into the SM object in xapi database, so
   instead the volume plugins for SMAPIv3 SRs would also declare this
   feature. However, we still need a final check on the datapath
   features to make sure the selected datapath really supports this
   feature. Note this check happens while trying to set up the mirror,
   after storage migration has started, and will give a relatively
   cryptic error to the user.

The end result of this is that if either the volume plugin does not
declare `VDI_MIRROR_IN`/`VDI_MIRROR` or the dp plugin does not declare
any of these two, SXM will fail (upfront/mid way). Note current SXM of
SMAPIv1 SRs are not affected since `VDI_MIRROR` is a superset of
`VDI_MIRROR_IN`.

Signed-off-by: Vincent Liu <[email protected]>
`VDI.compose` operation links a child volume to a parent, which changes
the internal structure of the volume. This is unsafe to do while the
mirroring datapath is activated. Instead, we move `VDI.compose` into
`receive_finalize`, which is when the source VM is paused, and the
mirroring datapath deactivated.

Implement `receive_finalize2` for this purpose, and also maintain
backwards compatibility.

Note this change does not make the dummy snapshot of the mirror VDI on the
destination host redundant, as there are two purposes for this dummy
snapshot: 1. to prevent GC from doing leaf coalescing; 2. to get a
differencing disk, which is needed for VDI.compose. The first purpose is
no longer needed due to this change, while the second purpose still
remains valid.

Signed-off-by: Vincent Liu <[email protected]>
This new function returns the real nbd server of the underlying storage
backend. This nbd server will not accept any fd passing. This is in
contrast to import_activate, which returns a "special" nbd server that
will accept fds via SCM_RIGHTS.

Signed-off-by: Vincent Liu <[email protected]>
The new http handler nbd_proxy accepts a connection from an nbd client,
and acts as a proxy between the client and the nbd server of the
underlying storage backend, as returned by get_nbd_server.

This proxy is used in place of the old style of fd passing to tapdisks
for VDI copying during storage migration, since the new nbd server from
qemu-dp does not support accepting fds.

This might also be used in the future for outbound storage migration for
mirroring.

Signed-off-by: Vincent Liu <[email protected]>
Previously the source VM is cleaned up at the very end of the migration,
after the destination VM is up and running. This has caused problems for
SMAPIv3 SRs as the new VM needs to plug in the new VBD which requires a
datapath activate on the VDI which is currently in use by the old VM
(and the mirroring process). This window needs to be reduced since we
cannot have two datapath activated at the same time.

Signed-off-by: Vincent Liu <[email protected]>
Improve the documentation of functions such as `VDI.compose` and
`DATA.MIRROR.receive_cancel`. Also use more consistent logging style.

Signed-off-by: Vincent Liu <[email protected]>
@Vincent-lau Vincent-lau force-pushed the private/shul2/sxm-in2 branch from 9470845 to fc559ea Compare December 12, 2024 14:37
; sm_config= []
; physical_utilisation=
x.Xapi_storage.Control.physical_utilisation
(* All the sm_config stored is of the form (k, v) where k will be prefixed by
Copy link
Member

@psafont psafont Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should appear after the ; that separates physical_utilisation from sm_config

Comment on lines +164 to +167
| c, c' when c = c' ->
true
| _ ->
false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| c, c' when c = c' ->
true
| _ ->
false
| c, c' ->
c = c'

@@ -352,6 +352,8 @@ end

let _nonpersistent = "NONPERSISTENT"

let _vdi_mirror_in = "VDI_MIRROR_IN"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the commit

CP-45016: Implement checking of mirroring features

What could be done to make the error less cryptic? maybe something along the error path to transform the error?

Local.VDI.compose dbg r.sr r.parent_vdi r.leaf_vdi ;
(* On SMAPIv3, compose would have removed the now invalid dummy vdi, so
there is no need to destroy it anymore, while this is necessary on SMAPIv1 SRs. *)
log_and_ignore_exn (fun () -> Local.VDI.destroy dbg r.sr r.dummy_vdi) ;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does log_and_ignore_exn produce scary errors in the logs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants