Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-Version Drivers #6163

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

lindig
Copy link
Contributor

@lindig lindig commented Dec 6, 2024

MVD CP-52334 multi-version driver API/CLI

Provide an API/CLI for multi-version drivers. Two new classes represent
them: Host_driver and Driver_variant. See

  * doc/content/toolstack/features/MVD/index.md

for technical background and feature overview.

The current implementation mocks the missing drivertool that manipulates
the file system and observes driver state. We expect to integrate this
at a later point. Consequently, the current implementation is incomplete
and exists to faciliate development of API and XE clients.

On Xapi start and restart driver information is purged from the
database and re-built from scrach. This could be improved in the future
to be incremental.

quality-gate.sh Fixed Show fixed Hide fixed
@lindig lindig marked this pull request as ready for review December 9, 2024 13:18
@lindig lindig force-pushed the private/christianlin/multi-driver-v2 branch 2 times, most recently from 6d992d8 to f0db7d8 Compare December 9, 2024 16:16
@Vincent-lau
Copy link
Contributor

I understand this is a draft version. But would be easier to split it into smaller commits for review, or is that not very feasible?

@lindig
Copy link
Contributor Author

lindig commented Dec 10, 2024

I worked from the original work by @last-genius but ended up having so many changes that the commit history was not meaningful. In addition, re-basing made a mess of them. In the end, this is difficult to split up - it's almost impossible to split this up such that each commit still compiles because it references each other. I could split it by module or something like it - but that is not entirely logical either.

@lindig
Copy link
Contributor Author

lindig commented Dec 12, 2024

I have additional code to make the database update incremental on scan() but would like to see the basics reviewed first.


## Variant vs. Version

A driver comes in several variants, each of which has a version. A
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is a driver variant useful?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be useful when you want to update (backport) a driver (e.g for newer hardware or some risky fixes) without the risk of introducing regressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean updating a driver to a variant, but that update itself is safe? How does it achieve that safety?

ocaml/idl/datamodel_driver_variant.ml Show resolved Hide resolved
call ~name:"deselect" ~in_oss_since:None ~lifecycle:[]
~doc:
"UNSUPPORTED. Deselect the currently active variant of this driver after \
reboot. No action will be taken if no variant is currently active."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So selected and active are independent operations, as mentioned in the design. What happens when there is no active driver but there is a selected driver? Will it be deselected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A selected driver will become active on next reboot.

Copy link
Contributor

@Vincent-lau Vincent-lau Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my confusion comes from this statement "Deselect the currently active variant of this driver", which says if the current driver is active (i.e. in use, but will not be loaded on next boot), calling deselect on it will make it deselected? I was expecting something like "deselect the currently selected variant of this driver", so that it will not be loaded on next boot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the wording could be improved. What happens is that the currently active driver/variant combination will not become active again on reboot. Technically a symlink is removed such that this variant is no longer loaded by the kernel.

@@ -581,17 +602,6 @@ let gc_PVS_cache_storage ~__context =
(fun x -> valid_ref __context x.pVS_cache_storage_host)
Db.PVS_cache_storage.destroy

(*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove these?

ocaml/xapi/xapi_host_driver.ml Outdated Show resolved Hide resolved
ocaml/xapi-cli-server/cli_operations.ml Outdated Show resolved Hide resolved

(** destroy driver and recursively its variants, too *)
let destroy ~__context ~self =
D.debug "Destroying driver %s" (Ref.string_of self) ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The %s __FUNCTION__ can be added here as well.

Copy link
Member

@robhoes robhoes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very long single commit... I would normally split out at least the datamodel definition, internal logic, tests and mock tool into separate commits.

doc/content/toolstack/features/MVD/index.md Outdated Show resolved Hide resolved
ocaml/idl/datamodel_driver_variant.ml Outdated Show resolved Hide resolved
ocaml/idl/datamodel_driver_variant.ml Show resolved Hide resolved
"Unique versions of this driver variant"
; field ~lifecycle:[] ~qualifier:DynamicRO ~ty:Bool "hardware_present"
"True if the hardware for this variant is present on the host"
; field ~lifecycle:[] ~qualifier:DynamicRO ~ty:Float "priority"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not mentioned in the markdown doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True; the role of priority is not clear at the moment; we are not actively using it except to order presentation. The value is reported by the lower level.

ocaml/idl/datamodel_driver_variant.ml Outdated Show resolved Hide resolved
ocaml/xapi/message_forwarding.ml Outdated Show resolved Hide resolved
ocaml/xapi/xapi_host_driver.ml Outdated Show resolved Hide resolved
Comment on lines 96 to 140
D.debug "%s selecting driver %s variant %s" __FUNCTION__ drv var ;
let d = Db.Host_driver.get_record ~__context ~self in
let v = Db.Driver_variant.get_record ~__context ~self:variant in
let stdout =
drivertool ["select"; d.API.host_driver_name; v.API.driver_variant_name]
in
info "%s: %s" __FUNCTION__ stdout ;
Db.Host_driver.set_selected_variant ~__context ~self ~value:variant
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a copy of Variant.select, except you have the driver ref already. But I'm not sure we actually need Driver_variant.select in the API at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One uses the name of the variant, the other the ref. Both may be the most convenient for an API client; I would make the decision to remove this later as the feature progresses.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So both for select and rescan have redundant versions in the datamodel now. I prefer to keep the API simple and have just one way of doing something, and keep just one of each kind. But fine if you want to decide later which one is best.

ocaml/xapi/xapi_host_driver.ml Outdated Show resolved Hide resolved
ocaml/xapi/xapi_host_driver.ml Outdated Show resolved Hide resolved
@lindig lindig force-pushed the private/christianlin/multi-driver-v2 branch from f0db7d8 to 2a1a1d1 Compare December 18, 2024 13:17

The format of the description is vendor specific and is used for
a null-terminated string holding the version. The name is fixed to
"XenServer". The exact format is described in [ELF notes].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this configurable? The spec says the identifier is vendor-specific, so can another vendor provide .note.$VENDOR instead and tell XAPI that it's where it should look for the extra driver information?

Alternatively, if not configurable, would Xapi be a better fit as an identifier for such a feature of the XAPI Project?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this is quite far out and we are trying to get this off the ground. It's too early to encode this kind of flexibility where we are far away from any end-to-end testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand regarding the configurability. However I tried to suggest changing the hardcoded names to Xapi (paths, ELF notes...), which seemed a good and neutral identifier for downstreams, in the previous draft iteration of this PR, but without an answer at the time, that's why I'm asking again now. This would likely avert the need for configurable paths and identifiers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll bring it up in a discussion. I understand the desire to keep XenServer out of this.

@robhoes
Copy link
Member

robhoes commented Dec 19, 2024

The diff for ocaml/idl/datamodel_lifecycle.ml should be removed and regenerated as versions have moved on.
Some of my comments have not yet been addressed (all simple fixes).

Provide an API/CLI for multi-version drivers. Two new classes represent
them: Host_driver and Driver_variant. See

  * doc/content/toolstack/features/MVD/index.md

for technical background and feature overview.

The current implementation mocks the missing drivertool that manipulates
the file system and observes driver state. We expect to integrate this
at a later point. Consequently, the current implementation is incomplete
and exists to faciliate development of API and XE clients.

On Xapi start and restart driver information is purged from the
database and re-built from scrach. This could be improved in the future
to be incremental.

Signed-off-by: Christian Lindig <[email protected]>
@lindig lindig force-pushed the private/christianlin/multi-driver-v2 branch from 78b7ff3 to fa3b4ae Compare December 20, 2024 10:44
Christian Lindig added 5 commits December 20, 2024 10:47
When scanning a host for drivers, don't remove and re-create them but
instead update an entry if it already exists.

Signed-off-by: Christian Lindig <[email protected]>

Signed-off-by: Christian Lindig <[email protected]>
Address review comments

Signed-off-by: Christian Lindig <[email protected]>
Signed-off-by: Christian Lindig <[email protected]>
@lindig lindig force-pushed the private/christianlin/multi-driver-v2 branch from fa3b4ae to 0e02275 Compare December 20, 2024 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants