title | authors | reviewers | approvers | creation-date | last-updated | status | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
upstream-OSDK-features-to-controller-runtime |
|
|
|
2019-09-22 |
2019-09-22 |
implementable |
- Enhancement is
implementable
- Design details are appropriately documented from clear requirements
- Test plan is defined
An operator project scaffolded with the Operator SDK will primarily use library code that lives upstream in the controller-runtime project. There is a small set of packages in the SDK that extends the functionality of the controller-runtime APIs to cover more specific use cases. Most of these features along with their documentation should be contributed upstream to the controller-runtime.
Additionally the Operator SDK is preparing to use Kubebuilder for scaffolding Go operator projects for better alignment with the upstream community. See kubebuilder-project-board and upstream proposal for the integration of Operator SDK and Kubebuilder. As part of this integration some features from the SDK’s scaffolding machinery need to be contributed upstream to Kubebuilder before the SDK can use Kubebuilder as its upstream.
Contributing the extra features in the SDK to the controller-runtime would make them available to all users of the controller-runtime. This also reduces the maintenance burden in the SDK by since there are more contributors and users upstream who can improve on these features over time.
Contributing scaffolding enhancements to Kubebuilder that cover the SDK’s use cases removes blockers in the SDK’s proposal to use Kubebuilder as upstream, and helps align the SDK and Kubebuilder on a common project layout and workflow.
- The packages and features that are suitable for upstream contribution should be made available in the controller-runtime such that they cover the same use cases.
- Downstream SDK users can easily switch over to using those features in the controller-runtime. In most cases this should amount to just changing the import paths. For function signature changes there should be documentation to explain the breaking changes.
- The contributed features should have sufficient documentation in the upstream godocs.
- Once available upstream, those packages and APIs should be marked as deprecated and eventually removed from the SDK.
- Not all SDK library code is suitable for upstream contributions.
- For instance the test-framework library is closely tied to the SDK’s testing workflow and not generally applicable outside SDK projects.
- The SDK's leader-for-life leader election package already has an alternative in the controller-runtime’s leader election package which uses lease based leader election.
The user stories outline the individual features that are suitable for upstream contribution. Some of these features may already be merged upstream or are currently under review.
The default RestMapper used in the controller-runtime will not update to reflect new resource types registered with the API server after the RestMapper is first initialized at startup. See operator-sdk #1328 and controller-runtime #321.
The SDK has pkg/restmapper (see operator-sdk #1329) that provides a dynamic RestMapper which will reload the cached rest mappings on lookup errors due to a stale cache. This dynamic restmapper is currently under review for upstream contribution with some improvements like thread safety and rate limiting. See controller-runtime #554.
The SDK provides a predicate called GenerationChangedPredicate in pkg/predicate that will filter out update events for objects that have no change in their metadata.Generation. This is commonly used for ignoring update events for CustomResource objects that only have their status block updated with no change to the spec block.
This feature has already been incorporated upstream with godocs on the predicate and its caveats. See controller-runtime-553 and GenerationChangedPredicate godocs.
The SDK has pkg/log/zap that provides a zap based logr logger that allows a number of fine grained configurations (e.g debug level, encoder formatting) via command line flags passed to the operator.
The controller-runtime’s zap based logger has recently been made configurable via functional options that should allow all the configurations in the SDK’s own zap logger (see controller-runtime #560). This needs to be followed up by adding predefined options for each configuration of the logger to the controller-runtime so that users don’t have to write it themselves.
Ideally the flagset for setting all the logger configurations could also live upstream but given that the controller-runtime’s pkg/log/zap allows instantiating multiple zap loggers with different configs, it may not be suitable to have a global flagset that provides a singular configuration for all instantiated loggers. This point needs more discussion and it’s possible that the configuration flags may have to live downstream in the SDK.
Story 4 - Operator Developers can use SDK’s method of building images that run as non-root users in Kubebuilder projects
Kubebuilder scaffolded projects would previously run the operator base images with the user as root. See Kubebuilder’s pkg/scaffold/v2/dockerfile.go.
Operator SDK scaffolds a project Dockerfile such that it runs as non-root by default and allows the operator image to run with arbitrary UIDs on openshift. For details, see the openshift container guidelines for non-root images, and how the SDK includes the user setup in the image build and uses a custom entrypoint:
- internal/pkg/scaffold/build_dockerfile.go
- internal/pkg/scaffold/entrypoint.go
- internal/pkg/scaffold/usersetup.go
Kubebuilder should support scaffolding projects that will allow the base image to run as non-root and support arbitrary user ids. Currently with kubebuilder #983, a non-root base image should be supported by Kubebuilder.
Story 5 - Operator Developers can use the prometheus-operator’s ServiceMonitor API to configure prometheus to scrape their operator metrics in Kubebuilder projects
The Operator SDK’s pkg/metrics has helpers that let’s operators configure and create ServiceMonitors that lets a prometheus instance on a cluster target the operator’s Service object that exposes operator metrics.
Instead of having this functionality live in controller-runtime as helpers that can be called to setup ServiceMonitors, this can added as manifests that are scaffolded by Kubebuilder for an operator project. This would be similar to other resources that need to be created alongside the operator and can be customized in the manifest. Upstream issue at kubebuilder #887.
Story 6 - Operator Developers have documentation that demonstrates how to create and expose custom operator metrics
Once Kubebuilder supports scaffolding ServiceMonitor manifests, the Kubebuilder book documentation on recording custom metrics should be extended to show how to expose these metrics via the ServiceMonitor.
Almost all of the upstream work entails breaking changes for SDK users. Before the features are removed from the SDK they should first be deprecated in prior release with sufficient documentation that explains how to use the equivalent features upstream in the controller-runtime.
The features that have replacements in kubebuilder will need to wait until the SDK is ready to upstream kubebuilder for Go operators before being removed.
All library features being upstreamed into the controller-runtime would need to have unit tests to run as part of its CI.
Similarly any scaffolding and manifest changes to Kubebuilder would need e2e tests that verify the effects of those manifest changes as part of Kubebuilder’s CI.