# Points to the mirrored VP install chart
export PATTERN_DISCONNECTED_HOME=registry.internal.disconnected.net/hybridcloudpatterns
./pattern.sh make install
Multi-Cloud GitOps
Validated Pattern: Multi-Cloud GitOps
Validated Patterns: The Story so far
Our first foray into the realm of Validated Patterns was the adaptation of the MANUela application and its associated tooling to ArgoCD and Tekton, to demonstrate the deployment of a fairly involved IoT application designed to monitor industrial equipment and use AI/ML techniques to predict failure. This resulted in the Industrial Edge validated pattern, which you can see here.
This was our first use of a framework to deploy a significant application, and we learned a lot by doing it. It was good to be faced with a number of problems in the “real world” before taking a look at what is really essential for the framework and why.
All patterns have at least two parts: A “common” element (which we expect to be the basic framework that nearly all of our patterns will share) and a pattern-specific element, which uses the common pattern and expands on it with pattern-specific content. In the case of Industrial Edge, the common component included secret handling, installation of the GitOps operator, and installation of Red Hat Advanced Cluster Management. The pattern-specific components included the OpenShift Pipelines Operator, the AMQ Broker and Streams operators, Camel-K, the Seldon Operator, OpenDataHub, and Jupyter Notebooks and S3 storage buckets.
Multi-Cloud GitOps: The Why and What
After finishing with Industrial Edge, we recognized that there were some specific areas that we needed to tell a better story in. There were several areas where we thought we could improve the user experience in working with our tools and repos. And we recognized that the pattern might be a lot clearer in form and design to those of us who worked on it than an interested user from the open Internet.
So we had several categories of work to do as we scoped this pattern:
- Make a clear starting point: Make a clear “entry point” into pattern development, and define the features that we think should be common to all patterns. This pattern should be usable as a template for both us and other users to be able to clone as a starting point for future pattern development.
- Make “common” common: Since this pattern is going to be foundational to future patterns, remove elements from the common framework that are not expected to be truly common to all future patterns (or at least a large subset of them). Many elements specific to Industrial Edge found their way into common; in some cases we thought those elements truly were common and later re-thought them.
- Improve secrets handling: Provide a secure credential store such that we can manage secrets in that store rather than primarily as YAML files on a developer workstation. Broker access to that secret store via the External Secrets Operator to ensure a level of modularity and allow users to choose different secret stores if they wish. We also want to integrate the usage of that secret store into the cluster and demonstrate how to use it.
- Improve support for cluster scalability: For the Industrial Edge pattern, the edge cluster was technically optional (we have a supported model where both the datacenter and factory applications can run on the same cluster). We want these patterns to be more clearly scalable, and we identified two categories of that kind of scalability:
- Clusters vary only in name, but run the same applications and the same workloads
- Clusters vary in workloads, sizing, and configuration, and allow for considerable variability.
Many of the elements needed to support these were present in the initial framework, but it may not have been completely clear how to use these features, or they were couched in terms that only made sense to the people who worked on the pattern. This will now be clearer for future patterns, and we will continue to evolve the model with user and customer feedback.
Key Learning: Submodules are hard, and probably not worth it
- Copy and Paste
- Git Submodules
- Git Subtrees
We rejected the notion of copy and paste because we reasoned that once patterns diverged in their “common” layer it would be too difficult and painful to bring them back together later. More importantly, there would be no motivation to do so.
In Industrial Edge, we decided to make common a git submodule. Git submodules have been a feature of git for a long time, originally intended to make compiling a large project with multiple libraries more straightforward, by having a parent repo and an arbitrary set of submodule repos. Git submodule requires a number of exceptions to the “typical” git workflow - the initial clone works differently, and keeping the submodule updated to date can trip users up. Most importantly, it requires the practical management of multiple repositories, which can make life difficult in disconnected environments, which are important to us to support. It was confusing for our engineers to understand how to contribute code to the submodule repository. Finally, user response to the exceptions they had to make because of submodules was universally negative.
So going forward, because it is still important to have a common basis for patterns, and a clear mechanism and technical path to get updates to the common layer, we have moved to the subtree model as a mechanism for including common. This allows consumers of the pattern to treat the repo as a single entity instead of two, and does not require special syntax or commands to be run when switching branches, updating or, in many cases, contributing to common itself.
Key Learning: Secrets Management
One of our biggest challenges in following GitOps principles for the deployment of workloads is the handling of secrets. GitOps tells us that the git repo should be the source of truth - but we know that we should not store secrets directly in publicly accessible repositories. Previously, our patterns standardized the use of YAML files on the developer workstation as the de-facto authoritative secret store. This can be problematic for at least two reasons: for one, if two people are working on the same repo, which secret store is “right”? Secondly, it might be easier to retrieve credentials from a developer workstation due to information breach or theft. Our systems will not work without secrets, and we need to have a better way of working with them.
Highlight: Multi-cloud GitOps is the “Minimum Viable Pattern”
One of the goals of this pattern is to provide the minimal pattern that both demonstrates the goals, aims and purpose of the framework and does something that we think users will find interesting and valuable. We plan to use it as a starting point for our own future pattern development; and as such we can point to it as the pattern to clone and start with if a user wants to start their own pattern development from scratch.
New Feature: Hub Cluster Vault instance
In this pattern, we introduce the ability to reference upstream helm charts, and pass overrides to them in a native ArgoCD way. The first application we are treating this way is Hashicorp Vault. The use of Vault also allows us to make Vault the authoritative source of truth for secrets in the framework. This also improves our security posture by making it significantly easier to rotate secrets and have OpenShift “do the right thing” by re-deploying and re-starting workloads as necessary.
For the purposes of shipping this pattern as a runnable demonstration, we take certain shortcuts with security that we understand are not best practices - storing the vault keys unencrypted on a developer drive, for example. If you intend to run code derived from this pattern in production, we strongly recommend you consider and follow the practices documented here.
New Feature: External Secrets Operator
While it is important to improve the story for secret handling in the pattern space overall, it is also important to provide space for multiple solutions inside patterns. Because of this, we include the external secrets operator, which in the pattern uses vault, but can be used to support a number of other secret providers, and can be extended to support secrets providers that it does not already support. Furthermore, the external secrets approach is less disruptive to existing applications, since it works by managing the secret objects applications are already used to consuming. Vault provides different options for integration, including the agent injector, but this approach is very specific to Vault and not clearly portable.
In a similar note to the feature about Vault and secrets: the approach we take in this release of the pattern has some known security deficiencies. In RHACM prior to 2.5, policies containing secrets will not properly cloak the secrets in the policy objects, and will not properly encrypt the secrets at rest. RHACM 2.5+ includes a fromSecret function that will secure these secrets in transit and at rest in both of these ways. (Of course, any entity with cluster admin access can recover the contents of a secret object in the cluster.) One additional deficiency of this approach is that the lookup function we use in the policies to copy secrets only runs when the policy object is created or refreshed - which means there is not a mechanism within RHACM presently to detect when a secret has changed and the policy needs to be refreshed. We are hoping this functionality will be included in RHACM 2.6.
New Feature: clusterGroups can have multiple cluster members
Using Advanced Cluster Management, we can inject per-cluster configuration into the ArgoCD application definition. We do this, for example, with the global.hubClusterDomain and global.localClusterDomain variables, which are available to use in helm templates in applications that use the framework.
This enables one of our key new features, the ability to deploy multiple clusters that differ only in local, cluster-defined ways (such as the FQDNs that they would publish for their routes). This is a need we determined when we were working on Industrial Edge, where we had to add the FQDN of the local cluster to a config map, for use in a browser application that was defined in kubernetes, but runs in a user’s browser.
The config-demo namespace uses a deployment of the Red Hat Universal Base Image of httpd to demonstrate how to use the framework to pass variables from application definition to actual use in config maps. The config-demo app shows the management of a secret defined and securely transferred from the hub cluster to remote clusters, as well as allowing for the use of the hub cluster base domain and the local cluster base domain in configuration of applications running on either the hub or managed clusters.
Where do we go from here?
One of the next things we are committed to delivering in the new year is a pattern to extend the concept of GitOps to include elements that are outside of OpenShift and Kubernetes - specifically Red Hat Enterprise Linux nodes, including Red Hat Enterprise Linux For Edge nodes, as well as Red Hat Ansible Automation Platform.
We plan on developing a number of new patterns throughout the new year, which will showcase various technologies. Keep watching this space for updates, and if you would like to get involved, visit our site at https://validatedpatterns.io!
QUICKLINKS
Validated Pattern: Multi-Cloud GitOps
Validated Patterns: The Story so far
Our first foray into the realm of Validated Patterns was the adaptation of the MANUela application and its associated tooling to ArgoCD and Tekton, to demonstrate the deployment of a fairly involved IoT application designed to monitor industrial equipment and use AI/ML techniques to predict failure. This resulted in the Industrial Edge validated pattern, which you can see here.
This was our first use of a framework to deploy a significant application, and we learned a lot by doing it. It was good to be faced with a number of problems in the “real world” before taking a look at what is really essential for the framework and why.
All patterns have at least two parts: A “common” element (which we expect to be the basic framework that nearly all of our patterns will share) and a pattern-specific element, which uses the common pattern and expands on it with pattern-specific content. In the case of Industrial Edge, the common component included secret handling, installation of the GitOps operator, and installation of Red Hat Advanced Cluster Management. The pattern-specific components included the OpenShift Pipelines Operator, the AMQ Broker and Streams operators, Camel-K, the Seldon Operator, OpenDataHub, and Jupyter Notebooks and S3 storage buckets.
Multi-Cloud GitOps: The Why and What
After finishing with Industrial Edge, we recognized that there were some specific areas that we needed to tell a better story in. There were several areas where we thought we could improve the user experience in working with our tools and repos. And we recognized that the pattern might be a lot clearer in form and design to those of us who worked on it than an interested user from the open Internet.
So we had several categories of work to do as we scoped this pattern:
- Make a clear starting point: Make a clear “entry point” into pattern development, and define the features that we think should be common to all patterns. This pattern should be usable as a template for both us and other users to be able to clone as a starting point for future pattern development.
- Make “common” common: Since this pattern is going to be foundational to future patterns, remove elements from the common framework that are not expected to be truly common to all future patterns (or at least a large subset of them). Many elements specific to Industrial Edge found their way into common; in some cases we thought those elements truly were common and later re-thought them.
- Improve secrets handling: Provide a secure credential store such that we can manage secrets in that store rather than primarily as YAML files on a developer workstation. Broker access to that secret store via the External Secrets Operator to ensure a level of modularity and allow users to choose different secret stores if they wish. We also want to integrate the usage of that secret store into the cluster and demonstrate how to use it.
- Improve support for cluster scalability: For the Industrial Edge pattern, the edge cluster was technically optional (we have a supported model where both the datacenter and factory applications can run on the same cluster). We want these patterns to be more clearly scalable, and we identified two categories of that kind of scalability:
- Clusters vary only in name, but run the same applications and the same workloads
- Clusters vary in workloads, sizing, and configuration, and allow for considerable variability.
Many of the elements needed to support these were present in the initial framework, but it may not have been completely clear how to use these features, or they were couched in terms that only made sense to the people who worked on the pattern. This will now be clearer for future patterns, and we will continue to evolve the model with user and customer feedback.
Key Learning: Submodules are hard, and probably not worth it
- Copy and Paste
- Git Submodules
- Git Subtrees
We rejected the notion of copy and paste because we reasoned that once patterns diverged in their “common” layer it would be too difficult and painful to bring them back together later. More importantly, there would be no motivation to do so.
In Industrial Edge, we decided to make common a git submodule. Git submodules have been a feature of git for a long time, originally intended to make compiling a large project with multiple libraries more straightforward, by having a parent repo and an arbitrary set of submodule repos. Git submodule requires a number of exceptions to the “typical” git workflow - the initial clone works differently, and keeping the submodule updated to date can trip users up. Most importantly, it requires the practical management of multiple repositories, which can make life difficult in disconnected environments, which are important to us to support. It was confusing for our engineers to understand how to contribute code to the submodule repository. Finally, user response to the exceptions they had to make because of submodules was universally negative.
So going forward, because it is still important to have a common basis for patterns, and a clear mechanism and technical path to get updates to the common layer, we have moved to the subtree model as a mechanism for including common. This allows consumers of the pattern to treat the repo as a single entity instead of two, and does not require special syntax or commands to be run when switching branches, updating or, in many cases, contributing to common itself.
Key Learning: Secrets Management
One of our biggest challenges in following GitOps principles for the deployment of workloads is the handling of secrets. GitOps tells us that the git repo should be the source of truth - but we know that we should not store secrets directly in publicly accessible repositories. Previously, our patterns standardized the use of YAML files on the developer workstation as the de-facto authoritative secret store. This can be problematic for at least two reasons: for one, if two people are working on the same repo, which secret store is “right”? Secondly, it might be easier to retrieve credentials from a developer workstation due to information breach or theft. Our systems will not work without secrets, and we need to have a better way of working with them.
Highlight: Multi-cloud GitOps is the “Minimum Viable Pattern”
One of the goals of this pattern is to provide the minimal pattern that both demonstrates the goals, aims and purpose of the framework and does something that we think users will find interesting and valuable. We plan to use it as a starting point for our own future pattern development; and as such we can point to it as the pattern to clone and start with if a user wants to start their own pattern development from scratch.
New Feature: Hub Cluster Vault instance
In this pattern, we introduce the ability to reference upstream helm charts, and pass overrides to them in a native ArgoCD way. The first application we are treating this way is Hashicorp Vault. The use of Vault also allows us to make Vault the authoritative source of truth for secrets in the framework. This also improves our security posture by making it significantly easier to rotate secrets and have OpenShift “do the right thing” by re-deploying and re-starting workloads as necessary.
For the purposes of shipping this pattern as a runnable demonstration, we take certain shortcuts with security that we understand are not best practices - storing the vault keys unencrypted on a developer drive, for example. If you intend to run code derived from this pattern in production, we strongly recommend you consider and follow the practices documented here.
New Feature: External Secrets Operator
While it is important to improve the story for secret handling in the pattern space overall, it is also important to provide space for multiple solutions inside patterns. Because of this, we include the external secrets operator, which in the pattern uses vault, but can be used to support a number of other secret providers, and can be extended to support secrets providers that it does not already support. Furthermore, the external secrets approach is less disruptive to existing applications, since it works by managing the secret objects applications are already used to consuming. Vault provides different options for integration, including the agent injector, but this approach is very specific to Vault and not clearly portable.
In a similar note to the feature about Vault and secrets: the approach we take in this release of the pattern has some known security deficiencies. In RHACM prior to 2.5, policies containing secrets will not properly cloak the secrets in the policy objects, and will not properly encrypt the secrets at rest. RHACM 2.5+ includes a fromSecret function that will secure these secrets in transit and at rest in both of these ways. (Of course, any entity with cluster admin access can recover the contents of a secret object in the cluster.) One additional deficiency of this approach is that the lookup function we use in the policies to copy secrets only runs when the policy object is created or refreshed - which means there is not a mechanism within RHACM presently to detect when a secret has changed and the policy needs to be refreshed. We are hoping this functionality will be included in RHACM 2.6.
New Feature: clusterGroups can have multiple cluster members
Using Advanced Cluster Management, we can inject per-cluster configuration into the ArgoCD application definition. We do this, for example, with the global.hubClusterDomain and global.localClusterDomain variables, which are available to use in helm templates in applications that use the framework.
This enables one of our key new features, the ability to deploy multiple clusters that differ only in local, cluster-defined ways (such as the FQDNs that they would publish for their routes). This is a need we determined when we were working on Industrial Edge, where we had to add the FQDN of the local cluster to a config map, for use in a browser application that was defined in kubernetes, but runs in a user’s browser.
The config-demo namespace uses a deployment of the Red Hat Universal Base Image of httpd to demonstrate how to use the framework to pass variables from application definition to actual use in config maps. The config-demo app shows the management of a secret defined and securely transferred from the hub cluster to remote clusters, as well as allowing for the use of the hub cluster base domain and the local cluster base domain in configuration of applications running on either the hub or managed clusters.
Where do we go from here?
One of the next things we are committed to delivering in the new year is a pattern to extend the concept of GitOps to include elements that are outside of OpenShift and Kubernetes - specifically Red Hat Enterprise Linux nodes, including Red Hat Enterprise Linux For Edge nodes, as well as Red Hat Ansible Automation Platform.
We plan on developing a number of new patterns throughout the new year, which will showcase various technologies. Keep watching this space for updates, and if you would like to get involved, visit our site at https://validatedpatterns.io!
QUICKLINKS
CONTRIBUTE
Ansible Edge GitOps
Validated Pattern: Ansible Edge GitOps
Ansible Edge GitOps: The Why and What
As we have been working on new validated patterns and the pattern framework, we have seen a need and interest from the community in expanding the use cases covered by the framework to include other parts of the portfolio besides OpenShift. We understand the Edge computing environments are very complex, and while OpenShift may be the right choice for some Edge environments, it will not be feasible or practical for all of them. Can other environments besides Kubernetes-native ones benefit from GitOps? If so, what would those look like? This pattern works to answer those questions.
GitOps is currently a hot topic in technology. It is a natural outgrowth of the Kubernetes approach in particular, and is informed by now decades of practice in managing large fleets of systems. But is GitOps a concept that is only for Kubernetes? Or can we use the techniques and patterns of GitOps in other systems as well? We believe that by applying specific practices and techniques to Ansible code, and using a Git repo as the authoritative source for configuration results, that we can do exactly that.
One of the first problems we knew we would have to solve in developing this pattern was to work out how to model an Edge environment that was running Virtual Machines. We started with the assumption that we were going to use the Ansible Automation Platform Operator for OpenShift to manage these VMs. But how should we run the VMs themselves?
It is certainly possible to use the different public cloud offerings to spin up instances within the clouds, but that would require a lot of maintenance to the pattern over the long haul to pay attention to different image types and to address any changes to the provisioning schemes the different clouds might make. Additionally, since the purpose of including VMs in this pattern is to model an Edge environment, modeling them as ordinary public cloud instances might seem odd. As a practical matter, the pattern user would have to keep track of the instances and spin them down when spinning down the pattern.
To begin solving these problems, this pattern introduces OpenShift Virtualization to the pattern framework. While OpenShift Virtualization today supports AWS and on-prem baremetal clusters, we hope that it will also bring support to GCP and Azure in the not too distant future. The use of OpenShift Virtualization enables the simulated Edge environment to be modeled entirely in a single cluster, and any instances will be destroyed along with the cluster.
The pattern itself focuses on the installation of a containerized application (Inductive Automation Ignition) on simulated kiosks running RHEL 8 in kiosk mode. This installation pattern is based on work Red Hat did with a customer in the Petrochemical industry.
Highlight: Imperative and Declarative Automation, and GitOps
The validated patterns framework has been committed to GitOps as a philosophy and operational practice since the beginning. The framework’s use of ArgoCD as a mechanism for deploying applications and components is proof of our commitment to GitOps core principles of having a declared desired end state, and a designated agent to bring about that end state.
Many decades of automation practice that focus on individual OS instances (whether they be virtual machines or baremetal) may lead us to believe that the only way to manage such instances is imperatively - that is, focusing on the steps required to configure a machine to the state you want it to be in as opposed to the actual end state you want.
By way of example, consider a situation where you want an individual OS instance to synchronize its time to a source you specify. The imperative way to do this would be to write a script that does some or all of the following:
- Install the software that manages system time synchronization
- Write a configuration file for the service that specifies the time source in question
- If the configuration file or other configuration mechanism that influences the service has changed, restart the time synchronization service.
Along the way, there are subtle differences between different operating systems, such as the name of the time synchronization package (ntp
or chrony
, for example); differences in which package manager to use; differences in configuration file formats; differences in service names. It is all rather a lot to consider, and the kinds of scripts that managed these sorts of things at scale, when written in Shell or Perl, could get quite convoluted.
Meanwhile, would it not be great if we could put the focus on end state, instead of on the steps required to get to that end state? So we could specify what we want, and we could trust the framework to “make it so” for us? Languages that have this capability rose to the forefront of IT consciousness this century and became wildly popular - languages like Puppet, Chef, Salt and, of course, Ansible. (And yes, they all owe quite a lot to CFEngine, which has been around, and is still around.) The development and practices that grew up around these languages significantly influenced Kubernetes and its development in turn.
Because these languages all provide a kind of hybrid model, they all have mechanisms that allow you to violate one or more of the core tenets of GitOps. For example, while many people run their configuration management code from a Git repository, none of these languages specifically require that, and all provide mechanisms to run in an ad-hoc mode. And yet, all of these languages have a fairly strong declarative flavor that can be used to specify configurations with them; again, this is not mandatory, but it is still quite common. So maybe there is a way to apply the stricter definitions of GitOps to these languages and include their use in a larger GitOps system.
Even within Kubernetes, where have clearly have first-class support for declarative systems, there are aspects of configuration that we may want to make deterministic, but not explicitly code into a git repository. For example, best practice for cluster availability is to spread a worker pool over three different availability zones in a public cloud region. Which three zones should they be? Those decisions are bound to the region the cluster is installed in. Do the AZs themselves really matter, or is the only important constraint that there be three? These kinds of things are state that matters to operators, and an imperative framework for dealing with questions like this can vastly simplify the task of cluster administrators, who can then use this automation to create clusters in multiple regions and clouds and trust that resources will be optimized for maximum availability.
Another crucial point of Declarative and Imperative systems is that it is impossible to conceive of a declarative system that does not have or require reconciliation loops. These reconciliation loops are by definition imperative processes. They often have additional conventions that apply - for example the convention that in Kubernetes Operators the reconciliation loop will change one thing, and then retry - but those processes are still inherently imperative.
A final crucial point on Declarative and Imperative systems is that, especially when we are talking about Edge installations, many of the systems that are important parts of those ecosystems do not have the same level of support for declarative-style configuration management that server operating systems and layers like Kubernetes have. Here we consider crucial elements of Edge environments like routers, switches, access points, and other network gear; as we consider IoT sensors like IP Cameras, it seems unlikely that we will have a Kubernetes-native way to manage devices like these in the foreseeable future.
With these points in mind, it seems that if we cannot bring devices to GitOps, perhaps we should bring GitOps to devices. Ansible has long been recognized for its ability to orchestrate and manage devices in an agentless model. Is there a way to run Ansible that we can recognize as a GitOps mechanism? We believe that there is, by using it with the Ansible Automation Platform components (formerly known as Tower), and recording the desired state in the git repository or repositories that the system uses. In doing so, we believe that we can and should bring GitOps to Edge environments.
Highlight: Including Ansible in Validated Patterns
A new element of this pattern is the use of the Ansible Automation Platform Operator, which we install in the hub cluster of the pattern.
The Ansible Automation Platform Operator is the Kubernetes-native way of running AAP. It provides the Controller function, which supports Execution Environments. The pattern provides its own Execution Environment (with the definition files, so that you can see what is in it or customize it if you like), and loads its own Ansible content into the AAP instance. It uses a dynamic inventory technique to deal with certain aspects of running the VMs it manages under Kubernetes.
The key function of AAP in this pattern is to configure and manage the kiosks. The included content takes the fresh templates, registers them to the Red Hat CDN, installs Firefox, configures kiosk mode, and then downloads and manages the Ignition application container so that both firefox and the application container start at boot time.
The playbook that configures the kiosks is configured to run every 10 minutes on all kiosks, so that if there is some temporary error on the kiosk the configuration will simply attempt configuration again when the schedule tells it to.
Highlight: Including Virtualization in Validated Patterns
As discussed above, another key element in this pattern is the introduction of of OpenShift Virtualization to model the Edge environment with kiosks. The pattern installs the OpenShift Virtualization operator, configures it, and provisions a metal node in order to run the virtual machines. It is possible to emulate hardware acceleration, but the resulting VMs have terrible performance.
The virtual machines we build as part of this pattern are x86_64 RHEL machines, but it should be straightforward to extend this pattern to model other architectures, or other operating systems or versions.
The chart used to define the virtual machines is designed to be open and flexible - replacing the values.yaml file in the chart’s directory will allow you to define different kinds of virtual machine sets; the chart may give you some ideas on how to manage virtual machines under OpenShift in a GitOps way.
Highlight: Including RHEL in Validated Patterns
One of the highlights of this pattern is the use of RHEL in it. There are a number of interesting developments in RHEL that we have been working on, and we expect to highlight more of these in future patterns. We expect that this pattern will be the basis for future patterns that include RHEL, Ansible, and/or Virtualization.
Where do we go from here?
We believe this pattern breaks some new and interesting ground in bringing Ansible, Virtualization, and RHEL to the validated pattern framework. Like all of our patterns, this pattern is Open Source, and we encourage you to use it, tinker with it, and submit your ideas, changes and fixes.
Documentation for how to install the pattern is here, where there are detailed installation instructions, more technical details on the different components in the pattern (especially the use of AAP and OpenShift Virtualization), and some ideas for customization.
QUICKLINKS
Validated Pattern: Ansible Edge GitOps
Ansible Edge GitOps: The Why and What
As we have been working on new validated patterns and the pattern framework, we have seen a need and interest from the community in expanding the use cases covered by the framework to include other parts of the portfolio besides OpenShift. We understand the Edge computing environments are very complex, and while OpenShift may be the right choice for some Edge environments, it will not be feasible or practical for all of them. Can other environments besides Kubernetes-native ones benefit from GitOps? If so, what would those look like? This pattern works to answer those questions.
GitOps is currently a hot topic in technology. It is a natural outgrowth of the Kubernetes approach in particular, and is informed by now decades of practice in managing large fleets of systems. But is GitOps a concept that is only for Kubernetes? Or can we use the techniques and patterns of GitOps in other systems as well? We believe that by applying specific practices and techniques to Ansible code, and using a Git repo as the authoritative source for configuration results, that we can do exactly that.
One of the first problems we knew we would have to solve in developing this pattern was to work out how to model an Edge environment that was running Virtual Machines. We started with the assumption that we were going to use the Ansible Automation Platform Operator for OpenShift to manage these VMs. But how should we run the VMs themselves?
It is certainly possible to use the different public cloud offerings to spin up instances within the clouds, but that would require a lot of maintenance to the pattern over the long haul to pay attention to different image types and to address any changes to the provisioning schemes the different clouds might make. Additionally, since the purpose of including VMs in this pattern is to model an Edge environment, modeling them as ordinary public cloud instances might seem odd. As a practical matter, the pattern user would have to keep track of the instances and spin them down when spinning down the pattern.
To begin solving these problems, this pattern introduces OpenShift Virtualization to the pattern framework. While OpenShift Virtualization today supports AWS and on-prem baremetal clusters, we hope that it will also bring support to GCP and Azure in the not too distant future. The use of OpenShift Virtualization enables the simulated Edge environment to be modeled entirely in a single cluster, and any instances will be destroyed along with the cluster.
The pattern itself focuses on the installation of a containerized application (Inductive Automation Ignition) on simulated kiosks running RHEL 8 in kiosk mode. This installation pattern is based on work Red Hat did with a customer in the Petrochemical industry.
Highlight: Imperative and Declarative Automation, and GitOps
The validated patterns framework has been committed to GitOps as a philosophy and operational practice since the beginning. The framework’s use of ArgoCD as a mechanism for deploying applications and components is proof of our commitment to GitOps core principles of having a declared desired end state, and a designated agent to bring about that end state.
Many decades of automation practice that focus on individual OS instances (whether they be virtual machines or baremetal) may lead us to believe that the only way to manage such instances is imperatively - that is, focusing on the steps required to configure a machine to the state you want it to be in as opposed to the actual end state you want.
By way of example, consider a situation where you want an individual OS instance to synchronize its time to a source you specify. The imperative way to do this would be to write a script that does some or all of the following:
- Install the software that manages system time synchronization
- Write a configuration file for the service that specifies the time source in question
- If the configuration file or other configuration mechanism that influences the service has changed, restart the time synchronization service.
Along the way, there are subtle differences between different operating systems, such as the name of the time synchronization package (ntp
or chrony
, for example); differences in which package manager to use; differences in configuration file formats; differences in service names. It is all rather a lot to consider, and the kinds of scripts that managed these sorts of things at scale, when written in Shell or Perl, could get quite convoluted.
Meanwhile, would it not be great if we could put the focus on end state, instead of on the steps required to get to that end state? So we could specify what we want, and we could trust the framework to “make it so” for us? Languages that have this capability rose to the forefront of IT consciousness this century and became wildly popular - languages like Puppet, Chef, Salt and, of course, Ansible. (And yes, they all owe quite a lot to CFEngine, which has been around, and is still around.) The development and practices that grew up around these languages significantly influenced Kubernetes and its development in turn.
Because these languages all provide a kind of hybrid model, they all have mechanisms that allow you to violate one or more of the core tenets of GitOps. For example, while many people run their configuration management code from a Git repository, none of these languages specifically require that, and all provide mechanisms to run in an ad-hoc mode. And yet, all of these languages have a fairly strong declarative flavor that can be used to specify configurations with them; again, this is not mandatory, but it is still quite common. So maybe there is a way to apply the stricter definitions of GitOps to these languages and include their use in a larger GitOps system.
Even within Kubernetes, where have clearly have first-class support for declarative systems, there are aspects of configuration that we may want to make deterministic, but not explicitly code into a git repository. For example, best practice for cluster availability is to spread a worker pool over three different availability zones in a public cloud region. Which three zones should they be? Those decisions are bound to the region the cluster is installed in. Do the AZs themselves really matter, or is the only important constraint that there be three? These kinds of things are state that matters to operators, and an imperative framework for dealing with questions like this can vastly simplify the task of cluster administrators, who can then use this automation to create clusters in multiple regions and clouds and trust that resources will be optimized for maximum availability.
Another crucial point of Declarative and Imperative systems is that it is impossible to conceive of a declarative system that does not have or require reconciliation loops. These reconciliation loops are by definition imperative processes. They often have additional conventions that apply - for example the convention that in Kubernetes Operators the reconciliation loop will change one thing, and then retry - but those processes are still inherently imperative.
A final crucial point on Declarative and Imperative systems is that, especially when we are talking about Edge installations, many of the systems that are important parts of those ecosystems do not have the same level of support for declarative-style configuration management that server operating systems and layers like Kubernetes have. Here we consider crucial elements of Edge environments like routers, switches, access points, and other network gear; as we consider IoT sensors like IP Cameras, it seems unlikely that we will have a Kubernetes-native way to manage devices like these in the foreseeable future.
With these points in mind, it seems that if we cannot bring devices to GitOps, perhaps we should bring GitOps to devices. Ansible has long been recognized for its ability to orchestrate and manage devices in an agentless model. Is there a way to run Ansible that we can recognize as a GitOps mechanism? We believe that there is, by using it with the Ansible Automation Platform components (formerly known as Tower), and recording the desired state in the git repository or repositories that the system uses. In doing so, we believe that we can and should bring GitOps to Edge environments.
Highlight: Including Ansible in Validated Patterns
A new element of this pattern is the use of the Ansible Automation Platform Operator, which we install in the hub cluster of the pattern.
The Ansible Automation Platform Operator is the Kubernetes-native way of running AAP. It provides the Controller function, which supports Execution Environments. The pattern provides its own Execution Environment (with the definition files, so that you can see what is in it or customize it if you like), and loads its own Ansible content into the AAP instance. It uses a dynamic inventory technique to deal with certain aspects of running the VMs it manages under Kubernetes.
The key function of AAP in this pattern is to configure and manage the kiosks. The included content takes the fresh templates, registers them to the Red Hat CDN, installs Firefox, configures kiosk mode, and then downloads and manages the Ignition application container so that both firefox and the application container start at boot time.
The playbook that configures the kiosks is configured to run every 10 minutes on all kiosks, so that if there is some temporary error on the kiosk the configuration will simply attempt configuration again when the schedule tells it to.
Highlight: Including Virtualization in Validated Patterns
As discussed above, another key element in this pattern is the introduction of of OpenShift Virtualization to model the Edge environment with kiosks. The pattern installs the OpenShift Virtualization operator, configures it, and provisions a metal node in order to run the virtual machines. It is possible to emulate hardware acceleration, but the resulting VMs have terrible performance.
The virtual machines we build as part of this pattern are x86_64 RHEL machines, but it should be straightforward to extend this pattern to model other architectures, or other operating systems or versions.
The chart used to define the virtual machines is designed to be open and flexible - replacing the values.yaml file in the chart’s directory will allow you to define different kinds of virtual machine sets; the chart may give you some ideas on how to manage virtual machines under OpenShift in a GitOps way.
Highlight: Including RHEL in Validated Patterns
One of the highlights of this pattern is the use of RHEL in it. There are a number of interesting developments in RHEL that we have been working on, and we expect to highlight more of these in future patterns. We expect that this pattern will be the basis for future patterns that include RHEL, Ansible, and/or Virtualization.
Where do we go from here?
We believe this pattern breaks some new and interesting ground in bringing Ansible, Virtualization, and RHEL to the validated pattern framework. Like all of our patterns, this pattern is Open Source, and we encourage you to use it, tinker with it, and submit your ideas, changes and fixes.
Documentation for how to install the pattern is here, where there are detailed installation instructions, more technical details on the different components in the pattern (especially the use of AAP and OpenShift Virtualization), and some ideas for customization.
QUICKLINKS
CONTRIBUTE
Push or Pull?
Push or Pull? Strategies for Large Scale Technology Change Management on the Edge
What is Technology Change Management?
There is a segment of the technology industry dedicated to keeping track of what is changing in an IT environment, when and how. These include systems like Service Now, Remedy, JIRA, and others. This is definitely a kind of change management, and these systems are important - but the focus of this blog post is not how the work of change management is tracked, but the actual means and strategy of doing those changes.
Edge technology solutions involve hardware and software, and they all require some kind of technology maintenance. Our focus here is on software maintenance - these task can involve updating applications, patching underlying operating systems, performing remote administration - restarting applications and services. Coordinating change for a complex application can be daunting for a centralized datacenter application - but on the Edge, where we have hundreds, thousands, maybe millions of devices and application instances to keep track of, it is harder.
What Do You Mean, Push or Pull?
In this article, we are going to discuss two primary strategies for systems that are responsible for making and recording changes on other systems. We are making the assumption here that the system in question is making changes and also recording the results of those changes for later review or troubleshooting. Highly regulated organizations often have audit requirements to show that their financial statements are accurate, and this means demonstrating that there are business processes in place to authorize and schedule changes.
In this context, when we say “Push”, we mean that a hub or centralized system originates and makes changes on other systems. The key differentiator is that the “Push” system stays in contact with the managed system throughout the process of the change. The “Push” system may also keep a record of changes made for its own purposes.
In a “Pull” system, on the other hand, the centralized system waits for managed systems to connect to it to get their configuration instructions. “Pull” systems often have agents that use a dedicated protocol to define changes. There may be several steps in a “Pull” conversation, as defined by the system. A “Pull” system might also be able to cache and apply a previous configuration. The key differentiator of a “Pull” system is that it does not need to maintain constant contact with the central system to do its work.
Push and Pull, in this context represent “strategies” for managing change. A given system can have both “push” and “pull” aspects; for example, ansible has an ansible-pull
command which clearly works in a “pull” mode, even though most people recognize ansible as being primarily a push based system. While specific systems or products may be mentioned, the goal is not to evaluate systems themselves, but to talk about the differences and the relative merits and pitfalls of push and pull as strategies for managing change.
What do you mean by “Large”?
The term “Large” is quoted because it can mean different things in different contexts. For change management systems, it can include, but is not necessarily limited to:
- The count of individual systems managed
As an arbitrary number, a system that manages 10,000 systems could probably be considered “large” regardless of other considerations. But sheer numbers of managed systems are only one aspect in play here. But you may still have a “large” problem if you do not have that many instances. Sheer volume of managed systems impose many constraints on systems that have to change and track other systems - records of changes have to be stored and indexed; there has to be a way to represent the different desired configurations.
- The complexity of different configurations across those systems
The number of configurations represented across your fleet might be a better predictor for how “large” the problem is. It is easier to manage 10 configurations with 1,000 instances each than to manage 50 configurations with 100 instances each, for example.
- The organizational involvement in managing configurations
Fleets have a certain team overhead in managing them as they grow. Who decides what hardware gets deployed, and when? Who decides when new Operating System versions are rolled out? Who does testing? If there are several teams involved in these activities, the problem is almost certainly a “large” one.
- The geographic distribution of systems managed
Another aspect of complexity is how widely dispersed the fleet is. It is easier to manage 10,000 instances in one or even two locations than it is to manage 5 instances in each of 2,000 locations. Geographic distribution also includes operating in multiple legal jurisdictions, and possible in multiple nations. These impose requirements of various kinds on systems and thus also on the systems responsible for maintaining and changing them.
- It feels like a “large” problem to you
If none of the other criteria listed so far apply to you, but it still feels like a “large” problem to you, it probably is one.
Managing Change - An Ongoing Challenge
In a perfect world, we could deploy technology solutions that maintain themselves. We would not need to update them; they would know how to update themselves. They could coordinate outage times, and could ensure that they can run successfully on a proposed platform. They would know about their own security vulnerabilities, and know how to fix them. Best of all, they could take requests from their users, turn those into systematic improvements, and deploy them without any other interaction.
What is the Edge?
Are you laughing yet? Most of the work of IT administrators is done in one of the areas listed above. Even very competent IT organizations sometimes struggle balancing some of these priorities. One aspect of the current computing environment is the prominence of Edge computing, which places its focus on the devices and applications that use computing resources far from central clouds and data centers - these compute resources run in retail stores, in pharmacies, hospitals, and warehouses; they run in cars and sometimes even spacecraft. In Edge environments, compute resources have to deal with intermittent network connectivity, if they have network connectivity at all. Sometimes, groups of devices have access to local server resources - as might the case in a large retail store, or in a factory or warehouse - but sometimes the closest “servers” are in the cloud and centralized. One interesting aspect of Edge deployments is that there are often many copies of effectively the same deployment. A large chain retail store might deploy the same set of applications to each of its stores. In such an installation, there may be many devices of a single type installed in that location. Think of the number of personal computers or cash registers you can see at a large retail store, for example. It would not be unusual to have ten PCs and twenty cash registers (per store) in this kind of deployment. And a large retail chain could have hundreds or even thousands of locations. Newer technologies, like Internet of Things deployments, require an even higher degree of connectivity - the single retail store example we are considering could have three hundred cameras to manage, which would need to be integrated into its IoT stack. And there could be hundreds, or thousands, of sites just like this one. The scope and scale of systems to manage can get daunting very quickly.
So, some of the defining qualities of Edge environments are: scale (anyone operating some edge installations probably has a lot of edge installations, and the success of their business depends on them operating more of them) and network limitations (whether there is a connection at all, and if so, how reliable it is; bandwidth - how much data it can transfer at a time; and latency - how long it takes to get where it is going). This makes making changes in these environments challenging, because it means keeping track of large numbers of entities in an environment where our ability to contact those entities and verify their status may be limited if it is present at all. But we still must make changes in those environments, because those solutions need maintenance - their platforms may need security updates; their operators may want to update application features and functionality. This requires us to make changes to these devices, and requires technology change management.
Consideration: Workload Capacity Management
Workload Capacity Management focuses on what is needed to manage the scale of deployments. With large Edge deployments, the work needs to get done, and that work needs to be replicated on every node in scope for the deployment - so the same job or configuration content may need to apply to hundreds, thousands, or more individual nodes. Since the control point (central or distributed) is different between push and pull based systems, how they distribute the work needed to distribute changes. Push based systems must send the work out directly; but pull-based systems can potentially overwhelm a centralized system with a “thundering herd.”
Common Consideration: Inventory/Data Aggregation
Inventory and Data Aggregation are crucial considerations for both kinds of systems. Inventory is the starting point for determining what systems are in scope for a given unit of work; data aggregation is important to them because as units of work get done, we often need proof or validation that the work was done. With large numbers of edge nodes, there are certain to be exceptions, and the ability to keep track of where the work is crucial to completing the task.
Common Consideration: Authentication/Authorization
Since these systems are responsible for making changes on devices, how they interact with authentication and authorization systems is an important aspect of how they work. Is the user who the user claims to be? Which users are allowed to make which changes? Authentication and authorization are things we must consider for systems that make changes. Additionally many large organizations have additional technology requirements for systems that can make changes to other systems.
Common Consideration: Dealing with Network Interruptions
In Edge deployments, network connectivity is by definition limited, unreliable, or non-existent. There are differences in how respective types of systems can detect and behave in the presence of network interruptions.
Common Consideration: Eventual Consistency / Idempotence
Regardless of whether a system is push-based or pull-based, it is valuable and useful for the configuration units managed by that system to be safe to apply and re-apply at will. This is the common meaning of the term idempotence. One strategy for minimizing the effect of many kinds of problems in large-scale configuration management is writing content that is idempotent, that is, the effect of running the same content multiple times is the same as the effect of running it once. Practically speaking, this means that such systems make changes only when they need to, and do not have “side effects”. This makes it safe to run the same configuration content on the same device many times, so if it cannot be determined whether a device has received a particular configuration or not, the solution would be to apply the configuration to it, and the devices should then be in the desired, known state when the configuration is done.
Approach 1: Push Strategy
The first approach we will consider is the “push” strategy. In a “push” strategy, the centralized change management system itself reaches out to managed devices and triggers updates in some way. This could involve making an API call to a device, logging in to a device through SSH, or using a dedicated client/server protocol. Red Hat’s Ansible operates as a push-based system, where a central console reaches out to devices and manages them.
Push Consideration: Workload Capacity Management
A push based system has much more control over how it parcels out configuration workload, since it is in control of how configuration workloads are driven. It can more easily perceive its own load state, and potentially “back off” or “throttle” if it is processing too much work too quickly - or increase it if the scope of the desired change is smaller than the total designed capacity of the system. It is easier to influence the “rate of change” on a push-based system for this reason.
Push Consideration: Dealing with Network Interruptions
Push-based systems are at a disadvantage when dealing with network interruptions and limitations. The most common network failure scenarios are ambiguous: if an attempt to reach an individual device fails, is that because there was a problem with the device, or a problem with the network path to reach the device? The push-based system can only know things about the devices it manages when it is told. An additional potential with network interruption is that a device can successfully apply a unit of configuration change but can fail to report that because of a network problem - the report is dropped, for example, because of a network path outage or problem, or the central collection infrastructure was overwhelmed. In such a situation, it is best to have the option to re-apply the configuration, for which it is best if you can have the confidence that such configuration will not have any undesired side-effects, and will only make the changes it needs to make.
Approach 2: Pull Strategy
The second approach we will consider is the “pull” strategy. The key difference in the “pull” strategy is that devices themselves initiate communication with the central management system. They can do this by making a request to the management system (which can be a notification, API call, or some other mechanism). That is to say - the central management system “waits” for check-ins from the managed devices. Client-server Puppet is a pull-based system, in which managed devices reach out to server endpoints, which give the devices instructions on what configurations to apply to themselves. Puppet also has options for operating in a push-based model; historically this could be done through puppet kick
, mcollective orchestration, application orchestration, or bolt.
Pull Consideration: Workload Capacity Management
Pull-based systems have some challenges in regard to workload capacity for the pieces that need to be centralized (particularly reporting and inventory functions). The reason for this is that the devices managed will not have a direct source of information about the load level of centralized infrastructure, unless this is provided by an API; some load balancing schemes can do this in a rudimentary way by directing new requests to an instance via a “least connection” balancing scheme. Large deployments typically have to design a system to stagger check-ins to ensure the system does not get overwhelmed by incoming requests.
Pull Consideration: Authentication/Authorization
Pull-based systems typically have agents that run on the systems that are managed, and as such are simpler to operate from an authentication and authorization standpoint. Agents on devices can often be given administrative privilege, and the practical authentication/authorization problems have to do with access to the central management console, and the ability to change the configurations distributed or see the inventory and reports of attempts to configure devices.
Pull Consideration: Dealing with Network Interruptions
Pull-based systems have a distinct advantage when they encounter network interruptions. While it is in no way safe to assume that a managed device is still present or relevant from the standpoint of central infrastructure, it is almost always safe for a device, when it finds it cannot connect central infrastructure, to assume that it is experiencing a temporary network outage, and to simply retry the operation later. Care must be taken, especially in large deployments, not to overwhelm central infrastructure with requests. Additionally, we must remember that since network interruption can occur at any time on the edge, that the operation we are interested in may indeed have completed successfully, but the device was simply unable to report this to us for some reason. As was the case for push-based systems, the best cure for this is to ensure that content can be safely re-applied as needed or desired.
Conclusions
Push and Pull based systems have different scaling challenges as they grow
Push and Pull-based systems have different tradeoffs. It can be easier to manage a push-based system for smaller numbers of managed devices; some of the challenges of both styles clearly increase as systems grow to multiple thousands of nodes.
Meanwhile, both push and pull based systems, as a practical matter, have to make sense and be usable for small installations as well as large, and grow and scale as smoothly as possible. Many installations will never face some or maybe even any of these challenges - and systems of both types must be easy to understand and learn, or else they will not be used.
Pull-based systems are better for Edge uses despite scaling challenges
Pull based systems can deal better with the network problems that are inherent with edge devices. When connectivity to central infrastructure is unreliable, pull-based systems can still operate. Pull-based systems can safely assume that a network partition is temporary, and thus do not suffer from the inherent ambiguity of “could not reach target system” kinds of errors.
Idempotence matters more than whether a system is push or pull based
The fix for nearly all the operational problems in large scale configurations management problems is to be able to apply the same configuration to the same device multiple times and expect the same result. This takes discipline and effort, but that effort pays off well in the end.
To help scale, introduce a messaging or queuing layer to hold on to data in flight if possible
Many of the operational considerations are related to limited network connectivity or overtaxing centralized infrastructure. Both of these problems can be mitigated significantly by introducing a messaging or queuing layer in the configuration management system to hold on to reports, results, and inventory updates until the system can confirm receipt and processing of those elements.
QUICKLINKS
Push or Pull? Strategies for Large Scale Technology Change Management on the Edge
What is Technology Change Management?
There is a segment of the technology industry dedicated to keeping track of what is changing in an IT environment, when and how. These include systems like Service Now, Remedy, JIRA, and others. This is definitely a kind of change management, and these systems are important - but the focus of this blog post is not how the work of change management is tracked, but the actual means and strategy of doing those changes.
Edge technology solutions involve hardware and software, and they all require some kind of technology maintenance. Our focus here is on software maintenance - these task can involve updating applications, patching underlying operating systems, performing remote administration - restarting applications and services. Coordinating change for a complex application can be daunting for a centralized datacenter application - but on the Edge, where we have hundreds, thousands, maybe millions of devices and application instances to keep track of, it is harder.
What Do You Mean, Push or Pull?
In this article, we are going to discuss two primary strategies for systems that are responsible for making and recording changes on other systems. We are making the assumption here that the system in question is making changes and also recording the results of those changes for later review or troubleshooting. Highly regulated organizations often have audit requirements to show that their financial statements are accurate, and this means demonstrating that there are business processes in place to authorize and schedule changes.
In this context, when we say “Push”, we mean that a hub or centralized system originates and makes changes on other systems. The key differentiator is that the “Push” system stays in contact with the managed system throughout the process of the change. The “Push” system may also keep a record of changes made for its own purposes.
In a “Pull” system, on the other hand, the centralized system waits for managed systems to connect to it to get their configuration instructions. “Pull” systems often have agents that use a dedicated protocol to define changes. There may be several steps in a “Pull” conversation, as defined by the system. A “Pull” system might also be able to cache and apply a previous configuration. The key differentiator of a “Pull” system is that it does not need to maintain constant contact with the central system to do its work.
Push and Pull, in this context represent “strategies” for managing change. A given system can have both “push” and “pull” aspects; for example, ansible has an ansible-pull
command which clearly works in a “pull” mode, even though most people recognize ansible as being primarily a push based system. While specific systems or products may be mentioned, the goal is not to evaluate systems themselves, but to talk about the differences and the relative merits and pitfalls of push and pull as strategies for managing change.
What do you mean by “Large”?
The term “Large” is quoted because it can mean different things in different contexts. For change management systems, it can include, but is not necessarily limited to:
- The count of individual systems managed
As an arbitrary number, a system that manages 10,000 systems could probably be considered “large” regardless of other considerations. But sheer numbers of managed systems are only one aspect in play here. But you may still have a “large” problem if you do not have that many instances. Sheer volume of managed systems impose many constraints on systems that have to change and track other systems - records of changes have to be stored and indexed; there has to be a way to represent the different desired configurations.
- The complexity of different configurations across those systems
The number of configurations represented across your fleet might be a better predictor for how “large” the problem is. It is easier to manage 10 configurations with 1,000 instances each than to manage 50 configurations with 100 instances each, for example.
- The organizational involvement in managing configurations
Fleets have a certain team overhead in managing them as they grow. Who decides what hardware gets deployed, and when? Who decides when new Operating System versions are rolled out? Who does testing? If there are several teams involved in these activities, the problem is almost certainly a “large” one.
- The geographic distribution of systems managed
Another aspect of complexity is how widely dispersed the fleet is. It is easier to manage 10,000 instances in one or even two locations than it is to manage 5 instances in each of 2,000 locations. Geographic distribution also includes operating in multiple legal jurisdictions, and possible in multiple nations. These impose requirements of various kinds on systems and thus also on the systems responsible for maintaining and changing them.
- It feels like a “large” problem to you
If none of the other criteria listed so far apply to you, but it still feels like a “large” problem to you, it probably is one.
Managing Change - An Ongoing Challenge
In a perfect world, we could deploy technology solutions that maintain themselves. We would not need to update them; they would know how to update themselves. They could coordinate outage times, and could ensure that they can run successfully on a proposed platform. They would know about their own security vulnerabilities, and know how to fix them. Best of all, they could take requests from their users, turn those into systematic improvements, and deploy them without any other interaction.
What is the Edge?
Are you laughing yet? Most of the work of IT administrators is done in one of the areas listed above. Even very competent IT organizations sometimes struggle balancing some of these priorities. One aspect of the current computing environment is the prominence of Edge computing, which places its focus on the devices and applications that use computing resources far from central clouds and data centers - these compute resources run in retail stores, in pharmacies, hospitals, and warehouses; they run in cars and sometimes even spacecraft. In Edge environments, compute resources have to deal with intermittent network connectivity, if they have network connectivity at all. Sometimes, groups of devices have access to local server resources - as might the case in a large retail store, or in a factory or warehouse - but sometimes the closest “servers” are in the cloud and centralized. One interesting aspect of Edge deployments is that there are often many copies of effectively the same deployment. A large chain retail store might deploy the same set of applications to each of its stores. In such an installation, there may be many devices of a single type installed in that location. Think of the number of personal computers or cash registers you can see at a large retail store, for example. It would not be unusual to have ten PCs and twenty cash registers (per store) in this kind of deployment. And a large retail chain could have hundreds or even thousands of locations. Newer technologies, like Internet of Things deployments, require an even higher degree of connectivity - the single retail store example we are considering could have three hundred cameras to manage, which would need to be integrated into its IoT stack. And there could be hundreds, or thousands, of sites just like this one. The scope and scale of systems to manage can get daunting very quickly.
So, some of the defining qualities of Edge environments are: scale (anyone operating some edge installations probably has a lot of edge installations, and the success of their business depends on them operating more of them) and network limitations (whether there is a connection at all, and if so, how reliable it is; bandwidth - how much data it can transfer at a time; and latency - how long it takes to get where it is going). This makes making changes in these environments challenging, because it means keeping track of large numbers of entities in an environment where our ability to contact those entities and verify their status may be limited if it is present at all. But we still must make changes in those environments, because those solutions need maintenance - their platforms may need security updates; their operators may want to update application features and functionality. This requires us to make changes to these devices, and requires technology change management.
Consideration: Workload Capacity Management
Workload Capacity Management focuses on what is needed to manage the scale of deployments. With large Edge deployments, the work needs to get done, and that work needs to be replicated on every node in scope for the deployment - so the same job or configuration content may need to apply to hundreds, thousands, or more individual nodes. Since the control point (central or distributed) is different between push and pull based systems, how they distribute the work needed to distribute changes. Push based systems must send the work out directly; but pull-based systems can potentially overwhelm a centralized system with a “thundering herd.”
Common Consideration: Inventory/Data Aggregation
Inventory and Data Aggregation are crucial considerations for both kinds of systems. Inventory is the starting point for determining what systems are in scope for a given unit of work; data aggregation is important to them because as units of work get done, we often need proof or validation that the work was done. With large numbers of edge nodes, there are certain to be exceptions, and the ability to keep track of where the work is crucial to completing the task.
Common Consideration: Authentication/Authorization
Since these systems are responsible for making changes on devices, how they interact with authentication and authorization systems is an important aspect of how they work. Is the user who the user claims to be? Which users are allowed to make which changes? Authentication and authorization are things we must consider for systems that make changes. Additionally many large organizations have additional technology requirements for systems that can make changes to other systems.
Common Consideration: Dealing with Network Interruptions
In Edge deployments, network connectivity is by definition limited, unreliable, or non-existent. There are differences in how respective types of systems can detect and behave in the presence of network interruptions.
Common Consideration: Eventual Consistency / Idempotence
Regardless of whether a system is push-based or pull-based, it is valuable and useful for the configuration units managed by that system to be safe to apply and re-apply at will. This is the common meaning of the term idempotence. One strategy for minimizing the effect of many kinds of problems in large-scale configuration management is writing content that is idempotent, that is, the effect of running the same content multiple times is the same as the effect of running it once. Practically speaking, this means that such systems make changes only when they need to, and do not have “side effects”. This makes it safe to run the same configuration content on the same device many times, so if it cannot be determined whether a device has received a particular configuration or not, the solution would be to apply the configuration to it, and the devices should then be in the desired, known state when the configuration is done.
Approach 1: Push Strategy
The first approach we will consider is the “push” strategy. In a “push” strategy, the centralized change management system itself reaches out to managed devices and triggers updates in some way. This could involve making an API call to a device, logging in to a device through SSH, or using a dedicated client/server protocol. Red Hat’s Ansible operates as a push-based system, where a central console reaches out to devices and manages them.
Push Consideration: Workload Capacity Management
A push based system has much more control over how it parcels out configuration workload, since it is in control of how configuration workloads are driven. It can more easily perceive its own load state, and potentially “back off” or “throttle” if it is processing too much work too quickly - or increase it if the scope of the desired change is smaller than the total designed capacity of the system. It is easier to influence the “rate of change” on a push-based system for this reason.
Push Consideration: Dealing with Network Interruptions
Push-based systems are at a disadvantage when dealing with network interruptions and limitations. The most common network failure scenarios are ambiguous: if an attempt to reach an individual device fails, is that because there was a problem with the device, or a problem with the network path to reach the device? The push-based system can only know things about the devices it manages when it is told. An additional potential with network interruption is that a device can successfully apply a unit of configuration change but can fail to report that because of a network problem - the report is dropped, for example, because of a network path outage or problem, or the central collection infrastructure was overwhelmed. In such a situation, it is best to have the option to re-apply the configuration, for which it is best if you can have the confidence that such configuration will not have any undesired side-effects, and will only make the changes it needs to make.
Approach 2: Pull Strategy
The second approach we will consider is the “pull” strategy. The key difference in the “pull” strategy is that devices themselves initiate communication with the central management system. They can do this by making a request to the management system (which can be a notification, API call, or some other mechanism). That is to say - the central management system “waits” for check-ins from the managed devices. Client-server Puppet is a pull-based system, in which managed devices reach out to server endpoints, which give the devices instructions on what configurations to apply to themselves. Puppet also has options for operating in a push-based model; historically this could be done through puppet kick
, mcollective orchestration, application orchestration, or bolt.
Pull Consideration: Workload Capacity Management
Pull-based systems have some challenges in regard to workload capacity for the pieces that need to be centralized (particularly reporting and inventory functions). The reason for this is that the devices managed will not have a direct source of information about the load level of centralized infrastructure, unless this is provided by an API; some load balancing schemes can do this in a rudimentary way by directing new requests to an instance via a “least connection” balancing scheme. Large deployments typically have to design a system to stagger check-ins to ensure the system does not get overwhelmed by incoming requests.
Pull Consideration: Authentication/Authorization
Pull-based systems typically have agents that run on the systems that are managed, and as such are simpler to operate from an authentication and authorization standpoint. Agents on devices can often be given administrative privilege, and the practical authentication/authorization problems have to do with access to the central management console, and the ability to change the configurations distributed or see the inventory and reports of attempts to configure devices.
Pull Consideration: Dealing with Network Interruptions
Pull-based systems have a distinct advantage when they encounter network interruptions. While it is in no way safe to assume that a managed device is still present or relevant from the standpoint of central infrastructure, it is almost always safe for a device, when it finds it cannot connect central infrastructure, to assume that it is experiencing a temporary network outage, and to simply retry the operation later. Care must be taken, especially in large deployments, not to overwhelm central infrastructure with requests. Additionally, we must remember that since network interruption can occur at any time on the edge, that the operation we are interested in may indeed have completed successfully, but the device was simply unable to report this to us for some reason. As was the case for push-based systems, the best cure for this is to ensure that content can be safely re-applied as needed or desired.
Conclusions
Push and Pull based systems have different scaling challenges as they grow
Push and Pull-based systems have different tradeoffs. It can be easier to manage a push-based system for smaller numbers of managed devices; some of the challenges of both styles clearly increase as systems grow to multiple thousands of nodes.
Meanwhile, both push and pull based systems, as a practical matter, have to make sense and be usable for small installations as well as large, and grow and scale as smoothly as possible. Many installations will never face some or maybe even any of these challenges - and systems of both types must be easy to understand and learn, or else they will not be used.
Pull-based systems are better for Edge uses despite scaling challenges
Pull based systems can deal better with the network problems that are inherent with edge devices. When connectivity to central infrastructure is unreliable, pull-based systems can still operate. Pull-based systems can safely assume that a network partition is temporary, and thus do not suffer from the inherent ambiguity of “could not reach target system” kinds of errors.
Idempotence matters more than whether a system is push or pull based
The fix for nearly all the operational problems in large scale configurations management problems is to be able to apply the same configuration to the same device multiple times and expect the same result. This takes discipline and effort, but that effort pays off well in the end.
To help scale, introduce a messaging or queuing layer to hold on to data in flight if possible
Many of the operational considerations are related to limited network connectivity or overtaxing centralized infrastructure. Both of these problems can be mitigated significantly by introducing a messaging or queuing layer in the configuration management system to hold on to reports, results, and inventory updates until the system can confirm receipt and processing of those elements.
QUICKLINKS
CONTRIBUTE
Using clusterGroups in the Validated Patterns Framework
What Is a clusterGroup?
The Validated Patterns framework defines itself in terms of clusterGroups. A clusterGroup is a set of one or more Kubernetes clusters that are managed to have the same deployments (that is, subscriptions and applications, namespaces and projects) applied to each member of the same clustergroup. Two different members of the same clusterGroup will differ in cluster-name and in other cluster-specific details (such as PKI and tokens) but will have the same subscriptions and applications as other members of the same clusterGroup.
Essentially, a clusterGroup is an abstract definition of what the pattern will install on each respective cluster in the pattern. It is possible to install multiple clusterGroups on the same cluster, as we do (for example) in Industrial Edge where we have both datacenter and factory clusterGroups, and the factory clusterGroup is installed on the datacenter cluster by default.
Single-Cluster and Multi-Cluster Patterns
Because Validated Patterns started as an Edge initiative, we designed the notion of multi-cluster patterns into the framework from the beginning. The first Validated Pattern, Industrial Edge models a central data-lake and an optional remote factory cluster. The factory cluster does not need the CI or test system, nor the central data lake. Since we have different configuration needs on the two types of cluster, we define them as different cluster groups.
Other patterns only (so far) require a single cluster, since they model their Edge requirements in different ways. Medical Diagnosis brings a pre-provided set of data (which would ordinarily come in from Edge clusters). Ansible Edge GitOps has RHEL instances as its Edge environment, not OpenShift clusters - and it runs its VMs through OpenShift Virtualization so it only uses the one cluster. In these cases, it is simplest to -designate the single cluster as a Hub cluster.
Uses for a Hub Cluster in a Multi-Cluster Deployment
But in multi-cluster deployments, it is often helpful to define a hierarchy. Even if there are exactly two clusters as part of a multi-cluster pattern, it may be helpful to define one as the “hub”, in case the pattern grows to more clusters later. Modern architectures can scale to many instances, in some cases thousands, and there are certain responsibilities that are convenient to define as “central” or “hub” functions. Some examples that exist in the Framework and its applications so far:
- Configuration Management (via Advanced Cluster Management, or by Ansible Automation Platform)
- Data Aggregation (via AMQ Streams)
- Continuous Integration/Continuous Delivery Pipelines (via OpenShift Pipelines)
- Data Visualization (via Grafana)
- Secrets storage and maintenance (via Vault and the External Secrets Operator)
Some other potential uses for the “hub” role or function include:
- General “control plane” functions
- Metrics aggregation
The hub role defines a place in the architecture to run these vital functions, while reserving capacity on the Edge for data gathering and pre-process functions.
But this naturally brings up a question - what are the different roles we have considered in the Validated Patterns framework, and how should they be used?
Types of clusterGroup
While we describe the Validated Patterns framework as “opinionated”, we do not want to make it overly constricting. The framework currently considered two types of clusterGroups: Hub and non-hub. We consider “Hub” as a role, primarily, but it is also used as the default name for the Hub clusterGroup. There are two important aspects of a Hub clusterGroup:
- It is expected to be a singleton (there is expected to be a single hub cluster in the hub clusterGroup)
- There is expected to be a single Hub clusterGroup in a given pattern
Generally, we expect that non-Hub clusterGroups will be Edge clusters, but this is not essential. A clusterGroup should have at least one cluster that it will apply to (otherwise, why define one?) and can have multiple clusters. The framework sets ArgoCD’s ignoreMissingValueFiles
setting to true
unconditionally, and the framework also provides an extraValueFiles
variable, which can define multiple optional additional values files on a per-clusterGroup basis.
How the Hierarchy Works - the clustergroup Chart
The clustergroup chart begins by looking in the pattern’s values-global.yaml
for the main.clusterGroupName
value. This value is then used to compute the next values file to process - if the value of that variable is hub
then the application(s) that are created will use both the values-global.yaml
and values-hub.yaml
files in the root of the pattern repository. The clusterGroup
structure will then be parsed for name
and isHubCluster
values; namespaces, subscriptions, projects and applications will be applied. Any managedClusterGroups (on the hubCluster) will be defined in terms for Advanced Cluster Management (as is done in Industrial Edge). Because the factory match expression is very general (vendor In OpenShift
) this will match the hub cluster and any cluster that is joined to the hub cluster’s ACM instance.
Frequently Asked Questions (FAQs)
I have a single-cluster Pattern. Do I need to call my hub cluster “hub”?
You can, but you do not have to. If you want to call it something other than hub
:
- Define a different
main.clusterGroupName
value invalues-global.yaml
- Create the
values-yourhubgroupname.yaml
in the root of your pattern repository. - Make sure
clusterGroup.name
variable invalues-yourhubgroupname.yaml
matches your intended hub group.
I have a multi-cluster Pattern. All of my Edge clusters will have the same configuration. Do I need multiple clusterGroups to model it?
No! This is the exact scenario we had in mind when we designed clusterGroups. Just add multiple clusters with the same criteria you defined to the ACM instance on your hub cluster, and each cluster you add will get the same subscriptions and applications.
I have a multi-cluster Pattern. How do I decide if I need multiple clusterGroups to model it?
Do you have different subscriptions or applications that you plan to use on your different clusterGroups? If so, you should define different clusterGroups and define them as managedClusterGroups for your hub cluster.
QUICKLINKS
QUICKLINKS
CONTRIBUTE
As you can see the subdomain property was replaced with the host property but it includes the ingress domain for the cluster. Why is this important? I can now apply this to any OpenShift cluster without worrying it’s ingress domain since it will get appended automatically.
There is currently a known issue when using the subdomain property in which the name given is not published with the correct template, instead of that they are published with the default ‘${name}-${namespace}.subdomain.local’. This has been fixed and it’s available in OpenShift 4.11.
Conclusion
Using the subdomain property when defining route is super useful if you are deploying your application to different clusters and it will allow you to not have to hard code the ingress domain for every cluster.
If you have any questions or want to see what we are working on please feel free to visit our Validated Patterns site. If you are excited or intrigued by what you see here we’d love to hear your thoughts and ideas! Try the patterns contained in our Validated Patterns Repo. We will review your pull requests to our pattern repositories.
QUICKLINKS
As you can see the subdomain property was replaced with the host property but it includes the ingress domain for the cluster. Why is this important? I can now apply this to any OpenShift cluster without worrying it’s ingress domain since it will get appended automatically.
There is currently a known issue when using the subdomain property in which the name given is not published with the correct template, instead of that they are published with the default ‘${name}-${namespace}.subdomain.local’. This has been fixed and it’s available in OpenShift 4.11.
Conclusion
Using the subdomain property when defining route is super useful if you are deploying your application to different clusters and it will allow you to not have to hard code the ingress domain for every cluster.
If you have any questions or want to see what we are working on please feel free to visit our Validated Patterns site. If you are excited or intrigued by what you see here we’d love to hear your thoughts and ideas! Try the patterns contained in our Validated Patterns Repo. We will review your pull requests to our pattern repositories.
QUICKLINKS
CONTRIBUTE
QUICKLINKS
QUICKLINKS
CONTRIBUTE
In the argoCD interface we see that the rollout is paused just like with blue-green
But what about our application - we said we only want 20% of the traffic to go to the new app:
That is awesome! So now let’s promote the application and see what happens - the expectation is that it will incrementally update the percentage of connections to the new application until completely promoted.
Let’s take a look at what it looks like using the argo rollouts plugin
Conclusion
Argo Rollouts makes progressive delivery of our applications super easy. Whether you want to deploy using blue-green or the more advanced canary rollout is up to you. The canary rollout is very powerful and as we saw gives us the ultimate control, with insights and flexibility to deploy applications. There is so much more that argo rollouts can do - this demo barely scratches the surface! Keep an eye out for argo rollouts as part of openshift-gitops in ‘23.
QUICKLINKS
In the argoCD interface we see that the rollout is paused just like with blue-green
But what about our application - we said we only want 20% of the traffic to go to the new app:
That is awesome! So now let’s promote the application and see what happens - the expectation is that it will incrementally update the percentage of connections to the new application until completely promoted.
Let’s take a look at what it looks like using the argo rollouts plugin
Conclusion
Argo Rollouts makes progressive delivery of our applications super easy. Whether you want to deploy using blue-green or the more advanced canary rollout is up to you. The canary rollout is very powerful and as we saw gives us the ultimate control, with insights and flexibility to deploy applications. There is so much more that argo rollouts can do - this demo barely scratches the surface! Keep an eye out for argo rollouts as part of openshift-gitops in ‘23.
QUICKLINKS
CONTRIBUTE
Multicluster DevSecOps
Software supply chain security: The why and what
Today more and more organizations are turning to agile development models and DevOps. With this approach, development organizations can deliver more enhancements and bug fixes in a timely manner, providing more value to their customers. While DevOps can include security earlier in the software lifecycle, in practice this has not always been the case. DevSecOps explicitly calls on organizations to pay attention to security best practices and to automate them or “Shift Left” as much as possible.
DevSecOps means baking in application and infrastructure security from the start. In order to be successful, organizations must look both upstream where their dependencies come from, and also how their components integrate together in the production environment. It also means automating security gates to keep the DevOps workflow from slowing down. As we learn from experience, we codify that into the automation process.
A successful DevSecOps based supply chain must consider four areas of concern:
- Secure developer dependencies
- Secure code development
- Secure deployment of resources into a secure environment
- Software Bill of Materials (SBOM)
Within each of these areas there are also many best practices to be applied particularly in Cloud Native development using container technology.
- Scanning new development code for potential vulnerabilities
- Scanning dependent images that new code will be layered upon
- Attesting to the veracity of images using image signing
- Scanning images for know CVEs
- Scanning the environment for potential networking vulnerabilities
- Scanning for misconfiguration of images and other assets
- Ensuring consistent automated deployment of secure configuration using GitOps
- Continuous upgrading of security policies from both trusted third parties and experience
This pattern deploys several Red Hat Products:
- Red Hat OpenShift Container Platform (Kubernetes platform)
- Red Hat OpenShift GitOps (ArgoCD)
- Red Hat Advanced Cluster Management (Open Cluster Management)
- Red Hat OpenShift Pipelines (Tekton)
- Red Hat Quay (OCI image registry with security features enabled)
- Red Hat Open Data Foundation (highly available storage)
- Red Hat Advanced Cluster Security (scanning and monitoring)
Highlight: Multicluster
While all of the components can be deployed on a single cluster, which makes for a simple demo, this pattern deploys a real world architecture where the central management, development environments, and production are all deployed on different clusters. This ensures that the pattern is structured for real-world deployments, with all the functionality needed to make such an architecture work already built-in, so that pattern consumers can concentrate on what is being delivered, rather than how.
The heavy lifting in the pattern includes a great deal of integration between components, especially those spanning across clusters:
- Deployment of Quay Enterprise with OpenShift Data Foundations as a storage backend
- Deployment of Quay Bridge operator configured to connect with Quay Enterprise on hub cluster
- Deployment of ACS on managed nodes with integration back to ACS central on the hub
- Deployment of a secure pipeline with scanning and signing tools, including ACS
Highlight: DevSecOps with Pipelines
“OpenShift Pipelines makes CI/CD concepts such as a ‘pipeline’, a ’task’, a ‘step’ natively instantiatable [sic] so it can use the scalability, security, ease of deployment capabilities of Kubernetes.” (Introducing OpenShift Pipelines). The pattern consumes many of the OpenShift Pipelines out of the box tasks but also defines new tasks for scanning and signing and includes them in enhanced DevSecOps pipelines.
While these pipelines are included in the pattern, the pattern also implements the use of Pipelines-as-Code feature where the pipeline can be part of the application code repository. “This allows developers to ship their CI/CD pipelines within the same git repository as their application, making it easier to keep both of them in sync in terms of release updates.”
Highlight: Using the CI Pipeline to provide supply chain security
This pattern includes some other technologies in the development CI pipeline, including cosign, a SIGSTORE project, implemented with Tekton Chains. Cosign supports container signing, verification, and storage in an OCI registry. It enables consumers to sign their pipeline resources and images and share the attestation files providing downstream consumers assurances that they are consuming a trusted artifact.
We also implement open source tools like Sonarqube for static code analysis, nexus for securely storing build artifacts in-cluster, and an open source reports application that is used to upload and present the reports from the security pipeline.
Not using these tools in your environment? That’s not a problem. The pattern framework is flexible. Organizations using different services can swap out what’s in the pattern with their software of choice to fit their environment.
Where do we go from here?
This pattern provides a complete deployment solution for Multicluster DevSecOps that can be used as part of a supply chain deployment pattern across different industries.
Documentation for how to install the pattern is here, where there are detailed installation instructions and more technical details on the different components in the pattern.
QUICKLINKS
Multicluster DevSecOps
Software supply chain security: The why and what
Today more and more organizations are turning to agile development models and DevOps. With this approach, development organizations can deliver more enhancements and bug fixes in a timely manner, providing more value to their customers. While DevOps can include security earlier in the software lifecycle, in practice this has not always been the case. DevSecOps explicitly calls on organizations to pay attention to security best practices and to automate them or “Shift Left” as much as possible.
DevSecOps means baking in application and infrastructure security from the start. In order to be successful, organizations must look both upstream where their dependencies come from, and also how their components integrate together in the production environment. It also means automating security gates to keep the DevOps workflow from slowing down. As we learn from experience, we codify that into the automation process.
A successful DevSecOps based supply chain must consider four areas of concern:
- Secure developer dependencies
- Secure code development
- Secure deployment of resources into a secure environment
- Software Bill of Materials (SBOM)
Within each of these areas there are also many best practices to be applied particularly in Cloud Native development using container technology.
- Scanning new development code for potential vulnerabilities
- Scanning dependent images that new code will be layered upon
- Attesting to the veracity of images using image signing
- Scanning images for know CVEs
- Scanning the environment for potential networking vulnerabilities
- Scanning for misconfiguration of images and other assets
- Ensuring consistent automated deployment of secure configuration using GitOps
- Continuous upgrading of security policies from both trusted third parties and experience
This pattern deploys several Red Hat Products:
- Red Hat OpenShift Container Platform (Kubernetes platform)
- Red Hat OpenShift GitOps (ArgoCD)
- Red Hat Advanced Cluster Management (Open Cluster Management)
- Red Hat OpenShift Pipelines (Tekton)
- Red Hat Quay (OCI image registry with security features enabled)
- Red Hat Open Data Foundation (highly available storage)
- Red Hat Advanced Cluster Security (scanning and monitoring)
Highlight: Multicluster
While all of the components can be deployed on a single cluster, which makes for a simple demo, this pattern deploys a real world architecture where the central management, development environments, and production are all deployed on different clusters. This ensures that the pattern is structured for real-world deployments, with all the functionality needed to make such an architecture work already built-in, so that pattern consumers can concentrate on what is being delivered, rather than how.
The heavy lifting in the pattern includes a great deal of integration between components, especially those spanning across clusters:
- Deployment of Quay Enterprise with OpenShift Data Foundations as a storage backend
- Deployment of Quay Bridge operator configured to connect with Quay Enterprise on hub cluster
- Deployment of ACS on managed nodes with integration back to ACS central on the hub
- Deployment of a secure pipeline with scanning and signing tools, including ACS
Highlight: DevSecOps with Pipelines
“OpenShift Pipelines makes CI/CD concepts such as a ‘pipeline’, a ’task’, a ‘step’ natively instantiatable [sic] so it can use the scalability, security, ease of deployment capabilities of Kubernetes.” (Introducing OpenShift Pipelines). The pattern consumes many of the OpenShift Pipelines out of the box tasks but also defines new tasks for scanning and signing and includes them in enhanced DevSecOps pipelines.
While these pipelines are included in the pattern, the pattern also implements the use of Pipelines-as-Code feature where the pipeline can be part of the application code repository. “This allows developers to ship their CI/CD pipelines within the same git repository as their application, making it easier to keep both of them in sync in terms of release updates.”
Highlight: Using the CI Pipeline to provide supply chain security
This pattern includes some other technologies in the development CI pipeline, including cosign, a SIGSTORE project, implemented with Tekton Chains. Cosign supports container signing, verification, and storage in an OCI registry. It enables consumers to sign their pipeline resources and images and share the attestation files providing downstream consumers assurances that they are consuming a trusted artifact.
We also implement open source tools like Sonarqube for static code analysis, nexus for securely storing build artifacts in-cluster, and an open source reports application that is used to upload and present the reports from the security pipeline.
Not using these tools in your environment? That’s not a problem. The pattern framework is flexible. Organizations using different services can swap out what’s in the pattern with their software of choice to fit their environment.
Where do we go from here?
This pattern provides a complete deployment solution for Multicluster DevSecOps that can be used as part of a supply chain deployment pattern across different industries.
Documentation for how to install the pattern is here, where there are detailed installation instructions and more technical details on the different components in the pattern.
QUICKLINKS
CONTRIBUTE
QUICKLINKS
QUICKLINKS
CONTRIBUTE
Introducing a Simplified Tier Naming Scheme
The efforts here started off with 2 different classes of patterns: “Community” and “Validated”, however this terminology dates back to before the effort had arrived at “Validated Patterns” as the official project name.
Having standardized on the use of “Validated Patterns” to refer to the overall initiative, it became confusing to refer to “Community” Validated Patterns and “Validated” Validated Patterns.
In addressing that confusion, we took the opportunity to design a new set of tiers:
Generally speaking, Sandbox aligns with the old Community tier, and Maintained aligns with the old Validated tier. However some of the requirements associated with those previous tiers were structured around our bandwidth and our priorities. In revisiting the tiers, we’ve removed many of those value judgements and shifted the emphasis to be on where the bar is, rather than who the work is being done by.
With a new set of tier names, and a new set of requirements, we are going to start off all patterns in the Sandbox tier rather than grandfather them into the new names. Don’t panic, the code behind your favorite patterns have not suddenly regressed, we’re using the opportunity to work out any kinks in the promotion process and ensure patterns are classified consistently.
We expect to have finished re-reviewing the previous validated tier patterns by the end of 2023 and the previous community tier patterns by the end of Q1 2024.
QUICKLINKS
The efforts here started off with 2 different classes of patterns: “Community” and “Validated”, however this terminology dates back to before the effort had arrived at “Validated Patterns” as the official project name.
Having standardized on the use of “Validated Patterns” to refer to the overall initiative, it became confusing to refer to “Community” Validated Patterns and “Validated” Validated Patterns.
In addressing that confusion, we took the opportunity to design a new set of tiers:
Generally speaking, Sandbox aligns with the old Community tier, and Maintained aligns with the old Validated tier. However some of the requirements associated with those previous tiers were structured around our bandwidth and our priorities. In revisiting the tiers, we’ve removed many of those value judgements and shifted the emphasis to be on where the bar is, rather than who the work is being done by.
With a new set of tier names, and a new set of requirements, we are going to start off all patterns in the Sandbox tier rather than grandfather them into the new names. Don’t panic, the code behind your favorite patterns have not suddenly regressed, we’re using the opportunity to work out any kinks in the promotion process and ensure patterns are classified consistently.
We expect to have finished re-reviewing the previous validated tier patterns by the end of 2023 and the previous community tier patterns by the end of Q1 2024.
QUICKLINKS
CONTRIBUTE
Pattern Testing on Nutanix
I am pleased to announce the addition of the Nutanix platform to our CI dashboard for the Multi-cloud GitOps pattern.
Pattern consumers can now rest assured that the core pattern functionality will remain functional for deployments of OpenShift on the Nutanix platform.
This would not be possible without the wonderful co-operation of Nutanix, who are doing all the work of deploying OpenShift and our pattern on their platform, executing the tests, and reporting the results.
To facilitate this, the patterns team have begun the process of open sourcing the downstream tests for all our patterns. Soon all tests will live alongside the the patterns they target, allowing them to be easily executed and/or improved by pattern consumers and platform owners.
Our thanks once again to Nutanix.
QUICKLINKS
I am pleased to announce the addition of the Nutanix platform to our CI dashboard for the Multi-cloud GitOps pattern.
Pattern consumers can now rest assured that the core pattern functionality will remain functional for deployments of OpenShift on the Nutanix platform.
This would not be possible without the wonderful co-operation of Nutanix, who are doing all the work of deploying OpenShift and our pattern on their platform, executing the tests, and reporting the results.
To facilitate this, the patterns team have begun the process of open sourcing the downstream tests for all our patterns. Soon all tests will live alongside the the patterns they target, allowing them to be easily executed and/or improved by pattern consumers and platform owners.
Our thanks once again to Nutanix.
QUICKLINKS
CONTRIBUTE
QUICKLINKS
QUICKLINKS
CONTRIBUTE
QUICKLINKS
QUICKLINKS
CONTRIBUTE
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/blog/2024-10-12-disconnected/index.html b/blog/2024-10-12-disconnected/index.html index 89e7a61bd..780f39e91 100644 --- a/blog/2024-10-12-disconnected/index.html +++ b/blog/2024-10-12-disconnected/index.html @@ -101,7 +101,8 @@ disconnected registry.
After a while the cluster will converge to its desired final state and the -MultiCloud Gitops pattern will be installed successfully.
QUICKLINKS
- +MultiCloud Gitops pattern will be installed successfully.
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/contribute/background-on-pattern-development/index.html b/contribute/background-on-pattern-development/index.html
index 14b8703df..f2951618e 100644
--- a/contribute/background-on-pattern-development/index.html
+++ b/contribute/background-on-pattern-development/index.html
@@ -1,6 +1,7 @@
Background on pattern development | Validated Patterns Introduction
This section provides details on how to create a new pattern using the validated patterns framework. Creating a new pattern might start from scratch or it may start from an existing deployment that would benefit from a repeatable framework based on GitOps.
This introduction explains some of framework design decisions and why they were chosen. There are some high level concepts that are required for the framework. While those concepts can be implemented using a variety of open source projects, this framework is prescriptive and mentions the project and also (down stream) product that was used. E.g. For development builds we use Tekton (project) and specifically use OpenShift Pipelines (product).
The framework uses popular Cloud Native Computing Foundation (CNCF) projects as much as possible. The CNCF landscape contains many projects that solve the same or similar problem. The validated patterns effort has chosen specific projects but it is not unreasonable for users to switch out one project for another. (See more on Operators below).
There is no desire to replicate efforts already in CNCF. If new a open source project comes out of this framework, the plan would be to contribute that to CNCF.
Who is a pattern developer?
Many enterprise class Cloud Native applications are complex and require many different application services integrated together. Organizations can learn from each other on how to create robust, scalable, and maintainable systems. When you find a pattern that seems to work, it makes sense to promote best practices to others in order for them to not repeat the many failures you probably made while getting to your killer pattern.
In the world of DevOps (including DevSecOps and GitOps), teams should include personnel from development, operations, security, and architects. What makes DevOps work is the collaboration of all these IT personnel, the business owners, and others. As DevOps practices move through your organization, best practices are shared and standards evolve.
This validated patterns framework has evolved since it was started in 2019. It will likely continue to evolve. What was learned is that there are some common concepts that need to be addressed once you desire to generalize your organizations framework.
Therefore, the goal is, that developers, operators, security, and architects will use this framework to have secure and repeatable day one deployment mechanism and maintenance automation for day two operations.
A common platform
One of the most important goals of this framework is to provide consistency across any cloud provider - public or private. Public cloud providers each have Kubernetes distributions. While they keep up with the Kubernetes release cycle, they are not always running on the same version. Furthermore, each cloud provider has their own sets of services that developers often consume. So while you could automate the handling for each of the cloud providers, the framework utilizes one Kubernetes distribution that runs on public or private clouds - the hybrid and/or multi cloud model.
The framework depends on Red Hat OpenShift Container Platform (OCP). Once you have deployed Red Hat OCP wherever you wish to deploy your cloud native application pattern, then the framework can deploy on that platform in a few easy steps.
Containment beyond containers
If you are reading this chances are you are already familiar with Linux containers. But there is more to containers than Linux containers in the Cloud Native environment.
Containers
Containers allow you to encapsulate your program/process and all its dependencies in one package called a container image. The container runtime starts an instance of this container using only the Linux kernel and the directory structure, with program and dependencies, provided in the container image. This ensures that the program is running isolated from any other packages, programs, or files loaded on the host system.
Kubernetes, and the Cloud Native community of services, use Linux containers as their basic building block.
Operators
While Linux containers provide an incredibly useful way to isolate the dependencies for an application or application service, containers also require some lifecycle management. For example, at start up a container my need to set up access to networks, or extra storage. This type of set up usually happens with a human operator deciding on how the container will connect networks or host storage. The operator may also have to do routine maintenance. For example, if the container contains a database, the human operator may need to do a backup or routine scrubbing of the database.
Kubernetes Operators are an extension to Kubernetes "that make sue of custom resources to manage applications and their components." I.e. it provides an extra layer of encapsulation on top of containers that packages up some operation automation with the container. It puts what the human operator would do into an Operator pattern for the service or set of services.
Many software providers/vendors have created operators to manage their application or service lifecycle. Red Hat OpenShift provides a catalog of certified Operators that application develops can consume as part of their overall application. The validated patterns makes use of these certified Operators as much as possible. Having a common platform like Red Hat OpenShift helps reduce risk by using certified Operators.
Validated patterns
Assembling operators into a common pattern provides another layer of encapsulation. As with an Operator, where the developer can take advantage of the best practices from a experienced human operator, a validated pattern provides a way of taking advantage of best practices for deploying operators and other assets for a particular type of solution. Rather than starting from scratch to figure out how to deploy and manage a complex set of integrated and dependent containerized services, a developer can take a validated pattern and know that a lot of experience has been put into it.
A validated pattern has been tested and continues to be tested as the lifecycle of individual parts (Operators) change through release cycles. Red Hat’s Quality Engineering team provides Continuous Integration of the pattern for new releases of Red Hat products (Operators).
The validated patterns framework takes advantage of automation technology. It uses Cloud Native automation technology as much as possible. Occasionally the framework resorts to some scripts in order to get a pattern up and running faster.
Automation has many layers
As mentioned above, gaining consistency and robustness for deploying complex Cloud Native applications requires automation. While many Kubernetes distributions, including OpenShift, provide excellent user interfaces for deploying and managing applications, this is mostly useful during development and/or debugging when things go wrong. Being able to consistently deploy complex applications is critical.
But which automation tool should be used? Or which automation tools, plural? During the development of the validated patterns framework we learned important lessons on the different areas of automation.
Automation for building application code
When developing container based Cloud Native applications, a developer needs to build executable code and create a new container image for deployment into their Kubernetes test environment. Once tested, that container image needs to be moved through the continuous integration and continuous deployment (CI/CD) pipeline until it ends up in production. Tekton is a Cloud Native CI/CD project that is build for hybrid-cloud. OpenShift Pipelines is a Red Hat product based on Tekton.
Automation for application operations
There are two aspects to consider for operations when doing automation. First, you must be able to package up much of the configuration that is required for deploying Operators and pods. The validated patterns framework started with a project called Kustomize which allows you to assemble complex deployment YAML to apply to your Kubernetes cluster. Kustomize is a powerful tool, and almost achieved what we needed. However it fell short when we needed to propagate variable data into our deployment YAML. Instead we chose Helm because it provides templating and can therefore handle the injection of variable data into the deployment package. See more on templating here.
The second aspect of automation for application automation deals with both workflow and GitOps. Validated patterns requires that a workflow deploys various components of the complex application. Visibility into the success or failure of those application components is really important. After the initial deployment it is important to role out configuration changes in an automated way using a code repository. This is achieved using GitOps. I.e. Using a Git repository as a mechanism to change configuration that triggers the automatic roll-out of those changes.
"Application definitions, configurations, and environments should be declarative and version controlled. Application deployment and lifecycle management should be automated, auditable, and easy to understand." - Argo CD project
OpenShift GitOps is based on the Argo CD project. It is a GitOps continuous delivery tool for Kubernetes.
Secret handling
Validated patterns often depend on resources that require certificates or keys. These secrets need to be handled carefully. While it’s tempting to focus on just the deployment of a pattern and "handle security later", that’s a bad idea. In the spirit of DevSecOps, the validated patterns effort has decided to "shift security left". I.e. build security in early in the lifecycle.
When it comes to security, the approach requires patience and care to set up. There is no avoiding some manual steps but validated patterns tries to automate as much as possible while at the same time taking the lid off so developers can see what was and needs to be done.
There are two approaches to secret handling with validated patterns:
Some of the validated patterns use configuration files (for now), while others, like the Multicloud GitOps, use Vault. See Vault Setup for more info.
Policy
While many enterprise Cloud Native applications are open source, many of the products used require licenses or subscriptions. Policies help enforce license and subscription management and the channels needed to get access to those licenses or subscriptions.
Similarly, in multicloud deployments and complex edge deployments, policies can help define and select the correct GitOps workflows that need to be managed for various sites or clusters. E.g. defining an OpenShift Cluster as a "Factory" in the Industrial Edge validated pattern provides a simple trigger to roll-out the entire Factory deployment. Policy is a powerful tool in automation.
Validated patterns use Red Hat Advanced Cluster Management for Kubernetes to control clusters and applications from a single console, with built-in security policies.
QUICKLINKS
- +
Introduction
This section provides details on how to create a new pattern using the validated patterns framework. Creating a new pattern might start from scratch or it may start from an existing deployment that would benefit from a repeatable framework based on GitOps.
This introduction explains some of framework design decisions and why they were chosen. There are some high level concepts that are required for the framework. While those concepts can be implemented using a variety of open source projects, this framework is prescriptive and mentions the project and also (down stream) product that was used. E.g. For development builds we use Tekton (project) and specifically use OpenShift Pipelines (product).
The framework uses popular Cloud Native Computing Foundation (CNCF) projects as much as possible. The CNCF landscape contains many projects that solve the same or similar problem. The validated patterns effort has chosen specific projects but it is not unreasonable for users to switch out one project for another. (See more on Operators below).
There is no desire to replicate efforts already in CNCF. If new a open source project comes out of this framework, the plan would be to contribute that to CNCF.
Who is a pattern developer?
Many enterprise class Cloud Native applications are complex and require many different application services integrated together. Organizations can learn from each other on how to create robust, scalable, and maintainable systems. When you find a pattern that seems to work, it makes sense to promote best practices to others in order for them to not repeat the many failures you probably made while getting to your killer pattern.
In the world of DevOps (including DevSecOps and GitOps), teams should include personnel from development, operations, security, and architects. What makes DevOps work is the collaboration of all these IT personnel, the business owners, and others. As DevOps practices move through your organization, best practices are shared and standards evolve.
This validated patterns framework has evolved since it was started in 2019. It will likely continue to evolve. What was learned is that there are some common concepts that need to be addressed once you desire to generalize your organizations framework.
Therefore, the goal is, that developers, operators, security, and architects will use this framework to have secure and repeatable day one deployment mechanism and maintenance automation for day two operations.
A common platform
One of the most important goals of this framework is to provide consistency across any cloud provider - public or private. Public cloud providers each have Kubernetes distributions. While they keep up with the Kubernetes release cycle, they are not always running on the same version. Furthermore, each cloud provider has their own sets of services that developers often consume. So while you could automate the handling for each of the cloud providers, the framework utilizes one Kubernetes distribution that runs on public or private clouds - the hybrid and/or multi cloud model.
The framework depends on Red Hat OpenShift Container Platform (OCP). Once you have deployed Red Hat OCP wherever you wish to deploy your cloud native application pattern, then the framework can deploy on that platform in a few easy steps.
Containment beyond containers
If you are reading this chances are you are already familiar with Linux containers. But there is more to containers than Linux containers in the Cloud Native environment.
Containers
Containers allow you to encapsulate your program/process and all its dependencies in one package called a container image. The container runtime starts an instance of this container using only the Linux kernel and the directory structure, with program and dependencies, provided in the container image. This ensures that the program is running isolated from any other packages, programs, or files loaded on the host system.
Kubernetes, and the Cloud Native community of services, use Linux containers as their basic building block.
Operators
While Linux containers provide an incredibly useful way to isolate the dependencies for an application or application service, containers also require some lifecycle management. For example, at start up a container my need to set up access to networks, or extra storage. This type of set up usually happens with a human operator deciding on how the container will connect networks or host storage. The operator may also have to do routine maintenance. For example, if the container contains a database, the human operator may need to do a backup or routine scrubbing of the database.
Kubernetes Operators are an extension to Kubernetes "that make sue of custom resources to manage applications and their components." I.e. it provides an extra layer of encapsulation on top of containers that packages up some operation automation with the container. It puts what the human operator would do into an Operator pattern for the service or set of services.
Many software providers/vendors have created operators to manage their application or service lifecycle. Red Hat OpenShift provides a catalog of certified Operators that application develops can consume as part of their overall application. The validated patterns makes use of these certified Operators as much as possible. Having a common platform like Red Hat OpenShift helps reduce risk by using certified Operators.
Validated patterns
Assembling operators into a common pattern provides another layer of encapsulation. As with an Operator, where the developer can take advantage of the best practices from a experienced human operator, a validated pattern provides a way of taking advantage of best practices for deploying operators and other assets for a particular type of solution. Rather than starting from scratch to figure out how to deploy and manage a complex set of integrated and dependent containerized services, a developer can take a validated pattern and know that a lot of experience has been put into it.
A validated pattern has been tested and continues to be tested as the lifecycle of individual parts (Operators) change through release cycles. Red Hat’s Quality Engineering team provides Continuous Integration of the pattern for new releases of Red Hat products (Operators).
The validated patterns framework takes advantage of automation technology. It uses Cloud Native automation technology as much as possible. Occasionally the framework resorts to some scripts in order to get a pattern up and running faster.
Automation has many layers
As mentioned above, gaining consistency and robustness for deploying complex Cloud Native applications requires automation. While many Kubernetes distributions, including OpenShift, provide excellent user interfaces for deploying and managing applications, this is mostly useful during development and/or debugging when things go wrong. Being able to consistently deploy complex applications is critical.
But which automation tool should be used? Or which automation tools, plural? During the development of the validated patterns framework we learned important lessons on the different areas of automation.
Automation for building application code
When developing container based Cloud Native applications, a developer needs to build executable code and create a new container image for deployment into their Kubernetes test environment. Once tested, that container image needs to be moved through the continuous integration and continuous deployment (CI/CD) pipeline until it ends up in production. Tekton is a Cloud Native CI/CD project that is build for hybrid-cloud. OpenShift Pipelines is a Red Hat product based on Tekton.
Automation for application operations
There are two aspects to consider for operations when doing automation. First, you must be able to package up much of the configuration that is required for deploying Operators and pods. The validated patterns framework started with a project called Kustomize which allows you to assemble complex deployment YAML to apply to your Kubernetes cluster. Kustomize is a powerful tool, and almost achieved what we needed. However it fell short when we needed to propagate variable data into our deployment YAML. Instead we chose Helm because it provides templating and can therefore handle the injection of variable data into the deployment package. See more on templating here.
The second aspect of automation for application automation deals with both workflow and GitOps. Validated patterns requires that a workflow deploys various components of the complex application. Visibility into the success or failure of those application components is really important. After the initial deployment it is important to role out configuration changes in an automated way using a code repository. This is achieved using GitOps. I.e. Using a Git repository as a mechanism to change configuration that triggers the automatic roll-out of those changes.
"Application definitions, configurations, and environments should be declarative and version controlled. Application deployment and lifecycle management should be automated, auditable, and easy to understand." - Argo CD project
OpenShift GitOps is based on the Argo CD project. It is a GitOps continuous delivery tool for Kubernetes.
Secret handling
Validated patterns often depend on resources that require certificates or keys. These secrets need to be handled carefully. While it’s tempting to focus on just the deployment of a pattern and "handle security later", that’s a bad idea. In the spirit of DevSecOps, the validated patterns effort has decided to "shift security left". I.e. build security in early in the lifecycle.
When it comes to security, the approach requires patience and care to set up. There is no avoiding some manual steps but validated patterns tries to automate as much as possible while at the same time taking the lid off so developers can see what was and needs to be done.
There are two approaches to secret handling with validated patterns:
Some of the validated patterns use configuration files (for now), while others, like the Multicloud GitOps, use Vault. See Vault Setup for more info.
Policy
While many enterprise Cloud Native applications are open source, many of the products used require licenses or subscriptions. Policies help enforce license and subscription management and the channels needed to get access to those licenses or subscriptions.
Similarly, in multicloud deployments and complex edge deployments, policies can help define and select the correct GitOps workflows that need to be managed for various sites or clusters. E.g. defining an OpenShift Cluster as a "Factory" in the Industrial Edge validated pattern provides a simple trigger to roll-out the entire Factory deployment. Policy is a powerful tool in automation.
Validated patterns use Red Hat Advanced Cluster Management for Kubernetes to control clusters and applications from a single console, with built-in security policies.
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/contribute/contribute-to-docs/index.html b/contribute/contribute-to-docs/index.html index 3bd2a8472..1957189ff 100644 --- a/contribute/contribute-to-docs/index.html +++ b/contribute/contribute-to-docs/index.html @@ -58,7 +58,8 @@ :gitops-shortname: GitOps
For more information on attributes, see link: https://docs.asciidoctor.org/asciidoc/latest/key-concepts/#attributes.
Formatting
Use the following links to refer to AsciiDoc markup and syntax.
If you are graduating to AsciiDoc from Markdown, see the AsciiDoc to Markdown syntax comparison by example.
Formatting commands and code blocks
To enable syntax highlighting, use
[source,terminal]
for any terminal commands, such asoc
commands and their outputs. For example:[source,terminal] ---- $ oc get nodes -----
To enable syntax highlighting for a programming language, use
[source]
tags used in the code block. For example:[source,yaml]
[source,go]
[source,javascript]
QUICKLINKS
- +----
To enable syntax highlighting for a programming language, use [source]
tags used in the code block. For example:
[source,yaml]
[source,go]
[source,javascript]
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/contribute/creating-a-pattern/index.html b/contribute/creating-a-pattern/index.html index fa5bcb999..3f4c224c8 100644 --- a/contribute/creating-a-pattern/index.html +++ b/contribute/creating-a-pattern/index.html @@ -87,7 +87,8 @@ sourceNamespace: openshift-marketplace {{- if .Values.main.options.useCSV }} startingCSV: openshift-gitops-operator.{{ .Values.main.gitops.csv }} -{{- end }}
Size matters
If things are taking a long time to deploy, use the OpenShift console to check on memory and other potential capacity issues with the cluster. If running in a cloud you may wish to up the machine size. Check the sizing charts.
QUICKLINKS
- +{{- end }}
Size matters
If things are taking a long time to deploy, use the OpenShift console to check on memory and other potential capacity issues with the cluster. If running in a cloud you may wish to up the machine size. Check the sizing charts.
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/contribute/extending-a-pattern/index.html b/contribute/extending-a-pattern/index.html index c42d1cfeb..181c11dc9 100644 --- a/contribute/extending-a-pattern/index.html +++ b/contribute/extending-a-pattern/index.html @@ -104,7 +104,8 @@ segment.bytes: 1073741824
Add, Commit & Push
Steps:
Use
git status
to see what’s changed that you need to add to your commit and add them usinggit add
Commit the changes to the branch
Push the branch to your fork.
~/git/multicloud-gitops> git status
~/git/multicloud-gitops> git add <the assets created/changed>
~/git/multicloud-gitops> git commit -m “Added Kafka using AMQ Stream operator and Helm charts”
-~/git/multicloud-gitops> git push origin multicloud-gitops
Watch OpenShift GitOps hub cluster UI and see Kafka get deployed
Let’s check the OpenShift console. This can take a bit of time for ArgoCD to pick it up and deploy the assets.
Select installed operators. Is AMQ Streams Operator deployed?
Select the Red Hat Integration - AMQ Streams operator.
Select Kafka tab. Is there a new lab-cluster created?
Select the Kafka Topic tab. Is there a lab-streams topic created?
This is a very simple and minimal Kafka set up. It is likely you will need to add more manifests to the Chart but it is a good start.
QUICKLINKS
- +~/git/multicloud-gitops> git push origin multicloud-gitops
Watch OpenShift GitOps hub cluster UI and see Kafka get deployed
Let’s check the OpenShift console. This can take a bit of time for ArgoCD to pick it up and deploy the assets.
Select installed operators. Is AMQ Streams Operator deployed?
Select the Red Hat Integration - AMQ Streams operator.
Select Kafka tab. Is there a new lab-cluster created?
Select the Kafka Topic tab. Is there a lab-streams topic created?
This is a very simple and minimal Kafka set up. It is likely you will need to add more manifests to the Chart but it is a good start.
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/contribute/support-policies/index.html b/contribute/support-policies/index.html
index d6050dbee..36f5fcf0b 100644
--- a/contribute/support-policies/index.html
+++ b/contribute/support-policies/index.html
@@ -1,6 +1,7 @@
Support Policies | Validated Patterns Purpose
The purpose of this support policy is to define expectations for the time in which consumers and developers of the Patterns framework can expect to receive assistance with their query to the Validated Patterns team.
Continuous Integration (CI) Failures
Expected Response time: 5 business days
The Validated Patterns team will collectively triage any CI failures for patterns to which this policy applies each Monday. If necessary, a Jira issue will be created and tracked by the team.
Reporting Pattern Issues
Normally there is a path to support all products within a pattern. Either they are directly supported by the vendor (of which Red Hat may be one), or an enterprise version of that product exists.
All product issues should be directed to the vendor of that product.
For problems deploying patterns, unhealthy GitOps applications, or broken demos, please create an issue within the pattern’s github repository where they will be reviewed by the appropriate SME.
To ensure we can best help you please provide the following information:
Environment Details (Machine Sizes, Specialized Network, Storage, Hardware)
The output of the error
Any changes that were made prior to the failure
Expected Outcome: What you thought should have happened
If you are unsure if your issue is product or pattern related, please reach out to the community using https://groups.google.com/g/validatedpatterns or by emailing validatedpatterns@googlegroups.com
Any pattern-based security issues, such as hard coded secrets, found should be reported to: validated-patterns-team@redhat.com You can expect a response within 5 business days
Pull Requests
Pull Requests against Patterns to which this policy applies will be reviewed by the appropriate SME or by the patterns team. We will endeavor to provide initial feedback within 10 business days, but ask for patience during busy periods, or if we happen to be on vacation.
Feature Enhancements
Create an issue, use the enhancement label, be clear what the desired functionality is and why it is necessary. For enhancements that could or should apply across multiple patterns, please file them against common. Use the following as a guide for creating your feature request:
Proposed title of the feature request
What is the nature and description of the request?
Why do you need / want this? (List business requirements here)
List any affected packages or components
QUICKLINKS
- +
Purpose
The purpose of this support policy is to define expectations for the time in which consumers and developers of the Patterns framework can expect to receive assistance with their query to the Validated Patterns team.
Continuous Integration (CI) Failures
Expected Response time: 5 business days
The Validated Patterns team will collectively triage any CI failures for patterns to which this policy applies each Monday. If necessary, a Jira issue will be created and tracked by the team.
Reporting Pattern Issues
Normally there is a path to support all products within a pattern. Either they are directly supported by the vendor (of which Red Hat may be one), or an enterprise version of that product exists.
All product issues should be directed to the vendor of that product.
For problems deploying patterns, unhealthy GitOps applications, or broken demos, please create an issue within the pattern’s github repository where they will be reviewed by the appropriate SME.
To ensure we can best help you please provide the following information:
Environment Details (Machine Sizes, Specialized Network, Storage, Hardware)
The output of the error
Any changes that were made prior to the failure
Expected Outcome: What you thought should have happened
If you are unsure if your issue is product or pattern related, please reach out to the community using https://groups.google.com/g/validatedpatterns or by emailing validatedpatterns@googlegroups.com
Any pattern-based security issues, such as hard coded secrets, found should be reported to: validated-patterns-team@redhat.com You can expect a response within 5 business days
Pull Requests
Pull Requests against Patterns to which this policy applies will be reviewed by the appropriate SME or by the patterns team. We will endeavor to provide initial feedback within 10 business days, but ask for patience during busy periods, or if we happen to be on vacation.
Feature Enhancements
Create an issue, use the enhancement label, be clear what the desired functionality is and why it is necessary. For enhancements that could or should apply across multiple patterns, please file them against common. Use the following as a guide for creating your feature request:
Proposed title of the feature request
What is the nature and description of the request?
Why do you need / want this? (List business requirements here)
List any affected packages or components
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/about-pattern-tiers-types/index.html b/learn/about-pattern-tiers-types/index.html
index 91ca3da82..a4138a36e 100644
--- a/learn/about-pattern-tiers-types/index.html
+++ b/learn/about-pattern-tiers-types/index.html
@@ -7,7 +7,8 @@
- Workflow
- Values Files
- Secrets
-
- FAQ
Validated Patterns tiers
The different tiers of Validated Patterns are designed to facilitate ongoing maintenance, support, and testing effort for a pattern. To contribute to a pattern that suits your solution or to learn about onboarding your own pattern, understand the following pattern tiers.
Icon Pattern tier Description A pattern categorized under the sandbox tier provides you with an entry point to onboard to Validated Patterns. The minimum requirement to qualify for the sandbox tier is to start with the patterns framework and include minimal documentation.
The patterns in this tier might be in a work-in-progress state; and they might have been manually tested on a limited set of platforms.
A pattern categorized under the tested tier implies that the pattern might have been recently working on at least one recent version of Red Hat OpenShift Container Platform. Qualifying for this tier might require additional work for the pattern’s owner, who might be a partner or a motivated subject matter expert (SME).
The patterns in this tier might have a defined business problem with a demonstration. The patterns might have a manual or automated test plan, which passes at least once for each new Red Hat OpenShift Container Platform minor version.
A pattern categorized under the maintained tier implies that the pattern might have been functional on all currently supported extended update support (EUS) versions of Red Hat OpenShift Container Platform. Qualifying for this tier might require additional work for the pattern’s owner who might be a partner or a motivated SME.
The patterns in this tier might have a formal release process with patch releases. They might have continuous integration (CI) automation testing.
Validated Patterns tiers
The different tiers of Validated Patterns are designed to facilitate ongoing maintenance, support, and testing effort for a pattern. To contribute to a pattern that suits your solution or to learn about onboarding your own pattern, understand the following pattern tiers.
Icon Pattern tier Description A pattern categorized under the sandbox tier provides you with an entry point to onboard to Validated Patterns. The minimum requirement to qualify for the sandbox tier is to start with the patterns framework and include minimal documentation.
The patterns in this tier might be in a work-in-progress state; and they might have been manually tested on a limited set of platforms.
A pattern categorized under the tested tier implies that the pattern might have been recently working on at least one recent version of Red Hat OpenShift Container Platform. Qualifying for this tier might require additional work for the pattern’s owner, who might be a partner or a motivated subject matter expert (SME).
The patterns in this tier might have a defined business problem with a demonstration. The patterns might have a manual or automated test plan, which passes at least once for each new Red Hat OpenShift Container Platform minor version.
A pattern categorized under the maintained tier implies that the pattern might have been functional on all currently supported extended update support (EUS) versions of Red Hat OpenShift Container Platform. Qualifying for this tier might require additional work for the pattern’s owner who might be a partner or a motivated SME.
The patterns in this tier might have a formal release process with patch releases. They might have continuous integration (CI) automation testing.
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/about-validated-patterns/index.html b/learn/about-validated-patterns/index.html
index 1cead6768..9cc662491 100644
--- a/learn/about-validated-patterns/index.html
+++ b/learn/about-validated-patterns/index.html
@@ -7,7 +7,8 @@
- Workflow
- Values Files
- Secrets
-
- FAQ
Overview of Validated Patterns
Validated Patterns are an advanced form of reference architectures that offer a streamlined approach to deploying complex business solutions. Validated Patterns are deployable, testable software artifacts with automated deployment, enhancing speed, reliability, and consistency across environments. These patterns are rigorously tested blueprints designed to meet specific business needs, reducing deployment risks.
Building on traditional architectures, Validated Patterns focus on customer solutions involving multiple Red Hat products. Successful deployments serve as the foundation for these patterns, which include example applications and the necessary open source projects. Users can easily change these patterns to fit their specific needs.
The creation process involves selecting customer use cases, validating the patterns with engineering teams, and developing GitOps-based automation. This automation supports Continuous Integration (CI) pipelines, allowing for proactive updates and maintenance as new product versions are released.
Relationship to reference architectures
Validated Patterns enhance reference architectures with automation and rigorous validation. Reference architectures provide a conceptual framework for building solutions. Validated Patterns take this further by offering a deployable software artifact that automates and optimizes the framework, ensuring consistent and efficient deployments. This approach allows businesses to implement complex solutions rapidly and with confidence, knowing that the patterns have been thoroughly tested and optimized for their use case.
The problem Validated Patterns solve
Deploying complex business solutions involves multiple steps, each of which, if done haphazardly, can introduce potential errors or inefficiencies. Validated Patterns address this by offering a pre-validated, automated deployment process. This reduces guesswork, minimizes manual intervention, and ensures faster, more reliable deployments. Organizations can then focus on strategic business objectives rather than deployment complexities.
Validated Patterns goals
The main goals of the Validated Patterns project include:
Consistency: Ensure that deployments are uniform across different environments, reducing variability and potential issues.
Reliability: Ensure that solutions are thoroughly tested and validated, minimizing the risk of errors during deployment.
Efficiency: Reduce the time and resources required for deployment by providing pre-configured, automated solutions.
Scalability: Enable businesses to deploy solutions that can scale to meet growing demands.
Automation: Streamline the deployment of complex business solutions by automating key processes. While automation is a key aspect of the framework, its primary role is to support and enhance the core goals of consistency, reliability, efficiency, and scalability.
Who should use Validated Patterns?
Validated Patterns are particularly suited for IT architects, advanced developers, and system administrators with a familiarity with Kubernetes and the Red Hat OpenShift Container Platform. These patterns are ideal for those who need to deploy complex business solutions quickly and reliably across various environments. The framework incorporates advanced Cloud Native concepts and projects, such as OpenShift GitOps (ArgoCD), Advanced Cluster Management (Open Cluster Management), and OpenShift Pipelines (Tekton), making them especially beneficial for users familiar with these tools.
Examples of Use Cases:
Enterprise-Level Deployments: Organizations implementing large-scale, multi-tier applications can use Validated Patterns to ensure reliable and consistent deployments across all environments.
Cloud Migration: Companies transitioning their infrastructure to the cloud can use Validated Patterns to automate and streamline the migration process.
DevOps Pipelines: Teams relying on continuous integration and continuous deployment (CI/CD) pipelines can use Validated Patterns to automate the deployment of new features and updates, ensuring consistent and repeatable outcomes.
The Validated Patterns community and ecosystem
A vibrant community and ecosystem support and contribute to the ongoing development and refinement of Validated Patterns. This community-driven approach ensures that Validated Patterns stay current with the latest technological advancements and industry best practices. The ecosystem includes contributions from various industries and technology partners, ensuring that the patterns are applicable to a wide range of use cases and environments. This collaborative effort keeps the patterns relevant and fosters a culture of continuous improvement within the Red Hat ecosystem.
Red Hat’s involvement in Validated Patterns
Red Hat plays a pivotal role in the development, validation, and promotion of Validated Patterns. As a leader in open source solutions, Red Hat leverages its extensive expertise to create and maintain these patterns, ensuring they meet the highest standards of quality and reliability. Red Hat’s involvement extends beyond tool provision; it includes continuous updates to align these patterns with the latest technological advancements and industry needs. This ensures that organizations using Validated Patterns are always equipped with the most effective and up-to-date solutions available. Additionally, Red Hat collaborates closely with the community to expand the catalog of Validated Patterns, making these valuable resources accessible to organizations worldwide.
Application deployment workflows
Effective deployment workflows are crucial for ensuring that applications are deployed consistently and efficiently across various environments. By leveraging OpenShift clusters and automation tools, these workflows can streamline the process, reduce errors, and ensure scalability. Below, we outline the general structure for deploying applications, including edge patterns and GitOps integration.
General structure
All patterns assume you have an available OpenShift cluster for deploying applications. If you don’t have one, you can use cloud.redhat.com. The documentation uses the
oc
command syntax, but you can usekubectl
interchangeably. For each deployment, ensure you’re logged into a cluster using theoc login
command or by exporting theKUBECONFIG
path.The following diagram outlines the general deployment flow for a data center application. Before proceeding, users must create a fork of the pattern repository to allow for changes to operational elements (such as configurations) and application code. These changes can then be successfully pushed to the forked repository as part of DevOps continuous integration (CI). Clone the repository to your local machine, and push future changes to your fork.
In your fork, if needed, edit the values files, such as
values-global.yaml
orvalues-hub.yaml
, to customize or personalize your deployment. These values files specify subscriptions, operators, applications, and other details. Additionally, each Validated Pattern contains avalues-secret
template file, which provides secret values required to successfully install the pattern. Patterns do not require committing secret material to git repositories. It is important to avoid pushing sensitive information to a public repository accessible to others. The Validated Patterns framework includes components to facilitate the safe use of secrets.Deploy the application as specified by the pattern, usually by using a
make
command (make install
). When the workload is deployed, the pattern first deploys the Validated Patterns operator, which in turn installs OpenShift GitOps. OpenShift GitOps then ensures that all components of the pattern, including required operators and application code, are deployed.
Most patterns also include the deployment of an Advanced Cluster Management (ACM) operator to manage multi-cluster deployments.
Edge Patterns
Many patterns include both a data center and one or more edge clusters. The following diagram outlines the general deployment flow for applications on an edge cluster. Edge OpenShift clusters are typically smaller than data center clusters and might be deployed on a three-node cluster that allows workloads on master nodes, or even on a single-node cluster (SNO). These edge clusters can be deployed on bare metal, local virtual machines, or in a public or private cloud.
GitOps for edge
After provisioning the edge cluster, import or join it with the hub or data center cluster. For more details on Instructions for importing the cluster see, Importing a cluster.
After importing the cluster, ACM (Advanced Cluster Management) on the data center deploys an ACM agent and agent-addon pod into the edge cluster. ACM then installs OpenShift GitOps, which deploys the required applications based on the specified criteria.
Overview of Validated Patterns
Validated Patterns are an advanced form of reference architectures that offer a streamlined approach to deploying complex business solutions. Validated Patterns are deployable, testable software artifacts with automated deployment, enhancing speed, reliability, and consistency across environments. These patterns are rigorously tested blueprints designed to meet specific business needs, reducing deployment risks.
Building on traditional architectures, Validated Patterns focus on customer solutions involving multiple Red Hat products. Successful deployments serve as the foundation for these patterns, which include example applications and the necessary open source projects. Users can easily change these patterns to fit their specific needs.
The creation process involves selecting customer use cases, validating the patterns with engineering teams, and developing GitOps-based automation. This automation supports Continuous Integration (CI) pipelines, allowing for proactive updates and maintenance as new product versions are released.
Relationship to reference architectures
Validated Patterns enhance reference architectures with automation and rigorous validation. Reference architectures provide a conceptual framework for building solutions. Validated Patterns take this further by offering a deployable software artifact that automates and optimizes the framework, ensuring consistent and efficient deployments. This approach allows businesses to implement complex solutions rapidly and with confidence, knowing that the patterns have been thoroughly tested and optimized for their use case.
The problem Validated Patterns solve
Deploying complex business solutions involves multiple steps, each of which, if done haphazardly, can introduce potential errors or inefficiencies. Validated Patterns address this by offering a pre-validated, automated deployment process. This reduces guesswork, minimizes manual intervention, and ensures faster, more reliable deployments. Organizations can then focus on strategic business objectives rather than deployment complexities.
Validated Patterns goals
The main goals of the Validated Patterns project include:
Consistency: Ensure that deployments are uniform across different environments, reducing variability and potential issues.
Reliability: Ensure that solutions are thoroughly tested and validated, minimizing the risk of errors during deployment.
Efficiency: Reduce the time and resources required for deployment by providing pre-configured, automated solutions.
Scalability: Enable businesses to deploy solutions that can scale to meet growing demands.
Automation: Streamline the deployment of complex business solutions by automating key processes. While automation is a key aspect of the framework, its primary role is to support and enhance the core goals of consistency, reliability, efficiency, and scalability.
Who should use Validated Patterns?
Validated Patterns are particularly suited for IT architects, advanced developers, and system administrators with a familiarity with Kubernetes and the Red Hat OpenShift Container Platform. These patterns are ideal for those who need to deploy complex business solutions quickly and reliably across various environments. The framework incorporates advanced Cloud Native concepts and projects, such as OpenShift GitOps (ArgoCD), Advanced Cluster Management (Open Cluster Management), and OpenShift Pipelines (Tekton), making them especially beneficial for users familiar with these tools.
Examples of Use Cases:
Enterprise-Level Deployments: Organizations implementing large-scale, multi-tier applications can use Validated Patterns to ensure reliable and consistent deployments across all environments.
Cloud Migration: Companies transitioning their infrastructure to the cloud can use Validated Patterns to automate and streamline the migration process.
DevOps Pipelines: Teams relying on continuous integration and continuous deployment (CI/CD) pipelines can use Validated Patterns to automate the deployment of new features and updates, ensuring consistent and repeatable outcomes.
The Validated Patterns community and ecosystem
A vibrant community and ecosystem support and contribute to the ongoing development and refinement of Validated Patterns. This community-driven approach ensures that Validated Patterns stay current with the latest technological advancements and industry best practices. The ecosystem includes contributions from various industries and technology partners, ensuring that the patterns are applicable to a wide range of use cases and environments. This collaborative effort keeps the patterns relevant and fosters a culture of continuous improvement within the Red Hat ecosystem.
Red Hat’s involvement in Validated Patterns
Red Hat plays a pivotal role in the development, validation, and promotion of Validated Patterns. As a leader in open source solutions, Red Hat leverages its extensive expertise to create and maintain these patterns, ensuring they meet the highest standards of quality and reliability. Red Hat’s involvement extends beyond tool provision; it includes continuous updates to align these patterns with the latest technological advancements and industry needs. This ensures that organizations using Validated Patterns are always equipped with the most effective and up-to-date solutions available. Additionally, Red Hat collaborates closely with the community to expand the catalog of Validated Patterns, making these valuable resources accessible to organizations worldwide.
Application deployment workflows
Effective deployment workflows are crucial for ensuring that applications are deployed consistently and efficiently across various environments. By leveraging OpenShift clusters and automation tools, these workflows can streamline the process, reduce errors, and ensure scalability. Below, we outline the general structure for deploying applications, including edge patterns and GitOps integration.
General structure
All patterns assume you have an available OpenShift cluster for deploying applications. If you don’t have one, you can use cloud.redhat.com. The documentation uses the
oc
command syntax, but you can usekubectl
interchangeably. For each deployment, ensure you’re logged into a cluster using theoc login
command or by exporting theKUBECONFIG
path.The following diagram outlines the general deployment flow for a data center application. Before proceeding, users must create a fork of the pattern repository to allow for changes to operational elements (such as configurations) and application code. These changes can then be successfully pushed to the forked repository as part of DevOps continuous integration (CI). Clone the repository to your local machine, and push future changes to your fork.
In your fork, if needed, edit the values files, such as
values-global.yaml
orvalues-hub.yaml
, to customize or personalize your deployment. These values files specify subscriptions, operators, applications, and other details. Additionally, each Validated Pattern contains avalues-secret
template file, which provides secret values required to successfully install the pattern. Patterns do not require committing secret material to git repositories. It is important to avoid pushing sensitive information to a public repository accessible to others. The Validated Patterns framework includes components to facilitate the safe use of secrets.Deploy the application as specified by the pattern, usually by using a
make
command (make install
). When the workload is deployed, the pattern first deploys the Validated Patterns operator, which in turn installs OpenShift GitOps. OpenShift GitOps then ensures that all components of the pattern, including required operators and application code, are deployed.
Most patterns also include the deployment of an Advanced Cluster Management (ACM) operator to manage multi-cluster deployments.
Edge Patterns
Many patterns include both a data center and one or more edge clusters. The following diagram outlines the general deployment flow for applications on an edge cluster. Edge OpenShift clusters are typically smaller than data center clusters and might be deployed on a three-node cluster that allows workloads on master nodes, or even on a single-node cluster (SNO). These edge clusters can be deployed on bare metal, local virtual machines, or in a public or private cloud.
GitOps for edge
After provisioning the edge cluster, import or join it with the hub or data center cluster. For more details on Instructions for importing the cluster see, Importing a cluster.
After importing the cluster, ACM (Advanced Cluster Management) on the data center deploys an ACM agent and agent-addon pod into the edge cluster. ACM then installs OpenShift GitOps, which deploys the required applications based on the specified criteria.
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/learn/clustergroup-in-values-files/index.html b/learn/clustergroup-in-values-files/index.html index 0091c3864..2d6ecd156 100644 --- a/learn/clustergroup-in-values-files/index.html +++ b/learn/clustergroup-in-values-files/index.html @@ -139,7 +139,8 @@ verbs: - get - list - - watch
In this example, the imperative section defines two jobs, "deploy-kubevirt-worker" and "configure-aap-controller".
The "deploy-kubevirt-worker" job is responsible for ensuring that the cluster runs on AWS. It uses the OpenShift MachineSet API to add a baremetal node for running virtual machines.
The "configure-aap-controller" job sets up the Ansible Automation Platform (AAP), a crucial component of the Ansible Edge GitOps platform. This job entitles AAP and sets up projects, jobs, and credentials. Unlike the default container image, this example uses a different image.
Additionally, an optional
clusterRoleYaml
section is defined. By default, the imperative job runs under Role-based access control (RBAC), providing read-only access to all resources within its cluster. However, if a job requires write access to alter or generate settings, such permissions can be specified within theclusterRoleYaml
section. In the AnsibleEdge scenario, the "deploy-kubevirt-worker" job needs permissions to manipulate and create machinesets, while the "configure-aap-controller" job requires read-only access to Kubernetes objects.QUICKLINKS
- + - watch
In this example, the imperative section defines two jobs, "deploy-kubevirt-worker" and "configure-aap-controller".
The "deploy-kubevirt-worker" job is responsible for ensuring that the cluster runs on AWS. It uses the OpenShift MachineSet API to add a baremetal node for running virtual machines.
The "configure-aap-controller" job sets up the Ansible Automation Platform (AAP), a crucial component of the Ansible Edge GitOps platform. This job entitles AAP and sets up projects, jobs, and credentials. Unlike the default container image, this example uses a different image.
Additionally, an optional
clusterRoleYaml
section is defined. By default, the imperative job runs under Role-based access control (RBAC), providing read-only access to all resources within its cluster. However, if a job requires write access to alter or generate settings, such permissions can be specified within theclusterRoleYaml
section. In the AnsibleEdge scenario, the "deploy-kubevirt-worker" job needs permissions to manipulate and create machinesets, while the "configure-aap-controller" job requires read-only access to Kubernetes objects.QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/faq/index.html b/learn/faq/index.html
index 748781fbd..182ebc4b5 100644
--- a/learn/faq/index.html
+++ b/learn/faq/index.html
@@ -7,7 +7,8 @@
- Workflow
- Values Files
- Secrets
-
- FAQ
FAQ
What is a Validated Pattern?
Validated Patterns are collections of applications (in the ArgoCD sense) that demonstrate aspects of hub/edge computing that seem interesting and useful. Validated Patterns will generally have a hub or centralized component, and an edge component. These will interact in different ways.
Many things have changed in the IT landscape in the last few years - containers and kubernetes have taken the industry by storm, but they introduce many technologies and concepts. It is not always clear how these technologies and concepts play together - and Validated Patterns is our effort to show these technologies working together on non-trivial applications in ways that make sense for real customers and partners to use.
The first Validated Pattern is based on MANUela, an application developed by Red Hat field associates. This application highlights some interesting aspects of the industrial edge in a cloud-native world - the hub component features pipelines to build the application, a "twin" for testing purposes, a central data lake, an s3 component to gather data from the edge installations (which are factories in this case). The edge component has machine sensors, which are responsible for only gathering data from instrumented line devices and shares them via MQTT messaging. The edge also features Seldon, an AI/ML framework for making predictions, a custom Node.js application to show data in real time, and messaging components supporting both MQTT and Kafka protocols. The local applications use MQTT to retrieve data for display, and the Kafka components move the data to the central hub for storage and analysis.
We are actively developing new Validated Patterns. Watch this space for updates!
How are they different from XYZ?
Many technology demos can be very minimal - such demos have an important place in the ecosystem to demonstrate the intent of an individual technology. Validated Patterns are meant to demonstrate groups of technologies working together in a cloud native way. And yet, we hope to make these patterns general enough to allow for swapping application components out — for example, if you want to swap out ActiveMQ for RabbitMQ to support MQTT - or use a different messaging technology altogether, that should be possible. The other components will require reconfiguration.
What technologies are used?
Key technologies in the stack for Industrial Edge include:
Red Hat OpenShift Container Platform
Red Hat Advanced Cluster Management
Red Hat OpenShift GitOps (based on ArgoCD)
Red Hat OpenShift Pipelines (based on tekton)
Red Hat Integration - AMQ Broker (ActiveMQ Artemis MQTT)
Red Hat Integration - AMQ Streams (Kafka)
Red Hat Integration - Camel K
Seldon Operator
In the future, we expect to further use Red Hat OpenShift, and expand the integrations with other elements of the ecosystem. How can the concept of GitOps integrate with a fleet of devices that are not running Kubernetes? What about integrations with baremetal or VM servers? Sounds like a job for Ansible! We expect to tackle some of these problems in future patterns.
How are they structured?
Validated Patterns come in parts - we have a common repository with logic that will apply to multiple patterns. Layered on top of that is our first pattern - industrial edge. This layout allows for individual applications within a pattern to be swapped out by pointing to different repositories or branches for those individual components by customizing the values files in the root of the repository to point to different branches or forks or even different repositories entirely. (At present, the repositories all have to be on github.com and accessible with the same token.)
The common repository is primarily concerned with how to deploy the GitOps operator, and to create the namespaces that will be necessary to manage the pattern applications.
The pattern repository has the application-specific layout, and determines which components are installed in which places - hub or edge. The pattern repository also defines the hub and edge locations. Both the hub and edge are expected to have multiple components each - the hub will have pipelines and the CI/CD framework, as well as any centralization components or data analysis components. Edge components are designed to be smaller as we do not need to deploy Pipelines or the test and staging areas to the Edge.
Each application is described as a series of resources that are rendered into GitOps (ArgoCD) via Helm and Kustomize. The values for these charts are set by values files that need to be "personalized" (with your local cluster values) as the first step of installation. Subsequent pushes to the gitops repository will be reflected in the clusters running the applications.
Who is behind this?
Today, a team of Red Hat engineers including Andrew Beekhof (@beekhof), Lester Claudio (@claudiol), Martin Jackson (@mhjacks), William Henry (@ipbabble), Michele Baldessari (@mbaldessari), Jonny Rickard (@day0hero) and others.
Excited or intrigued by what you see here? We’d love to hear your thoughts and ideas! Try the patterns contained here and see below for links to our repositories and issue trackers.
How can I get involved?
Try out what we’ve done and submit issues to our issue trackers.
We will review pull requests to our pattern repositories.
FAQ
What is a Validated Pattern?
Validated Patterns are collections of applications (in the ArgoCD sense) that demonstrate aspects of hub/edge computing that seem interesting and useful. Validated Patterns will generally have a hub or centralized component, and an edge component. These will interact in different ways.
Many things have changed in the IT landscape in the last few years - containers and kubernetes have taken the industry by storm, but they introduce many technologies and concepts. It is not always clear how these technologies and concepts play together - and Validated Patterns is our effort to show these technologies working together on non-trivial applications in ways that make sense for real customers and partners to use.
The first Validated Pattern is based on MANUela, an application developed by Red Hat field associates. This application highlights some interesting aspects of the industrial edge in a cloud-native world - the hub component features pipelines to build the application, a "twin" for testing purposes, a central data lake, an s3 component to gather data from the edge installations (which are factories in this case). The edge component has machine sensors, which are responsible for only gathering data from instrumented line devices and shares them via MQTT messaging. The edge also features Seldon, an AI/ML framework for making predictions, a custom Node.js application to show data in real time, and messaging components supporting both MQTT and Kafka protocols. The local applications use MQTT to retrieve data for display, and the Kafka components move the data to the central hub for storage and analysis.
We are actively developing new Validated Patterns. Watch this space for updates!
How are they different from XYZ?
Many technology demos can be very minimal - such demos have an important place in the ecosystem to demonstrate the intent of an individual technology. Validated Patterns are meant to demonstrate groups of technologies working together in a cloud native way. And yet, we hope to make these patterns general enough to allow for swapping application components out — for example, if you want to swap out ActiveMQ for RabbitMQ to support MQTT - or use a different messaging technology altogether, that should be possible. The other components will require reconfiguration.
What technologies are used?
Key technologies in the stack for Industrial Edge include:
Red Hat OpenShift Container Platform
Red Hat Advanced Cluster Management
Red Hat OpenShift GitOps (based on ArgoCD)
Red Hat OpenShift Pipelines (based on tekton)
Red Hat Integration - AMQ Broker (ActiveMQ Artemis MQTT)
Red Hat Integration - AMQ Streams (Kafka)
Red Hat Integration - Camel K
Seldon Operator
In the future, we expect to further use Red Hat OpenShift, and expand the integrations with other elements of the ecosystem. How can the concept of GitOps integrate with a fleet of devices that are not running Kubernetes? What about integrations with baremetal or VM servers? Sounds like a job for Ansible! We expect to tackle some of these problems in future patterns.
How are they structured?
Validated Patterns come in parts - we have a common repository with logic that will apply to multiple patterns. Layered on top of that is our first pattern - industrial edge. This layout allows for individual applications within a pattern to be swapped out by pointing to different repositories or branches for those individual components by customizing the values files in the root of the repository to point to different branches or forks or even different repositories entirely. (At present, the repositories all have to be on github.com and accessible with the same token.)
The common repository is primarily concerned with how to deploy the GitOps operator, and to create the namespaces that will be necessary to manage the pattern applications.
The pattern repository has the application-specific layout, and determines which components are installed in which places - hub or edge. The pattern repository also defines the hub and edge locations. Both the hub and edge are expected to have multiple components each - the hub will have pipelines and the CI/CD framework, as well as any centralization components or data analysis components. Edge components are designed to be smaller as we do not need to deploy Pipelines or the test and staging areas to the Edge.
Each application is described as a series of resources that are rendered into GitOps (ArgoCD) via Helm and Kustomize. The values for these charts are set by values files that need to be "personalized" (with your local cluster values) as the first step of installation. Subsequent pushes to the gitops repository will be reflected in the clusters running the applications.
Who is behind this?
Today, a team of Red Hat engineers including Andrew Beekhof (@beekhof), Lester Claudio (@claudiol), Martin Jackson (@mhjacks), William Henry (@ipbabble), Michele Baldessari (@mbaldessari), Jonny Rickard (@day0hero) and others.
Excited or intrigued by what you see here? We’d love to hear your thoughts and ideas! Try the patterns contained here and see below for links to our repositories and issue trackers.
How can I get involved?
Try out what we’ve done and submit issues to our issue trackers.
We will review pull requests to our pattern repositories.
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/implementation/index.html b/learn/implementation/index.html
index b5a756d78..e8f0ddde5 100644
--- a/learn/implementation/index.html
+++ b/learn/implementation/index.html
@@ -7,7 +7,8 @@
- Workflow
- Values Files
- Secrets
-
- FAQ
Technical requirements
Consider these requirements specific to the implementation of all Validated Patterns and their tiers.
The requirements are categorized as follows:
- Must
These are nonnegotiable, core requirements that must be implemented.
- Should
These are important but not critical; their implementation enhances the pattern.
- Can
These are optional or desirable features, but their absence does not hinder the implementation of a pattern.
Must
Patterns must include one or more Git repositories in a publicly accessible location, containing configuration elements that can be consumed by the Red Hat OpenShift GitOps Operator without supplying custom Argo CD images.
Patterns must be useful without all content stored in private Git repositories.
Patterns must include a list of names and versions of all the products and projects that the pattern consumes.
Patterns must be useful without any sample applications that are private or that lack public sources.
Patterns must not degrade due to lack of updates or opaque incompatibilities in closed source applications.
Patterns must not store sensitive data elements including, but not limited to, passwords in Git repositories.
Patterns must be possible to deploy on any installer-provisioned infrastructure OpenShift cluster (BYO).
Validated Patterns distinguish between the provisioning and configuration requirements of the initial cluster (
Patterns
) and of clusters or machines that are managed by the initial cluster (Managed clusters
).Patterns must use a standardized clustergroup Helm chart as the initial Red Hat OpenShift GitOps application that describes all namespaces, subscriptions, and any other GitOps applications which contain the configuration elements that make up the solution.
Managed clusters must operate on the premise of
eventual consistency
(automatic retries, and an expectation of idempotence), which is one of the essential benefits of the GitOps model.Imperative elements must be implemented as idempotent code stored in Git repository.
Should
Patterns should include sample applications to demonstrate the business problems addressed by the pattern.
Patterns should try to indicate which parts are foundational as opposed to being for demonstration purposes.
Patterns should use the Validated Patterns Operator to deploy patterns. However, anything that creates the OpenShift GitOps subscription and initial clustergroup application could be acceptable.
Patterns should embody the Open Hybrid Cloud model unless there is a compelling reason to limit the availability of functionality to a specific platform or topology.
Patterns should use industry standards and Red Hat products for all required tooling.
Patterns require current best practices at the time of pattern development. Solutions that do not conform to best practices should expect to justify non-conformance or expend engineering effort to conform.
Patterns should not make use of upstream or community Operators and images except, depending on the market segment, where it is critical to the overall solution.
Such Operators are forbidden to be deployed into an increasing number of customer environments, which limits the pattern reuse. Alternatively, consider to productize the Operator, and build it in-cluster from trusted sources as part of the pattern.
Patterns should be decomposed into modules that perform a specific function, so that they can be reused in other patterns.
For example, Bucket Notification is a capability in the Medical Diagnosis pattern that could be used for other solutions.
Patterns should use Red Hat Ansible Automation Platform to drive the declarative provisioning and management of managed hosts, for example, Red Hat Enterprise Linux (RHEL). See also
Imperative elements
.Patterns should use Red Hat Advanced Cluster Management (RHACM) to manage policy and compliance on any managed clusters.
Patterns should use RHACM and a standardized RHACM chart to deploy and configure OpenShift GitOps to managed clusters.
Managed clusters should be loosely coupled to their hub, and use OpenShift GitOps to consume applications and configuration directly from Git as opposed to having hard dependencies on a centralized cluster.
Managed clusters should use the
pull
deployment model for obtaining their configuration.Imperative elements should be implemented as Ansible playbooks.
Imperative elements should be driven declaratively implying that the playbooks should be triggered by Jobs or CronJobs stored in Git and delivered by OpenShift GitOps.
Can
Patterns can include additional configuration and/or demo elements located in one or more additional private Git repositories.
Patterns can include automation that deploys a known set of clusters and/or machines in a specific topology.
Patterns can limit functionality/testing claims to specific platforms, topologies, and cluster/node sizes.
Patterns can consume Operators from established partners (for example, Hashicorp Vault, and Seldon)
Patterns can include managed clusters.
Patterns can include details or automation for provisioning managed clusters, or rely on the admin to pre-provision them out-of-band.
Patterns can also choose to model multi-cluster solutions as an uncoordinated collection of initial hub clusters.
Imperative elements can interact with cluster state or external influences.
Technical requirements
Consider these requirements specific to the implementation of all Validated Patterns and their tiers.
The requirements are categorized as follows:
- Must
These are nonnegotiable, core requirements that must be implemented.
- Should
These are important but not critical; their implementation enhances the pattern.
- Can
These are optional or desirable features, but their absence does not hinder the implementation of a pattern.
Must
Patterns must include one or more Git repositories in a publicly accessible location, containing configuration elements that can be consumed by the Red Hat OpenShift GitOps Operator without supplying custom Argo CD images.
Patterns must be useful without all content stored in private Git repositories.
Patterns must include a list of names and versions of all the products and projects that the pattern consumes.
Patterns must be useful without any sample applications that are private or that lack public sources.
Patterns must not degrade due to lack of updates or opaque incompatibilities in closed source applications.
Patterns must not store sensitive data elements including, but not limited to, passwords in Git repositories.
Patterns must be possible to deploy on any installer-provisioned infrastructure OpenShift cluster (BYO).
Validated Patterns distinguish between the provisioning and configuration requirements of the initial cluster (
Patterns
) and of clusters or machines that are managed by the initial cluster (Managed clusters
).Patterns must use a standardized clustergroup Helm chart as the initial Red Hat OpenShift GitOps application that describes all namespaces, subscriptions, and any other GitOps applications which contain the configuration elements that make up the solution.
Managed clusters must operate on the premise of
eventual consistency
(automatic retries, and an expectation of idempotence), which is one of the essential benefits of the GitOps model.Imperative elements must be implemented as idempotent code stored in Git repository.
Should
Patterns should include sample applications to demonstrate the business problems addressed by the pattern.
Patterns should try to indicate which parts are foundational as opposed to being for demonstration purposes.
Patterns should use the Validated Patterns Operator to deploy patterns. However, anything that creates the OpenShift GitOps subscription and initial clustergroup application could be acceptable.
Patterns should embody the Open Hybrid Cloud model unless there is a compelling reason to limit the availability of functionality to a specific platform or topology.
Patterns should use industry standards and Red Hat products for all required tooling.
Patterns require current best practices at the time of pattern development. Solutions that do not conform to best practices should expect to justify non-conformance or expend engineering effort to conform.
Patterns should not make use of upstream or community Operators and images except, depending on the market segment, where it is critical to the overall solution.
Such Operators are forbidden to be deployed into an increasing number of customer environments, which limits the pattern reuse. Alternatively, consider to productize the Operator, and build it in-cluster from trusted sources as part of the pattern.
Patterns should be decomposed into modules that perform a specific function, so that they can be reused in other patterns.
For example, Bucket Notification is a capability in the Medical Diagnosis pattern that could be used for other solutions.
Patterns should use Red Hat Ansible Automation Platform to drive the declarative provisioning and management of managed hosts, for example, Red Hat Enterprise Linux (RHEL). See also
Imperative elements
.Patterns should use Red Hat Advanced Cluster Management (RHACM) to manage policy and compliance on any managed clusters.
Patterns should use RHACM and a standardized RHACM chart to deploy and configure OpenShift GitOps to managed clusters.
Managed clusters should be loosely coupled to their hub, and use OpenShift GitOps to consume applications and configuration directly from Git as opposed to having hard dependencies on a centralized cluster.
Managed clusters should use the
pull
deployment model for obtaining their configuration.Imperative elements should be implemented as Ansible playbooks.
Imperative elements should be driven declaratively implying that the playbooks should be triggered by Jobs or CronJobs stored in Git and delivered by OpenShift GitOps.
Can
Patterns can include additional configuration and/or demo elements located in one or more additional private Git repositories.
Patterns can include automation that deploys a known set of clusters and/or machines in a specific topology.
Patterns can limit functionality/testing claims to specific platforms, topologies, and cluster/node sizes.
Patterns can consume Operators from established partners (for example, Hashicorp Vault, and Seldon)
Patterns can include managed clusters.
Patterns can include details or automation for provisioning managed clusters, or rely on the admin to pre-provision them out-of-band.
Patterns can also choose to model multi-cluster solutions as an uncoordinated collection of initial hub clusters.
Imperative elements can interact with cluster state or external influences.
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/importing-a-cluster/index.html b/learn/importing-a-cluster/index.html
index 9a26ecfee..a914b3f75 100644
--- a/learn/importing-a-cluster/index.html
+++ b/learn/importing-a-cluster/index.html
@@ -7,7 +7,8 @@
- Workflow
- Values Files
- Secrets
-
- FAQ
Importing a managed cluster
Many validated patterns require importing a cluster into a managed group. These groups have specific application sets that will be deployed and managed. Some examples are factory clusters in the Industrial Edge pattern, or development clusters in Multi-cluster DevSecOps pattern.
Red Hat Advanced Cluster Management (RHACM) can be used to create a cluster of a specific cluster group type. You can deploy a specific cluster that way if you have RHACM set up with credentials for deploying clusters. However in many cases an OpenShift cluster has already been created and will be imported into the set of clusters that RHACM is managing.
While you can create and deploy in this manner this section concentrates on importing an existing cluster and designating a specific managed cluster group type.
To deploy a cluster that can be imported into RHACM, use the
openshift-install
program provided at console.redhat.com. You will need login credentials.Importing a cluster using the RHACM User Interface
Getting to the RHACM user interface
After ACM is installed a message regarding a "Web console update is available" will be displayed. Click on the "Refresh web console" link.
On the upper-left side you’ll see a pull down labeled "local-cluster". Select "All Clusters" from this pull down. This will navigate to the RHACM console and to its "Clusters" section
Select the "Import cluster" option.
Importing the cluster
On the "Import an existing cluster" page, enter the cluster name (arbitrary) and choose Kubeconfig as the "import mode". Add the tag
clusterGroup=
using the appropriate cluster group specified in the pattern. PressImport
.Using this method, you are done. Skip to the section in your pattern documentation that describes how you can confirm the pattern deployed correctly on the managed cluster.
Other potential import tools
There are a two other known ways to join a cluster to the RHACM hub. These methods are not supported but have been tested once. The patterns team no longer tests these methods. If these methods become supported we will maintain the documentation here.
Using the cm-cli tool
Using the clusteradm tool
Importing a cluster using the cm-cli tool
Install the cm-cli (
cm
) (cluster management) command-line tool. See installation instructions here: cm-cli installationObtain the KUBECONFIG file from the managed cluster.
On the command-line login into the hub/datacenter cluster (use
oc login
or export the KUBECONFIG).Run the following command:
cm attach cluster --cluster <cluster-name> --cluster-kubeconfig <path-to-KUBECONFIG>
Importing a cluster using
clusteradm
toolYou can also use
clusteradm
to join a cluster. The following instructions explain what needs to be done.clusteradm
is still in testing.To deploy a edge cluster you will need to get the hub/datacenter cluster’s token. You will need to install
clusteradm
. When it is installed run the following on existing hub/datacenter cluster:clusteradm get token
When you run the
clusteradm
command above it replies with the token and also shows you the command to use on the managed. Login to the managed cluster with either of the following methods:oc login
or
export KUBECONFIG=~/<path-to-kubeconfig>
Then request to that the managed join the datacenter hub.
clusteradm join --hub-token <token from clusteradm get token command > <managed-cluster-name>
Back on the hub cluster accept the join request.
clusteradm accept --clusters <managed-cluster-name>
Designate the new cluster as a devel site
If you use the command line tools above you need to explicitly indicate that the imported cluster is part of a specific clusterGroup. If you haven’t tagged the cluster as
clusterGroup=<managed-cluster-group>
then do that now. Some examples ofclusterGroup
arefactory
,devel
, orprod
.We do this by adding the label referenced in the managedSite’s
clusterSelector
.Find the new cluster.
oc get managedclusters.cluster.open-cluster-management.io
Apply the label.
oc label managedclusters.cluster.open-cluster-management.io/<your-cluster> clusterGroup=<managed-cluster-group>
Importing a managed cluster
Many validated patterns require importing a cluster into a managed group. These groups have specific application sets that will be deployed and managed. Some examples are factory clusters in the Industrial Edge pattern, or development clusters in Multi-cluster DevSecOps pattern.
Red Hat Advanced Cluster Management (RHACM) can be used to create a cluster of a specific cluster group type. You can deploy a specific cluster that way if you have RHACM set up with credentials for deploying clusters. However in many cases an OpenShift cluster has already been created and will be imported into the set of clusters that RHACM is managing.
While you can create and deploy in this manner this section concentrates on importing an existing cluster and designating a specific managed cluster group type.
To deploy a cluster that can be imported into RHACM, use the
openshift-install
program provided at console.redhat.com. You will need login credentials.Importing a cluster using the RHACM User Interface
Getting to the RHACM user interface
After ACM is installed a message regarding a "Web console update is available" will be displayed. Click on the "Refresh web console" link.
On the upper-left side you’ll see a pull down labeled "local-cluster". Select "All Clusters" from this pull down. This will navigate to the RHACM console and to its "Clusters" section
Select the "Import cluster" option.
Importing the cluster
On the "Import an existing cluster" page, enter the cluster name (arbitrary) and choose Kubeconfig as the "import mode". Add the tag
clusterGroup=
using the appropriate cluster group specified in the pattern. PressImport
.Using this method, you are done. Skip to the section in your pattern documentation that describes how you can confirm the pattern deployed correctly on the managed cluster.
Other potential import tools
There are a two other known ways to join a cluster to the RHACM hub. These methods are not supported but have been tested once. The patterns team no longer tests these methods. If these methods become supported we will maintain the documentation here.
Using the cm-cli tool
Using the clusteradm tool
Importing a cluster using the cm-cli tool
Install the cm-cli (
cm
) (cluster management) command-line tool. See installation instructions here: cm-cli installationObtain the KUBECONFIG file from the managed cluster.
On the command-line login into the hub/datacenter cluster (use
oc login
or export the KUBECONFIG).Run the following command:
cm attach cluster --cluster <cluster-name> --cluster-kubeconfig <path-to-KUBECONFIG>
Importing a cluster using
clusteradm
toolYou can also use
clusteradm
to join a cluster. The following instructions explain what needs to be done.clusteradm
is still in testing.To deploy a edge cluster you will need to get the hub/datacenter cluster’s token. You will need to install
clusteradm
. When it is installed run the following on existing hub/datacenter cluster:clusteradm get token
When you run the
clusteradm
command above it replies with the token and also shows you the command to use on the managed. Login to the managed cluster with either of the following methods:oc login
or
export KUBECONFIG=~/<path-to-kubeconfig>
Then request to that the managed join the datacenter hub.
clusteradm join --hub-token <token from clusteradm get token command > <managed-cluster-name>
Back on the hub cluster accept the join request.
clusteradm accept --clusters <managed-cluster-name>
Designate the new cluster as a devel site
If you use the command line tools above you need to explicitly indicate that the imported cluster is part of a specific clusterGroup. If you haven’t tagged the cluster as
clusterGroup=<managed-cluster-group>
then do that now. Some examples ofclusterGroup
arefactory
,devel
, orprod
.We do this by adding the label referenced in the managedSite’s
clusterSelector
.Find the new cluster.
oc get managedclusters.cluster.open-cluster-management.io
Apply the label.
oc label managedclusters.cluster.open-cluster-management.io/<your-cluster> clusterGroup=<managed-cluster-group>
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/infrastructure/index.html b/learn/infrastructure/index.html
index 9f11bd0a9..2c1b4104c 100644
--- a/learn/infrastructure/index.html
+++ b/learn/infrastructure/index.html
@@ -7,7 +7,8 @@
- Workflow
- Values Files
- Secrets
-
- FAQ
Background
Each validated pattern has infrastructure requirements. The majority of the validated patterns will run Red Hat OpenShift while some parts will run directly on Red Hat Enterprise Linux or (RHEL), more likely, a version of RHEL called RHEL for Edge. It is expected that consumers of validated patterns already have the infrastructure in place using existing reliable and supported deployment tools. For more information and tools head over to console.redhat.com
Sizing
In this section we provide general minimum sizing requirements for such infrastructure but it is important to review specific requirements for a specific validated pattern. For example, Industrial Edge 2.0 employs AI/Ml technology that requires large machine instances to support the applications deployed on OpenShift at the datacenter.
Background
Each validated pattern has infrastructure requirements. The majority of the validated patterns will run Red Hat OpenShift while some parts will run directly on Red Hat Enterprise Linux or (RHEL), more likely, a version of RHEL called RHEL for Edge. It is expected that consumers of validated patterns already have the infrastructure in place using existing reliable and supported deployment tools. For more information and tools head over to console.redhat.com
Sizing
In this section we provide general minimum sizing requirements for such infrastructure but it is important to review specific requirements for a specific validated pattern. For example, Industrial Edge 2.0 employs AI/Ml technology that requires large machine instances to support the applications deployed on OpenShift at the datacenter.
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/learn/keyconcepts/index.html b/learn/keyconcepts/index.html index 50982a9e5..7de3304c9 100644 --- a/learn/keyconcepts/index.html +++ b/learn/keyconcepts/index.html @@ -13,7 +13,8 @@ │ └── example.yaml └── values.yaml (3)
1 The Chart.yaml
file contains chart metadata, such as the name and version of the chart.2 The templates
directory contains files that define application resources such as deployments.3 The values.yaml
file contains default values for the chart.ArgoCD and Helm Integration
ArgoCD integrates with Helm to provide a powerful GitOps-based deployment mechanism. The validated patterns framework uses ArgoCD and Helm to streamline application deployment by defining applications as Helm charts stored in Git repositories. ArgoCD is the tool of choice to apply the desired state of desired application to the target cluster environment.
ArgoCD automates the deployment and synchronization of these applications to OpenShift Container Platform clusters, ensuring consistency, reliability, and efficiency in managing Kubernetes applications. This integration supports automated, declarative, and version-controlled deployments, enhancing operational efficiency and maintaining application state across environments. ArgoCD helps implement continuous deployment for cloud-native applications.
Values
Values files are essential for customizing settings in applications, services, or validated patterns, particularly in Kubernetes deployments using Helm charts. These files, written in plain YAML format, provide a structured and flexible approach to set parameters and configurations for deploying validated patterns. The values files contain the variables that drive the configurations of your namespaces, subscriptions, applications, and other resources. The variables defined in your values files are referenced within your Helm chart templates. This ensures consistency and enables dynamic configurations. Combined with the power of the Helm’s templating language you can implement conditionals and loops for adaptable and scalable configurations.
Key characteristics of values files include:
Plain YAML Format: The human-readable and easy-to-edit syntax of YAML makes configuration settings accessible and straightforward to manage.
Hierarchical Nature: Values files support a hierarchy of values, allowing logical organization and structuring of configurations, which is especially useful in handling complex deployments. -In Helm charts, values files define configuration settings for deploying applications and managing resources within an OpenShift Container Platform cluster. They enable flexible, per-cluster customization while ensuring consistency with the overall validated pattern. This ensures that organizations can achieve efficient, secure, and consistent deployments across multiple OpenShift Container Platform clusters.
A common practice is to use a base values file, such as
values-global.yaml
, for global settings, and then have cluster-specific values files for examplevalues-cluster1.yaml
,values-cluster2.yaml
that override or add to the global settings. This approach allows for comprehensive customization while maintaining a centralized and organized configuration structure, promoting best practices for deployment and resource management.For more information, see Exploring values.
Applications
The applications section in the Helm values file plays a crucial role in defining and managing the deployment of various applications within an OpenShift Container Platform cluster. By leveraging Helm charts and adhering to validated patterns, it ensures consistency, best practices, and simplified management, leading to reliable and scalable application deployments.
The path field in each application entry points in the values file points to the location of the Helm chart and associated configuration files. These charts contain the Kubernetes manifests and configuration necessary to deploy the application. Helm charts are used to package Kubernetes applications and manage their deployment in a consistent and reproducible manner.
When these applications are deployed, the following Kubernetes resources are typically created:
Deployments: Define the desired state and replicas for the application’s pods.
Services: Expose the application’s pods to other services or external traffic.
ConfigMaps and Secrets: Store configuration data and sensitive information.
PersistentVolumeClaims (PVCs): Request storage resources for the application.
Ingress or Routes: Provide external access to the application.
RBAC (Role-Based Access Control): Define access permissions and roles.
Red Hat Advanced Cluster Management (RHACM)
One of the applications deployed by the Validated Patterns Operator is Red Hat Advanced Cluster Management (RHACM). RHACM is a comprehensive solution designed to manage multiple OpenShift Container Platform clusters, whether that is ten clusters or a thousand clusters and enforce policies across those clusters from a single pane of glass.
RHACM plays a pivotal role in the validated pattern framework by providing robust capabilities for managing Kubernetes clusters and enforcing policies across heterogeneous environments. RHACM is only installed when a pattern spans multiple clusters. It supports operational efficiency, scalability, compliance, and security, making it an essential tool for organizations looking to manage their Kubernetes infrastructure effectively.
The Validated Patterns framework uses ACM policies to ensure that applications, targeted for specific clusters, are deployed to the appropriate cluster environments. The single pane of glass allows you to see information about your clusters. RHACM supports multiple cloud providers out of the box and it gives you a clear insight into the resources for that cluster using the observability feature.
ClusterGroups
In a validated pattern, a ClusterGroup organizes and manages clusters sharing common configurations, policies, or deployment needs, with the default group initially encompassing all clusters unless assigned elsewhere. Multiple cluster groups within a pattern allow for tailored management, enabling specific configurations and policies based on roles, environments, or locations. This segmentation enhances efficiency, consistency, and simplifies complex environments. In the validated patterns framework, a ClusterGroup is a key entity representing either a single cluster or a collection of clusters with unique configurations, determined by Helm charts and Kubernetes features. Typically, a ClusterGroup serves as the foundation for each pattern, with the primary one named in
values-global.yaml
, often referred to ashub
. Managed ClusterGroups can also be defined, specifying characteristics and policies for additional clusters. Managed cluster groups are sets of clusters, grouped by function, that share a common configuration set. There is no limitation on the number of groups, or the number of clusters within each group.When joining a managed cluster to Red Hat Advanced Cluster Management (RHACM) or deploying a new cluster with RHACM, it must be assigned to at least one ClusterGroup. RHACM identifies the managed cluster’s membership in a ClusterGroup and proceeds to set up the cluster, including installing the RHACM agent. Once the setup is complete, RHACM deploys GitOps and supplies it with information about the ClusterGroup. GitOps then retrieves the associated values file and proceeds to deploy the Operators, configurations, and charts accordingly.
For more information, see ClusterGroup configuration in values files.
GitOps
GitOps is a way to manage cloud-native systems that are powered by Kubernetes. It leverages a policy-as-code approach to define and manage every layer of the modern application stack from infrastructure, networking application code, and the GitOps pipeline itself.
The key principle of GitOps are:
Declarative: The methodology requires describing the desired state, achieved through raw manifests, helm charts, kustomize, or other forms of automation.
Versioned and immutability: Git ensures versioning and immutability, serving as the definitive source of truth. Version control and historical tracking offer insights into changes that impact the clusters.
Pulled automatically: The GitOps controller pulls the state automatically to prevent any errors introduced by humans, and it also allows the application an opportunity to heal itself.
Continuously reconciled: The GitOps controller has a reconciliation loop that by default runs every 3 minutes. When the reconciler identifies a diff between git and the cluster, it will reconcile the change onto the cluster during the next synchronization.
GitOps within the validated pattern framework ensures that infrastructure and application configurations are managed declaratively, consistently, and securely. GitOps ensures consistency across our environments, platforms and applications.
For more information, see GitOps.
Namespaces
Namespaces in a validated pattern are essential for organizing and managing resources within an OpenShift Container Platform cluster, ensuring security, consistency, and efficient resource allocation. Recommendations for defining namespaces include using consistent naming conventions, ensuring isolation and security through policies and RBAC, setting resource quotas, tailoring configurations to specific environments, and designing namespaces for modularity and reusability across patterns or applications.
Operators generally create their own namespaces, but you might need to create additional ones. Check if there are expected namespaces with the product before creating new ones.
For more information, see Understanding Namespace Creation using the Validated Patterns Framework.
Subscriptions
Subscriptions in a validated pattern typically refer to a methodical approach to managing and deploying applications and services within an OpenShift cluster. Subscriptions within a ClusterGroup in validated patterns streamline access management and resource allocation, allowing administrators to efficiently control user and application permissions, allocate resources precisely, and enforce governance policies.
Subscriptions are defined in the values files and they are OpenShift Operator subscriptions from the Operator Hub. Subscriptions contribute to the creation of a software bill of materials (SBOM), enhancing transparency and security by detailing all intended installations within the ClusterGroup. Managed through the Operator Lifecycle Manager (OLM), these subscriptions ensure continuous operation and upgrades of operators, similar to RPM packages on RHEL, thus maintaining cluster health and security.
To maximize the benefits of subscriptions, it is crucial to align them with organizational needs, integrate automated monitoring and alerting, and regularly review and update subscription plans.
Secrets
Enterprise applications, especially in multi-cluster and multi-site environments, require robust security measures, including the use of certificates and other secrets to establish trust. Managing these secrets effectively is crucial.
Ignoring security during the development of distributed enterprise applications can lead to significant technical debt. The DevSecOps model addresses this by emphasizing the need to integrate security early in the development lifecycle, known as "shifting security to the left."
In the OpenShift Container Platform, secrets are used to securely store sensitive information like passwords, API keys, and certificates. These secrets are managed using Kubernetes secret objects within validated patterns, ensuring consistent, secure, and compliant deployments. This approach promotes best practices for security and simplifies the management of sensitive data across OpenShift Container Platform Container Platform deployments.
For more information, see Overview of secrets management.
Shared values
Shared values files are YAML files used in a validated pattern to centralize and manage configuration settings. They define common parameters and settings that can be reused across multiple clusters, applications, and environments. This approach promotes consistency and simplifies configuration management.
Shared values files are a powerful mechanism in a validated pattern, enabling centralized, consistent, and reusable configuration management across multiple clusters and environments. By defining global settings and leveraging cluster-specific overrides, they ensure that configurations are both standardized and flexible enough to accommodate specific needs of individual clusters.
Tests
Tests within the Validated Pattern Framework are essential components to validate and ensure that the deployed patterns and configurations adhere to operational standards, security protocols, performance expectations, and compliance requirements in Kubernetes and OpenShift Container Platform environments.
QUICKLINKS
- +In Helm charts, values files define configuration settings for deploying applications and managing resources within an OpenShift Container Platform cluster. They enable flexible, per-cluster customization while ensuring consistency with the overall validated pattern. This ensures that organizations can achieve efficient, secure, and consistent deployments across multiple OpenShift Container Platform clusters.
A common practice is to use a base values file, such as
values-global.yaml
, for global settings, and then have cluster-specific values files for examplevalues-cluster1.yaml
,values-cluster2.yaml
that override or add to the global settings. This approach allows for comprehensive customization while maintaining a centralized and organized configuration structure, promoting best practices for deployment and resource management.For more information, see Exploring values.
Applications
The applications section in the Helm values file plays a crucial role in defining and managing the deployment of various applications within an OpenShift Container Platform cluster. By leveraging Helm charts and adhering to validated patterns, it ensures consistency, best practices, and simplified management, leading to reliable and scalable application deployments.
The path field in each application entry points in the values file points to the location of the Helm chart and associated configuration files. These charts contain the Kubernetes manifests and configuration necessary to deploy the application. Helm charts are used to package Kubernetes applications and manage their deployment in a consistent and reproducible manner.
When these applications are deployed, the following Kubernetes resources are typically created:
Deployments: Define the desired state and replicas for the application’s pods.
Services: Expose the application’s pods to other services or external traffic.
ConfigMaps and Secrets: Store configuration data and sensitive information.
PersistentVolumeClaims (PVCs): Request storage resources for the application.
Ingress or Routes: Provide external access to the application.
RBAC (Role-Based Access Control): Define access permissions and roles.
Red Hat Advanced Cluster Management (RHACM)
One of the applications deployed by the Validated Patterns Operator is Red Hat Advanced Cluster Management (RHACM). RHACM is a comprehensive solution designed to manage multiple OpenShift Container Platform clusters, whether that is ten clusters or a thousand clusters and enforce policies across those clusters from a single pane of glass.
RHACM plays a pivotal role in the validated pattern framework by providing robust capabilities for managing Kubernetes clusters and enforcing policies across heterogeneous environments. RHACM is only installed when a pattern spans multiple clusters. It supports operational efficiency, scalability, compliance, and security, making it an essential tool for organizations looking to manage their Kubernetes infrastructure effectively.
The Validated Patterns framework uses ACM policies to ensure that applications, targeted for specific clusters, are deployed to the appropriate cluster environments. The single pane of glass allows you to see information about your clusters. RHACM supports multiple cloud providers out of the box and it gives you a clear insight into the resources for that cluster using the observability feature.
ClusterGroups
In a validated pattern, a ClusterGroup organizes and manages clusters sharing common configurations, policies, or deployment needs, with the default group initially encompassing all clusters unless assigned elsewhere. Multiple cluster groups within a pattern allow for tailored management, enabling specific configurations and policies based on roles, environments, or locations. This segmentation enhances efficiency, consistency, and simplifies complex environments. In the validated patterns framework, a ClusterGroup is a key entity representing either a single cluster or a collection of clusters with unique configurations, determined by Helm charts and Kubernetes features. Typically, a ClusterGroup serves as the foundation for each pattern, with the primary one named in
values-global.yaml
, often referred to ashub
. Managed ClusterGroups can also be defined, specifying characteristics and policies for additional clusters. Managed cluster groups are sets of clusters, grouped by function, that share a common configuration set. There is no limitation on the number of groups, or the number of clusters within each group.When joining a managed cluster to Red Hat Advanced Cluster Management (RHACM) or deploying a new cluster with RHACM, it must be assigned to at least one ClusterGroup. RHACM identifies the managed cluster’s membership in a ClusterGroup and proceeds to set up the cluster, including installing the RHACM agent. Once the setup is complete, RHACM deploys GitOps and supplies it with information about the ClusterGroup. GitOps then retrieves the associated values file and proceeds to deploy the Operators, configurations, and charts accordingly.
For more information, see ClusterGroup configuration in values files.
GitOps
GitOps is a way to manage cloud-native systems that are powered by Kubernetes. It leverages a policy-as-code approach to define and manage every layer of the modern application stack from infrastructure, networking application code, and the GitOps pipeline itself.
The key principle of GitOps are:
Declarative: The methodology requires describing the desired state, achieved through raw manifests, helm charts, kustomize, or other forms of automation.
Versioned and immutability: Git ensures versioning and immutability, serving as the definitive source of truth. Version control and historical tracking offer insights into changes that impact the clusters.
Pulled automatically: The GitOps controller pulls the state automatically to prevent any errors introduced by humans, and it also allows the application an opportunity to heal itself.
Continuously reconciled: The GitOps controller has a reconciliation loop that by default runs every 3 minutes. When the reconciler identifies a diff between git and the cluster, it will reconcile the change onto the cluster during the next synchronization.
GitOps within the validated pattern framework ensures that infrastructure and application configurations are managed declaratively, consistently, and securely. GitOps ensures consistency across our environments, platforms and applications.
For more information, see GitOps.
Namespaces
Namespaces in a validated pattern are essential for organizing and managing resources within an OpenShift Container Platform cluster, ensuring security, consistency, and efficient resource allocation. Recommendations for defining namespaces include using consistent naming conventions, ensuring isolation and security through policies and RBAC, setting resource quotas, tailoring configurations to specific environments, and designing namespaces for modularity and reusability across patterns or applications.
Operators generally create their own namespaces, but you might need to create additional ones. Check if there are expected namespaces with the product before creating new ones.
For more information, see Understanding Namespace Creation using the Validated Patterns Framework.
Subscriptions
Subscriptions in a validated pattern typically refer to a methodical approach to managing and deploying applications and services within an OpenShift cluster. Subscriptions within a ClusterGroup in validated patterns streamline access management and resource allocation, allowing administrators to efficiently control user and application permissions, allocate resources precisely, and enforce governance policies.
Subscriptions are defined in the values files and they are OpenShift Operator subscriptions from the Operator Hub. Subscriptions contribute to the creation of a software bill of materials (SBOM), enhancing transparency and security by detailing all intended installations within the ClusterGroup. Managed through the Operator Lifecycle Manager (OLM), these subscriptions ensure continuous operation and upgrades of operators, similar to RPM packages on RHEL, thus maintaining cluster health and security.
To maximize the benefits of subscriptions, it is crucial to align them with organizational needs, integrate automated monitoring and alerting, and regularly review and update subscription plans.
Secrets
Enterprise applications, especially in multi-cluster and multi-site environments, require robust security measures, including the use of certificates and other secrets to establish trust. Managing these secrets effectively is crucial.
Ignoring security during the development of distributed enterprise applications can lead to significant technical debt. The DevSecOps model addresses this by emphasizing the need to integrate security early in the development lifecycle, known as "shifting security to the left."
In the OpenShift Container Platform, secrets are used to securely store sensitive information like passwords, API keys, and certificates. These secrets are managed using Kubernetes secret objects within validated patterns, ensuring consistent, secure, and compliant deployments. This approach promotes best practices for security and simplifies the management of sensitive data across OpenShift Container Platform Container Platform deployments.
For more information, see Overview of secrets management.
Shared values
Shared values files are YAML files used in a validated pattern to centralize and manage configuration settings. They define common parameters and settings that can be reused across multiple clusters, applications, and environments. This approach promotes consistency and simplifies configuration management.
Shared values files are a powerful mechanism in a validated pattern, enabling centralized, consistent, and reusable configuration management across multiple clusters and environments. By defining global settings and leveraging cluster-specific overrides, they ensure that configurations are both standardized and flexible enough to accommodate specific needs of individual clusters.
Tests
Tests within the Validated Pattern Framework are essential components to validate and ensure that the deployed patterns and configurations adhere to operational standards, security protocols, performance expectations, and compliance requirements in Kubernetes and OpenShift Container Platform environments.
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/maintained/index.html b/learn/maintained/index.html
index 87968f752..12399ad54 100644
--- a/learn/maintained/index.html
+++ b/learn/maintained/index.html
@@ -8,7 +8,8 @@
- Values Files
- Secrets
- FAQ
About the Validated Patterns Maintained tier
A pattern categorized under the maintained tier implies that the pattern was known to be functional on all currently supported extended update support (EUS) versions of Red Hat OpenShift Container Platform. Qualifying for this tier might require additional work for the pattern’s owner who might be a partner or a sufficiently motivated subject matter expert (SME).
Nominating a pattern for the maintained tier
If your pattern qualifies or meets the criteria for maintained tier, submit your nomination to validatedpatterns@googlegroups.com.
Each maintained pattern represents an ongoing maintenance, support, and testing effort. Finite team capacity means that it is not possible for the team to take on this responsibility for all Validated Patterns.
For this reason we have designed the tiers and our processes to facilitate this to occur outside of the team by any sufficiently motivated party, including other parts of Red Hat, partners, and even customers.
In limited cases, the Validated Patterns team may consider taking on that work, however, it is recommended that you contact the team at least 4 weeks prior to the end of a given quarter for the necessary work to be considered as part of the following quarter’s planning process.
Requirements for the maintained tier
The maintained patterns have deliverable and requirements in addition to those -specified for the Tested tier.
The requirements are categorized as follows:
- Must
These are nonnegotiable, core requirements that must be implemented.
- Should
These are important but not critical; their implementation enhances the pattern.
- Can
These are optional or desirable features, but their absence does not hinder the implementation of a pattern.
Must
A maintained pattern must continue to meet the following criteria to remain in maintained tier:
A maintained pattern must conform to the common technical implementation requirements.
A maintained pattern must only make use of components that are either supported, or easily substituted for supportable equivalents, for example, HashiCorp vault which has community and enterprise variants.
A maintained pattern must not rely on functionality in tech-preview, or hidden behind feature gates.
A maintained pattern must have their architectures reviewed by a representative of each Red Hat product they consume to ensure consistency with the product teams` intentions and roadmaps. Your patterns SME (eg. services rep) can help coordinate this.
A maintained pattern must include a link to a hosted presentation (Google Slides or similar) intended to promote the solution. The focus should be on the architecture and business problem being solved. No customer, or otherwise sensitive, information should be included.
A maintained pattern must include test plan automation that runs on every change to the pattern, or a schedule no less frequently than once per week.
A maintained pattern must be tested on all currently supported Red Hat OpenShift Container Platform extended update support (EUS) releases.
A maintained pattern must fix breakage in timely manner.
A maintained pattern must document their support policy.
The individual products used in a Validated Patterns are backed by the full Red Hat support experience conditional on the customer’s subscription to those products, and the individual products`s support policy.
Additional components in a Validated Patterns that are not supported by Red Hat; for example, Hashicorp Vault, and Seldon Core, require a customer to obtain support from that vendor directly.
The Validated Patterns team is will try to address any problems in the Validated Patterns Operator, and in the common Helm charts, but cannot not offer any SLAs at this time.
The maintained patterns do not imply an obligation of support for partner or community Operators by Red Hat.
QUICKLINKS
- +specified for the Tested tier.
The requirements are categorized as follows:
- Must
These are nonnegotiable, core requirements that must be implemented.
- Should
These are important but not critical; their implementation enhances the pattern.
- Can
These are optional or desirable features, but their absence does not hinder the implementation of a pattern.
Must
A maintained pattern must continue to meet the following criteria to remain in maintained tier:
A maintained pattern must conform to the common technical implementation requirements.
A maintained pattern must only make use of components that are either supported, or easily substituted for supportable equivalents, for example, HashiCorp vault which has community and enterprise variants.
A maintained pattern must not rely on functionality in tech-preview, or hidden behind feature gates.
A maintained pattern must have their architectures reviewed by a representative of each Red Hat product they consume to ensure consistency with the product teams` intentions and roadmaps. Your patterns SME (eg. services rep) can help coordinate this.
A maintained pattern must include a link to a hosted presentation (Google Slides or similar) intended to promote the solution. The focus should be on the architecture and business problem being solved. No customer, or otherwise sensitive, information should be included.
A maintained pattern must include test plan automation that runs on every change to the pattern, or a schedule no less frequently than once per week.
A maintained pattern must be tested on all currently supported Red Hat OpenShift Container Platform extended update support (EUS) releases.
A maintained pattern must fix breakage in timely manner.
A maintained pattern must document their support policy.
The individual products used in a Validated Patterns are backed by the full Red Hat support experience conditional on the customer’s subscription to those products, and the individual products`s support policy.
Additional components in a Validated Patterns that are not supported by Red Hat; for example, Hashicorp Vault, and Seldon Core, require a customer to obtain support from that vendor directly.
The Validated Patterns team is will try to address any problems in the Validated Patterns Operator, and in the common Helm charts, but cannot not offer any SLAs at this time.
The maintained patterns do not imply an obligation of support for partner or community Operators by Red Hat.
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/ocp-cluster-general-sizing/index.html b/learn/ocp-cluster-general-sizing/index.html
index 20114cc82..ed4dfd0b8 100644
--- a/learn/ocp-cluster-general-sizing/index.html
+++ b/learn/ocp-cluster-general-sizing/index.html
@@ -9,7 +9,8 @@
- Secrets
- FAQ
OpenShift General Sizing
Recommended node host practices
The OpenShift Container Platform node configuration file contains important options. For example, two parameters control the maximum number of pods that can be scheduled to a node:
podsPerCore
andmaxPods
.When both options are in use, the lower of the two values limits the number of pods on a node. Exceeding these values can result in:
Increased CPU utilization.
Slow pod scheduling.
Potential out-of-memory scenarios, depending on the amount of memory in the node.
Exhausting the pool of IP addresses.
Resource overcommitting, leading to poor user application performance.
In Kubernetes, a pod that is holding a single container actually uses two containers. The second container is used to set up networking prior to the actual container starting. Therefore, a system running 10 pods will actually have 20 containers running.
podsPerCore sets the number of pods the node can run based on the number of processor cores on the node. For example, if podsPerCore is set to
10
on a node with 4 processor cores, the maximum number of pods allowed on the node will be40
.kubeletConfig: podsPerCore: 10
Setting podsPerCore to
0
disables this limit. The default is0
. podsPerCore cannot exceedmaxPods
.maxPods sets the number of pods the node can run to a fixed value, regardless of the properties of the node.
kubeletConfig: - maxPods: 250
For more information about sizing and Red Hat standard host practices see the Official OpenShift Documentation Page for recommended host practices.
Control plane node sizing
The control plane node resource requirements depend on the number of nodes in the cluster. The following control plane node size recommendations are based on the results of control plane density focused testing. The control plane tests create the following objects across the cluster in each of the namespaces depending on the node counts:
12 image streams
3 build configurations
6 builds
1 deployment with 2 pod replicas mounting two secrets each
2 deployments with 1 pod replica mounting two secrets
3 services pointing to the previous deployments
3 routes pointing to the previous deployments
10 secrets, 2 of which are mounted by the previous deployments
10 config maps, 2 of which are mounted by the previous deployments
Number of worker nodes Cluster load (namespaces) CPU cores Memory (GB) 25
500
4
16
100
1000
8
32
250
4000
16
96
On a cluster with three masters or control plane nodes, the CPU and memory usage will spike up when one of the nodes is stopped, rebooted or fails because the remaining two nodes must handle the load in order to be highly available. This is also expected during upgrades because the masters are cordoned, drained, and rebooted serially to apply the operating system updates, as well as the control plane Operators update. To avoid cascading failures on large and dense clusters, keep the overall resource usage on the master nodes to at most half of all available capacity to handle the resource usage spikes. Increase the CPU and memory on the master nodes accordingly.
The node sizing varies depending on the number of nodes and object counts in the cluster. It also depends on whether the objects are actively being created on the cluster. During object creation, the control plane is more active in terms of resource usage compared to when the objects are in the
running
phase.If you used an installer-provisioned infrastructure installation method, you cannot modify the control plane node size in a running OpenShift Container Platform 4.5 cluster. Instead, you must estimate your total node count and use the suggested control plane node size during installation.
The recommendations are based on the data points captured on OpenShift Container Platform clusters with OpenShiftSDN as the network plug-in.
In OpenShift Container Platform 4.5, half of a CPU core (500 millicore) is now reserved by the system by default compared to OpenShift Container Platform 3.11 and previous versions. The sizes are determined taking that into consideration.
For more information about sizing and Red Hat standard host practices see the Official OpenShift Documentation Page for recommended host practices.
Recommended etcd practices
For large and dense clusters, etcd can suffer from poor performance if the keyspace grows excessively large and exceeds the space quota. Periodic maintenance of etcd, including defragmentation, must be performed to free up space in the data store. It is highly recommended that you monitor Prometheus for etcd metrics and defragment it when required before etcd raises a cluster-wide alarm that puts the cluster into a maintenance mode, which only accepts key reads and deletes. Some of the key metrics to monitor are
etcd_server_quota_backend_bytes
which is the current quota limit,etcd_mvcc_db_total_size_in_use_in_bytes
which indicates the actual database usage after a history compaction, andetcd_debugging_mvcc_db_total_size_in_bytes
which shows the database size including free space waiting for defragmentation. Instructions on defragging etcd can be found in theDefragmenting etcd data
section.Etcd writes data to disk, so its performance strongly depends on disk performance. Etcd persists proposals on disk. Slow disks and disk activity from other processes might cause long fsync latencies, causing etcd to miss heartbeats, inability to commit new proposals to the disk on time, which can cause request timeouts and temporary leader loss. It is highly recommended to run etcd on machines backed by SSD/NVMe disks with low latency and high throughput.
Some of the key metrics to monitor on a deployed OpenShift Container Platform cluster are p99 of etcd disk write ahead log duration and the number of etcd leader changes. Use Prometheus to track these metrics.
etcd_disk_wal_fsync_duration_seconds_bucket
reports the etcd disk fsync duration,etcd_server_leader_changes_seen_total
reports the leader changes. To rule out a slow disk and confirm that the disk is reasonably fast, 99th percentile of theetcd_disk_wal_fsync_duration_seconds_bucket
should be less than 10ms.For more information about sizing and Red Hat standard host practices see the Official OpenShift Documentation Page for recommended host practices.
QUICKLINKS
- + maxPods: 250
For more information about sizing and Red Hat standard host practices see the Official OpenShift Documentation Page for recommended host practices.
Control plane node sizing
The control plane node resource requirements depend on the number of nodes in the cluster. The following control plane node size recommendations are based on the results of control plane density focused testing. The control plane tests create the following objects across the cluster in each of the namespaces depending on the node counts:
12 image streams
3 build configurations
6 builds
1 deployment with 2 pod replicas mounting two secrets each
2 deployments with 1 pod replica mounting two secrets
3 services pointing to the previous deployments
3 routes pointing to the previous deployments
10 secrets, 2 of which are mounted by the previous deployments
10 config maps, 2 of which are mounted by the previous deployments
Number of worker nodes Cluster load (namespaces) CPU cores Memory (GB) 25
500
4
16
100
1000
8
32
250
4000
16
96
On a cluster with three masters or control plane nodes, the CPU and memory usage will spike up when one of the nodes is stopped, rebooted or fails because the remaining two nodes must handle the load in order to be highly available. This is also expected during upgrades because the masters are cordoned, drained, and rebooted serially to apply the operating system updates, as well as the control plane Operators update. To avoid cascading failures on large and dense clusters, keep the overall resource usage on the master nodes to at most half of all available capacity to handle the resource usage spikes. Increase the CPU and memory on the master nodes accordingly.
The node sizing varies depending on the number of nodes and object counts in the cluster. It also depends on whether the objects are actively being created on the cluster. During object creation, the control plane is more active in terms of resource usage compared to when the objects are in the
running
phase.If you used an installer-provisioned infrastructure installation method, you cannot modify the control plane node size in a running OpenShift Container Platform 4.5 cluster. Instead, you must estimate your total node count and use the suggested control plane node size during installation.
The recommendations are based on the data points captured on OpenShift Container Platform clusters with OpenShiftSDN as the network plug-in.
In OpenShift Container Platform 4.5, half of a CPU core (500 millicore) is now reserved by the system by default compared to OpenShift Container Platform 3.11 and previous versions. The sizes are determined taking that into consideration.
For more information about sizing and Red Hat standard host practices see the Official OpenShift Documentation Page for recommended host practices.
Recommended etcd practices
For large and dense clusters, etcd can suffer from poor performance if the keyspace grows excessively large and exceeds the space quota. Periodic maintenance of etcd, including defragmentation, must be performed to free up space in the data store. It is highly recommended that you monitor Prometheus for etcd metrics and defragment it when required before etcd raises a cluster-wide alarm that puts the cluster into a maintenance mode, which only accepts key reads and deletes. Some of the key metrics to monitor are
etcd_server_quota_backend_bytes
which is the current quota limit,etcd_mvcc_db_total_size_in_use_in_bytes
which indicates the actual database usage after a history compaction, andetcd_debugging_mvcc_db_total_size_in_bytes
which shows the database size including free space waiting for defragmentation. Instructions on defragging etcd can be found in theDefragmenting etcd data
section.Etcd writes data to disk, so its performance strongly depends on disk performance. Etcd persists proposals on disk. Slow disks and disk activity from other processes might cause long fsync latencies, causing etcd to miss heartbeats, inability to commit new proposals to the disk on time, which can cause request timeouts and temporary leader loss. It is highly recommended to run etcd on machines backed by SSD/NVMe disks with low latency and high throughput.
Some of the key metrics to monitor on a deployed OpenShift Container Platform cluster are p99 of etcd disk write ahead log duration and the number of etcd leader changes. Use Prometheus to track these metrics.
etcd_disk_wal_fsync_duration_seconds_bucket
reports the etcd disk fsync duration,etcd_server_leader_changes_seen_total
reports the leader changes. To rule out a slow disk and confirm that the disk is reasonably fast, 99th percentile of theetcd_disk_wal_fsync_duration_seconds_bucket
should be less than 10ms.For more information about sizing and Red Hat standard host practices see the Official OpenShift Documentation Page for recommended host practices.
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/learn/quickstart/index.html b/learn/quickstart/index.html index 3973af9f1..21dde4a77 100644 --- a/learn/quickstart/index.html +++ b/learn/quickstart/index.html @@ -16,7 +16,8 @@ brew install podman git # Containers on MacOSX run in a VM which is managed by "podman machine" commands podman machine init -v ${HOME}:${HOME} -v /private/tmp/:/private/tmp -podman machine start
QUICKLINKS
- +podman machine start
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/sandbox/index.html b/learn/sandbox/index.html
index 59d926449..a14e82f78 100644
--- a/learn/sandbox/index.html
+++ b/learn/sandbox/index.html
@@ -8,7 +8,8 @@
- Values Files
- Secrets
- FAQ
About the Validated Patterns Sandbox tier
A pattern categorized under the sandbox tier provides you with an entry point to onboard to the Validated Patterns. The minimum requirement to qualify for the sandbox tier is that you must start with the patterns framework and include minimal documentation.
Nominating a pattern for the sandbox tier
The Validated Patterns team has a preference for empowering others, and not taking credit for their work.
Where there is an existing application or a demonstration, there is also a strong preference for the originating team to own any changes that are needed for the implementation to become a validated pattern. Alternatively, if the Validated Patterns team drives the conversion, then to prevent confusion and duplicated efforts, we are likely to ask for a commitment to phase out use of the previous implementation for future engagements such as demos, presentations, and workshops.
The goal is to avoid bringing a parallel implementation into existence which divides engineering resources, and creates confusion internally and with customers as the implementations drift apart.
In both scenarios the originating team can choose where to host the primary repository, will be given admin permissions to any fork in https://github.com/validatedpatterns, and will receive on-going assistance from the Validated Patterns team.
Requirements for the sandbox tier
Consider these requirements for all sandbox tier.
The requirements are categorized as follows:
- Must
These are nonnegotiable, core requirements that must be implemented.
- Should
These are important but not critical; their implementation enhances the pattern.
- Can
These are optional or desirable features, but their absence does not hinder the implementation of a pattern.
Must
A sandbox pattern must continue to meet the following criteria to remain in the sandbox tier:
A sandbox pattern must conform to the common technical implementation requirements.
A sandbox pattern must be able to be deployed onto a freshly deployed OpenShift cluster without prior modification or tuning.
A sandbox pattern must include a top-level README file that highlights the business problem and how the pattern solves it.
A sandbox pattern must include an architecture drawing. The specific tool or format is flexible as long as the meaning is clear.
A sandbox pattern must undergo an informal technical review by a community leader to ensure that it meets basic reuse standards.
A sandbox pattern must undergo an informal architecture review by a community leader to ensure that the solution has the right components, and they are generally being used as intended. For example, not using a database as a message bus.
As community leaders, contributions from within Red Hat might be subject to a higher level of scrutiny. While we strive to be inclusive, the community will have quality standards and generally using the framework does not automatically imply a solution is suitable for the community to endorse/publish.
A sandbox pattern must document their support policy.
It is anticipated that most sandbox pattern will be supported by the community on a best-effort basis, but this should be stated explicitly. -The Validated Patterns team commits to maintaining the framework, but will also accept help. |
Can
A sandbox pattern (including works-in-progress) can be hosted in the https://github.com/validatedpatterns-sandbox GitHub organization.
A sandbox pattern can be listed on the https://validatedpatterns.io site.
A sandbox pattern meeting additional criteria can be nominated for promotion to the Tested tier.
QUICKLINKS
- +The Validated Patterns team commits to maintaining the framework, but will also accept help.
Can
A sandbox pattern (including works-in-progress) can be hosted in the https://github.com/validatedpatterns-sandbox GitHub organization.
A sandbox pattern can be listed on the https://validatedpatterns.io site.
A sandbox pattern meeting additional criteria can be nominated for promotion to the Tested tier.
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/learn/secrets-management-in-the-validated-patterns-framework/index.html b/learn/secrets-management-in-the-validated-patterns-framework/index.html index cd19e42ef..3c1e75f6e 100644 --- a/learn/secrets-management-in-the-validated-patterns-framework/index.html +++ b/learn/secrets-management-in-the-validated-patterns-framework/index.html @@ -357,7 +357,8 @@ name: letsencrypt namespace: letsencrypt project: default - path: common/letsencrypt
QUICKLINKS
CONTRIBUTE
- Documentation
-
diff --git a/learn/secrets/index.html b/learn/secrets/index.html
index 7efa9e6a7..7fad0eaf2 100644
--- a/learn/secrets/index.html
+++ b/learn/secrets/index.html
@@ -7,7 +7,8 @@
- Workflow
- Values Files
- Secrets
-
- FAQ
Secrets
Secret Management
One area that has been impacted by a more automated approach to security is in the secret management. DevOps (and DevSecOps) environments require the use of many different services:
Code repositories
GitOps tools
Image repositories
Build pipelines
All of these services require credentials. (Or should do!) And keeping those credentials secret is very important. E.g. pushing your credentials to your personal GitHub/GitLab repository is not a secure solution.
While using a file based secret management can work if done correctly, most organizations opt for a more enterprise solution using a secret management product or project. The Cloud Native Computing Foundation (CNCF) has many such projects. The Validated Patterns project has started with Hashicorp Vault secret management product but we look forward to other project contributions.
Secrets
Secret Management
One area that has been impacted by a more automated approach to security is in the secret management. DevOps (and DevSecOps) environments require the use of many different services:
Code repositories
GitOps tools
Image repositories
Build pipelines
All of these services require credentials. (Or should do!) And keeping those credentials secret is very important. E.g. pushing your credentials to your personal GitHub/GitLab repository is not a secure solution.
While using a file based secret management can work if done correctly, most organizations opt for a more enterprise solution using a secret management product or project. The Cloud Native Computing Foundation (CNCF) has many such projects. The Validated Patterns project has started with Hashicorp Vault secret management product but we look forward to other project contributions.
QUICKLINKS
CONTRIBUTE
- Documentation
- diff --git a/learn/test-artifacts/index.html b/learn/test-artifacts/index.html index 6a4230a2e..db9de5c25 100644 --- a/learn/test-artifacts/index.html +++ b/learn/test-artifacts/index.html @@ -55,7 +55,8 @@ "testID": "13602", "jenkinsURL":"https://jenkins/job/ValidatedPatterns/job/MedicalDiagnosis/job/medicaldiag-gcp-ocp4.13/37/", "debugInfo":"https://storage.cloud.google.com/.../medicaldiag-gcp-ocp4.13.15-13602.tgz", - }
QUICKLINKS
- + }