Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(DMVP-5664): flagger operator integration #109

Merged
merged 1 commit into from
Nov 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,7 @@ worker_groups = {
| <a name="module_eks-cluster"></a> [eks-cluster](#module\_eks-cluster) | ./modules/eks | n/a |
| <a name="module_external-dns"></a> [external-dns](#module\_external-dns) | ./modules/external-dns | n/a |
| <a name="module_external-secrets"></a> [external-secrets](#module\_external-secrets) | ./modules/external-secrets | n/a |
| <a name="module_flagger"></a> [flagger](#module\_flagger) | ./modules/flagger | n/a |
| <a name="module_fluent-bit"></a> [fluent-bit](#module\_fluent-bit) | ./modules/fluent-bit | n/a |
| <a name="module_metrics-server"></a> [metrics-server](#module\_metrics-server) | ./modules/metrics-server | n/a |
| <a name="module_nginx-ingress-controller"></a> [nginx-ingress-controller](#module\_nginx-ingress-controller) | ./modules/nginx-ingress-controller/ | n/a |
Expand Down Expand Up @@ -260,9 +261,11 @@ worker_groups = {
| <a name="input_ebs_csi_version"></a> [ebs\_csi\_version](#input\_ebs\_csi\_version) | EBS CSI driver addon version | `string` | `"v1.15.0-eksbuild.1"` | no |
| <a name="input_efs_id"></a> [efs\_id](#input\_efs\_id) | EFS filesystem id in AWS | `string` | `null` | no |
| <a name="input_efs_storage_classes"></a> [efs\_storage\_classes](#input\_efs\_storage\_classes) | Additional storage class configurations: by default, 2 storage classes are created - efs-sc and efs-sc-root which has 0 uid. One can add another storage classes besides these 2. | <pre>list(object({<br> name : string<br> provisioning_mode : optional(string, "efs-ap")<br> file_system_id : string<br> directory_perms : optional(string, "755")<br> base_path : optional(string, "/")<br> uid : optional(number)<br> }))</pre> | `[]` | no |
| <a name="input_enable_alb_ingress_controller"></a> [enable\_alb\_ingress\_controller](#input\_enable\_alb\_ingress\_controller) | Whether alb ingress controller enabled. | `bool` | `true` | no |
| <a name="input_enable_api_gw_controller"></a> [enable\_api\_gw\_controller](#input\_enable\_api\_gw\_controller) | Weather enable API-GW controller or not | `bool` | `false` | no |
| <a name="input_enable_ebs_driver"></a> [enable\_ebs\_driver](#input\_enable\_ebs\_driver) | Weather enable EBS-CSI driver or not | `bool` | `true` | no |
| <a name="input_enable_efs_driver"></a> [enable\_efs\_driver](#input\_enable\_efs\_driver) | Weather install EFS driver or not in EKS | `bool` | `false` | no |
| <a name="input_enable_external_secrets"></a> [enable\_external\_secrets](#input\_enable\_external\_secrets) | Whether to enable external-secrets operator | `bool` | `true` | no |
| <a name="input_enable_kube_state_metrics"></a> [enable\_kube\_state\_metrics](#input\_enable\_kube\_state\_metrics) | Enable kube-state-metrics | `bool` | `false` | no |
| <a name="input_enable_metrics_server"></a> [enable\_metrics\_server](#input\_enable\_metrics\_server) | METRICS-SERVER | `bool` | `false` | no |
| <a name="input_enable_node_problem_detector"></a> [enable\_node\_problem\_detector](#input\_enable\_node\_problem\_detector) | n/a | `bool` | `true` | no |
Expand All @@ -272,6 +275,7 @@ worker_groups = {
| <a name="input_enable_waf_for_alb"></a> [enable\_waf\_for\_alb](#input\_enable\_waf\_for\_alb) | Enables WAF and WAF V2 addons for ALB | `bool` | `false` | no |
| <a name="input_external_dns"></a> [external\_dns](#input\_external\_dns) | Allows to install external-dns helm chart and related roles, which allows to automatically create R53 records based on ingress/service domain/host configs | <pre>object({<br> enabled = optional(bool, false)<br> configs = optional(any, {})<br> })</pre> | <pre>{<br> "enabled": false<br>}</pre> | no |
| <a name="input_external_secrets_namespace"></a> [external\_secrets\_namespace](#input\_external\_secrets\_namespace) | The namespace of external-secret operator | `string` | `"kube-system"` | no |
| <a name="input_flagger"></a> [flagger](#input\_flagger) | Allows to create/deploy flagger operator to have custom rollout strategies like canary/blue-green and also it allows to create custom flagger metric templates | <pre>object({<br> enabled = optional(bool, false)<br> namespace = optional(string, "ingress-nginx") # The flagger operator helm being installed on same namespace as mesh/ingress provider so this field need to be set based on which ingress/mesh we are going to use, more info in https://artifacthub.io/packages/helm/flagger/flagger<br> configs = optional(any, {}) # available options can be found in https://artifacthub.io/packages/helm/flagger/flagger<br> metric_template_configs = optional(any, {}) # available options can be found in https://github.com/dasmeta/helm/tree/flagger-metric-template-0.1.0/charts/flagger-metric-template<br> enable_metric_template = optional(bool, false)<br> enable_loadtester = optional(bool, false)<br> })</pre> | <pre>{<br> "enabled": false<br>}</pre> | no |
| <a name="input_fluent_bit_configs"></a> [fluent\_bit\_configs](#input\_fluent\_bit\_configs) | Fluent Bit configs | <pre>object({<br> enabled = optional(string, true)<br> fluent_bit_name = optional(string, "")<br> log_group_name = optional(string, "")<br> system_log_group_name = optional(string, "")<br> log_retention_days = optional(number, 90)<br> values_yaml = optional(string, "")<br> configs = optional(object({<br> inputs = optional(string, "")<br> filters = optional(string, "")<br> outputs = optional(string, "")<br> cloudwatch_outputs_enabled = optional(bool, true)<br> }), {})<br> drop_namespaces = optional(list(string), [])<br> log_filters = optional(list(string), [])<br> additional_log_filters = optional(list(string), [])<br> kube_namespaces = optional(list(string), [])<br> image_pull_secrets = optional(list(string), [])<br> })</pre> | <pre>{<br> "additional_log_filters": [<br> "ELB-HealthChecker",<br> "Amazon-Route53-Health-Check-Service"<br> ],<br> "configs": {<br> "cloudwatch_outputs_enabled": true,<br> "filters": "",<br> "inputs": "",<br> "outputs": ""<br> },<br> "drop_namespaces": [<br> "kube-system",<br> "opentelemetry-operator-system",<br> "adot",<br> "cert-manager",<br> "opentelemetry.*",<br> "meta.*"<br> ],<br> "enabled": true,<br> "fluent_bit_name": "",<br> "image_pull_secrets": [],<br> "kube_namespaces": [<br> "kube.*",<br> "meta.*",<br> "adot.*",<br> "devops.*",<br> "cert-manager.*",<br> "git.*",<br> "opentelemetry.*",<br> "stakater.*",<br> "renovate.*"<br> ],<br> "log_filters": [<br> "kube-probe",<br> "health",<br> "prometheus",<br> "liveness"<br> ],<br> "log_group_name": "",<br> "log_retention_days": 90,<br> "system_log_group_name": "",<br> "values_yaml": ""<br>}</pre> | no |
| <a name="input_manage_aws_auth"></a> [manage\_aws\_auth](#input\_manage\_aws\_auth) | n/a | `bool` | `true` | no |
| <a name="input_map_roles"></a> [map\_roles](#input\_map\_roles) | Additional IAM roles to add to the aws-auth configmap. | <pre>list(object({<br> rolearn = string<br> username = string<br> groups = list(string)<br> }))</pre> | `[]` | no |
Expand Down
2 changes: 1 addition & 1 deletion alb-ingress-controller.tf
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
module "alb-ingress-controller" {
source = "./modules/aws-load-balancer-controller"

count = var.create ? 1 : 0
count = var.create && var.enable_alb_ingress_controller ? 1 : 0

account_id = local.account_id
region = local.region
Expand Down
22 changes: 22 additions & 0 deletions examples/eks-with-flagger/0-setup.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
provider "aws" {
region = "eu-central-1"
}

provider "helm" {
kubernetes {
host = module.this.cluster_host
cluster_ca_certificate = module.this.cluster_certificate
token = module.this.cluster_token
}
}

# Prepare for test
data "aws_availability_zones" "available" {}
data "aws_vpcs" "ids" {
tags = {
Name = "default"
}
}
data "aws_subnet_ids" "subnets" {
vpc_id = data.aws_vpcs.ids.ids[0]
}
81 changes: 81 additions & 0 deletions examples/eks-with-flagger/1-example.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
module "this" {
source = "../.."

cluster_name = "test-cluster-with-flagger"

vpc = {
link = {
id = data.aws_vpcs.ids.ids[0]
private_subnet_ids = data.aws_subnet_ids.subnets.ids
}
}

node_groups = {
"default" : {
"desired_size" : 1,
"max_capacity" : 1,
"max_size" : 1,
"min_size" : 1
}

}
node_groups_default = {
"capacity_type" : "SPOT",
"instance_types" : ["t3.medium"]
}

alarms = {
enabled = false
sns_topic = ""
}
enable_ebs_driver = false
enable_external_secrets = false
create_cert_manager = false
enable_alb_ingress_controller = false
enable_node_problem_detector = false
metrics_exporter = "disabled"
fluent_bit_configs = {
enabled = false
}

nginx_ingress_controller_config = {
enabled = true
name = "nginx"
create_namespace = true
namespace = "ingress-nginx"
replicacount = 1
metrics_enabled = true
}

external_dns = {
enabled = true
configs = {
configs = { sources = ["service", "ingress"] }
}
}

flagger = {
enabled = true
namespace = "ingress-nginx"
enable_loadtester = true
configs = {
meshProvider = "nginx"
prometheus = {
install = true
}
}
}
}

resource "helm_release" "http_echo" {
name = "http-echo"
repository = "https://dasmeta.github.io/helm"
chart = "base"
namespace = "default"
version = "0.2.7"
wait = true

values = [file("${path.module}/http-echo-canary-eks.yaml")]

depends_on = [module.this]
}
37 changes: 37 additions & 0 deletions examples/eks-with-flagger/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# eks-with-flagger

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

No requirements.

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | 4.67.0 |
| <a name="provider_helm"></a> [helm](#provider\_helm) | 2.16.1 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_this"></a> [this](#module\_this) | ../.. | n/a |

## Resources

| Name | Type |
|------|------|
| [helm_release.http_echo](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |
| [aws_availability_zones.available](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/availability_zones) | data source |
| [aws_subnet_ids.subnets](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/subnet_ids) | data source |
| [aws_vpcs.ids](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/vpcs) | data source |

## Inputs

No inputs.

## Outputs

No outputs.
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
82 changes: 82 additions & 0 deletions examples/eks-with-flagger/http-echo-canary-eks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
image:
repository: mendhak/http-https-echo
tag: 34

containerPort: 8080

service:
enabled: true
type: ClusterIP

autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 2
targetCPUUtilizationPercentage: 99

readinessProbe:
initialDelaySeconds: 5
failureThreshold: 1
httpGet:
path: /health
port: http
livenessProbe:
initialDelaySeconds: 5
failureThreshold: 3
httpGet:
path: /health
port: http
resources:
requests:
cpu: 6m

ingress:
enabled: true
class: nginx
hosts:
- host: http-echo.devops.dasmeta.com
paths:
- path: "/ping"
backend:
serviceName: http-echo
servicePort: 80

rolloutStrategy:
enabled: true
operator: flagger
configs:
progressDeadlineSeconds: 60 # the maximum time in seconds for the canary deployment to make progress before it is rollback (default 600s)
canaryReadyThreshold: 51 # minimum percentage of canary pods that must be ready before considering canary ready for traffic shifting (default 100)
primaryReadyThreshold: 51 # minimum percentage of primary pods that must be ready before considering primary ready for traffic shifting (default 100)
interval: 11s # schedule interval (default 60s)
threshold: 11 # max number of failed metric checks before rollback (default 10)
maxWeight: 31 # max traffic percentage (0-100) routed to canary (default 30)
stepWeight: 11 # canary increment step percentage (0-100) (default 10)
# min and max replicas count for primary hpa, default to main app hpa, the main app hpa values also being used for canary deploy hpa so we use this options to have custom values for primary hpa
primaryScalerMinReplicas: 2
primaryScalerMaxReplicas: 5
metrics: # metrics template configs to use for identifying if canary deploy handles request normally, the `request-success-rate` and `request-duration` named ones are available by default, and you can create custom metric templates
- name: request-success-rate
# minimum req success rate (non 5xx responses) percentage (0-100)
thresholdRange:
min: 99
interval: 1m
- name: request-duration
# maximum req duration P99, milliseconds
thresholdRange:
max: 500
interval: 1m

webhooks: # webhooks can be used for load testing before traffic switching to canaries by using `pre-rollout` type and also generating traffic
- name: acceptance-test
type: pre-rollout
url: http://flagger-loadtester.ingress-nginx/
timeout: 30s
metadata:
type: bash
cmd: "curl -sd 'test' http://http-echo-canary/ping | grep ping"
- name: load-test
url: http://flagger-loadtester.ingress-nginx/
timeout: 5s
metadata:
cmd: "hey -z 1m -q 3 -c 1 http://http-echo.devops.dasmeta.com/ping"
17 changes: 16 additions & 1 deletion main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ module "metrics-server" {
module "external-secrets" {
source = "./modules/external-secrets"

count = var.create ? 1 : 0
count = var.create && var.enable_external_secrets ? 1 : 0

namespace = var.external_secrets_namespace

Expand Down Expand Up @@ -388,3 +388,18 @@ module "external-dns" {
module.eks-cluster
]
}

module "flagger" {
count = var.create && var.flagger.enabled ? 1 : 0

source = "./modules/flagger"
namespace = var.flagger.namespace
configs = var.flagger.configs
metric_template_configs = var.flagger.metric_template_configs
enable_metric_template = var.flagger.enable_metric_template
enable_loadtester = var.flagger.enable_loadtester

depends_on = [
module.eks-cluster
]
}
65 changes: 65 additions & 0 deletions modules/flagger/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# terraform module allows to create/deploy flagger operator to have custom rollout strategies like canary/blue-green and also it allows to create custom flagger metric templates
## for more info check https://flagger.app and https://artifacthub.io/packages/helm/flagger/flagger


## example
```terraform
module "flagger" {
source = "dasmeta/eks/aws//modules/flagger"
version = "2.18.0"

configs = {
meshProvider = "nginx"
prometheus = {
install = true # most possibly the prometheus is already installed, in that case set this to false and use `metricsServer` option to set the endpoint to prometheus
}
}
}
```
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.3.0 |
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | >= 2.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_helm"></a> [helm](#provider\_helm) | >= 2.0 |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [helm_release.flagger_loadtester](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |
| [helm_release.flagger_metric_template](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |
| [helm_release.this](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_atomic"></a> [atomic](#input\_atomic) | Whether use helm deploy with --atomic flag | `bool` | `false` | no |
| <a name="input_chart_version"></a> [chart\_version](#input\_chart\_version) | The app chart version | `string` | `"1.38.0"` | no |
| <a name="input_configs"></a> [configs](#input\_configs) | Configurations to pass and override default ones. Check the helm chart available configs here: https://artifacthub.io/packages/helm/flagger/flagger?modal=values | `any` | `{}` | no |
| <a name="input_create_namespace"></a> [create\_namespace](#input\_create\_namespace) | Create namespace if requested | `bool` | `true` | no |
| <a name="input_enable_loadtester"></a> [enable\_loadtester](#input\_enable\_loadtester) | Whether to install loadtester helm | `bool` | `false` | no |
| <a name="input_enable_metric_template"></a> [enable\_metric\_template](#input\_enable\_metric\_template) | Whether to install flagger-metric-template helm | `bool` | `false` | no |
| <a name="input_metric_template_chart_version"></a> [metric\_template\_chart\_version](#input\_metric\_template\_chart\_version) | The metric template chart version | `string` | `"0.1.0"` | no |
| <a name="input_metric_template_configs"></a> [metric\_template\_configs](#input\_metric\_template\_configs) | Configurations to pass and override default ones. Check the helm chart available configs here: https://github.com/dasmeta/helm/tree/flagger-metric-template-0.1.0/charts/flagger-metric-template | `any` | `{}` | no |
| <a name="input_namespace"></a> [namespace](#input\_namespace) | The namespace to install main helm. | `string` | `"ingress-nginx"` | no |
| <a name="input_wait"></a> [wait](#input\_wait) | Whether use helm deploy with --wait flag | `bool` | `true` | no |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_helm_metadata"></a> [helm\_metadata](#output\_helm\_metadata) | Helm release metadata |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
9 changes: 9 additions & 0 deletions modules/flagger/examples/basic/0-setup.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
terraform {
required_version = ">= 1.3.0"

required_providers {
helm = ">= 2.0"
}
}

provider "helm" {}
10 changes: 10 additions & 0 deletions modules/flagger/examples/basic/1-example.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
module "this" {
source = "../.."

configs = {
meshProvider = "nginx"
prometheus = {
install = true # most possibly the prometheus is already installed, in that case set this to false and use `metricsServer` option to set the endpoint to prometheus
}
}
}
Loading
Loading