Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Fluentbit with ADOT Container Logs Collector #261

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 10 additions & 8 deletions docs/eks/logs.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,18 +20,19 @@ to enable Amazon CloudWatch as a data source. Make sure to provide permissions.
Amazon CloudWatch data source has already been setup for you.

All logs are delivered in the following CloudWatch Log groups naming pattern:
`/aws/eks/observability-accelerator/{cluster-name}/{namespace}`. Log streams
follow `{container-name}.{pod-name}`. In Grafana, querying and analyzing logs
`/aws/eks/observability-accelerator/{cluster-name}/workloads`. Log streams
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to keep logs separated per namespace? that was one of our requests before, and it this allows different patterns per log group (subscriptions, log class ...)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to find an elegant way to achieve this as currently the ADOT operator addon config schema supports only $CLUSTER_NAME and $NODE_NAME configuration variables for CW logs exporter. Container Logs configurable values

follow the naming pattern `{node-name}`. In Grafana, querying and analyzing logs
is done with [CloudWatch Logs Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html)

### Example - ADOT collector logs

Select one or many log groups and run the following query. The example below,
queries AWS Distro for OpenTelemetry (ADOT) logs
Select workloads log group for the cluster and run the following query. The example below,
queries container logs from `kube-system` namespace.

```console
fields @timestamp, log
| order @timestamp desc
fields @timestamp, @message, @logStream, @log, resource.k8s.namespace.name
| filter resource.k8s.namespace.name = "kube-system"
| sort @timestamp desc
| limit 100
```

Expand All @@ -49,8 +50,9 @@ In the example below, we use the following query to graph the number of metrics
collected by the ADOT collector

```console
fields @timestamp, log
| parse log /"#metrics": (?<metrics_count>\d+)}/
fields @timestamp, attributes.log
| filter resource.k8s.namespace.name = "adot-collector-kubeprometheus"
| parse attributes.log /\"metrics\": (?<metrics_count>\d+?)(,|\})/
| stats avg(metrics_count) by bin(5m)
| limit 100
```
Expand Down
6 changes: 3 additions & 3 deletions modules/eks-monitoring/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
This module provides EKS cluster monitoring with the following resources:

- AWS Distro For OpenTelemetry Operator and Collector for Metrics and Traces
- Logs with [AWS for FluentBit](https://github.com/aws/aws-for-fluent-bit)
- Logs with [ADOT Container Logs Collector](https://aws-otel.github.io/docs/getting-started/adot-eks-add-on/config-container-logs)
- Installs Grafana Operator to add AWS data sources and create Grafana Dashboards to Amazon Managed Grafana.
- Installs FluxCD to perform GitOps sync of a Git Repo to EKS Cluster. We will use this later for creating Grafana Dashboards and AWS datasources to Amazon Managed Grafana.
- Installs External Secrets Operator to retrieve and Sync the Grafana API keys from AWS SSM Parameter Store.
Expand Down Expand Up @@ -37,8 +37,8 @@ See examples using this Terraform modules in the **Amazon EKS** section of [this

| Name | Source | Version |
|------|--------|---------|
| <a name="module_adot_logs"></a> [adot\_logs](#module\_adot\_logs) | ./add-ons/adot-logs | n/a |
| <a name="module_external_secrets"></a> [external\_secrets](#module\_external\_secrets) | ./add-ons/external-secrets | n/a |
| <a name="module_fluentbit_logs"></a> [fluentbit\_logs](#module\_fluentbit\_logs) | ./add-ons/aws-for-fluentbit | n/a |
| <a name="module_helm_addon"></a> [helm\_addon](#module\_helm\_addon) | github.com/aws-ia/terraform-aws-eks-blueprints//modules/kubernetes-addons/helm-addon | v4.32.1 |
| <a name="module_istio_monitoring"></a> [istio\_monitoring](#module\_istio\_monitoring) | ./patterns/istio | n/a |
| <a name="module_java_monitoring"></a> [java\_monitoring](#module\_java\_monitoring) | ./patterns/java | n/a |
Expand Down Expand Up @@ -90,7 +90,7 @@ See examples using this Terraform modules in the **Amazon EKS** section of [this
| <a name="input_enable_istio"></a> [enable\_istio](#input\_enable\_istio) | Enable ISTIO workloads monitoring, alerting and default dashboards | `bool` | `false` | no |
| <a name="input_enable_java"></a> [enable\_java](#input\_enable\_java) | Enable Java workloads monitoring, alerting and default dashboards | `bool` | `false` | no |
| <a name="input_enable_kube_state_metrics"></a> [enable\_kube\_state\_metrics](#input\_enable\_kube\_state\_metrics) | Enables or disables Kube State metrics exporter. Disabling this might affect some data in the dashboards | `bool` | `true` | no |
| <a name="input_enable_logs"></a> [enable\_logs](#input\_enable\_logs) | Using AWS For FluentBit to collect cluster and application logs to Amazon CloudWatch | `bool` | `true` | no |
| <a name="input_enable_logs"></a> [enable\_logs](#input\_enable\_logs) | Using ADOT container logs collector to collect cluster and application logs to Amazon CloudWatch | `bool` | `true` | no |
| <a name="input_enable_managed_prometheus"></a> [enable\_managed\_prometheus](#input\_enable\_managed\_prometheus) | Creates a new Amazon Managed Service for Prometheus Workspace | `bool` | `true` | no |
| <a name="input_enable_nginx"></a> [enable\_nginx](#input\_enable\_nginx) | Enable NGINX workloads monitoring, alerting and default dashboards | `bool` | `false` | no |
| <a name="input_enable_node_exporter"></a> [enable\_node\_exporter](#input\_enable\_node\_exporter) | Enables or disables Node exporter. Disabling this might affect some data in the dashboards | `bool` | `true` | no |
Expand Down
53 changes: 53 additions & 0 deletions modules/eks-monitoring/add-ons/adot-logs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# AWS Distro for OpenTelemetry (ADOT) Container Logs Collector

[AWS Distro for OpenTelemetry (ADOT)](https://aws-otel.github.io/) is a secure,
production-ready, AWS-supported distribution of the OpenTelemetry project.
Part of the Cloud Native Computing Foundation, OpenTelemetry provides open
source APIs, libraries, and agents to collect distributed traces and metrics
for application monitoring.

This module generates the
[ADOT Container Logs Collector](https://aws-otel.github.io/docs/getting-started/adot-eks-add-on/config-container-logs) configuration for Amazon EKS ADOT add-on.

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.1.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 3.72 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | >= 2.10 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.72 |

## Modules

| Name | Source | Version |
|------|--------|---------|
| <a name="module_adot_logs_iam_role"></a> [adot\_logs\_iam\_role](#module\_adot\_logs\_iam\_role) | terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks | 5.33.0 |

## Resources

| Name | Type |
|------|------|
| [aws_cloudwatch_log_group.adot_log_group](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_group) | resource |
| [aws_iam_policy.adot_logs_iam_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_policy_document.adot_logs_iam_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_addon_config"></a> [addon\_config](#input\_addon\_config) | ADOT Container Logs Collector config | <pre>object({<br> enable_logs = bool<br> logs_config = object({<br> cw_log_retention_days = number<br> })<br> })</pre> | <pre>{<br> "enable_logs": true,<br> "logs_config": {<br> "cw_log_retention_days": 90<br> }<br>}</pre> | no |
| <a name="input_addon_context"></a> [addon\_context](#input\_addon\_context) | Input configuration for the addon | <pre>object({<br> aws_caller_identity_account_id = string<br> aws_caller_identity_arn = string<br> aws_eks_cluster_endpoint = string<br> aws_partition_id = string<br> aws_region_name = string<br> eks_cluster_id = string<br> eks_oidc_issuer_url = string<br> eks_oidc_provider_arn = string<br> irsa_iam_role_path = string<br> irsa_iam_permissions_boundary = string<br> tags = map(string)<br> })</pre> | n/a | yes |

## Outputs

| Name | Description |
|------|-------------|
| <a name="output_adot_logs_collector_config"></a> [adot\_logs\_collector\_config](#output\_adot\_logs\_collector\_config) | ADOT Container Logs Collector configuration |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
33 changes: 33 additions & 0 deletions modules/eks-monitoring/add-ons/adot-logs/data.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
data "aws_iam_policy_document" "adot_logs_iam_policy" {
statement {
sid = "PutLogEvents"
effect = "Allow"
resources = ["arn:${var.addon_context.aws_partition_id}:logs:${var.addon_context.aws_region_name}:${var.addon_context.aws_caller_identity_account_id}:log-group:/aws/eks/observability-accelerator/${var.addon_context.eks_cluster_id}/workloads:log-stream:*"]
actions = ["logs:PutLogEvents"]
}

statement {
sid = "DescribeLogGroups"
effect = "Allow"
resources = ["*"]

actions = [
"logs:DescribeLogGroups",
]
}

statement {
sid = "LogStreams"
effect = "Allow"
resources = [
"arn:${var.addon_context.aws_partition_id}:logs:${var.addon_context.aws_region_name}:${var.addon_context.aws_caller_identity_account_id}:log-group:/aws/eks/observability-accelerator/${var.addon_context.eks_cluster_id}/workloads",
"arn:${var.addon_context.aws_partition_id}:logs:${var.addon_context.aws_region_name}:${var.addon_context.aws_caller_identity_account_id}:log-group:/aws/eks/observability-accelerator/${var.addon_context.eks_cluster_id}/workloads:log-stream:*"
]

actions = [
"logs:CreateLogStream",
"logs:DescribeLogStreams",
]
}

}
42 changes: 42 additions & 0 deletions modules/eks-monitoring/add-ons/adot-logs/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
resource "aws_cloudwatch_log_group" "adot_log_group" {
count = var.addon_config.enable_logs ? 1 : 0

name = "/aws/eks/observability-accelerator/${var.addon_context.eks_cluster_id}/workloads"

retention_in_days = var.addon_config.logs_config.cw_log_retention_days

tags = var.addon_context.tags
}

resource "aws_iam_policy" "adot_logs_iam_policy" {
count = var.addon_config.enable_logs ? 1 : 0

name = "${substr(var.addon_context.eks_cluster_id, 0, 30)}-${var.addon_context.aws_region_name}-adot-logs-policy"
path = "/"
description = "IAM Policy for ADOT Container Logs Collector"

policy = data.aws_iam_policy_document.adot_logs_iam_policy.json
tags = var.addon_context.tags
}

module "adot_logs_iam_role" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "5.33.0"

count = var.addon_config.enable_logs ? 1 : 0

role_name = "${substr(var.addon_context.eks_cluster_id, 0, 30)}-${var.addon_context.aws_region_name}-adot-logs-irsa"

role_policy_arns = {
policy = resource.aws_iam_policy.adot_logs_iam_policy[0].arn
}

oidc_providers = {
main = {
provider_arn = var.addon_context.eks_oidc_provider_arn
namespace_service_accounts = ["opentelemetry-operator-system:adot-col-container-logs"]
}
}

tags = var.addon_context.tags
}
37 changes: 37 additions & 0 deletions modules/eks-monitoring/add-ons/adot-logs/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
output "adot_logs_collector_config" {
description = "ADOT Container Logs Collector configuration"
value = jsondecode(length(resource.aws_cloudwatch_log_group.adot_log_group) > 0 ? jsonencode({
resources = {
limits = {
cpu = "1000m"
memory = "750Mi"
}

requests = {
cpu = "300m"
memory = "512Mi"
}
}

serviceAccount = {
annotations = {
"eks.amazonaws.com/role-arn" = module.adot_logs_iam_role[0].iam_role_arn
}
}

exporters = {
awscloudwatchlogs = {
log_group_name = "/aws/eks/observability-accelerator/$CLUSTER_NAME/workloads"
log_stream_name = "$NODE_NAME"
}
}

pipelines = {
logs = {
cloudwatchLogs = {
enabled = true
}
}
}
}) : "{}")
}
32 changes: 32 additions & 0 deletions modules/eks-monitoring/add-ons/adot-logs/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
variable "addon_context" {
description = "Input configuration for the addon"
type = object({
aws_caller_identity_account_id = string
aws_caller_identity_arn = string
aws_eks_cluster_endpoint = string
aws_partition_id = string
aws_region_name = string
eks_cluster_id = string
eks_oidc_issuer_url = string
eks_oidc_provider_arn = string
irsa_iam_role_path = string
irsa_iam_permissions_boundary = string
tags = map(string)
})
}

variable "addon_config" {
description = "ADOT Container Logs Collector config"
type = object({
enable_logs = bool
logs_config = object({
cw_log_retention_days = number
})
})
default = {
enable_logs = true
logs_config = {
cw_log_retention_days = 90
}
}
}
14 changes: 14 additions & 0 deletions modules/eks-monitoring/add-ons/adot-logs/versions.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
terraform {
required_version = ">= 1.1.0"

required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 3.72"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.10"
}
}
}
2 changes: 2 additions & 0 deletions modules/eks-monitoring/add-ons/adot-operator/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ resource "aws_eks_addon" "adot" {
service_account_role_arn = try(var.addon_config.service_account_role_arn, null)
preserve = try(var.addon_config.preserve, true)

configuration_values = try(var.addon_config.configuration_values, null)

tags = merge(
var.addon_context.tags,
try(var.addon_config.tags, {}),
Expand Down
55 changes: 0 additions & 55 deletions modules/eks-monitoring/add-ons/aws-for-fluentbit/README.md

This file was deleted.

23 changes: 0 additions & 23 deletions modules/eks-monitoring/add-ons/aws-for-fluentbit/data.tf

This file was deleted.

Loading