Skip to content

Commit

Permalink
Merge pull request #19 from dasmeta/feat/DMVP-4666-dashboard-creation…
Browse files Browse the repository at this point in the history
…-ability

Feat/dmvp-4666 dashboard creation ability
  • Loading branch information
mrdntgrn authored Aug 6, 2024
2 parents afb8a94 + 08ebdc9 commit 04fd258
Show file tree
Hide file tree
Showing 111 changed files with 3,797 additions and 5 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,6 @@ override.tf.json
# Ignore CLI configuration files
.terraformrc
terraform.rc

.terraform.lock.hcl
/node_modules
59 changes: 55 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,55 @@
# terraform-onpremise-grafana
https://registry.terraform.io/modules/dasmeta/grafana/onpremise/latest

This module is created to manage OnPremise Grafana stack with Terraform.
At this moment we support managing
- Grafana Dashboard with `dashboard` submodule
- Grafana Alerts with `alerts` submodule
- Grafana Contact Points with `contact-points` submodule
- Grafana Notification Policies with `notifications` submodule

More parts are coming soon.

## example for dashboard
```hcl
module "grafana_monitoring" {
source = "dasmeta/grafana/onpremise"
version = "1.2.0"
name = "Test-dashboard"
application_dashboard = {
rows : [
{ type : "block/sla" },
{ type : "block/ingress" },
{ type : "block/service", name : "service-name-1", host : "example.com" },
{ type : "block/service", name : "service-name-2" },
{ type : "block/service", name : "service-name-3" }
]
data_source = {
uid : "00000"
}
variables = [
{
"name" : "namespace",
"options" : [
{
"selected" : true,
"value" : "prod"
},
{
"value" : "stage"
},
{
"value" : "dev"
}
],
}
]
}
}
```

## Example for Alert Rules
```
module "grafana_alerts" {
Expand Down Expand Up @@ -161,11 +204,14 @@ module "grafana_alerts" {
```

## Usage
Check `modules/alerts/tests`, `modules/contact-points/tests` and `modules/notifications/tests` folders to see more examples.
Check `./tests`, `modules/alerts/tests`, `modules/contact-points/tests` and `modules/notifications/tests` folders to see more examples.
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

No requirements.
| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.3.0 |
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | >= 3.7.0 |

## Providers

Expand All @@ -176,6 +222,7 @@ No providers.
| Name | Source | Version |
|------|--------|---------|
| <a name="module_alerts"></a> [alerts](#module\_alerts) | ./modules/alerts | n/a |
| <a name="module_application_dashboard"></a> [application\_dashboard](#module\_application\_dashboard) | ./modules/dashboard/ | n/a |
| <a name="module_contact_points"></a> [contact\_points](#module\_contact\_points) | ./modules/contact-points | n/a |
| <a name="module_notifications"></a> [notifications](#module\_notifications) | ./modules/notifications | n/a |

Expand All @@ -188,12 +235,16 @@ No resources.
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_alert_interval_seconds"></a> [alert\_interval\_seconds](#input\_alert\_interval\_seconds) | The interval, in seconds, at which all rules in the group are evaluated. If a group contains many rules, the rules are evaluated sequentially. | `number` | `10` | no |
| <a name="input_alert_rules"></a> [alert\_rules](#input\_alert\_rules) | This varibale describes alert folders, groups and rules. | <pre>list(object({<br> name = string # The name of the alert rule<br> no_data_state = optional(string, "NoData") # Describes what state to enter when the rule's query returns No Data<br> exec_err_state = optional(string, "Error") # Describes what state to enter when the rule's query is invalid and the rule cannot be executed<br> summary = optional(string, "") # Rule annotation as a summary<br> priority = optional(string, "P2") # Rule priority level: P2 is for non-critical alerts, P1 will be set for critical alerts<br> folder_name = optional(string, "Main Alerts") # Grafana folder name in which the rule will be created<br> datasource = string # Name of the datasource used for the alert<br> expr = optional(string, null) # Full expression for the alert<br> metric_name = optional(string, "") # Prometheus metric name which queries the data for the alert<br> metric_function = optional(string, "") # Prometheus function used with metric for queries, like rate, sum etc.<br> metric_interval = optional(string, "") # The time interval with using functions like rate<br> settings_mode = optional(string, "replaceNN") # The mode used in B block, possible values are Strict, replaceNN, dropNN<br> settings_replaceWith = optional(number, 0) # The value by which NaN results of the query will be replaced<br> filters = optional(any, {}) # Filters object to identify each service for alerting<br> function = optional(string, "mean") # One of Reduce functions which will be used in B block for alerting<br> equation = string # The equation in the math expression which compares B blocks value with a number and generates an alert if needed. Possible values: gt, lt, gte, lte, e<br> threshold = number # The value against which B blocks are compared in the math expression<br> }))</pre> | `[]` | no |
| <a name="input_alert_rules"></a> [alert\_rules](#input\_alert\_rules) | This variable describes alert folders, groups and rules. | <pre>list(object({<br> name = string # The name of the alert rule<br> no_data_state = optional(string, "NoData") # Describes what state to enter when the rule's query returns No Data<br> exec_err_state = optional(string, "Error") # Describes what state to enter when the rule's query is invalid and the rule cannot be executed<br> summary = optional(string, "") # Rule annotation as a summary<br> priority = optional(string, "P2") # Rule priority level: P2 is for non-critical alerts, P1 will be set for critical alerts<br> folder_name = optional(string, "Main Alerts") # Grafana folder name in which the rule will be created<br> datasource = string # Name of the datasource used for the alert<br> expr = optional(string, null) # Full expression for the alert<br> metric_name = optional(string, "") # Prometheus metric name which queries the data for the alert<br> metric_function = optional(string, "") # Prometheus function used with metric for queries, like rate, sum etc.<br> metric_interval = optional(string, "") # The time interval with using functions like rate<br> settings_mode = optional(string, "replaceNN") # The mode used in B block, possible values are Strict, replaceNN, dropNN<br> settings_replaceWith = optional(number, 0) # The value by which NaN results of the query will be replaced<br> filters = optional(any, {}) # Filters object to identify each service for alerting<br> function = optional(string, "mean") # One of Reduce functions which will be used in B block for alerting<br> equation = string # The equation in the math expression which compares B blocks value with a number and generates an alert if needed. Possible values: gt, lt, gte, lte, e<br> threshold = number # The value against which B blocks are compared in the math expression<br> }))</pre> | `[]` | no |
| <a name="input_application_dashboard"></a> [application\_dashboard](#input\_application\_dashboard) | Dashboard for monitoring applications | <pre>object({<br> rows = optional(any, [])<br> data_source = object({ # global/default datasource, TODO: create datasource inside the module<br> uid = string<br> type = optional(string, "prometheus")<br> })<br> variables = optional(list(object({ # Allows to define variables to be used in dashboard<br> name = string<br> type = optional(string, "custom")<br> hide = optional(number, 0)<br> includeAll = optional(bool, false)<br> multi = optional(bool, false)<br> query = optional(string, "")<br> queryValue = optional(string, "")<br> skipUrlSync = optional(bool, false)<br> options = optional(list(object({<br> selected = optional(bool, false)<br> value = string<br> text = optional(string, null)<br> })), [])<br> })), [])<br> })</pre> | <pre>{<br> "data_source": null,<br> "rows": [],<br> "variables": []<br>}</pre> | no |
| <a name="input_name"></a> [name](#input\_name) | Dashboard name | `string` | n/a | yes |
| <a name="input_notifications"></a> [notifications](#input\_notifications) | Represents the configuration options for Grafana notification policies. | <pre>object({<br> contact_point = optional(string, "Slack") # The default contact point to route all unmatched notifications to.<br> group_by = optional(list(string), ["grafana_folder", "alertname"]) # A list of alert labels to group alerts into notifications by.<br> group_interval = optional(string, "5m") # Minimum time interval between two notifications for the same group.<br> repeat_interval = optional(string, "4h") # Minimum time interval for re-sending a notification if an alert is still firing.<br><br> policy = optional(object({<br> contact_point = optional(string, null) # The contact point to route notifications that match this rule to.<br> continue = optional(bool, false) # Whether to continue matching subsequent rules if an alert matches the current rule. Otherwise, the rule will be 'consumed' by the first policy to match it.<br> group_by = optional(list(string), [])<br> mute_timings = optional(list(string), []) # A list of mute timing names to apply to alerts that match this policy.<br><br> matcher = optional(object({<br> label = optional(string, "priority") # The name of the label to match against.<br> match = optional(string, "=") # The operator to apply when matching values of the given label. Allowed operators are = for equality, != for negated equality, =~ for regex equality, and !~ for negated regex equality.<br> value = optional(string, "P1") # The label value to match against.<br> }))<br> }))<br> })</pre> | `{}` | no |
| <a name="input_opsgenie_endpoints"></a> [opsgenie\_endpoints](#input\_opsgenie\_endpoints) | OpsGenie contact points list. | <pre>list(object({<br> name = string # The name of the contact point.<br> api_key = string # The OpsGenie API key to use.<br> auto_close = optional(bool, false) # Whether to auto-close alerts in OpsGenie when they resolve in the Alertmanager.<br> message = optional(string, "") # The templated content of the message.<br> api_url = optional(string, "https://api.opsgenie.com/v2/alerts") # Allows customization of the OpsGenie API URL.<br> disable_resolve_message = optional(bool, false) # Whether to disable sending resolve messages.<br> }))</pre> | `[]` | no |
| <a name="input_slack_endpoints"></a> [slack\_endpoints](#input\_slack\_endpoints) | Slack contact points list. | <pre>list(object({<br> name = string # The name of the contact point.<br> endpoint_url = optional(string, "https://slack.com/api/chat.postMessage") # Use this to override the Slack API endpoint URL to send requests to.<br> icon_emoji = optional(string, "") # The name of a Slack workspace emoji to use as the bot icon.<br> icon_url = optional(string, "") # A URL of an image to use as the bot icon.<br> recipient = optional(string, null) # Channel, private group, or IM channel (can be an encoded ID or a name) to send messages to.<br> text = optional(string, "") # Templated content of the message.<br> title = optional(string, "") # Templated title of the message.<br> token = optional(string, "") # A Slack API token,for sending messages directly without the webhook method.<br> webhook_url = optional(string, "") # A Slack webhook URL,for sending messages via the webhook method.<br> username = optional(string, "") # Username for the bot to use.<br> disable_resolve_message = optional(bool, false) # Whether to disable sending resolve messages.<br> }))</pre> | `[]` | no |

## Outputs

No outputs.
| Name | Description |
|------|-------------|
| <a name="output_data"></a> [data](#output\_data) | n/a |
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
10 changes: 10 additions & 0 deletions dashboard.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
module "application_dashboard" {
source = "./modules/dashboard/"

count = length(var.application_dashboard) > 0 ? 1 : 0

name = var.name
rows = var.application_dashboard.rows
data_source = var.application_dashboard.data_source
variables = var.application_dashboard.variables
}
1 change: 1 addition & 0 deletions modules/dashboard/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
*.tfstate
Loading

0 comments on commit 04fd258

Please sign in to comment.