axonops-config-automation

Tool to manage AxonOps settings

Installation

What You Will Need Before Start

To run the Ansible AxonOps Playbook you will need:

Anisble >= 2.10
Python3.8 or above
make

Export Environment Variables

The Ansible Playbook takes as input environment variables, the standard for a SaaS connections are:

# export your org
# This is the only mandatory variable value
export AXONOPS_ORG='example'

# Create you API token within your AxonOps console. You will need DBA level access or above to the clusters
# you will be configuring.
export AXONOPS_TOKEN='aaaabbbbccccddddeeee'

To simplify the process, the export_tokens.sh file has been created with all the accepted variables. Modify this file with your specific details, and then export the variables.

source ./export_tokens.sh

The AXONOPS_TOKEN parameter is used only for AxonOps SaaS. For AxonOps on-premises, you can use a username and password or configure it to allow anonymous login. Refer to export_tokens.sh for more information on configuring the Ansible Playbook for AxonOps on-premises and on the accepted environment variables.

Ansible preparation

The playbooks have been tested on Ansible 2.10 but it should work on most versions.

NOTE: We prefer using the GNU Make to run these playbooks but you can use the make.sh script instead if you prefer.

RedHat and RockyLinux

The system ansible package should work fine for RedHat and RockyLinux >=8

sudo dnf -y install epel-release
sudo dnf -y install ansible make

Debian and Ubuntu

It has been tested for Ubuntu 22.04 and Debian Bookworm:

sudo apt update
sudo apt -y install ansible make

Virtualenv

If you're using virtualenv, simply createa a python 3 environment and install Ansible to it:

virtualenv ~/py-axonops
source ~/py-axonops/bin/activate
pip3 install -r requirements.txt

pipenv

We recommend using pipenv to manage the environment. After installing pipenv, simply run:

pipenv install

and the export the variable:

export PIPENV=true

Configuration

The configuration is structured in folders following the format within the directory config:

The config/REPLACE_WITH_ORG_NAME folder contains alert_endpoints.yml which defines alert endpoints at the org level, since the alert endpoints are defined for the entire org and shared across all your clusters. All other configurations are defined per cluster. This folder also contains the following files:

metric_alert_rules.yml
log_alert_rules.yml
service_checks.yml

These configurations defined in this folder will be applied to all of your clusters. You should define common alert rules and service checks in this folder.

To define cluster specific configurations, overiding the rules and configurations defined at the org level, example files are provided under config/REPLACE_WITH_ORG_NAME/REPLACE_WITH_CLUSTER_NAME. The rules and configurations in this folder will append and override the settings provided in the org folder.

Alert Endpoints

Alert endpoints such as Slack, Teams, PagerDuty, OpsGenie can be configured using this Ansible playbook. Since alert enpoints configurations are AxonOps org-level setting, the configuration file is placed at ./config/<org_name>/alert_endpoints.yml.

Metric Alert Rules

The metric alert rules are configured against the charts that exists for the AxonOps dashboard in each cluster. Multiple alert rules can be configured against each chart.

An example configuration for a metric alert is shown below.

- name: CPU usage per host
  dashboard: System
  chart: CPU usage per host
  operator: '>='
  critical_value: 99
  warning_value: 90
  duration: 1h
  description: Detected High CPU usage
  present: true

name: is the name of the alert

dashboard: must correspond to the dashboard name in the AxonOps right-hand menu.

chart: must correspond to the name of the chart within the above dashboard. In this case CPU usage per host. The metric query is automatically detected by specifying the chart name.

operator: options are: ==, >=, >, <=, <, !=.

critical_value: is the critical value threshold.

warning_value: is the warning value threshold.

duration: is the duration the warning or critical values must violate the operator rule before the alert is triggered.

description: sets the description of the alert. You may want to add a description of an action to take when this alert is raised.

present: true|false - by setting it to false it will remove the alert.

Log Alert Rules

Log alerts can be defined using this Ansible playbook.

An example configuration for a log alert is shown below.

- name: TLS failed to handshake with peer
  warning_value: 50
  critical_value: 100
  duration: 5m
  content: \"Failed to handshake with peer\"
  source: "/var/log/cassandra/system.log"
  description: "Detected TLS handshake error with peer"
  level: warning,error,info
  present: true

name: is the name of the alert.

warning_value: is the warning value threshold based on the count of matched logs.

critical_value: is the critical value threshold based on the count of matched logs.

duration: is the duration the warning or critical values must violate the operator rule before the alert is triggered.

content: is the text search. Double quotes must be escaped. Currently the following matching syntax is supported:

hello - matches hello
hello world - matches hello or world
"hello world" - matches exact hello world
+-hello - matches excluding hello
+-"hello world" - matches excluding hello world
+-hello +-world - matches excluding hello or world

source: specifies the log source. This must match with one of the options available in the Source filter found in the Logs&Events view.

description: sets the description of the alert. You may want to add a description of an action to take when this alert is raised.

level: sets the event level filter - a comma separated list with the following values: debug, error, warning, info

present: true|false - by setting it to false it will remove the alert.

Service Checks

Service checks in AxonOps can be configured using this playbook. Example service check configurations can be found in: ./config/REPLACE_WITH_ORG_NAME/REPLACE_WITH_CLUSTER_NAME/service_checks.yml

Backups

Backup Schedules can be create and Backup snapshots taken

Supported backup locations are:

local
s3
sftp
azure

Remote Backup paths take the form of $remote_path/cassandra/$cluster_name/$node_id

General options

These following options apply to all backup configurations

Option	Required	Type	Description
present	No	Bool	Whether a backup schedule should exist. Setting to False will remove an existing schedule. Defaults to True
local_retention	No	Str	How long to keep a snapshot on the local node. Defaults to 10d (10 Days)
remote	No	Bool	Whether backup is to a remote. Defaults to False
remote_retention	No	Str	How long to keep a snapshot on the remote location. Defaults to 60d (60 Days)
remote_type	Only if remote is True	Str	Where to send backups. One of 'local', 's3', 'sftp', 'azure'. Defaults to local.
timeout	No	Str	Time before backup times out. Defaults to 10h (10 Hours)
transfers	No	Int	File Transfers Parallelism
tps_limit	No	Int	Throttle transfer amount
bw_limit	No	Str	Apply bandwith throttling. Use a suffix b
tag	No	Str	Tag to apply to the backup
datacenters	Yes	List(Str)	Datacenters to include in backup
nodes	No	List(str)	Nodes to include in backup
tables_keyspace	No	List(str)	Mutually exclusive with tables
tables	No	List(str)	Tables to include in backup. Mutually exclusive with tables_keyspace
keyspaces	No	List(str)	Keyspaces to include in backup
schedule	No	Bool	Whether to schedule a future backup. If False then an immediate snapshot will be taken
schedule_expre	No	Str	Crontab expression of backup schedule. Defaults to '0 1 * * *'

Local Options

Backs up to the local filesystem of the node.

S3

Sends backups to an S3 bucket

S3 Options

Option	Required	Type	Description
remote_path	Yes	Str	Path to store the backups, Needs to include the bucketname. eg mybucket/path/to/backups
s3_region	Yes	Bool	S3 region that bucket is in
s3_access_key_id	No	Str	S3 Access key ID if not using IAM authentication
s3_secret_access_key	No	Str	S3 Access key if not using IAM authentication
s3_storage_class	No	Str	Storage class of bucket. Defaults to STANDARD. One of 'default', 'STANDARD', 'reduced_redundancy', 'standard_ia', 'onezone_ia', 'glacier', 'deep_archive', 'intelligent_tiering'
s3_acl	No	Str	ACL type of bucket. Defaults to private. One of 'private', 'public-read', 'public-read-write','authenticated-read', 'bucket-owner-read'
s3_encryption	No	Str	Encryption to apply. Defaults to AES256. One of 'none', 'AES256'
s3_no_check_bucket	No	Bool
s3_disable_checksum	No	Bool

SFTP

Sends backups to and SFTP/SSH server

sftp options

Option	Required	Type	Description
remote_path	Yes	Str	Path to store the backups on the remote server
host	Yes	Str	Host to connect to
ssh_user	Yes	Str	Username to connect as
ssh_pass	No	Str	Password to connect with. Either ssh_pass or key_file needs to be set
key_file	No	Str	Location of key file on the host. Either ssh_pass or key_file needs to be set

Azure

Sends backups to an Azure Storage Blob container

Azure options

Option	Required	Type	Description
remote_path	Yes	Str	Path to store the backups, Needs to include the container name. eg mycontainer/path/to/backups
azure_account	Yes	Str	The name of the Azure storage account
azure_endpoint	No	Str	To override the endpoint destination for the Azure storage account. Generally not required
azure_key	No	Str	Storage account key. Only required if not using Azure MSI authentication
azure_msi	No	Bool	Whether to use Azure MSI authentication to connect to the storage account
azure_msi_object_id	No	Only required if there are multiple user assigned identities. Mutually exlusive with azure_msi_client_id and azure_msi_mi_res_id
azure_msi_client_id	No	Only required if there are multiple user assigned identities. Mutually exlusive with azure_msi_object_id and azure_msi_mi_res_id
azure_msi_mi_res_id	No	Only required if there are multiple user assigned identities. Mutually exlusive with azure_msi_object_id and azure_msi_client_id

Backup Examples

- name: Schedule a backup to S3 bucket
  remote_type: s3
  cluster: testcluster
  datacenters: dc1
  remote_path: bucketname/path
  local_retention: 10d
  remote_retention: 60d
  tag: "scheduled backup"
  timeout: 10h
  remote: True
  schedule: True
  schedule_expr: 0 1 * * *
  s3_region: eu-west-2
  s3_acl: private

- name: Snapshot a table to an Azure Blob
  remote_type: azure
  cluster: testcluster
  datacenters: dc1
  remote_path: foo
  local_retention: 10d
  remote_retention: 30d
  tag: "Snapshot appTable"
  timeout: 10h
  remote: True
  tables: ['appKeyspace.appTable']
  keyspaces: ['appKeyspace']
  schedule: False
  azure_account: azure_storage_account_name
  azure_use_msi: True

Playbooks

The playbooks are designed to run in a predefined order as some of them depend on the others. For example, you'll need to create the alert endpoints before you can set up alert routing.

Set up alert endpoints
Set up routes
Set up metrics alerts
Set up log alerts
Set up Service checks
Set up backup schedules

Running

The provided Makefile is the easiest way to run the playbooks:

❯ make help
metrics-alerts                 Create alerts based on metrics
check                          run pre-commit tests
endpoints                      Create alert endpoints and integrations
log-alerts                     Create alerts based on logs
routes                         Create alert routes
service-checks                 Create alerts for TCP and shell connections
backups                        Create backup schedules

You can dedide to either configure all the parameters as explained above using the export_tokens.sh file or you can set them in the command line overriding the environment configuration:

make endpoints AXONOPS_ORG=ORG_NAME
make routes AXONOPS_ORG=ORG_NAME AXONOPS_CLUSTER=CLUSTER_NAME
make metrics-alerts AXONOPS_ORG=ORG_NAME AXONOPS_CLUSTER=CLUSTER_NAME
make log-alerts AXONOPS_ORG=ORG_NAME AXONOPS_CLUSTER=CLUSTER_NAME
make service-checks AXONOPS_ORG=ORG_NAME AXONOPS_CLUSTER=CLUSTER_NAME
make backups AXONOPS_ORG=ORG_NAME AXONOPS_CLUSTER=CLUSTER_NAME

Other

The provided playbooks are only examples. Adapt the rules and configurations to suit your enterprise requirements.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
ansible_collections/axonops/configuration		ansible_collections/axonops/configuration
assets		assets
config/REPLACE_WITH_ORG_NAME		config/REPLACE_WITH_ORG_NAME
tasks_examples		tasks_examples
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.yamllint		.yamllint
LICENSE		LICENSE
Makefile		Makefile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
export_tokens.sh		export_tokens.sh
make.sh		make.sh
requirements.txt		requirements.txt
setup_alert_endpoints.yml		setup_alert_endpoints.yml
setup_alert_routes.yml		setup_alert_routes.yml
setup_backups.yml		setup_backups.yml
setup_commitlogs_archive.yml		setup_commitlogs_archive.yml
setup_log_alerts.yml		setup_log_alerts.yml
setup_metrics_alerts.yml		setup_metrics_alerts.yml
setup_service_checks.yml		setup_service_checks.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

axonops-config-automation

Installation

What You Will Need Before Start

Export Environment Variables

Ansible preparation

RedHat and RockyLinux

Debian and Ubuntu

Virtualenv

pipenv

Configuration

Alert Endpoints

Metric Alert Rules

Log Alert Rules

Service Checks

Backups

General options

Local Options

S3

S3 Options

SFTP

sftp options

Azure

Azure options

Backup Examples

Playbooks

Running

Other

About

Releases

Packages

Contributors 4

Languages

License

axonops/axonops-config-automation

Folders and files

Latest commit

History

Repository files navigation

axonops-config-automation

Installation

What You Will Need Before Start

Export Environment Variables

Ansible preparation

RedHat and RockyLinux

Debian and Ubuntu

Virtualenv

pipenv

Configuration

Alert Endpoints

Metric Alert Rules

Log Alert Rules

Service Checks

Backups

General options

Local Options

S3

S3 Options

SFTP

sftp options

Azure

Azure options

Backup Examples

Playbooks

Running

Other

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages