Multi-cluster allocation demo for Agones on EKS

This sample deploys Agones multi-cluster configuration to Amazon EKS, one routing cluster and two DGS clusters, with multi-cluster allocation feature enabled.

This sample also works as a good Terraform example for the following features:

Deploy Agones with Network Load Balancer (NLB) instead of Classic Load Balancer with AWS Load Balancer Controller
Aggregate logs and metrics into Amazon CloudWatch using Fluent Bit and OpenTelemetry
View Kubernetes resources with Kubernetes Dashboard
Adjust the number of nodes by Karpenter (You can also see cluster autoscaler version in this tag)

Architecture / How it works

The architecture overview of this sample is as the image below.

We adopt a cluster topology of Dedicated Cluster Responsible For Routing, which is disscussed here. By this way, your cluster configurations are symmetric - all the DGS clusters can share the same configuration, which is simpler than the toplogy Single Cluster Responsible For Routing with a special DGS cluster to perform serving game servers as well as allocation routing function. All Clusters Responsible For Routing topology seems overkill for a single region deployment, because it is unlikely for only a single cluster to fail while other clusters in the same region are working normally. It might improve availability with multi-region deployment though.

Steps to Deploy

Prerequisites

You must install the following tools before deploying this sample:

Terraform CLI
Kubectl
Helm
AWS CLI
- After install, you must configure permission equivalent to Administrator IAM policy

Check Terraform parameters

Please open variable.tf and check the parameters.

You can continue to deploy without any modification, but you may want to change some of the settings such as AWS region to deploy. You can also improve the security by specifying CIDRs that can connect to servers. By default all the servers are protected by mTLS but can be connected from anyone.

Deploy

To deploy this sample, you need to run the following commands:

# Install required modules
terraform init

# deploy to your account
terraform apply -auto-approve

It usually takes 20-30 minutes to deploy.

After a deployment, please check all the pods are running properly (i.e. Running state) by the commands below:

aws eks update-kubeconfig --name dgs01
kubectl get pods -A

aws eks update-kubeconfig --name dgs02
kubectl get pods -A

aws eks update-kubeconfig --name router
kubectl get pods -A

Usage

Connect to a game server

You can follow the official guide to connect to a game server, or run the following commands:

aws eks update-kubeconfig --name dgs01 # or --name dgs02
kubectl get gs

# you will get a output like below
NAME                    STATE          ADDRESS                                        PORT   NODE                                        AGE
dgs-fleet-2l7fs-8pjfb   Ready          ec2-redacted.us-west-2.compute.amazonaws.com   7684   ip-10-0-177-35.us-west-2.compute.internal   66m
dgs-fleet-2l7fs-dtz7c   Ready          ec2-redacted.us-west-2.compute.amazonaws.com   7039   ip-10-0-177-35.us-west-2.compute.internal   66m

# get IP address and PORT number from the above output
nc -u {ADDRESS} {PORT}

# now you can send some message and see ACK is returned

As a sample game server, we are running simple-game-server. The available commands are described in the README.md.

Allocate a game server

You can allocate a game server pod either by using GameServerAllocation API aggregation or allocator service client.

To use GameServerAllocation API aggregation, run the following command:

aws eks update-kubeconfig --name router
kubectl create -f example/allocation.yaml
# Try to run above command several times 

# You can see some DGS pods are allocated
aws eks update-kubeconfig --name dgs01
kubectl get gs

aws eks update-kubeconfig --name dgs02
kubectl get gs

To use an allocator service client, run the following commands. You can either use gRPC or REST interface. Since they are protected by mTLS, you need to set up TLS certificates and private keys first.

NAMESPACE=default # replace with any namespace
EXTERNAL_IP=$(terraform output -raw allocation_service_hostname)
KEY_FILE=client.key
CERT_FILE=client.crt
TLS_CA_FILE=ca.crt

# get certificates locally
terraform output -raw allocation_service_client_tls_key | base64 -d > $KEY_FILE
terraform output -raw allocation_service_client_tls_crt | base64 -d > $CERT_FILE
terraform output -raw allocation_service_server_tls_crt | base64 -d > $TLS_CA_FILE

mv $KEY_FILE $CERT_FILE $TLS_CA_FILE ./example
cd ./example

# Using go example code for gRPC interface
go run allocation-client.go --ip ${EXTERNAL_IP} --namespace ${NAMESPACE} --key ${KEY_FILE} --cert ${CERT_FILE} --cacert ${TLS_CA_FILE} --multicluster true

# Using curl for REST interface
curl --key ${KEY_FILE} \
     --cert ${CERT_FILE} \
     --cacert ${TLS_CA_FILE} \
     -H "Content-Type: application/json" \
     --data '{"namespace":"'${NAMESPACE}'", "multiClusterSetting":{"enabled":true}}' \
     https://${EXTERNAL_IP}/gameserverallocation \
     -v

Note that allocation requests are forwarded from the routing cluster to the DGS clusters with Agones multi-cluster allocation feature.

Open Kubernetes Dashboard

You can open Kubernetes dashboard to see and manage Kubernetes resources in detail. It is already installed in all the clusters. You can follow the steps below to open and login it.

aws eks update-kubeconfig --name <cluster name> # cluster name: dgs01, dgs02, router
kubectl proxy
# Now, open http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:https/proxy/#/login

# Retrieve access token by the below command:
kubectl -n kubernetes-dashboard get secret $(kubectl -n kubernetes-dashboard get sa/admin-user -o jsonpath="{.secrets[0].name}") -o go-template="{{.data.token | base64decode}}"

Monitor logs and metrics in CloudWatch

Agones logs and metrics are aggregated into CloudWatch in this sample. You can easily check them in CloudWatch management console.

To check logs, you can open Log groups page, and inspect relevant log groups (e.g. /aws/containerinsights/dgs01/application.)

Here you can see application logs ingested nearly realtime by Fluent Bit. You can configure which logs should be included or excluded by modifying modules/fluent_bit/manifests. Please also check the official document for further detail.

For metrics, you can either open All metrics page or Dashboards page from CloudWatch management console. In All metrics page, you can check each metric one by one, which can be useful to check metrics in an ad-hoc manner.

In CloudWatch Dashboards page, you can create a dashboard to monitor all the required metrics at a glance. This sample includes a sample dashboard for monitoring Agones. You can import the dashboard by the following command:

aws cloudwatch put-dashboard --dashboard-name agones-demo-dashboard --dashboard-body file://exmaple/dashboard.json

Note that AWS region us-west-2 is hard-coded in dashboard.json. If you deployed this sample in other regions, please replace it before creating a dashboard.

After put-dashboard successed, now you can open the imported dashboard from CloudWatch management console.

You can freely and intuitively customize the dashboard via management console. Please also refer to this document if you need further information.

You can add or remove metrics ingested into CloudWatch by modifying otel.yaml. The main document for AWS Distro for OpenTelemetry is here. You can also refer to the URLs commented in the file for further detail of each configs.

Add more DGS clusters

Currently there are only two DGS clusters, but you can add more of them easily.

To add a DGS cluster, open main.tf and declare another instance of ./modules/dgs_cluster module. You also need to add the module to local.dgs_clusters list variable.

# Add this
module "dgs03" {
  source       = "./modules/dgs_cluster"
  cluster_name = "dgs03"
  vpc          = module.vpc

  cluster_endpoint_public_access_cidrs = var.cluster_endpoint_allowed_cidrs
  gameserver_allowed_cidrs             = var.gameserver_allowed_cidrs
}

# Don't forget to edit also this variable
locals {
  dgs_clusters = [module.dgs01, module.dgs02, module.dgs03]
}

Clean up

To avoid incurring future charges, clean up the resources you created.

You can remove all the AWS resources deployed by this sample running the following command:

terraform destroy -auto-approve

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
example		example
imgs		imgs
modules		modules
.gitignore		.gitignore
.terraform.lock.hcl		.terraform.lock.hcl
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
THIRD-PARTY-LICENSES		THIRD-PARTY-LICENSES
access_log_policy.json		access_log_policy.json
main.tf		main.tf
output.tf		output.tf
variable.tf		variable.tf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-cluster allocation demo for Agones on EKS

Architecture / How it works

Steps to Deploy

Prerequisites

Check Terraform parameters

Deploy

Usage

Connect to a game server

Allocate a game server

Open Kubernetes Dashboard

Monitor logs and metrics in CloudWatch

Add more DGS clusters

Clean up

Security

License

About

Contributors 4

Languages

License

aws-samples/multi-cluster-allocation-demo-for-agones-on-eks

Folders and files

Latest commit

History

Repository files navigation

Multi-cluster allocation demo for Agones on EKS

Architecture / How it works

Steps to Deploy

Prerequisites

Check Terraform parameters

Deploy

Usage

Connect to a game server

Allocate a game server

Open Kubernetes Dashboard

Monitor logs and metrics in CloudWatch

Add more DGS clusters

Clean up

Security

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Contributors 4

Languages