Important
This Guidance requires the use of AWS CodeCommit, which is no longer available to new customers. Existing customers of AWS CodeCommit can continue using and deploying this Guidance as normal.
The games industry is increasing adoption of the Games-as-a-Service operating model, where games have become more like a service than a product, and recurring revenue is frequently generated through in-app purchases, subscriptions, and other techniques. With this change, it is critical to develop a deeper understanding of how players use the features of games and related services. This understanding allows game developers to continually adapt, and make the necessary changes to keep players engaged.
The Game Analytics Pipeline guidance helps game developers to apply a flexible, and scalable DataOps methodology to their games. Allowing them to continuously integrate, and continuously deploy (CI/CD) a scalable serverless data pipeline for ingesting, storing, and analyzing telemetry data generated from games, and services. The guidance supports streaming ingestion of data, allowing users to gain critical insights from their games, and other applications in near real-time, allowing them to focus on expanding, and improving game experience almost immediately, instead of managing the underlying infrastructure operations. Since the guidance has been codified as a CDK application, game developers can determine the best modules, or components that fit their use case, allowing them to test, and QA the best architecture before deploying into production. This modular system allows for additional AWS capabilities, such as AI/ML models, to be integrated into the architecture in order to further support real-time decision making, and automated LiveOps using AIOps, to further enhance player engagement. Essentially allowing developers to focus on expanding game functionality, rather than managing the underlying infrastructure operations.
Before deploying the sample code, ensure that the following required tools have been installed:
- GitHub Account
- Visual Studio Code
- Docker Desktop (local)
- AWS Cloud Development Kit (CDK) 2.92
- Python >=3.8
- NodeJS >= 20.0.0
NOTE: A Visual Studio Code dev container configuration has been provided for you. This image container the necessary Python, NodeJS, and the AWS CDK versions needed to implement this guidance. It is recommended, that you use the pre-configured environment as your development environment.
Before deploying the sample code, it needs to be customized to suite your specific usage requirements. Guidance configuration, and customization, is managed using a config.yaml
file, located in the infrastructure
folder of the repository.
The following steps will walk you through how to customize the sample code configuration to suite your usage requirements:
-
Log into your GitHub account, and fork this repository into your GitHub account.
-
Follow the instructions on how to (Create a connection to GitHub)[https://docs.aws.amazon.com/dtconsole/latest/userguide/connections-create-github.html#connections-create-github-console], to connect AWS CodePipeline to the forked copy of this repository. Once the connection has been created, make a note of the Amazon Resource Name (ARN) for the connection.
-
A configuration template file, called
config.yaml.TEMPLATE
has been provided as a reference for use case customizations. Using the provided Visual Studio Code devcontainer environment, run the following command to create a usable copy of this file:cp ./infrastructure/config.yaml.TEMPLATE ./infrastructure/config.yaml
-
Open the
./infrastructure/config.yaml
file for editing.
The following settings can be adjusted to suite your use case:
WORKLOAD_NAME
- Description: The name of the workload that will deployed. This name will be used as a prefix for for any component deployed into your AWS Account.
- Type: String
- Example:
"GameAnalyticsPipeline"
CDK_VERSION
- Description: The version of the CDK installed in your environment. To see the current version of the CDK, run the
cdk --version
command. The guidance has been tested using CDK version2.92.0
of the CDK. If you are using a different version of the CDK, ensure that this version is also reflected in the./infrastructure/package.json
file. - Type: String
- Example:
"2.92.0"
- Description: The version of the CDK installed in your environment. To see the current version of the CDK, run the
NODE_VERSION
- Description: The version of NodeJS being used. The default value is set to
"latest"
, and should only be changed this if you require a specific version. - Type: String
- Example:
"latest"
- Description: The version of NodeJS being used. The default value is set to
PYTHON_VESION
- Description: The version of Python being used. The default value is set to
"3.8"
, and should only be changed if you require a specific version. - Type: String
- Example:
"3.8"
- Description: The version of Python being used. The default value is set to
DEV_MODE
- Description: Wether or not to enable developer mode. This mode will ensure synthetic data, and shorter retention times are enabled. It is recommended that you set the value to
true
when first deploying the sample code for testing, as this setting will enable S3 versioning, and won't delete S3 buckets on teardown. This setting can be changed at a later time, and the infrastructure re-deployed through CI/CD. - Type: Boolean
- Example:
true
- Description: Wether or not to enable developer mode. This mode will ensure synthetic data, and shorter retention times are enabled. It is recommended that you set the value to
ENABLE_STREAMING_ANALYTICS
- Description: Wether or not to enable the Kinesis Data Analytics component/module of the guidance. It is recommended to set this value to
true
when first deploying this sample code for testing, as this setting will allow you to verify if streaming analytics is required for your use case. This setting can be changed at a later time, and the guidance re-deployed through CI/CD. - Type: Boolean
- Example:
true
- Description: Wether or not to enable the Kinesis Data Analytics component/module of the guidance. It is recommended to set this value to
STREAM_SHARD_COUNT
- Description: The number of Kinesis shards, or sequence of data records, to use for the data stream. The default value has been set to
1
for initial deployment, and testing purposes. This value can be changed at a later time, and the guidance re-deployed through CI/CD. For information about determining the shards required for your use case, refer to Amazon Kinesis Data Streams Terminology and Concepts in the Amazon Kinesis Data Streams Developer Guide. - Type: Integer
- Example:
1
- Description: The number of Kinesis shards, or sequence of data records, to use for the data stream. The default value has been set to
CODECOMMIT_REPO
- Description: The name of the AWS CodeCoomit, repository used as source control for the codified infrastructure, and CI/CD pipeline.
- Type: String
- Example:
"game-analytics-pipeline"
RAW_EVENTS_PREFIX
- Description: The prefix for new/raw data files stored in S3.
- Type: String
- Example:
"raw_events"
PROCESSED_EVENTS_PREFIX
- Description: The prefix processed data files stored in S3.
- Type: String
- Example:
"processed_events"
RAW_EVENTS_TABLE
- Description: The name of the of the AWS Glue table within which all new/raw data is cataloged.
- Type: String
- Example:
"raw_events"
GLUE_TMP_PREFIX
- Description: The name of the temporary data store for AWS Glue.
- Type: String
- Example:
"glueetl-tmp"
S3_BACKUP_MODE
- Description: Wether or not to enable Kinesis Data Firehose to send a backup of new/raw data to S3. The default value has been set to
false
for initial deployment, and testing purposes. This value can be changed at a later time, and the guidance re-deployed through CI/CD. - Type: Boolean
- Example:
false
- Description: Wether or not to enable Kinesis Data Firehose to send a backup of new/raw data to S3. The default value has been set to
CLOUDWATCH_RETENTION_DAYS
- Description: The default number of days in which Amazon CloudWatch stores all the logs. The default value has been set to
30
for initial deployment, and testing purposes. This value can be changed at a later time, and the guidance re-deployed through CI/CD. - Type: Integer
- Example:
30
- Description: The default number of days in which Amazon CloudWatch stores all the logs. The default value has been set to
API_STAGE_NAME
- Description: The name of the REST API stage for the Amazon API Gateway configuration endpoint for sending telemetry data to the pipeline. This provides an integration option for applications that cannot integrate with Amazon Kinesis directly. The API also provides configuration endpoints for admins to use for registering their game applications with the guidance, and generating API keys for developers to use when sending events to the REST API. The default value is set to
live
. - Type: String
- Example:
"live"
- Description: The name of the REST API stage for the Amazon API Gateway configuration endpoint for sending telemetry data to the pipeline. This provides an integration option for applications that cannot integrate with Amazon Kinesis directly. The API also provides configuration endpoints for admins to use for registering their game applications with the guidance, and generating API keys for developers to use when sending events to the REST API. The default value is set to
EMAIL_ADDRESS
- Description: The email address to receive operational notifications, and delivered by CloudWatch.
- Type: String
- Example:
"[email protected]"
GITHUB_USERNAME
- Description: The user name for the Github account, into which the guidance has been forked.
- Type: String
GITHUB_REPO_NAME
- Description: The repository name of the fork in your GitHub account.
- Type: String
- Example:
"guidance-for-game-analytics-pipeline-on-aws"
CONNECTION_ARN
- Description: The ARN for the GitHub connection, created during the Configuration Setup section.
- Type String
- Example:
"arn:aws:codeconnections:us-east-1:123456789123:connection/6506b29d-429e-4bf3-8ab4-78cb2fc011b3"
accounts
- Description: Leverages CDK Cross-account, Cross-region capabilities for deploying separate CI/CD pipeline stages to separate AWS Accounts, AWS Regions. For more information on Cross-account CI/CD pipelines, using the CDK, refer to the Building a Cross-account CI/CD Pipeline workshop.
- Example:
accounts: - NAME: "QA" ACCOUNT: "<YOUR-ACCOUNT-NUMBER>" REGION: "<QA-ACCOUNT-REGION>" - NAME: "PROD" ACCOUNT: "<YOUR-ACCOUNT-NUMBER>" REGION: "<PROD-ACCOUNT-REGION>"
NOTE: It is recommended that you use the same AWS Account, as well as the same AWS Region, for both the
QA
, andPROD
stages, when first deploying the guidance.
Once you will have to add your own custom configuration settings, and saved the config.yaml
file, then following steps can be used to deploy the CI/CD pipeline:
- Build the sample code dependencies, by running the following command:
npm run build
- Bootstrap the sample code, by running the following command:
npm run deploy.bootstrap
- Deploy the sample code, by running the following command:
npm run deploy
After the sample code has been deployed, two CloudFormation stacks are created within you AWS Account, and AWS Region:
PROD-<WORKLOAD NAME>
: The deployed version of the guidance infrastructure.<WORKLOAD NAME>-Toolchain
: The CI/CD Pipeline for the guidance.
The stack hosts the deployed production version of the AWS resources for you to validate, and further optimize the guidance for your use case.
Once the deployed infrastructure has been validated, or further optimized for your use case, you can trigger the continuos deployment, by committing any updated source code into the newly create CodeCommit repository, using the following steps:
- Copy the URL for cloning CodeCommit repository that you specified in the
config.yanl
file. See the View repository details (console) section of the AWS CodeCommit User Guid for more information on how to vie the Clone URL for the repository. - Create a news Git repository, by running the following command:
rm -rf .git git init --initial-branch=main
- Add the CodeCommit repository as the origin, using the following command:
git remote add origin <CodeCommit Clone URL>
- Commit the code to trigger the CI/CD process, by running the following commands:
git add -A git commit -m "Initial commit" git push --set-upstream origin
Make any code changes to subsequently optimize the guidance for your use case. Committing these changes will trigger a subsequent continuous integration, and deployment of the deployed production stack, PROD-<WORKLOAD NAME>
.
To clean up any of the deployed resources, you can either delete the stack through the AWS CloudFormation console, or run the cdk destroy
command.
NOTE: Deleting the deployed resources will not delete the Amazon S3 bucket, in order to protect any game data already ingested, and stored with the data lake. The Amazon S3 Bucket, and data, can be deleted from Amazon S3 using the Amazon S3 console, AWS SDKs, AWS Command Line Interface (AWS CLI), or REST API. See the Deleting Amazon S3 objects section of the user guide for mor information.
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.