The Replay analysis web app is used to analyze and compare replays. The goal of this tool is to facilitate understanding of Workload Replicator results and simplify analysis results for customers to make the right decisions for their use of Redshift.
- Execute replay (either using the WorkloadReplicator or NodeConfigCompare)
- For workload replicator, execute replay by adding the following parameters in
$REDSHIFT_TEST_DRIVE_ROOT/config/replay.yaml
.- analysis_iam_role
- analysis_output
- For NodeConfigCompare, complete the Step Function execution.
- For workload replicator, execute replay by adding the following parameters in
- After the step function execution is complete (in case of NodeConfigCompare) or the replay is complete (in case of WorkloadReplicator), find the S3 location of the analysis files and copy to clipboard. Follow the following steps to find the location:
- Navigate to Cloudwatch insights console located here
- Select the log group that is used by NodeConfigCompare or WorkloadReplicator
- For NodeConfigCompare, the log group can be found in the CloudFormation console in the resources section with the CloudFormation template that was used for the NodeConfigCompare run.
- Use the following insights query to find the S3 location where the Analysis files are present. Note: When you are asked to enter the analysis bucket below, use the path that is the closest ancestor to analysis folder i.e. if your analysis files are present in
s3://<bucket_name>/analysis/<analysis files>
, enters3://<bucket_name>
fields @message
| filter @message like /can be used in Replay Analysis./
-
Pre-install Node js (Minimum Node >= 14 required for React)
- If you do not have Node js already installed, follow steps below:
-
If using EC2 instance, run following commands in order:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
source ~/.bashrc
nvm install --lts
- Check that Node js is installed by running:
which node
-
To install Node js in your local computer, follow link here.
-
- If you do not have Node js already installed, follow steps below:
-
If using EC2 instance to view replay analysis web UI, ensure you allowlist your IP address to access the webpage hosted by EC2 instance by following these steps:
- Follow this link to configure inbound rules in the security group of your EC2 instance.
- For Type, choose Custom TCP
- For Port Range, enter 3000
- For Source, select My IP to allow inbound traffic from only your local computer's public IPv4 address.
- The inbound traffic should look like:
NOTE: Complete all the initial setup steps outlined in the Redshift-test-drive Readme before proceeding.
Execute this from the root directory (within your clone of redshift-test-drive - this is aliased to $REDSHIFT_TEST_DRIVE_ROOT
)
cd $REDSHIFT_TEST_DRIVE_ROOT && make replay_analysis
- Installs all the requirements required to launch the web app.
- Opens Web app which helps customer to choose multiple replays for comparison.
The start page of the interface will prompt users to provide the bucket location(s) with replays they want to analyze. It provides users the ability to select a number of replays for analysis.
- If you are using local computer, click the links generated in the terminal. You can access the web app using either
http://localhost:3000
, orhttp://YOUR-LOCAL-IP:3000
- If using EC2 instance, identify the public IPv4 address of your instance:
- Console option: Go to EC2 console and find your instance. Public IPv4 address of your instance is listed on the Networking tab.
- Command line option: Use describe-instances to find the public IPv4 address.
- Open local browser and enter
http://YOUR-PUBLIC-IPV4-ADDRESS:3000
. The web link should look something like this:http://12.123.123.1:3000/
- The start page will prompt user to add the following inputs:
- Credentials Type
The credentials entered for accesing the analysis bucket can be either of the following:- Use a Profile
Enter user profile for the account where the Replay Bucket resides. - Use an IAM Role
Enter IAM role for the bucket mentioned in the file location. This IAM role should have read access to the bucket.
- Use a Profile
- Replay analysis file location
Bucket where the analysis output lies which was pre-generated by running replay. This should be copied over from the step-2 of the pre-requisites section. - Replays
List of replays available in the bucket for comparison
- Credentials Type
- Select the replays and click the Analysis button for generating the comparison analysis of the replay runs.
- Replay Analysis
This section in the Web UI consists of following metrics:- Compare Throughput
It is a graph which showcases the number of queries executed per second. This data is filtered by the selected query types, users, and time range. - Aggregated Metrics
It is a table which displays the different percentiles of execution time, elapsed time, and queue time across selected replays. These values are representative of the selected query types, users, and time range. - Query Latency
It is a graph which displays the distribution of query latency. - Longest Running Queries
It is a table that shows the execution time metrics for each replay and also displays the top 100 running queries.
- Compare Throughput
- Replay Validation
This section will give more insight into the success and validity of a given replay.Validity is defined by success and error rates, the distribution of errors, and differences in data ingested.It includes the following metrics for:- Query Errors
Errors encountered across selected replays. - Error Category Distribution
A Stacked bar chart that shows the distribution of errors that occurred across each replay and allow insight into which errors occurred most frequently. - COPY Ingestion Metrics
An aggregated execution metrics of COPY ingestion by replay.
- Query Errors