Skip to content

Commit

Permalink
[Issue 894] analytics documentation (#984)
Browse files Browse the repository at this point in the history
* docs: Adds docs/analytics/
Adds an analytics section to docs to mirror structure of API and frontend with:
- development.md - Instructions for installing and running locally
- formatting-and-linting.md - Guidance on formatting and linting in the project
- testing.md - Guide on running and writing tests
- usage.md - Usage guide for CLI
- metrics/ - Sub-directory with descriptions of analytics metrics
* docs: Refactors analytics/README
  • Loading branch information
widal001 authored Jan 22, 2024
1 parent 125eae0 commit ed81c9d
Show file tree
Hide file tree
Showing 11 changed files with 857 additions and 199 deletions.
238 changes: 40 additions & 198 deletions analytics/README.md
Original file line number Diff line number Diff line change
@@ -1,207 +1,49 @@
# Simpler Grants Analytics

This sub-directory enables users to run analytics on data generated within the Simpler Grants project.

## Getting started

> Note: The following guide will focus on interacting with our analytics package through GitHub and Slack. If you'd like to run this package locally skip to the section on [how to install it locally](#installing-the-analytics-package-locally).
### See the daily reports

We have some automation set up in this repository that automatically runs our analytics and posts the results to Slack on a daily basis. To see these results, use the following steps to discover and join the `#z_bot-sprint-reporting` channel for updates.

1. Join our Slack workspace. **Note:** This option will be available for open source contributors shortly.
2. Within Slack, click the "Channels" dropdown menu, then select "Manage > Browse channels".
<img alt="Screenshot of browsing channels in slack" src="./static/screenshot-browse-channels-slack.png" width=500>
3. On the browse channel page, search for `#z_bot-sprint-reporting` and select it from the list of results.
4. On the channel page, select "Join channel" at the bottom of the page.

### Triggering a report

> Note: This option is only available to project maintainers with write access to the repo.
If you're a project maintainer and want to run the reports outside of the daily schedule, you can also trigger the report to run manually using the following steps:

1. Go to the [Run analytics package GitHub Action](https://github.com/HHS/simpler-grants-gov/actions/workflows/run-analytics.yml) page.
2. Select the "Run workflow" dropdown menu, then click "Run workflow".
3. For more information about triggering GitHub actions, including running a version of this workflow from another branch, checkout [the GitHub documentation](https://docs.github.com/en/actions/using-workflows/manually-running-a-workflow).

<img alt="Screenshot triggering a GitHub action manually" src="./static/screenshot-trigger-gh-action.png" width=750>

## Installing the analytics package locally

### Pre-requisites

- Python version 3.11
- Poetry
- GitHub CLI

Check that you have the following with: `make check-prereqs`

### Installation

1. Clone the GitHub repo: `git clone https://github.com/HHS/simpler-grants-gov.git`
2. Change directory into the analytics folder: `cd simpler-grants-gov/analytics`
4. Set up the project: `make setup` -- This will install the required packages and prompt you to authenticate with GitHub
5. Create a `.secrets.toml` with the following details, see the next section to discover where these values can be found:
```toml
reporting_channel_id = "<REPLACE_WITH_CHANNEL_ID>"
slack_bot_token = "<REPLACE_WITH_SLACKBOT_TOKEN_ID>"
```

### Configuring secrets

#### Prerequisites

In order to correctly set the value of the `slack_bot_token` and `reporting_channel_id` you will need:

1. To be a member of the Simpler.Grants.gov slack workspace
2. To be a collaborator on the Sprint Reporting Bot slack app

If you need to be added to the slack workspace or to the list of collaborators for the app, contact a project maintainer.

#### Finding reporting channel ID

1. Go to the `#z_bot-sprint-reporting` channel in the Simpler.Grants.gov slack workspace.
2. Click on the name of the channel in the top left part of the screen.
3. Scroll down to the bottom of the resulting dialog box until you see where it says `Channel ID`.
4. Copy and paste that ID into your `.secrets.toml` file under the `reporting_channel_id` variable.

<img alt="Screenshot of dialog box with channel ID" src="./static/screenshot-channel-id.png" height=500>

#### Finding slackbot token

1. Go to [the dashboard](https://api.slack.com/apps) that displays the slack apps for which you have collaborator access
2. Click on `Sprint Reporting Bot` to go to the settings for our analytics slackbot
3. From the side menu, select `OAuth & Permissions` and scroll down to the "OAuth tokens for your workspace" section
4. Copy the "Bot user OAuth token" which should start with `xoxb` and paste it into your `.secrets.toml` file under the `slack_bot_token` variable.

<img alt="Screenshot of slack app settings page with bot user OAuth token" src="./static/screenshot-slackbot-token.png" width=750>

## Using the make commands

In most cases, the reports you'd like to run are already available as `make` commands, specified in our [`Makefile`](./Makefile)

### Export data and run reports

If want to run reports with the most recent data from GitHub, the easiest way to do it is with the `make sprint-reports-with-latest-data`.

That should result in something like the following being logged to the command line:

<img alt="Screenshot of terminal after running make sprint-reports-with-latest-data" src="./static/screenshot-make-sprint-reports.png" width=750>

It should also open two new browser tabs, each with a separate report:

**Sprint burndown by points for the current sprint**

![Screenshot of burndown for sprint 10](static/screenshot-sprint-burndown.png)

**Percent of points complete by deliverable**

![Screenshot of deliverable percent complete by points](static/screenshot-deliverable-pct-complete-points.png)

### Other relevant make commands

- `make issue-data-export` - Exports issue data from HHS/simpler-grants-gov
- `make sprint-data-export` - Exports project data from the [Sprint Planning GitHub project](https://github.com/orgs/HHS/projects/13)
- `make gh-data-export` - Exports both issue and sprint data
- `make sprint-burndown` - Runs the sprint burndown report
- `make percent-complete` - Runs the percent complete by deliverable report
- `make sprint-reports` - Runs both percent complete and sprint burndown (without exporting data first)

## Using the command line interface

For a bit more control over the underlying analytics package, you can use the *full* `analytics` command line interface. The following sections describe how to work with the analytics CLI.

### Learning how to use the command line tool

The `analytics` package comes with a built-in CLI that you can use to discover the reporting features available:

Start by simply typing `poetry run analytics --help` which will print out a list of available commands:

![Screenshot of passing the --help flag to CLI entry point](static/screenshot-cli-help.png)

Discover the arguments required for a particular command by appending the `--help` flag to that command:

```bash
poetry run analytics export gh_issue_data --help
```

![Screenshot of passing the --help flag to a specific command](static/screenshot-command-help.png)

### Exporting GitHub data

After following the installation steps above, you can use the following commands to export data from GitHub for local analysis:

#### Exporting issue data

```bash
poetry run analytics export gh_issue_data --owner HHS --repo simpler-grants-gov --output-file data/issue-data.json
```

Let's break this down piece by piece:

- `poetry run` - Tells poetry to execute a package installed in the virtual environment
- `analytics` - The name of the analytics package installed locally
- `export gh_issue_data` - The specific sub-command in the analytics CLI we want to run
- `--owner HHS` Passing `HHS` to the `--owner` argument for this sub-command, the owner of the repo whose issue data we want to export, in this case `HHS`
- `--repo simpler-grants-gov` We want to export issue data from the `simpler-grants-gov` repo owned by `HHS`
- `--output-file data/issue-data.json` We want to write the exported data to the file with the relative path `data/issue-data.json`

#### Exporting project data

Exporting project data works almost the same way, except it expects a `--project` argument instead of a `--repo` argument. **NOTE:** The project should be the project number as it appears in the URL, not the name of the project.

```bash
poetry run analytics export gh_project_data --owner HHS --project 13 --output-file data/sprint-data.json
```

### Calculating metrics

#### Calculating sprint burndown

Once you've exported the sprint and issue data from GitHub, you can start calculating metrics. We'll begin with sprint burndown:

```bash
poetry run analytics calculate sprint_burndown --sprint-file data/sprint-data.json --issue-file data/issue-data.json --sprint @current --unit points --show-results
## Introduction

This a command line interface (CLI) tool written in python that is used to run analytics on operational data for the Simpler.Grants.gov initiative. For a more in depth discussion of tools used and the structure of the codebase, view the technical details for the analytics package.

## Project directory structure

Outlines the structure of the analytics codebase, relative to the root of the simpler-grants-gov repo.

```text
root
├── analytics
│ └── src
│ └── analytics
│ └── datasets Create re-usable data interfaces for calculating metrics
│ └── integrations Integrate with external systems used to export data or metrics
│ └── metrics Calculate the project's operational metrics
│ └── tests
│ └── integrations Integration tests, mostly for src/analytics/integrations
│ └── datasets Unit tests for src/analytics/datasets
│ └── metrics Unit tests for src/analytics/metrics
|
│ └── config.py Load configurations from environment vars or local .toml files
│ └── settings.toml Default configuration settings, tracked by git
│ └── .secrets.toml Gitignored file for secrets and configuration management
│ └── Makefile Frequently used commands for setup, development, and CLI usage
│ └── pyproject.toml Python project configuration file
```

A couple of important notes about this command:

- `--sprint @current` In order to calculate burndown, you'll need to specify either `@current` for the current sprint or the name of another sprint, e.g. `"Sprint 10"`
- `--unit points` In order to calculate burndown based on story points, you pass `points` to the `--unit` option. The other option for unit is `issues`
- `--show-results` In order to the see the output in a browser you'll need to pass this flag.
## Using the tool

![Screenshot of burndown for sprint 10](static/screenshot-sprint-burndown.png)
Project maintainers and members of the public have a few options for interacting with the tool and the reports it produces. Read more about each option in the [usage guide](../documentation/analytics/usage.md):

You can also post the results of this metric to a Slack channel:

```bash
poetry run analytics calculate sprint_burndown --sprint-file data/sprint-data.json --issue-file data/issue-data.json --sprint "Sprint 10" --unit points --post-results
```
1. [Viewing the reports in Slack](../documentation/analytics/usage.md#view-daily-reports-in-slack)
2. [Triggering reports from GitHub](../documentation/analytics/usage.md#trigger-a-report-from-github)
3. [Triggering reports from the command line](../documentation/analytics/usage.md#trigger-a-report-from-the-command-line)

> **NOTE:** This requires you to have the `.secrets.toml` configured according to the directions in step 5 of the [installation section](#installation)
![Screenshot of burndown report in slack](static/screenshot-slack-burndown.png)

### Calculating deliverable percent complete

Another key metric you can report is the percentage of issues or points completed per 30k deliverable.
You can specify the unit you want to use for percent complete (e.g. points or issues) using the `--unit` flag.

For example, here we're calculating percentage completion based on the number of tickets under each deliverable.

```bash
poetry run analytics calculate deliverable_percent_complete --sprint-file data/sprint-data.json --issue-file data/issue-data.json --show-results --unit issues
```
![Screenshot of deliverable percent complete by issues](static/screenshot-deliverable-pct-complete-tasks.png)

And here we're calculating it based on the total story point value of those tickets.

```bash
poetry run analytics calculate deliverable_percent_complete --sprint-file data/sprint-data.json --issue-file data/issue-data.json --show-results --unit points
```
## Contributing to the tool

![Screenshot of deliverable percent complete by points](static/screenshot-deliverable-pct-complete-points.png)
Project maintainers or open source contributors are encouraged to contribute to the tool. Follow the guides linked below for more information:

The `deliverable_pct_complete` sub-command also supports the `--post-results` flag if you want to post this data to slack.
1. [Technical overview](../documentation/analytics/technical-overview.md)
2. [Installation and development guide](../documentation/analytics/development.md)
- [Adding a new data source](../documentation/analytics/development.md#adding-a-new-dataset)
- [Adding a new metric](../documentation/analytics/development.md#adding-a-new-metric)
3. [Writing and running tests](../documentation/analytics/testing.md)
4. [Command line interface (CLI) user guide](../documentation/analytics/usage.md#using-the-command-line-interface)
5. [Description of existing metrics](../documentation/analytics/metrics/README.md)
2 changes: 1 addition & 1 deletion analytics/src/analytics/datasets/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
"""Create a set of data interfaces to use when claculating metrics."""
"""Create a set of data interfaces to use when calculating metrics."""
3 changes: 3 additions & 0 deletions documentation/analytics/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Operational analytics documentation

Documentation for the operational analytics package can be found in this directory and the [Analytics README.md](../../analytics/README.md).
Loading

0 comments on commit ed81c9d

Please sign in to comment.