[Issue 894] analytics documentation (#984)

* docs: Adds docs/analytics/ Adds an analytics section to docs to mirror structure of API and frontend with: - development.md - Instructions for installing and running locally - formatting-and-linting.md - Guidance on formatting and linting in the project - testing.md - Guide on running and writing tests - usage.md - Usage guide for CLI - metrics/ - Sub-directory with descriptions of analytics metrics * docs: Refactors analytics/README
HHS · Jan 22, 2024 · ed81c9d · ed81c9d
1 parent 125eae0
commit ed81c9d
Show file tree

Hide file tree

Showing 11 changed files with 857 additions and 199 deletions.
diff --git a/analytics/README.md b/analytics/README.md
@@ -1,207 +1,49 @@
 # Simpler Grants Analytics
 
-This sub-directory enables users to run analytics on data generated within the Simpler Grants project.
-
-## Getting started
-
-> Note: The following guide will focus on interacting with our analytics package through GitHub and Slack. If you'd like to run this package locally skip to the section on [how to install it locally](#installing-the-analytics-package-locally).
-
-### See the daily reports
-
-We have some automation set up in this repository that automatically runs our analytics and posts the results to Slack on a daily basis. To see these results, use the following steps to discover and join the `#z_bot-sprint-reporting` channel for updates.
-
-1. Join our Slack workspace. **Note:** This option will be available for open source contributors shortly.
-2. Within Slack, click the "Channels" dropdown menu, then select "Manage > Browse channels".
-   <img alt="Screenshot of browsing channels in slack" src="./static/screenshot-browse-channels-slack.png" width=500>
-3. On the browse channel page, search for `#z_bot-sprint-reporting` and select it from the list of results.
-4. On the channel page, select "Join channel" at the bottom of the page.
-
-### Triggering a report
-
-> Note: This option is only available to project maintainers with write access to the repo.
-
-If you're a project maintainer and want to run the reports outside of the daily schedule, you can also trigger the report to run manually using the following steps:
-
-1. Go to the [Run analytics package GitHub Action](https://github.com/HHS/simpler-grants-gov/actions/workflows/run-analytics.yml) page.
-2. Select the "Run workflow" dropdown menu, then click "Run workflow".
-3. For more information about triggering GitHub actions, including running a version of this workflow from another branch, checkout [the GitHub documentation](https://docs.github.com/en/actions/using-workflows/manually-running-a-workflow).
-
-<img alt="Screenshot triggering a GitHub action manually" src="./static/screenshot-trigger-gh-action.png" width=750>
-
-## Installing the analytics package locally
-
-### Pre-requisites
-
-- Python version 3.11
-- Poetry
-- GitHub CLI
-
-Check that you have the following with: `make check-prereqs`
-
-### Installation
-
-1. Clone the GitHub repo: `git clone https://github.com/HHS/simpler-grants-gov.git`
-2. Change directory into the analytics folder: `cd simpler-grants-gov/analytics`
-4. Set up the project: `make setup` -- This will install the required packages and prompt you to authenticate with GitHub
-5. Create a `.secrets.toml` with the following details, see the next section to discover where these values can be found:
-   ```toml
-   reporting_channel_id = "<REPLACE_WITH_CHANNEL_ID>"
-   slack_bot_token = "<REPLACE_WITH_SLACKBOT_TOKEN_ID>"
-   ```
-
-### Configuring secrets
-
-#### Prerequisites
-
-In order to correctly set the value of the `slack_bot_token` and `reporting_channel_id` you will need:
-
-1. To be a member of the Simpler.Grants.gov slack workspace
-2. To be a collaborator on the Sprint Reporting Bot slack app
-
-If you need to be added to the slack workspace or to the list of collaborators for the app, contact a project maintainer.
-
-#### Finding reporting channel ID
-
-1. Go to the `#z_bot-sprint-reporting` channel in the Simpler.Grants.gov slack workspace.
-2. Click on the name of the channel in the top left part of the screen.
-3. Scroll down to the bottom of the resulting dialog box until you see where it says `Channel ID`.
-4. Copy and paste that ID into your `.secrets.toml` file under the `reporting_channel_id` variable.
-
-<img alt="Screenshot of dialog box with channel ID" src="./static/screenshot-channel-id.png" height=500>
-
-#### Finding slackbot token
-
-1. Go to [the dashboard](https://api.slack.com/apps) that displays the slack apps for which you have collaborator access
-2. Click on `Sprint Reporting Bot` to go to the settings for our analytics slackbot
-3. From the side menu, select `OAuth & Permissions` and scroll down to the "OAuth tokens for your workspace" section
-4. Copy the "Bot user OAuth token" which should start with `xoxb` and paste it into your `.secrets.toml` file under the `slack_bot_token` variable.
-
-<img alt="Screenshot of slack app settings page with bot user OAuth token" src="./static/screenshot-slackbot-token.png" width=750>
-
-## Using the make commands
-
-In most cases, the reports you'd like to run are already available as `make` commands, specified in our [`Makefile`](./Makefile)
-
-### Export data and run reports
-
-If want to run reports with the most recent data from GitHub, the easiest way to do it is with the `make sprint-reports-with-latest-data`.
-
-That should result in something like the following being logged to the command line:
-
-<img alt="Screenshot of terminal after running make sprint-reports-with-latest-data" src="./static/screenshot-make-sprint-reports.png" width=750>
-
-It should also open two new browser tabs, each with a separate report:
-
-**Sprint burndown by points for the current sprint**
-
-![Screenshot of burndown for sprint 10](static/screenshot-sprint-burndown.png)
-
-**Percent of points complete by deliverable**
-
-![Screenshot of deliverable percent complete by points](static/screenshot-deliverable-pct-complete-points.png)
-
-### Other relevant make commands
-
-- `make issue-data-export` - Exports issue data from HHS/simpler-grants-gov
-- `make sprint-data-export` - Exports project data from the [Sprint Planning GitHub project](https://github.com/orgs/HHS/projects/13)
-- `make gh-data-export` - Exports both issue and sprint data
-- `make sprint-burndown` - Runs the sprint burndown report
-- `make percent-complete` - Runs the percent complete by deliverable report
-- `make sprint-reports` - Runs both percent complete and sprint burndown (without exporting data first)
-
-## Using the command line interface
-
-For a bit more control over the underlying analytics package, you can use the *full* `analytics` command line interface. The following sections describe how to work with the analytics CLI.
-
-### Learning how to use the command line tool
-
-The `analytics` package comes with a built-in CLI that you can use to discover the reporting features available:
-
-Start by simply typing `poetry run analytics --help` which will print out a list of available commands:
-
-![Screenshot of passing the --help flag to CLI entry point](static/screenshot-cli-help.png)
-
-Discover the arguments required for a particular command by appending the `--help` flag to that command:
-
-```bash
-poetry run analytics export gh_issue_data --help
-```
-
-![Screenshot of passing the --help flag to a specific command](static/screenshot-command-help.png)
-
-### Exporting GitHub data
-
-After following the installation steps above, you can use the following commands to export data from GitHub for local analysis:
-
-#### Exporting issue data
-
-```bash
-poetry run analytics export gh_issue_data --owner HHS --repo simpler-grants-gov --output-file data/issue-data.json
-```
-
-Let's break this down piece by piece:
-
-- `poetry run` - Tells poetry to execute a package installed in the virtual environment
-- `analytics` - The name of the analytics package installed locally
-- `export gh_issue_data` - The specific sub-command in the analytics CLI we want to run
-- `--owner HHS` Passing `HHS` to the `--owner` argument for this sub-command, the owner of the repo whose issue data we want to export, in this case `HHS`
-- `--repo simpler-grants-gov` We want to export issue data from the `simpler-grants-gov` repo owned by `HHS`
-- `--output-file data/issue-data.json` We want to write the exported data to the file with the relative path `data/issue-data.json`
-
-#### Exporting project data
-
-Exporting project data works almost the same way, except it expects a `--project` argument instead of a `--repo` argument. **NOTE:** The project should be the project number as it appears in the URL, not the name of the project.
-
-```bash
-poetry run analytics export gh_project_data --owner HHS --project 13 --output-file data/sprint-data.json
-```
-
-### Calculating metrics
-
-#### Calculating sprint burndown
-
-Once you've exported the sprint and issue data from GitHub, you can start calculating metrics. We'll begin with sprint burndown:
-
-```bash
-poetry run analytics calculate sprint_burndown --sprint-file data/sprint-data.json --issue-file data/issue-data.json --sprint @current --unit points --show-results
+## Introduction
+
+This a command line interface (CLI) tool written in python that is used to run analytics on operational data for the Simpler.Grants.gov initiative. For a more in depth discussion of tools used and the structure of the codebase, view the technical details for the analytics package.
+
+## Project directory structure
+
+Outlines the structure of the analytics codebase, relative to the root of the simpler-grants-gov repo.
+
+```text
+root
+├── analytics
+│   └── src
+│       └── analytics
+│           └── datasets      Create re-usable data interfaces for calculating metrics
+│           └── integrations  Integrate with external systems used to export data or metrics
+│           └── metrics       Calculate the project's operational metrics
+│   └── tests
+│       └── integrations      Integration tests, mostly for src/analytics/integrations
+│       └── datasets          Unit tests for src/analytics/datasets
+│       └── metrics           Unit tests for src/analytics/metrics
+|
+│   └── config.py             Load configurations from environment vars or local .toml files
+│   └── settings.toml         Default configuration settings, tracked by git
+│   └── .secrets.toml         Gitignored file for secrets and configuration management
+│   └── Makefile              Frequently used commands for setup, development, and CLI usage
+│   └── pyproject.toml        Python project configuration file
 ```
 
-A couple of important notes about this command:
-
-- `--sprint @current` In order to calculate burndown, you'll need to specify either `@current` for the current sprint or the name of another sprint, e.g. `"Sprint 10"`
-- `--unit points` In order to calculate burndown based on story points, you pass `points` to the `--unit` option. The other option for unit is `issues`
-- `--show-results` In order to the see the output in a browser you'll need to pass this flag.
+## Using the tool
 
-![Screenshot of burndown for sprint 10](static/screenshot-sprint-burndown.png)
+Project maintainers and members of the public have a few options for interacting with the tool and the reports it produces. Read more about each option in the [usage guide](../documentation/analytics/usage.md):
 
-You can also post the results of this metric to a Slack channel:
-
-```bash
-poetry run analytics calculate sprint_burndown --sprint-file data/sprint-data.json --issue-file data/issue-data.json --sprint "Sprint 10" --unit points --post-results
-```
+1. [Viewing the reports in Slack](../documentation/analytics/usage.md#view-daily-reports-in-slack)
+2. [Triggering reports from GitHub](../documentation/analytics/usage.md#trigger-a-report-from-github)
+3. [Triggering reports from the command line](../documentation/analytics/usage.md#trigger-a-report-from-the-command-line)
 
-> **NOTE:** This requires you to have the `.secrets.toml` configured according to the directions in step 5 of the [installation section](#installation)
-
-![Screenshot of burndown report in slack](static/screenshot-slack-burndown.png)
-
-### Calculating deliverable percent complete
-
-Another key metric you can report is the percentage of issues or points completed per 30k deliverable.
-You can specify the unit you want to use for percent complete (e.g. points or issues) using the `--unit` flag.
-
-For example, here we're calculating percentage completion based on the number of tickets under each deliverable.
-
-```bash
-poetry run analytics calculate deliverable_percent_complete --sprint-file data/sprint-data.json --issue-file data/issue-data.json --show-results --unit issues
-```
-![Screenshot of deliverable percent complete by issues](static/screenshot-deliverable-pct-complete-tasks.png)
-
-And here we're calculating it based on the total story point value of those tickets.
-
-```bash
-poetry run analytics calculate deliverable_percent_complete --sprint-file data/sprint-data.json --issue-file data/issue-data.json --show-results --unit points
-```
+## Contributing to the tool
 
-![Screenshot of deliverable percent complete by points](static/screenshot-deliverable-pct-complete-points.png)
+Project maintainers or open source contributors are encouraged to contribute to the tool. Follow the guides linked below for more information:
 
-The `deliverable_pct_complete` sub-command also supports the `--post-results` flag if you want to post this data to slack.
+1. [Technical overview](../documentation/analytics/technical-overview.md)
+2. [Installation and development guide](../documentation/analytics/development.md)
+   - [Adding a new data source](../documentation/analytics/development.md#adding-a-new-dataset)
+   - [Adding a new metric](../documentation/analytics/development.md#adding-a-new-metric)
+3. [Writing and running tests](../documentation/analytics/testing.md)
+4. [Command line interface (CLI) user guide](../documentation/analytics/usage.md#using-the-command-line-interface)
+5. [Description of existing metrics](../documentation/analytics/metrics/README.md)
diff --git a/analytics/src/analytics/datasets/__init__.py b/analytics/src/analytics/datasets/__init__.py
@@ -1 +1 @@
-"""Create a set of data interfaces to use when claculating metrics."""
+"""Create a set of data interfaces to use when calculating metrics."""
diff --git a/documentation/analytics/README.md b/documentation/analytics/README.md
@@ -0,0 +1,3 @@
+# Operational analytics documentation
+
+Documentation for the operational analytics package can be found in this directory and the [Analytics README.md](../../analytics/README.md).