Skip to content

Commit

Permalink
Merge branch 'main' into quidel_flutest
Browse files Browse the repository at this point in the history
  • Loading branch information
nmdefries committed Dec 14, 2024
2 parents 5d49b9f + 0cd6b18 commit 2b31745
Show file tree
Hide file tree
Showing 1,066 changed files with 925,913 additions and 160,018 deletions.
5 changes: 5 additions & 0 deletions .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[bumpversion]
current_version = 0.3.56
commit = True
message = chore: bump covidcast-indicators to {new_version}
tag = False
6 changes: 6 additions & 0 deletions .git-blame-ignore-revs
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Format geomap.py
d4b056e7a4c11982324e9224c9f9f6fd5d5ec65c
# Format test_geomap.py
79072dcdec3faca9aaeeea65de83f7fa5c00d53f
# Sort setup.py dependencies
6912077acba97e835aff7d0cd3d64309a1a9241d
145 changes: 145 additions & 0 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
# Contributing to COVIDcast indicator pipelines

## Branches

* `main`

The primary branch of this repository is called `main`, and contains the version of the code and supporting libraries currently under development. This should be your starting point when creating a new indicator. It is protected so that only reviewed pull requests can be merged in. The main branch is configured to deploy to our staging environment on push. CI is set up to build and test all indicators on PR.

* `prod`

The production branch is configured to automatically deploy to our production environment on push, and is protected so that only administrators can push or merge. CI is set up to build and test all indicators on PR.

* everything else

All other branches are development branches. We don't enforce a naming policy, but it is recommended to prefix all branches you create with your name, username, or initials (e.g. `username/branch-name`).

We don't forbid force-pushing, but please keep to a minimum and be careful of using when modifying a branch at the same time as others.

## Issues

Issues are the main communication point when it comes to bugfixes, new features, or other possible changes. The repository has several issue templates that help to structure issues.

If you ensure that each issue deals with a single topic (ie a single new proposed data source, or a single data quality problem), we'll all be less likely to drop subordinate tasks on the floor, but we also recognize that a lot of the people filing issues in this repository are new to large project management and not used to focusing their thoughts in this way. It's okay, we'll all learn and get better together.

Admins will assign issues to one or more people based on balancing expediency, expertise, and team robustness. It may be faster for one person to fix something, but we can reduce the risk of having too many single points of failure if two people work on it together.

## General workflow for indicators creation and deployment

So, how does one go about developing a pipeline for a new data source?

### tl;dr

1. Create your new indicator branch from `main`.
2. Build it using the [indicator template](https://github.com/cmu-delphi/covidcast-indicators/tree/main/_template_python), following the guidelines in the included README.md, REVIEW.md, and INDICATOR_DEV_GUIDE.md files.
3. Make some stuff!
4. When your stuff works, push your development branch to remote, and open a PR against `main` for review.
5. Once your PR has been merged, consult with a platform engineer for the remaining production setup needs. They will create a deployment workflow for your indicator including any necessary production parameters. Production secrets are encrypted in the Ansible vault. This workflow will be tested in staging by admins, who will consult you about any problems they encounter.
6. Following [the source documentation template](https://github.com/cmu-delphi/delphi-epidata/blob/main/docs/api/covidcast-signals/_source-template.md), create public API documentation for the source. You can submit this as a pull request against the delphi-epidata repository.
7. If your peers like the code, the documentation is ready, and the staging runs are successful, work with admins to schedule your indicator in production, merge the documentation, and announce the new indicator to the mailing list.
8. Rejoice!

### Starting out

The `main` branch should contain up-to-date code and supporting libraries. This should be your starting point when creating a new indicator.

```shell
# Hint
#
git checkout main
git checkout -b dev-my-feature-branch
```

### Creating your indicator

Create a directory for your new indicator by making a copy of `_template_python`. (We also make a `_template_r` available, but R should be only used as a last resort, due to complications using it in production.) Add the name of the directory to the list found in `jobs > build > strategy > matrix > packages` in `.github/workflows/python-ci.yml`, which will enable automated checks for your indicator when you make PRs. The template copies of `README.md` and `REVIEW.md` include the minimum requirements for code structure, documentation, linting, testing, and method of configuration. Beyond that, we don't have any established restrictions on implementation; you can look at other existing indicators see some examples of code layout, organization, and general approach.

* Consult your peers with questions! :handshake:

Once you have something that runs locally and passes tests you set up your remote branch eventual review and production deployment.

```shell
# Hint
#
git push -u origin dev-my-feature-branch
```

You can then draft [public API documentation](https://cmu-delphi.github.io/delphi-epidata/) for people who would fetch this
data from the API. Public API documentation is kept in the delphi-epidata
repository, and there is a [template Markdown
file](https://github.com/cmu-delphi/delphi-epidata/blob/main/docs/api/covidcast-signals/_source-template.md)
that outlines the features that need to be documented. You can create a pull
request to add a new file to `docs/api/covidcast-signals/` for your source. Our
goal is to have public API documentation for the data at the same time as it
becomes available to the public.

### Setting up for review and deployment

Once you have your branch set up you should get in touch with a platform engineer to pair up on the remaining production needs. These include:

* Adding the necessary Jenkins scripts for your indicator.
* Preparing the runtime host with any Automation configuration necessities.
* Reviewing the workflow to make sure it meets the general guidelines and will run as expected on the runtime host.

Once all the last mile configuration is in place you can create a pull request against `prod` to initiate the CI/CD pipeline which will build, test, and package your indicator for deployment.

If everything looks ok, you've drafted source documentation, platform engineering has validated the last mile, and the pull request is accepted, you can merge the PR. Deployment will start automatically.

Hopefully it'll be a full on :tada:, after that :crossed_fingers:

If not, circle back and try again.

## Production overview

### Running production code

Currently, the production indicators all live and run on the venerable and perennially useful Delphi primary server (also known generically as "the runtime host").

### Delivering an indicator to the production environment

We use a branch-based git workflow coupled with [Jenkins](https://www.jenkins.io/) and [Ansible](https://www.ansible.com/) to build, test, package, and deploy each indicator individually to the runtime host.

* Jenkins dutifully manages the whole process for us by executing several "stages" in the context of a [CI/CD pipeline](https://dzone.com/articles/learn-how-to-setup-a-cicd-pipeline-from-scratch). Each stage does something unique, building on the previous stage. The stages are:
* Environment - Sets up some environment-specific needs that the other stages depend on.
* Build - Create the Python venv on the Jenkins host.
* Test - Run linting and unit tests.
* Package - Tar and gzip the built environment.
* Deploy - Trigger an Ansible playbook to place the built package onto the runtime host, place any necessary production configuration, and adjust the runtime envirnemnt (if necessary).

There are several additional Jenkins-specific files that will need to be created for each indicator, as well as some configuration additions to the runtime host.
It will be important to pair with a platform engineer to prepare the necessary production environment needs, test the workflow, validate on production, and ultimately sign off on a production release.

### Preparing container images of indicators

It may be desirable to build a container image from an indicator. To do this:

* Edit the `.github/workflows/build-container-images.yml` file and add your indicator directory name to the `matrix.packages` section of the `jobs:` block:

```yaml
...
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
packages: [ new_indicator ] # indicator directory name
...
```

* Create a suitable Dockerfile in the root of your indicator directory.

* GitHub Actions will try to build this for you and register it in our private repo.

Currently we will build container images off of `main` and `prod` branches. These can be pulled by systems or humans that have access to the registry.

* `main` builds create a registered image of:

```text
ghcr.io/cmu-delphi/covidcast-indicators-${indicator_name}:dev
```

* `prod` builds create a registered image of:

```text
ghcr.io/cmu-delphi/covidcast-indicators-${indicator_name}:latest
```
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
blank_issues_enabled: true
21 changes: 21 additions & 0 deletions .github/ISSUE_TEMPLATE/data_quality_issue.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
name: Data quality issue
about: Missing data, weird data, broken data
title: ''
labels: 'data quality'
assignees: 'nolangormley'
---

**Actual Behavior:**

<!--Provide a description of the problem and a minimal reproducible example, if relevant. Please include the source and signal names, as well as sample observations, with geo region name, date, and data, demonstrating the problem.-->

When I...

**Expected behavior**

<!--A clear and concise description of what you expected to happen.-->

**Context**

<!--Add any context about the problem here.-->
19 changes: 19 additions & 0 deletions .github/ISSUE_TEMPLATE/source_signal_request.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
name: 🚀 New Source or Signal
about: Suggest incorporation of a new source or signal
title: ''
labels: 'API addition'
assignees: ''
---

<!--A clear and concise description of the source or signal you would like to add or modify, and how you imagine it working.-->

It would be great if ...

**Data details**

<!--Please link and briefly describe the proposed source. How is the raw data made available? Describe the geographic and time resolution of the raw data. Which fields should be extracted? Please describe proposed processing of the data, especially if this is a variant of an existing source or signal. -->

**Additional context**

<!--Add any other context or screenshots about the feature request here.-->
10 changes: 10 additions & 0 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
### Description
Type of change (bug fix, new feature, etc), brief description, and motivation for these changes.

### Changelog
Itemize code/test/documentation changes and files added/removed.
- change1
- change2

### Associated Issue(s)
- Addresses #(issue)
49 changes: 49 additions & 0 deletions .github/workflows/backfill-corr-ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.
#
# See https://github.com/r-lib/actions/tree/master/examples#readme for
# additional example workflows available for the R community.

name: R backfill corrections

on:
push:
branches: [main, prod]
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
branches: [main, prod]

jobs:
build:
runs-on: ubuntu-latest
if: github.event.pull_request.draft == false
defaults:
run:
working-directory: backfill_corrections/delphiBackfillCorrection

steps:
- uses: actions/checkout@v4

- name: Set up R 4.2
uses: r-lib/actions/setup-r@v2
with:
use-public-rspm: true
r-version: 4.2

- name: Install and cache dependencies
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
uses: r-lib/actions/setup-r-dependencies@v2
with:
extra-packages: any::rcmdcheck
working-directory: backfill_corrections/delphiBackfillCorrection
upgrade: "TRUE"

- name: Check package
uses: r-lib/actions/check-r-package@v2
with:
working-directory: backfill_corrections/delphiBackfillCorrection
args: 'c("--no-manual", "--test-dir=unit-tests")'
error-on: '"error"'
49 changes: 49 additions & 0 deletions .github/workflows/build-container-images.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
name: Build indicator container images and upload to registry

on:
push:
branches: [main, prod]
workflow_dispatch:

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
packages: [backfill_corrections]
steps:
- name: Checkout code
uses: actions/checkout@v2

- name: Login to GitHub Container Registry
uses: docker/login-action@v1
with:
registry: ghcr.io
username: cmu-delphi-deploy-machine
password: ${{ secrets.CMU_DELPHI_DEPLOY_MACHINE_PAT }}

- name: Build, tag, and push image to Github
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
baseRef="${GITHUB_REF#*/}"
baseRef="${baseRef#*/}"
case "${baseRef}" in
main)
imageTag="dev"
;;
prod)
imageTag="latest"
;;
*)
imageTag="${baseRef//\//_}" # replace `/` with `_` in branch name
;;
esac
if [ -z ${{ matrix.packages }} ]; then
echo "The matrix list is empty so we will not build any images."
else
cd ${{ github.workspace }}/${{ matrix.packages }}
echo "using tag: --${imageTag}--"
DOCKER_BUILDKIT=1 BUILDKIT_PROGRESS=plain docker build --secret id=GITHUB_TOKEN -t ghcr.io/${{ github.repository }}-${{ matrix.packages }}:$imageTag --file Dockerfile .
docker push ghcr.io/${{ github.repository }}-${{ matrix.packages }}:$imageTag
fi
80 changes: 80 additions & 0 deletions .github/workflows/create-release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
name: Create Release

on:
workflow_dispatch:
inputs:
versionName:
description: 'Semantic Version Number (i.e., 5.5.0 or patch, minor, major, prepatch, preminor, premajor, prerelease)'
required: true
default: patch

jobs:
create-release:
runs-on: ubuntu-latest
steps:
- name: Check out code
uses: actions/checkout@v2
with:
ref: prod
ssh-key: ${{ secrets.CMU_DELPHI_DEPLOY_MACHINE_SSH }}
- name: Reset prod branch
run: |
git fetch origin main:main
git reset --hard main
git config --global user.email [email protected]
git config --global user.name "Delphi Deploy Bot"
- name: Set up Python 3.8
uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Install bump2version
run: python -m pip install bump2version
- name: Check for delphi-utils changes
uses: dorny/paths-filter@v2
id: changes
with:
base: 'prod'
ref: 'main'
filters: |
utils:
- '_delphi_utils_python/**'
- name: Bump delphi-utils version
id: utils-changed
if: steps.changes.outputs.utils == 'true'
working-directory: ./_delphi_utils_python
run: |
echo -n "::set-output name=version::"
bump2version --list ${{ github.event.inputs.versionName }} | grep ^new_version | sed -r s,"^.*=",,
echo -e "\n::set-output name=msg::(*new*)"
- name: Detect delphi-utils version
id: utils-unchanged
if: steps.changes.outputs.utils == 'false'
working-directory: ./_delphi_utils_python
run: |
echo -n "::set-output name=version::"
bump2version --list -n ${{ github.event.inputs.versionName }} | grep ^current_version | sed -r s,"^.*=",,
echo -e "\n::set-output name=msg::(same as it was)"
- name: Bump covidcast-indicators version
id: indicators
run: |
echo -n "::set-output name=version::"
bump2version --list ${{ github.event.inputs.versionName }} | grep ^new_version | sed -r s,"^.*=",,
- name: Copy version to indicator directory
run: |
indicator_list=("changehc" "claims_hosp" "doctor_visits" "google_symptoms" "hhs_hosp" "nchs_mortality" "nssp" "quidel_covidtest" "sir_complainsalot")
for path in ${indicator_list[@]}; do
echo "current_version = ${{ steps.indicators.outputs.version }}" > $path/version.cfg
done
- name: Create pull request into prod
uses: peter-evans/create-pull-request@v3
with:
branch: release/indicators_v${{ steps.indicators.outputs.version }}_utils_v${{ steps.utils-changed.outputs.version }}${{ steps.utils-unchanged.outputs.version }}
base: prod
title: Release covidcast-indicators ${{ steps.indicators.outputs.version }}
labels: chore
reviewers: melange396
assignees: melange396
body: |
Releasing:
* covidcast-indicators ${{ steps.indicators.outputs.version }}
* delphi-utils ${{ steps.utils-changed.outputs.version }}${{ steps.utils-unchanged.outputs.version }} ${{steps.utils-changed.outputs.msg }}${{ steps.utils-unchanged.outputs.msg }}
Loading

0 comments on commit 2b31745

Please sign in to comment.