Skip to content

Commit

Permalink
feat+lint+ci: unify linters, add make format
Browse files Browse the repository at this point in the history
* all linters now use the same configuration file
* add `make format` command to format all code
* make sure all indicators pass with the new config
* update the template files as well
* add a new darker format job to CI
* add python package caching
* update README
  • Loading branch information
dshemetov committed Jun 5, 2024
1 parent f011857 commit 6912077
Show file tree
Hide file tree
Showing 47 changed files with 189 additions and 288 deletions.
49 changes: 33 additions & 16 deletions .github/workflows/python-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,28 +16,40 @@ jobs:
if: github.event.pull_request.draft == false
strategy:
matrix:
packages:
[
_delphi_utils_python,
changehc,
claims_hosp,
doctor_visits,
google_symptoms,
hhs_hosp,
nchs_mortality,
nwss_wastewater,
quidel_covidtest,
sir_complainsalot,
]
include:
- package: "_delphi_utils_python"
dir: "delphi_utils"
- package: "changehc"
dir: "delphi_changehc"
- package: "claims_hosp"
dir: "delphi_claims_hosp"
- package: "doctor_visits"
dir: "delphi_doctor_visits"
- package: "google_symptoms"
dir: "delphi_google_symptoms"
- package: "hhs_hosp"
dir: "delphi_hhs"
- package: "nchs_mortality"
dir: "delphi_nchs_mortality"
- package: "nwss_wastewater"
dir: "delphi_nwss"
- package: "quidel_covidtest"
dir: "delphi_quidel_covidtest"
- package: "sir_complainsalot"
dir: "delphi_sir_complainsalot"
defaults:
run:
working-directory: ${{ matrix.packages }}
working-directory: ${{ matrix.package }}
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Python 3.8
uses: actions/setup-python@v2
uses: actions/setup-python@v5
with:
python-version: 3.8
cache: "pip"
cache-dependency-path: "setup.py"
- name: Install testing dependencies
run: |
python -m pip install --upgrade pip
Expand All @@ -51,3 +63,8 @@ jobs:
- name: Test
run: |
make test
- uses: akaihola/[email protected]
with:
options: "--check --diff --isort --color"
src: "${{ matrix.package }}/${{ matrix.dir }}"
version: "~=2.1.1"
41 changes: 32 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

In early April 2020, Delphi developed a uniform data schema for [a new Epidata endpoint focused on COVID-19](https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html). Our intent was to provide signals that would track in real-time and in fine geographic granularity all facets of the COVID-19 pandemic, aiding both nowcasting and forecasting. Delphi's long history in tracking and forecasting influenza made us uniquely situated to provide access to data streams not available anywhere else, including medical claims data, electronic medical records, lab test records, massive public surveys, and internet search trends. We also process commonly-used publicly-available data sources, both for user convenience and to provide data versioning for sources that do not track revisions themselves.

Each data stream arrives in a different format using a different delivery technique, be it sftp, an access-controlled API, or an email attachment. The purpose of each pipeline in this repository is to fetch the raw source data, extract informative aggregate signals, and output those signals---which we call **COVID-19 indicators**---in a common format for upload to the [COVIDcast API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html).
Each data stream arrives in a different format using a different delivery technique, be it sftp, an access-controlled API, or an email attachment. The purpose of each pipeline in this repository is to fetch the raw source data, extract informative aggregate signals, and output those signals---which we call **COVID-19 indicators**---in a common format for upload to the [COVIDcast API](https://cmu-delphi.github.io/delphi-epidata/api/covidcast.html).

For client access to the API, along with a variety of other utilities, see our [R](https://cmu-delphi.github.io/covidcast/covidcastR/) and [Python](https://cmu-delphi.github.io/covidcast/covidcast-py/html/) packages.

Expand All @@ -13,18 +13,19 @@ For interactive visualizations (of a subset of the available indicators), see ou
## Organization

Utilities:
* `_delphi_utils_python` - common behaviors
* `_template_python` & `_template_r` - starting points for new data sources
* `ansible` & `jenkins` - automated testing and deployment
* `sir_complainsalot` - a Slack bot to check for missing data

- `_delphi_utils_python` - common behaviors
- `_template_python` & `_template_r` - starting points for new data sources
- `ansible` & `jenkins` - automated testing and deployment
- `sir_complainsalot` - a Slack bot to check for missing data

Indicator pipelines: all remaining directories.

Each indicator pipeline includes its own documentation.
Each indicator pipeline includes its own documentation.

* Consult README.md for directions to install, lint, test, and run the pipeline for that indicator.
* Consult REVIEW.md for the checklist to use for code reviews.
* Consult DETAILS.md (if present) for implementation details, including handling of corner cases.
- Consult README.md for directions to install, lint, test, and run the pipeline for that indicator.
- Consult REVIEW.md for the checklist to use for code reviews.
- Consult DETAILS.md (if present) for implementation details, including handling of corner cases.

## Development

Expand All @@ -35,6 +36,28 @@ Each indicator pipeline includes its own documentation.
3. Add new commits to your branch in response to feedback.
4. When approved, tag an admin to merge the PR. Let them know if this change should be released immediately, at a set future date, or if it can just go along for the ride whenever the next release happens.

### Linting and Formatting

Each indicator has a `make lint` command to check for linting errors and a `make
format` command to incrementally format your code (using
[darker](https://github.com/akaihola/darker)). These are both automated with a
[Github Action](.github/workflows/python-ci.yml).

If you get the error `ERROR:darker.git:fatal: Not a valid commit name <hash>`,
then it's likely because your local main branch is not up to date; either you
need to rebase or merge. Note that `darker` reads from `pyproject.toml` for
default settings.

If the lines you change are in a file that uses 2 space indentation, `darker`
will indent the lines around your changes and not the rest, which will likely
break the code; in that case, you should probably just pass the whole file
through black. You can do that with the following command (using the same
virtual environment as above):

```sh
env/bin/black <file>
```

## Release Process

The release process consists of multiple steps which can all be done via the GitHub website:
Expand Down
22 changes: 0 additions & 22 deletions _delphi_utils_python/.pylintrc

This file was deleted.

5 changes: 4 additions & 1 deletion _delphi_utils_python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,12 @@ install-ci: venv
pip install .

lint:
. env/bin/activate; pylint delphi_utils
. env/bin/activate; pylint delphi_utils --rcfile=../pyproject.toml
. env/bin/activate; pydocstyle delphi_utils

format:
. env/bin/activate; darker delphi_utils

test:
. env/bin/activate ;\
(cd tests && ../env/bin/pytest --cov=delphi_utils --cov-report=term-missing)
Expand Down
5 changes: 2 additions & 3 deletions _delphi_utils_python/delphi_utils/geomap.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
Authors: Dmitry Shemetov @dshemetov, James Sharpnack @jsharpna, Maria Jahja
"""

# pylint: disable=too-many-lines
from os.path import join
from collections import defaultdict
from typing import Iterator, List, Literal, Optional, Set, Union
Expand All @@ -13,7 +12,7 @@
from pandas.api.types import is_string_dtype


class GeoMapper: # pylint: disable=too-many-public-methods
class GeoMapper:
"""Geo mapping tools commonly used in Delphi.
The GeoMapper class provides utility functions for translating between different
Expand Down Expand Up @@ -624,7 +623,7 @@ def get_geos_within(
if contained_geocode_type == "state":
if container_geocode_type == "nation" and container_geocode == "us":
crosswalk = self._crosswalks["state"]["state"]
return set(crosswalk["state_id"]) # pylint: disable=unsubscriptable-object
return set(crosswalk["state_id"])
if container_geocode_type == "hhs":
crosswalk_hhs = self._crosswalks["fips"]["hhs"]
crosswalk_state = self._crosswalks["fips"]["state"]
Expand Down
11 changes: 6 additions & 5 deletions _delphi_utils_python/delphi_utils/logger.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
"""Structured logger utility for creating JSON logs."""
"""Structured logger utility for creating JSON logs.
# the Delphi group uses two ~identical versions of this file.
# try to keep them in sync with edits, for sanity.
# https://github.com/cmu-delphi/covidcast-indicators/blob/main/_delphi_utils_python/delphi_utils/logger.py # pylint: disable=line-too-long
# https://github.com/cmu-delphi/delphi-epidata/blob/dev/src/common/logger.py
The Delphi group uses two ~identical versions of this file.
Try to keep them in sync with edits, for sanity.
https://github.com/cmu-delphi/covidcast-indicators/blob/main/_delphi_utils_python/delphi_utils/logger.py
https://github.com/cmu-delphi/delphi-epidata/blob/dev/src/common/logger.py
"""

import contextlib
import logging
Expand Down
20 changes: 6 additions & 14 deletions _delphi_utils_python/delphi_utils/smooth.py
Original file line number Diff line number Diff line change
Expand Up @@ -304,17 +304,11 @@ def left_gauss_linear_smoother(self, signal):
n = len(signal)
signal_smoothed = np.zeros_like(signal)
# A is the regression design matrix
A = np.vstack([np.ones(n), np.arange(n)]).T # pylint: disable=invalid-name
A = np.vstack([np.ones(n), np.arange(n)]).T
for idx in range(n):
weights = np.exp(
-((np.arange(idx + 1) - idx) ** 2) / self.gaussian_bandwidth
)
AwA = np.dot( # pylint: disable=invalid-name
A[: (idx + 1), :].T * weights, A[: (idx + 1), :]
)
Awy = np.dot( # pylint: disable=invalid-name
A[: (idx + 1), :].T * weights, signal[: (idx + 1)].reshape(-1, 1)
)
weights = np.exp(-((np.arange(idx + 1) - idx) ** 2) / self.gaussian_bandwidth)
AwA = np.dot(A[: (idx + 1), :].T * weights, A[: (idx + 1), :])
Awy = np.dot(A[: (idx + 1), :].T * weights, signal[: (idx + 1)].reshape(-1, 1))
try:
beta = np.linalg.solve(AwA, Awy)
signal_smoothed[idx] = np.dot(A[: (idx + 1), :], beta)[-1]
Expand Down Expand Up @@ -389,9 +383,7 @@ def savgol_coeffs(self, nl, nr, poly_fit_degree):
if nr > 0:
warnings.warn("The filter is no longer causal.")

A = np.vstack( # pylint: disable=invalid-name
[np.arange(nl, nr + 1) ** j for j in range(poly_fit_degree + 1)]
).T
A = np.vstack([np.arange(nl, nr + 1) ** j for j in range(poly_fit_degree + 1)]).T

if self.gaussian_bandwidth is None:
mat_inverse = np.linalg.inv(A.T @ A) @ A.T
Expand All @@ -406,7 +398,7 @@ def savgol_coeffs(self, nl, nr, poly_fit_degree):
coeffs[i] = (mat_inverse @ basis_vector)[0]
return coeffs

def savgol_smoother(self, signal): # pylint: disable=inconsistent-return-statements
def savgol_smoother(self, signal):
"""Smooth signal with the savgol smoother.
Returns a convolution of the 1D signal with the Savitzky-Golay coefficients, respecting
Expand Down
2 changes: 0 additions & 2 deletions _delphi_utils_python/delphi_utils/validator/dynamic.py
Original file line number Diff line number Diff line change
Expand Up @@ -320,15 +320,13 @@ def create_dfs(self, geo_sig_df, api_df_or_error, checking_date, geo_type, signa
#
# These variables are interpolated into the call to `api_df_or_error.query()`
# below but pylint doesn't recognize that.
# pylint: disable=unused-variable
reference_start_date = recent_cutoff_date - self.params.max_check_lookbehind
if signal_type in self.params.smoothed_signals:
# Add an extra 7 days to the reference period.
reference_start_date = reference_start_date - \
timedelta(days=7)

reference_end_date = recent_cutoff_date - timedelta(days=1)
# pylint: enable=unused-variable

# Subset API data to relevant range of dates.
reference_api_df = api_df_or_error.query(
Expand Down
3 changes: 2 additions & 1 deletion _delphi_utils_python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,13 @@
from setuptools import find_packages

with open("README.md", "r") as f:
long_description = f.read()
long_description = f.read()

required = [
"boto3",
"covidcast",
"cvxpy",
"darker[isort]~=2.1.1",
"epiweeks",
"freezegun",
"gitpython",
Expand Down
22 changes: 0 additions & 22 deletions _template_python/.pylintrc

This file was deleted.

5 changes: 4 additions & 1 deletion _template_python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,12 @@ install-ci: venv
pip install .

lint:
. env/bin/activate; pylint $(dir)
. env/bin/activate; pylint $(dir) --rcfile=../pyproject.toml
. env/bin/activate; pydocstyle $(dir)

format:
. env/bin/activate; darker $(dir)

test:
. env/bin/activate ;\
(cd tests && ../env/bin/pytest --cov=$(dir) --cov-report=term-missing)
Expand Down
1 change: 1 addition & 0 deletions _template_python/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
from setuptools import find_packages

required = [
"darker[isort]~=2.1.1",
"numpy",
"pandas",
"pydocstyle",
Expand Down
24 changes: 0 additions & 24 deletions changehc/.pylintrc

This file was deleted.

5 changes: 4 additions & 1 deletion changehc/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,12 @@ install-ci: venv
pip install .

lint:
. env/bin/activate; pylint $(dir)
. env/bin/activate; pylint $(dir) --rcfile=../pyproject.toml
. env/bin/activate; pydocstyle $(dir)

format:
. env/bin/activate; darker $(dir)

test:
. env/bin/activate ;\
(cd tests && ../env/bin/pytest --cov=$(dir) --cov-report=term-missing)
Expand Down
1 change: 0 additions & 1 deletion changehc/delphi_changehc/download_ftp_files.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
"""Download files from the specified ftp server."""

# standard
import datetime
import functools
from os import path

Expand Down
3 changes: 0 additions & 3 deletions changehc/delphi_changehc/sensor.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,6 @@
"""

# standard packages
import logging

# third party
import numpy as np
import pandas as pd
Expand Down
Loading

0 comments on commit 6912077

Please sign in to comment.