Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CTX-5783: Mypy Linter Action #171

Merged
merged 38 commits into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
f516935
CTX-5783: Fixed coretex-jobs linter errors
Jun 28, 2024
7c1fe66
CTX-5783: Additional mypy error fixes
Jul 1, 2024
9d536c9
Merge branch 'develop' into CTX-5783
Jul 4, 2024
9930eb7
CTX-5783: Additional typing improvements
Jul 4, 2024
8e2906d
CTX-5783: Added missing mypy.ini files
Jul 4, 2024
3a272f5
CTX-5783: Additional linter errors fixed
Jul 4, 2024
0d76879
CTX-5783: Added type: ignores
Jul 5, 2024
9cd99b3
CTX-5783: Added github action
Jul 5, 2024
861da8e
CTX-5783: Removed .mypy.ini from audio-analytics
Jul 5, 2024
132183f
CTX-5783: Fixed dependencies
Jul 5, 2024
abbce15
CTX-5783: Removed hardcoded tensorflow versions in requirements.txts
Jul 8, 2024
827d3d4
CTX-5783: Extracted code to bash script and explicit python version u…
Jul 9, 2024
f5f4f1e
Merge branch 'develop' into CTX-5783
Jul 9, 2024
ad82602
CTX-5783: Fixed mypy errors for sql-connector
Jul 9, 2024
da7722b
first commit
Jul 17, 2024
654cc30
removed unnecessary file
Jul 17, 2024
57fe81a
parallelism implemented
Jul 17, 2024
038c363
merged CTX-5783 into develop
Jul 17, 2024
3f33f7e
deleted debugging lines from scripts
Jul 17, 2024
5429dea
CTX-5783: deleted unnecessary lines
Jul 17, 2024
c19c3b0
CTX-5783: removed unnecessary lines
Jul 17, 2024
26ac0b5
Merge pull request #2 from nemanjamt/CTX-5783
VukManojlovic Jul 17, 2024
e0e05fc
CTX-5783: Fixed linter errors for model-transfer and dataset-split
Jul 18, 2024
ed6cdb9
CTX-5783: Removed unecessary packages from mypy ignore imports
Aug 6, 2024
57d89ef
Merge branch 'develop' into CTX-5783
Aug 9, 2024
46d2766
CTX-5783: Fixed linter error
Aug 12, 2024
00d37cb
CTX-5783: Fixed linter error
Aug 13, 2024
84dbea9
CTX-5783: Fixed linter error (img-seg)
Aug 13, 2024
2d2fe62
CTX-5783: Fixed linter error (diffusion-fn)
Aug 13, 2024
66761eb
CTX-5783: FIxed unpacking error (image-quality-predictor)
Aug 13, 2024
c86d52b
CTX-5783: Fixed linter error (translation-ollama)
Aug 14, 2024
5148830
CTX-5783: Fixed linter error (tabular-data-diagnostics)
Aug 14, 2024
f19db18
CTX-5783: Fixed linter error (translation-ollama)
Aug 14, 2024
5ab69a0
CTX-5783: Fixed linter error (translation-ollama)
Aug 14, 2024
a6f6932
CTX-5783: Removed ignore_missing_imports
Aug 20, 2024
6850068
Merge branch 'develop' into CTX-5783
Aug 20, 2024
4048cd4
CTX-5783: Fixed linter error (image-extractor)
Aug 20, 2024
eb35f64
CTX-5783: Fixed linter error (synthetic-image-generator)
Aug 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions .github/workflows/linter-code-check.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/bash

eval "$(conda shell.bash hook)"
dir=$1
echo "Checking directory: $dir"

# Skip the directory if no .mypy.ini file is found
if [ ! -f "$dir/.mypy.ini" ]; then
echo "No .mypy.ini file found in $dir, skipping..."
exit 0
fi

if [ -f "$dir/environment.yml" ]; then
echo "Setting up conda environment for $dir"
conda env create -n $(basename "$dir") -f "$dir/environment.yml"
echo "Created conda environment"
conda activate $(basename "$dir")
pip install mypy
elif [ -f "$dir/requirements.txt" ]; then
echo "Setting up venv for $dir"
python3.9 -m venv "$dir/venv"
echo "activate venv"
source "$dir/venv/bin/activate"
echo "install requirements"
pip install --upgrade pip
pip install -r "$dir/requirements.txt"
pip install mypy
fi

echo "Running mypy in $dir"
set +e # Disable exit on error
mypy_output=$(mypy --config-file "$dir/.mypy.ini" "$dir" 2>&1)
set -e # Re-enable exit on error

echo "$mypy_output"
if echo "$mypy_output" | grep -q 'error:'; then
echo "Running install-types in $dir"
mypy --install-types --non-interactive --config-file "$dir/.mypy.ini" "$dir"
fi

if [ -f "$dir/environment.yml" ]; then
conda deactivate
conda remove -y -n $(basename "$dir") --all
elif [ -f "$dir/requirements.txt" ]; then
deactivate
rm -rf "$dir/venv"
fi

# done
43 changes: 43 additions & 0 deletions .github/workflows/linter-code-check.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Linter code check

on:
push:
branches:
- main
- stage
- develop
pull_request:
types: [opened, reopened, synchronize]
branches:
- main
- stage
- develop

jobs:
define-dirs:
runs-on: ubuntu-latest
outputs:
dirs: ${{ steps.dirs.outputs.dirs }}
steps:
- uses: actions/checkout@v3
- name: Define Dirs
id: dirs
run: result=$(echo tasks/*/ | sed 's/\([^ ]*\)/"\1",/g') && result="${result%,}" && echo "dirs=[$result]" >> "$GITHUB_OUTPUT"
build:
runs-on: ubuntu-latest
needs: define-dirs
strategy:
matrix:
dirs: ${{ fromJSON(needs.define-dirs.outputs.dirs) }}
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: "3.9"
- name: Install mypy globally
run: |
pip install mypy
- name: Analysing templates with mypy
run: |
bash .github/workflows/linter-code-check.sh ${{matrix.dirs}}
36 changes: 36 additions & 0 deletions tasks/annotated-image-extractor/.mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Global options:

[mypy]
python_version = 3.9
pretty = True
warn_return_any = True
warn_no_return = True
warn_redundant_casts = True
warn_unused_configs = True
warn_unused_ignores = True
warn_unreachable = True
disallow_subclassing_any = True
disallow_untyped_calls = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
no_implicit_optional = True
strict_optional = True
allow_redefinition = False


# Per-module options:
# https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
[mypy-ultralytics.*]
ignore_missing_imports = True

[mypy-tensorflow.*]
ignore_missing_imports = True

[mypy-scipy.*]
ignore_missing_imports = True

[mypy-transformers.*]
ignore_missing_imports = True

[mypy-coretex.*]
dule1322 marked this conversation as resolved.
Show resolved Hide resolved
ignore_missing_imports = True
2 changes: 1 addition & 1 deletion tasks/audio-analytics/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import logging

from coretex import CustomDataset, TaskRun, currentTaskRun
from coretex.nlp import AudioTranscriber
from coretex.nlp import AudioTranscriber # type: ignore[attr-defined]

from src import text_search
from src.utils import createTranscriptionArtfacts, fetchModelFile
Expand Down
2 changes: 1 addition & 1 deletion tasks/audio-analytics/src/text_search.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from coretex.nlp import Token
from coretex.nlp import Token # type: ignore[attr-defined]

from .occurence import EntityOccurrence

Expand Down
2 changes: 1 addition & 1 deletion tasks/audio-analytics/src/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import logging

from coretex import CustomSample, cache, TaskRun, folder_manager
from coretex.nlp import Token
from coretex.nlp import Token # type: ignore[attr-defined]

from .occurence import NamedEntityRecognitionResult

Expand Down
33 changes: 33 additions & 0 deletions tasks/bio-bodysite-prediction-nn/.mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Global options:

[mypy]
python_version = 3.9
pretty = True
warn_return_any = True
warn_no_return = True
warn_redundant_casts = True
warn_unused_configs = True
warn_unused_ignores = True
warn_unreachable = True
disallow_subclassing_any = True
disallow_untyped_calls = True
disallow_untyped_defs = True
disallow_incomplete_defs = True
no_implicit_optional = True
strict_optional = True
allow_redefinition = False


# Per-module options:
# https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
[mypy-tensorflow.*]
ignore_missing_imports = True

[mypy-scipy.*]
ignore_missing_imports = True

[mypy-matplotlib.*]
ignore_missing_imports = True

[mypy-sklearn.*]
ignore_missing_imports = True
10 changes: 8 additions & 2 deletions tasks/bio-bodysite-prediction-nn/resources/function/function.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@

from coretex import folder_manager, functions

import numpy as np

from load_data import loadDataAtlas
from load_data_std import loadDataStd

Expand All @@ -29,7 +31,7 @@ def unzip(inputPath: Path, dataFormat: int) -> Path:
return inputPath


def inference(modelInput: Path, model: Model, uniqueTaxons: dict[str, int]) -> list[str]:
def inference(modelInput: Path, model: Model, uniqueTaxons: dict[str, int]) -> np.ndarray:
BATCHE_SIZE = 562
sampleCount = len(list(modelInput.iterdir()))

Expand All @@ -45,7 +47,11 @@ def response(requestData: dict[str, Any]) -> dict[str, Any]:
with open(modelDir / "model_descriptor.json", "r") as jsonFile:
modelDescriptor = json.load(jsonFile)

dataFormat = int(requestData.get("dataFormat")) # 0 - MBA, 1 - Microbiome Forensics Institute Zuric
dataFormatRaw = requestData.get("dataFormat")
if not isinstance(dataFormatRaw, str) and not isinstance(dataFormatRaw, int):
return functions.badRequest("Invalid dataFormat. (0 - MBA, 1 - Microbiome Forensics Institute Zuric)")

dataFormat = int(dataFormatRaw) # 0 - MBA, 1 - Microbiome Forensics Institute Zuric
inputPath = requestData.get("inputFile")

if not isinstance(inputPath, Path):
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ def loadDataAtlas(
) -> tuple[Path, dict[str, int], dict[str, int], list[str]]:

workerCount = os.cpu_count() # This value should not exceed the total number of CPU cores
if workerCount is None:
workerCount = 1

logging.info(f">> [MicrobiomeForensics] Using {workerCount} CPU cores to read the file")

fileSize = inputPath.stat().st_size
Expand All @@ -89,8 +92,9 @@ def loadDataAtlas(
uniqueBodySites = pickle.load(f)

def onProcessingFinished(future: Future) -> None:
if future.exception() is not None:
raise future.exception()
exception = future.exception()
if exception is not None:
raise exception

logging.info(f">> [MicrobiomeForensics] Reading: {inputPath}")

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,12 @@
from objects import Sample, Taxon


def loadDataStd(inputPath: Path, modelDir: Path, level: int) -> tuple[int, int, dict[str, int], list[int]]:
def loadDataStd(inputPath: Path, modelDir: Path, level: int) -> tuple[Path, dict[str, int], dict[str, int], list[str]]:
with open(modelDir / "uniqueTaxons.pkl", "rb") as f:
uniqueTaxons = pickle.load(f)
uniqueTaxons: dict[str, int] = pickle.load(f)

with open(modelDir / "uniqueBodySites.pkl", "rb") as f:
uniqueBodySites = pickle.load(f)
uniqueBodySites: dict[str, int] = pickle.load(f)

datasetPath = folder_manager.createTempFolder("dataset")

Expand Down
18 changes: 9 additions & 9 deletions tasks/bio-bodysite-prediction-nn/resources/function/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from utils import convertFromOneHot


class GatingLayer(tf.keras.layers.Layer):
class GatingLayer(tf.keras.layers.Layer): # type: ignore[misc]

def __init__(
self,
Expand Down Expand Up @@ -83,7 +83,7 @@ def hard_sigmoid(self, x: Tensor, a: Tensor) -> Tensor:
return x


class Model(tf.keras.Model):
class Model(tf.keras.Model): # type: ignore[misc]

def __init__(
self,
Expand Down Expand Up @@ -148,7 +148,7 @@ def __init__(
self.lam = lam

self._activation_gating = activation_gating
self.activation_gating = activation_gating # will overwrite _activation_gating
self.activation_gating = activation_gating # type: ignore[assignment]

self.activation_pred = activation_pred

Expand Down Expand Up @@ -325,7 +325,7 @@ def _valid_step(self, X: Tensor, y: Tensor) -> Tensor:
return y_pred_hot


def predict(self, data: tf.data.Dataset, batches: int):
def predict(self, data: tf.data.Dataset, batches: int) -> np.ndarray:
y_pred: list[list[int]] = []

for i, batch in enumerate(data):
Expand All @@ -337,7 +337,7 @@ def predict(self, data: tf.data.Dataset, batches: int):
return convertFromOneHot(np.array(y_pred))


def test(self, data: tf.data.Dataset, batches: int) -> tuple[np.ndarray, np.ndarray, float]:
def test(self, data: tf.data.Dataset, batches: int) -> tuple[np.ndarray, np.ndarray]:

y_pred: list[list[int]] = [] # List of one hot vectors
y_true: list[list[int]] = []
Expand All @@ -363,7 +363,7 @@ def test_from_array(self, X: ArrayLike) -> np.ndarray:
if type(X) == sparse.csr_matrix:
X = X.toarray().astype(np.float32)

return self.soft_to_hot(self._predict_from_array(X)).numpy()
return self.soft_to_hot(self._predict_from_array(X)).numpy() # type: ignore[no-any-return]


@tf.function
Expand All @@ -374,11 +374,11 @@ def _predict_from_array(self, X: ArrayLike) -> Tensor:

@property
def activation_gating(self) -> Callable:
return self._activation_gating
return self._activation_gating # type: ignore[return-value]


@activation_gating.setter
def activation_gating(self, value: str) -> Callable:
def activation_gating(self, value: str) -> Callable: # type: ignore[return]
if value == 'relu':
self._activation_gating = tf.nn.relu
elif value == 'l_relu':
Expand All @@ -388,7 +388,7 @@ def activation_gating(self, value: str) -> Callable:
elif value == 'tanh':
self._activation_gating = tf.nn.tanh
elif value == 'none':
self._activation_gating = lambda x: x
self._activation_gating = lambda x: x # type: ignore[assignment]
else:
raise NotImplementedError('activation for the gating network not recognized')

Expand Down
23 changes: 14 additions & 9 deletions tasks/bio-bodysite-prediction-nn/resources/function/utils.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
from typing import Optional

from numpy.typing import ArrayLike
from typing import Optional, Union

import numpy as np

def oneHotEncoding(vector: ArrayLike, num_classes: Optional[int] = None) -> np.ndarray:

def oneHotEncoding(vector: Union[np.ndarray, int], numClasses: Optional[int] = None) -> np.ndarray:

"""
Converts an input 1-D vector of integers into an output
Expand All @@ -16,7 +15,7 @@ def oneHotEncoding(vector: ArrayLike, num_classes: Optional[int] = None) -> np.n
----------
vector : ArrayLike
A vector of integers
num_classes : int
numClasses : int
Optionally declare the number of classes (can not exceed the maximum value of the vector)

Returns
Expand All @@ -26,23 +25,29 @@ def oneHotEncoding(vector: ArrayLike, num_classes: Optional[int] = None) -> np.n

Example
-------
>>> v = np.array((1, 0, 4))
>>> v = np.array([1, 0, 4])
>>> one_hot_v = oneHotEncoding(v)
>>> print one_hot_v
[[0 1 0 0 0]
[1 0 0 0 0]
[0 0 0 0 1]]
"""

vecLen = 1 if isinstance(vector, int) else len(vector)
if isinstance(vector, int):
vector = np.array([vector])

vecLen = vector.shape[0]

if numClasses is None:
numClasses = vector.max() + 1

result = np.zeros(shape = (vecLen, num_classes))
result = np.zeros(shape = (vecLen, numClasses))
result[np.arange(vecLen), vector] = 1
return result.astype(int)


def convertFromOneHot(matrix: np.ndarray) -> np.ndarray:
numOfRows = len(matrix) if isinstance(matrix, list) else matrix.shape[0]
numOfRows = matrix.shape[0]
if not numOfRows > 0:
raise RuntimeError(f">> [MicrobiomeForensics] Encountered array with {numOfRows} rows when decoding one hot vector")

Expand Down
Loading
Loading