Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MàJ versions Pytorch et Pytorch-Geometric #105

Merged
merged 31 commits into from
Feb 6, 2024
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
123757c
Update environment.yml with newer torch version and less straining re…
CharlesGaydon Dec 20, 2023
99257b5
Fix all incorrect imports and invalid Trainer flag
CharlesGaydon Dec 20, 2023
9b84ca0
Fix logging
CharlesGaydon Jan 2, 2024
cc51008
Fix loading of model at inference time
CharlesGaydon Jan 2, 2024
35492a2
Do not change API: keep predict.ckpt_path as parameter for inference
CharlesGaydon Jan 2, 2024
d527e55
Fix use of cpu accelerator in test using the new syntax
CharlesGaydon Jan 2, 2024
99e60f3
Refactor iou into a separate script for clarity
CharlesGaydon Jan 2, 2024
6167fde
Revert conda environment name to myria3d
CharlesGaydon Jan 2, 2024
83e34d1
Load checkpoint via the parent class directly
CharlesGaydon Jan 2, 2024
d28b756
Update version and changelog to V3.7.0
CharlesGaydon Jan 2, 2024
90a2622
Merge branch 'main' into upgrade-torch
CharlesGaydon Jan 2, 2024
86e4977
Downgrade to pytorch-lightning==2.0.8 to avoid error
CharlesGaydon Jan 2, 2024
11eaaa1
Merge branch 'main' into upgrade-torch
CharlesGaydon Jan 3, 2024
4c7c208
Manually reset the metrics after each end of epoch
CharlesGaydon Jan 8, 2024
03d12c0
Update signature of class LogLogsPath's setup hook
CharlesGaydon Jan 8, 2024
05c681c
Rename conda env to myria3d
CharlesGaydon Jan 10, 2024
ee70453
Remove dead comments in environment.yml
CharlesGaydon Jan 10, 2024
2c4edba
Install with conda whenever possible in environment.yml
CharlesGaydon Jan 10, 2024
dac04f2
Revert name of conda env to myria3d
CharlesGaydon Jan 10, 2024
2300917
Update docker image to use mamba based image + use conda packages as …
leavauchier Jan 11, 2024
f57d9af
Add proxy parameters in gh action
leavauchier Jan 30, 2024
deecc64
Use root user to build conda env in docker image
leavauchier Jan 30, 2024
9387ddb
No need to save criterion as a hyperparameter since already checkpointed
CharlesGaydon Feb 1, 2024
9c6a1a6
Fix setting model.criterio, using kwargs instead of hparams.
CharlesGaydon Feb 1, 2024
997b4b3
refactor: follow python conventions of lowercase/uppercase use
CharlesGaydon Feb 6, 2024
72c2195
Merge branch 'main' into upgrade-torch
CharlesGaydon Feb 6, 2024
03d4921
dev: autofind available gpu in tests
CharlesGaydon Feb 6, 2024
1a15143
Changelog: indicate refactor of single-class IoUs
CharlesGaydon Feb 6, 2024
9033a39
Flake8
CharlesGaydon Feb 6, 2024
ca919f2
Mention the retrocompatibility of changes and the need to update pred…
CharlesGaydon Feb 6, 2024
e5f8a64
Update config used in cicd
CharlesGaydon Feb 6, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# CHANGELOG

## 3.7.0
- Update all versions of Pytorch, Pytorch Lightning, and Pytorch Geometric.
CharlesGaydon marked this conversation as resolved.
Show resolved Hide resolved

### 3.6.1
- Set urllib3<2 for comet logging to function and add back seaborn for plotting optimal LR graph.

## 3.6.0
- Remove the "EPSG:2154" by default and use the metadata of the lidar file, unless a parameter is given.

Expand Down
6 changes: 0 additions & 6 deletions configs/callbacks/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,6 @@ lr_monitor:
logging_interval: "step"
log_momentum: true

# This logs IoU at validation and test time
# Predictions are aggregated and saved at test time in a way coherent with prediction logic.
log_iou_by_class:
_target_: myria3d.callbacks.logging_callbacks.LogIoUByClass
classification_dict: ${dataset_description.classification_dict}

model_checkpoint:
_target_: pytorch_lightning.callbacks.ModelCheckpoint
monitor: "val/loss_epoch" # name of the logged metric which determines when model is improving
Expand Down
1 change: 0 additions & 1 deletion configs/experiment/DebugFineTune.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ trainer:
limit_test_batches: 1
max_epochs: 1
num_sanity_val_steps: 0
# gpus: [1]

callbacks:
finetune:
Expand Down
3 changes: 1 addition & 2 deletions configs/experiment/RandLaNet_base_run_FR-MultiGPU.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,4 @@ trainer:
strategy: ddp_find_unused_parameters_false
# Replace by cpu to simulate multi-cpus training.
accelerator: gpu
num_processes: 2
gpus: 2
devices: 2
9 changes: 1 addition & 8 deletions configs/model/default.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ _target_: myria3d.models.model.Model
## Inputs and outputs
d_in: ${dataset_description.d_in} # XYZ (3) + Other features (N)
num_classes: ${dataset_description.num_classes}
classification_dict: ${dataset_description.classification_dict}

# Architecture defined in sub-configs
ckpt_path: null # str, for resuming training and finetuning.
Expand All @@ -13,14 +14,6 @@ neural_net_hparams: ???
interpolation_k: ${predict.interpolator.interpolation_k} # interpolation at eval time
num_workers: 4 # for knn_interpolate

## Evaluation metric - partial for triple (train/val/test) init
iou:
_target_: functools.partial
_args_:
- "${get_method:torchmetrics.JaccardIndex}"
- ${model.num_classes}
absent_score: 1.0 # do not penalize if a class is absent from labels.

## Optimization
momentum: 0.9 # arbitrary
monitor: "val/loss_epoch"
Expand Down
2 changes: 1 addition & 1 deletion configs/predict/default.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
src_las: "/path/to/input.las" # Any glob pattern can be used to predict on multiple files.
output_dir: "/path/to/output_dir/" # Predictions are saved in a new file which shares src_las basename.
ckpt_path: "/path/to/lightning_model.ckpt" # Checkpoint of trained model.
gpus: 0 # 0 for none, 1 for one, [gpu_id] to specify which gpu to use e.g [1]
gpus: 0

# Probas interpolation parameters
# subtile_overlap=25 to use a sliding window of inference of which predictions will be merged.
Expand Down
3 changes: 2 additions & 1 deletion configs/task/default.yaml
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
# Task at hand. Can be train or predict
task_name: fit # "fit" or "test" or "fit+test", or "predict", or "finetune"
task_name: fit # "fit" or "test" or "fit+test", or "predict", or "finetune"
auto_lr_find: false # override with true to run the LR-range test in train.py.
50 changes: 0 additions & 50 deletions configs/trainer/all_params.yaml

This file was deleted.

12 changes: 4 additions & 8 deletions configs/trainer/default.yaml
Original file line number Diff line number Diff line change
@@ -1,14 +1,10 @@
_target_: pytorch_lightning.Trainer

# set `1` to train on GPU, `0` to train on CPU only
gpus: 0

min_epochs: 1
max_epochs: 1300
log_every_n_steps: 1

weights_summary: null
progress_bar_refresh_rate: 1

auto_lr_find: false # override with true to run the LR-range test in train.py.

# set to gpu for gpu training (if devices > 1, set ddp_find_unused_parameters_false: true)
accelerator: cpu
devices: 1
num_nodes: 1
7 changes: 3 additions & 4 deletions docs/source/apidoc/default_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,11 @@ print_config: true
ignore_warnings: true
trainer:
_target_: pytorch_lightning.Trainer
gpus: 0
accelerator: cpu
devices: 1
min_epochs: 1
max_epochs: 1
log_every_n_steps: 1
weights_summary: null
progress_bar_refresh_rate: 1
auto_lr_find: false
limit_train_batches: 1
limit_val_batches: 1
limit_test_batches: 1
Expand Down Expand Up @@ -253,6 +251,7 @@ logger:
disabled: true
task:
task_name: fit
auto_lr_find: false
predict:
src_las: /path/to/input.las
output_dir: /path/to/output_dir/
Expand Down
2 changes: 1 addition & 1 deletion docs/source/guides/train_new_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ After training, you model best checkpoints and hydra config will be saved in a `
### Optimized learning rate

Pytorch Lightning support au [automated learning rate finder](https://pytorch-lightning.readthedocs.io/en/stable/common/trainer.html#auto-lr-find), by means of an Learning Rate-range test (see section 3.3 in [this paper](https://arxiv.org/pdf/1506.01186.pdf) for reference).
You can perfom this automatically before training by setting `trainer.auto_lr_find=true` when calling training on your dataset. The best learning rate will be logged and results saved as an image, so that you do not need to perform this test more than once.
You can perfom this automatically before training by setting `task.auto_lr_find=true` when calling training on your dataset. The best learning rate will be logged and results saved as an image, so that you do not need to perform this test more than once.

### Multi-GPUs

Expand Down
29 changes: 13 additions & 16 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,17 +1,14 @@
# Simple install with
# mamba env create -f environment.yml
CharlesGaydon marked this conversation as resolved.
Show resolved Hide resolved
name: myria3d
name: myria3d_latest_pytorch
CharlesGaydon marked this conversation as resolved.
Show resolved Hide resolved
channels:
- conda-forge
- anaconda
CharlesGaydon marked this conversation as resolved.
Show resolved Hide resolved
dependencies:
- python==3.9.*
- pip
# cudatoolkit to specify the cuda driver in the conda env
CharlesGaydon marked this conversation as resolved.
Show resolved Hide resolved
- conda-forge::cudatoolkit=11.3.1 # single equal sign there, not a typo
- numba==0.55.1
# --------- data formats --------- #
- numpy==1.20
# - numpy
CharlesGaydon marked this conversation as resolved.
Show resolved Hide resolved
- h5py
# --------- geo --------- #
- pygeos
Expand All @@ -38,26 +35,26 @@ dependencies:
- pip:
# --------- Deep Learning --------- #
# Extra index may need to be on first line
- --extra-index-url https://download.pytorch.org/whl/cu113
- torch==1.11.*
- --extra-index-url https://download.pytorch.org/whl/cu118
CharlesGaydon marked this conversation as resolved.
Show resolved Hide resolved
- torch==2.1.*
- torchvision
- pytorch-lightning==1.5.9
- torchmetrics==0.7.* # Else, pytorch-lightning will install the latest
- comet_ml==3.31.*
- pytorch-lightning==2.0.8
- torchmetrics
- comet_ml==3.31.* # VErsion to update !
- torch_geometric
- urllib3<2 # To solve for https://github.com/GeneralMills/pytrends/issues/591
# Wheels for torch-geometric optionnal dependencies
- https://data.pyg.org/whl/torch-1.11.0%2Bcu113/torch_cluster-1.6.0-cp39-cp39-linux_x86_64.whl
- https://data.pyg.org/whl/torch-1.11.0%2Bcu113/torch_scatter-2.0.9-cp39-cp39-linux_x86_64.whl
- https://data.pyg.org/whl/torch-1.11.0%2Bcu113/torch_sparse-0.6.14-cp39-cp39-linux_x86_64.whl
- git+https://github.com/pyg-team/[email protected]
- https://data.pyg.org/whl/torch-2.1.0%2Bcu118/torch_cluster-1.6.3%2Bpt21cu118-cp39-cp39-linux_x86_64.whl
- https://data.pyg.org/whl/torch-2.1.0%2Bcu118/torch_scatter-2.1.2%2Bpt21cu118-cp39-cp39-linux_x86_64.whl
- https://data.pyg.org/whl/torch-2.1.0%2Bcu118/torch_sparse-0.6.18%2Bpt21cu118-cp39-cp39-linux_x86_64.whl
# Nota: if libcusparse.so.11. errors occur, run
# export LD_LIBRARY_PATH="/home/${USER}/miniconda/envs/lib:$LD_LIBRARY_PATH"
# ou
# export LD_LIBRARY_PATH="/home/${USER}/anaconda3/envs/lib:$LD_LIBRARY_PATH"
# see https://github.com/pyg-team/pytorch_geometric/issues/2040#issuecomment-766610625
# --------- Visualization --------- #
- pandas==1.4.*
- matplotlib==3.5.*
- pandas
- matplotlib
# --------- hydra configs --------- #
- hydra-core==1.1.*
- hydra-colorlog==1.1.*
Expand Down
8 changes: 4 additions & 4 deletions myria3d/callbacks/comet_callbacks.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
from typing import Optional

from pytorch_lightning import Callback, Trainer
from pytorch_lightning.loggers import CometLogger, LoggerCollection
from pytorch_lightning.loggers import CometLogger
from pytorch_lightning.utilities import rank_zero_only

from myria3d.utils import utils
Expand All @@ -27,7 +27,7 @@ def get_comet_logger(trainer: Trainer) -> Optional[CometLogger]:
if isinstance(trainer.logger, CometLogger):
return trainer.logger

if isinstance(trainer.logger, LoggerCollection):
if isinstance(trainer.logger, list):
for logger in trainer.logger:
if isinstance(logger, CometLogger):
return logger
Expand Down Expand Up @@ -65,9 +65,9 @@ class LogLogsPath(Callback):
"""Logs run working directory to comet.ml"""

@rank_zero_only
def on_init_end(self, trainer):
def setup(self, trainer, pl_module, stage):
logger = get_comet_logger(trainer=trainer)
if logger:
log_path = os.getcwd()
log.info(f"----------------\n LOGS DIR is {log_path}\n ----------------")
logger.experiment.log_parameter("experiment_logs_dirpath", log_path)
logger.experiment.log_parameter("experiment_logs_dirpath", log_path)
Loading
Loading