Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error on new solo version #66

Closed
cnk113 opened this issue Aug 9, 2021 · 8 comments
Closed

Error on new solo version #66

cnk113 opened this issue Aug 9, 2021 · 8 comments

Comments

@cnk113
Copy link

cnk113 commented Aug 9, 2021

Hello,

I updated to the newest version on github as well as scvi-tools.
However when I start it fails on the first step:

Min cell depth: 500.0, Max cell depth: 40136.0                                                                                                                                                                   
INFO     No batch_key inputted, assuming all cells are same batch                                                                                                                                                
INFO     No label_key inputted, assuming all cells have same label                                                                                                                                               
INFO     Using data from adata.X                                                                                                                                                                                 
INFO     Computing library size prior per batch                                                                                                                                                                  
INFO     Successfully registered anndata object containing 14643 cells, 36601 vars, 1 batches, 1 labels, and 0 proteins. Also registered 0 extra categorical covariates and 0 extra continuous covariates.       
INFO     Please do not further modify adata until model is trained.                                                                                                                                              
GPU available: True, used: True                                                                                                                                                                                  
TPU available: False, using: 0 TPU cores                                                                                                                                                                         
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]                                                                                                                                                                        
Epoch 1/2000:   0%|                                                                                                                                   | 1/2000 
[00:02<1:34:48,  2.85s/it, loss=8.43e+03, v_num=1]
...
  File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 871, in run_train
    self.train_loop.run_training_epoch()
  File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 577, in run_training_epoch
    self.trainer.optimizer_connector.update_learning_rates(interval='epoch')
  File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/optimizer_connector.py", line 66, in update_learning_rates
    f'ReduceLROnPlateau conditioned on metric {monitor_key}'
pytorch_lightning.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau conditioned on metric reconstruction_loss_validation which is not available. 
Available metrics are: ['train_loss_step', 'train_loss_epoch', 'train_loss', 'elbo_train', 'reconstruction_loss_train', 'kl_local_train', 'kl_global_train']. 
Condition can be set using `monitor` key in lr scheduler dict
Epoch 1/2000:   0%| 
@njbernstein
Copy link
Contributor

Hi there,

Thanks for bringing this to my attention. I will try to track down this issue today.

@njbernstein
Copy link
Contributor

Hi there

It seems like the most recent version of scvi-tools broke this. Please rollback to

pip install scvi-tools==0.11.0
pip install pytorch-lightning==1.2.3

I'll be pinning these in the requirements.txt shortly

@cnk113
Copy link
Author

cnk113 commented Aug 10, 2021

So I changed the versions as above but now I get this:

Min cell depth: 500.0, Max cell depth: 40136.0                                                                                                                                                                   
INFO     No batch_key inputted, assuming all cells are same batch                                                                                                                                                
INFO     No label_key inputted, assuming all cells have same label                                                                                                                                               
INFO     Using data from adata.X                                                                                                                                                                                 
INFO     Computing library size prior per batch                                                                                                                                                                  
INFO     Successfully registered anndata object containing 14643 cells, 36601 vars, 1 batches, 1 labels, and 0 proteins. Also registered 0 extra categorical covariates and 0 extra continuous covariates.       
INFO     Please do not further modify adata until model is trained.                                                                                                                                              
GPU available: True, used: True                                                                                                                                                                                  
TPU available: None, using: 0 TPU cores                                                                                                                                                                          
Epoch 1/2000:   0%|                                                                                                                                                                     | 
0/2000 [00:06<?, ?it/s]Traceback (most recent call last):
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 1102, in call_hook trainer_hook(*args, **kwargs)
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/pytorch_lightning/trainer/callback_hook.py", line 85, in on_train_epoch_end
    callback.on_train_epoch_end(self, self.lightning_module, outputs)
File "/home/chang/miniconda3/envs/venv/lib/python3.7/site-packages/scvi/train/_progress.py", line 89, in on_train_epoch_end
    super().on_train_epoch_end(trainer, pl_module, unused=unused)
TypeError: on_train_epoch_end() got an unexpected keyword argument 'unused'

@njbernstein njbernstein reopened this Aug 11, 2021
@njbernstein
Copy link
Contributor

Very weird. okay lemme try track this down

@njbernstein
Copy link
Contributor

Okay try now. After you install the latest version from github in a clean environment could you post the pip freeze if you still have issues.

@cnk113
Copy link
Author

cnk113 commented Aug 13, 2021

Different error now

Min cell depth: 500.0, Max cell depth: 40022.0                                                                                                                                                                   
INFO     No batch_key inputted, assuming all cells are same batch
INFO     No label_key inputted, assuming all cells have same label                                                                                                                                               
INFO     Using data from adata.X
INFO     Computing library size prior per batch                                                                                                                                                              
INFO     Successfully registered anndata object containing 14637 cells, 36601 vars, 1 batches, 1 labels, and 0 proteins. Also registered 0 extra categorical covariates and 0 extra continuous covariates.       
INFO     Please do not further modify adata until model is trained.                                                                                                                                              
GPU available: True, used: True
TPU available: False, using: 0 TPU cores                                                                                                                                                                         
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Epoch 1/2000:   0%|                                                                                                                                   | 1/2000 [00:02<1:34:36,  2.84s/it, loss=7.04e+03, v_num=1]
File "/home/chang/miniconda3/envs/solo/lib/python3.7/site-packages/pytorch_lightning/trainer/trainer.py", line 871, in run_train
    self.train_loop.run_training_epoch() 
File "/home/chang/miniconda3/envs/solo/lib/python3.7/site-packages/pytorch_lightning/trainer/training_loop.py", line 577, in run_training_epoch
   self.trainer.optimizer_connector.update_learning_rates(interval='epoch')
File "/home/chang/miniconda3/envs/solo/lib/python3.7/site-packages/pytorch_lightning/trainer/connectors/optimizer_connector.py", line 66, in update_learning_rates
f'ReduceLROnPlateau conditioned on metric {monitor_key}'
pytorch_lightning.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau conditioned on metric reconstruction_loss_validation which is not available. Available metrics are: ['train_loss_step', 'train_loss_epoch', 'train_loss', 'elbo_train', 'reconstruction_loss_train', 'kl_local_train', 'kl_global_train']. Condition can be set using `monitor` key in lr scheduler dict
Epoch 1/2000:   0%|  

pip freeze output
freeze.txt

@njbernstein
Copy link
Contributor

@cnk113 do a pip install pytorch-lightning=1.3.1 please

@cnk113
Copy link
Author

cnk113 commented Aug 18, 2021

It works! Thanks for the fast turnaround.

@cnk113 cnk113 closed this as completed Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants