Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with loom file #56

Closed
marzamKI opened this issue Jan 13, 2021 · 11 comments
Closed

Problem with loom file #56

marzamKI opened this issue Jan 13, 2021 · 11 comments

Comments

@marzamKI
Copy link

Hey,

I'm trying to use your tool on a loom file and it's giving me an error I dont quite know how to solve..
Mind having a look at it and tell me how you think I could solve?

Kind regards,
Margherita

(base) KI-C02Z42TFLVDM:solo marzam$ solo solo/solo_params_example.json dev_all.loom
[2021-01-13 16:12:25,205] INFO - scvi._settings | 'scvi' logger already has a StreamHandler, set its level to 10.
Cuda is not available, switching to cpu running!
[2021-01-13 16:12:25,205] INFO - scvi.dataset.loom | Preprocessing dataset
Traceback (most recent call last):
  File "/Users/marzam/miniconda3/bin/solo", line 33, in <module>
    sys.exit(load_entry_point('solo-sc', 'console_scripts', 'solo')())
  File "/Users/marzam/OneDrive - KI.SE/Mac/Documents/sequencing/ionut/doublets/solo/solo/solo/solo.py", line 119, in main
    scvi_data = LoomDataset(data_path)
  File "/Users/marzam/miniconda3/lib/python3.7/site-packages/scvi/dataset/loom.py", line 66, in __init__
    delayed_populating=delayed_populating,
  File "/Users/marzam/miniconda3/lib/python3.7/site-packages/scvi/dataset/dataset.py", line 2026, in __init__
    self.populate()
  File "/Users/marzam/miniconda3/lib/python3.7/site-packages/scvi/dataset/loom.py", line 138, in populate
    data = ds[:, select].T  # change matrix to cells by genes
  File "/Users/marzam/.local/lib/python3.7/site-packages/loompy/loompy.py", line 206, in __getitem__
    return self.layers[""][slice_]
  File "/Users/marzam/.local/lib/python3.7/site-packages/loompy/loom_layer.py", line 88, in __getitem__
    return self.ds._file['/matrix'].__getitem__(slice)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/Users/marzam/miniconda3/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 777, in __getitem__
    selection = sel.select(self.shape, args, dataset=self)
  File "/Users/marzam/miniconda3/lib/python3.7/site-packages/h5py/_hl/selections.py", line 82, in select
    return selector.make_selection(args)
  File "h5py/_selector.pyx", line 272, in h5py._selector.Selector.make_selection
  File "h5py/_selector.pyx", line 183, in h5py._selector.Selector.apply_args
TypeError: Indexing arrays must have integer dtypes
(base) KI-C02Z42TFLVDM:solo marzam$ 
@njbernstein
Copy link
Contributor

I haven't seen the error before. I think your indexes must be cast to integers. I can try to help troubleshoot if you are willing to send me your data or a subset of it.

@marzamKI
Copy link
Author

Hey,

Thank you for your prompt reply!
I'm using the loom files from Linnarsson lab (http://loom.linnarssonlab.org/) - in particular the L6_Immature_Neurons, tho I suspect I'd get the same error with the others as well..

@scSeqLiubice
Copy link

Hi @marzamKI, @njbernstein,

i am running into the exact same problems when using solo on prefiltered loomfiles from the kallisto|bustools python implementation. @marzamKI, did you find a solution to your problem?

This is the Loomfile structure, if it helps by any chance:

>>> Bl6.adata
AnnData object with n_obs × n_vars = 198891 × 55487
    var: 'gene_name'
    layers: 'matrix', 'spliced', 'unspliced'

>>> adata.var.index
Index(['ENSMUSG00000079800.2', 'ENSMUSG00000079192.2', 'ENSMUSG00000094799.1',
       'ENSMUSG00000079794.2', 'ENSMUSG00000095092.1', 'ENSMUSG00000079190.3',
       'ENSMUSG00000095672.1', 'ENSMUSG00000094514.1', 'ENSMUSG00000095787.1',
       'ENSMUSG00000095250.1',
       ...
       'ENSMUSG00000103524.1', 'ENSMUSG00000103796.1', 'ENSMUSG00000103757.1',
       'ENSMUSG00000106618.1', 'ENSMUSG00000118466.1', 'ENSMUSG00000118447.1',
       'ENSMUSG00000118422.1', 'ENSMUSG00000118472.1', 'ENSMUSG00000118404.1',
       'ENSMUSG00000118417.1'],
      dtype='object', length=55487)

>>> adata.obs.index
Index(['AAACCCAAGAAACCCG', 'AAACCCAAGACAACTA', 'AAACCCAAGACGCAGT',
       'AAACCCAAGACGGTTG', 'AAACCCAAGACTCTTG', 'AAACCCAAGAGCAACC',
       'AAACCCAAGAGTCTGG', 'AAACCCAAGAGTCTTC', 'AAACCCAAGCACCGTC',
       'AAACCCAAGCACTCGC',
       ...
       'TTTGTTGTCTCAACGA', 'TTTGTTGTCTCATTTG', 'TTTGTTGTCTGGCTGG',
       'TTTGTTGTCTGTAACG', 'TTTGTTGTCTGTCTCG', 'TTTGTTGTCTGTGCAA',
       'TTTGTTGTCTGTGCTC', 'TTTGTTGTCTGTTGGA', 'TTTGTTGTCTTACTGT',
       'TTTGTTGTCTTCACGC'],
      dtype='object', length=198891)

@njbernstein
Copy link
Contributor

So sorry about the slow response. Working on this today.

@njbernstein
Copy link
Contributor

njbernstein commented Feb 3, 2021

@marzamKI @scSeqLiubice I'm unable to recreate this error with the L6_Immature_Neurons loom file from http://loom.linnarssonlab.org/.

Do you get the same results if you run the following:

import scvi
from scvi.dataset import AnnDatasetFromAnnData, LoomDataset, \
    GeneExpressionDataset, Dataset10X
scvi_data = LoomDataset(data_path)

on its own?

Again sorry about the slow response.

@njbernstein
Copy link
Contributor

Which version of solo-sc are you both using? If not using the current version in master could you try using that version?
Here is how to install it:

git clone [email protected]:calico/solo.git && cd solo && conda create -n solo python=3.6 && conda activate solo && pip install -e .

@scSeqLiubice
Copy link

scSeqLiubice commented Feb 4, 2021

Hi @njbernstein,

really appreciate your response!
I'm using solo-sc : 0.6, but i did not use conda for installing and instead directly used pip install -e .

I tried using scvi, and i am getting the same error while removing non expressing cells. So maybe this issue is more scvi related?

>>> scvi_data=LoomDataset(pth + '/counts_unfiltered.Bl6.46/adata.loom')
[2021-02-04 10:08:47,833] INFO - scvi.dataset.loom | Preprocessing dataset
[2021-02-04 10:11:47,726] WARNING - scvi.dataset.loom | Removing non-expressing cells
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/scvi/dataset/loom.py", line 62, in __init__
    super().__init__(
  File "/usr/local/lib/python3.8/site-packages/scvi/dataset/dataset.py", line 2026, in __init__
    self.populate()
  File "/usr/local/lib/python3.8/site-packages/scvi/dataset/loom.py", line 138, in populate
    data = ds[:, select].T  # change matrix to cells by genes
  File "/usr/local/lib/python3.8/site-packages/loompy/loompy.py", line 206, in __getitem__
    return self.layers[""][slice_]
  File "/usr/local/lib/python3.8/site-packages/loompy/loom_layer.py", line 88, in __getitem__
    return self.ds._file['/matrix'].__getitem__(slice)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/usr/local/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 777, in __getitem__
    selection = sel.select(self.shape, args, dataset=self)
  File "/usr/local/lib/python3.8/site-packages/h5py/_hl/selections.py", line 82, in select
    return selector.make_selection(args)
  File "h5py/_selector.pyx", line 272, in h5py._selector.Selector.make_selection
  File "h5py/_selector.pyx", line 183, in h5py._selector.Selector.apply_args
TypeError: Indexing arrays must have integer dtypes

Package incompatibilities? I'm using scvi : 0.6.7
Thanks for your work on this!

Edit:
I reread your comment and saw that you specifically create a Python3.6 env. I installed it into python3.8, i will use the approach for python3.6 and retry it again.

@njbernstein
Copy link
Contributor

@scSeqLiubice 0.6.7 should work. I'm gonna change the default to that.

Can you preprocess your dataset to remove empty cells?

If not can you can send me your data so I can troubleshoot more easily?

@marzamKI I believe changing the scVI version to 0.6.7 should solve your issue.

@carloelle
Copy link

carloelle commented Feb 25, 2021

Hi,

I had the same issue but I work it around using older versions of numpy and h5py. I'm using python v3.7.
@marzamKI try to uninstall numpy and h5py and then do:

pip install h5py==2.10.0
pip install numpy==1.19.5

In my case, when I run solo jsonfile.json dataset.loom again, I don't have the TypeError: Indexing arrays must have integer anymore, solo finish preprocessing of the data and start training of the neural network.
It seems to be an issue of h5py/numpy, see here.

Best,
Carlo

@ymahmoud
Copy link

Hi,

I had the same issue but I work it around using older versions of numpy and h5py. I'm using python v3.7.
@marzamKI try to uninstall numpy and h5py and then do:

pip install h5py==2.10.0
pip install numpy==1.19.5

In my case, when I run solo jsonfile.json dataset.loom again, I don't have the TypeError: Indexing arrays must have integer anymore, solo finish preprocessing of the data and start training of the neural network.
It seems to be an issue of h5py/numpy, see here.

Best,
Carlo

Hi! I had the same issue and this worked for me. Thanks!

@njbernstein
Copy link
Contributor

@davek44 can you close this out?

@davek44 davek44 closed this as completed Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants