Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution fails in colbert.index_objs() with assert classname.endswith('Vector') #340

Open
MayaBercovitch opened this issue May 2, 2024 · 1 comment

Comments

@MayaBercovitch
Copy link

Hi,
Can you please explain why I'm getting this error?
I have a function using ColBERT to index some text objects and it keeps failing with this error. I'm using the CPU version.

Thanks


The full traceback is:
Traceback (most recent call last):
File ".pyenv/versions/3.11.5/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File ".pyenv/versions/3.11.5/lib/python3.11/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File ".pyenv/versions/solid3.11/lib/python3.11/site-packages/colbert/infra/launcher.py", line 134, in setup_new_process
return_val = callee(config, *args)
^^^^^^^^^^^^^^^^^^^^^
File ".pyenv/versions/solid3.11/lib/python3.11/site-packages/colbert/indexing/collection_indexer.py", line 33, in encode
encoder.run(shared_lists)
File "pyenv/versions/solid3.11/lib/python3.11/site-packages/colbert/indexing/collection_indexer.py", line 68, in run
self.train(shared_lists) # Trains centroids from selected passages
^^^^^^^^^^^^^^^^^^^^^^^^
File ".pyenv/versions/solid3.11/lib/python3.11/site-packages/colbert/indexing/collection_indexer.py", line 232, in train
centroids = self._train_kmeans(sample, shared_lists)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".pyenv/versions/solid3.11/lib/python3.11/site-packages/colbert/indexing/collection_indexer.py", line 304, in train_kmeans
centroids = compute_faiss_kmeans(*args
)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".pyenv/versions/solid3.11/lib/python3.11/site-packages/colbert/indexing/collection_indexer.py", line 507, in compute_faiss_kmeans
kmeans.train(sample)
File ".pyenv/versions/solid3.11/lib/python3.11/site-packages/faiss/extra_wrappers.py", line 564, in train
centroids = faiss.vector_float_to_array(clus.centroids)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".pyenv/versions/solid3.11/lib/python3.11/site-packages/faiss/array_conversions.py", line 118, in vector_float_to_array
return vector_to_array(v)
^^^^^^^^^^^^^^^^^^
File "pyenv/versions/solid3.11/lib/python3.11/site-packages/faiss/array_conversions.py", line 109, in vector_to_array
assert classname.endswith('Vector')
AssertionError

@Luffy241
Copy link

Luffy241 commented Sep 9, 2024

Yeah,I am facing the same issue while using fitz but if i change toPDFplumber for text extraction it is working.
Can you help me clarify the issue!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants