Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize table crashes #3330

Open
andrijazz opened this issue Jan 2, 2025 · 0 comments
Open

Optimize table crashes #3330

andrijazz opened this issue Jan 2, 2025 · 0 comments

Comments

@andrijazz
Copy link

andrijazz commented Jan 2, 2025

My table has around 100k rows. I populated table with ray write_datasink. Optimize operation crashes with the following error. I can provide db if needed.

>> tbl.optimize()
thread 'tokio-runtime-worker' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-52.2.0/src/transform/utils.rs:42:56:
offset overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/lance-encoding-0.19.2/src/decoder.rs:1448:65:
called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(995), "offset overflow", ...)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/lancedb/table.py", line 2091, in optimize
    asyncio.get_running_loop()
RuntimeError: no running event loop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/lancedb/table.py", line 2099, in optimize
    asyncio.run(
  File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/dist-packages/lancedb/table.py", line 2116, in _async_optimize
    return await table.optimize(
  File "/usr/local/lib/python3.10/dist-packages/lancedb/table.py", line 2995, in optimize
    return await self._inner.optimize(cleanup_older_than, delete_unverified)
pyo3_asyncio.RustPanic: rust future panicked: unknown error

Same thing when compact_files is run:

>>> tbl.compact_files()
thread 'lance_background_thread' panicked at /root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-52.2.0/src/transform/utils.rs:42:56:
offset overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread '<unnamed>' panicked at /home/runner/work/lance/lance/rust/lance-encoding/src/decoder.rs:1448:65:
called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(1602), "offset overflow", ...)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/lancedb/table.py", line 2041, in compact_files
    return self.to_lance().optimize.compact_files(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/lance/dataset.py", line 3086, in compact_files
    return Compaction.execute(self._dataset, opts)
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: JoinError::Panic(Id(1602), "offset overflow", ...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant