Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix cannot import name 'escape' from 'jinja2' #183

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

sd3ntato
Copy link

@sd3ntato sd3ntato commented May 14, 2024

togheter with some fixed to the dependency hell, I added a docker compose config.

sd3ntato added 3 commits May 14, 2024 12:26
after the previous commit, i did
```
python3.8 -m venv .venv38
source .venv38/bin/activate
pip install .
excalibur initdb
excalibur webserver
```

and got the app running, but then when I uploaded a pdf I got this error:
```
ERROR:root:PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.
Traceback (most recent call last):
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/excalibur/tasks.py", line 22, in split
    extract_pages, total_pages = get_pages(file.filepath, file.pages)
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/excalibur/utils/task.py", line 29, in get_pages
    infile = PdfFileReader(inputstream, strict=False)
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/PyPDF2/_reader.py", line 1974, in __init__
    deprecation_with_replacement("PdfFileReader", "PdfReader", "3.0.0")
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/PyPDF2/_utils.py", line 369, in deprecation_with_replacement
    deprecation(DEPR_MSG_HAPPENED.format(old_name, removed_in, new_name))
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/PyPDF2/_utils.py", line 351, in deprecation
    raise DeprecationError(msg)
PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.
```

so i fixed PyPDF2==2.0 and the app manages to read pdfs and detect tables, but when i try to "view and download data" i get this:

```
ERROR:root:to_excel() got an unexpected keyword argument 'encoding'
Traceback (most recent call last):
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/excalibur/tasks.py", line 126, in extract
    tables.export(f_datapath, f=f, compress=True)
  File "/Users/valeriomariani/Desktop/test_excalibur/excalibur/.venv38/lib/python3.8/site-packages/camelot/core.py", line 736, in export
    table.df.to_excel(writer, sheet_name=sheet_name, encoding="utf-8")
TypeError: to_excel() got an unexpected keyword argument 'encoding'
```
pip freeze | grep pandas
pandas==2.0.3

but https://github.com/camelot-dev/camelot/blob/5c23e10702b1e53fefa71136d5e8ac2f0de9368a/pyproject.toml#L27 says ```pandas = "^1.5.3"
```

so i added requirement "pandas==1.5.3",
you can now docker ```compose up --build ``` and access http://localhost:5001
@kuirolo
Copy link

kuirolo commented Jun 25, 2024

Combining this with #169 gets the local server working for me (mostly). I didn't test changes to docker though.

I noticed that camelot now uses pypdf instead of PyPDF2. Seems like the next dependency hell is getting both modules on the same pdf stack.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants