Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'extract_pages' from 'pdfminer.high_level' #25

Open
AbubakrChan opened this issue Jul 24, 2023 · 1 comment

Comments

@AbubakrChan
Copy link

Getting this error IDK y:

Traceback (most recent call last):
File "C:\Users\l\streamlit-google-oauth\chatgpt-retrieval\chatgpt.py", line 38, in
index = VectorstoreIndexCreator().from_loaders([loader])
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\indexes\vectorstore.py", line 72, in from_loaders
docs.extend(loader.load())
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 108,
in load
self.load_file(i, p, docs, pbar)
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 69, in load_file
raise e
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 63, in load_file
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 71, in load
elements = self._get_elements()
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 106, in _get_elements
from unstructured.partition.auto import partition
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\auto.py", line 21, in
from unstructured.partition.image import partition_image
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\image.py", line 5, in
from unstructured.partition.pdf import partition_pdf_or_image
from pdfminer.high_level import extract_pages
ImportError: cannot import name 'extract_pages' from 'pdfminer.high_level' (C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\pdfminer\high_level.py)
PS C:\Users\l\streamlit-google-oauth\chatgpt-retrieval> ^C
PS C:\Users\l\streamlit-google-oauth\chatgpt-retrieval> pip install pdfminer.six
Requirement already satisfied: pdfminer.six in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (20191110)
Requirement already satisfied: pycryptodome in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (3.17)
Requirement already satisfied: sortedcontainers in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (2.4.0)
Requirement already satisfied: chardet in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (3.0.4)
Requirement already satisfied: six in c:\users\l\appdata\local\programs\python\python39\lib\site-packages (from pdfminer.six) (1.16.0)

[notice] A new release of pip is available: 23.0.1 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip
PS C:\Users\l\streamlit-google-oauth\chatgpt-retrieval> python chatgpt.py "what is my dog's name"
Traceback (most recent call last):
File "C:\Users\l\streamlit-google-oauth\chatgpt-retrieval\chatgpt.py", line 38, in
index = VectorstoreIndexCreator().from_loaders([loader])
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\indexes\vectorstore.py", line 72, in from_loaders
docs.extend(loader.load())
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 108,
in load
self.load_file(i, p, docs, pbar)
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 69, in load_file
raise e
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\directory.py", line 63, in load_file
sub_docs = self.loader_cls(str(item), **self.loader_kwargs).load()
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 71, in load
elements = self._get_elements()
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\langchain\document_loaders\unstructured.py", line 106, in _get_elements
from unstructured.partition.auto import partition
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\auto.py", line 21, in
from unstructured.partition.image import partition_image
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\image.py", line 5, in
from unstructured.partition.pdf import partition_pdf_or_image
File "C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\unstructured\partition\pdf.py", line 9, in
from pdfminer.high_level import extract_pages
ImportError: cannot import name 'extract_pages' from 'pdfminer.high_level' (C:\Users\l\AppData\Local\Programs\Python\Python39\lib\site-packages\pdfminer\high_level.py)

@3dylson
Copy link

3dylson commented Aug 10, 2023

#32 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants