-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource punkt not found. Please use the NLTK Downloader to obtain the resource: #14
Comments
You can simply run
in the notebook to download the required files |
punkt is a nltk library tool for tokenizing text documents. When we use an old or a degraded version of nltk module we generally need to download the remaining data . |
[nltk_data] Error loading punkt: <urlopen error [SSL: |
Got this same thing |
Try this:
|
import nltk work for me thanks :) |
This worked for me thanks. |
This worked for me too. Thanks!
|
work for me thanks:) |
I am receiving this error as well and have tried everything in the comments. |
An easy way to get over this 'urlopen error' is to do the process manually. Just go to the website https://www.nltk.org/nltk_data/ and download the required zip file and extract the contents. In Windows, go to user/AppData/local/Programs/Python/Python(version)/lib and create a folder nltk_data. Then create the respective folder. As an example, for 'punkt' create the folder tokenizers and add the folder 'punkt' inside the extracted folder to it. This info is mostly given by the terminal itself. Run your program. Cheers! EDIT 1: Of course, downloading all files can be time-consuming, but it's the only option if the "urlopen error" persists. EDIT 2 It is also mostly your router or network at fault that you are not able to download nltk files. Try changing your network and that should help. |
TRY CHANGING YOUR NETWORK |
TRY CHANGING YOUR NETWORK |
Code downloads Punkt tokenizer successfully for me |
Try This:
OR
|
Getting this error guys. Any help would be very helpful. Thanks in advance
|
As mentioned by several people here including me, the primary cause of this error underlies to a faulty/unstable network connection. import nltk works fine. |
It works fine if the network conection is stable otherwise it crashes . |
I ran into the same problem but just needed to add the code mentioned above (plus a few additional lines) to get it to work. Here is the original code: Here is the modified and working code: You'll notice i just added 3 lines. The first is based on the comments above and the other two were derived by extension of the same logic. Hope this helps! |
I've downloaded it manually what to do next |
i face the same issue. The main issue is that we are not able to connect the raw github url. Where NLTK will download the data. You can use following tutorial to solve this issue. |
This solution worked for me as well. |
This worked for me ! |
This works!!!!1 |
you're god! |
this help!!!! |
🪲Its a bug , add these parameters to the word_tokenize function |
I solved this by providing an absolute path (as I needed to perform calculations on a remote server that didn't have an internet connection). Download the resource you need and save it under For example
|
import nltk |
Ahhhhh @jangmaga Folks, after I did that, I received the following status. See image: |
|
this was my initial code |
I am not yet at the NLP guru level of others here. But I would suggest ensuring you do the following to ensure you have NLTK: I am using jupyter notebook and had to do this install: then the following..... you may need this one.... and of course, this one.... NOTE: |
ohh damn this worked for me thank you very much...
…On Fri, Nov 29, 2024 at 9:16 PM PWAz ***@***.***> wrote:
I am not yet at the NLP guru level of others here.
But I would suggest ensuring you do the following to ensure you have NLTK:
------------------------------
I am using jupyter notebook and had to do this install:
--->>>> !pip install -U NLTK
then the following.....
--->>>> import nltk
--->>>> nltk.download('punkt')
you may need this one....
--->>>> nltk.download('punkt_tab')
and of course, this one....
--->>>> from nltk.tokenize import word_tokenize
*NOTE:*
*AND wouldn't ya know it - I just discovered (Nov 29th) this on the NLTK
site <https://nltk.org/api/nltk.tokenize.punkt.html>* - going to have to
update my own web page for this content:
--->>>> from nltk.tokenize import PunktTokenizer
nlp.nltk.jpg (view on web)
<https://github.com/user-attachments/assets/1ada1e55-c722-4382-b09d-dd79ff8af96d>
—
Reply to this email directly, view it on GitHub
<#14 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AY4BBFSUEJAXCCSYTLSGXWL2DCD6DAVCNFSM6AAAAABRFZ4Y4WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMBYGA3DANBZG4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Guys, I have tried every single one of the comments and still get no module named nltk.tokenize.punkt error |
Got This Below error in Notebook 5_2_munging_frankenstein.ipynb
Please hep on this
LookupError Traceback (most recent call last)
in ()
----> 1 tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
2 with open(args.raw_dataset_txt) as fp:
3 book = fp.read()
4 sentences = tokenizer.tokenize(book)
/usr/local/lib/python3.6/dist-packages/nltk/data.py in load(resource_url, format, cache, verbose, logic_parser, fstruct_reader, encoding)
832
833 # Load the resource.
--> 834 opened_resource = _open(resource_url)
835
836 if format == 'raw':
/usr/local/lib/python3.6/dist-packages/nltk/data.py in open(resource_url)
950
951 if protocol is None or protocol.lower() == 'nltk':
--> 952 return find(path, path + ['']).open()
953 elif protocol.lower() == 'file':
954 # urllib might not use mode='rb', so handle this one ourselves:
/usr/local/lib/python3.6/dist-packages/nltk/data.py in find(resource_name, paths)
671 sep = '*' * 70
672 resource_not_found = '\n%s\n%s\n%s\n' % (sep, msg, sep)
--> 673 raise LookupError(resource_not_found)
674
675
LookupError:
Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:
Searched in:
- '/root/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
- '/usr/nltk_data'
- '/usr/lib/nltk_data'
The text was updated successfully, but these errors were encountered: