Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preprocessing error reporting #19

Closed
1607722534 opened this issue Aug 12, 2021 · 7 comments
Closed

Preprocessing error reporting #19

1607722534 opened this issue Aug 12, 2021 · 7 comments

Comments

@1607722534
Copy link

Preprocessing error reporting:
Parse ./data/train/news.tsv
malloc(): invalid next size (unsorted)
已放弃 (核心已转储)

Is it the code or the environment?

@yusanshi
Copy link
Owner

I'm not sure but looks like a out-of-memory error?

What's the capacity of the memory? Please check the available memory while running the script. And if it runs out of memory, one possible solution would be enlarging the swap size.

@1607722534
Copy link
Author

I use two kinds of servers. The memory occupancy rate is less than 3%, and the cup occupancy rate is 100%.

@yusanshi
Copy link
Owner

Could you find out on which line of code the error occurs? (e.g., use debugging tools, or many print)

@1607722534
Copy link
Author

Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:

@yusanshi
Copy link
Owner

Resource punkt not found.
Please use the NLTK Downloader to obtain the resource:

Please refer to delip/PyTorchNLPBook#14 or https://stackoverflow.com/questions/4867197/failed-loading-english-pickle-with-nltk-data-load.

PS: you can try searching it with search engines first. See this, this and this. (No offense.)

@1607722534
Copy link
Author

I got the preprocessed files. Thank you for your patient reply. My undergraduate is not a computer, so my code ability is too poor.
(╥﹏╥)

@yusanshi
Copy link
Owner

My pleasure. 😁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants