Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear instructions and code errors #2

Open
an-k45 opened this issue Aug 18, 2023 · 0 comments
Open

Unclear instructions and code errors #2

an-k45 opened this issue Aug 18, 2023 · 0 comments

Comments

@an-k45
Copy link

an-k45 commented Aug 18, 2023

Hello,

I managed to get the code running on this repository and produce a copy of CCOHA. There was some technical stuff I had to sort out, which it would be useful to include in the README for future users. These include:

  • Clarifying the code runs on Python 2.7
  • Setting up a virtual environment for the packages (docopt, HTMLParser, nltk), and specifically mentioning nltk==3.4.5 to run this code.

There were also some basic code issues I had to fix, which might be helpful to note, or add another commit for. These include:

  • Setting all tabs to spaces
  • Renaming body to results in the function write_to_file(...)

Finally, there were a tiny number of files which the given processor failed to handle. The current code logs when this occurs and then sends a raise. Given the number of files in which this occurred (5? out of 100K), it didn't seem worth addressing this and just could be skipped over - but nonetheless the raise's needed to be commented out, which would have been helpful to've been noted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant