We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The link to the PubMed Abstracts Database is broken in the Chapter 5 Section 4 'Big Datasets Chapter'.
Broken link in question found in
data_files = "https://the-eye.eu/public/AI/pile_preliminary_components/PUBMED_title_abstracts_2019_baseline.jsonl.zst"
Chapter here
The text was updated successfully, but these errors were encountered:
I have been able to continue doing the course by using this link instead
data_files = "https://the-eye.eu/public/AI/pile_v2/data/NIH_ExPORTER_awarded_grant_text.jsonl.zst"
Sorry, something went wrong.
Looks like this URL changing and breaking the link has been an issue before (see #324)
Note that there is another broken link further down the page on this line in the following code block:
law_dataset_streamed = load_dataset( "json", data_files="https://the-eye.eu/public/AI/pile_preliminary_components/FreeLaw_Opinions.jsonl.zst", split="train", streaming=True, ) next(iter(law_dataset_streamed))
Same issue here, looks like the pile has been taken down due to copyright reasons.
No branches or pull requests
The link to the PubMed Abstracts Database is broken in the Chapter 5 Section 4 'Big Datasets Chapter'.
Broken link in question found in
data_files = "https://the-eye.eu/public/AI/pile_preliminary_components/PUBMED_title_abstracts_2019_baseline.jsonl.zst"
Chapter here
The text was updated successfully, but these errors were encountered: