You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I received an API key (thank you) and followed the instructions to download the S2ORC dataset here and here. This found 30 files on AWS, and each of those contains almost 7,000 rows, each of which is a paper with text. I believe the S2ORC dataset should include over 8M paper with full text, not 30 * 7,000 = 210k. What am I missing?
Would love to ask this on the slack community but I cannot create an account, as I do not belong to the 5 organizations listed.
The text was updated successfully, but these errors were encountered:
Thanks so much for maintaining this resource.
I received an API key (thank you) and followed the instructions to download the S2ORC dataset here and here. This found 30 files on AWS, and each of those contains almost 7,000 rows, each of which is a paper with text. I believe the S2ORC dataset should include over 8M paper with full text, not 30 * 7,000 = 210k. What am I missing?
Would love to ask this on the slack community but I cannot create an account, as I do not belong to the 5 organizations listed.
The text was updated successfully, but these errors were encountered: