forked from Mondego/spacetime-crawler4py
-
Notifications
You must be signed in to change notification settings - Fork 1
Questions
Ethan Wong edited this page Apr 20, 2023
·
6 revisions
wtf
- What does it mean when there are sets of similar pages with no info? Does this refer to
ics.uci.edu#aaa
vsics.uci.edu#aaa
- How do we detect large files? Some parameter in pickles?
- What counts as a page with "high textual content"?
- What counts as a unique page? Is ".ics.uci.edu/about" different page from ".ics.uci.edu/department"