Scrape, filter, and store daily arXiv update pages - so you don't have to.
pip install -r requirements.txt
In main.yaml
, change url
to desired daily update page and filters
to your desired keyword in lower case.
python scrape.py [--cfg <CONFIG PATH>] [--out <OUT DIR>]
- Open Task Scheduler
- Action -> Create Basic Task
- Choose weekly, click on day & time you want script to run
- Start a Program -> Input path to your python exe
- For 'add arguments' provide full path to
scrape.py
along with full path to--cfg
and--out
For example, an example argument would be
C:arxiv-scraper\scrape.py --cfg C:arxiv-scraper\main.yaml --out C:arxiv-scraper\out
The code in this repository is released under the GNU General Public License v3.0. A copy can be found in the LICENSE file.