Skip to content

Latest commit

 

History

History
6 lines (4 loc) · 450 Bytes

README.md

File metadata and controls

6 lines (4 loc) · 450 Bytes

Wikidata-Discussion-Parser

Python classes for processing Wikidata xml dumps

This code can download the Wikidata edit history dumps from the Wikimedia foundation archives (https://dumps.wikimedia.org/wikidatawiki/). These are XML files in bz2 format (e.g., wikidatawiki-20221201-pages-meta-history1.xml-p1p154.bz2). Change the url variable in the main.py script to download the newest dump version and loop the wiki_downloader to save the files.