Skip to content

A simple python script to recover bible text from biblegateway

Notifications You must be signed in to change notification settings

bowenchin/bible-crawler

 
 

Repository files navigation

bible-crawler

Simple python script to crawl (https://www.biblegateway.com/). Currently works for bible versions that supply a direct mapping between verse and verse number (i.e. doesn't work for MSG translation)

Tested on Macbook Pro running MacOS Mojave version 10.14.4.

Environment information:

  • Python 3.6.5

Installation

To install dependencies, run:

pip install -r requirements.txt

Usage

scrapy runspider spider.py -o [FILENAME].json

Replace FILENAME with any name you want the json output to be stored in. Change the start link in the script to Genesis 1 in your desired version.

Also provided is a bundler.py script to bundle together the crawling output. This would create a json with the following structure:

{
    Book1
        {
            Chapter1 : Verses {}
            Chapter2 : Verses {}
            ...
        }
    Book2
        {
            Chapter1 : Verses {}
            Chapter2 : Verses {}
            ...
        }
    ...
}

bundler.py expect .json input files (generated by the crawler) to be in the bundler_input directory. It will create a bundler_output directory if it doesn't exist to store the bundled outputs.

About

A simple python script to recover bible text from biblegateway

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%