-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #1 from theMashUp/dev
v0.1
- Loading branch information
Showing
6 changed files
with
246 additions
and
125 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,44 @@ | ||
# FindMyBooks | ||
From a list of books, parse desired online library catalogues to figure out where a book is available. | ||
Note: This tool is still in development. The README will be updated when the tool is complete. | ||
[example code] | ||
# FindMyBooks v0.1 | ||
Ever found yourself in front of a 100+ books to-read list on Goodreads, without knowing which one is available from your local library? | ||
FindMyBooks allows you to where all of those books are available. | ||
|
||
## Getting started | ||
To use the tool, first clone this repository. | ||
Then, download your goodreads library ([link](https://www.goodreads.com/review/import)). | ||
```Python | ||
python find_my_books MY_GOODREADS_LIBRARY.csv | ||
``` | ||
You will then find a MY_GOODREADS_LIBRARY_OUTPUT.csv document with columns for each target library, with url if the book was found in the library. | ||
A detailed list of supported command line arguments can be found below in the Configuration section. | ||
|
||
## Features | ||
- Import a list of books from Goodreads, ... | ||
- Find out which book is available where | ||
- Import a list of books from Goodreads | ||
- Find out in which library your books are available | ||
|
||
## Configuration | ||
### Goodreads library file | ||
Mandatory, string path to the goodreads library file. | ||
Example: | ||
```Python | ||
python find_my_books MY_GOODREADS_LIBRARY.csv | ||
``` | ||
|
||
### Output file | ||
*-o, --output OUTPUT_FILE.csv* | ||
|
||
Optional, path to desired output file to be created. | ||
Default: Goodreads library file with "_output" suffix. | ||
|
||
### Debug | ||
*-d, --debug* | ||
|
||
Activate debug mode, which displays more information in the console. | ||
|
||
## Contributing | ||
### New libraries | ||
Everyone is encouraged to add libraries via the libraries.json file. Please create one PR with all the libraries you wish to add. | ||
### New features | ||
New features are always welcome. If the feature is substantial, please create an issue first, so that it can be discussed. If the feature is minor, you can directly create a PR and we'll look at it. | ||
|
||
... | ||
## Licensing | ||
The code in this project is licensed under MIT license. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
# -*- coding: utf-8 -*- | ||
|
||
# import built-in modules | ||
import re | ||
import json | ||
import logging | ||
import argparse | ||
import csv | ||
|
||
# import third-party modules | ||
import requests | ||
|
||
|
||
APP_DESCRIPTION = "Parse a Goodreads books list to find in which library they are available." | ||
SUPPORTED_LIBRARIES_FILE = "libraries.json" | ||
# TODO: Figure out what is the minimal header required. | ||
REQUEST_HEADERS = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:101.0) Gecko/20100101 Firefox/101.0', | ||
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8', | ||
'Accept-Language': 'en-US,en;q=0.5', | ||
'Accept-Encoding': 'gzip, deflate, br', | ||
'Upgrade-Insecure-Requests': '1', | ||
'Sec-Fetch-Dest': 'document', | ||
'Sec-Fetch-Mode': 'navigate', | ||
'Sec-Fetch-Site': 'none', | ||
'Sec-Fetch-User': '?1', | ||
'Connection': 'keep-alive'} | ||
VERSION = "0.1" | ||
|
||
|
||
def parse_argv() -> dict: | ||
""" | ||
Parse command-line arguments into a dict. | ||
""" | ||
arg_parser = argparse.ArgumentParser(description=APP_DESCRIPTION) | ||
arg_parser.add_argument("goodreads_library", type=str, | ||
help="Goodreads library .csv file") | ||
arg_parser.add_argument("-o", "--output", type=str, required=False, | ||
dest="output", | ||
help="output .csv file") | ||
arg_parser.add_argument("-d", "--debug", action="store_true", required=False, default=False, | ||
help="display debug logging lines") | ||
args = arg_parser.parse_args() | ||
args_dict = vars(args) | ||
return args_dict | ||
|
||
|
||
def get_book_search_url(book_title: str, book_author: str, url_template: str) -> str: | ||
""" | ||
Replace {TITLE} and {AUTHOR} in url_template by book_title and book_author respectively. | ||
""" | ||
# Remove parenthesis and colon in book title to facilitate search. | ||
re_remove_parenthesis = "\(.*\)|\s-\s.*" | ||
book_title = re.sub(re_remove_parenthesis, "", book_title) | ||
re_remove_colon = ":.+" | ||
book_title = re.sub(re_remove_colon, "", book_title) | ||
|
||
url = url_template.replace("{AUTHOR}", book_author.replace(" ", "+")) | ||
url = url.replace("{TITLE}", book_title.replace(" ", "+")) | ||
return url | ||
|
||
|
||
# Script starts here | ||
if __name__ == '__main__': | ||
# Parse command line arguments | ||
args = parse_argv() | ||
goodreads_library_file_path = args["goodreads_library"] | ||
if args["output"]: | ||
output_file_path = args["output"] | ||
else: | ||
output_file_path = goodreads_library_file_path.replace(".csv", "_output.csv") | ||
|
||
# Set-up logging | ||
if args["debug"]: | ||
logging_level = logging.DEBUG | ||
else: | ||
logging_level = logging.INFO | ||
|
||
logging.basicConfig(level=logging_level) | ||
|
||
print(f"FindMyBooks v{VERSION}") | ||
print(f"\tGoodreads library file path: {goodreads_library_file_path}") | ||
print(f"\tOutput file path: {output_file_path}") | ||
|
||
# Load Goodreads library file | ||
books_to_check = list() | ||
with open(goodreads_library_file_path, newline="") as csvfile: | ||
reader = csv.DictReader(csvfile) | ||
|
||
# Consider only "to-read" books | ||
for book in reader: | ||
if book["Exclusive Shelf"] == "to-read": | ||
books_to_check.append({"title": book["Title"], | ||
"author": book["Author"]}) | ||
print(f"\tFound {len(books_to_check)} books in to-read shelf.") | ||
|
||
# Load supported libraries from json config file | ||
with open(SUPPORTED_LIBRARIES_FILE) as f: | ||
supported_libraries = json.load(f)["libraries"] | ||
print(f"\tFound {len(supported_libraries)} supported libraries.") | ||
|
||
print("\tBeginning library search:") | ||
for id, book in enumerate(books_to_check): | ||
print(f"\t\tBook {id+1}/{len(books_to_check)}: {book['author']}, \"{book['title']}\"") | ||
|
||
for lib in supported_libraries: | ||
books_to_check[id][lib["name"]] = "" | ||
url = get_book_search_url(book["title"], book["author"], lib["url"]) | ||
|
||
logging.debug(f"Library: {lib['name']} ({lib['response_length_threshold']} bytes response threshold)") | ||
logging.debug(f"URL: {url}") | ||
|
||
# Using a GET is a waste of download. However, not all websites provide Content-Length with HEAD, | ||
# and some do not answer to HEAD at all. | ||
# TODO: Figure out way to check libraries with smaller footprint. Idea: pretend to be mobile, | ||
# very small screen, etc. | ||
# REQUEST_HEADERS is used because some websites require a "realistic" header to answer the request. | ||
response = requests.get(url, headers=REQUEST_HEADERS) | ||
|
||
if response: | ||
response_content_length = len(response.content) | ||
logging.debug(f"Content size: {response_content_length} bytes") | ||
if response_content_length > lib["response_length_threshold"]: | ||
books_to_check[id][lib["name"]] = url | ||
else: | ||
logging.warning(f"Request not successful. Response status code: {response.status_code}") | ||
|
||
print("\tLibrary search completed.") | ||
|
||
# Create output file | ||
with open(output_file_path, "w", newline="") as csvfile: | ||
fieldnames = books_to_check[0].keys() | ||
writer = csv.DictWriter(csvfile, fieldnames=fieldnames) | ||
writer.writeheader() | ||
for book in books_to_check: | ||
writer.writerow(book) | ||
print(f"\tResults written to output file {output_file_path}") | ||
|
||
print("End of script.") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
{ | ||
"libraries": [ | ||
{"name": "BANQ (Overdrive)", | ||
"url": "https://banq.overdrive.com/search/title?query={TITLE}&creator={AUTHOR}&mediaType=ebook&sortBy=newlyadded", | ||
"response_length_threshold": 95000}, | ||
{"name": "Ville de Québec (Prêt Numérique)", | ||
"url": "https://quebec.pretnumerique.ca/resources?utf8=%E2%9C%93&keywords={TITLE}&author={AUTHOR}&narrator=&publisher=&collection_title=&issued_on_range=&language=&audience=&category_standard=thema&category=&nature=ebook&medium=", | ||
"response_length_threshold": 8000}, | ||
{"name": "Kobo Plus", | ||
"url": "https://www.kobo.com/ca/en/search?query=query&fcmedia=Book~BookSubscription&nd=true&ac=1&ac.author={AUTHOR}&ac.title={TITLE}&sort=PublicationDateDesc&sortchange=1", | ||
"response_length_threshold": 250000} | ||
] | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
Book Id,Title,Author,Author l-f,Additional Authors,ISBN,ISBN13,My Rating,Average Rating,Publisher,Binding,Number of Pages,Year Published,Original Publication Year,Date Read,Date Added,Bookshelves,Bookshelves with positions,Exclusive Shelf,My Review,Spoiler,Private Notes,Read Count,Recommended For,Recommended By,Owned Copies,Original Purchase Date,Original Purchase Location,Condition,Condition Description,BCID | ||
,The Little Grave,Carolyn Arnold,,,,,,,,,,,,,,to-read,,to-read,,,,,,,,,,,, | ||
35022387,Décroissance versus développement durable: Débats pour la suite du monde,Yves-Marie Abraham,"Abraham, Yves-Marie","Louis Marion, Hervé Philippe",,,0,3.7,Écosociété,Kindle Edition,301,2012,,,2022/03/29,to-read,to-read (#186),to-read,,,,0,,,0,,,,, | ||
14781491,"The Time of Contempt (The Witcher, #2)",Andrzej Sapkowski,"Sapkowski, Andrzej",David French,0316219134,9780316219136,3,4.17,Orbit,Paperback,331,2013,1995,2022/06/24,2022/05/08,,,read,,,,1,,,0,,,,, | ||
39739146,L'Intelligence des plantes,Stefano Mancuso,"Mancuso, Stefano","Alessandra Viola, Renaud Temperini",,,0,3.89,Albin Michel,Kindle Edition,240,2018,2013,,2022/06/10,to-read,to-read (#218),to-read,,,,0,,,0,,,,, | ||
34324534,Against the Grain: A Deep History of the Earliest States,James C. Scott,"Scott, James C.",,0300182910,9780300182910,0,4.13,Yale University Press,Hardcover,312,2017,2017,,2022/06/20,to-read,to-read (#229),to-read,,,,0,,,0,,,,, | ||
27889241,Future Trends in Microelectronics: Journey Into the Unknown,Serge Luryi,"Luryi, Serge","Jimmy Xu, Alexander Zaslavsky",1119069114,9781119069119,0,0,Wiley,Hardcover,384,2016,,,2022/06/19,to-read,to-read (#228),to-read,,,,0,,,0,,,,, | ||
23209924,The Water Knife,Paolo Bacigalupi,"Bacigalupi, Paolo",,0385352875,9780385352871,0,3.84,Knopf,Hardcover,371,2015,2015,,2022/06/17,to-read,to-read (#227),to-read,,,,0,,,0,,,,, | ||
44882,Code: The Hidden Language of Computer Hardware and Software,Charles Petzold,"Petzold, Charles",,0735611319,9780735611313,0,4.39,Microsoft Press,Paperback,400,2000,1999,,2022/06/16,to-read,to-read (#226),to-read,,,,0,,,0,,,,, | ||
8701960,"The Information: A History, a Theory, a Flood",James Gleick,"Gleick, James",,0375423729,9780375423727,0,4.02,Knopf Doubleday Publishing Group,Hardcover,527,2011,2011,,2022/06/13,to-read,to-read (#225),to-read,,,,0,,,0,,,,, | ||
40175096,Thus Spoke the Plant: A Remarkable Journey of Groundbreaking Scientific Discoveries and Personal Encounters with Plants,Monica Gagliano,"Gagliano, Monica",,,9781623172435,0,3.77,North Atlantic Books,Paperback,176,2018,2018,,2022/06/10,to-read,to-read (#217),to-read,,,,0,,,0,,,,, | ||
804069,A Brief History of the Future: The Origins of the Internet,John Naughton,"Naughton, John",,075381093X,9780753810934,0,3.81,Orion Publishing Group,Paperback,334,2006,1999,,2022/06/05,"to-read, history-of-technology","to-read (#204), history-of-technology (#6)",to-read,,,,0,,,0,,,,, | ||
8201080,The Master Switch: The Rise and Fall of Information Empires,Tim Wu,"Wu, Tim",,0307269930,9780307269935,0,3.87,Knopf,Hardcover,384,2010,2010,,2022/06/05,"to-read, history-of-technology","to-read (#203), history-of-technology (#5)",to-read,,,,0,,,0,,,,, | ||
753865,Inventing the Internet,Janet Abbate,"Abbate, Janet",,0262511150,9780262511155,0,3.87,MIT Press,Paperback,268,2000,1999,,2022/06/05,"to-read, history-of-technology","to-read (#202), history-of-technology (#4)",to-read,,,,0,,,0,,,,, | ||
509866,"The Rise of the Network Society: The Information Age: Economy, Society and Culture, Volume I",Manuel Castells,"Castells, Manuel",,0631221409,9780631221401,0,3.98,Wiley-Blackwell,Paperback,624,2000,1996,,2022/06/05,"to-read, history-of-technology","to-read (#201), history-of-technology (#3)",to-read,,,,0,,,0,,,,, | ||
35068671,The Perfectionists: How Precision Engineers Created the Modern World,Simon Winchester,"Winchester, Simon",,0062652575,9780062652577,0,4.13,Harper,ebook,416,2018,2018,,2022/06/05,"to-read, history-of-technology","to-read (#200), history-of-technology (#2)",to-read,,,,0,,,0,,,,, | ||
51619298,Technology and the Environment in History,Sara B. Pritchard,"Pritchard, Sara B.",Carl A. Zimring,1421438992,9781421438993,0,3,Johns Hopkins University Press,Paperback,264,2020,,,2022/06/05,"to-read, history-of-technology","to-read (#211), history-of-technology (#1)",to-read,,,,0,,,0,,,,, | ||
13330922,"The Black Count: Glory, Revolution, Betrayal, and the Real Count of Monte Cristo",Tom Reiss,"Reiss, Tom",Gabriel Stoian,030738246X,9780307382467,0,3.97,Crown,Hardcover,414,2012,2012,,2022/06/03,to-read,to-read (#199),to-read,,,,0,,,0,,,,, | ||
59469433,The Temporary European: Lessons and Confessions of a Professional Traveler,Cameron Hewitt,"Hewitt, Cameron",,,,0,4.55,,Kindle Edition,,2022,,,2022/06/01,to-read,to-read (#198),to-read,,,,0,,,0,,,,, | ||
32783223,Problems of Life: An Evaluation of Modern Biological Thought,Ludwig Von Bertalanffy,"Bertalanffy, Ludwig Von",,161427701X,9781614277019,0,4,Martino Fine Books,Paperback,226,2014,,,2022/05/11,to-read,to-read (#196),to-read,,,,0,,,0,,,,, | ||
6043781,"Blood of Elves (The Witcher, #1)",Andrzej Sapkowski,"Sapkowski, Andrzej",Danusia Stok,031602919X,9780316029193,4,4.1,Hachette Book Group,Mass Market Paperback,398,2009,1994,2022/05/08,2022/05/05,,,read,,,,1,,,0,,,,, | ||
25454056,"Sword of Destiny (The Witcher, #0.7)",Andrzej Sapkowski,"Sapkowski, Andrzej",David A French,,,4,4.28,Orbit,Kindle Edition,384,2015,1992,2022/05/03,2021/11/22,,,read,,,,1,,,0,,,,, | ||
39328584,Greenwood,Michael Christie,"Christie, Michael",,1984822004,9781984822000,0,4.34,Hogarth,Hardcover,528,2020,2019,,2022/05/01,to-read,to-read (#194),to-read,,,,0,,,0,,,,, | ||
56404444,Bewilderment,Richard Powers,"Powers, Richard",,0393881148,9780393881141,0,3.97,W. W. Norton Company,Hardcover,278,2021,2021,,2022/04/30,to-read,to-read (#193),to-read,,,,0,,,0,,,,, | ||
40180098,The Overstory,Richard Powers,"Powers, Richard",,039335668X,9780393356687,5,4.12,W.W. Norton & Company,Paperback,502,2019,2018,2022/04/30,2022/04/04,,,read,,,,1,,,0,,,,, | ||
25848636,The Time Traveller's Wife,Audrey Niffenegger,"Niffenegger, Audrey",,,,0,3.99,,Hardcover,518,2003,2003,,2022/04/28,to-read,to-read (#192),to-read,,,,0,,,0,,,,, | ||
50403493,Wagnerism: Art and Politics in the Shadow of Music,Alex Ross,"Ross, Alex",,0374285934,9780374285937,0,4.2,"Farrar, Straus and Giroux",Hardcover,784,2020,2020,,2022/04/18,to-read,to-read (#191),to-read,,,,0,,,0,,,,, | ||
11543839,Did Jesus Exist?: The Historical Argument for Jesus of Nazareth,Bart D. Ehrman,"Ehrman, Bart D.",,0062089943,9780062089946,0,3.84,HarperCollins Publishers,ebook,304,2012,2012,,2022/04/18,to-read,to-read (#190),to-read,,,,0,,,0,,,,, | ||
13587193,"Permanent Present Tense: The Unforgettable Life of the Amnesic Patient, H. M.",Suzanne Corkin,"Corkin, Suzanne",,0465031595,9780465031597,3,3.7,Basic Books,Hardcover,386,2013,2012,,2013/10/23,,,read,,,,1,,,0,,,,, | ||
53328332,Less is More: How Degrowth Will Save the World,Jason Hickel,"Hickel, Jason",,1786091216,9781786091215,5,4.53,Windmill Books,Paperback,320,2021,2020,,2022/03/29,,,read,,,,1,,,0,,,,, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
title,author,BANQ (Overdrive),Ville de Québec (Prêt Numérique),Kobo Plus | ||
The Little Grave,Carolyn Arnold,,, | ||
Décroissance versus développement durable: Débats pour la suite du monde,Yves-Marie Abraham,,https://quebec.pretnumerique.ca/resources?utf8=%E2%9C%93&keywords=Décroissance+versus+développement+durable&author=Yves-Marie+Abraham&narrator=&publisher=&collection_title=&issued_on_range=&language=&audience=&category_standard=thema&category=&nature=ebook&medium=, | ||
L'Intelligence des plantes,Stefano Mancuso,,https://quebec.pretnumerique.ca/resources?utf8=%E2%9C%93&keywords=L'Intelligence+des+plantes&author=Stefano+Mancuso&narrator=&publisher=&collection_title=&issued_on_range=&language=&audience=&category_standard=thema&category=&nature=ebook&medium=, | ||
Against the Grain: A Deep History of the Earliest States,James C. Scott,,, | ||
Future Trends in Microelectronics: Journey Into the Unknown,Serge Luryi,,, | ||
The Water Knife,Paolo Bacigalupi,https://banq.overdrive.com/search/title?query=The+Water+Knife&creator=Paolo+Bacigalupi&mediaType=ebook&sortBy=newlyadded,, | ||
Code: The Hidden Language of Computer Hardware and Software,Charles Petzold,,, | ||
"The Information: A History, a Theory, a Flood",James Gleick,https://banq.overdrive.com/search/title?query=The+Information&creator=James+Gleick&mediaType=ebook&sortBy=newlyadded,, | ||
Thus Spoke the Plant: A Remarkable Journey of Groundbreaking Scientific Discoveries and Personal Encounters with Plants,Monica Gagliano,,, | ||
A Brief History of the Future: The Origins of the Internet,John Naughton,,, | ||
The Master Switch: The Rise and Fall of Information Empires,Tim Wu,https://banq.overdrive.com/search/title?query=The+Master+Switch&creator=Tim+Wu&mediaType=ebook&sortBy=newlyadded,, | ||
Inventing the Internet,Janet Abbate,,, | ||
"The Rise of the Network Society: The Information Age: Economy, Society and Culture, Volume I",Manuel Castells,,, | ||
The Perfectionists: How Precision Engineers Created the Modern World,Simon Winchester,,, | ||
Technology and the Environment in History,Sara B. Pritchard,,, | ||
"The Black Count: Glory, Revolution, Betrayal, and the Real Count of Monte Cristo",Tom Reiss,https://banq.overdrive.com/search/title?query=The+Black+Count&creator=Tom+Reiss&mediaType=ebook&sortBy=newlyadded,, | ||
The Temporary European: Lessons and Confessions of a Professional Traveler,Cameron Hewitt,,,https://www.kobo.com/ca/en/search?query=query&fcmedia=Book~BookSubscription&nd=true&ac=1&ac.author=Cameron+Hewitt&ac.title=The+Temporary+European&sort=PublicationDateDesc&sortchange=1 | ||
Problems of Life: An Evaluation of Modern Biological Thought,Ludwig Von Bertalanffy,,, | ||
Greenwood,Michael Christie,https://banq.overdrive.com/search/title?query=Greenwood&creator=Michael+Christie&mediaType=ebook&sortBy=newlyadded,https://quebec.pretnumerique.ca/resources?utf8=%E2%9C%93&keywords=Greenwood&author=Michael+Christie&narrator=&publisher=&collection_title=&issued_on_range=&language=&audience=&category_standard=thema&category=&nature=ebook&medium=, | ||
Bewilderment,Richard Powers,https://banq.overdrive.com/search/title?query=Bewilderment&creator=Richard+Powers&mediaType=ebook&sortBy=newlyadded,, | ||
The Time Traveller's Wife,Audrey Niffenegger,,, | ||
Wagnerism: Art and Politics in the Shadow of Music,Alex Ross,https://banq.overdrive.com/search/title?query=Wagnerism&creator=Alex++Ross&mediaType=ebook&sortBy=newlyadded,, | ||
Did Jesus Exist?: The Historical Argument for Jesus of Nazareth,Bart D. Ehrman,,, |
Oops, something went wrong.