Skip to content

GoodReads Preliminary Notes and Discussion

Grunthos edited this page Jan 4, 2012 · 1 revision

##GoodReads (GR) Integration

###Benefits to BookCatalogue (BC)

  • user ratings and reviews
  • often requested feature
  • more series data; ability to 'show other books in series', and 'show other books by author'

Aside: We may want to consider updating book search to include GR and to show a list of several matching books (and covers) before going to the 'Add Book' screen. We also may want to add 'I own this book' flag to the books table since GR makes a distinction.

###Important Structural Data Differences

BC just has the concept of a 'Book'. Each book has an (optional) ISBN and a unique ID. It is possible to have more than one 'BOOK' with the same ISBN -- ie. more than one copy, but each will have a different unique ID in BC. Each 'BOOK' can be assigned to more than one shelf, and individual books with the same ISBN can be assigned to different shelves.

GR has:

  • 'Book' -- roughly analagous to BC 'book', typically unique by ISBN and/or unique ID from GR
  • 'Work' -- fits above 'Book' in the logical hierarchy and seems to encompass all versions of a 'book'; eg. editions apply to works, and works (not books) go in series. Every book has an associated work.
  • 'OwnedBook' -- corresponds to physical items
  • 'Edition' -- note sure this is really relevant to BC integration. It is assumed to fit below 'Book' on the hierarchy.

In GR, 'works' are placed on shelves, NOT books or ownedBooks. This means that all items that share the same ISBN will be on the same shelf, unlike BC.

###Important Update Differences

GR allows shelves to be renamed. BC does not.

##GR Features ###GR Features Deemed Essential

  • Shelves
  • Books
  • GET reviews
  • lookup book details

###GR Features To Implement Some Time

  • Personal status update
  • Adding/updating reviews

###GR Features Deemed Low Priority

  • Listopia
  • Discussions interface
  • Friends management
  • queries about other GR users
  • Followers

##Synchronization

Many of the data items relating to books are considered read-only in GR; they sensibly guard this data. Some may be updated by 'Librarians', but only via the web. A 'Librarian' seems to be defined as someone with 50 or more books.

###What will be synchronized?

  • New books
  • Updated ratings and notes on books
  • New shelves
  • Books on shelves

The ONLY fields relevant to BC that can be updated in GR are: shelf, review, read_start and rating.

What MAY be synchronized?

  • Read End Date (GR does not support start date)
  • Publisher (one-way: GR -> BC)
  • Publication date (one-way: GR -> BC)
  • Pages (one-way: GR -> BC)
  • Genre (one-way: GR -> BC)
  • Series (one-way: GR -> BC)

###What will NOT be synchronized?

  • General book metadata. These are derived from many sources; it is not appropriate to copy from site to site.
  • Any author info

###Data Changes to support GR

  • Add goodreads_book_id(int) and (maybe) goodreads_owned_book_id(int) to the books table. ESSENTIAL for sync.
  • Add goodreads_author_id to the authors table. MAYBE.
  • Add goodreads_shelf_id to the bookshelves table. ESSENTIAL for sync.

Alternatively, in order to keep core tables 'clean', consider adding a goodreads_books table that has a 1:1 relationship with the books table; ditto bookshelves and authors. This is logically equivalent to adding fields to the base tables.

###Synchronization Method

Synchronization needs to be developed with large personal libraries in mind (where large is defined as more than 1000 books).

In that context, GR does not provide useful APIs for retrieving lists of books: ideally, it would be possible to get a sorted list consisting just of GR IDs and 'last-update-date', so that we could perform a mix-and-match search with our own database and update as appropriately. What IS provided by GR is a huge dump of books including reviews, author details etc, which would be unmanageable for synchronizing large libraries.

New Items

#####New Items in BC (OK)

If any of the good_reads_* IDs are null/0, it implies that the item is new and need to be added to GR.

#####New Books in GR

Detecting new books in GR is not possible due to the size of the list-based API output. The best solution here might be to implement a new Add option: 'Add From GoodReads Collection', which would lookup the named book or isbn in the users personal collection and add it to BC.

#####New Shelves in GR

See 'Deleted Shelves', below.

####Deleted Items

#####Deleted Books GR provides no means to detect a book was deleted. The only way this could be implemented would be to lookup each book in BC based on its GR ID, and check it still exists. This would be extremeny cumbersome.

#####Deleted Authors

Don't matter.

#####Deleted Shelves

Since the number of shelves is relatively small, it is possible to mix-and-match bookshelf lists. A deletion in BC could be detected by using a trigger to log bookshelf deletions, and a deletion in GR could be detected if a BC sheld has a GD ID that no longer exists. However, no shelf should be deleted in BC without asking the user, and shelves can never be deleted if it leaves a book without a shelf.

####Updated Items

It would be possible to log changes in BC so that we could sync all changed books since the last sync to GR, but the reverse is not possible. GR does not provide an API that returns a list of books updated since a given time.

The simplest first pass solution may be to add a 'Send to GR' and 'Update from GR' option while displaying book details, possibly allowing specific fields to be selected.

#####Tracking Updates in BC

When updating book details in BC it would be desirable to update GR instantaneously, but this will not always be possible. When a book is saved, it may be desirable to check if any GR-exported fields have been updated, and if so, queue a GR update for those fields.

To achieve this, book editing will need to load the book details before saving book details, and check what has been changed, if anything. It will also be necessary to add a new table, perhaps 'goodreads_exported_book_fields', to flag what is relevant. If an exported field has been changed, and GR is unavailble, the old and new images will need to be saved.

Alternatively, every time a book is edited and saved, mark it as in need of export to GR. This approach is much simpler but risks overwriting updates make directly in GR. This is a good method for the first release.

##'Complete' list of GR APIs and their initial importance to BC

###Essential APIs in First Release

#####auth.user — Get id of user who authorized OAuth.

ESSENTIAL. Done

#####book.isbn_to_id — Get the Goodreads book ID given an ISBN.

ESSENTIAL. Done

#####book.show — Get the reviews for a book given a Goodreads book id.

ESSENTIAL. This gets the full details of a book; ID, ISBN, covers, publishing info, authors, rating, every(?) review. This could be a lot of data, and may need to go into the database to reduce memory usage. Items notably missing from this data are the list of SERIES a book appears in.

#####owned_book.create — Add to books owned.

ESSENTIAL. Done

#####shelves.add_to_shelf — Add a book to a shelf.

ESSENTIAL. Done.

###High Priority APIs

#####review.create — Add review. HIGH. This is not only the means by which personal star-ratings of books are updating, it is also the method used to update comments, date-read, 'owned' etc.

#####review.destroy — Destroy a review.

HIGH. See review.create.

#####review.show — Get a review.

HIGH.

#####search.books — Find books by title, author, or ISBN.

HIGH...I think. Another source of books is good. This also returns a great deal of data, eg. reviews.

#####series.work — See all series a work is in.

HIGH.

#####shelves.list — Get a user's shelves.

HIGH. Useful for sync. Issues may occur because shelves can be renamed in GR.

###Low Priority APIs

#####book.review_counts — Get review statistics given a list of ISBNs.

LOW. Fast way to get review summaries for a list of books

#####book.show_by_isbn — Get the reviews for a book given an ISBN.

LOW. See book.show; probably lower priority.

#####notifications — See the current user's notifications.

LOW. Not in first implementation

#####owned_book.list — List books owned by a user.

LOW. Maybe. This returns every owned book in one request. It also gets all reviews, authors, etc etc. It will be a huge amount of data, again best stored temporarily in the database. The sheer volume of data will probably make this API useless.

Unfortunately, there seems to be no other API for listing books, so syncing with GR will not be easily automated.

#####owned_book.show — Show an owned book.

LOW - at least until we work out how to handle the BC-Book vs. the GR-Book relationship.

#####owned_book.update — Update an owned book.

LOW - see owned_book.show

#####quotes.create — Add a quote.

ONE DAY. Maybe. Kind of nice feature.

#####ratings.create — Rate a review.

LOW. Comes after adding/editing reviews.

#####reviews.list — Get the books on a members shelf. {Actually, it gets the REVIEWS for books on a shelf}

LOW.

#####review.recent_reviews — Recent reviews from all members..

ONE DAY. Maybe. Use GR web site, or GR app for this.

#####review.show_by_user_and_book — Get a user's review for a given book.

LOW.

#####review.update — Update book reviews.

LOW.

#####search.authors — Find an author by name.

LOW. BC does not do much with authors at the current time. It could be useful in finding a book, but not in the first release.

#####series.show — See a series.

LOW. See all the works in a series.

#####series.list — See all series by an author.

ONE DAY MAYBE.

#####user_shelves.create — Add book shelf. {Actually, add a SHELF}

LOW? It seems that shelves.add_to_shelf automatically creates the shelf.

#####user_shelves.destroy — Delete book shelf.

LOW. Probably do this from web or GR app. Deleting a shelf is not really something to play with in first release.

#####user_shelves.update — Edit book shelf.

LOW (Not sure). BC really does not do much with shelves. If a shelf needs editing, probably best to do it in GR.

#####user_status.create — Update user status.

LOW. Seems nice to do, but probably not in first release.

#####user_status.destroy — Delete user status.

LOW. Seems nice to do, but probably not in first release.

#####user_status.index — View user status.

LOW. Seems nice to do, but probably not in first release.

#####work.editions — See all editions by work.

LOW. Seems nice to do, but probably not in first release. Also requires extra permissions.

###GR API - No Plans to Implement in short to medium term.

#####author.books — Paginate an author's books. #####author.show — Get info about an author by id. #####book.title — Get the reviews for a book given a title string. #####comment.create — Create a comment. #####comment.list — List comments on a subject. #####events.list — Events in your area. #####followers.create — Follow a user. #####friends.create — Add a friend. #####group.list — List groups for a given user. #####group.members — Return members of a particular group. #####group.show — Get info about a group by id. #####list.book — Get the listopia lists for a given book. #####list.show — Get the books from a listopia list. This API requires extra permission please contact us. #####list.tag — Get the listopia lists for a given tag. This API requires extra permission please contact us. #####topic.create — Create a new topic via OAuth. #####topic.group_folder — Get list of topics in a group's folder. #####topic.show — Get info about a topic by id. #####topic.unread_group — Get a list of topics with unread comments. #####updates.friends — Get your friend updates. #####user.show — Get info about a member by id or username. #####user.compare — Compare books with another member. #####user.followers — Get a user's followers. #####user.following — Get people a user is following. #####user.friends — Get a user's friends.