API terms of service: Section 5.e.1. Prohibition on building databases #558

ryan-lp · 2023-07-29T02:22:26Z

ryan-lp
Jul 29, 2023

Podcast Index APIs Terms of Service - March 2, 2021

Reading Section 5.e.,

[you will not]:

Scrape, build databases, or otherwise create permanent copies of such content, or keep cached copies longer than permitted by the cache header;

It seems from this that the way I would want to use the API is actually not permitted, and that's why I have instead built a sync tool to import the weekly data dumps, compute a diff of what's changed, and then replicate all of the APIs that I need. I understand that the weekly data dump is there to prevent scraping the API, but it's not clear to me why someone who already has a copy of the data dump shouldn't also be able to make occasional API calls and integrate those responses into a local copy of the database.

Prohibition 2 is also a concern to the extent that it relates to 1.:

Copy, translate, modify, create a derivative work of, sell, lease, lend, convey, distribute, publicly display, or sublicense to any third-party;

I am also curious to ask, what do the cache headers referenced in prohibition 1 actually say?

As for what I'm currently doing (downloading the whole database every week, computing a diff, then integrating that), I feel this could be streamlined. Currently, it's a large file to download, it takes time for the diff/import process to complete, and I don't think the dump is at a predictable URL so it's a process I have to do manually. It would perhaps be more convenient and efficient to publish diffs. Or, and maybe you have reasons against this, since the Podcast Index actually knows when all podcasts get updated, it could just podping the changes on behalf of those that don't already use podping.

daveajones · 2023-08-14T11:45:38Z

daveajones
Aug 14, 2023
Maintainer

We don’t care what you do with the data. If you want to build a database knock yourself out. Even commercial is fine. The biggest issue we have is API scraping which is what we combat constantly. I’m fine with people building databases but I don’t want them using the API to do it (meaning issuing 100 million requests just to get all the episodes). I’d rather provide a download. It’s just that the full (with episodes) download is 150 gigs so it’s why I only provide the feeds tabs currently.

I’m not sure what you mean by the weekly dump “not being at a predictable url”. The download url for that hasn’t changed in a long time. If there is a bug let me know.

What @mitchdowney does for Podverse is he tracks the /recent/data endpoint and just follows along with all the feed updates. There are a bunch of ways to stay up to date actually. I’m glad to share them.

Sorry for the late reply. This should probably be in the “database” repo instead of in the namespace repo.

3 replies

ryan-lp Aug 17, 2023
Author

Or maybe the "api" repo? The database repo seems to be about the design of the software, not about the API terms and conditions. Sorry, I agree I perhaps did not choose the best one. (I found a page on how to transfer it here if you would like to relocate this to somewhere more appropriate)

I understand that you probably wouldn't have a problem with building databases, but since the terms of service explicitly prohibit building databases, I thought I would raise the issue since it is the reason why I am using the dumps instead of the API.

The download url for that hasn’t changed in a long time

Don't worry, I think my comment was just out of date, that issue is since resolved.

ryan-lp Aug 18, 2023
Author

Or maybe the "api" repo?

I just found there is a repo called "legal" which hosts the terms of service document, so maybe I should have opened an issue there.

ryan-lp Dec 27, 2023
Author

Sorry, I agree I perhaps did not choose the best one.

I have moved the issue to Podcastindex-org/legal#1

Charlotte-br560 · 2024-03-21T10:49:25Z

Charlotte-br560
Mar 21, 2024

Understand the frustration with API terms. Your workaround with weekly data dumps sounds smart! Have you considered exploring publishing differentials or utilizing podping for updates?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API terms of service: Section 5.e.1. Prohibition on building databases #558

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

API terms of service: Section 5.e.1. Prohibition on building databases #558

ryan-lp Jul 29, 2023

Replies: 2 comments · 3 replies

daveajones Aug 14, 2023 Maintainer

ryan-lp Aug 17, 2023 Author

ryan-lp Aug 18, 2023 Author

ryan-lp Dec 27, 2023 Author

Charlotte-br560 Mar 21, 2024

ryan-lp
Jul 29, 2023

Replies: 2 comments 3 replies

daveajones
Aug 14, 2023
Maintainer

ryan-lp Aug 17, 2023
Author

ryan-lp Aug 18, 2023
Author

ryan-lp Dec 27, 2023
Author

Charlotte-br560
Mar 21, 2024