Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEPTA: ZIP file contains multiple ZIP archives within #154

Open
mlundblad opened this issue Sep 20, 2024 · 4 comments
Open

SEPTA: ZIP file contains multiple ZIP archives within #154

mlundblad opened this issue Sep 20, 2024 · 4 comments

Comments

@mlundblad
Copy link

Issue description
GTFS feeds (obtained via Transitland) for SEPTA (Southeast Pennsylvania Transportation Agency) contains two GTFS files within the ZIP file.

There are two links, one for a bus and one for a rail feed.

Last update of GTFS Feed
2024-09-07

Hash of the GTFS Feed
SHA1: adb983d5fae46af17e07ae8ae31423b2a91b6916
SHA1: da7a6dc4e8f83f9b6dd4b1dc1e984b56a25c96b5

GTFS Feed Download Link
https://github.com/septadev/GTFS/releases/latest/download/gtfs_public.zip#google_rail.zip
https://github.com/septadev/GTFS/releases/latest/download/gtfs_public.zip#google_bus.zip

Corresponding Transitland pages:
https://www.transit.land/feeds/f-dr4-septa~rail
https://www.transit.land/feeds/f-dr4-septa~bus

@mlundblad mlundblad changed the title Agency Short Name: SEPTA SEPTA: ZIP file contains multiple ZIP archives within Sep 20, 2024
@mlundblad
Copy link
Author

Actually the "anchor part" (after the #) corresponds to the file name of the archive inside the "outer" ZIP. So maybe the intension is supposed to be that the parser treats that as an "address" into the ZIP…

@hbruch
Copy link
Member

hbruch commented Sep 21, 2024

Hi @mlundblad!

Thanks for reporting this issue here. I was not aware that @septadev had already a GTFS GitHub repository they use to publish their feeds and to track issues people have with their feeds. That's great and significantly better than all the agencies I know.

I suggest to open an issue directly there as they surely will track their repo.

@mlundblad
Copy link
Author

It seems this might be intended from SEPTA:
septadev/GTFS#14

In the meantime, I tested implementing support for treating "trailing path" after # in the URL as a "sub ZIP file" and extract the downloaded ZIP and extract and write down that "addressed" inner ZIP in:

public-transport/transitous#518

@mlundblad
Copy link
Author

Aha, and actually there seems to be directly links (not via the GitHub page).

So, maybe we should just use an HTTP source instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants