This is a simple utility that uses yt-dlp and the YouTube Data API to archive the contents of a YouTube channel.
It uses a SQLite database to cache channel information and avoid hitting the YouTube API as much as possible.
You need to obtain an API key by registering an application, then you need to enable that application to use the YouTube Data API.
After you've obtained your key, assign it to the YT_CH_ARCHIVER_API_KEY
environment variable.
Set a path for downloading the videos with the YT_CH_ARCHIVER_ROOT_PATH
environment variable. Directories will be created under here that correspond to the name of each channel.
You can optionally set YT_CH_ARCHIVER_DB_PATH
to the location where you want the SQLite database to be stored. By default it will be ~/.local/share/yt-ch-archiver/videos.db
.
The general usage for this utility is to run get
then list
commands. The former gets information from YouTube and caches it locally, while the latter works on the cached data.
First, get the list of all the videos for the channel:
./app.py videos get <channel-name>
Now retrieve the playlists:
./app.py playlists get <channel-name>
Playlists sometimes contain videos that are unlisted on the main channel. They also often contain videos from other channels.
Run this command to cache the information for the unlisted and external videos:
./app.py playlists ls <channel-name> --add-unlisted --add-external
If you now run ./app.py list-channels
you will see the main channel, plus all the channels that relate to any external videos that were on playlists.
Now it's useful to run ./app.py list-videos <channel-name>
. You will see all the videos for the channel, plus any unlisted videos that were on playlists. Videos not yet downloaded are coloured red, while those that have been are coloured green. At this point they should all be red.
Now download them:
./app.py videos download <channel-name>
Using yt-dlp
, the utility will download the video, thumbnail, info and description for each video for that channel. It configures yt-dlp
to download the best mp4 video available, and failing that, the best video otherwise available.
The videos will be downloaded to YT_CH_ARCHIVER_ROOT_PATH/channel_name/video
. By default yt-dlp
uses the title of the video in the filename, but these can be huge and unwieldy. This utility just saves the video using the YouTube ID. For this reason, I added a command to generate a basic HTML file to function as an index, so it's easy to tell which video relates to which ID. Generate the index by running this command:
./app.py channels generate-index <channel-id>
This will output an index.html
file at YT_CH_ARCHIVER_ROOT_PATH/channel_name/video/index.html
.