Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accessing the obtained URL results in HTTP ERROR 403 #386

Open
MimuraKyosuke opened this issue Dec 21, 2024 · 13 comments
Open

Accessing the obtained URL results in HTTP ERROR 403 #386

MimuraKyosuke opened this issue Dec 21, 2024 · 13 comments
Labels
bug Something isn't working

Comments

@MimuraKyosuke
Copy link

MimuraKyosuke commented Dec 21, 2024

🛑 DO NOT REMOVE OR SKIP THIS ISSUE TEMPLATE 🛑

Issues with incomplete or missing information will be closed automatically.

🐞 Bug Description

Accessing the obtained URL results in HTTP ERROR 403.
I was able to play the video until yesterday.
The same error occurs on a server with a different IP address.
If I retrieve the best_url with yt-dlp without using the potoken, the video can be played, so it doesn't seem to be an issue with the potoken.
I couldn't determine if it was the same issue as #384.

🔢 Code Snippet

Include the minimal code snippet that reproduces the issue.

import sys
from pytubefix import YouTube

if len(sys.argv) > 1:
    video_id = sys.argv[1]
else:
    print("Usage: python script.py <video_id>")
    sys.exit(1)

video_url = f'https://www.youtube.com/watch?v={video_id}'

try:
    yt = YouTube(video_url)
    best_url = yt.streams.get_highest_resolution().url
except Exception as e:
    sys.exit(1)

print(best_url)

🎯 Expected Behavior

Describe what you expected to happen instead of the observed behavior.
Accessing the obtained URL allows the video to play.

📸 Screenshots or Logs


🖥️ Environment Details

Fill in the details below about your setup:

  • Operating System: [Ubuntu 24.04.1 LTS]
  • Python Version: [Python 3.12.3]
  • Pytubefix Version: [pytubefix 8.8.2]

📋 Additional Context

Add any additional information or context that might help us resolve the issue.


🚀 Next Steps

Once submitted, we will triage the issue. Make sure to respond to follow-up questions to keep the process smooth.

@MimuraKyosuke MimuraKyosuke added the bug Something isn't working label Dec 21, 2024
@NannoSilver
Copy link
Contributor

I also detected an abnormally high number of "HTTP Error 403: Forbidden" exception errors in recent days.

@felipeucelli
Copy link
Contributor

Recently I also started detecting it, maybe it's YouTube messing with the ANDROID_VR client.

Maybe soon we will have to switch to a WEB-based client, which will reduce the library's performance.

@MimuraKyosuke
Copy link
Author

I wonder why YouTube chose to return a URL that cannot be played, instead of simply not returning a URL at all.

@felipeucelli
Copy link
Contributor

I believe it is related to PoToken, see more here: #209

@NannoSilver
Copy link
Contributor

NannoSilver commented Dec 24, 2024

I believe it is related to PoToken, see more here: #209

I did an analysis on how this error is affecting my system. My system is small, with a couple of IPs.
The error 403 is also affecting IPs that never were flagged as bot by Youtube. In some cases, replacing the IP and retrying allowed to by-pass the error 403, but in other cases, no.
Looks like to be a new issue, but poToken remains as a real possibility.

@MimuraKyosuke
Copy link
Author

On the same IP server, the URLs obtained via pytubefix cannot play videos, but the URLs obtained using yt-dlp without a potoken can always play videos. I hope this information is helpful.

@MimuraKyosuke
Copy link
Author

Even with the URL obtained by inputting the potoken and visitorData as shown below, the video cannot be played and results in a 403 error.

yt = YouTube(video_url, 'WEB', use_po_token=True)

@NannoSilver
Copy link
Contributor

NannoSilver commented Dec 24, 2024

On the same IP server, the URLs obtained via pytubefix cannot play videos, but the URLs obtained using yt-dlp without a potoken can always play videos. I hope this information is helpful.

I tried yt-dlp in my server, without any proxy, and got this error:

ERROR: [youtube] h5qtlg9fKq8: Sign in to confirm you’re not a bot. Use --cookies-from-browser or --cookies for the authentication. See https://github.com/yt-dlp/yt-dlp/wiki/FAQ#how-do-i-pass-cookies-to-yt-dlp for how to manually pass cookies. Also see https://github.com/yt-dlp/yt-dlp/wiki/Extractors#exporting-youtube-cookies for tips on effectively exporting YouTube cookies

@felipeucelli
Copy link
Contributor

Recently yt-dlp implemented poToken in the IOS client. And the fact that the error appears randomly may indicate that YouTube is doing A/B testing.

I think that at the moment we just have to wait for the A/B tests to be completed and prepare to change the client.

@Shaadalam9
Copy link

I am also getting the same error while downloading the video from youtube:
HTTP Error 403: Forbidden

@MimuraKyosuke
Copy link
Author

I tried yt-dlp in my server, without any proxy, and got this error:

ERROR: [youtube] h5qtlg9fKq8: Sign in to confirm you’re not a bot. Use --cookies-from-browser or --cookies for the authentication. See https://github.com/yt-dlp/yt-dlp/wiki/FAQ#how-do-i-pass-cookies-to-yt-dlp for how to manually pass cookies. Also see https://github.com/yt-dlp/yt-dlp/wiki/Extractors#exporting-youtube-cookies for tips on effectively exporting YouTube cookies

It seems that such an error occurs when accessing YouTube multiple times from the same server.
I tried using a server that does not usually access YouTube.
From that server, I was still able to obtain URLs and play videos using yt-dlp.

This 403 error has also been discussed in yt-dlp, so it doesn't seem to be an issue specific to pytubefix.

@MimuraKyosuke
Copy link
Author

With the latest yt-dlp, the URL obtained using the following command can be played without resulting in a 403 error.
I hope this information is helpful.

yt-dlp -f b -g https://www.youtube.com/watch?v=aqz-KE-bpKQ --extractor-args "youtube:player-client=android_vr"

@felipeucelli
Copy link
Contributor

In my tests, I noticed that the 403 error appears after many requests coming from a single IP, and in some cases, it is immediately detected as a bot.

I managed to fix this by manually adding a visitorData to the API request. This does not fix it when already detected as a bot, only when the 403 error is generated.

Using poToken seems to work as it sends the visitorData (I'm not sure but it seems to expire after several requests).

We can add visitorData automatically, but I believe that for now it is better to wait for the A/B tests and prepare for the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: waiting
Development

No branches or pull requests

4 participants