Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

videocardz.com - unable to decode content correctly #2883

Open
2 tasks done
hazzuk opened this issue Oct 10, 2024 · 1 comment
Open
2 tasks done

videocardz.com - unable to decode content correctly #2883

hazzuk opened this issue Oct 10, 2024 · 1 comment

Comments

@hazzuk
Copy link

hazzuk commented Oct 10, 2024

TL; DR

Videocardz.com has to use specific setting to bypass cloudflare, but either these changes or the sites content isn't displayed correctly.


Setup

When trying to add the feed https://videocardz.com/rss-feed, I'm unable to do so without additional options. I believe this is because the feed/website is placed behind Cloudflare. To get this to work I need to both 'Disable HTTP/2 to avoid fingerprinting' and 'Override Default User Agent':

General

  • Feed URL: https://videocardz.com/rss-feed

Network Settings

  • Override Default User Agent:
    Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.5938.132 Safari/537.36
  • Fetch original content
  • Disable HTTP/2 to avoid fingerprinting

Rules

  • Scraper Rules: #videocardz-article
  • Rewrite Rules: remove("div.socialbar")

Problem

However, the content returned is not correctly displayed:

Screenshot 2024-10-10 125224

  becomes Â, becomes ’ and many more issues seemingly to do with UTF-8.


Note

When searching for options that could help with decoding the content, I applied the rewrite rule base64_decode and it caused this error:

Database error: store: unable to create entry "https://videocardz.com/newz/intel-confirms-5th-gen-npu-for-panther-lake" (feed # 27): pq: invalid byte sequence for encoding "UTF8": 0xf5 0x39 0x3c 0x2f.

@jvoisin
Copy link
Contributor

jvoisin commented Nov 10, 2024

It looks like an encoding issue on their side to be honest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants