-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] Remove tubeup host IP address from info.json prior to IA upload #192
Comments
I like this solution, said a variant of it a while back, just drop that element. We don't need it even for diagnosing problems on our end. YT-DLP would need that though, but we don't. |
also I think YTDL guys did the filename thing to try to see if filesystem limits would trigger a failed rip. So like "()" in NTFS are illegal, as are raw Twitter video dumps because the extractor puts the content of the tweet into the filename which can break on both NTFS and EXT4 (this is because of filename length). You'd have to filter output to 'extractor-videoID'. Anyway, we don't need that diagnostic information. I've always thought dropping those elements in the JSON and keeping the rest was good. Thanks, Brandon. |
We've discussed this with Pukaden and theres no satisfactory way to sanitize IPs from the JSON without negativly affecting the video description or other false flags. Yes there is a secondary copy of the description but I still don't like it's touching the JSON that messes with IAs bots that fix items. I move to close as not feasibly fixable @brandongalbraith what do you think? |
@vxbinaca Let me spike this out next week and if I don't get any traction in short order, we'll close it out as WONTFIX. |
My apologies to all, I have not had the bandwidth to work on this. Please consider this comment a Request for Contributions; contributors can propose a fiat amount for the contribution and I will consider reasonable offers. |
we have lives it's fine |
This reverts commit 0911477. urllib3 is needed to postprocess/parse/URL for sanitization and privacy purpose (bibanon#192)
* Revert "Remove urllib3 dependency" This reverts commit 0911477. urllib3 is needed to postprocess/parse/URL for sanitization and privacy purpose (#192) * IA currently leaks the IP address of the submitter. This is bad. We fix this by carefully redacting the IP address in the JSON fields known to contain it. * added tests * flake8 linting
Previous PR on the topic: #146
Had a bit of an epiphany on this topic while working on another problem; we shouldn't attempt to pattern match for an IP. We should wholesale parse the video URL in info.json and then remove the
ip
query param in theurl
key that contains the temporary video URL generated.Stackoverflow link on the topic with hints on code for the solution for my notes: https://stackoverflow.com/questions/7734569/how-do-i-remove-a-query-string-from-url-using-python
The text was updated successfully, but these errors were encountered: