-
-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple problems in scraping of multimedia content #890
Comments
Many things are broken broken because of the |
Just merged a few pull requests and things are looking much better :) |
@ISNIT0 We need automated tests for this multimedia scraping... I don't count the number of tickets I have open in the past for multimedia content not mirrored properly... and I had to open one a week ago. I don't want to open new ones in the future. This has to be secured. BTW, I'm quite sure there is way to inject wikicode to the parsoid/MSC API and get the HTML back. So the automated tests should use that instead of starting directly from HTML (which offer no garanty that this is the kind of HTML that the Mediawiki - still - deliver). |
Testing this is not in 1.9 or 2.0 |
I will have a look in detail to that ticket to see if it works now. |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions. |
I have created a small test about multiple type of content and ways to include them, but everything is standard. It is available here https://en.m.wikipedia.org/wiki/User:Kelson/MWoffliner_CI_reference.
I have scarped it with
1.9.4
and this was a bit disappointing. We have a here many problems, most of them being that the content is simply not made available. I think such a page should be really tested properly to secure that we don't have anymore big problem around multimedia content displaying.The text was updated successfully, but these errors were encountered: