-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New ZIM request: NHS conditions #323
Comments
Excellent idea. It looks like a pretty straightforward design, have you tried it on zimit? |
running it through youzim.it seems to do a great job 👏 (maybe just hitting the 1000 file limit). |
Recipe created |
File is ready at the library |
Same CSS fix should be applied as in #1138 |
Custom CSS created, recipe updated to publish to dev with this custom CSS and requested, let's see. |
@benoit74 I've just noticed with this ZIM (I'm testing for the first time, having been away) that none of the videos appear to work. See for example the Heart Attack video at bottom of this page: https://library.kiwix.org/viewer#nhs.uk-conditions_en_all_2024-09/www.nhs.uk/conditions/heart-attack/ . There are other examples such as the Menstrual Cycle video at the bottom of this page: https://library.kiwix.org/viewer#nhs.uk-conditions_en_all_2024-09/www.nhs.uk/conditions/periods/ . Clearly this is Zimit-related, and not specific to this ZIM, but I thought I should note it here. EDIT: I tested in library.kiwix.org and in the PWA. Videos don't play in either. |
For the record, you published this file to production on August 15, you probably already tested it or at least you should have. The fact that videos don't work is is a known limitation of the scraper. Only Youtube videos are known to work in Zimit/Warc2zim, and this is not going to change in the coming months / years. Is it critical enough that we remove the ZIM for production? Or the information present is sufficiently valuable without videos? |
Hi @benoit74 I think you think you're replying to a different person! (I am not involved in publishing ZIMs.). The decision on whether it's critical is more for your team to decide, but personally I'd say it's not critical because there is a lot of textual information. I don't know whether the underlying video files have been scraped, but if they have, then it bloats the ZIM if they can't be accessed, and it might be an idea to exclude them. |
Sorry @Jaifroid, too soon in the morning, I was convinced it was Ravan speaking ^^ Your point regarding whether videos are bloating the ZIM is indeed a good one |
I confirm the ZIM is bloated with first seconds of every videos. Unfortunately I don't think we have sufficient tooling to exclude them from the ZIM, AFAIK we can do it only with openzim/zimit#353. I think it would be super cool if we could also replace or even watermark video posters in such situation so that we have something saying "videos not available in ZIM". I've opened openzim/warc2zim#396 to keep the idea. I've also opened openzim/warc2zim#397 for a "let's dream a bit" scenario. Regarding current NHS conditions ZIM and until these issues are solved, should we manually remove the useless items and publish it manually? It is work only a developer can do, but if we agree that we will not update the ZIM for coming year this might be worth it to avoid big ZIM for nothing. |
I was going to ask how much bloated is bloated but considering that NHS conditions is 4.5GB and NHS medicine is 13.5MB, I suspect I have an answer. |
Do we agree this is a one-shot manual operation, and I will not do it again until many months (i.e. the recipe will be disable?) We have no tooling for this, so I will have to do it "by hand", quite time consuming. |
@benoit74 Personally (but I guess it's @Popolechien's call), I'd say it is not something you should have to do "by hand", but rather something that could wait till openzim/zimit#353 is ready and it can be done automatically. I don't think it's so urgent as to take up valuable time that could be spent on other things. Sorry if I'm speaking (writing) out of turn! JMHO. |
Ah no, I thought that your hands would be writing a handy script and voilà. Never mind, then. Let's wait for openzim/zimit/issues/353 as flagged by @Jaifroid |
Then we have to remove the file from production, right? If so, then please open a separate issue since the assignees are different. |
Yup. Opened #1163 |
Wait - what's the policy again here? Keep it open as it's not ready, or close it because the recipe exists? |
Never close unless we know we will never make the ZIM. Here we have good hopes to do the ZIM, so only flag it as upstream + bug. |
The text was updated successfully, but these errors were encountered: