-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Support HTMLRewriter replacing HTML with content from a ReadableStream or Response #2758
Comments
Message from @dotjs from this morning:
That's how we landed on this issue on workerd. @jasnell @anonrig I'm not a kj expert but I know you have a lot of experience. Do you understand what Andrew's message above means and how much work it would be for us to fix it in the runtime? Is that something you would be comfortable doing or do we need someone else from the runtime. As for priority, this is become a blocking issue for wider Fractus rollout because the lack of this feature means we can't support streaming SSR, which is a big performance problem for us. |
It's been a while since I've dug through this code, would need some time to analyze it again. |
Could you take a peek please? We need to get unblocked soon (within the next few days), and if this path will take longer than that, then we'll most likely reach for an less performant alternative that supports streaming which in the end is going to be on overall perf boost for us even if the transformation itself is inefficient. Our current top alternative is something like https://trumpet.gofunky.fun/ which seems fine, but we'd have to turn on the nodejs_compat layer which further complicates things but might be worth it if we get unblocked. |
We don't need full Rust/KJ stream or async integration here. We just need lol-html to support an API where we can incrementally provide chunks of bytes into the element body and it'll write those back out into the response stream. For now it's fine if lol-html even thinks the API is synchronous -- we'll continue tricking it with fibers just like we always have. |
That said, I think it's more than a few days of work. |
Yep, just going through the code a bit here and I'd estimate that it's probably about a week worth of dev at a minimum. The existing fiber model should work well. I think the limitation is more on the lol-html side (as @kentonv suggests, there needs to be an API where lol-html will accept content incrementally chunk-by-chunk, which is likely the most significant part of what's needed here. |
It's doable in the sync manner, but can't say it's very straightforward:
/// A trait for byte chunk iterator that can be inserted as content in rewritable units mutations.
pub trait ChunkIterator {
fn next(&mut self) -> Option<&[u8]>;
}
/// A trait for converting into [`ChunkIterator`].
pub trait IntoChunkIterator {
type IntoChunkIter: ChunkIterator;
fn into_chunk_iterator(self) -> Self::IntoChunkIter;
}
|
@inikulin do you have an idea about how much work this is? your plan is very detailed which gives me hope that you've done the hard part already, but would it be possible to implement this within the next few days or is that out of the question? thanks for your help |
@IgorMinar no work is done on that, I'm just giving the directions as the original author of the library. We're discussing with Andrew G who can carry this work as I'm currently occupied with other stuff. |
Apologies been a busy week - @kornelski should be able to start looking at this next week. |
@jasnell assuming that @kornelski will do that lol-html changes soon, would you be able to do the workerd integration this week please? We see escalations related to this issue so we'd like to get them resolved ASAP. |
This week for me is unlikely. I'll talk to @kflansburg to see who may be able to work on this this week. |
I'm looking at this now |
Thank you @kornelski, I'm working with @jasnell and @southpolesteve on staffing for the runtime part of this. |
@kornelski any luck? could you please post an update? thank you |
I need reviewers: cloudflare/lol-html#229 |
How text encodings should be supported?
|
@kornelski we expect only utf-8 encoding support at the moment. |
You can try it out now: https://github.com/cloudflare/lol-html/tree/streaming-handlers-chunked |
Thank you @kornelski, @southpolesteve have you decided who from the runtime team could help with the runtime bits? Is this something @anonrig could take a look at? Based on the previous discussion it seems that the workerd part of this work is minimal, so my educated guess is that @anonrig should be able to tackle it quickly unless there are more surprises and we need more help from lol-html and @kornelski. |
This has been assigned to @npaun who is out today but will start looking tomorrow |
Currently, HTMLRewriter only supports replacing HTML content with strings:
workerd/src/workerd/api/html-rewriter.c++
Lines 693 to 697 in bb0f8a0
Being able to accept content via a ReadableStream or Response would be very useful for improving the latency of the response to the end user when (for example) embedding content from an upstream fetch into another HTML document.
This depends on first adding underlying support for this in https://github.com/cloudflare/lol-html which will then need to be exposed in the runtime.
The text was updated successfully, but these errors were encountered: