Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CropTarget serializable #24

Closed
wants to merge 3 commits into from

Conversation

eladalon1983
Copy link
Member

@eladalon1983 eladalon1983 commented Mar 8, 2022

Fixes #21.


Preview | Diff

index.html Outdated Show resolved Hide resolved
Comment on lines +167 to +168
{{CropTarget}} objects are [= serializable objects =] [[!HTML]]. Their [= serialization
steps =], given <var>value</var> and <var>serialized</var>, are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to accept forStorage here as well, and throw an error (not sure which one, DataCloneError maybe?) if it is true.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The introduction of these steps should omit mention of the forStorage argument if it is not relevant to the algorithm."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for forStorage being set to true?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want this to be able to end up in IDB/pushState/some other things or not, essentially.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want it to be possible for this to be stringified and sent over a REST call, as I know some Web applications that can cooperate either in a single-tab or multi-tab setting, and they use REST (and REST-like) for as much as they can, since that's already set up and works for them. I think it's a valid use-case and there's no reason to forbid it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean through Fetch? Serializable doesn't help with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Serializable forStorage=true does not allow conversion to a string, could you please advise what does?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to convert a platform object to a string, you need to define a stringifier for it. That seems like a topic for a new issue though as #21 doesn't cover it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I've added a slot for [[Data]].
  • Indeed, stringification is a separate issue. If you happen to know the mechanism, I'd be happy to hear it. At any rate, we can do so as a follow-up.

index.html Outdated
steps =], given <var>value</var> and <var>serialized</var>, are:
</p>
<ol>
<li>Make <var>serialized</var> a randomly generated, globally unique {{DOMString}}.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a randomly generated value each time seems problematic. How does this correlate to the element?

Unless there's some shortcut I'm unaware of, my guess is we need an internal slot — ideally on the element itself, or less ideal in some weakmap<element, domstring> — to store this value, so that it can be correlated to the element, and the same value returned for this element on subsequent calls.

Since this intersects with HTMLElement, it might be good to secure review from someone familiar with the HTML spec. cc @annevk

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal slot makes sense to me. Could you suggest the exact wording?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot set serialized to a value. You need to set serialized.[[Data]] or some such to a value. (Otherwise you also overwrite serialized.[[Type]] and break the caller. This should probably be documented better, maybe file an issue against whatwg/html?

I don't really have enough context here to provide further suggestions about what might need to be done.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added [[Data]].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jan-ivar, the current text allows the user agent to determine its own implementation. A mapping is one possibility.

@youennf
Copy link
Contributor

youennf commented Mar 17, 2022

Overall it seems fine.
I would go with forStorage=false as a first step.

Also, if serialising then deserialising, we should get an equivalent object (representing the same element) but not exactly the same object.
I wonder whether we could have CropTarget have an internal slot storing its element.
Then the serialisation and deserialisation steps would be about reading/setting the CropTarget internal slot.

@eladalon1983
Copy link
Member Author

Overall it seems fine. I would go with forStorage=false as a first step.

Also, if serialising then deserialising, we should get an equivalent object (representing the same element) but not exactly the same object. I wonder whether we could have CropTarget have an internal slot storing its element. Then the serialisation and deserialisation steps would be about reading/setting the CropTarget internal slot.

Done with an internal slot for the serialized object, but not for CropTarget, where that would be a different topic. PTAL at the PR.

@youennf
Copy link
Contributor

youennf commented Mar 21, 2022

Looking at the new PR, I think it is missing the forStorage handling, something like:

  • Add forStorage as parameter to the serialisation steps algorithm.
  • As first step of the algorithm, throw an exception if forStorage is set to true.

I also think it would be preferable to move away from using UUIDs, which is currently a half-defined mechanism anyway.
Instead, we could do the following:

  • A CropTarget has a weak reference to its element as a private slot, say cropTarget.[[Element]] (weak reference since we do not want to keep the element alive if crop targets are still in the air).
  • When serialisation happens, set serialised.[[cropTargetElement]] to cropTarget.[[Element]].
  • When deserialisation happens, create a CropTarget and set its [[Element]] to serialised.[[cropTargetElement]].

@eladalon1983
Copy link
Member Author

Looking at the new PR, I think it is missing the forStorage handling, something like:

  • Add forStorage as parameter to the serialisation steps algorithm.
  • As first step of the algorithm, throw an exception if forStorage is set to true.

On the PR itself, I have quoted this part:

The introduction of these steps should omit mention of the forStorage argument if it is not relevant to the algorithm.

It's taken from here.
I read it as "ignore this if it's not relevant." Seems irrelevant enough to me. Or why do you think it's important to specify throwing?

I also think it would be preferable to move away from using UUIDs, which is currently a half-defined mechanism anyway.

The PR does not mention UUIDs. (And that's intentional.)

Instead, we could do the following:

  • A CropTarget has a weak reference to its element

Please recall that the element is in another document. Potentially a cross-origin document.

@youennf
Copy link
Contributor

youennf commented Mar 23, 2022

I drafted a version of what I think could be done at youennf@e2cb285

Please recall that the element is in another document. Potentially a cross-origin document.

Internal slots are not exposed to JS so I think it is fine.

@jan-ivar
Copy link
Member

jan-ivar commented Mar 24, 2022

I drafted a version of what I think could be done at youennf@e2cb285

With #32 (comment) merged Youenn's drafted version LGTM, modulo s/Element/ElementHandle/

@eladalon1983 Do you agree? If so, let's get these PRs merged.

<ol>
<li>
If <var>serialized</var>.[[\Data]] was obtained by serializing previous
{{CropTarget}}, then set <var>value</var> to an equivalent {{CropTarget}}.
Copy link
Contributor

@youennf youennf Mar 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "equivalent {{CropTarget}}" is not very well defined.
We would be on solid ground by introducing in the spec a direct CropTarget<->Element relationship, through an element reference or a slot storing an element referrence.
{{CropTarget}}s would then be equivalent if referencing the same element.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment applies here too; only its second half is relevant, so I'll repeat:
It might be that using weak references is standard procedure and would be well understood by readers. If that is the case, then I do not wish to delay us further. Could you point out an example of a Transferable/Serializable object that stores a weak reference to an Element in the original context, and where the weak reference gets replicated in the new context?

If <var>serialized</var>.[[\Data]] was obtained by serializing previous
{{CropTarget}}, then set <var>value</var> to an equivalent {{CropTarget}}.
</li>
<li>Otherwise, set <var>value</var> to {{undefined}}.</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the serialisation steps are done correctly, we should always successfully deserialise, which does not seem to be the case here if we set value to undefined.
Is the intention to serialise a CropTarget and deserialise it into undefined?
Does this mean serialising a CropTarget from top level page A to top level page B then back to top level page A would not work?

Introducing a direct element <-> CropTarget link, for instance by having CropTarget having an element reference seems a good approach to tighten things.

Copy link
Member Author

@eladalon1983 eladalon1983 Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The otherwise section intended to deal with deserialization of invalid input. However, since we don't have a (legal) way to obtain an invalid CropTarget, or an invalid serialization thereof, I'd gladly drop this line - if that is indeed what you're suggesting...?

@jan-ivar
Copy link
Member

Looking at the new PR, I think it is missing the forStorage handling, something like:

  • Add forStorage as parameter to the serialisation steps algorithm.
  • As first step of the algorithm, throw an exception if forStorage is set to true.

I agree with @youennf here.

On the PR itself, I have quoted this part:

The introduction of these steps should omit mention of the forStorage argument if it is not relevant to the algorithm.

It's taken from here. I read it as "ignore this if it's not relevant."

I think you're misinterpreting it. It says "The introduction of these steps should omit mention of the forStorage argument if it is not relevant to the algorithm." which seems to say to not list the argument in the argument list unless it is referenced in the algorithm. Since the this algorithm would throw on it, that is satisfied.

Cc-ing @annevk as the authority on what the intent as far as default behavior here is, if any.

Seems irrelevant enough to me. Or why do you think it's important to specify throwing?

We should narrow the scope of functionality to what's necessary for the use cases we're solving. If we don't need to implement storage, we shouldn't. The iterative process for adding functionality, is we hear use cases that demand it, and then discuss adding it.

It's easy to add more functionality later. What's hard is to remove it.

@eladalon1983
Copy link
Member Author

We should narrow the scope of functionality to what's necessary for the use cases we're solving. If we don't need to implement storage, we shouldn't. The iterative process for adding functionality, is we hear use cases that demand it, and then discuss adding it.

It's easy to add more functionality later. What's hard is to remove it.

I suggest we document lack of consensus here and revisit this in the context of conversion to/from string. As I've mentioned to you out-of-band, I am in contact with Web-developers who often communicate using a shared cloud infrastructure (e.g. REST) even when communicating between frames in a given tab. For them, stringification of CropTargets is desirable.

@youennf
Copy link
Contributor

youennf commented Mar 28, 2022

stringification of CropTargets is orthogonal to serialisation of CropTargets.
forStorage is used as an example for IDB storage, this will not help your REST developers.

@eladalon1983
Copy link
Member Author

stringification of CropTargets is orthogonal to serialisation of CropTargets. forStorage is used as an example for IDB storage, this will not help your REST developers.

If you're sure about that, then we can throw on forStorage as far as I'm concerned.

@eladalon1983
Copy link
Member Author

eladalon1983 commented Mar 28, 2022

Another problem I have with a slot that holds a "weak reference" to an Element, is that it becomes needlessly scary when considering later extensions of this API to cropping a track derived from capturing another tab. I don't think we should make it harder for ourselves to later extend this API. I want a formulation that makes it abundantly clear that the only thing an application could ever do with a CropTarget, is to use it to shave off pixels that it is already getting.

Clarification: In the future, I think it will be quite useful if an application running in another tab could mint a CropTarget for one of its Elements, stringify that CropTarget, then post this string over Capture Handle. A capturer could then convert the string back to a CropTarget and use it to shave off pixels. I believe this to be very safe (because we only shave off pixels). If we have a "weak reference" to an Element in the other tab, it's not so patently safe anymore...

@jan-ivar
Copy link
Member

@eladalon1983 We need:

  1. To reference an element within the same process
  2. This reference must not prevent GC of the element in that process
  3. We need to pass this reference to another process and back
  4. We don't need to dereference it in the other process

Since we won't dereference in other processes, I see nothing scary, unsafe, harder, permissive, or unclear.

A capturer could then convert the string back to a CropTarget and use it to shave off pixels.

That's one level up, and seems irrelevant. I think the interesting part is how "the user agent calculates the bounding box of the pixels belonging to the element, and crops the frame to the coordinates of this bounding box", which I would think has to happen in the capturee process where the element is, to track layout-movements of the element. Hence the "and back".

@eladalon1983
Copy link
Member Author

eladalon1983 commented Mar 29, 2022

We need to pass this reference to another process and back

Citation needed for "and back" - see below.

which I would think has to happen in the capturee process where the element is

The process where the element originates does not need to be consulted again. That might be a valid implementation, but it's not a necessary implementation. Allow me to tell you about Chrome's implementation. Chrome "tags" the Element in the rendering pipeline as soon as produceCropTarget() is called. This tagging travels along to the GPU process, which collates data from the many processes associated with the different frames. The process where the Element lives never finds out that it's being captured, let alone that an active capture is being cropped to an Element in said process.

To (re-)summarize, Chrome does not use a "weak reference to an Element in another process". That proves that implementations are possible that do not involve a weak reference back to the Element. Things are more patently safe when such references are not posted cross-process.

@jan-ivar
Copy link
Member

That proves that implementations are possible that do not involve a weak reference back to the Element.

I think this WG works off the idea that any implementation that is indistinguishable from a spec algorithm is a valid implementation.

Things are more patently safe when such references are not posted cross-process.

Can you explain the specific threat vector you worry about? A "reference" can be an id or a key/string for the user agent to look something up later in a different context, a "tag" if you will. It doesn't need to mean a memory pointer. Even if it were implemented that way, it'd 'd be a pointer that would be unusable in other processes without some lookup mechanism invented by the user agent, since neither the element itself or its environment or any other information about it leaves or gets copied to any other process. The only thing that leaves the process is whatever we specify in the serialization steps, which will probably be some id or key.

While it's good to give implementation advice where it matters¹, specs normally don't care about these details, and trusts user agents to be responsible. Specs typically care mostly about JS observable behaviors, like GC, i.e. the 3 properties I mentioned.

My view is @youennf's PR satisfies these, and that the prose you've provided in response does not, but please illuminate me, so we don't keep tripping up over details like this.


1. In WebRTC we use a [[KeyingMaterialHandle]] internal slot for the arguably much more sensitive information. If this is unsafe, this would be good to learn.

@eladalon1983
Copy link
Member Author

@jan-ivar, I am concerned that "weak reference" might not be a well understood term in the way it's used here. I am afraid readers would think "pointer" and be alarmed. If there is a history of using "weak reference" in the sense it's being used here, please do let me know. Otherwise, it seems prudent to keep to the beaten path here. All CropTarget aims to do is serve as input to cropTo(). I think we can find a formulation that highlights this fact.

@youennf
Copy link
Contributor

youennf commented Mar 29, 2022

Another problem I have with a slot that holds a "weak reference" to an Element, is that it becomes needlessly scary when considering later extensions of this API to cropping a track derived from capturing another tab.

@eladalon1983, with regards to #24 (comment), if you invoke security concerns, please provide as precise security assessments. I honestly do not understand your message, or how it weakens weak :0) references.

FWIW, the web is full of well dealt with relationships between objects from different origins, living in potentially different processes which have security implications. Security will be best ensured and assessed with precisely defined relationships, semantics and algorithms
This is the intent of this work, there is no intent in changing the behavior or outlawing a particular implementation.

If you're sure about that, then we can throw on forStorage as far as I'm concerned.

An important part of the standardisation is to cross check and make sure we are all on the same page after this work.
In other words, do not trust me and please cross-check ;o)

@eladalon1983
Copy link
Member Author

eladalon1983 commented Mar 29, 2022

FWIW, the web is full of well dealt with relationships between objects from different origins, living in potentially different processes which have security implications.

As mentioned, if you have examples of such "weak references" being employed elsewhere, then we can fall in with this pattern. Otherwise, I think a less controversial phrasing is preferable. Do you have such examples?

An important part of the standardisation is to cross check and make sure we are all on the same page after this work.
In other words, do not trust me and please cross-check ;o)

Apologies, I could have phrased things more directly. I am saying - I agree to throw on forStorage under our mutual understanding that this does not relate to stringification. (It is unfortunate that forStorage is not documented better. I am writing this 2022-03-29. Hopefully this is improved in the future.)

@youennf
Copy link
Contributor

youennf commented Mar 29, 2022

As mentioned, if you have examples of such "weak references" being employed elsewhere, then we can fall in with this pattern. Otherwise, I think a less controversial phrasing is preferable. Do you have such examples?

Given GC should not be observable, whether weak or not is not a hard requirement, it is a clarification.
If you do not like weak (I am not sure why), I am fine reverting to the original PR: no mention of weak but a GC note.

As of objects referencing other objects that are potentially in other process, this is an existing pattern that proved to work well, let's use it.

@jan-ivar
Copy link
Member

If there is a history of using "weak reference" in the sense it's being used here, please do let me know.

HTML uses "An OffscreenCanvas object may hold a weak reference to a placeholder canvas element,"

@jan-ivar
Copy link
Member

"Weak reference" is also the term the TAG uses in their design principles in § 5.3. Don’t expose garbage collection.

@eladalon1983
Copy link
Member Author

Given GC should not be observable, whether weak or not is not a hard requirement, it is a clarification.

We're in agreement here.

If you do not like weak

My concern is mostly with "reference".

I am fine reverting to the original PR: no mention of weak but a GC note.

To ensure we're discussing the same thing, could you please link me to the revision you're referring to?

@youennf: As of objects referencing other objects that are potentially in other process, this is an existing pattern that proved to work well, let's use it.

@jan-ivar: HTML uses "An OffscreenCanvas object may hold a weak reference to a placeholder canvas element,"

Thank you for the reference, @jan-ivar. Please allow me some time to dig into these more closely.

"Weak reference" is also the term the TAG uses in their design principles in § 5.3. Don’t expose garbage collection.

That example seems to refer to "weak reference" precisely in the interpretation that I want to avoid, is it not? "This means that you shouldn’t expose any API that acts as a weak reference, e.g. with a property that becomes null once garbage collection runs." It is this interpretation of "weak reference" that I worry would be confusing.

@eladalon1983
Copy link
Member Author

eladalon1983 commented Mar 29, 2022

Btw, I realize that CropTarget could be posted over from another tab using a BroadcastChannel (even with Storage Partitioning). Moreover, that tab could be closed before deserialization occurs. I wonder if we should then:

  • Fail deserialization?
  • Set the internal slot to a "weak reference" to an Element that no longer exists?
  • Set the internal slot to undefined?

What are your thoughts on this, @jan-ivar and @youennf?

(FWIW, this question would be trivially answered if we said that CropTarget is a UUID, or has a single internal slot that is a UUID. Or a unique string other than UUID, for that matter.)

@eladalon1983
Copy link
Member Author

As a compromise, and in the interest of progress, let's go with Youenn's proposed PR, which he'll upload shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CropTarget should be made serializable
4 participants