-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General approach to capability negotiation #176
Comments
I'm thrilled the fingerprinting analysis is good. This section of the explainer lays out the design philosophy for the current API shape. It mentions your idea as well a potential follow up that could work in tandem this design (so far, not something we pursued). There are some cases having the browser pick works well. For instance, Having the browser say which format it prefers is sometimes still compatible with the newer APIs. For instance, with MSE as used by sites like YouTube, this could work fine. But, with EME and WebRTC, its more complicated. For EME, a site like Netflix may balance the most performant stream configuration against the most secure stream configuration. When these are the alligned, the choice is easy. But they are not always aligned, and the site is in a better position to break a tie. With WebRTC, you may have a preferred format, but you have to participate in a negotiation with your peers to arrive at a format that everyone supports. An API that can tell you info about each possible format is better suited to populating the format negotiation ladder. |
@chcunningham, Thank you!
This isn't quite as detailed as I had hoped for. As you say, it mentions the possibility of the UA picking, but it says little about why that path isn't being chosen.
If "we're following the example" is the argument, then I'd like to push back. I'm not convinced that these others got it right, and I'd like to take a fresh look here.
The user might have an opinion in this case, also. e.g., the user might have low power availability and might prefer the lower power choice. As as you point out, there may be misalignment. Given that, I would argue that the UA - as the "user's agent" - is in the better place to break the tie, not the site.
The more-than-two-party case (of WebRTC) seems different than the two-party case, and my understanding was that this API is for two-party cases, right? In that case, having the site provide information about what it supports - rather than the UA supply that information - would seem to provide sufficient completeness, right? |
The fresh look is welcome, but I think the proposed design is not feasible at this time. The MediaCapabilities API is widely implemented and used. We have an opportunity to make additions, improvements, refinements, etc... but we cannot make a breaking change of this magnitude.
Sites may offer users this choice while also factoring in their secret sauce for whatever they think makes the best user experience.
For WebRTC usage, this API is not limited to two parties. The API can describe the send and receive capabilities of the local machine. The app could then exchange this information with the N parties in a conference call setup as part of format negotiation. |
Please correct me if I'm wrong, but isn't this the first time the WG has sought the Privacy IG's review of this spec? |
You are correct, this is the first time review has been requested. I accept responsibility for the delay in making the request. This spec was my first time navigating the w3c process. My aim in the comment about "feasibility" is to provide important background. I'm happy to continue discussion on the merits of various designs. |
Just to second @samuelweiler (twice): I think the substance of Sam's issue is important, given that for some users the values here will be highly identifying for browser fingerprinting (and if the approach Sam is suggesting isn't workable, then other fingerprinting protections are needed in the spec. I appreciate and agree with Sam that the text discussing fingerprinting issues is great, but the spec also needs normative protections against the fingerprinting risk) I think the process points I read in Sam's comment are important too. The purpose of reviews is to identify privacy risks in specs, and make sure they're addressed before things move to REC. Doubly so when the spec touches on topics called out as needing extra care by TAG Design Principals. I see Sam identifying a place where the current spec doesn't seem to follow the least-power principal the TAG suggests (or align with the fingerprinting risks PING is generally concerned with). @chcunningham are you saying that the WG isn't interested in moving the spec in a direction more in line with TAG guidance (and reducing fingerprinting risk)? Or that a capability navigation approach (or something else more in line with the TAG principals) sounds good, but would need to be achieved in a different way that has been discussed in the thread so far? Or that its simply too late to make any significant changes (in this respect) at all? |
No. I am happy to make changes that reduce fingerprinting risk. I think being transparent about feasibility of proposed changes is essential to having a good faith conversation about making improvements.
Feasibility aside, I do not think the capability approach sounds good. I gave a few examples of issues in my earlier comments. IMO those examples demonstrate that the current API does align with the least power principal (more power was needed). |
Can you elaborate on this? I'd like to explore other mitigations. |
@samuelweiler wrote:
At least in our system (Netflix) and I imagine in others, the available media formats varies significantly by title and requires a small amount of work server-side to compute. In our case, also, computing the CDN URLs for the various streams involves a larger amount of server-side work. At the moment we do these tasks in a single network request. These tasks can be done speculatively to a certain extent (when there are signals as to which title might be presented next), but we would not want to waste resources on the CDN calculations for stream formats that are not supported by the device. If they cannot be done speculatively, then doing them in a single request is desirable from a responsiveness point of view, rather than one request to get the available formats and another to get the URLs and other metadata for the chosen format. If I understand correctly, an API that allows the browser to choose a format from a provided list exposes all the same information about device capabilities, since it could be called repeatedly with different lists, so the privacy advantage of that approach is only that those requests could be rate limited and abuse might then be easier to detect. A site like Netflix, though, would need to call this frequently at first as we drive speculative preparation for titles visible in the gallery, for example, so heavy throttling could have a user experience impact and differentiating between normal usage and abuse may not be so easy anyway. It should always be possible for privacy-sensitive browsers to monitor whether sites request capability information (in general, not just this API) and then do not go on to use the capabilities detected. Browsers can also choose to advertise only a common baseline capability and offer users the choice to expose more information only when a site actually uses that capability. |
Discussed in Media WG meeting 12 December 2023 (minutes). Next step: update our privacy considerations. |
Sorry, I missed the discussion last month. Happy to help draft text for the streaming case, based on the note above. I can prepare a PR if no one else is doing it. |
I thank the editors for what appears to be an excellent fingerprinting analysis. This is exactly the sort of thing I'm looking for in specs.
As a general thing, why are we exposing device capabilities to the app for purposes of negotiation? Couldn't we instead have sites expose available media formats and have browsers (perhaps in a way not exposed the application) pick the one they like best? That way a browser wishing to be more privacy preserving could simply make a consistent choice, without having to fake an answer to this API, as recommended in https://w3c.github.io/media-capabilities/#decoding-encoding-fingerprinting.
The text was updated successfully, but these errors were encountered: