Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Related Website Sets (formerly First-Party Sets) #342

Closed
3 of 5 tasks
mikewest opened this issue Feb 8, 2019 · 83 comments
Closed
3 of 5 tasks

Related Website Sets (formerly First-Party Sets) #342

mikewest opened this issue Feb 8, 2019 · 83 comments
Assignees
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. Provenance: Privacy Sandbox Resolution: unsatisfied The TAG does not feel the design meets required quality standards Topic: privacy Topic: security features

Comments

@mikewest
Copy link

mikewest commented Feb 8, 2019

Guten TAG!

I'm requesting a TAG review of:

Further details (optional):

You should also know that y'all are marvelous.

We'd prefer the TAG provide feedback as:

  • open issues in our Github repo for each point of feedback
  • open a single issue in our Github repo for the entire review
  • leave review feedback as a comment in this issue and @-notify [github usernames]
@dbaron
Copy link
Member

dbaron commented Feb 8, 2019

A few notes from reading through the explainer (which I haven't fully digested yet):

  • the rules that the browser not cache the set if there's any mismatch seems a little problematic if a site wants to evolve the set over time (e.g., add a host to it). If something depends on the set being present, they'd need to make sure to roll out the change across multiple hostst at exactly the same time, probably including dropping HTTP cache expiration times for a period leading up to the switchover. Is that avoidable? (There are some thoughts on this later in the "Incremental Verification" section... but I suspect there might be other options for rollout if that isn't viable.)
  • Why is a host that is a registerable domain disallowed? ("None of the origins specified is itself a registrable domain") (I'm guessing you have a good reason, but I don't see it, and it seems a little confusing to me.) (Though it seems to contradict the "The design above relies on origins. Shouldn't we evaluate registrable domains instead?" FAQ item.)
  • This does appear to pose some risk to users ("How will malicious actors abuse this mechanism?"), but it's not clear to me what the user benefit is over what happens today. It seems like the explainer should be clearer about that.
  • The abuse point of hopping between sets seems to be mentioned only in passing, but it seems pretty concerning.
mikewest added a commit to WICG/first-party-sets that referenced this issue Feb 8, 2019
@mikewest
Copy link
Author

mikewest commented Feb 8, 2019

Thanks for the feedback, @dbaron! It might also be useful to get feedback from @englehardt and @ehsan, with whom I briefly discussed this proposal. I'd love to figure out if we can make this more robust together. :)

the rules that the browser not cache the set if there's any mismatch seems a little problematic if a site wants to evolve the set over time (e.g., add a host to it

First, as is probably clear, the rules are somewhat up in the air. This first pass seems like a reasonable balance between deployment difficulty and stability, but we might well want to revisit some of the restrictions. The incremental verification suggestion you mention is one route that occurred to me, but I'm sure there are others. For example, the X-Bikeshed-This-Origin-Asserts-A-First-Party-Set could have a version number rather than a boolean, bypassing the cache expiration.

Second, I don't know how much we need this sort of thing to be trivial to change. The history of Mozilla's disconnect-entitylist.json shows that entities do indeed shift over time, but each individual entity seems relatively stable.

Why is a host that is a registerable domain disallowed?

That's a typo. :) Should have read "is itself a public suffix". Fixing it in WICG/first-party-sets@34adb31.

This does appear to pose some risk to users ("How will malicious actors abuse this mechanism?"), but it's not clear to me what the user benefit is over what happens today. It seems like the explainer should be clearer about that.

I'm hopeful that we can create non-proprietary and publicly auditable alternatives to the lists Apple, Google, and Mozilla are independently maintaining for various features. In the best case, something like this feature would give us the ability to offer entity-related features like credential sharing, and reduce the risk of rolling out tighter controls on cross-entity sharing.

The abuse point of hopping between sets seems to be mentioned only in passing, but it seems pretty concerning.

I think the current design fairly substantially mitigates this risk by making deployment bidirectional and atomic (e.g. the pain point you noted at the top). It seems to me that we can mitigate it more by locking sites into a given set for some period of time if we decide that the inherent difficult is either unacceptable or not enough.

Thanks again!

@plinss plinss added this to the 2019-03-05-telcon milestone Feb 26, 2019
@dbaron
Copy link
Member

dbaron commented Mar 5, 2019

I'm hopeful that we can create non-proprietary and publicly auditable alternatives to the lists Apple, Google, and Mozilla are independently maintaining for various features. In the best case, something like this feature would give us the ability to offer entity-related features like credential sharing, and reduce the risk of rolling out tighter controls on cross-entity sharing.

I think it would be useful to say something like that in the explainer.

@mikewest
Copy link
Author

@plinss: Skimming the minutes, I don't think y'all got to this in the 05.03 meeting, and it looks like it fell off the radar for the 12.03 meeting. Perhaps there's an upcoming slot it could fit into?

@dbaron: Yes. I need to restructure the explainer a bit to improve the explanation of the problem I'm aiming to solve, as it grew out of a different document with a distinct purpose. I'll certainly take some time to do that (though I don't think it'll create substantive changes, and hopefully won't block y'all taking a closer look).

Thank you both!

@lknik
Copy link
Member

lknik commented Apr 3, 2019

Hi Mike!

Hope you missed me. Lovely explainer. May I ask about a few bits below.

Still, it seems likely that folks will want to stretch the bounds of what first-party sets enables over time

Can you please elaborate why it's likely, and which folks specifically do you mean here? Not asking for all their names and addresses, of course.

Tying those two domains together in the same first-party set could increase the risk of credential leakage, if browsers aren't careful about how they expose the credential sharing behavior discussed above

Any other risks that you can imagine (apart from the stuff listed later in the explainer)? Aside from the Ordinary User not knowing about the existence first/third party stuff, would it make sense to require browser UI changes to indicate that some site is linked with another?

It would be fatal to the design if https://subdomain1.advertiser.example/ could live in one first-party set while https://subdomain2.advertiser.example/ could live in another

That looks unfortunate indeed. Good the explainer is listing plenty of concerns.

Given this reality, we need to add a registrable domain constraint to the design above such that each registrable domain may live in one and only one first-party set.

Would there be a way to deregister from the set, and e.g. change sets in quick time intervals, or something like that? I'm simply wondering if site1 can easily change its membership (rather than: being member of two separate sets on the same time, which is already marked as concern). Apart from the natural expiration of 7 days you speak of, unless it could be the same.

We can mitigate this risk to some extent by limiting the maximum number of registrable domains that can live together in a first-party set, rejecting sets that exceed this number

How would the risk after such mitigation compare with today's risk of making the same? Would you imagine it conceivable that advertisers will start serving their stuff from XXXYYYZZZ.ccTLD, and smartly game the number-limited system? (but: "Forget the entity" looks good).

As the declaration is public by nature, the style of abuse noted here will be trivially obvious to observers, which creates exciting opportunities for out-of-band intervention

Sounds like an opportunity for a new batch of research papers? I'm sure many will be happy ;-)

@torgo
Copy link
Member

torgo commented Apr 17, 2019

@mikewest we discussed at today's call having a focused discussion on this one at our next f2f - week of 20th of May. Is that going to be too late to be useful for you? If not, would you like to dial in for that (we will be in ~UTC).

@mikewest
Copy link
Author

Thanks, @torgo!

we discussed at today's call having a focused discussion on this one at our next f2f - week of 20th of May. Is that going to be too late to be useful for you?

Sure! Chrome will likely have begun implementation by then, but y'all's feedback would be quite welcome as we work through the initial stages.

If not, would you like to dial in for that (we will be in ~UTC).

I can probably make time to chat with y'all; that week looks pretty open. Let me know when you're closer to scheduling something?

@mikewest
Copy link
Author

Thanks, @lknik! Sorry I missed your feedback when you first provided it.

Still, it seems likely that folks will want to stretch the bounds of what first-party sets enables over time

Can you please elaborate why it's likely, and which folks specifically do you mean here? Not asking for all their names and addresses, of course.

The example I linked in the document (https://lists.w3.org/Archives/Public/public-webappsec/2017Mar/0034.html) came to mind as an existence proof of folks with interesting ideas about loosening the same-origin policy based on affiliation.

Any other risks that you can imagine (apart from the stuff listed later in the explainer)? Aside from the Ordinary User not knowing about the existence first/third party stuff, would it make sense to require browser UI changes to indicate that some site is linked with another?

The document lists the risks I've thought about. If I come up with more, I'll add them. :)

I don't personally think there's any value in exposing the relationship between A and B to users directly via browser UI, but I'm not at all a UI guy. I'd expect folks like @estark37 to have strong, well-informed opinions on these topics, and I'd defer to them completely. That said, however Chrome comes down on that question, I don't think it makes sense to specify UI in this kind of document.

Would there be a way to deregister from the set, and e.g. change sets in quick time intervals, or something like that? I'm simply wondering if site1 can easily change its membership (rather than: being member of two separate sets on the same time, which is already marked as concern). Apart from the natural expiration of 7 days you speak of, unless it could be the same.

I don't think it would be helpful to create a way to deregister oneself in an accelerated fashion without also taking some catastrophic action against the data that's been built up given the existing first-party relationships. I could imagine the Clear-Site-Data: * mechanism being draconian enough to enable this, for example.

How would the risk after such mitigation compare with today's risk of making the same? Would you imagine it conceivable that advertisers will start serving their stuff from XXXYYYZZZ.ccTLD, and smartly game the number-limited system? (but: "Forget the entity" looks good).

How would this "game the system"? The risk mitigated by limiting the size of a set is the incentive that would otherwise exist to create a single global set of all an advertisers' otherwise unrelated publishers (e.g. doubleclick.net + cnn.com + sz.de + vox.net + ∞). Allowing an advertiser (or anyone else) to bind their matching ccTLDs together seems different in kind from that scenario.

Sounds like an opportunity for a new batch of research papers? I'm sure many will be happy ;-)

I agree! Mechanisms that encourage transparency are good.

@mikewest
Copy link
Author

mikewest commented May 8, 2019

Regarding use cases, I'd like to draw your attention to https://mikewest.github.io/cookie-samesite-firstparty/draft-west-cookie-samesite-firstparty.html (http://tools.ietf.org/html/draft-west-cookie-samesite-firstparty if you prefer "paginated" text), which builds upon the primitive described here in a way that might allow us to avoid some developer pain points while tightening cookie controls over time.

@lknik
Copy link
Member

lknik commented May 21, 2019

@mikewest Thanks for the answer (We're discussing at f2f Reykjavik).

@dbaron
Copy link
Member

dbaron commented Sep 11, 2019

Given that the repo is now at https://github.com/krgovind/first-party-sets, feels like ccing @krgovind might be useful.

@torgo
Copy link
Member

torgo commented Sep 11, 2019

@mikewest just picking this up again, I think we are stalled and this topic has gone into our "abyss." Can you let us know the status and (most usefully) if there are specific questions where the TAG might weigh in. Does it make sense to discuss this at TPAC?

@torgo torgo added this to the 2023-02-13-week milestone Feb 12, 2023
@torgo
Copy link
Member

torgo commented Feb 14, 2023

Thank you for the update and the detailed explanation of the changes. Just to let you know: we discussed this in our virtual f2f last week and we agreed we like the direction this is going in. We're still formulating some questions to ask - and the same regarding requestStorageAccessForOrigin.

@torgo torgo modified the milestones: 2023-02-13-week, 2023-04-03-week, 2023-04-10-week Mar 23, 2023
@slightlyoff
Copy link
Member

There's now an I2S out for this; was the TAG able to discuss?

https://groups.google.com/a/chromium.org/g/blink-dev/c/7_6JDIfE1as/m/wModmpcaAgAJ

@torgo
Copy link
Member

torgo commented Apr 11, 2023

@slightlyoff right now the focus is more on requestStorageAccessFor. Do you have any thoughts you can share there?

@hadleybeeman
Copy link
Member

Hi all. We are looking at this at our TAG f2f. We're noting a pause in the I2S — can you tell us more context about this? How might any consequences might impact the design of FPS, especially any of the aspects we've discussed in this review?

@krgovind
Copy link

krgovind commented Aug 2, 2023

@hadleybeeman Thank you for checking!

We announced a pause in our rollout of the feature in Chrome because we discovered a regression in some of our metrics. Our analysis suggests that the regression may be due to the fact that our implementation currently blocks all network requests on the completion of the set change handling algorithm described here. The algorithm clears out all site data for any site that is leaving an existing set, so we had previously chosen the more conservative route of blocking all network requests until this operation was complete, in order to avoid being in a state where a site that is about to have its data cleared receives requests before the clearing is complete. We are currently waiting for our tentative fix (optimizing this implementation) to propagate, and hope to validate it in the next couple of weeks.

With regards to how that impacts this review; your question made us realize that we had previously neglected to specify this implementation detail in the algorithm, so we've now written up a PR at WICG/first-party-sets/pull/169 that describes this behavior, along with a note recommending that user agents optimize for request latency by being smarter about what requests/fetches to block.

@krgovind
Copy link

krgovind commented Oct 2, 2023

Hi TAG!

We wanted to provide you with some updates:

  • We are renaming "First-Party Sets" to "Related Website Sets", which we think better reflects its purpose. We have begun to gradually update relevant artifacts.
  • We have increased the "Associated Subset" domain limit slightly from three to five, in response to industry feedback. More on this choice here.
  • The feature has now been rolled out to 100% of Chrome stable users. As I mentioned in my last comment, we had previously paused the rollout due to some regressions caused by our implementation to prevent privacy leaks on list changes. We have now resolved that issue, but have a remaining race condition in the Chromium implementation which occurs exceptionally rarely in practice - in those rare scenarios, if a URL is on a the list of sites whose data needs to be cleared due to a set membership change and that URL is loaded during startup, those loads might complete before we've cleared data. We are tracking this issue here.
@plinss plinss closed this as completed Feb 27, 2024
@plinss plinss added Resolution: unsatisfied The TAG does not feel the design meets required quality standards and removed Progress: unreviewed labels Feb 27, 2024
@plinss plinss reopened this Feb 27, 2024
@plinss plinss removed the Resolution: unsatisfied The TAG does not feel the design meets required quality standards label Feb 27, 2024
@plinss
Copy link
Member

plinss commented Feb 27, 2024

We recognize that there are some substantial changes to the API since our original review.

We acknowledge that our use of the term origin in this feedback may have lead to confusion, and that we should have used the term site in this context. However, the fundamental points raised in that original review stand.

The primary change here is a reliance on the Storage Access API. However, the decision to automatically grant requests has the same net effect as in the original proposal. We do not think that labeling an auto-grant decision as "implementation-defined" is acceptable, as that defines how the web is experienced by websites and users.

The argument has been made that other browsers ship heuristics that do much the same thing. Our argument has been and continues to be that these heuristics are essentially a workaround and that we should not be building new technologies into the platform that cement this way of working. Furthermore, we observe that many browsers are working to eliminate the need for these heuristics.

We are therefore re-closing this review.

@plinss plinss closed this as completed Feb 27, 2024
@plinss plinss added the Resolution: unsatisfied The TAG does not feel the design meets required quality standards label Feb 27, 2024
@martinthomson martinthomson changed the title First-Party Sets Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
privacy-tracker Group bringing to attention of Privacy, or tracked by the Privacy Group but not needing response. Provenance: Privacy Sandbox Resolution: unsatisfied The TAG does not feel the design meets required quality standards Topic: privacy Topic: security features