Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase the maximum number of dynamic rules to 30,000 #319

Open
105th opened this issue Nov 10, 2022 · 20 comments
Open

Increase the maximum number of dynamic rules to 30,000 #319

105th opened this issue Nov 10, 2022 · 20 comments
Labels
implemented: chrome Implemented in Chrome implemented: safari Implemented in Safari supportive: firefox Supportive from Firefox topic: dnr Related to declarativeNetRequest

Comments

@105th
Copy link

105th commented Nov 10, 2022

In the case of ad blockers, dynamic rules can serve several purposes:

  1. Differential updates for the static rulesets.
  2. Allowing users to add custom filter lists which are not included into the extension packages.
  3. Applying custom rules made by the users themselves.

Differential updates

The limit of 5,000 is simply not enough for this purpose. The problem is pretty much the same as with the current “disableRules” limit. Here are some graphs that show how many rules are being added to popular filter lists that are used by AdGuard users.

Number of added network rules to AdGuard popular global lists

Raw data

On average about 2600+ new rules are added every week to the most popular filter lists. Note, that this does not include changes to regional lists (many users have 1-3 regional lists). Considering the release cycle of 3-6 weeks, we need to increase the dynamic rules limit to at least 20,000 just to make it possible to use it for differential updates.

Custom filter lists

Ad blockers provide users with the possibility to add custom filter lists, the ones that are not included into the list of static rulesets. This is crucial for the ecosystem, that’s how new filter lists appear and become popular and eventually get included into the static rulesets. Is 5,000 enough for this purpose?

We analyzed public issues from AdGuard users from the AdguardFilters repo to see what sorts of custom lists are used and how many rules are there on average. It appears that about 5% of the reports come from the users that have custom lists. And every one in four of those users uses more than the maximum limit of 5,000 rules.

Custom user-made rules

Unfortunately, there’s no open stats that we can share on that. Although, from our experience, users rarely have over several hundreds of user-made rules.

Summary

Considering all the cases that are listed above we suppose that the adequate limit of dynamic rules is 30,000.

@Rob--W
Copy link
Member

Rob--W commented Dec 8, 2022

As noted in the meeting (https://github.com/w3c/webextensions/blob/main/_minutes/2022-12-08-wecg.md), this would also be a good time to separate the rule limits for session and dynamic rules.

A potential downside of separating the quotas for session and dynamic rules is that extensions could try to maximize the number of rules by reading and installing session rules at startup. In my opinion, if there is a concern on the registering session rules at startup, then the more sensible thing to do is to have lower limits for session rules, than to continue maintaining a shared limit for session + dynamic rules. E.g. continue limiting session rules to 5k, and expanding dynamic session rules to 30k.

@ameshkov
Copy link

ameshkov commented Aug 2, 2023

@oliverdunk hi Oliver, we're waiting for some feedback from your team on this one, any chance you had an internal discussion?

@oliverdunk
Copy link
Member

@oliverdunk hi Oliver, we're waiting for some feedback from your team on this one, any chance you had an internal discussion?

Hi Andrey 👋 Nothing specific to share yet, but we have been discussing this and I'm personally very keen to address it.

If you're up for gathering some more data, one thing I would be interested in is the breakdown of the types of rules you'd be dynamically adding (how many are block rules, how many are redirects to extension URLs, how many are external redirects, how many are modifications etc).

That would be really useful in figuring out what sort of usage to expect if we were to increase this limit :)

@ameshkov
Copy link

ameshkov commented Aug 2, 2023

Got it, we'll update the stats then, thank you!

@105th
Copy link
Author

105th commented Aug 3, 2023

@oliverdunk hi Oliver, we collected needed data with categories.

Filters were grouped by the same 3 groups: popular, recommended and others - they are represented on three pictures, one picture for one group of filters.

Added network rules divided into 6 categories:

The top chart shows all categories of rules, the bottom chart shows everything except block rules.

All these categories were extracted from DNR RuleActionType and from our process of conversion.

You can see that block rules take up about 98-99% of the changes.

Popular filters
popular

Recommended filters
recommended

Other filters
other

Raw data here

@oliverdunk
Copy link
Member

@105th I forget if I said already but thank you so much for this! We get a lot from all of the data you share and really appreciate it - will try to have some follow-up on this for the next meeting :)

@oliverdunk
Copy link
Member

oliverdunk commented Aug 17, 2023

We've been discussing this request for some time on the Chrome team and have put together a new proposal:

Proposal: Safe rules in declarativeNetRequest

In short, the idea is to establish a smaller subset of rule types which are subject to an increased dynamic rule limit. There is a lot more detail in the document, so please do take a look. Importantly, there is no change to what you can do with rules in declarativeNetRequest or the existing limits - this is purely additive.

Feedback is very much welcome - in particular, the "safe rules" naming is something we'd like to discuss (no rule is definitively safe, just as all rules have valid use cases). We also want to make sure this addresses the most important use cases and fits in to the bigger picture of how the declarativeNetRequest API is used (for example, I know supporting differential updates is a key goal that may require more than one moving piece to fully support).

Happy to discuss here for now although we could consider opening a new issue if that would work better.

@oliverdunk oliverdunk removed the follow-up: chrome Needs a response from a Chrome representative label Aug 17, 2023
@Rob--W
Copy link
Member

Rob--W commented Aug 17, 2023

@oliverdunk Your proposal document currently proposes a higher limit for "block", "allow" and "allowAllRequests". Can it be just "allow" and "block", without "allowAllRequests"?

"allowAllRequests" is expensive in its design and it is not clear whether the request here actually needs it.

@oliverdunk
Copy link
Member

@oliverdunk Your proposal document currently proposes a higher limit for "block", "allow" and "allowAllRequests". Can it be just "allow" and "block", without "allowAllRequests"?

"allowAllRequests" is expensive in its design and it is not clear whether the request here actually needs it.

Assuming there aren't any use cases we're missing, I don't have any strong feelings there. I added it for completeness but "block" and "allow" are definitely the important ones.

@lyphyser
Copy link

lyphyser commented Aug 22, 2023

Just remove the limit on the number of rules. There is of course no need to have a limit and an arbitrary magic number is obviously always wrong since there is no way to prove that the choice is optimal or correct.

The limit is suspected to have been added by Google for anti-user purposes (enriching their ad business at the expense of the user by making ad blockers less effective), or was possibly added due to sheer incompetence.

If you want to prevent DoS by high memory usage, then account the rules as memory usage by the extension, and act on the memory usage of the whole extensions. Obviously this must not be an arbitrary limit, but instead some way of informing the user that the extension might be using too much memory and might be causing performance or crash issues.

(in case it's not obvious, with a proper implementation (i.e. a trie for simple patterns or NFA for regexes) performance does not depend asymptotically on the number of rules, but only on the size of the text to match, excluding the construction of the data structure which is irrelevant since it can be done in the background before the extension update is finalized and saved to disk)

@ameshkov
Copy link

ameshkov commented Aug 23, 2023

@oliverdunk

First of all, it's awesome that you agree to increase the dynamic limit, thank you!

Generally, the proposal covers what was requested pretty well.

The concept of "unsafe rules" seems interesting to me, maybe it can be used further in the future (optional permission?). Just one question: why upgradeScheme is not considered safe in the proposal?

@oliverdunk
Copy link
Member

@oliverdunk

First of all, it's awesome that you agree to increase the dynamic limit, thank you!

Generally, the proposal covers what was requested pretty well.

The concept of "unsafe rules" seems interesting to me, maybe it can be used further in the future (optional permission?). Just one question: why upgradeScheme is not considered safe in the proposal?

Thanks for taking a look at this!

I don't think there's any specific reason to exclude upgradeScheme - I'd be curious if you have thoughts on the use cases for this or how common you think it would be.

The main reason to focus on block, allow and allowAllRequests for now was just to keep this as well scoped as possible, but we're definitely open to conversations if it seems like what we have is too restrictive.

@ameshkov
Copy link

upgradeScheme is not too useful for us specifically since it's more for extensions like HTTPS everywhere, but I assume they wouldn't be against increasing the dynamic limit either. I am just trying to think of any scenario where upgradeScheme can be problematic and so far can't think of any.

@sammacbeth
Copy link

To give some context on upgradeScheme rules, in the context of DuckDuckGo's Smarter Encryption feature (which is what HTTPS everywhere is based off). We currently ship a list of ~250k domains to be upgraded in the MV3 version of our extension. By using the requestDomains option, we can encode these into 10 DNR rules, so the rule cap is currently not an issue for this use-case. We ship these as static rules anyway due to the list size - the ruleset file is currently 4.7Mb.

I wrote up some notes on the MV3 implementation for this feature, in case it is of interest: https://sammacbeth.eu/mv3-smarter-encryption

@SHAGGAR
Copy link

SHAGGAR commented Nov 17, 2023

Has any consideration been given to letting the user choose limits for each extension?

It seems like limiting the maximum rule count for safety reasons doesn't make any sense. All it would take is one malicious rule to do damage. Maybe the user can take on the accepted (theoretical) risk of large dynamic lists by choosing to enable and/or set a max value. This would allow users who want to retain maximum functionality to eliminate the limits entirely, while the concern google et.al. have of legal risk for "approved" extensions goes away. Alternatively just slap a warning on the extension page that says the filters are dynamic and may result in unexpected behavior. We see this today for things like video games where a single player campaign is rated, but the online play is not because its out of control of the developer.

TBH this whole thing seems pretty silly since the entire concept of a web browser is that you're going to view totally dynamic, unknown content every time you load a page. I mean right now you could go to a search engine and get an ad that leads to a scam. Or maybe you load a page and a tracker builds a profile of you that it sells to advertisers/scammers. Or maybe there's a vulnerability in the browser and a malicious website uses it to gain elevated access to your computer.

These are all risks we accept with the web and I don't think anyone from w3c or any of the browser vendors would ever consider tackling those issues in the form of a pre-approved browser whitelist. So why are we doing that here with MV3?

Let the user decide what risk they're willing to take.

@xeenon
Copy link
Collaborator

xeenon commented Nov 17, 2023

While the idea of allowing users to set their own limits for each extension is interesting, it introduces complexities, especially in terms of performance. The increase of the content blocker extension rule limit in WebKit to 150,000 was a balance between functionality and memory usage — see the linked commit for the juicy details. (That WebKit limit is also separate from Safari's dNR limits, for other technical reasons.)

Allowing users to choose their own limits could lead to performance issues in a way that might not be immediately obvious to them. This is crucial, considering the diverse range of devices and operating systems browsers support. Performance degradation due to high rule counts might be hard for users to diagnose, leading to a suboptimal browsing experience.

@SHAGGAR
Copy link

SHAGGAR commented Nov 18, 2023

As a ublock origin user I have never had an issue with performance related to its dynamic lists. Are these performance issues a new problem resulting from the design adopted for MV3? If so perhaps that design should be reconsidered or abandoned.

If this is not an issue of the MV3 design and the same issue exists in MV2 then why have users not noticed it yet? Is it maybe because the performance hit of the blocker's lists is nothing compared to the performance hit of running the unblocked javascript of ads?

I think most users would accept the insignificant performance hit of unlimited dynamic lists if it eliminates the extremely noticeable and unbearable performance cost of unblocked ads. It seems like it would be better to let the user make that choice.

If the issue is low end devices not handling the lists, then maybe those devices should have limits set based on available resources. Setting a static limit for all extensions across all hardware to support the lowest common denominator seems like not just a bad choice but also unnecessary. Browsers could automatically size lists to resources.

And even if there is an issue with low end devices we run into the same tradeoff: blocking the javascript of the ads is almost always going to be a vast performance improvement. Effective ad blockers are even more critical for low end devices as any ads that sneak through will be detrimental to the user's experience.

@Dealman
Copy link

Dealman commented Dec 21, 2023

While the idea of allowing users to set their own limits for each extension is interesting, it introduces complexities, especially in terms of performance. The increase of the content blocker extension rule limit in WebKit to 150,000 was a balance between functionality and memory usage — see the linked commit for the juicy details. (That WebKit limit is also separate from Safari's dNR limits, for other technical reasons.)

Allowing users to choose their own limits could lead to performance issues in a way that might not be immediately obvious to them. This is crucial, considering the diverse range of devices and operating systems browsers support. Performance degradation due to high rule counts might be hard for users to diagnose, leading to a suboptimal browsing experience.

The keyword here being could. Surely some diagnostics to alert the user if an extension is using a substantial amount of memory is a rather nice trade-off?

Looking at those numbers, I really don't see any issue other than for potentially low-end mobile devices. Whereas high-end devices will barely notice a few hundreds of MB being used.

Why resort to this kind of collective punishment only to satisfy low-end devices? I wholly agree with the other statements that some of the content that could otherwise be blocked but wouldn't, because of these imposed limits could have a far bigger performance impact.

While I like to try and keep things on-topic and professional, this rather obviously isn't decided on with the end-user in interest. That's but a façade, what's truly in interest here is Google and that alone.

I mean, W3C's own guidelines champion that the user should be able to render the content the way they want - including assistive extensions and blocking of unwanted content. Why deny the user the ability to tweak these values themselves, if it means they could do just that?

The fact the limit was set to a shockingly low value of 5,000 just further cements that Google has no end-user interest in this. Even if increased to 30,000, that's laughably low when like you demonstrated - WebKit allows for 150,000.

I hate to sound a bit pessimistic, but I feel there's going to be some real Streisand effect fallout from all of this if it goes through the way Google wants it to.

@dotproto dotproto added the implemented: chrome Implemented in Chrome label Mar 18, 2024
@xeenon xeenon added the follow-up: safari Needs a response from a Safari representative label Mar 18, 2024
@Rob--W
Copy link
Member

Rob--W commented Mar 18, 2024

FYI Firefox hasn't bumped the limit yet, it is tracked as part of https://bugzilla.mozilla.org/show_bug.cgi?id=1803370.

@xeenon
Copy link
Collaborator

xeenon commented Mar 18, 2024

This has been fixed in WebKit, not shipping in Safari yet.

@xeenon xeenon added implemented: safari Implemented in Safari and removed supportive: safari Supportive from Safari follow-up: safari Needs a response from a Safari representative labels Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
implemented: chrome Implemented in Chrome implemented: safari Implemented in Safari supportive: firefox Supportive from Firefox topic: dnr Related to declarativeNetRequest
10 participants