DDG is a specialized blocker, while uBO is a wide-spectrum blocker which you can configure to your liking, by allowing you to add more filter lists, or by enabling advanced default-deny modes.
Performance-wise, uBO stands its ground compared to DDG, keeping in mind that uBO has the non-trivial burden of having to enforce pattern-based and cosmetic filtering from EasyList/EasyPrivacy/etc., while DDG is strictly hostname-based filtering.
For instance, while DDG reports that ad.doubleclick\.net was blocked on CNBC, the network pane reports that not ALL requests to ad.doubleclick\.net were blocked, and as a result, ad.doubleclick\.net still gets to know which page you are visiting.
Do you have any additional lists you'd recommend? I added some custom ones for Reddit sponsored div's which was easy enough (although manual), and some local news sites custom DNS (which they're changing quarterly now.. I assume to game some KPI)
No mention of content blockers that stop ad network requests from even going out in the first place... I was under the impression that content blockers could even speed up page loads.
The news article in the test loaded 3 times as fast with the slowest adblocker than with no adblocker, and 19 times as fast with the fastest adblocker.
The charts are confusing. When visiting example.com, adblockers are worse on all metrics but then in the Adblock section they visit a different page which makes the general extension evaluation suspect since it’s evaluated only on example.com and not more representative page. I would do it on the top websites and create a composite score (and perhaps a separate score for example.con style pages that are unlikely to have complex content).
That would let you understand the impact of these extensions on different website types.
Another issue with the methodology is the lack of any combination of extensions, although they may not have the data to build the experiment for that. The assumption one might make is that these extensions stack up but I could imagine there could be non-linear effects combining extensions (unless there’s prior research indicating it is linear in all or almost all cases?).
Overall though it’s an impressive undertaking. Kudos to the author.
Generally the site you're on doesn't matter. Ad blockers are different, since they have a positive impact on sites with ads. Some extensions are site-specific, e.g. Wikiwand. I'm also mentioning Avira as a special case. If I tested each extension on the top 1000 sites I could identify more examples, but don't think it's worth the effort/cost.
> The assumption one might make is that these extensions stack up but I could imagine there could be non-linear effects combining extensions (unless there’s prior research indicating it is linear in all or almost all cases?).
Yeah, if you different extensions add more JavaScript bundles the performance impact just adds up. I've not looked at this in depth, but the 2018 post takes a quick look at combining multiple extensions. https://www.debugbear.com/public/blog/chrome-extension-perfo...
> Interestingly Avira is now significantly faster, 600ms instead of 2.7s! What’s going on?
> The Avira Browser Safety extension contains a website allowlist with 30k+ regular expressions. When the user navigates to a new page Avira checks if the page URL is in that allowlist.
Though I think the author might not have the time to run this on more sites, and what's "representative" anyway? Every user visits different sites.
Did you check the charts? Removing ads (in this case by blocking, but could be done by simply not having them) reduces the load time. There's not trade-off between load time and ad being shown...
Or just have ads that are unobtrusive and non-deceptive. The problem when I turn off ad blocking is that, to this day, ads are still too often atrocious and distracting in ways that they never were for printed content.
When Google started with its ads they were both unobtrusive and non-deceptive.this, together with the fact that they were context sensitive, was one of the reason they were much more appreciated by the users (cannot quickly find examples right now) . Unfortunately , it seems that obtrusive and deceptive ads make more money
Exactly this, making money isn't good enough. You have to make more and more each quarter. Shame Google haven't found many more ways of doing this other than infesting the web with ever more aggressive and annoying ads.
I don't run adsense on sites anymore but, the last time I did, I enabled 'auto ads' and the results were nothing short of atrocious. Literally, almost a wall of ads?
That's the issue at the heart of it. Ads are annoying, but a certain type of lucrative person clicks on them. A YouTuber asking to hit that bell and like button is repetitive, but it works. Sometimes superfluous additions ruin products, but customers will purchase things that make them feel good.
We have a network of local news sites in the UK which are almost totally unusable on mobile and desktop without javascript disabled and/or aggressive adblocking.
That wouldn't be so bad if the actual news content was worth reading; it's mostlty click-bait at best.
The owner's clear intent is to milk every last drop of ad revenue they can before the inevitable happens which is really sad since there is a clear need for local news coverage which isn't available elsewhere.
Running a persistent content blocker as a third-party HTTP proxy process is more efficient than running in-page: you get all the CPU benefits of content blocking, without the RAM drawbacks of loading JS blocklists through page injections. I would not consider the browser extension model to be viable for content blockers in light of these statistics, as the overhead of a cold start simply can’t match the readiness of a network proxy.
Yet here's a drawback: you can't handle cases where the domain doubles both a service that you depend on and as a source of advertising and tracking that you wish to block. Facebook and Google are the two that come most readily to mind, but there are countless others.
EDIT: my apologies, I misread your comment and assumed you meant host-based blocking. You seem to be advocating for the proxy to parse the page contents and do the ad-blocking there, before serving it on to you. It's an interesting idea. At scale it would have some significant hurdles to overcome WRT TLS, but I am nonetheless intrigued.
At scale it would have some significant hurdles to overcome WRT TLS
Proxomitron and other MITM proxies have been doing this for years --- you do need to generate your own root CA and change the TLS config of whatever clients will go through it so that it trusts your root, but after that it works perfectly. As a bonus, you can do "HTTPS upgrade" with applications that don't even support HTTPS or only old TLS versions.
Those aren't third-party HTTP proxies, they're local HTTP proxies.
If you want to do it locally, then you're paying the CPU / RAM cost yourself, not to mention doubling a lot of the work (the page needs to be parsed / interpreted / JS executed by the proxy, and then again by your browser) -- so why not just do it in the browser?
If you want to do it third-party, you need to trust this third-party because they're terminating the TLS connection for you (not to mention being fundamentally incompatible with TLS 1.3)
AdGuard has this ability. It will even install a certificate to your OS so it can parse and block TLS connections (but only if you explicitly instruct it to).
uMatrix and ublock origin are my two go to addons on firefox, for the most part, whatever ublock origin misses umatrix will catch, and I really like the grid system layout of matrix (despite recently learning on hn there is an advanced mode of ublock origin that is similar to umatrix) I guess umatrix isn't included because it's not technically an adblocker?
I feel sorry for the masses that still browse without these addons.
FYI, development has stopped on uMatrix (github repository is archived and read-only). I've switched to using uBlock origin in advanced mode. It's not as flexible, but it does similar work.
For example, medium takes forever to load in Firefox mobile with unlock origin, which I'm assuming is from time spent waiting on network requests and retrying them.
> which I'm assuming is from time spent waiting on network requests and retrying them.
there shouldn't be timeouts. ublock origin prevents the request from happening. there should be neither DNS lookup or a HTTP request generated from the browser.
It would be more interessting what is more energy consuming on smartphones, the ads or the blocker? My hypothesis is that ads are more energy consuming...
I worked on improving the performance of an extension that had to parse the DOM and any mutations.
Performance was horrible in Firefox, not nearly as bad in Chromium browsers but improvements could be found.
What worked: parsing the DOM with a tree-walker (instantly rejecting nodes that didn’t need to be checked, like style tags) and throttling checks with a mutation observer, i.e. mutations would be collected and parsed together after a timeout interval. Wasn’t too difficult, I think that some of these things can be overlooked because front-end engineers typically don’t concern themselves with performance.
It looks like librejs is about blocking proprietary javascript. Certainly malware or user hostile javascript could include an open source licence, if people started using that tool.
Probably because the browsing experience with this extension is downright miserable. Most major web sites are completely unusable with it enabled.
As an added kick in the teeth, it doesn't even render the contents of <noscript> tags, so usability with LibreJS is often worse than what you'd get if you'd disabled Javascript entirely.
Unfortunately, this could look like you’re pretty heavy, if it’s “only” 2x speed up, by comparison to the post article.
But more to the point, what does it mean to test “your” ad blocker, if what you’re doing is providing a rules list to Safari?
”Content Blockers are app extensions that you build using Xcode. They indicate to Safari a set of rules to use to block content in the browser window. Blocking behaviors include hiding elements, blocking loads, and stripping cookies from Safari requests.”
”You use a containing app to contain and deliver a Content Blocker on the App Store. ... Apps tell Safari in advance what kinds of content to block. Because Safari doesn't have to consult with the app during loading, and because Xcode compiles Content Blockers into bytecode, this model runs efficiently. Additionally, Content Blockers have no knowledge of users' history or the websites they visit.”
Note: I’ve tested dozens of ad blockers on iOS for interoperability with fin techs that users expect to “just work”. Not performance testing, just using as daily drivers and doing banking tasks.
I look for which apps let ads through on popular sites while breaking brand name banks. The antidote to this is great active curation by the core team (or curated upstream), plus a convenient way for the end user to block new things, and to allow things.
For a tech audience that uses MacOS and iOS native Safari and misses ability to customize rules, I tend to recommend 1Blocker:
Of the many I’ve tested, 1blocker lets the lowest percentage of ad or tracker requests through to the DNS lookup stage as logged/blocked by pi-hole, while the web still works as normal users expect.
We drop from >40% of DNS requests blocked with no ad-blocking to ~5% of DNS requests blocked with 1Blocker enabled.
Performance changes depending upon the rule list provided to Safari. A better (not necessarily larger) rule list means that more unnecessary items are blocked and web pages therefore load more quickly. This means that there can be a large difference between ad blockers even if they use the same underlying blocking mechanism.
As you note, the rule quality is also important. We judge our rule quality based upon:
- How many ads, trackers and annoyances it blocks
- How frequently rules are updated to meet changes to pages
- How large is the rule set and does it include out of date and redundant items (many ad blockers have overly large rulesets that increase memory use and reduce maintainability; including 1Blocker)
- Not breaking any required web page functionality
Not surprising. Advertising logic in the JavaScript served to the user is often degenerate garbage, perhaps by people who would rather be doing something else compiling code to JavaScript.
I used to be happy with ad blockers but now every site detects them and it's taking me longer to access content and I get to see the ads anyway. Does anyone actually have a good solution?
uBlock Origin and Adblock Plus on FF, blocks all the youtube ads for me. DDG didn't. I am okay to sacrifice CPU for not watching the ads and hence sticking with the former duo.
FYI, AdBlock Plus is made by a garbage company who allow companies to buy their way around the blocker in return for fulfilling some token guidelines about ad formatting etc.
I first started using an ad blocker (adblock plus or some such firefox plugin, don't remember) somewhere around 2003 when people were hyping it on tech forums. About a year later, I removed it, because I felt (wrongly or rightly) that it was eating more CPU to have it on my already strained machine (yes, in 2020 it is more worth it). Instead of relying on the ad blocker to fix Firefox's broken and non-working "disable popups" option, I set dom.popup_allowed_events to "" in about:config. Which also does not fully work (false positives and negatives), but is the least evil.
Around 2006 or so I started understanding the topic of computer security, and continued to not use an ad blocker, as that would increase attack surface, and is written by amateur hobbiests. Years later, my infosec buddies constantly talked about how X and Y plugins can be leveraged to do things range from RCE to XSS in Firefox. Of course this doesn't really matter since any browser at this point already had an unrecoverably large attack surface, but was still a good exploration.
In 2009, I started only browsing the internet over Tor, as this is how the internet is meant to be used. IP addresses are an implementation detail and nothing other than high end real time low latency games have any business seeing it.
Then around 2011 Cloudflare came out and decided to block Tor (for pedants: yes, Tor was actually blocked automatically by their WAF; They didn't explitly block Tor. Does that make their WAF good? No! Cloudbleed anyone?). This caused about 50% of ads to stop loading. I subsequently filled out thousands of captchas for the next 7 years to read 3 paragraph text articles (which consume less resources on than the captcha itself) and attained wrist problems.
Around 2013 or so, the web became so bad that it was not usable on my sub $1000 machines without entirely disabling JS, with or without an ad blocker with or without Tor. So I set javascript.enabled to false in about:config. No, this is not what Noscript is for. Noscript is an engine for mitigating web attacks such as XSS and CSRF. Basically a firewall. I do not use Firewalls, they are marketing hype. I would not install that just to use one tiny feature of it. Noscript has also _caused_ XSS vulns in the past. I will not have some hobbiest plugin arbitrarily modify the DOM or whatever else it modifies when forced to access my bank online who mysteriously thinks using the most insecure computing platform in the history of man kind for sensitive data is a good idea. Disabling Javascript is not only a good performance optimization, but will get rid of most ads.
Around 2018, I stopped using the internet for a year or two, and just went to the library instead, as this was now more efficient than trying to use the web. During this time I have written multiple pieces of software, each were better than I ever wrote. Sometimes I had some thing I wanted to lookup online, and just wrote it down and went to a cafe to look it up and save the results. I do not consider buying a phone and sending photo ID to websites (literally: steam [a website embedded in a terrible GUI marketed to 3 year olds] now will ask for a photo ID before you can go on a game when you logon from your friend's house down the street) to view static content consisting of a paragraph of text surrounded in marketing boilerplate as a reasonable compromise. The web is simply dead to me at this point. Even using the web like a normal person, you will just get a non-stop stream of SEO'd garbage, half working programs (most websites are now programs written by low quality programmers that interactively fetch the 300 bytes of ASCII you wanted to read in some inefficient and error prone way from a server who's WAF will block you for false positives), and delays. Even in the "hacker community" at this point, it's just a bunch of hippies parroting "X and Y is hard", and racing to implement the new best practices like linting, password hashing, 2FA, KYC, forward secrecy (yup, good to have, but people believe it's the only thing they should care about. so bad that i've seen them literally miss gaping holes in their software because of it), etc, etc, while not understanding software engineering very well in the first place.
But none of this matters. There is no correct way to use the web. It's broken by design. And so are all these new decentralized tech like Ethereum, Freenet, etc which use web tech as the front end. I merely use the web as described above out of principle, because I want computing to be good as opposed to 1000 layers of bad, and that's what we're going to make, and you can't do anything about it. People will object because "nooo you can't just view a paragraph of text, what if it is saved to your hard drive? that will be piracy!!11oneone", and "noo you can't just use a memory safe language, that is slow because of GC and insecure because you might make an initial reference implementation in C, check mate!" but we will not care about these pathetic tech reactionary ideas anymore. Yes, I know about the Gemini protocol. That's how the web should have been in 1990, but not 2000.
The web should have only supported static content
The web should have had a real markup format as opposed to something that will be arbitrarily parsed however the implementor feels like
The web should have been on content addressable storage
The web should have based on strange military/UNIX crap like MIME
Domain names should not exist
X.509 should not exist
The web should have been anonymous
"Web applications" (which are no different than real software aside from having 10x more hacks) should have been the same as .exe but with proper isolation of programs. It's windows, mac, and linux's faults that web applications are viable.
Performance-wise, uBO stands its ground compared to DDG, keeping in mind that uBO has the non-trivial burden of having to enforce pattern-based and cosmetic filtering from EasyList/EasyPrivacy/etc., while DDG is strictly hostname-based filtering.
For instance, while DDG reports that ad.doubleclick\.net was blocked on CNBC, the network pane reports that not ALL requests to ad.doubleclick\.net were blocked, and as a result, ad.doubleclick\.net still gets to know which page you are visiting.
https://twitter.com/gorhill/status/1273263784828731393