Discussed a few days ago here on HN, and described by ‘ivanr’ as “misusing TLS s...

ivanr · on April 13, 2021

Indeed, it's a misuse of TLS session IDs for pervasive surveillance. The early web application firewalls also used TLS session IDs to better understand clients, for example correlating the observed values with HTTP session identifiers. I believe that many intrusion detection systems today support rich parsing of TLS traffic to enhance their operation.

Back on TLS, there's are several other ways in which it leaks information that could be used for surveillance. Say, a decade ago, back when revocation checking with OCSP used to be much more widely supported, you could track someone's browsing habits by observing which OCSP responders they talked to. OCSP responses are signed, but the traffic is otherwise not encrypted. Given that most certificates were (and still are) issued by a handful of CAs, you only needed to listen for global traffic at a small number of points. Of course, the fact that OCSP traffic is not encrypted further increases the amount of information available for fingerprinting.

Then there is client fingerprinting at TLS level, which is possible because TLS is actually a mix of protocols, cipher suites, extensions, and various other parameters. ClientHello messages (which are used to initiate TLS handshakes) leak a lot of information that can be converted to useful signals for fingerprinting.

Even at a single web site level, information is leaked via TLS record (i.e., packet) length. A careful observer can look at the record lengths and deduce which resources (e.g., pages, images) are downloaded.

baybal2 · on April 13, 2021

TLS on top of that ignore:

CORS

img tag GET request cookie sanitation on modern browsers

3rd party cookie filter/discard

All kind of other privacy filtering

It's just like TLS SNI a very obvious feature completely contrary to TLS privacy purpose, but which very few people know about.