Discussed a few days ago here on HN, and described by ‘ivanr’ as “misusing TLS session IDs, but the technique doesn't work reliably and it's not secure”:
Indeed, it's a misuse of TLS session IDs for pervasive surveillance. The early web application firewalls also used TLS session IDs to better understand clients, for example correlating the observed values with HTTP session identifiers. I believe that many intrusion detection systems today support rich parsing of TLS traffic to enhance their operation.
Back on TLS, there's are several other ways in which it leaks information that could be used for surveillance. Say, a decade ago, back when revocation checking with OCSP used to be much more widely supported, you could track someone's browsing habits by observing which OCSP responders they talked to. OCSP responses are signed, but the traffic is otherwise not encrypted. Given that most certificates were (and still are) issued by a handful of CAs, you only needed to listen for global traffic at a small number of points. Of course, the fact that OCSP traffic is not encrypted further increases the amount of information available for fingerprinting.
Then there is client fingerprinting at TLS level, which is possible because TLS is actually a mix of protocols, cipher suites, extensions, and various other parameters. ClientHello messages (which are used to initiate TLS handshakes) leak a lot of information that can be converted to useful signals for fingerprinting.
Even at a single web site level, information is leaked via TLS record (i.e., packet) length. A careful observer can look at the record lengths and deduce which resources (e.g., pages, images) are downloaded.
1. https://news.ycombinator.com/item?id=26770261