On many websites, you will literally be the only person doing that. This is a unique fingerprint.
This might not matter to you, since it sounds like privacy isn’t your primary motivation here, but it is worth pointing out that custom patched browsers are going to be more fingerprinted, not less.
Yes, I understand some HN readers have this thought. I have gotten similar replies before. However, consider that I only send two to three headers: Host, Connection and (optionally) Cookie. There is nothing unique about the text-only browser by virtue of the patches. The TCP connetion and TLS is handled by a proxy. Sometimes I send the TCP requests with netcat, tcpclient, socat, etc. Then I open the HTML file with the browser.
Perhaps the proxy has a fingerprint, but it is a very popular proxy in widespread use.
Maybe the OS, e.g., the networking stack, has a fingerprint.
Maybe the TCP clients have fingeprints.
But seriously, what is the point of thinking about these things. Who is going to go to such lengths to try to "identify" me. What is their purpose. Am I a spy trying to hide from computer forensics nerds. No. Am I trying to "blend in". No. I am trying to improve the web experience. That involves 1. sending the minimum data (avoid feeding the online advertising juggernaut) and 2. reducing advertising, ideally to zero. I have done a good job of both 1 and 2.
Who is going to try to advertise to a user who is using a text-only browser.
Also, assuming hypothetically, for argument's sake, we tried to get every user to "blend in" by using the exact same browser with the exact same settings on the exact same computer. Which would be easier: (a) users have to copy all the settings and idiosyncracies of a "modern" graphical browser including numerous HTTP headers or (b) users have to refrain from running CSS or Javascript and limit to sending only two to three headers (Host, Connection and, optionally, Cookie) and only send one request at a time. The more points of differentiation one has to worry about, the greater the chance one will overlook something. Needless to say, the "modern" browser presents a greater number of points of differentiation than the text-only browser.
The goal for me is to improve the web experience, including minimising (a) how much data I voluntarily share and (b) advertising. I have succeeded. The use of a text-only browser in lieu of a graphical one for recreational web use is part of the solution. I also benefit from other strategies that help with (b). Attempts to advertise to me online are few and far between. It is a fruitless endeavour.
It is not a goal of mine to try to "blend in" with other web users. It appears this is a goal of some HN commenters. Alas, other web users share heaps of data voluntarily and subject themselves to large amount of advertising. There is arguably a price to pay for trying to appear "same".
> But seriously, what is the point of thinking about these things. Who is going to go to such lengths to try to "identify" me.
Nobody's going to very much trouble at all. They're just dumping every characteristic they can gather about you into an AI system, like a Bayesian classifier or a Convolution Neural Net. It doesn't require very much work to take into account clearly discrete data like the set of headers you submit, or the delay between switching pages, or parts of your IP address.
You hear a lot of stuff on HN about how inaccurate AI is, and much of it is true. But for figuring out when the set of HTTP headers correlates with your shopping habits, it should actually do a pretty good job, because it's basically just a matter of finding ways to correlate data together. No need to recognize when it's missing some form of outside context, because it doesn't "fail" or "succeed", it just does "worse" or "better." As long as it does better than a coin flip, it's worth it.
Right now, it's pretty effective to block ads by just not loading them, but there's no universal law that says it will always be that way. That already doesn't work on YouTube, which serves the ads from the same domain as the content, meaning that most ad blockers don't work on it. If ad blocking keeps becoming more popular, tactics like that will become more common. Once the ad serving becomes strictly first-party, relying on JavaScript looks like an increasingly terrible idea, not because of the minuscule number of people blocking JavaScript, but because you can't trust the potentially-malicious client to defend against click fraud.
These replies about "uniqueness" are in response to me disclosing I use a text-only browser or some non-graphical client to access websites. Why should "uniqueness" matter to me. As I said, I am just trying to avoid the annoyances of graphical web browsers. I am successful in doing that.
The majority of web use for me is not shopping. Why should I use the same browser for shopping that I use for recreational web use.
As for YouTube, this has been brought up many times. I cannot speak for other users, but I see zero ads when using YouTube. I search and download videos from the command line. With very few exceptions I never need to use youtube-dl because the signature values are already in the web page. There is no need for a "Javascript video player" to submit HTTP requests. The Javascript-enabled behavioural tracking on the YouTube website is insane. I use tiny shell scripts to search and download. I am aware of "SponsorBlock" which suggests some videos have ads embedded in them however I have never seen such a video. Most videos I watch are non-commercial.
"Click-fraud" is IMO secondary to fraud on the part of Big Tech and Big Tech wannabes who induce advertisers to purchase online advertising knowing, but not adequately disclosing, that it suffers from such inherent technical flaws.
> These replies about "uniqueness" are in response to me disclosing I use a text-only browser or some non-graphical client to access websites.
And your disclosure was in response to a CSS-based fingerprinting demo. If being fingerprinted doesn’t even matter to you, and you use a text-only browser just because you prefer the UX, then why bring it up on this article in the first place?
Because when there is a thread about a demo, and for some users the demo does not work, it is common to see comments that the demo did not work.
Being fingerprinted does matter to me. It is one more reason why the large, complex, graphical browsers supported directly or indirectly by online advertising annoy me. It is nigh impossible for users to control those programs.
As it happens, using a text-only browser, the TCP clients and the use of a proxy to remove headers all make fingerprinting less useful. The fingerprinting techniques used by "tech" companies tend to rely on the features of the large, complex, graphical browsers supported directly or indirectly by online advertising. For example, CSS fingerprinting does not work with a text-only browser doing its own formatting and ignoring CSS. Although this is not the primary reason I use a text-only browser, TCP clients and a proxy, any attempted fingerprint of that setup would indicate a user who cannot see ads. What would be the use of the fingerprint then.
Thanks for this comment. This is basically my position. While I prefer that sites would respect "do not track", ultimately I just don't want to see all the shitty ads. I don't really care if you're trying super hard to track me, though I think you're a fool if you do based on what I do online (read HN, wikipedia; download academic papers, mostly?).
I sympathize with the people who are going for pure anonymity. If I could be anonymous and still have a usable web, then I would do that. If you really think you'll learn something about me by tracking me, then whatever. I have still never been served a relevant ad in my life, so, uh, great job there. But in the end I just don't want to see all the shitty ads.
I would be interested to know what these people think they know about me. Based on my experience as a cognitive scientist, I suspect they know an awful lot less than they think they know ... or at least claim in their sales pitches to advertisers.
What these replies about "uniqueness" seem to ignore is that the majority of traffic on the internet is so-called "bots". In other words, it is traffic from clients that are not Chrome, Safari, etc. It is ridicuously easy to be mistaken for a "bot" when submitting requests manually, if one does not know what they are doing. For example, editing a single HTTP header is often enough. "Bot detection" is more often than not based on laughably crude heuristics. What happens if the user makes a single request manually and that header is missing. There is nothing to check. In almost all cases, nothing happens. There is no penalty for reducing the amount of information sent. In any event, it is rather easy to unintentionally "blend in" with the majority of internet traffic, which comes from "bots".
Those professing to have superior knowledge about user behaviour, including Big Tech, still cannot tell if someone is submitting requests manually or not.^1 (Absent keylogging on the users computer.) Their superior knowledge of user behaviour only applies to users who use "modern" browsers that place high emphasis on graphics. Chrome, Safari, etc.
No one is going to try to advertise to a "bot", i.e., a non-graphical client. It would be ineffective. The online advertisig industry relies on graphical web browsers like Chrome and Safari.
Using a common browser with the default settings to try to "remain" anonymous comes at a cost. Default settings do not include installation of extensions, e.g., ad blockers.
1. Contrast this with the different question of determining whether or not a user is using a certain client, e.g., Chrome, Safari, etc. That is an easier question to answer. However detection of other clients is not done. I am never notified whether or not I am using, e.g., tcpclient, original netcat, socat, openssl, etc. How does one detect the difference. And assuming they could tell, then what. How will ads be served. The HN commenters replying about "uniqueness" fail to consider why there is so much effort to "fingerprint". It is driven by advertising which puts a monetary value on gathering user data. Using a modern browser, sending more data voluntarily to "belnd in", feeds the online advertising industry and ensures such surveillance efforts will only increase. As checkyoursudo suggests, the data collectors will "claim in their sales pitches to advertisers" that they know a great deal about users, regardless of whether the data they have collected is truly accurate or usefully informative. Feeding the data collectors "fake" or "non-unique" data is one idea, but another idea is not sending the data at all. For HTTP requests, the later works for me.
Yes. It is a relief from the times I must use the graphical browsers.
For example, I can search for and consume information much faster and more efficiently, free from distraction.
Perhaps one needs more than one client for the web. For example, I need a netcat-like program, some helper programs for working with HTTP and a text-only browser for viewing HTML. Plus a TLS proxy to deal with all the HTTPS. This is in addition to graphical, everything-but-the-kitchen-sink desktop or mobile browsers.
Perhaps a single, large, graphical browser directly or indirectly funded by advertisers is insufficient for all web use. Other clients may work better in certain situations. Certainly this is true for me. The smaller programs are faster, more robust and offer me greater flexibility.
It does not send E-tag headers either. The local forward proxy removes all HTTP headers except Host and Connection, and Cookie where needed.