Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Replace the Text Encoding menu with a single override item (bugzilla.mozilla.org)
88 points by serentty on Feb 17, 2022 | hide | past | favorite | 128 comments



It’s pretty unclear to me whether people use this feature much. Two possibilities seem likely to me:

1. Very few people use this feature but a few vocal users who are particularly good at navigating bugtracking bureaucracy do use it and care deeply.

2. Of all Firefox users, a small percentage speak languages for which there are many non-ascii letters and a relatively large amount of pre-Unicode content online, and of those users a small percentage read niche (eg non-facebook and old) sites with broken character encoding.

Both situations could probably lead to few uses of the menu in telemetry. Perhaps we should care more about users in the second category as they are perhaps more correlated with other things like location whereas the first set may be more uniformly distributed across the population. I think other points are also being disregarded:

- ordinary users may find the feature cryptic and confusing. Or at least undiscoverable

- dealing with character encoding and supporting the feature is painful (though the will still deal with it under the hood). Mozilla have limited resources and they (and Firefox users) want the browser to be successful and sustainable, and this means focusing resources on things that will make the browser easier to adopt and less confusing rather than chasing the short-lived gains from appeasing a small and stagnant pool of ‘power users’.


A complicating factor on the telemetry is that both this new "guess" option and its predecessor list are/were in the menubar View menu and have no way to be accessed through the "hamburger," so there wasn't a default-visible way to get to this functionality for Windows and Linux users.

This is a pretty familiar pattern for controversial telemetry-based browser decisions, first making the feature more obscure or hidden, then later noting that few people are using it when justifying change/removal. How many users just abandoned a site they used to use or switched browsers is not captured in the telemetry (though options to switch to that still offer this feature are pretty thin on the ground).

To be clear I don't think this was a "blind" telemetry decision; the developer seems to have done a pretty thoughtful and thorough analysis about it. But you see the pattern of such changes over the years and you start to imagine there's a thumb on the scale...


Pro-tip: press Alt and you get the menu back, at least on my Linux.


Windows as well.


> A complicating factor on the telemetry is that both this new "guess" option and its predecessor list are/were in the menubar View menu and have no way to be accessed through the "hamburger," so there wasn't a default-visible way to get to this functionality for Windows and Linux users.

The collapsing the submenu of the View menu into a single item shipped in Firefox 91. The submenu was removed from the hamburger menu in Firefox 89.

This order of changes was unfortunate. If I had gotten the single-item design done sooner, perhaps the rewritten hamburger menu would have kept the single item instead of removing the entry point altogether.


I’m not convinced that making it easier to find the menu of cryptic encodings would help. I think users would more easily discover that chrome correctly detects the encoding so perhaps the issue is just that the auto-detector needs to be a little better. Indeed if users were just going to chrome that would not show up in telemetry and so a low number would still not mean the the menu needs to be more visible. But this is all speculation really.


In cases where Chrome autodetects, Firefox autodetects, too.

In the cases where you'd use the remaining menu item in Firefox, Chrome offers no recourse (apart from extensions).


> dealing with character encoding and supporting the feature is painful (though the will still deal with it under the hood). Mozilla have limited resources and they (and Firefox users) want the browser to be successful and sustainable, and this means focusing resources on things that will make the browser easier to adopt and less confusing rather than chasing the short-lived gains from appeasing a small and stagnant pool of ‘power users’.

Dealing with character encodings is indeed painful, but they are not dropping support for all of these legacy encodings. If they did, I am sure it would simplify things a lot in terms of maintenance, but they are too widespread in existing content to do that. Firefox will still use a legacy encoding if it detects it. All they are doing is removing the ability to tell Firefox that a website uses a certain encoding. The idea that this option is a maintenance burden is kind of hard for me to believe.


I find it hard to believe as well, but he made some concrete mentions of other tickets that were apparently complicated by the ability to manually set encodings:

> Additional non-user-facing benefit:

The back end code for supporting the non-Automatic menu items got in the way of implementing bug 673087. Based on that experience, chances are that fixing bug 1701828 would be much easier if we first implemented the change proposed here.


Maybe, but if so users should still be able to submit a patch for reimplementing the feature properly, after these fixes land?


I still don't understand why someone would conflate whether something is frequently used with whether it is useful.

Ask yourself: How often do you use the emergency number, how small is the percentage of the population that does in a day? Or even in a life? In the light of this, would you want it discontinued?


Right but that’s a matter of life or death. Let’s have some perspective. Another comparison is perhaps a small back road that no one ever drives down. How many years do you keep maintaining it before you close it off?


The point of the example is that there are things that are useful even if they are not frequently used. You simply cannot equate use with usefulness.


Wouldn't you have to ask where exactly the back road leads to?


> It’s pretty unclear to me whether people use this feature much.

People only dealing with languages that use the Latin script probably don't use it at all. But, speaking from my own experience with Russian, this feature is useful. Now that unicode is the default, text encoding problems are very rare, but they do sometimes still happen. Tools like this[1] were created out of necessity, not curiosity.

[1] https://www.artlebedev.ru/decoder/


There are multiple legacy encodings for text in Latin script - the only subset where encoding might not matter all that much is ISO 646, which is a subset of 7-bit US-ASCII. Even then EBCDIC, while relatively uncommon, is totally incompatible with even the ISO 646 subset.


> dealing with character encoding and supporting the feature is painful

This feature is in no way "painful". It's a legitimate facility for dealing with non-Unicode, legacy content, and AFAICT it's not clear that it can be reliably replaced, even by a plugin. Any argument that the menu entry is "confusing" for most users can be addressed by hiding it behind an advanced preference setting, and I think this was done already.


This feature should be replicable in a plugin: it just needs to inject an appropriate Content-Type header into a response (similarly to how user-agent extensions override the User-Agent headers in requests). A fancier plugin could even remember overrides per domain/path/site, making it even more useful than the original menu.

To be clear, no such extension appears to exist yet, but it should be a matter of time before one is created.


...and because it sounded easy, I went ahead and just did it: https://addons.mozilla.org/en-US/firefox/addon/override-text...

The hardest part was doing all the UI bits, because I had to learn how to make a UI that would look halfway decent in the popup menu. But hey, it works!


The reason why this seems easy while the built-in feature got in the way of other changes is that the built-in feature dealt with more issues. For example: https://hsivonen.com/test/moz/never-show-user-supplied-conte...


> Mozilla have limited resources and they (and Firefox users) want the browser to be successful and sustainable,

To begin with, near nobody uses Firefox today. Very succesful, and sustainable.

I have to pour heavy sarcasm here. It's honestly justly needed.

Firefox went from 50%+ to sub 1% under the current leadership.


> To begin with, near nobody uses Firefox today. Very succesful, and sustainable.

215 million MAU is still a lot of people: https://data.firefox.com/dashboard/user-activity


I'm of a few different minds on this.

First off, what needs to be acknowledged is that any website that requires you to override the charset is broken and the greatest part of your ire should be directed at the people who write and operate such websites for expecting that everybody else should be responsible for cleaning up their messes. Let's also acknowledge that avoiding this breakage has been possible for (checks notes) 22 years.

Since this is very legacy stuff, at what point can we say that it is no longer the end consumer's job to deal with legacy cruft? The motivation for why a user-level override was ever necessary in the first place--the fact that once upon a time computers couldn't simultaneously represent content in different locales--is quite ancient history. I don't think an answer of "never" is compelling: why then aren't we complaining that browsers don't let us select EBCDIC or UTF-7 as an encoding (both of which were once supported, but have been dropped)?

At the same time, this change does somehow still feel premature; dropping the ability to override charset encoding does feel very "WTF?" to me. But... if that's the case, then what can be done to hasten the moment where we live in a world where dropping this feature doesn't feel like that? (This makes me glad I work in compilers, where no one attacks you for deciding that some inputs are just too wrong to attempt to deal with it.)


Unicode encoding was far from universally adopted in the early 2000s. Some legacy content will always exist, and the moment where dropping the feature is harmless is also the moment where there's no reason to drop it in the first place, since it only ever comes into play with broken legacy content; otherwise, it is entirely hidden and has no harmful effect whatsoever.


Unicode encoding isn't what I was referring to--it was the ability to specify <META http-equiv="Content-Type" content="text/html; charset=EUC-JP"> (copied from HTML 4.01 specification, dated December 24, 1999). That might not be the first time it was possible to declare the charset of an HTML document within HTML itself, but I don't see any mention of it in HTML 3.2.

Also, it's wrong to say that there's "no harmful effect whatsoever"--the ability to do charset overrides requires code to support it, and that code could force awkward compromises in (say) your HTTP layer. All features have costs, even if they're invisible.


I also would like to point out in the context of the issue of “legacy” content that not all web content is some sort of single-page application that gets maintained. Of course, we don’t expect software to remain compatible with other software forever, but not all websites should be considered software. Consider a video file, for example. We have the expectation that a video file, which is static content with no scripting, will continue to play forever in future media players, or at least that you will be asked to download some obscure codec pack if it doesn’t play by default. I think that a static website with no scripting and just text is the same. A website with just text should continue to work. I think this is pretty different from something like Flash or ActiveX or even cookies, where we are no longer talking about static content.


Taking your video example it also just shows that browsers are different: there's tons of video content out there on the web that browsers can no longer play, since they relied on plugins to play and plugins are dead, like RealVideo, WMV, MPEG-1/2/4, etc. As always, ffmpeg is a treasure, though.

There's a whole world of patents that make video a particularly problematic space for browsers, but just the basic philosophy of continuing to work with old static content forever isn't that strongly held (and really, isn't that strongly held in much of software: consider opening really old Word or WordPerfect documents just as an example).


> this change does somehow still feel premature

Especially in this heated sort of discussion, I think we need to know more than 'feels premature' or overripe or whatever. What about the data they have? The developer's research is linked in this discussion somewhere.

And of course we are deep in a bubble. Almost no end user knows what character encoding is, and few have any hope of fixing the problem manually. In fact, calling the menu item 'Repair Character Encoding' (or whatever they chose) is probably poor UI - you need something that end users will understand, more like 'Repair Gibberish Text'.


The telemetry-driven desire to redesign everything and remove lots of 'infrequently used' features is a real plague on software and it really sucks that it seems to have infested the web because of the amount of novelty-obsessed designers, PMs and engineers who move from one company to another. Lots of chrome, firefox and safari devs are people who previously worked on a different browser or a company with this culture and eventually your team has enough people like that to push through things like 'remove this important low-cost feature because only 1% of our billion users are impacted', as if 1% of a billion isn't a significant number of people.

It used to be that even minor web platform changes wouldn't go through if they would break 0.1% of webpages, but we don't live in that world anymore.


> The telemetry-driven desire to redesign everything and remove lots of 'infrequently used' features is a real plague on software

I disagree. Maintaining features requires engineering overhead. It sucks to have to maintain features that barely anybody uses when you have a backlog of features that are in demand by a large number of your users.


Right, but not all features need to be used often to be useful. This is a great example of such a feature.


That is the core problem with product management: Presumably every feature is useful to someone, somewhere, but you don't have infinite engineering resources to ship and maintain them all.

No one is suggesting the feature isn't ever useful. The issue is that they've reached a point where they have to choose what stays and what goes, and the least frequently used features are obviously at the top of the list for what to cut.

You can't make everyone happy in these scenarios, just as you can't deliver everything.


To make a concrete argument here: If Mozilla has the sweng resources to devote to their frequent frontend redesigns, they can keep the encoding menu working. And the encoding menu has a significant, difficult-to-replace function to serve for people from non-english-speaking nations who want to consume content in their native language. For reference, I worked on Firefox.

Removing something like the encoding menu is a choice, not a necessity.


Right, but I'm challenging the notion that use means usefulness. I don't think that can be assumed to correlate without further justification, especially given the ease of producing counterexamples and scenarios where this is not true.

Some features are used a lot because some other features are inefficient. Sometimes a feature is very useful in some rare situations (compare with seatbelts).


>but we don't live in that world anymore.

And I'm glad. I'm sick of using software that has 2 trillion toolbars, menus and settings. I want something which sets the defaults to be pretty much what I want and makes the settings all the common things I might need to change. Telemetry driven design lets designers build what the majority like rather than the most vocal bug tracker using power users who want exactly the opposite of the average user.


Must be nice to always live in the middle of the bell curve.


The defaults aren’t always exactly what I want but I find it is preferable to just accept them as they are than to have a billion customisation options. I pick the best bit of software that works for me and then just change my workflow to fit it.


Just another casualty of telemetry.

I'm sure it doesn't help that Chrome killed its own encoding menu years ago.


> Just another casualty of telemetry.

Sure, but you're missing the bigger picture, which is that each feature has all sorts of other costs (maintenance, security risk, codebase complexity, etc). As mentioned in the bug, this feature got in the way of implementing another feature.

Telemetry helps guide the balance between those costs and the value the feature brings, and you can disagree with how that balance in each decision, but it still is a fact that codebase health is important and if you don't prune, you'll very quickly end up with a huge messy project, no matter how careful you are along the way.

Of course, the other option would've been to rewrite/refactor the feature, but again, is an engineers time better spent working on something that helps 50 people or 50K people?


Exactly. With infinite engineering resources, you can afford to keep every feature around and maintained forever.

But nobody has infinite engineering resources. Maintaining everything comes at a cost, and eventually you have to make some hard decisions about which features to abandon so you can reallocate engineers to features people actually use.

Frustrating for the small number of people who use that specific feature, but you can't please everyone.


> But nobody has infinite engineering resources. Maintaining everything comes at a cost

Irrelevant in this case. There are far more effective ways of paying down technical debt and enhancing future maintainability than removing a small, self-contained feature that occasionally addresses a critical user need. This is Firefox we're talking about, they've got a humongous legacy codebase to play with.


They still have the feature, but not the menu.

I bet the maintenance burden of the menu is extremely low.


I would be prepared to bet that most people that use the more advanced functionality disable telemetry.


And then they wonder why the developers think nobody uses said niche feature


Should I have to sit around defensively pressing buttons I like to keep around but don't have use for every day just to be sure the devs don't think they are redundant? Seems silly.

Alright 15 minutes have passed, here we go again. Back, reload, view page source, bookmarks,...


Really at this point users should band together to steer Firefox back into better graces. Really all that’s needed is a bunch of servers running Firefox with something like Playwrightjs or Selenium and just juke the stats on features since the marketers and developers are ONLY going to make decisions based off that data.


No, you should use it as you normally do, as will others, and then they'll know what is actually used.


And then the vendor wonders why privacy conscious users don't use their browser (a core demographic, judging by their product line and social media.)


Given that it's known that power users disable telemetry, developers should not rely on telemetry alone when deciding to remove a feature. The opposite is spiting power users, and you can see how that's going for Mozilla.


My feeling is that telemetry as used in most places as a gut check to ensure that what you are doing isn't a terrible idea. It isn't necessarily going to drive you to invest further into something (you generally have better ways of getting that kind of feedback).


> developers should not rely on telemetry alone when deciding to remove a feature

What better data should they use?

> you can see how that's going for Mozilla

HN commenters (not necesarily readers) react this way to most changes in everything. Mozilla doesn't stand out IME.


> What better data should they use?

For one, they could take comments on their bug tracker to heart instead of locking the discussion as soon as there are enough users unhappy with a change.


They should just remove Firefox since telemetry shows not a lot of people use it.


Their telemetry is coming from Firefox installs. Kind of impossible to escape from that cave.


Chrome's position was "use an extension for this", but then users have to risk privacy by installing a third-party extension. What harm does having the option tucked somewhere in advanced settings do?


Every release they probably have to test that it still works.

Not that I would ever justify removing something like this - it's clear that for a percentage of the userbase, this is necessary for them to browse. But I can see a product manager trying to cut "fat" if they see that 0.01% of people use it.


Isn't this issue fixed by websites fixing their encoding? If no browser is able to properly display the content, I imagine they will be fixed pretty quickly if they are at all maintained.


Yes, but, they’re not. Huge swaths of the web are effectively abandoned, and will not get fixed.


Then the best option would be to copy out the useful content to new maintained sites and let the old ones die. Rather than keeping browsers tied to supporting long long dead features. UTF-8 has been standards for over a decade now. I think there has been plenty of migration time. More than flash had.


A while back, Firefox removed the ability to override the text encoding of pages. If Firefox gets the encoding wrong, you are screwed. The thing is, according to the developers themselves, detecting various single-byte Latin encodings is very unreliable, and they have indicated that making Firefox detect one encoding more reliably nearly always means introducing failures at detecting another one, so there will be no progress in this area. Despite this, they have seen it fit to remove the encoding menu, because telemetry shows that most users don’t use it, and because when they do, it often takes them multiple tries to guess the correct encoding. Their solution? To replace it with a “guess again” button, completely removing the ability to choose manually. Henri, one of the developers responsible for this change, has argued that as long as both encodings are Latin-script encodings with a common ASCII subset, it is not catastrophic for the user to be stuck with the wrong encoding, because the text is likely “still legible”. To give an example of what he is calling an “acceptable” level of mojibake, consider this text from the Polish Wikipedia, encoded as Latin-2 and decoded as Latin-1.

> Hamidiye turecki kr±¿ownik pancernopok³adowy z pocz±tku XX wieku, wodowany w 1903 roku, zbudowany w brytyjskiej stoczni Armstronga. Wyporno¶æ normalna okrêtu wynosi³a 3904 tony, a d³ugo¶æ siêga³a 112 metrów. Napêd stanowi³y maszyny parowe o mocy 12 000 KM, pozwalaj±ce na osi±ganie maksymalnej prêdko¶ci 22 wêz³y. Artyleria g³ówna sk³ada³a siê z dwóch pojedynczych dzia³ kalibru 152 mm i o¶miu dzia³ kalibru 120 mm. S³u¿y³ w marynarce Imperium Osmañskiego podczas wojen ba³kañskich oraz I wojny ¶wiatowej, a nastêpnie w marynarce Republiki Turcji do 1947 roku.

Personally, I think this change is ridiculous. It has been a few versions already since it was rolled out, but people are still complaining in the issue tracker. I know one Russian guy who has resorted to using an extension which replaces arbitrary strings in order to correct common mojibake sequences in order to deal with the regression in functionality brought about by this change.

If you feel strongly about this like I do, I strongly encourage you to comment in the Bugzilla thread. Yes, the web should be using Unicode these days, but if it isn’t, that is not your fault as a user, and making the experience miserable for the end user is not justifiable.


> Hamidiye turecki kr±¿ownik pancernopok³adowy z pocz±tku XX wieku, wodowany w 1903 roku, zbudowany w brytyjskiej stoczni Armstronga. Wyporno¶æ normalna okrêtu wynosi³a 3904 tony, a d³ugo¶æ siêga³a 112 metrów. Napêd stanowi³y maszyny parowe o mocy 12 000 KM, pozwalaj±ce na osi±ganie maksymalnej prêdko¶ci 22 wêz³y. Artyleria g³ówna sk³ada³a siê z dwóch pojedynczych dzia³ kalibru 152 mm i o¶miu dzia³ kalibru 120 mm. S³u¿y³ w marynarce Imperium Osmañskiego podczas wojen ba³kañskich oraz I wojny ¶wiatowej, a nastêpnie w marynarce Republiki Turcji do 1947 roku.

As a Polish person who has been seeing this sort of mis-encoded Polish text for over two decades now, my gut instinct is to immediately reach for the encoding menu. That menu is gone now.

We live in the era of almost omnipresent UTF-8, but it simply feels wrong to remove backwards compatibility with older documents on the Polish web that are mis-encoded like that - and there are still some of them out there.


Would it be impossible to put this in an extension?

What is the compatibility worth if people have to manually figure out how to switch it on?


I made one after seeing this discussion: https://addons.mozilla.org/en-US/firefox/addon/override-text...

Feedback & bug reports welcome!


yep, as someone from the balkans with old sites and iso8859-2 vs cp1250, I used this feature sometimes too... well.. I guess not anymore.


I wonder how possible it would be to make an extension


It turned out to be pretty easy; I hacked one up in a few hours and got it published on AMO: https://addons.mozilla.org/en-US/firefox/addon/override-text...

It's open-sourced on GitHub in case you're curious. It just overrides the Content-Type header on pages where you've turned it on, setting the `charset` parameter to whatever you select.


After the submenu was replaced with a single item in August, how many times have you 1) encountered Polish mojibake and 2) the single item wasn't able to fix it?


I seldom use this feature, but when I do and I can't find the encoding menu, then it will be a very frustrating experience. Their logic seems to be like dialing emergency service is a rarely use feature based on telemetry, so we will just remove that. No discussion is allowed, if you challenged, then they ask you about do you have data about the use case of dialing emergency service? People can still send text to emergency service, it is fine.

And because of this post, I just checked if chromium supported encoding menu. Yep, they removed it long ago. Basically, when you need to read some very old Japanese site where their encoding is pretty non standard, you are screwed.


Per https://hsivonen.fi/encoding-telemetry/, it looks like they run a special encoding detector on .jp sites, which is presumably designed specifically to choose between the various Japanese encodings (Shift-JIS, EUC-JP, etc).

I think your example of emergency services is a bit hyperbolic. This is a feature that is really not often used, whose omission is not fatal, which often requires several tries to get right, and which is increasingly less useful thanks to gradual changes on the Web. Much more widely-used features like FTP and Flash have been deprecated; people howl and yell every time, but yet things still seem to work.


Of course, it is not about the fatality. It is an example to point out that a rarely used feature does not mean it is not important.

Another example is Google search remove the ability to perform exact text search such as "some phrase I found important to search". Maybe based on statistic, it was 20% of user using that, 80% of users does not rely on this feature. The logic is saying messing up 20% of user is not a problem. We are serving the 80% of users pretty great.

The current paradigm of UX design for providing "good default" to serve 80% of users and removing customization to screw 20% of advanced user makes me pretty helpless for using modern app. That's why I prefer cli nowadays for the flexibility of those software.


So now you have to rely on hardcoded heuristics based on the website’s domain name instead of just being able to choose. There are so many complicated heuristics to choose this hard-to-predict factor which it still gets wrong often. Removing the user’s control over this seems like a dreadful move.

> This is a feature that is really not often used, whose omission is not fatal, which often requires several tries to get right, and which is increasingly less useful thanks to gradual changes on the Web.

How often do you visit small indie websites in non-English languages? Because in my experience, as soon as you do, this is not a rare occurrence.


It breaks things without possibility of repair for a certain group of users. How is this not an issue?


> Basically, when you need to read some very old Japanese site where their encoding is pretty non standard, you are screwed.

As it happens, the primary reason why the single menu item remains instead of an override being totally gone is usage in Japan. (Many people in this thread comment as if Firefox had done what Chrome did: complete removal of override as you note.) The remaining menu item is pretty accurate for its primary use case, which is dealing with Japanese legacy sites that misdeclare their encoding. (There is one exception: If the page declares UTF-8 but is actually ISO-2022-JP, then you don't have recourse.)


> when you need to read some very old Japanese site where their encoding is pretty non standard, you are screwed

If you need to read them frequently, yes. Otherwise, if this is just one-off, maybe user can download the page source, extract text content, paste into text editor and select encoding? Do you have an example CJK site I may try?


I rarely need to do that, so I cannot find an example for the moment. It is usually a one-off thing, then yes I could do the conversion manually, but it was obviously an usability degraded.


It seems he did extensive background research to investigate and find optimal solution for this issue: https://hsivonen.fi/chardetng/

Goal was apparently to bring FF to feature parity with Chrome. Sad to see it's not working as well as intended. (Even sadder to see his person being attacked by some commenters below.)


The problem is that it's impossible to guess correctly 100% of the time. You can improve the guessing algorithm as much as you want to (not that I'm saying it's useless - it's very good for regular users who only rarely encounter such issues), but you can never achieve total accuracy, which is why some sort of manual override is absolutely necessary in cases when it fails.


I agree that good detection is important and I am glad that he is working on it, but I don’t think that the level of detection which is possible is a good replacement for the menu, based on what he himself has indicated he thinks is possible to achieve in Firefox.


Extensive maybe, correct apparently not so much.


Too late, code merged. Bugzilla locked. You the user do not matter really to Firefox anymore.

If the encoding is broken, it cannot be fixed by Henri Sivonen auto-detection code, but he can claim he "simplified the menus" as a bullet point to project managers.

This is the gradual but inevitable backslide of Firefox into Chrome.


Chrome has marketshare.


Simplifying a submenu by replacing it with a quick option that works in most cases is a great improvement, but removing the underlying feature that lets users select the text encoding manually is bad. There must be a good way to keep the menu bar simple while keeping the option.

Firefox already shows page-specific info in the left side of the address bar. There's usually a shield icon and a lock icon, and permission requests appear there too. Why not add an "encoding icon" that shows whenever the encoding repair feature is active? Such a marker would also be a great place to put a "override with custom encoding" menu.


The quick option was already there before though - it was labeled "automatic" before. The only thing he did was remove all other options.


Good luck but I am not hopeful Firefox's behavior will be corrected. See also this 13 year old pearl courtesy of Bugzilla:

"Set screen coordinates during HTML5 drag event"

> The current HTML5 spec describes that all DragEvent properties should be available during all the events - according to editor Ian Hickson.

>> Note though that it doesn't specify what the properties should be set to, just that they should be set and we currently set them to 0.

https://bugzilla.mozilla.org/show_bug.cgi?id=505521


Seems like they restricted the comments, because, you know, head in the sand is the standard Mozilla approach.


Or, maybe, hear me out here: deliberately posting a link to an issue to a widely-visited community with an inflammatory title, with the explicit intent of dogpiling the maintainer led to them locking the thread? You know, something like what HN itself does to prevent vote manipulation?


Your example is not an example of actual Firefox failure mode: If I encode your text as ISO-8859-1 and run chardetng on the bytes, it says ISO-8859-2. It’s generally unlikely that chardetng would misdetect ISO-8859-2 Polish as windows-1252.

When you claim that something isn’t working for you (and it’s completely plausible that you have encountered a case where chardetng doesn’t work for you), it would be polite to post the actual failure and not something that’s not an actual example of failure.


[flagged]


This person makes a decision supported by reasonable evidence and argumentation, confirmed through discussion with other members of the organization and wider community, and shipped after a thorough review (all happening in the open on a bug tracker!); and you're calling for them to be fired?

I sure hope you don't make any mistakes at your company!


> all happening in the open on a bug tracker!

I find it suspicious that there was not a single comment from a user before the change shipped - but immediately after, there were various complaints. That's not the first time I see this pattern in a bug.

This looks more like the "Hitchhiker's Guide to the Galaxy" definition of "open" - where the discussion is technically public, but you heavily rely on the fact that no one knows it exists.


That's a good point. The removal of such user-critical features should at the very least be announced well in advance via a clear deprecation notice, to allow for the collection of relevant feedback.


> user-critical features

They didn't remove JavaScript or the tabs. Very few users will know it's gone - I wouldn't have noticed.


If the page becomes unreadable I would say that it's critical.


How could Mozilla publicize every change they make? And then discuss them all?


By submitting every new Bugzilla ticket as a new HN post, of course!


Do other browsers offer character encoding menus or are they from improper companies? (Or were the developers fired but their changes kept?)


He "simplified the menus", remaining broken old web pages be damned. Mozilla's goal is to be as user hostile as Chrome.


Seems like clickbait title. I have seen this issue few times over the years and it was always a game of "Which encoding should I choose?"

Having an automated way to solve this seems pretty good to me. Especially if the solution works pretty much 100% of time (especially for non-ASCII languages, like Japanese or Chinese) - see paragraph Document-length Accuracy in https://hsivonen.fi/chardetng/


East Asian languages are not the main concern here, actually. It’s the various single-byte Latin script encodings, which are incredibly similar. The issue of these being confused is addressed in the thread itself, where he says that such issues will not be fixed, because they are so confusable that fixing the detection of one means breaking the detection of another.


I view it as a firefox fixing incorrectly written pages. It's nice that FF has a "fix-it" option, but ultimately, the author shoudl fix her site. The single-byte Latin scripts are semi legible, so that is enough for me.

When a website breaks because 3rd party cookies go way of dodo or the page relies on ActiveX, shoudl Firefox provide an option to run it?

I understand that some people have different opinion, but any page past 2005 should have encoding in order and older sites should fix their sites.


“Should” is not “is”. As a user, I run into broken sites regularly enough. I still want to be able to see them. I get the “let’s bury all broken legacy and hacks as quickly as possible” mindset, but when it comes to the issue of the main text of a page being unreadable, I think there is a high bar.

> older sites should fix their sites.

A lot of the reason they are considered “older sites” in the first place is because they are not heavily maintained. The thing to remember about the web is this: not all websites are “software” that is maintained. Some of them are static works similar to a movie, with static content.

> The single-byte Latin scripts are semi legible, so that is enough for me.

Did you see the example that I gave of Polish text? I am sorry, but if you feel like reading multiple paragraphs this way is bearable, then we will just be stuck in disagreement.


I guess we are stuck at disagreement.

I did see the text. If it bothered me that much, I would write a proxy that passed HTML through iconv tool before serving it to user (with a dictionary to remember setting for each site).

But more likely, I would stop visiting such site or complain to the owner. Situation doesn't get better by covering up errors.


As a matter of usage and severity of failure mode, East Asian languages are very much the main concern. Japanese specifically is the primary motivator for keeping the feature in any form.

If there's more elaborate UI surface in order to address Latin-script cases where autodetection fails (I estimate 1% failure rate for Polish), it risks getting the whole thing removed completely even for Japanese users (as happened with the hamburger menu entry point in Firefox when the hamburger menu was rewritten, as happened in Chrome previously, and as happened with Safari for iOS in the sense of the menu never making it to iOS).


The old menu did have an automatic automated way to solve this. I'm not sure if it goes all the way back to Firefox 1.0, but it's pretty old. (At least Waterfox Classic exists and you can install it and see the menu there.) That it wasn't very good (as detailed in the post) suggests replacing it with a better version, but this goes beyond that in removing the ability to pick something specifically when the automatic detection is wrong.


This decision is offensively user-hostile. It breaks a critical feature for dealing with broken and legacy webpages (which, for better or worse, are never going to stop existing) without a good justification. Even for English-speaking users this sucks; there's lots of mailing list archives and the like that mishandle their encoding.


All the discussion in this thread and in the bugzilla thread feel very odd to me. Like arguing that it's not very used, is a maintenance burden.... and it was already hidden off in a submenu!

I'm, of course, glad that the guesser can work more effectively. But isn't this a bit different from something like FTP (which can introduce nasty security bugs?)? And you still end up with "change encoding of page" with the guesser, so it's like... it's not like you are removing some lower-level functionality.

It's unfortunate because yeah people mention Flash getting removed in a similar tone, but at least with Flash there are things like ruffle that can pick up the slack. I don't really want to have to install a plugin for the couple of times I land on the page that miss this encoding, but when I do hit the page I'll be frustrated.

Of course the original sin here was browsers not throwing warnings or something on every page without the encoding set, and having users become the people who complain to webmasters about it like... 15 years ago or whatever. But meanwhile we had a really _fine_ place for the encoding picker. Maybe first click is "Guess", second reveals the menu... or something.

I don't want to hold too much malice towards the devs here cuz getting yelled at by all the internet sucks. But I hope there can be some resolution here that doesn't lead to massive burnout. I will be very happy if there is some backup solution inserted (even if it's behind a conf flag), and I would see this as an overall positive to FF if that happens (listening to feedback!)


This headline is unfairly editorialized. What the Firefox dev said was that the automatic encoding detecter confusing some ISO/IEC 8859 encodings for each other is an intractable problem, and fixing one Latin text encoding being auto-detected for a different Latin text encoding is a very low-priority use of time. Also, removing the encoding override is completely separate from that, since it's (apparently) a very complicated feature to keep around that 1) breaks webpages (e.g. etrade) and 2) keeps confusing users and leading them think Firefox is worse at text encoding then Chrome.

Regardless of your opinion of the change in question, these two statements are completely different from what the headline implies.


He straight-up said that it’s not as much of an issue because it’s still legible because not all of the letters are borked. I don’t see how this is a misrepresentation of what he said.

> Also, removing the encoding override is completely separate from that,

It is not separate at all. If the detector cannot do it, and the ability for the user to do it is removed, these issues are intrinsically linked. Sure, there are the other issues you mentioned around keeping it around, but you can’t act like these issues can be considered separately.


The etrade breakage was a failure of the previous detector, which expected an encoding declaration within the first few page KB and introduced a page reload otherwise. Switching the current encoding manually does not reload the page per se.


> This headline is unfairly editorialized.

I'm not sure why it doesn't follow HN guidelines to use the page title:

"Replace the Text Encoding menu with a single override item Repair Text Encoding"


To people thinking that this was removed gratuitously please read the bug and linked bugs carefully.

This got in the way of implementing various improvements to <meta charset> and interop improvements to XHR.


Encoding privilege is not something I expected to be enforced by Firefox. I guess I’m lucky to not be hindered by this - because I can’t ready any language that uses non English characters to the point where I can’t guess what they would be.


The title, by HN guidelines, is "Replace the Text Encoding menu with a single override item Repair Text Encoding"


In addition to the concrete case at hand, this is another instance where explicit user abilities are taken away and replaced by an algorithm.

I really hope the tech industry will see the light at some point and stop this trend.


After all the heated discussion here, I decided to make this into a plugin: https://addons.mozilla.org/en-US/firefox/addon/override-text...

It's pretty bare-bones and probably has some bugs (after all, I hacked it up in a few hours...) but it does what it says on the tin: it lets you set the text encoding for any webpage manually, just like the good ol' Encoding menu.


I hugely appreciate this! I will definitely install it.


I like this change. If a website misconfigures encoding and users have an encoding override menu, the result is 99% of users see a broken website, 1% know how to fix it. The menu doesn’t meaningfully improve the situation, but it does keep the people who could improve it from caring. Take that away and I predict 10%+ of currently misconfigured websites will get fixed for 100% of visitors, and that’s a big improvement over 100% of websites being fixed for 1% of visitors.


So you think that making things miserable for the remaining 1% is good because it gets them to care, and therefore for the website to be fixed? I’m sorry, but most of the time when a website has a borked encoding, it is some website where I doubt I would ever get in contact with the webmaster anyway. It’s mostly small websites. But small websites are important to maintaining a healthy web where things are not dominated by a few big players.

But perhaps most importantly, I have never agreed with any mindset of that encourages punishing some subset of users in order to get someone to change their ways, even more so if that isn’t even the users themselves.


> I’m sorry, but most of the time when a website has a borked encoding, it is some website where I doubt I would ever get in contact with the webmaster anyway.

So you wouldn't bother to try to get the webmaster to fix the issue, you would just resolve it on your own, letting everyone else be miserable?

> But perhaps most importantly, I have never agreed with any mindset of that encourages punishing some subset of users in order to get someone to change their ways, even more so if that isn’t even the users themselves.

It sounds like you don't care if people are punished, as long as it isn't you.


> So you wouldn't bother to try to get the webmaster to fix the issue, you would just resolve it on your own, letting everyone else be miserable?

I mean that I would not be able to. Huge swaths of the web are static content that was just put there and then just abandoned. The web will remain this way unless everything is moved onto SPA social networks, and I hope that we agree that that would not a good thing.

> It sounds like you don't care if people are punished, as long as it isn't you.

If the added benefit of causing people more inconvenience is that they push for problems to be fixed, I don’t think it’s fair to cause that inconvenience with the intent of using that to solve the problem. I think it’s a manipulative tactic. That is my idea of what “punishment” we’re talking about. What kind of punishment are you thinking of if you think I’m fine with punishment, and who is getting punished by it?


> If the added benefit of causing people more inconvenience is that they push for problems to be fixed, I don’t think it’s fair to cause that inconvenience with the intent of using that to solve the problem. I think it’s a manipulative tactic. That is my idea of what “punishment” we’re talking about.

We are talking about the same punishment.

If fixing the pages would reduce the amount of times people would have to manually fix page encodings locally, just working around the problem leaves the punishment in place for more people.

You express that you are being manipulated to prompt webmasters to solve the underlying problem, even though many more people would benefit from not having to manually work around it.


The thing is that I do not agree with you that the ends justify the means. You seem to think that people seeing fewer pages in the wrong encoding in the long run justifies making the problem worse when it does happen. I do not.

And again, I will reiterate that this does not solve that problem anyway because vast swaths of the web are static content that was put there and never touched again, so no one will fix it. Unless we want everything to live on content farms, it is essential to be able to access content like this. You are viewing the web as something that is actively maintained, similar to a software project. That is true for a certain category of website, such as the one that we are on right now. It is not true for the kind of website which is most likely to have borked text encoding.


Probably much less than 1% English speakers know how to to select a text encoding in a browser, but for speakers of languages where multiple non-utf encodings were common until recently this knowledge is more common.

My anecdata: once having only a smartphone with me I needed information from a txt file in cp1251 encoding. The server was configured correctly for .html, but not for .txt Mobile browsers available on the smartphone didn't allow me to override text encoding. It was very frustrating experience and it would be enough for me to switch to another browser in future except I don't know a mobile browser which allows to select text encoding and not much worse than mobile Firefox. And I don't blame the maintainer of that site - he does (or did) this in his spare time and had more important tasks than to check text encodings for every single URL on the site so users of crippled web browsers would have no problems.

I like comparison of this feature to emergency services - you rarely need it but when you need it you need it very much.


I use it few times a year to fix websites sending ISO-8859-2 (or Windows-1250) and the guess mode worked. As an user affected by this change, it is an improvement (less need to click around trying 5 different encodings)


I am definitely not against the addition of the guesser. It was added a bit before the encoding menu was removed, though, so there was a time when they were both there. I mainly wanted to draw attention to the removal of the menu.


There are still documents on the internet that use ASCII encoding, and while English speaking users doesn't have this problem, in some countries there were several different encoding systems used back in the past. While most believe that web content is resilient and we will still be able to browse old websites forever, recent web browser developments show otherwise. Please do not remove working features, especially if they have no effect on security or privacy.


> Please do not remove working features, especially if they have no effect on security or privacy.

As it happens, part of what made the menu back end cause other changes to take more effort than they should have taken on their face were the security aspects of the menu in Firefox.

Try https://hsivonen.com/test/moz/never-show-user-supplied-conte... in Firefox 90 (which had the menu) and in another browser that has a menu or an extension that adds a menu.


> There are still documents on the internet that use ASCII encoding

Probably a great many because ASCII is a subset of UTF-8, and afaict is common for English when the site doesn't use more sophisticated typography such as curly quotes, em dashes, etc.


Could the menu be restored by an add-on?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: