Hit the link expecting to read about UTF-8 Byte Order Marks at the top of the fi...

SAI_Peregrinus · 2025-11-20T16:48:05 1763657285

UTF-8 doesn't have a BOM. UTF-16 does. UTF-32 does. "UTF-8 with BOM" is not a standards-compliant text format, it's a proprietary binary format that happens to have a bunch of embedded UTF-8. Just because you can run `strings` on a file & get a bunch of text out doesn't mean it's a text file!

jborean93 · 2025-11-20T22:37:29 1763678249

This seems a bit pedantic, while you may be correct (I honestly don't know what standard this is referring to) the UTF-8 BOM is a thing that some tools do know about. Even then in the context of OP's question the BOM with UTF-8 isn't the specific problem but rather how the shebang interpreter reads the actual ASCII byte sequences so a UTF-16 with a BOM "text" file would also fail.

cm2187 · 2025-11-20T06:17:32 1763619452

tbh it is lame for any program reading a text file to not support BOM. It's just one if.

theblazehen · 2025-11-20T06:53:52 1763621632

There isn't really any one "text file" though, the kernel looks for the first two bytes to match what "#!" corresponds to in ASCII.

https://www.youtube.com/watch?v=J8nblo6BawU is some great watching on how "Plain text isn't that simple"

SAI_Peregrinus · 2025-11-20T16:49:28 1763657368

UTF-8 is a text format with no BOM. Just like ASCII doesn't support a BOM. The BOM is a UTF-16 or UTF-32 thing, so "UTF-8 with BOM" is a binary file that happens to contain some UTF-8 strings as well. Since it's not a text file, it makes sense that utilities expecting text files don't handle it.

masfuerte · 2025-11-20T17:24:59 1763659499

Eh? A utf8 file starting with ZERO WIDTH NO-BREAK SPACE is not a text file? How do you figure that?

SAI_Peregrinus · 2025-11-22T22:30:33 1763850633

If it starts with 0xFE 0xFF, but is otherwise UTF-8 instead of UTF-16, it's a binary file. If it starts with 0xEF 0xBB 0xBF, it's a text file with a ZERO WIDTH NO-BREAK SPACE at the start.

masfuerte · 2025-11-23T21:57:33 1763935053

> If it starts with 0xFE 0xFF, but is otherwise UTF-8 instead of UTF-16, it's a binary file

Sure, but who does this? All the Microsoft tooling writes 0xEF 0xBB 0xBF if you output utf8 with a BOM.