Many things makes sense to me, but as we can all guess, this will never become a thing :(
But the "magic number" thing to me is a waste of space. If this standard is accepted, if no magic number you have corrected UTF-8.
As for \r\n, not a big deal to me. I would like to see if forbidden if only to force Microsoft to use \n like UN*X and Apple. I still need to deal with \r\n in files showing up every so often.
"If this standard is accepted, if no magic number you have corrected UTF-8."
That's true only if "corrected UTF-8" is accepted and existing UTF-8 becomes obsolete. That can't happen. There's too much existing UTF-8 text that will never be translated to a newer standard.
You do realize that it's the UNIX people who are the strange ones here? The CRLF has been used as line delimiter by everyone (except IBM who always lived in their own special EBCDIC land) since late sixties, but then Thompson decided that he'd rather do LF-to-CRLF translation in the kernel tty driver than store the text on the disk as-is, like literally every other OS did (and continued to do).
Besides, the terminal emulators nowadays speak UTF-8 natively; and they absolutely do behave differently for naked LF and CRLF, and you can see it for yourself if you exec "stty -onlcr" and then try to echo or cat some stuff. Sure, you can try to persuade every single terminal emulator's author to adopt "automatic carriage return" but most will refuse to; and you will also need to somehow persuade people to stop emitting CR+LF combination in the raw mode... but then you'll need to give them back the old LF functionality (go down one line, scroll if necessary) somehow. Now, such functionality exists as the IND character — which is in the now forbidden C1 block. Simply amazing!
There's no point in a carriage return without a newline. So why have both just because of the 1933 teletype's hardware implementation? It's purely a hardware thing. That's why Multics used \n, and that's likely why Thompson chose to continue that practice.
When ASCII came about, it wasn't really about text files. Computers didn't talk to each other back then. ASCII was about sending characters between devices, and for compatibility reasons a lot of devices copied \r\n from the teletype. But there were a lot of devices that didn't as well. Putting it in the driver makes perfect sense from the point of view of someone developing a system in the 1960s.
> There's no point in a carriage return without a newline.
Progress bars. And TUIs in general, CR without LF is still quite handy for them. But even for paper teletypes, it has marginal use of correcting typos: print spaces over the correct text, overtype the corrections on top of mis-typed letters.
> When ASCII came about, it wasn't really about text files. Computers didn't talk to each other back then.
By the time it was finished being standardized, they did. One of the earliest RFCs, RFC 20 talks, among other things, about using ASCII in "HOST-HOST primary connections". Actually, reading the descriptions of the "format effectors" in that RFC is quite illuminating, it explicitly mentions "display devices", that is, "glass" terminals. And it also talks about the possibility of using LF as NL but notices that it requires exact matching of the semantics in both sender and receiver. But even without networks, exchanging data on magnetic/paper tapes, punch cards etc. between the computers was already a well established thing. After all, they did not invented and standardized ASCII simply because they had nothing better to do with their time!
Don't get me wrong, I too think that having a single-character new line delimiter/line terminator for use in text files is better than using a two-character sequence cobbled together. But many disagreed, and all of the RFC-documented protocols up until very recently use CRLF as line separators, so this convention obviously used to have a rather large support. Now, whether LF-to-CRLF translation, and line discipline in general, belongs in the kernel is a different question; I personally think it should've been lifted out of there and not conflated with serial port management but alas, it is what it is.
Using CR for progress bars and overstrikes is a neat hack, sure. But it's a hack. You're using it for cursor control, and there's a lot more to cursor control than just moving to the left without advancing down. TUIs need a lot more than that, and none of it is part of ASCII.
CR and NL were separated for mechanical reasons - namely it took the same time to move to the beginning of a new line as it did to print two characters. That convention stuck because computers were rarely in the loop, so software conversion wasn't an option.
As far as the nascent ARPANET project goes, I doubt it was much of a concern for the ASCII committee. This was the '60s - computing news traveled word-of-mouth or in journals. You couldn't just jump onto the IETC's website and download the latest RFCs. Unless a large organization was pushing things (and DARPA wasn't, not really) people mostly concerned themselves with what they worked on and were familiar with. The ASCII committee would have lots of people familiar with telegraphy, AUTODIN, and similar device-to-device networks. The computer-savvy people would be thinking of tapes and punch cards and other I/O.
If ASCII had come out ten years later, then sure I could see them being concerned about computer text formats.
Yeah, Macs IIRC just stored the keyboard input instead (the Enter key generates CR, which is also an old and venerable tradition, that's why there is ICRNL flag for the terminal input).
Obviously, in a perfect world we would have a single NL character for storing in text files in memory/on disk/in transit, and terminals would use entirely different CR and IND control codes, and internally translate NL to CR+IND combination when printing text, and send NL to the host as the keycode of the Enter key when it's pressed. Alas, that train has sailed long ago (and let's not even start on choosing BS versus DEL).
But the "magic number" thing to me is a waste of space. If this standard is accepted, if no magic number you have corrected UTF-8.
As for \r\n, not a big deal to me. I would like to see if forbidden if only to force Microsoft to use \n like UN*X and Apple. I still need to deal with \r\n in files showing up every so often.