Except that the "comma" was a poor choice for a separator, the CSV is just a pla...

thw_9a83c · 2025-09-10T11:52:23 1757505143

The notion of a "platform" caught my attention. Funny story: About five years ago, I got a little nostalgic and wanted to retrieve some data from my Atari XL computer (8-bit) from my preteen years. Back then, I created primitive software that showed a map of my village with notable places, like my friends' homes. I was able to transform all the BASIC files (stored on cassette tape) from the Atari computer to my PC using a special "SIO2PC" cable, but the mapping data were still locked in the BASIC files. So I had the idea to create a simple BASIC program that would run in an Atari 8-bit PC emulator, linearize all the coordinates and metadata, and export them as a CSV file. The funny thing is that the 8-bit Atari didn't even use ASCII, but an unusual ATASCII encoding. But it's the same for letters, numbers, and even for the comma. Finally, I got the data and wrote a little Python script to create an SVG map. So yes, CSV forever! :)

humanfromearth9 · 2025-09-10T11:12:34 1757502754

And the best thing about CSV is that it is a text file with a standardized, well known, universally shared encoding, so you don't have to guess it when opening a CSV file. Exactly in the same way as any other text file. The next best thing with CSV is that separators are also standardized and never positional, you never have to guess.

nradov · 2025-09-10T15:12:30 1757517150

Technically there is a CSV standard in IETF RFC 4180, although compliance isn't required and of course many implementations are broken.

https://www.ietf.org/rfc/rfc4180.txt

whizzter · 2025-09-10T11:50:55 1757505055

Almost missed the sarcasm :)

dirkt · 2025-09-11T03:54:00 1757562840

Try exporting things from Excel to CSV on a Mac with non-us locale.

Some genius at Microsoft decided the exporting to CSV should follow the locale convention. Which means I get a "semicolon-separated value" file instead of a comma-separated one, unless I change my local to us.

Line breaks are also fun...

jstanley · 2025-09-10T11:06:09 1757502369

JSON has the major annoyance that grep doesn't work well on it. You need tooling to work with JSON.

re · 2025-09-10T11:54:01 1757505241

As soon as you encounter any CSVs where field values may contain double quotes, commas, or newlines, you need tooling to work with CSV as well.

(TSV FTW)

IAmBroom · 2025-09-10T15:52:30 1757519550

TSV is superior to CSVs, and it still angers me that Excel doesn't offer it as a standard input option, but your examples are fairly easily handled by eye in a text file.

Tools definitely make it faster and more reliable.

spicybbq · 2025-09-10T15:00:06 1757516406

One of my first tasks as a junior dev was replacing an incorrect/incomplete "roll your own" CSV parsing regex (which broke in production) with a library.

euroderf · 2025-09-11T13:26:58 1757597218

ASCII FS GS RS US ... just make decent font entries for them.

jstanley · 2025-09-11T13:36:30 1757597790

And keys on the keyboard.

euroderf · 2025-09-11T20:16:28 1757621788

Yes! But nobody ever came up with decent font entries that would look snappy on keys. Not even IBM (or Data General or Burroughs or whoever) I guess.

rogue7 · 2025-09-11T07:52:59 1757577179

For this I use gron [0]. It's very convenient.

[0]: https://github.com/tomnomnom/gron

theknarf · 2025-09-10T11:10:42 1757502642

grep is a tool. jq is a good tool for json.

kergonath · 2025-09-10T11:22:51 1757503371

grep is POSIX and you can count on it being installed pretty much anywhere. That’s not the case for jq.

whizzter · 2025-09-10T11:47:33 1757504853

Do people contain themselvs to a POSIX conformant grep subset in practice, or do you mean GNU grep that probably doesn't behave according to spec unless POSIXLY_CORRECT is set?

IAmBroom · 2025-09-10T15:53:51 1757519631

"Anywhere" does not include Windows environments, which are over half the work computers out there.

krogenx · 2025-09-12T12:49:53 1757681393

If a workstation has Git installed on it, which I’d think would be the case for substantial number of engineers out there (…not just software engineers), grep is there due to Git BASH.

keeperofdakeys · 2025-09-10T23:03:01 1757545381

Arguably, "comma as a separator" is close enough to comma's usage in (many) written languages that it makes it easier for less technical users to interact with CSV.

wlesieutre · 2025-09-11T01:51:52 1757555512

Easier as long as they don't try to put any of those written languages in the CSV

Commas and quotation marks suddenly make it complicated

john_the_writer · 2025-09-10T11:10:10 1757502610

100%.. xml also worked here too..

YAML is a pain because it has every so slightly different versions, that sometimes don't play nice.

csv or TSV's are almost always portable.

renox · 2025-09-11T07:42:49 1757576569

I'd say that is not its biggest issue. The way to escape things is by far its biggest issue, a passwd like \, \", \\ would have been far easier.

talles · 2025-09-11T13:53:08 1757598788

What separator would be better?

freetinker · 2025-09-11T05:08:52 1757567332

The comma makes it more human-readable. What separator would you suggest?

snthpy · 2025-09-11T05:55:42 1757570142

So ASCII actually had dedicated characters for this, 0x1C-0x1F. The problem is that they are non-printing.

Unicode has rendered analogs, U+241C-U+241F, but they take more bytes to encode, which can significantly increase file size in large USV files.

So my ideal would be to use ASV files rendered as USV in editors.

https://github.com/SixArm/usv

snthpy · 2025-09-11T06:05:46 1757570746

The benefits are that ASV / USV files are trivial to parse with simple string splitting since you don't have to worry about nesting and quoting.

Here's an example of what a USV looks like:

Folio1␟␞ Sheet1␟␞ a␟b␟␞ c␟d␟␞ ␝ Sheet2␟␞ e␟f␟␞ g␟h␟␞ ␝␜ Folio2␟␞ Sheet3␟␞ a␟b␟␞ c␟d␟␞ ␝ Sheet4␟␞ e␟f␟␞ g␟h␟␞ ␝␜

joz1-k · 2025-09-11T05:44:21 1757569461

The comma is too prevalent in the data to be a suitable separator. A semicolon would be a better choice.

r721 · 2025-09-11T12:47:00 1757594820

"|" looks pretty good (and is relatively rarely-used).

conception · 2025-09-10T21:21:25 1757539285

|| separated for life