Hacker News new | past | comments | ask | show | jobs | submit login

Bit of a rant: what annoys me about these lists is how they just give off a huge "you are dumb for making any assumptions, how could you not think of <extremely obscure edge case>" vibe. I'd be interested to see what the effects are of these assumptions failing, because often they are pretty reasonable assumptions for a reasonable subset of the universe. Software is imperfect and you can't cover every possibility. Like ok technically 10 flights with the same number could leave the same gate at the same time, but if 99.99% of the time they don't and you assume that, what is the real impact to people?

Reminds me of a list that came up ages ago that presented an assumption of "X code always runs" with the counterpoint that you could unplug the computer. Ok sure, but then why write software at all? Clearly no point assuming any code will ever run since you can just terminate the program at any random time.






I don't agree that this list has the attitude you describe--if anything, they just seem proud that they have many fewer of these corner case bugs than anyone else--so it is difficult to work with your example of the flight number. These are, in fact, misconceptions made by programmers, often without having the in-depth knowledge of this specific area that comes from being an actual expert (the kind that often people don't allocate for in their budgets), and this list isn't an over-the-top portrayal of such: it feels weird to become offended?

That said, I do appreciate some of these lists--which maybe has put you on edge to the paradigm--do have an edge to them... but, in all honesty, I think they should? The bugs and edge cases that these lists tend to expose aren't random glitches that equally affect every user: they usually segment users into the ones whose lives "follow the happy path" (which often just means "are intuitive and familiar to the culture near the developer") and the users who get disproportionately (or even continually!) screwed every time they dare interact with a computer.

And like, it is actually a problem that the other side of this is almost always a developer who doesn't really give a shit and considers that user's (or even an entire region/country's) existence to somehow be a negligible statistic not worth their time or energy, and I really do think that they deserve to take some flak for that (the same way I try to not get offended if someone points out how my being a cis-het white male blinds me to stuff: I think I deserve to get held to task harder by frustrated minorities rather than force them to be nice all the time in a world that penalizes them).


I don't disagree with you at all. My point was more like what another commenter said, that software adheres to a strict and very finite set of rules, the real world is way more complicated than that. It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO). So you define a reasonable subset and work with that. And the reasonable subset is probably defined by positive/negative outcomes.

It would have been cool if the blog post discussed those outcomes so we can reason about it properly, otherwise it's just a list of claims at face value. If the programmer making an assumption means a screen at a gate says the wrong boarding time when there's a human there controlling the boarding, then not the end of the world. But if the programmer making an assumption causes 1/10000 flights to crash, then that's interesting and worthwhile calling out. It's just endless speculation without a proper outcome to tie it down.


At a general level I think these lists make developers more aware of uniqueness and constraints.

When designing data I think these questions (skepticisms) should be front of mind;

1) natural values are not unique.

2) things identified by number are best stored as a string. If you're not going to do math on it, it's not a number. That "customer number" should be treated as "customer id" and as a string.

3) be careful constraining data. Those "helpful checks" to make sure the "zip code is valid" are harmful not helpful.

4) those tiny edge cases may "almost never happen" but they will end up consuming your support department. Challenge your own assumptions at every possible opportunity. Never assume anything you "know" is true.

It's hard to measure time saved, and problems avoided, with good design. But it's easy to see bad design as it plays out over decades.

And (especially today) never optimize design for "size". Y2K showed that folly once and for all.


> 2)

This implies denormalization, which is rarely needed for performance, despite what so many believe. Now you’ve introduced referential integrity issues, and have taken a huge performance hit at scale.

> 3)

I mean, maybe don’t try to use a regex on an email address beyond “is there a local and domain portion,” but a ZIP code, as in U.S. only, seems pretty straightforward to check. I would much rather have to update a check constraint if proven wrong than to risk bad data the rest of the time.

> never optimize for size

Optimize for size when it doesn’t introduce other issues. Anyone working on 2-digit years could have and likely did see that issue, but opted to ignore it for various reasons (“not my problem,” etc.). But for example, _especially_ since Postgres has a native type for IP addresses, there is zero reason to store them as strings in dotted quad. Even if you have MySQL, store them as a UINT32, and use its built-in functions to cast back and forth.


>It's so trivially easy to find real world counterexamples to just about any software that it's a barely interesting exercise (IMO).

These lists hopefully make programmers aware that a lot of their assumptions about the real world might be wrong, or at least questionable.

Examples are assumptions on the local part of email addresses without checking the appropriate RFCs. Which then get enshrined in e.g. JavaScript libraries which everyone copies. I've been annoyed for the last 30 years by websites where the local part is expected to be composed of only [a-z0-9_-] although the plus sign (and many other characters) are valid constituents of a local part.

Or assumptions on telephone numbers. Including various ways (depending on local culture) of structuring their notation, e.g. "123 456 789" versus "12-3456-89" where software is too dumb to just ignore spaces or dashes, or even a stray whitespace character copied by accident with the mouse.

And those forms where you have to enter a credit card (or bank account number) in fields of n characters each, which makes cut/copy/paste difficult because you notes contain it in the "wrong" format.

So while some examples may count as "just usability" it all stemps from naive assumptions by programmers who think one size fits all (it doesn't).


I disagree, in my view they do not inherently give off such vibes at all. In this post for example, they specifically broach the topic like so:

> There are a lot of assumptions one could make when designing data types and schemas for aviation data that turn out to be inaccurate.

Sounds like a pretty explicit acknowledgement of the notion that these are otherwise reasonable assumptions that just happen to fail when put to the test, I'd say.

It's very easy to self-deprecate, especially if one has insecurities. But that doesn't mean that articles like this actually mean to do so. I think it's worthwhile for everyone involved to always evaluate whether the feeling is actually coming from the source you're looking at, or if that source just happened to trigger it inside you. More often than not, in my anecdotal experience, it's the latter.

I'd also find it interesting to learn what happens when these falsehoods nonetheless make it into an implementation though.


> I'd be interested to see what the effects are of these assumptions failing

Mostly confusion, but the combination of aviation and confusion can be dangerous and even deadly. Not directly related to this list, but I'm reminded of [1]: no one entity has set out to inconvenience the hapless traveler, but the combination of history and practice are a constant source of irritation, and at the times of heightened tensions and security might even lead to scary incidents. All because of the name.

[1] https://travel.stackexchange.com/questions/149323/my-name-ca...


Usually I use lists like this to define design constraints. This sort of thing becomes a template for the tables in the database.

This feels like an unnecessarily defensive take. I think these lists are more meant to be humor or thought-provoking. If anything, I think they serve to point out to non-programmers why programming is difficult, not to call programmers stupid.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: