Small suggestion -- it seems like you're using the wrong units (or maybe abbreviations) for your displayed average page sizes. You're using lower-case b to indicate bits, but I suspect you mean upper-case B to indicate bytes? Also, the lower-case k is the correct prefix for 1,000, but lower case m is milli, or 1/000. You want M for mega, which is 1,000,000.
Also if you really want to be precise you should consider whether you're using binary prefixes vs SI prefixes, e.g. kB (10^3 bytes) vs kiB (2^10 bytes). That doesn't matter as much because the error is small for these lower values, but the casing errors definitely do matter. "mb" means millibars to me, not Megabytes!
Huh... I've always described 2^10 bytes as a "kilobyte" (kB) but I've always hated the ambiguity, even if the difference between 2^10 and 10^3 is usually not important. Thanks to this comment, I learned there is an formal set of units which are distinct from their SI counterparts[0].
Nail on the head. They're just weird, at least in the context of English phonetics. That's the reason I discourage their adoption. I'd much rather we just all agree that kilobyte means 2^10 bytes. Or find words that aren't weird.
If I remember correctly, a big motivation for this change was the fact that disk manufacturers intentionally used base-10 definitions so they could advertise larger numbers for disk capacity. But presumably they still do that, and presumably people still often don't notice.
Sure; there are a lot of distinctions that serve to frustrate casual users. And this is hardly limited to computers - the rant that sticks in my memory is that of a family friend being absolutely infuriated that nuts and bolts can be the same size but have different threads.
The difference does matter, though, and matters a lot when you're working with storage at any scale. So people tend to use the right labels just to avoid ambiguity.
> I'd much rather we just all agree that kilobyte means 2^10 bytes.
Kilo means 10^3 though. I use a lot of SI units every day and that's what it always means for every one of them, just like Mega is always 10^6. These prefixes shouldn't have different meanings depending on the unit being used; that breaks SI. The SI prefixes were first adopted in 1795, before computing even existed as a concept, let alone computers existing as actual objects. The overloading of already very-well-established prefixes to mean something different was always a mistake, and can probably partially be blamed on the US's failure to adopt the metric system.
Why should we use 2^10? Bits matter in the small—a machine with a 12-bit word and 1024 words of memory or whatever was popular in the ’70s—but at the scale of gigabytes, individual bits don’t matter anymore, so may as well just use decimal because that’s what our number system is based on. I don’t see any point besides retro nostalgia to use base 2 after you move out of the each bit counts space.
SI prefixes are fine for mass/block storage and network speeds, because there's no particular reason they would fall precisely into buckets of powers of 2. But for CPU cache and system/GPU memory in particular, and maybe even some flash memory, it does continue to make sense to use MiB and GiB, because of the particular way that memory itself is addressed and packaged. Memory very much does fall precisely along power of 2 boundaries.
For example, I recently bought two 32 GiB DIMMs for my computer. I guess you could call them 34.3597 GB DIMMs, but that's strictly worse! Knowing that they're exactly 32 GiB makes it makes it obvious that it takes 2^35 bit pointers to address every location by byte in one of those DIMMs (so they obviously require a 64-bit architecture to take advantage of!), or 2^36 bit pointers to address memory locations across both of them.
Notably, storage vendors and many operating system vendors use different units to describe capacity, so hard disks will always seem small as a result of this.
This is a hilarious nit-pick. In the context of an internet blog describing average page-size, it's completely obvious that mb == MB ... not millibars...
There's an insane amount of abbreviations and acronyms that have multiple meanings in different contexts. How many Wikipedia pages have [disambiguation] here?...
It is not disambiguous, it is wrong. People will probably figure out what it is meant from the context, but it will slow them down. Are there really any reasons for using the wrong abbrevation other than lazyness or ignorance in this case?
Maybe not officially, but in practice that's how it's used:
The unit's official symbol is bar; the earlier symbol b is now deprecated and conflicts with the use of b denoting the unit barn, but it is still encountered, especially as mb (rather than the proper mbar) to denote the millibar.
Yeah, in general, there are many, many "unit collisions" that can really only be exactly interpreted from context. I think it would be great if everyone started using bracket notation (or similar) for prefixes. E.g. [k]B [Mi]B This is the convention used in pqm.js and it works really well. this would go a long way toward units that can be accurately read by a computer.
But is the SI system ambiguous? I almost never have to think about prefix v unit, it is always quite clear. Of course, strange combinations can occur (MNm - Meganewtonmeter as an example perhaps), but even those are unambiguous on second thought.
Collisions seem to be most prevalent in "IT" units or for improper SI usage. But maybe the latter is really the reason for the former in this case: if people separated SI and other units properly for IT units, it seems to be it would also be perfectly fine (but like me, a lot of people seem to have no idea of the correct definitions).
If you are strictly sticking to the SI system of units, you should be fine. However, some of us work in industries and countries (You know where) that don't fully embrace SI, and mixing other systems with SI is common.
Pressure measurements/specifications are part of my daily work. I have never seen "b" referring to bar, also not in any of the many American (imperial) papers I have come across. So this is either properly old usage indeed, or very specific to certain regions or industries (like usage of relative v absolute pressure). Or Wikipedia being properly pedantic.
You should learn to understand MB/mb/Mb/mB for megabyte and mbit for megabit. Anything else is inviting error and sorrow, because many people use it that way.
This is flat-out wrong. Most times when I see a lower-case b in software or documentation it really does mean bit, not byte. The standards exist for a reason and you should follow them. Anything else is inviting error and sorrow.
Also if you really want to be precise you should consider whether you're using binary prefixes vs SI prefixes, e.g. kB (10^3 bytes) vs kiB (2^10 bytes). That doesn't matter as much because the error is small for these lower values, but the casing errors definitely do matter. "mb" means millibars to me, not Megabytes!