Problems emerge for a unified /dev/*random

codeflo · on March 29, 2022

These random/urandom blocking/nonblocking discussions have been going on for at least a decade at this point, maybe even longer. I get that RNG algorithms are hard. What’s surprising to me is that it seems so hard to even figure out the desired interface of such a basic thing in the kernel. Security in entropy-starved environments seems to be very hard to square with backwards compatibility.

tptacek · on March 30, 2022

It's not hard by itself. These interfaces are actually pretty straightforward to design and implement; Linux made things hard for itself by designing a weirdly complicated interface with an entropy credit system, an idea that's more or less discredited now.

What's hard is accurately surveying the last 15 years of systems built on top of Linux's needlessly-complicated LRNG and making changes that don't violate the expectations of those systems. Nobody should be running "jitter entropy daemons" in userland (userland RNGs are essentially the source of all practical CSPRNG vulnerabilities in the past 2 decades), but people do, and if those systems exceeded some threshold of security before Donenfeld's patches, you can't merge those patches if they push security below that threshold.

It's a really irritating problem.

yxhuvud · on March 30, 2022

Also,the problems seems mostly to be around Linux running in a VM rather than on actual hardware. Would it be possible to have the host urandom show up as a hw random number generator to the VM and then use that for initiating things? Seems less hacky than having to fiddle with seeds. Dunno how possible that is though.

pm215 · on March 30, 2022

You can do that where for instance the VM has PCI and can plug in a virtio RNG device. But a lot of the QEMU setups Guenter Roeck reported issues with are emulating real (usually old) hardware which didn't have a PCI bus, or any other hardware RNG interface.

mjevans · on March 30, 2022

Also lighter systems such as routers (E.G. many OpenWRT targets)

loeg · on March 30, 2022

https://wiki.qemu.org/Features/VirtIORNG

AnssiH · on March 30, 2022

Yes, and that is commonly used.

But Linux must continue to work even on VMs without that, and that is the hard part.

loeg · on March 30, 2022

The VMs described in the article are (faithfully) emulating real hardware lacking unpredictable timers or other sources of entropy. It’s not exclusively a VM problem; it’s a weird/old hardware problem, too.

cperciva · on March 30, 2022

an entropy credit system, an idea that's more or less discredited now

... no pun intended, I'm sure.

tptacek · on March 30, 2022

Nope, just coming off a migraine and not catching stuff like that. :)

rocqua · on March 30, 2022

> userland RNGs are essentially the source of all practical CSPRNG vulnerabilities in the past 2 decades

Is your point that CSPRNGs should not be implemented in userland, but instead should just read from /dev/random or do the equivalent on non-linux opperating systems?

loeg · on March 30, 2022

Yes, that’s been Thomas’s position for many years[1]. And I mostly agree with him on that.

[1]: https://sockpuppet.org/blog/2014/02/25/safely-generate-rando...

raverbashing · on March 30, 2022

> What's hard is accurately surveying the last 15 years of systems built on top of Linux's needlessly-complicated LRNG and making changes that don't violate the expectations of those systems.

Sounds like they need a compatibility or feature (cmdline) flag.

Add the flag and you can use the new improved system. Establish dates for when the old method will not be the default anymore (but can be used with the flag) and when it will be deprecated for good

larusso · on March 30, 2022

That won‘t align with the never break the user mantra. And honestly I hate these timebomb solutions some API vendors put up. In a never breaking system you can‘t remove anything only add things or change stuff internally to make it work like before. They never said to get rid of /dev/random or /dev/urandom just that both would practically be the same. From reading about the problem and the different usecases and expectations it looks like it will stay like this for another decade or more.

amelius · on March 30, 2022

> Nobody should be running "jitter entropy daemons" in userland (userland RNGs are essentially the source of all practical CSPRNG vulnerabilities in the past 2 decades), but people do, and if those systems exceeded some threshold of security before Donenfeld's patches, you can't merge those patches if they push security below that threshold.

Why not break their code? Seems justified in this case.

Quekid5 · on March 30, 2022

For better or worse Linux doesn't break userland.

jonhohle · on March 29, 2022

I ran into it the first time in 2005 and it was a widely known issue, easy to find if you could find where your program was pseudo-randomly blocking (pun intended). FreeBSD already had unified random/urandom by that point.

I’m not sure if the Linux folks find something wrong with existing, time tested approaches or if there’s some NIH going on, but it doesn’t seem like a multi-decade problem.

adastra22 · on March 30, 2022

Pedantic question: would it be pseudo-randomly blocking or randomly blocking?

mobiletoss1337 · on March 30, 2022

It would be somewhat deterministic and predictable and therefore pseudo-random.

staticassertion · on March 29, 2022

I don't think it's hard to figure out the desired interface. As you said, the hard part is changing the interface to what's desired decades later.

throwaway81523 · on March 30, 2022

You'd be amazed at how much stupidity lingers about this issue. I don't know if the present-day kernel devs are affected by it, but an awful lot of users are.

jart · on March 29, 2022

Why is random so hard? If we assume best intentions then:

    uint64_t lemur64(void) {
      static uint128_t s =
        (uint128_t)426679527491843471ull << 64 | 2131259787901769494ull;
      return (s *= 15750249268501108917ull) >> 64;
    }

Passes bigcrush and PractRand and it won't ever cause your system to hang on boot.

anonymousiam · on March 29, 2022

Yeah, but it always returns the same pseudo random sequence of numbers. THAT'S part of what they're trying to fix by including entropy.

FrenchDevRemote · on March 29, 2022

what if you add timestamp?

richdougherty · on March 30, 2022

Here's a fun article about hacking online poker, in part by taking advantage of a time-based seed:

https://www.developer.com/guides/how-we-learned-to-cheat-at-...

"The system clock seed gave us an idea that reduced the number of possible shuffles even further. By synchronizing our program with the system clock on the server generating the pseudo-random number, we are able to reduce the number of possible combinations down to a number on the order of 200,000 possibilities. After that move, the system is ours, since searching through this tiny set of shuffles is trivial and can be done on a PC in real time."

sillysaurusx · on March 29, 2022

From what source? That’s the hard part.

Think of it in terms of VM snapshots. If you boot from a snapshot, there’s no way to get any timestamp guaranteed to be secure.

But “secure” in this context is misleading. I subscribe to the “just use urandom” school of thought. It’s sufficiently secure in practice that no real security threats have emerged (can anyone point to a CVE?) and it seems like a waste of time to focus on this rather than the hundreds of other cases that cause dozens of CVEs.

caf · on March 30, 2022

Insufficient entropy is the root cause of this:

https://factorable.net/weakkeys12.extended.pdf

which is still happening:

https://www.quintessencelabs.com/wp-content/uploads/2020/03/...

staticassertion · on March 30, 2022

The source shouldn't be an issue even if the timestamp is reused. Timestamps just don't add a ton of entropy - a timestamp is generally 64bits, so even if it were actually random you could only ever get 64bits of entropy. And it's obviously way way way less than that since the clock is probably somewhat accurate, the computer is probably less than ~10 years old, and it's probably not way in the future.

So it should be fine to add in, but I don't know how many bits you'd want to count that as.

AnonHP · on March 29, 2022

That would also make it predictable. Essentially, if you get a sequence of random numbers, it should be quite hard to predict what the next one would be.

stouset · on March 30, 2022

Then if I know about when you turned your computer on, I can guess every random number you'll ever generate.

postalrat · on March 30, 2022

If they used a timestamp then all you really need is a couple random numbers to find the seed. Just keep guessing timestamps until you find a sequence that matches.

anonymousiam · on March 30, 2022

Not all computers have a RTC.

woodruffw · on March 30, 2022

The presence or absence of a real-time clock doesn't significantly change the problem here.

staticassertion · on March 30, 2022

Yep, that is a good idea! But a timestamp only adds a little bit of entropy, and you want a few hundred if not a few thousand bits.

noselasd · on March 30, 2022

What it will also do is to allow anyone to predict your pseudo random numbers with high accuracy. (which is bad, as the use case here is for security)

aprdm · on March 29, 2022

What kind of wizardry is this doing?! :)

Dylan16807 · on March 30, 2022

It's one of the simplest ways to make random numbers. Multiply state by a constant, return the upper bits/digits of state.

There's some math that goes into picking a good constant, but honestly that's optional. Usually you'd also add a constant each round but this is going for absolute simplicity.

And the line with 'static' is just an awkward way of initializing the state integer.

Someone · on March 30, 2022

> There's some math that goes into picking a good constant, but honestly that's optional.

It isn’t optional. For example, if your constant is even, every iteration adds another zero to the end of your state, and after (in this case) 127 iterations, you’ll start returning zeroes forever.

Also, if your multiplier is smallish, the lower order ones of the current value can be used to (somewhat) predict the higher order ones of the next value.

Dylan16807 · on March 30, 2022

Yes it has to be odd and it has to be the size asked for. I don't count checking that as math.

If you're particularly worried about the top couple digits of the number being undersized you could return fewer bits or use a wider multiplier.

tirpen · on March 30, 2022

Those numbers are not random.

loeg · on March 30, 2022

It’s a multiplicative congruential generator (MCG), a subset of linear congruential generators (LCG). Basically a multipy (and for LCG, an add) between each generated number. Very fast, small state requirements.

Wholly unsuitable for cryptographic random numbers.

llimllib · on March 30, 2022

Seems that's one of these: https://en.wikipedia.org/wiki/Lehmer_random_number_generator

denton-scratch · on March 30, 2022

"Nobody knows what entropy really is, so in a debate you will always have the advantage."

~ Claude Shannon

I've become more and more uncomfortable with the word "entropy" as used in cryptography. The word "random" has always been problematic. I try to use "unpredictable" instead, because I have half a clue what that word means.

I think "entropy" is used to mean "unpredictable in principle", which would exclude chaotic processes like CPU jitter. Chaotic processes like the weather are predictable in principle; but they are so complex that in practice they are unpredictable. Quantum processes such as radioactive decay or tunneling seem to be unpredictable in principle. If it's unpredictable, then to my mind it's sufficiently unpredictable for cryptography.

The Intel HWRNG depends on chaos - the jitter in gate propagation times. I'd settle for that, if it weren't for the fact that the raw chaos from the process isn't available for inspection - you can only see the result of obfuscating it using AES. How much bias is there in the raw bitstream from the unencrypted HWRNG output? I have no way of finding out.

Entropy accounting in the Linux RNG seems to be somewhere between heuristics and guesswork. It seems to be awfully complicated, given that it comes down to finger-in-the-air estimates.

/me not a cryptographer, mathematician or physicist.

KateLawson · on March 30, 2022

You need to get access to the raw entropy stream in order to characterize it and test it under a number of different situations. At Cryptography Research, we did a number of reviews of hardware entropy sources.

You have to look into behavior during very early startup (power on reset), suspend/resume from low power states, under high heat and thermal shutdown, as well as stable operation. You look at different samples of chips to look for production variation. You build software models that try to simulate the underlying hardware behavior to see how close they get to predicting outputs (which is a bad thing if it works too well!)

You then review how the system processes this entropy since ideas that look good to hardware engineers (like a string filter) are actually really bad for entropy. You analyze the path for side channels or race conditions that could leak raw entropy across process boundaries.

Anyway, here are the reports:

https://www.rambus.com/wp-content/uploads/2015/08/IntelRNG.p... (1999)

https://www.rambus.com/wp-content/uploads/2015/08/VIA_rng.pd... (2003)

https://web.archive.org/web/20141230024150/http://www.crypto... (2012)

denton-scratch · on March 31, 2022

> ~ Claude Shannon

I misattributed that quote. It's from a conversation between Shannon and Von Neumann; Shannon attributed the quote to Von Neumann.

  I thought of calling it "information", but the word was overly used, so I decided to call it "uncertainty". [...] Von Neumann told me, "You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage."

https://en.wikipedia.org/wiki/Entropy#Entropy_of_a_system

Sorry.

cyphar · on March 30, 2022

> I think "entropy" is used to mean "unpredictable in principle", which would exclude chaotic processes like CPU jitter. Chaotic processes like the weather are predictable in principle; but they are so complex that in practice they are unpredictable.

Chaotic systems are practically unpredictable because they have an exponential dependence on their initial conditions. It has nothing to do with complexity -- a double pendulum is a very simple system and is still chaotic.

In other words, as you evolve t, you need more and more bits of precision in the state variables of the initial condition in order to continue predicting the future state correctly. As soon as you start drifting away from the real state the gap between your prediction and reality exponentially increases.

As there is a physical limit to what accuracy you can measure things, at some point is becomes physically impossible to continue accurately predicting the future state.

effie · on March 30, 2022

> I've become more and more uncomfortable with the word "entropy" as used in cryptography.

I don't like the word either. On the other hand, the meaning in the kernel seems clear: with a threat model of the attacker, it's the amount of bits that could be generated from internal state that are believed to be random, i.e. not predictable by the attacker.

> The word "random" has always been problematic. I try to use "unpredictable" instead, because I have half a clue what that word means.

Yes, "random" has many meanings. Sometimes it means simply that variable has non-repeating values with few desired statistical properties. This is not enough in cryptography, where we care about the attacker, so random there is much more demanding.

If a bit stream is looking unpredictable, but is known to be actually produced purely algorithmically, it is then not called random, but pseudo-random, because in principle, with enough knowledge of internal state, it can be predicted.

> I think "entropy" is used to mean "unpredictable in principle", which would exclude chaotic processes like CPU jitter.

"In principle" is too strong and vague. We don't have a way to test and verify a source is unpredictable in principle. We can only pronounce it based on current state of knowledge. In practice, the criterion is always "unpredictable in practice". A number source like CPU jitter may be declared unpredictable in practice. It is a decision of the implementor.

> Chaotic processes like the weather are predictable in principle; but they are so complex that in practice they are unpredictable.

Chaotic processes like weather are theoretical continuous processes, i.e. whose description is based on real numbers. We can measure such numbers only with unknown non-zero error, which makes predictions have errors too, which makes fine enough details of these processes unpredictable. There is nothing qualitatively better in unpredictability that quantum phenomena bring here. Classical thermal noise is already unpredictable.

> The Intel HWRNG depends on chaos - the jitter in gate propagation times. I'd settle for that, if it weren't for the fact that the raw chaos from the process isn't available for inspection - you can only see the result of obfuscating it using AES. How much bias is there in the raw bitstream from the unencrypted HWRNG output? I have no way of finding out.

How do they obfuscate it using AES? Maybe the AES obfuscation could be reversed?

> Entropy accounting in the Linux RNG seems to be somewhere between heuristics and guesswork. It seems to be awfully complicated, given that it comes down to finger-in-the-air estimates.

This seems to be the state of the things. Kernel programmers are constantly dabbling in RNG science which they shouldn't - they end up changing kernel RNG system design and breaking stuff. I don't expect kernel to provide high quality RNG, I expect it to boot and get out of my way.

denton-scratch · on March 30, 2022

> whose description is based on real numbers.

Yes, I get it. But you mean real real numbers from the real world, not the representations used in computers. Real real numbers have infinite precision, which means you can't measure a real real quantity.

cyphar (parents sibling) said:

> a double pendulum is a very simple system and is still chaotic

You're right, it's nothing to do with complexity. It's that immeasurably small changes in initial conditions lead to huge differences in the way a system develops. So a chaotic system that depends on real-world real quantities is in principle unpredictable, because you can't measure real-world reals precisely.

Thanks people, for nailing a stake through my misguided notion that you need a quantum process to be sure your RNG is immune to prediction by some hypothetical arbitrarily-powerful computer.

staticassertion · on March 29, 2022

It seems like one of the main issues is that writes to /dev/urandom aren't counted towards the entropy count.

> Torvalds's suggestion would help much. Ted Ts'o, the other kernel RNG maintainer, cautioned that writing to /dev/urandom is not a privileged operation, so a malicious user-space program could write specific values to it; that is the reason why inputs to the device are not used "until there is a chance for it to be mixed in with other entropy which is hopefully not under the control of malicious userspace".

> "we can haphazardly mix whatever any user wants, without too much concern" as long as it is not given entropy credit.

What's the attack here? An attacker can't lower the security of a system by writing data to urandom if the data gets hashed and there's any other entropy in the system. If there isn't entropy in the system, how is it a regression?

> The Android system writes its kernel command line to /dev/urandom early in the boot process with the expectation that it will not count as entropy, "given that the command line might not contain much entropy, or any at all".

So I assume the problem here is that if you write low entropy data here and then the system counts it, the threshold to reach 256bit is lowered?

Can a crypto-person explain? Is the issue basically that an attacker would be able to make the system look like it had more entropy than it does?

AgentME · on March 30, 2022

You've got it: this is purely an issue during boot-up before enough quality entropy has been gathered. If a malicious userspace program writes predictable data to /dev/random immediately on boot, and the system counts it toward the entropy credit, then it's possible that other services on the system reading from /dev/random will get predictable improperly seeded randomness. It's fine to allow writes to /dev/random at any time as long as it is never given entropy credit.

staticassertion · on March 30, 2022

I see. That is a bit tricky... the entire "count the entropy" thing sounds like a pain. I guess I don't see a way around it - they link to fortuna, but I don't see how that's going to fix it. It sounds like fortuna is a better prng, so in theory if you have N outputs it's harder to get back to the seed? But the attacker is so early in the boot process...

It would help I guess.

I'll have to read up on Fortuna, at the least, as it sounds really neat.

edit: OK Fortuna is more than I expected. I have to read up.

zvrba · on March 30, 2022

> If a malicious userspace program writes predictable data to /dev/random immediately on boot,

Hmmm, how did the program get there? A case for secure boot, no? At least initialization of RNG state should be included in the trusted part.

SnowHill9902 · on March 29, 2022

The hash of known data is know data. So such an attacker could flood the system with zero-entropy data and predict with higher than baseline probability the output of “random” generators.

staticassertion · on March 30, 2022

You can't reduce the entropy just by adding low entropy data to the pool. If I hand you 256bits of entropy you can hash "a" into it as many times as you like - it won't reduce the entropy.

The only thing I can think here is that the attacker would trick the system into thinking it had hit its 256bit entropy minimum before it actually had.

At which point... I don't know what they gain, but sure, it's at least a dick move.

semi-extrinsic · on March 30, 2022

MENLO PARK, March 30 (Associated Press) - In what was widely characterized as "at least a dick move", malicious hackers of unknown origin were discovered yesterday to have been spamming the entropy pool of thousands of McDonalds self service kiosks with the phrase "i can haz jalapeno cheezeburger" during system boot.

While the direct ramifications of this action remain unclear, informal discussions are already being held in various constellations in congress to probe whether the FCC should be given an extended mandate to regulate randomness, according to several sources close to the food sales tech stack regulations sector who requested to remain anonymous.

In response to the swift call for regulatory action, several privately held businesses that supply PoS devices for tea shops have pledged to promptly deploy OTA updates that replace all sources of randomness in their systems with an infinite repetition of the second amendment.

Said Harry Parks, 37, of International Teaselling Machines Inc. - "In this country it is the given right of each man to select the randomness that he wants, so be damned, and I will happily sacrifice the security of my customers to make a stand for the unique freedom and liberty we have in this country."

tptacek · on March 30, 2022

No, that's not the problem at all. Nobody is predicting anything. The issue as I understand it is that it might have been possible in some pathologically bad but real configurations for malicious userland code to trick the kernel into believing the LRNG had been seeded when it hadn't yet been.

zx2c4 · on March 30, 2022

There are a few intertwined closely related pitfalls that are each subtly different:

1) "premature first" wrt non-local attacker: this is the problem you identified - the RNG initializes when there's actually only 1 bit of entropy, and then SSH generates keys that some researchers bruteforce years later.

2) "premature first" wrt local attacker: the RNG has no entropy. Something legit feeds it 32 bits of entropy, and the kernel mixes that entropy directly into the key that's generating the /dev/urandom stream. Local unpriv'd attacker reading /dev/urandom (or some remote attacker who has access to overly large nonces or something) then bruteforces those 32 bits of entropy, compromising it, since it's only 32 bits.

3) "premature next": the RNG has some entropy. That entropy gets compromised somehow. Then the "premature first" wrt local attacker scenario happens. Maybe you think this is no big deal, since a compromise of the RNG state probably indicates something worse. But compromised seed files do happen, and in general, a "nice" property to have is that the RNG eventually recovers after compromise -- "post compromise security".

Problem set A) A malicious entropy source can currently cause any of these due to the lack of a fortuna-like scheduler. Since we just count "bits" linearly, and any source that is credit-worthy bumps that same counter, a malicious source can bump 255 bits and a legit source 1 bit, and then an attacker brute forces the 1 bit.

Problem set B) Making writes into /dev/[u]random automatically credit would cause the same issue, since it's already common for people to write non-entropic stuff into there (e.g. Android cmdline), and because others manually credit afterwards, and mixing into the /dev/urandom key without crediting would also cause a premature next issue, since some things trickle in 32 bits at a time. And other things that trickle in more bits at a time might still only have a few of those be entropic. Yada yada yada, it would cause some combination of the problems outlined above.

In spite of problem set A, the kernel currently does do a few things to prevent against these issues. First, it avoids problem set B, by not implementing that behavior. More generally, /dev/urandom extracts from the entropy pool every "256 bits" and 5 minutes. And, in order to prevent against a "premature first" it relaxes that 5 minutes to 5 seconds, then 10 seconds, then 20 seconds, then 40 seconds --> 5 minutes during early boot, so at least a potential "premature first" gets mitigated somewhat quickly.

Problem set A still exists, however. Whether anybody cares and what the code complexity cost is versus the actual risk of the issue remains to be seen, and should make for some interesting research.

tptacek · on March 30, 2022

Yeah. I see (1) and (2) as instances of the same basic problem, and (3) as mostly a non-problem (like, you do the best you can to get compromise recovery from a CSPRNG, you don't do nothing, but you don't hold up progress on it).

But from my read of the backstory here, the problem is userland regressions on (1) and (2), and I buy that you simply can't have those.

zx2c4 · on March 30, 2022

> (3) as mostly a non-problem (like, you do the best you can to get compromise recovery from a CSPRNG, you don't do nothing, but you don't hold up progress on it).

Mitigating that attack is the main selling point of Fortuna, which makes this attack way harder. I think this is the primary thing we would get from a Fortuna-like scheduler that we don't currently have or can't currently have given the present design.

> But from my read of the backstory here, the problem is userland regressions on (1) and (2), and I buy that you simply can't have those.

Yea so the way these interact with the current story is in two totally opposite directions.

The original thing -- unifying /dev/urandom+/dev/random -- was desirable because it'd prevent (1)-like issues. People who use /dev/urandom at early boot instead of getrandom(0) wouldn't get into trouble.

Then, in investigating why we had to revert that, I noticed that the way non-systemd distros seed the RNG is buggy/vulnerable/useless, but fixing it in the kernel would lead to issue (2) by introducing problem set B. So instead I'm fixing userspaces by submitting https://git.zx2c4.com/seedrng/tree/seedrng.c to various distros and alternative init systems.

By the way, running around to every distro and userspace and trying to cram that code in really is not a fun time. Some userspaces are easygoing, while others have "quirks", and, while there has been reasonably quick progress so far, it's quite tedious. Working on the Linux kernel has its quirks too, of course, but it's just one project, versus a handful of odd userspaces.

jwilk · on March 29, 2022

Unification discussed on HN:

https://news.ycombinator.com/item?id=30373351 (62 comments)

pmoriarty · on March 29, 2022

What about the idea of initializing the random pool from some entropy that was saved to a file on the previous boot?

Before the first boot, the entropy file could be initialized by the installation system.

Yes, these entropy files could be compromised, but that would be no worse than compromising the kernel binary or any number of other sensitive files. So as long as you have trust in the security of the most sensitive files on your filesystem, adding another sensitive file should not be much of a concern, as long as it's protected properly.

loeg · on March 30, 2022

Yes, that works iff (1) you have writable media, (2) you can actually feed it to the kernel RNG (and have it credited) before anything critical blocks on RNG output, and (3) the seed isn't (intentionally or accidentally) predictable (e.g., cloning an existing filesystem image, or cloning a VM, or any other design where the same entropy file is reused by multiple instances).

The kernel has limited ability to protect against (3). Some forms of it can be prevented with the VM Generation Counter PCI device, which Linux recently added support for.

pmoriarty · on March 30, 2022

You don't actually have to have writable media on the system itself, as you should in principle be able to pass the entropy information to the system over netboot.

And now that I think about it, with BIOS being so sophisticated these days (with UEFI and the like), even before booting many systems should be able to use the network to get entropy information.

Regarding point (3), if an image is used in a VM, the VM host should be able to pass entropy to the guest, so the guest doesn't have to use its own entropy to boot.

zx2c4 · on March 30, 2022

That's what this demo code project is about:

https://git.zx2c4.com/seedrng/tree/seedrng.c

https://git.zx2c4.com/seedrng/about/

https://twitter.com/EdgeSecurity/status/1509002499507818500

It's trying to do the seed file thing the "right way", and be portable enough that init systems can just copy and paste this where it fits.

eklitzke · on March 29, 2022

This is in fact discussed in the article.

fullstop · on March 30, 2022

I remember very early versions of Redhat (as in late 90s) saving the RNG seed on shutdown.

brobinson · on March 30, 2022

How do other OSes handle this "lack of entropy at startup" problem? (Windows, FreeBSD, OpenBSD, etc.) I've only ever seen this discussed in the context of Linux.

tytso · on March 30, 2022

In practice it's not hard that hard to solve if you only are supporting a limited number of CPU architectures (e.g., all the world's x86) or only one bootloader. Even if some of the BSD systems support multiple architectures, in practice, they are mostly used only in x86 servers --- and are mostly judged by how well or poorly they work on x86. In contrast, Linux has to work on a very large number of embedded architectures, and some of the CPU architectures don't even have a fine-grained CPU cycle counter, let alone something like RDRAND. And some architectures have practically no peripherals that provide unpredictable input, and some of them very cleverly generate RSA keys and x.509 certificates as the very first thing they do as part of their "out of box" experience.

If you can assume that you're always running on x86 architecture, with RDRAND and RDSEED, and pretty much all desktops, servers, and laptops have TPM chips (which have their own hardware random number generator) and are using UEFI boot (which also has a random number generator) --- and while maybe one of these are either incompetently designed, or backdoored by either the NSA or MSS, hopefully not all of them have been compromised, it's really not that hard.

The challenge has always being on the crap embedded/mobile devies, where manufacturers live and die based on a tenth of a penny in BOM costs..... (and where they tend to have hardware engineers writing firmware and device drivers, and contractors implementing their Minimum Viable Product, and no one ever goes back to retrofit security....)

dehrmann · on March 30, 2022

> Even if some of the BSD systems support multiple architectures...In contrast, Linux has to work on a very large number of embedded architectures

Doesn't NetBSD target an absurd number of platforms?

mburee · on March 30, 2022

OpenBSD also still targets like 7 or 8 arches and did some bunch more a while ago

loeg · on March 30, 2022

As does FreeBSD, and of course those non-x86 platforms are where the pain points are re: RNG -- same as Linux. Tytso is just unfamiliar with the BSD landscape.

shmageggy · on March 30, 2022

They both use a system with multiple entropy pools, and incoming entropoy is distributed over them in a staggered fashion such that eventually one of them will accumulate sufficient entropy. You'll notice there is discussion in all of these Linux patches/blog posts about eventually moving to such a design because entropy estimation is too hard and full of pitfalls.

I thought that neither try to estimate entropy, but I see that apple provides a blocking getrandom(2) system call, so maybe they additionally do some entropy estimation alongside the Fortuna design?

They are both very well documented (pdf):

https://download.microsoft.com/download/1/c/9/1c9813b8-089c-...

https://www.schneier.com/wp-content/uploads/2015/12/fortuna....

loeg · on March 30, 2022

The entropy pool design protects against some attacks, but, critically, it actually makes this boot-time initial seeding problem harder because you need to gather a lot more total entropy (spread over all pools) before the initial pool has enough bits, vs a non-staggered design.

pfortuny · on March 30, 2022

IIRC Open BSD saves a seed each time the system is turned off. Of course, thebvery first boot is problematic, but this one tends to be the installation.

loeg · on March 30, 2022

Saving a seed on disk is fine (and good), but orthogonal to the Fortuna entropy pool design.

FreeBSD does the same, but not every bootloader can put it in kernel memory. And/or not every configuration has a writeable /boot. Writing some entropy from the previous boot (or installer) would be helpful for a similar set of Linux systems, but would not solve the problem in the hardest cases. And Linux userspace is developed by a different group than kernel, so it is somewhat harder to make these kind of systemic changes than in BSD land.

tremon · on March 30, 2022

So does Linux (although by default it only seeds it after initial boot, so doesn't help with boot-time seeding), but the problem is that the seed must be trusted. Which may be true (or true enough) in a number of specific cases, but it's not true in general.

andbberger · on March 30, 2022

how does the kernel estimate entropy?

tptacek · on March 30, 2022

It's not that this is hard to design, it's that it's hard to retrofit given the installed base of weird shit that already cryptically relies on the existing behavior, bugs and all.

loeg · on March 30, 2022

It's kind of both. There's the existing behavior, and also it's also just sort of impossible to solve on some of these weird archs with nothing unpredictable going on.

hypertele-Xii · on March 30, 2022

I wasn't aware randomness had behavior.

loeg · on March 30, 2022

In FreeBSD, we're equally screwed on these architectures with no good source of random data and no ability to persist some entropy from a previous boot.

g0xA52A2A · on March 30, 2022

There’s a nice overview of OenBSD’s RNG system here[1] or video if you prefer [2].

[1] https://www.openbsd.org/papers/hackfest2014-arc4random/index...

[2] https://m.youtube.com/watch?v=aWmLWx8ut20

Waterluvian · on March 29, 2022

Off topic: I decided to try to figure out what LWN stands for. Not on front page. Not in FAQ (unless I missed it. Should be first question.)

I think this oversight is typical for humans but especially for experts. It’s like the inverse of Dunning-Kruger: you’re too smart to realize what others don’t know.

mzzter · on March 29, 2022

According to the FAQ: https://lwn.net/op/FAQ.lwn#general

> What does LWN stand for, anyway?

> LWN, initially, was "Linux Weekly News." That name has been deemphasized over time as we have moved beyond just the weekly coverage, and as we have looked at the free software community as a whole. We have yet to come up with a better meaning for LWN, however.

floren · on March 29, 2022

It's the last entry in the first section of the FAQ:

> LWN, initially, was "Linux Weekly News." That name has been deemphasized over time as we have moved beyond just the weekly coverage, and as we have looked at the free software community as a whole. We have yet to come up with a better meaning for LWN, however.

AT&T used to stand for American Telephone & Telegraph, now it's just AT&T. Same idea.

Waterluvian · on March 31, 2022

Thanks. I completely missed it. I went back and still missed it once more.

bheadmaster · on March 29, 2022

Wikipedia [0] states that:

    The acronym "LWN" originally stood for Linux Weekly News; that name is no longer used because the site no longer covers exclusively Linux-related topics, and it has daily as well as weekly content.

[0] https://en.wikipedia.org/wiki/LWN.net

Waterluvian · on March 29, 2022

Oh interesting. So maybe it’s intentionally meaningless now? Probably a good FAQ leading question. Thanks for looking it up.

AnnikaL · on March 30, 2022

https://xkcd.com/2501/

rad88 · on March 29, 2022

What are the kind of boot/init processes that require the RNG? Is it only a requirement on some systems, or does basically every gameboy, laptop, soda machine etc have init processes that would need the RNG?

The issue that triggered this revert, if I'm understanding it, was that RNG reads had to be minimally decent and non-blocking already in order for a certain system to seed the RNG.

This is definitely breakage, but it'd be interesting to know about the less "reflexive" applications.

toast0 · on March 30, 2022

Lots of network things need (or desire anyway) randomness. A soda machine probably doesn't need randomness (unless it contacts a payment network), most early video game systems are deterministic, some games can have their 'random' state easily manipulated by user input.

A laptop is likely to need randomness during boot. DHCP is supposed to have a random 4-byte XID in the discover message (to be matched by the client in responses), ipv6 slaac privacy wants randomness too. 802.1x (flawed as it is) needs random values as well. If you're doing those, probably you've got a live network cable which might provide some entropy.

Things like address space randomization want to happen as well.

RL_Quine · on March 29, 2022

KASLR, for one.

The original gameboys have absolutely no entropy sources other than reading uninitialized SRAM memory.

rocqua · on March 30, 2022

I believe there are quite a few systems that generate and install their own private-keys / certificates at boot time. Those should be rather unpredictable, being private-keys and all.

_dh54 · on March 30, 2022

Jason is doing great work here. These aren’t fun to solve problems.

Even if long term he can bootstrap a proper userspace ecosystem of using /dev/*random, will the unification ever be possible if we have to support these old systems?

I am generally in favor of maintaining 100% userspace backward compatibility but in this case I would support changing the behavior since the old shell scripts were exploiting behavior that wasn’t documented and was always incorrect.

bhawks · on March 30, 2022

In the context of Linux on embedded and mobile breaking user space would lead to hardware manufacturers essentially freezing on the kernel version right before the break. (For both new projects or in the exceedingly rare case where they upgrade deployed devices).

This would be exceedingly unwise and pushes against the vast amount of effort the kernel community is expending to secure the kernel and bring OEMs on non ancient versions of the kernel.

Essentially these are not old devices, they are new and you likely have purchased some of them without even knowing.

_dh54 · on March 30, 2022

Wouldn’t it be easier to patch their startup scripts than freezing their kernel?

vlovich123 · on March 30, 2022

Wouldn't switching the order of operations & getting Fortuna to be the PRNG be an easier first step and then try merging /dev/urandom and /dev/random? Since Fortuna doesn't have the entropy estimation problem, doesn't that bypass the issue? I imagine it would also bypass the related RNDADDTOENTCNT issue because you can just ignore that API call and writes would be contributing entropy just like anything else.

On a separate note, I'm not sure I understood the problem with crediting writes to random as contributing that many entropy bits when the write happens by Admin & the pool isn't yet initialized. It seems like it would largely solve this class of problems without introducing any security issues, would it?

caf · on March 30, 2022

On a separate note, I'm not sure I understood the problem with crediting writes to random as contributing that many entropy bits when the write happens by Admin & the pool isn't yet initialized. It seems like it would largely solve this class of problems without introducing any security issues, would it?

There are existing startup scripts out there that write, as root, completely predictable data (the example given is the kernel commandline in Android).

rocqua · on March 30, 2022

At what point does "you made a willfully wrong program" override "we don't break userland"? Normally I would expect something like "we don't break documented and intended behavior we told userland to expect". It makes sense that Linux tries for a bit more than that, but if you do it too much you get almost instant ossification, which is bad.

caf · on March 31, 2022

Those programs weren't "willfully wrong". That status quo is that whatever you wrote would be potentially mixed into the pool, but not counted as adding any kind of unpredictability unless the ioctl() was used to specifically indicate this.

sfblah · on March 29, 2022

Admittedly, this would assume the computer is connected to the Internet, but is there not some theoretically reasonable way to get entropy off the Internet? Maybe some server with a known certificate that can return entropy from lava lamps or similar?

I realize part of the issue here is kickstarting the process right after boot. But is that the entire issue?

RL_Quine · on March 29, 2022

You're awoken abruptly, you've just got out of grub, you don't have networking interfaces, you don't have any higher level cryptography libraries, the concept of connecting to something on the internet to get entropy in order to continue booting seems like a complete fantasy. You can gather what little entropy you have, mix it with suspicious sources like RDRAND, and paper over the cracks once you've got some better sources on the burner.

a-dub · on March 30, 2022

you could try rowhammering yourself maybe? aren't the bitflips both spatially and temporally random, depending on em conditions?

i never knew that any user could write into the entropy sources by writing to the random block devices... that's sketch.

rocqua · on March 30, 2022

I think that rowhammering is too hardware specific. I could imagine much slower embedded devices being immune because their old slow RAM is too robust, and it would really suck if ECC RAM (perhaps even opaquely ECC RAM) causes your boot to hang.

a-dub · on March 30, 2022

maybe you could do something with race conditions?

it's an interesting question: is there a clever way to extract random bits from a clocked ttl system by driving it in some weird way that is platform independent enough to cover the bases that don't already have hardware support for generating random bits (or have cycle counters)?

i suppose the problem is that not all chips have a cycle counter so extracting clock drift/jitter isn't always feasible... i wonder what else could work...

race conditions could be an interesting way to try to draw random bits, but probably very hardware dependent... i wonder if there are other glitchy things...

account42 · on March 30, 2022

Some previous LWN discussions on this approach:

https://lwn.net/Articles/642166/

https://lwn.net/Articles/802360/

RL_Quine · on March 30, 2022

Why is that sketchy? There’s nothing you could write that would make the output of the RNG worse.

a-dub · on March 30, 2022

well, i'm not a random number scientist or anything... but i do find it a little troubling that the random number generator at the core of linux that is trusted by everything has a secret "add more random" button on it.

RL_Quine · on March 30, 2022

Is it secret? You can just cat file.txt > /dev/random if you want to seed the generator.

a-dub · on March 30, 2022

can you? anyone in the village is allowed to dump old junk in there. old shoes, dead cats, kernel command lines, politician emails, old diaries and missing tax returns. and then what happens once it's in there? well, it can't really be trusted or used to actually seed the rng. half the junk in there is static. so it just kinda quietly gets mixed into the entropy hash pools... to what end exactly? it's neither ignored nor actually used in a meaningful way, so the whole problem it's trying to solve is ill-posed. this is a distinct code smell, where the problem was very hard and no good solution was found, so there are still remnants of tried and aborted approaches laying around like pieces of disused viaduct after a major earthquake. for something as critical as this, yes, i'd argue that it is indeed, sketchy.

but that's less interesting than generating entropy by exploiting the digital systems equivalent of quantum effects. which has me wondering now: what if quantum entanglement is just rowhammer for reality...

RL_Quine · on March 30, 2022

No matter what you send into the device, it can't make the output lower in quality, but it might make it better. Seems like a win-win in my books.

a-dub · on March 31, 2022

it's the notion, that it could make it "better", that is problematic.

does this mean that users who don't feed the pool manually have inferior keys?

kardos · on March 30, 2022

If your hardware is impervious to rowhammer you'd get nothing

spicybright · on March 29, 2022

You can easily seed randomness with the user input, like the mouse.

https://security.stackexchange.com/questions/10438/what-are-...

NickNameNick · on March 29, 2022

What mouse? You haven't finished booting yet. You don't know if you're a laptop, a router or a lightbulb.

spicybright · on March 30, 2022

Ok then. Laptops have temperature sensors that can add entropy. A super small embeded device like a lightbulb should keep one of it's I/O pins open. A floating wire also gives entropy.

If the question is can you generate random numbers on a computer in a sealed room with absolutely no input, a static image loaded on it, and you have no storage devices to keep any sate, of course you can't generate a random number.

The vast majority of things we put computers in do have some form of input that is random enough for a seed unless you design it that way on purpose.

staticassertion · on March 30, 2022

boot the mouse first, then the os

throw0101a · on March 29, 2022

Somehow I managed to generated RSA keys with PGP 2.x in the early 1990s on a Linux system without a mouse or an always-on connection to the Internet. It took about five minutes while I mashed the keyboard during that time, but I was able to do it.

Yet we can't find enough entropy in 2022 to boot a system?

Also, why don't the BSDs seem to have as many problems on the subject as Linux apparently does?

NobodyNada · on March 30, 2022

> It took about five minutes while I mashed the keyboard

> Yet we can't find enough entropy in 2022 to boot a system?

A Linux system will have no trouble at all if you're around to mash the keyboard while it boots. Nor if it's doing normal computer things like (IIRC) running processes on the CPU, talking over the network, or accessing files, all of which create more than enough activity & jitter to seed the entropy pool within a few seconds. Or if your computer has some form of writable storage, in which case the kernel loads some high-quality entropy that was saved from the previous boot.

The issue in question is a quite rare edge case. It only affects certain kinds of embedded systems that lack hardware capable of generating entropy on-demand, don't have writable storage, and run software that (for whatever reason) absolutely needs random numbers early during the boot process. In this situation, it is possible to get into a deadlock where a process is waiting for entropy to be available, but the system cannot reach a state where it is capable of generating entropy until that process completes.

Linux currently provides /dev/urandom to work around the issue by providing insecure randomness instead of blocking if insufficient entropy is available. This is rarely a good solution, and it can obviously lead to unexpected security vulnerabilities. Making the issue worse is the fact that /dev/random used to have much more annoying problems, so a lot of old code uses /dev/urandom when it definitely should be using /dev/random in this day and age.

The kernel developers tried to solve the problem once and for all by adding a "jitter dance" that can generate entropy out of thin air, and making /dev/urandom the same thing as /dev/random. Bu, unsurprisingly, this broke buggy cod on platforms that can't do the jitter dance.

> Also, why don't the BSDs seem to have as many problems on the subject as Linux apparently does?

A bit of googling shows they have exactly the same problems. The FreeBSD manpage for /dev/random [0] says that /dev/random and /dev/urandom behave identically on BSDs, with a setting allowing the system administrator to choose whether they block or return insecure randonmess in that edge-case situation.

The reason you've seen so much news about the problem on Linux is because the Linux kernel devs recognize that the current situation is very non-ideal, and they really want to find a way to fix it without breaking compatibility with buggy software running in niche environments. I don't know why you don't see as much news about the BSDs, but it must be either because BSD is less popular than Linux, or maybe BSD developers haven't put as much effort into fixing the problem. In any case, I see these discussions as reassuring: it demonstrates that Linux kernel developers care about fixing issues that only affect a tiny minority of users, and that they care about maintaining near-perfect backwards compatibility, even with ancient, buggy, and niche software.

[0]: https://www.freebsd.org/cgi/man.cgi?query=random&sektion=4

toast0 · on March 30, 2022

> I don't know why you don't see as much news about the BSDs, but it must be either because BSD is less popular than Linux, or maybe BSD developers haven't put as much effort into fixing the problem.

I think that's certainly a significant portion of it, but also, the BSD behavior has been fairly consistent for a long time, while the Linux behavior has been changing recently. It's a bigger deal if your embedded system (or virtual machine) breaks when you upgrade to a newer kernel than if it's always been broken.

mikem170 · on March 30, 2022

From `man urandom` on OpenBSD 7.0:

> Entropy data stored previously is provided to the kernel during the boot sequence and used as inner-state of a stream cipher. High quality data is available immediately upon kernel startup.

> For portability reasons, never use /dev/random. On OpenBSD, it is an alias for /dev/urandom, but on many other systems misbehaves by blocking because their random number generators lack a robust boot-time initialization sequence.

NobodyNada · on March 30, 2022

The scheme described there has the disadvantage of requiring a writable filesystem, whereas Linux is designed to run on embedded hardware with no writable storage. If you're running "normal" hardware, Linux can do the same "save some entropy in a file" trick.

> For portability reasons, never use /dev/random. On OpenBSD, it is an alias for /dev/urandom, but on many other systems misbehaves by blocking

This seems like bad advice to me (at least as a blanket statement). On recent versions of Linux, /dev/random only blocks if the kernel cannot guarantee cryptographically secure randomness. If you use /dev/urandom in this situation, you get potentially insecure randomness, which can lead to security vulnerabilities. My understanding of the current best practice is: unless you're running on weird embedded hardware, writing an application which absolutely needs randomness during earlyboot, AND you don't care about the quality of the randomness, you shouldn't use urandom.

Perhaps the manpage was referencing an older version of Linux where the entropy pool could be "drained" and had to be re-seeded periodically, but that was not a great design and was replaced in 2020: https://lwn.net/Articles/808575/

mikem170 · on April 1, 2022

It's my understanding that openbsd boot will take a bit longer to seed the random seed file when there is not a prior seed file on a writable filesystem, just like during the first boot after the os install. This was my experience when I created an openbsd router that booted from a single file and had no persistent storage.

It's also my understanding that /dev/urandom on openbsd will never block for a user process, nor anything that I've ever needed to worry about in the kernel.

Some additional info that I found: https://www.openbsd.org/papers/hackfest2014-arc4random/

tedunangst · on March 29, 2022

Something something perfect good blah blah.

detaro · on March 29, 2022

Most Linux systems don't even have a mouse, and even those that do, users tend not to fiddle with them in early boot, so no signal there, even if you've gotten around to initializing it already.

RL_Quine · on March 29, 2022

The majority of LInux devices do not have any keyboard or mouse, and those that do will see absolutely no input during init, you've never seen "move the mouse around a bunch to begin boot".

spicybright · on March 30, 2022

I am wrong, please ignore my above post.

cerved · on March 30, 2022

great windows solution

zahllos · on March 29, 2022

No it isn't the entire issue. Or more precisely on your average desktop system this isn't really a problem. The jitter dance will suffice here.

The clue is in randomness seed files. You won't have one of these on a PC because you don't need it. Once you have your 128/256 bits of entropy you can stretch this using a DRBG for actual use and it'll be good.

The problem is systems that are much more deterministic, because they have simpler CPUs that don't do speculative execution, or have anything fancy by way of caching. Their peripherals are fixed at boot time via a devicetree file and udev style peripheral plug and play doesn't happen.

These tend to be embedded use cases. To get around this, you have a seed file to give you at least some good entropy. Then you mix in the best you can get and update that seed file for next boot. Hopefully this plus what you get while the device is on is enough for next time.

A lot of crypto including the tls you might use to get stuff off the internet will break if the quality of randomness is not good enough.

You are reading 'Linux' and hearing 'desktop' but the reality is there is a lot more embedded Linux than desktop Linux, and that's where the entropy problems start to creep in depending on the device in question.

Sharlin · on March 30, 2022

> You are reading 'Linux' and hearing 'desktop' but the reality is there is a lot more embedded Linux than desktop Linux

Yeah, this is a weird example of availability bias. Of the literally billions of Linux instances on the planet, how many are desktops these days? One in a thousand? Even less? Most of them are Android phones, most of the rest (other) embedded devices, almost all of the rest, servers.

jazzyjackson · on March 29, 2022

If you ask me, every motherboard should ship with a bit of Americium 241, that way they have a physical entropy source AND can detect when a fire has occurred inside the case (:

akvadrako · on March 29, 2022

Processors already have instructions to generate random numbers from physical fluctuations.

eternityforest · on March 30, 2022

And temperature sensors to detect fires!

seba_dos1 · on March 30, 2022

Some even have instructions to halt and catch fire.

trasz · on March 30, 2022

That’s one way to test that sensor.

pronoiac · on March 29, 2022

Sending an ssl/tls request also requires some entropy! It's needed to avoid MITM and replay attacks.

A bit more info: https://security.stackexchange.com/questions/157684/why-does...

simulate-me · on March 29, 2022

It doesn't need to be encrypted, just signed by the entropy source. The signature should be verifiable without entropy.

Hello71 · on March 29, 2022

How does that prevent replay attacks?

simulate-me · on March 30, 2022

Hmm good point. I guess you need entropy for a nonce.

paulclinger · on March 30, 2022

Maybe include and sign a timestamp?

pronoiac · on March 29, 2022

Sorta nice, but that wouldn't avoid replay issues.

staticassertion · on March 29, 2022

Yes, seeding entropy from a server is a thing you can do. One such service is "pollinate".

I don't think that would help here.

tptacek · on March 30, 2022

Apart from all the other reasons this is a very bad idea, the brick wall you run into is that you need cryptographically secure randomness to safely fetch randomness from the Internet.

gumby · on March 29, 2022

> I realize part of the issue here is kickstarting the process right after boot.

Really it's during the bootstrap phase, unless you define that process as complete once the CPU has branched into the kernel's address space, which I wouldn't consider a useful definition. After all, kernel's enumeration of the attached devices and such is also part of the bootstrap process.

mjg59 · on March 30, 2022

I wrote https://mjg59.user.srcf.net/entropy/ over 20 years ago, so yes, this is an entirely solved problem.

(The issues associated with seeding an RNG with data entirely controlled by a third party are left as an exercise for the reader)

bhawks · on March 30, 2022

Even assuming trust of the 3p service - establishing an ssl connection actually requires entropy too. Even more raw - how could a zero entropy client generate a nonce to protect itself against replay attacks by man in the middle?

I suppose this situation works if you assume both a trusted third party service and a trusted networking layer, but all those assumptions greatly inhibit usefulness.

denton-scratch · on March 30, 2022

Anything you get from a network interface could have been interfered with by a third party.

eternityforest · on March 30, 2022

Do they have CPU temperature at that point in boot? That should still work given enough time to gather entropy, on jitterless platforms.

They also have fixed seeds+ RTC on some platforms.

Seems like they should be able to find a few bits of entropy on almost any platform.

ufo · on March 30, 2022

The problem is that some environments don't have that kind of information. For example, VMs.

rasz · on March 30, 2022

VM host does, can expose emulated randomness source.

Kubuxu · on March 30, 2022

but it doesn't, so it isn't a solution

loeg · on March 30, 2022

No, that's wrong. There are actually a number of ways VM hosts can and do expose host entropy to guests. On x86, for example, they can allow the guest to use RDRAND/RDSEED. On anything with PCI devices, there's VirtIORNG: https://wiki.qemu.org/Features/VirtIORNG . VM hypervisors absolutely implement this service, or similar guest services, and hosting providers enable them.

ufo · on March 31, 2022

Sure. But the issue here is when these features are not used correcly. For non-vms, linux uses cpu jitter as a fallback if the random device is missing or misconfigured. But for vms, that fallback doesn't work as well.

eternityforest · on March 30, 2022

I'm sure the kernel devs can make it happen, I don't see why VM devs wouldn't want to add this if thry saw clear demand.

They probably couldn't just use RDRAND and the like since Linux doesn't seem to trust that though, so the API could be tricky.

zamadatix · on March 30, 2022

Being able to have a solution isn't enough, even having a solution in hand people could switch to isn't enough. You need to actually show no currently working setups are going to break as a result of the change, regardless what those setups could do to avoid breaking after the change.

hypertele-Xii · on March 30, 2022

How exactly can existing code break if you change randomness? It's already random! Changing random from random to random shouldn't make a difference.

wutbrodo · on March 30, 2022

There's an example described in the article, unless I'm misunderstanding this thread. It doesn't change the output distribution, but rather the behavior around the interface, causing certain users of the old interface to block forever on the new.

sundarurfriend · on March 30, 2022

"break" in this case means made less random/more prone to malicious attacks, hence made less secure. The code won't break in the sense of visibly failing; it'd just silently switch to being less correct and more vulnerable - which is arguably worse.

akira2501 · on March 30, 2022

> Seems like they should be able to find a few bits of entropy on almost any platform.

We often do.. the seed file from last boot. Just need a /dev/no_im_root_and_i_really_want_to_add_and_credit_this_entropy_random

device.

tinus_hn · on March 29, 2022

Sounds like they need a way for userspace to state it accepts that it is running on a system that doesn’t have randomness available. Unfortunately that requires a change to userspace.

dinosaurdynasty · on March 29, 2022

getrandom(GRND_INSECURE) is this

tinus_hn · on March 29, 2022

I mean some kind of toggle that disables the whole entropy counting so you don’t have to change all the executables.

Today you can emulate old computers in such a way you can run it 10 times and get exactly the same result each time. If you were able to run the Linux kernel like this there simply is no way to get randomness.

The solutions I can imagine are for the kernel to block, as happened here, for the kernel to give back numbers that aren’t really random (because if you rerun the emulation you’ll get the same numbers) or for userspace to say the system accepts numbers that aren’t secure random numbers.

staticassertion · on March 30, 2022

Yes, I think that ultimately the solution is to remove entropy counting entirely.

KSPAtlas · on March 29, 2022

Don't break userspace!

narrator · on March 30, 2022

Do people trying to crack crypto go after trying to predict the outputs of these random number generators? Would they be used to generate keys for SSL key exchange?

uxp · on March 30, 2022

Yes. Here's a non-crypto instance of using a seed value to deterministically identify the output of a random number generator. This only works in Python 2.x:

    import random
    for i in range(0,50):
        if not i % 15:
            random.seed(1178741599)
        print [i+1, "Fizz", "Buzz", "FizzBuzz"][random.randint(0,3)]

By seeding the PRNG with a known value, one can literally predict the output of a weak number generation utility. In this case, it solves the FizzBuzz puzzle.

https://en.wikipedia.org/wiki/Fizz_buzz

rocqua · on March 30, 2022

I recall a piece of research looking at all RSA public keys in certificates on the web they could find. They found a few keys that were very _very_ common. This was the result of exactly these kinds of low entropy key generation problems.

This is problematic even if no-one manages to predict the outcome of the random generator, because it means if someone throws a stupid amount of compute at the problem to brute-force the private key, they can compromise many different people.

But if the source of those repeated keys can be found, an attacker could get their own copy, reverse-engineer the key-generation process, and then try to predict the randomness. In which case they would have a rather cheap way to compromise many different keys.

Looking at it differently, if the problem is big enough, then it can leave a signature of duplicate keys being out their in the wild. These duplicate keys on their own are bad, but the signature can also be used by attackers to know much better where to look for bad entropy.

tptacek · on March 30, 2022

(1) They do if the RNG isn't seeded or is seeded improperly; otherwise, it's pointless (you might as well try to break AES).

(2) Yes.

dylan604 · on March 30, 2022

someone needs to take a radio telescope and point it at random (haha) bit of sky with a live feed of the noise being turned into a live stream of 1s & 0s. then create an API that allows one to get a number of bits from the live stream. let's make /dev/*random require an internet connection!

Also, we'll need some jammers so the haxors don't try to rick roll the signal

rocqua · on March 30, 2022

How do you verify, without being able to send a random challenge, that you are really talking to the radio telescope when it feeds you random data?

If you want to do TLS as a client you still need secure randomness. You need it as input to Diffie-Helman (or whatever key-agreement system you are using). Or, if you are dong old TSL with pure RSA, you need an unpredictable RSA key.

In any case, if an attacker can predict your randomness, he can pretend to be the radio telescope over TLS. The problem is inherent in all asymmetric crypto you would use. You could 'get around this' by using symmetric crypto, but then you need a shared secret, which does not scale.

dylan604 · on March 30, 2022

Thanks for reminding me why I'm not a comedian. If you couldn't tell that anything in my comment was a joke, then I'm obviously not good at comedy.

rocqua · on March 31, 2022

It was obvious your suggestion was a joke. But there was a slightly surprising reason (besides the other obvious ones) that your joke-suggestion would be a bad idea. Hence it seemed interesting for the general discussion to explain this reason.

tomcatfish · on March 30, 2022

Not quite stars, but this is super duper close:

https://www.random.org/

> RANDOM.ORG offers true random numbers to anyone on the Internet. The randomness comes from atmospheric noise, which for many purposes is better than the pseudo-random number algorithms typically used in computer programs.

dylan604 · on March 30, 2022

There are fictional plot lines that involve using atmospheric/space noise as the methods for creating one time cypher pads for the ultimate in security. How much of that is based in reality is not known to me, but I know Tom Clancy used this type of encryption as a plot device.

For one time use cyphers or other non-realtime use, it seems like a good source. For realtime back and forth computer negotiating it seems to be much less useful.

burstmode · on March 30, 2022

Don't give the systemd people any ideas ! PLEASE!

kps · on March 30, 2022

SGI had an online random number source fed by cameras pointing at lava lamps.

dylan604 · on March 30, 2022

Of course they did. Something that was slower than christmas to make noticeable changes, got hotter than hell, and took up way more space than anything else for the performance.

rascul · on March 30, 2022

Cloudflare does also.

https://www.cloudflare.com/learning/ssl/lava-lamp-encryption...

RcouF1uZ4gsC · on March 30, 2022

Reading all this makes me almost suspect that real world security for most people would be to just use the RDRAND from the cpu.

tptacek · on March 30, 2022

Yes, but lots of systems don't have it.

thehappypm · on March 30, 2022

Why don’t all computers provide a random hardware generator? Randomness is trivial in dedicated hardware..

willis936 · on March 30, 2022

It is but you need to trust the hardware. Trust is the keyword here. Long battles have been fought over this. Many high profile authors have (successfully) argued against the use of unauditable sources of entropy being used in /dev/random.

https://en.wikipedia.org/wiki/RDRAND#Reception

hypertele-Xii · on March 30, 2022

If you don't trust the hardware, any software you run on it is already compromised.

Vecr · on March 30, 2022

Yeah, but having a broken RNG in unauditable encrypted obfuscated microcode is more likely to go undetected than something hotter and more critical for performance, but even then meltdown and spectre took a long time to figure out.

caf · on March 30, 2022

Because some computers are a 10c part in a 50c device.

ironmagma · on March 30, 2022

Are they the same computers running Linux though?

bhawks · on March 30, 2022

Yes Linux has a tremendous existing embedded footprint across a number of interesting minimally specced devices.

You (and everyone) should care because of the ever expanding internet of things movement connecting these devices up to the rest of the computing universe. This represents a brand new frontier for attack surfaces and if the kernel cannot provide foundational entropy there's no plausible security story for the rest of the stack.

The cynic would be quick to point out that IoT is a nitemare anyway and even if the kernel gives the tools the manufacturer needs to use them and use them in the right way.

ironmagma · on March 30, 2022

I’d be more convinced by an example of a 10c microprocessor that can run Linux. Obviously it sounds better to support everything, but there also are (and should be) bounds to what Linux should try to support.

throwaway81523 · on March 30, 2022

Just about all Linux-capable computers have hardware RNG by now, don't they? Just seed the random device from that.

efitz · on March 30, 2022

Many computers have TPMs or Secure Enclaves or whatnot these days. Many CPUs have similar or subset functionality built in. Why don’t manufacturers start to add hardware entropy sources? Then /dev/*random could source entropy from a driver, not an algorithm.

Yes, I know that it only fixes things going forward, and I know that the answer to any question that starts with “why don’t they” is “money”.

Sigh.

josefx · on March 30, 2022

> “why don’t they” is “money”.

More likely "trust". A purpose build instruction with a black box implementation on hardware that has its own hidden OS with direct access to the network is one issue. The fact that Intels CPUs are a bug riddled mess that have been a significant headache to anyone caring about kernel and application layer security is another. Who needs security if disabling all the mitigations makes you look great on benchmarks.

efitz · on March 30, 2022

A HES or a HRNG is observable and testable.

And we trust purpose built security hardware all the time; think crypto accelerator cards and HSMs.

CorrectHorseBat · on March 30, 2022

It's much harder to test the output of a TRNG

CorrectHorseBat · on March 30, 2022

But they do: https://en.m.wikipedia.org/wiki/RDRAND

zx2c4 · on March 30, 2022

CONFIG_RANDOM_TRUST_CPU=y

CONFIG_RANDOM_TRUST_BOOTLOADER=y

CONFIG_HW_RANDOM=y

CONFIG_HW_RANDOM_TPM=y

and so forth all exist.