Just days after the release of a portable version of the crypto library, a flaw was reported in LibreSSL’s pseudo-random number generator – its PRNG, a vital component in strong encryption. Andrew Ayer was able to write a program that could call LibreSSL’s PRNG twice and get back the exact same stream of bits each time, which is not supposed to happen. Ayer, the founder of secure backup company Opsmate, described this bug as a “catastrophic failure of the PRNG.”
The OpenBSD project rejected this assessment as “overblown”, and instead said the glitch was “minor”. Nonetheless, the team promptly addressed the problem in LibreSSL version 2.0.2 and released it. OpenBSD contributor Bob Beck reckoned a “contrived test program”, such as the one produced by Ayer, was needed to reproduce the bug.
Ayer’s fork_rand proof-of-concept code exploited the fact that fork()ing a program linked with LibreSSL produces a child process with the same PRNG state as the parent – meaning they will both spit out the same sequence of pseudo-random numbers. This is not particularly brilliant because it means, for example, one process knows what another’s random bit stream will look like, and these bit streams are used to generate secure encryption keys. Secrets will be leaked, in other words.
LibreSSL tries to detect when it is running in a fork()ed process by checking its process ID (PID): a change in PID means it is running in a new child, and duly resets its PRNG state to produce different numbers. But the PID on Linux is 16 bits wide by default, and overflows when it hits that limit. If a program fork()s hard enough, a child could end up reusing a grandparent’s PID and the aforementioned reseed check fails.
The LibreSSL developers say that OpenSSL gets around this problem with an ugly workaround. One they didn’t want to replicate. They told the Linux kernel developers to fix the circumstances leading to the problem in the kernel. Crypto subsystem maintainer Ted Ts’o agrees, but it will be a while until the change hits a kernel release and it might be years until it’s actually deployed on production servers.
The getrandom(2) system call was requested by the LibreSSL Portable developers. It is analoguous to the getentropy(2) system call in OpenBSD.
The rationale of this system call is to provide resiliance against file descriptor exhaustion attacks, where the attacker consumes all available file descriptors, forcing the use of the fallback code where /dev/[u]random is not available. Since the fallback code is often not well-tested, it is better to eliminate this potential failure mode entirely.
The other feature provided by this new system call is the ability to request randomness from the /dev/urandom entropy pool, but to block until at least 128 bits of entropy has been accumulated in the /dev/urandom entropy pool. Historically, the emphasis in the /dev/urandom development has been to ensure that urandom pool is initialized as quickly as possible after system boot, and preferably before the init scripts start execution. This is because changing /dev/urandom reads to block represents an interface change that could potentially break userspace which is not acceptable. In practice, on most x86 desktop and server systems, in general the entropy pool can be initialized before it is needed (and in modern kernels, we will printk a warning message if not). However, on an embedded system, this may not be hte case. And so with a new interface, we can provide this requested functionality of blocking until the urandom pool has been initialized. Any userspace program which uses this new functionality must make sure that if it is used in early boot, that it will not cause the boot up scripts or other portions of the system startup to hang indefinitely.
Ted Ts’o on the Linux Kernel Mailing List: http://thread.gmane.org/gmane.linux.kernel.cryptoapi/11666
People complaining about #LibreSSL PRNG ought to get their OS fixed to provide a decent entropy source instead. `Must be that tall to ride'
— Miod in the Middle (@MiodVallat) July 15, 2014