Bug 236010

Summary: rand(3) is a bad RNG, but lots of software uses it. Make it a shim around arcrandom(3)
Product: Base System Reporter: Yuri Victorovich <yuri>
Component: miscAssignee: freebsd-bugs (Nobody) <bugs>
Status: Closed Not A Bug    
Severity: Affects Some People CC: brooks, cem, delphij, gordon, kennedy.greg, koobs, pfg, secteam
Priority: --- Keywords: needs-patch, security
Version: 12.0-STABLEFlags: koobs: mfc-stable12?
koobs: mfc-stable11?
koobs: exp-run?
Hardware: Any   
OS: Any   

Description Yuri Victorovich freebsd_committer freebsd_triage 2019-02-24 21:11:19 UTC
---Problem description---
This simple program
> #include <stdlib.h>
> #include <stdio.h>
> int main() {
>   srand(time(0));
>   printf("%d\n", rand());
> }
demonstrates how a lot of software uses srand(3)/rand(3) functions.

The first random number seems to strongly correlate with time, here are subsequent run outputs:
> $ ./rand 
> 50768478
> $ ./rand 
> 50785285
> $ ./rand 
> 50802092
> $ ./rand 
> 50818899
> $ ./rand 
> 50852513
> $ ./rand 
> 50869320

It's always 50,000,000-something, which is strongly correlated with time.

---Practical problem that this causes---
Please observe this use of rand(3): https://github.com/ccxvii/mujs/blob/master/jsmath.c#L71 When the rand(3) output is simply mapped to a floating point like this, the resulting floating point values aren't random at all from run to run. Somebody could use such floating point pseudo-random number, truncate the less significant digits, and get a completely predictable values. The consecutive outputs of this routine that I observed are: 0.03040289788544448, 0.03041072425470755, 0.03041072425470755.

---Proposed solution---
Some 32-bit hash function can be used on the seed value that would map the value to a less seemingly related value.
Comment 1 Kubilay Kocak freebsd_committer freebsd_triage 2019-02-25 03:43:51 UTC
This sounds like a request to improve the algorithm, not necessarily achieving 'cryptographically secure', given these details from man rand(3):

     rand, srand, sranddev, rand_r – bad random number generator

     <snip>

     The functions described in this manual page are not cryptographically
     secure.  Cryptographic applications should use arc4random(3) instead.

     These interfaces are obsoleted by random(3).

     <snip>

     For better generator quality, use random(3) or lrand48(3).
Comment 2 Yuri Victorovich freebsd_committer freebsd_triage 2019-02-25 04:11:01 UTC
(In reply to Kubilay Kocak from comment #1)

Yes, srand(3)/rand(3) are bad random number generators, both in their algorithm and interface. But it is widely used.

I am suggesting that it can be easily improved by passing the supplied seed through one additional hash function that would further randomize it and make the rand(3) function somewhat 'less bad'.
Comment 3 Conrad Meyer freebsd_committer freebsd_triage 2019-02-25 05:37:18 UTC
I think any attempt to make rand(3) _look_ less obviously bad, but still *be* bad, is a step in the wrong direction.

A better alternative to make rand(3) more random-looking would be to take a page from OpenBSD and just make it a shim around the CSPRNG arcrandom(3).
Comment 4 Yuri Victorovich freebsd_committer freebsd_triage 2019-02-25 06:27:02 UTC
(In reply to Conrad Meyer from comment #3)

I agree.
Comment 5 Kubilay Kocak freebsd_committer freebsd_triage 2019-02-25 13:22:12 UTC
Would an exp-run for this shim around arcrandom(3) be handy? If not, unset exp-run flag.

This doesn't need to remain assigned to secteam for implementation, but they are CC'd.
Comment 6 Brooks Davis freebsd_committer freebsd_triage 2019-02-26 17:22:53 UTC
We might also consider adding a linker warning with the use of rand().  That would certainly require an exp-run and might prove too disruptive in practice.
Comment 7 Gordon Tetlow freebsd_committer freebsd_triage 2019-03-03 00:30:46 UTC
Adding a linker warning doesn't magically solve badly written software and I can't think of a single downside to moving rand and friends to using a CSPRNG. Can we just migrate to that instead and provide actual randomness?
Comment 8 Xin LI freebsd_committer freebsd_triage 2019-03-04 18:27:14 UTC
Note that there are actual cases where deterministic pseudo random is desirable (e.g. benchmarks for compression, I/O, query, etc.), otherwise the comparison would be less meaningful.

The OpenBSD approach seems to be a reasonable compromise (provide a new, explicit "srand_deterministic" interface to allow this usage, and change the default behavior to be a shim around arc4random(); they will also issue a link time warning whenever rand or rand_r is called to discourage its use).
Comment 9 Xin LI freebsd_committer freebsd_triage 2019-03-04 18:32:56 UTC
Note that for backward compatibility purposes, srand(3) version should be bumped, with the old version of symbol a call equivalent to srand_determistic() so we don't break existing binaries that wants the old behavior.
Comment 10 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-04 19:16:15 UTC
I don't like the idea.

rand(3) is posix compliant and is still useful for the cases when you want deterministic randomness. The fact that few people actually require the deterministic feature is irrelevant: its an interface promise.

I'd rather prefer we do further work on improving the period by using one of the modern algorithms like xorshift1024*.(ache@ was working on this).
Comment 11 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-04 19:19:04 UTC
(In reply to Pedro F. Giffuni from comment #10)

I should also say that is someone dares use rand(3) for something security-sensitive, the problem lies elsewhere and its something we won't fix by breaking the API.
Comment 12 Conrad Meyer freebsd_committer freebsd_triage 2019-05-15 22:07:29 UTC
Note that xorshift is also not a CSPRNG.

IIRC, the OpenBSD approach is to shim arc4random() IFF rand()/random() is called without srand()/srandom() first.

If the user requested a seeded rand(3)/random(3) by calling srand/srandom, then they honor that from that point on and produce a predictable sequence (complying with POSIX).

Obviously, programs that use srand(time(NULL)) are still going to be broken under this approach, but this hybrid approach retains POSIX compliance for srand/srandom.
Comment 13 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-15 23:02:32 UTC
(In reply to Conrad Meyer from comment #12)

My point is that if you need a CSPRNG you shouldn't be calling rand(3) at all.

There is a market for deterministic PRNGS: the playstation 4 carries the credits for a the Mersenne Twister code along with FreeBSD's libc.

I haven't looked at the OpenBSD code in a while, I know they ran a sweep for rand(3) uses in their ports tree and I was under the impression that they added a deterministic_rand for code that required it. The thing is: if people want a non-deterministic, fully crypto safe, random number generator and they have the source code (which is basically true for anything in the ports tree), they may as well replace the sources with arc4ramdom directly and not have to wonder id they will be running the code in an "apparently safe" system or not.
Comment 14 Conrad Meyer freebsd_committer freebsd_triage 2019-05-15 23:44:52 UTC
(In reply to Pedro F. Giffuni from comment #13)
> My point is that if you need a CSPRNG you shouldn't be calling rand(3) at
> all.

I don't disagree.  The problem is that the names "rand" and "random" are misleading, and this leads to accidental misuse in actual practice.

Really-random-until-srand has little downside.  Savvy users that want a fast deterministic PRNG inevitably provide their own, because the libc one isn't going to be as fast or tailored to their application's needs as one they provide.

> The thing is: if people want a non-deterministic, fully crypto safe, random number generator and they have the source code (which is basically true for anything in the ports tree), they may as well replace the sources with arc4ramdom directly...

I agree, partially, with concerns on "have the source code" and "true for anything in ports."  If you or I are writing new code or have some degree of control over a project, yes, we can convert mistaken uses of rand(3) over to arc4random(3) or getentropy(3) or avoid the mistake entirely.

But I think that doing so in Ports is probably infeasible.  First, because analyzing all ports for valid or invalid use of rand is a lot of work; second, because patching and pushing back such changes to the relevant upstreams is a huge burden; and third, because upstreams may introduce new uses of the insecure functions in future port versions without any automated detection.  Ports is a downstream consumer of 3rd party software.
Comment 15 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-16 01:55:20 UTC
(In reply to Conrad Meyer from comment #14)
(In reply to Pedro F. Giffuni from comment #13)
> > My point is that if you need a CSPRNG you shouldn't be calling rand(3) at
> > all.


>I don't disagree.  The problem is that the names "rand" and "random" are >misleading, and this leads to accidental misuse in actual practice.

The manpage is very clear "The functions described in this manual page are not cryptographically secure.  Cryptographic applications should use arc4random(3) instead." You can rightfully blame the C standard for not providing a crypto-ready algorithm but it is rare to find languages that have done so. C++ and stuff like LUA and Python do support Mersenne Twister though.

> Really-random-until-srand has little downside.  Savvy users that want a fast
> deterministic PRNG inevitably provide their own, because the libc one isn't
> going to be as fast or tailored to their application's needs as one they
> provide.

I am a committer in Apache OpenOffice (AOO for short): we use rand(3) for some visual effects when changing one slide to the next. It doesn't need a crypto-grade random, nor does it need seeding either: rand(3) just works there and is fairly portable. FWIW, AOO also uses much better random generators for some other things so we carry OpenSSL and APR which have plugs for all of the OSs out there that I would care about.

There are plenty legitimate uses for a poor RNG but that doesn't mean we can't do better than we currently do in rand() and random(). If we manage to adapt a longer seed, we can handle much longer periods with any of the fine, and fast, modern algorithms. I think no one has done it just due to lack of motivation: the existing LCG works fine for many things and it is easy to bring in Mersenne Twister for the rest.
Comment 16 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-16 02:24:59 UTC
Perhaps on a more constructive twist, how about implementing rand_s instead:

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/rand-s?view=vs-2019

Not sure if it's part of annex K, but developers would know what to expect and is somewhat portable.
Comment 17 Conrad Meyer freebsd_committer freebsd_triage 2019-05-16 03:29:00 UTC
(In reply to Pedro F. Giffuni from comment #15)
> (In reply to Conrad Meyer from comment #14)
> >I don't disagree.  The problem is that the names "rand" and "random" are >misleading, and this leads to accidental misuse in actual practice.
> 
> The manpage is very clear "The functions described in this manual page are
> not cryptographically secure.  Cryptographic applications should use
> arc4random(3) instead."

The *FreeBSD* man page is clear *now*, but it has not always been so clear, and e.g., the Linux man pages do not lead with the same bold disclaimer.  In fact, there is little discussion in Linux rand.3 at all about how it isn't actually random.  And the NOTES section at the bottom of the page says, "if good randomness is needed, use random(3)."

The point is, there are people writing code who do not read our most recent manual pages.  They might be developing on Linux; they might be careless; they might have written code against FreeBSD 5.0 with the older manual page.  It doesn't really matter.  Since rand(3) is a portable C89 API, the code could have originated anywhere.

> You can rightfully blame the C standard for not
> providing a crypto-ready algorithm

I am not making that specific criticism.  I believe it is unfortunate that the C standard took the "rand" and "random" global namespace for non-random PRNG functions, but given how unlikely C is to remove a function (as far as I know, the only function ever removed from the standard is gets(3), and that took until 2017-2018), it seems like we're stuck with that.

I don't think it is reasonable to criticize the C89 authors for failing to predict the value of CSPRNGs, unpredictable random numbers, and e.g. cryptography, which largely took off after 1989.  At the time, the best block cipher was DES.  Single DES.

> but it is rare to find languages that
> have done so. C++ and stuff like LUA and Python do support Mersenne Twister
> though.

Yeah, it's not a language level standard and not going to be, or else it becomes impossible to implement conforming C on super exotic / super embedded platforms.  MT is definitely not a CSPRNG and easily broken.

> I am a committer in Apache OpenOffice (AOO for short): we use rand(3) for
> some visual effects when changing one slide to the next. It doesn't need a
> crypto-grade random, nor does it need seeding either: rand(3) just works
> there and is fairly portable.

This is a good example of an application that really doesn't care whether rand(3) is a PRNG or a CSPRNG.  If rand(3) were arc4random(3) tomorrow, page flipping animations would continue to work just fine.

> There are plenty legitimate uses for a poor RNG

s/poor RNG/fast PRNG with good distribution/, then yes.  Mostly simulations, but PRNGs are also useful for e.g., reproducible builds (makefs with "random" inode gen numbers).

> but that doesn't mean we
> can't do better than we currently do in rand() and random(). If we manage to
> adapt a longer seed, we can handle much longer periods with any of the fine,
> and fast, modern algorithms. I think no one has done it just due to lack of
> motivation: the existing LCG works fine for many things and it is easy to
> bring in Mersenne Twister for the rest.

Sure, one could optimize the PRNGs.  I think that's basically orthogonal and unrelated to the idea of making rand/random CSPRNGs until srand/srandom is used explicitly.  It could address the original report's concern, though (poor distribution of rand() after srand(time(NULL))).

(In reply to Pedro F. Giffuni from comment #16)
> Perhaps on a more constructive twist, how about implementing rand_s instead:
> 
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/rand-
> s?view=vs-2019
> 
> Not sure if it's part of annex K, but developers would know what to expect
> and is somewhat portable.

I'm not a proponent of annex K functions generally.

As far as that specific API goes, I have a number of problems with it:

1. naming a CSPRNG function after rand() conflates the two.  It suggests they are in some way related, when they aren't.  They have different ranges (UINT_MAX vs RAND_MAX), utterly different behaviors, and different type signatures.  (This is a problem with many Annex K APIs.)

2. It's not clear why this routine would return an errno at all.  It doesn't take a "non-block" flag that would indicate EAGAIN.

3. It's not clear what benefit it provides over arc4random.  Annex K is widely unimplemented.  There isn't a single reference implementation.  It is not any more portable than arc4random(3) and the C standard committee is considering dropping it entirely in C2x.

4. It doesn't address accidental misuse of the existing rand/random APIs.
Comment 18 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-16 05:42:11 UTC
(In reply to Conrad Meyer from comment #12)

> IIRC, the OpenBSD approach is to shim arc4random() IFF rand()/random() is
> called without srand()/srandom() first.

> If the user requested a seeded rand(3)/random(3) by calling srand/srandom, then > they honor that from that point on and produce a predictable sequence
> (complying with POSIX).

No. the OpenBSD approach breaks the standard:

https://man.openbsd.org/rand
"Standards insist that this interface return deterministic results. Unsafe usage is very common, so OpenBSD changed the subsystem to return non-deterministic results by default."

In order for rand(3) to be deterministic you have to call srand_deterministic(), which is a non-standard function;
http://src.illumos.org/source/xref/openbsd-src/lib/libc/stdlib/rand.c

Concerning their manpage comment, if they meant crypto-safe, rand(3) and random(3) were never ever meant to be used in "safe" ways.

About your comments concerning Xorshift, and MT not being CSPRNG, I agree, of course neither of them are meant to be crypto safe, but they work great for what they *are* meant to do (montecarlo simulations, etc) and people use them.

(In reply to Conrad Meyer from comment #17)

>Sure, one could optimize the PRNGs.  I think that's basically orthogonal and
>unrelated to the idea of making rand/random CSPRNGs until srand/srandom is
> used explicitly. ,,,

I disagree: once you break the API, people can't trust the function will work again in standard fashion and will require FreeBSD version checks to get things working again, it would be a mess. Our developers will also lose any motivation to fix them properly.

Please keep it working according to standards. I think its reasonable that you don't want to add the Microsoft variant, but you can still add a linker warning pointing people to arc4random.
Comment 19 Conrad Meyer freebsd_committer freebsd_triage 2019-05-16 14:23:00 UTC
(In reply to Pedro F. Giffuni from comment #18)
> (In reply to Conrad Meyer from comment #12)
> 
> > … shim arc4random() IFF rand()/random() is
> > called without srand()/srandom() first.
> 
> > If the user requested a seeded rand(3)/random(3) by calling srand/srandom,
> > then … honor that from that point on and produce a predictable sequence
> > (complying with POSIX).
> 
> No. the OpenBSD approach breaks the standard:

I am not proposing the design you go on to describe.  The proposal I am making is described above (excerpt from comment #12).

The only part of the standard broken is that srand(1) is no longer equivalent to the unseeded rand sequence.  It is difficult to determine how that would break programs, since simple changes in the rand() implementation (i.e., xorshift) would also break that characteristic.

In the proposal in comment #12, programs that use srand continue to get a predictable sequence after invoking srand.
Comment 20 Conrad Meyer freebsd_committer freebsd_triage 2019-05-16 14:33:40 UTC
One other maybe amusing possibility for implementing the predictable part of the PRNG would be to deterministically seed a CSPRNG like Chacha.  (E.g., use sha256(seed) as Chacha's key.)  Chacha20 is quite fast — faster than MT[1] if we compare cycles/B — and only a bit slower than xorshift.  A lower iteration version can be even more competitive, while purportedly maintaining similar security properties to AES-192.[2]

If seeds are not reused in practice (srand(time(NULL)) with infrequent restarts), then observation of some results cannot be used to predict PRNG state.  And there are no concerns about RNG distribution or period.

[1]: http://xoshiro.di.unimi.it/#quality
[2]: https://moderncrypto.org/mail-archive/noise/2017/001207.html
Comment 21 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-16 14:48:28 UTC
(In reply to Conrad Meyer from comment #19)

While your method is less invasive, any deviation from the standard is significant. We currently have a default seed and changing the seed, and even the algorithm still keeps the sequence reproducible.

Out of the top of my head: your change could break reproducible builds. I replaced a use of arc4random() with random() in makefs to ensure the build of installable media was reproducible. I didn't bother to specify a seed: the default is/was OK. The code might have changed as we later adopted the NetBSD approach, but still it shows any small deviation can break things.

OTOH, your change could be fine for the kernel since we don't have to keep up with standards there. We still use rand/random in some edge cases: I forget the details but I had a patch in phabricator long ago that replaced non-safe uses of rand/random in the kernel with arc4random().
Comment 22 Conrad Meyer freebsd_committer freebsd_triage 2019-05-16 14:57:39 UTC
(In reply to Pedro F. Giffuni from comment #21)
> Out of the top of my head: your change could break reproducible builds. I
> replaced a use of arc4random() with random() in makefs to ensure the build
> of installable media was reproducible.

makefs always seeds with the current time or the timestamp provided; this is not a problem.

> OTOH, your change could be fine for the kernel

This PR is not about the kernel.
Comment 23 Pedro F. Giffuni freebsd_committer freebsd_triage 2019-05-16 15:20:03 UTC
(In reply to Conrad Meyer from comment #20)

I could bite that -- a CSPRNG that is reproducible. There are some catches with the seeding but it could be worked around.
Comment 24 kennedy.greg 2021-01-25 01:11:23 UTC
I actually really dislike the idea of "use arc4random UNLESS srand has been called", because when people want rand() to be random, they've historically been trained to called srand(time(0)) first.  So the outcome is backwards from what you'd expect: the pattern of srand(time(0)) -> rand() now becomes LESS random, when the dev was trying to get MORE random.

Meanwhile, people who didn't call srand() at all are presumably expecting stable output, and defaulting a new seed with unpredictable qualities now gives them UNstable, high-quality PRNG output with each run.

It's completely backwards.
Comment 25 Gordon Tetlow freebsd_committer freebsd_triage 2022-05-23 14:10:05 UTC
In an attempt to clean this up (and remove the security mailing list from owning this bug), I'm assigning this to cem and removing secteam from the maintainer-feedback. Leaving secteam on CC for this bug in case there is any movement. I'm inclined to close it as it doesn't seem to have any consensus on how to address this.