2014-04-28 Uniform random floats:  How to generate a double-precision
      floating-point number in [0, 1] uniformly at random given a
      uniform random source of bits

   (Permalink: https://mumble.net/~campbell/2014/04/28/uniform-random-float)

   The naive approach of generating an integer in {0, 1, ..., 2^k - 1}
   for some k and dividing by 2^k, as used by, e.g., Boost.Random and
   GCC 4.8's implementation of C++11 uniform_real_distribution, gives a
   nonuniform distribution:
  
   - If k is 64, a natural choice for the source of bits, then because
   the set {0, 1, ..., 2^53 - 1} is represented exactly, whereas many
   subsets of {2^53, 2^53 + 1, ..., 2^64 - 1} are rounded to common
   floating-point numbers, outputs in [0, 2^-11) are underrepresented.
  
   [Update, 2016-08-23: I can't figure out what I meant by the above
   bias.  The probability the floating-point draw lies in [0, 2^-11)
   should be 2^-11.  This is the same as the probability that the
   integer draw lies in {0, 1, ..., 2^53 - 1}, which is 2^53/2^64 =
   2^-11.  Certainly values below 2^-64 are underrepresented, as
   below, but that's a separate issue.]

   - If k is 53, in an attempt to avoid nonuniformity due to rounding,
   then numbers in (0, 2^-53) and 1 will never be output.  These
   outputs have very small, but nonnegligible, probability: the Bitcoin
   network today randomly guesses solutions every ten minutes to
   problems for which random guessing has much, much lower probability
   of success, closer to 2^-64.
  
   What is the `uniform' distribution we want, anyway?  It is obviously
   not the uniform discrete distribution on the finite set of
   floating-point numbers in [0, 1] -- that would be silly.  For our
   `uniform' distribution, we would like to imagine[*] drawing a real
   number in [0, 1] uniformly at random, and then choosing the nearest
   floating-point number to it.
  
   To formalize this, start with the uniform continuous distribution on
   [0, 1] in the real numbers, whose measure is mu([a, b]) = b - a for
   any [a, b] contained in [0, 1].  Next, let rho be the default
   (round-to-nearest/ties-to-even) rounding map from real numbers to
   floating-point numbers, so that, e.g., rho(0.50000000000000001) =
   0.5.
  
   Note that the preimage under rho of any floating-point number --
   that is, for a floating-point number x, the set of real numbers that
   will be rounded to x, or { u \in [0, 1] : rho(u) = x } -- is an
   interval.  Its measure, mu(rho^-1(x)), is the measure of the set of
   numbers in [0, 1] that will be rounded to x, and that is precisely
   the probability we want for x in our `uniform' distribution.
  
   Instead of drawing from real numbers in [0, 1] uniformly at random,
   we can imagine drawing from the space of infinite sequences of bits
   uniformly at random, say (0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 0, ...), and
   interpreting the result as the fractional part of the binary
   expansion of a real number in [0, 1]:
  
     0*(1/2) + 0*(1/4) + 0*(1/8) + 1*(1/16) + 0*(1/32) + 1*(1/64) + ...
  
   Then if we round that number, we'll get a floating-point number with
   the desired probability.  But we can't just directly compute that
   sum one term at a time from left to right: once we get to 53
   significant bits, further bits will be rounded away by addition, so
   the result will be biased to be `even', i.e. to have zero least
   significant bit.  And we obviously can't choose uniformly at random
   from the space of infinite sequences of bits and round the result.
  
   We can, however, choose finite sequences of bits uniformly at
   random, and, except for a detail about ties, after a certain
   constant number of bits, no matter what further bits we choose they
   won't change what floating-point number we get by rounding.  So we
   can choose a sufficiently large finite sequence of bits and round
   that, knowing that the result of rounding would not change no matter
   how many extra bits we add.
  
   The detail about ties is that if we simply choose, say, 1088 bits --
   the least multiple of 64 greater than the exponent of the smallest
   possible (subnormal) floating-point number, 2^-1074 -- and round it
   in the default round-to-nearest/ties-to-even mode, the result would
   be biased to be even.
  
   The reason is that the probability of seeing a tie in any finite
   sequence of bits is nonzero, but but real ties occur only in sets of
   measure zero -- they happen only when *every* remaining bit in the
   infinite sequence is also zero.  To avoid this, before rounding, we
   can simulate a sticky bit by setting the least significant bit of
   the finite sequence we choose, in order to prevent rounding what
   merely looks approximately like a tie to even.
  
   One little, perhaps counterintuitive, detail remains: the boundary
   values, 0 and 1, have half the probability you might naively expect,
   because we exclude from consideration the half of the interval
   rho^-1(0) below 0 and the half of the interval rho^-1(1) above 1.
  
   Some PRNG libraries provide variations on the naive algorithms that
   omit one or both boundary values, in order to allow, e.g., computing
   log(x) or log(1 - x).  For example, instead of drawing from {0, 1,
   ..., 2^53 - 1} and dividing by 2^53, some
  
   - divide by 2^53 - 1 instead, to allow both boundary points;
   - draw from {1, 2, ..., 2^53} instead, to omit 0 but allow 1; or
   - draw from {-2^52, -2^52 + 1, ..., 2^52 - 2, 2^52 - 1} (i.e., the
   signed rather than unsigned 53-bit integers) and then add 1/2 before
   dividing, to omit both boundary points.
  
   These algorithms still have the biases described above, however.
   Given a uniform distribution on [0, 1], you can always turn it into
   a uniform distribution on (0, 1) or (0, 1] or [0, 1) by rejection
   sampling if 0 or 1 is a problem.

   See <https://mumble.net/~campbell/2014/04/28/random_real.c> for
   source code.

   Thanks to Steve Canon and Alex Barnett for explaining the binary
   expansion approach to me and talking the problem over.

   [*] For the pedantic who may stumble upon this before reading ahead,
   we cannot, of course, assign a meaningful probability to a choice of
   real number in [0, 1].  What we can do, however, is imagine that it
   were meaningful, and then formalize it in the next sentence with the
   correct sense of measure.

--
Copyright (c) 2006--2015, Taylor R. Campbell.

Verbatim copying and distribution of this entire article are permitted
worldwide, without royalty, in any medium, provided this notice, and
the copyright notice, are preserved.