Random Number Generation
From Qwiki
Random numbers are an important part of computer simulations, which in turn are important tools in the physical sciences. However with the exception of some new technologies (that use, for instance, Shot Noise fluctuations in electrical current to generate random numbers), real "randomness" is impossible to generate because computers all execute code in a purely deterministic manner. Fortunately, clever people have figured out ways to generate deterministic sequences of numbers that have properties that are not too different from what you'd expect if the numbers were generated from a truly random process. Such methods are known as pseudo-random number generators, and their desirable properties include, for example, very little statistical correlation between pairs numbers in the sequence and a very long repeat period (all computer-generated sequences obviously must repeat themselves eventually). The topic of pseudo-random number generators is well documented elsewhere (Numerical Recipes Textbooks is a good starting point), and we are by no means experts in it anyway, so we will not discuss it further. However, we list some guidelines below that may or may not be helpful for people writing computer simulations.
Contents |
Uniform pseudo-random number generation
The most common application of pseudo-random number generation is generating numbers chosen from a uniform distribution, with

where x is a random variable lying in some finite interval [a,a + L]. Typically, other distributions are simulated by appropriate transformations of uniform numbers, so the quality of the uniform random number generator used by your code is extremely important even if you do not use uniform random numbers.
- Never trust the rand() function built into a Fortran, C or C++ compiler. Some of them may be perfectly good but there are no standards that must be met by such functions, so some of them may be quite bad.
- RANLIB and some of the other libraries provided by netlib are easy to use and provide a simple but decent uniform generator and some methods for generating random numbers according to other distributions.
- MATLAB uses a fairly sophisticated uniform random generator and a dedicated algorithm for generating Gaussian random numbers, as well as a number of methods for generating random numbers according to other distributions.
- The GNU scientific library (GSL) incorporates several sophisticated uniform random number generators that can be selected by the user, as well as a large number of methods for generating random numbers according to other distributions.
- There are a number of other less-common implementations of standard uniform number generators. Check Google. Also, some generators such as Mersenne Twister have code made publicly available by their creators.
When given multiple uniform random number generators to choose from, the best choice is often a compromise between statistical performance and computation time. We at Mabuchi Lab tend to go with either the Matlab generators or the Mersenne Twister (built into GSL) because they are proven to produce high-quality random numbers and execute pretty quickly on modern computers.
Transformation to other distributions
One way to generate random numbers from a distribution on other than uniform is to transform some uniform random numbers. We know from Transformation of Random Variables that if we define an invertible function y = ξ(x) of a random variable x, then

relates the probability densities of x and y. If we think of x as a uniform random variable generated by a computer, and of py(y)
as the probability density of the random variables we would like to generate, then all we need to do is solve for ξ(x). If we generate a sequence {xn} of uniform random numbers, then the sequence
will be distributed according to py(y)!
Example: Exponential random variables
This seems to be the canonical example. Suppose we want to generate numbers according to

From above we have (assuming our uniform random numbers lie in an interval of unit length)

which becomes, by the Inverse function theorem,

We integrate to get ξ − 1(y) = e − y / μ, or ξ(x) = − μlog(x).
Example: Points on/in a sphere
Another example that might actually be useful is for generating points lying uniformly on the surface of a sphere of radius R. By this we mean that the probability that a point lies inside any region of area A is proportional to A. If we parametrize the sphere using the angles θ and φ in the standard way (φ is the azimuthal angle) then we know that the area of a region swept out by the infinitesimal increments dθ and dφ is
so that the probability density of the point (θ,φ) is

Since this does not depend on φ we can generate values for φ by simply taking uniform random numbers on the interval [0,2π). Noting that φ contributes a factor of 1 / 2π to the probability density and allowing x to lie uniformly in [ − 1,1] we have

If instead we had wanted to generate points in three dimensions uniformly distributed within the sphere, we would have had

As before when we treated φ and θ separately, we treat r by itself and select φ and θ as described above. For r we have (with x in [0,1]):

which gives ξr(x) = Rx1 / 3.
Transformation by rejection
The transformation method described above is rather elegant. However, it relies on the probability density py(y) to be both invertible and easily integrated in order to get a simple transformation formula. Sometimes these criteria are just not met.
It is always possible to generate random numbers according to any bounded distribution py(y) over a finite interval [a,b] in the following way:
- Generate x, a uniform random number on [a,b].
- Generate z, a uniform random number on
.
- Compute py(x). If
then we accept x as our random number. However, if z > py(x), we return to step 1.
Proof that this works
Note that this method can be proved by simple geometric arguments: If we plot the curve py(y) over its domain, and draw a rectangle of width b − a and height maxpy, then we see that every point (x,z) chosen lies inside the rectangle, but only the points that lie below the curve are accepted. It is closely related to methods for Monte-Carlo integration.
This does not mean that we shouldn't prove that it works algebraically. We now compute the probability density p(x) of the number that our procedure has generated, beginning by computing the probability that --- in a single iteration --- we generate an x that we reject:

In a similar way we compute the probability density that in one iteration we both choose a particular x and accept it:

Now we can compute what we are looking for. We note that x can be the number we choose to accept by either selecting it and accepting it on our first iteration, or by rejecting our first choice and choosing/accepting x on our second iteration, and so on. Thus

which evaluates to py(x), completing the proof that this technique produces random numbers with the appropriate distribution.

