Matthew Leadbetter bio photo

Matthew Leadbetter

Coder, data scientist and maths enthusiast.

So you might have seen the title and thought. “Why do I need to know about this? My language already has a perfectly good built in library! They wouldn’t give me something that’s rubbish.” To a degree you’d be right. It’s horses for courses. If you’re using it to randomly pick out a picture for your web site, it’s fine. Want it to do anything else? There’s a good chance that built in is terrible.

In this article I’ll cover:

  • Whether you should use your language’s built in.
  • How to use PRNGs in your code base with respect to threading and mocking.
  • What are some of the different PRNGs and what are they good for?
  • How to write your own PRNG with a modern algorithm (if you want/need to).

I will not cover:

  • Cryptographically secure PRNGs. If you need a CSRNG the advice on threading and mocking with the PRNG basics preliminaries are the only relevant parts of this article.
  • Too many theoretical details of PRNGs, I’ve tried to keep this relatively light and practical.

Should You Use The Built In?

It depends how good your language’s built in is. It’s surprisingly bad in a lot of languages. You can consult this table.

Language Should you use the built in?
C No.
C++11 Yes but read on.
C# No.
Haskell Yes but read on.
Java No.
Javascript Depends on the browser. (So no)
PHP I’d say no but you don’t have a choice.
Python Yes.
Ruby Yes.
SCALA No.

Python and Ruby are the best of the bunch offering a simple interface with a good algorithm. Both C++11 and Haskell offer a few PRNGs so you need to know which algorithms are good. Javascript you’re probably okay as long as the browser isn’t internet explorer. Some versions of PHP use a good algorithm, some use a bad one. All of the versions commit other atrocities however. Your best bet is to either write a C extension or sit in the corner and cry. Finally C and all the Java related languages only have terrible algorithms available.

It’s worth noting that all of these (except PHP, to my knowledge) have good external libraries with better algorithms. So don’t code your own unless you think it’s worth it.

PRNG Basics

For the moment you only need to know about two attributes of PRNGs - seeds and periods.

Seed

PRNGs don’t generate random numbers. They just generate numbers that look random. This is because everything that happens on a computer has to happen deterministically. When these algorithms are asked for a new number they modify their internal state and output a new random number based on that.

What if you wanted to run your program multiple times and have different output each time though? Because these algorithms all work deterministically it would be the same every time. To overcome this the algorithm is initialised with a different internal state each time the program runs. This initial value is called the seed. And supplying the seed requires some entropy.

Entropy is something that will be different each time the program is run for example the number of seconds since the epoch. Unlike the requirement of PRNGs, entropy does not need to be uniform random. Though it is preferable.

These days our operating systems can find lots of very good entropy and provide it to your program. How easy this is to get to depends on your programming language though. It used to be that your RNG was initialised based on time and only time. It’s now common to initialise based on random numbers passed by the OS. On some hardware/software however you may unfortunately still need to source your own entropy.

Period

The period of a PRNG is the number of generations before it repeats the whole sequence again. Obviously a long period is a good thing. Most have at least .

The higher a period is, the more memory that is needed. As a lower bound for a period, you need bytes. This is because at each point in the period the PRNG will need to be representing a unique state. Pretty much all PRNGs operate very closely to this lower bound.

Most PRNGs have massive periods however, so big that you probably don’t have to worry too much about it.

Threading

You have two choices for using threads with a PRNG. You can use mutexes to lock your PRNG down or you can supply each thread with it’s own PRNG (maybe taken from a singleton object pool). Of course to do this you will need more entropy. You can’t have your threads all producing the same random numbers.

Obviously the second solution is both the fastest and best solution. The problem is you will need much more entropy. That entropy will put each thread on some random point in the period and each thread will start to sequentially process from it’s point in the period.

Clearly for almost any application having two threads which overlapped each other’s random numbers would be a bad thing. Thankfully for a large enough period (say or more) you’re pretty much guaranteed this won’t happen in your life time, no matter how many cores you throw at it. For lower periods here is a table showing the probability of overlap for period, threads, and the number of generations per core.

Period Threads Generations Overlap Probability
4 1000000 0.0037
8 1000000 0.015
16 1000000 0.058
32 1000000 0.21
4 10000000 0.037
8 10000000 0.14
16 10000000 0.46
32 10000000 0.92
64 10000000 0.00015
128 10000000 0.00058
1024 100000000 0.31
10000000000 10000000000

However this is under the assumption that you are seeding each with a uniform random number. If you are seeding from the operating system, you’re fine. If you are seeding from another, mutexed PRNG you might not be not fine. PRNGs are not random, they just look random. So you may find you have some bias in that seed which brings the periods closer together. In practice it tends to be okay but try to use as much entropy as possible when generating your threaded PRNGs.

Mocking

This obviously doesn’t apply to all languages. If you use an OO language properly though, when you test classes that use random numbers you are probably going to want to be able to mock the PRNG. Having a lot of lengthy tests that test within highly unlikely error bounds is not good unit testing. I’m not even sure if it’s unit testing.

When unit testing you only test one class. You don’t also want to be testing that your PRNG works correctly. Instead you can test for if the PRNG was called correctly. For some languages (but not all) this means that even if you use the built in you will probably want to create your own PRNG class which can be mocked.

Types of PRNG

The quality of PRNGs is tested both by how much theory is known about them and how well they perform empirically in statistical tests. There are lots of PRNGs I’ll cover three of the most common.

Linear Congruence Generator

This was state of the art in the 80s. Things have moved on now. Conceivably it still has some uses though it is the fastest of them all.

The period of the linear congruence generator can be set to whatever you like, though if you want a large one you’ll need to use an arbitrary precision library. As a result is common. could be done easily and for reasons that escape me Java uses .

Linear Congruence Generators do not generate good random numbers however because they used to be as good as you could get most older languages implement this and their random number generator and haven’t changed it (probably for back compatibility reasons).

Mersenne Twister

This is the big daddy of PRNGs. It’s well studied, fast and produces excellent random output. If you don’t want to use the built in for your language because it’s terrible see if you can find an implementation of this someone coded.

It boasts one of the largest periods of all PRNGs at an enormous . I guess now you can see where it gets its name. If you’re thinking “Wow, that’s a lot of entropy to fill!”, it is. If there is hardware available though you should be fine. Some languages don’t bother filling it all PHP for example only lets you pass in an int for the seed. If you bear in mind the threading issues you will probably think that not seeding more than the 8 bytes that PHP seeds is a bad idea. It is, but in practice it’s unlikely to make a difference.

The Mersenne Twister passes most tests. It really only fails one test and it’s unlikely that issue will cause you a problem.

Unproven

Okay, so I say unproven but that’s not quite true. Maybe less proven would be a better choice of words. These typically have less strong mathematical theory associated with them however perform better on empirical tests than the Mersenne Twister. They are often faster too. These are the next generation of PRNGs.

One of the best at the moment is called xorshift*, which is twice as fast and fails no empirical tests on even the biggest test suites. However it does not have as much theory behind it the that Mersenne Twister does. It is based off an algorithm called xorbitshift which performs rather badly. However after being multiplied by a constant starts performing very well.

If you’re thinking “To hell with it, good in practice is good enough for me!” I agree! Lets write an xorshift* library.

Writing Your Own PRNG

I’ve tried to keep this all as language agnostic as possible, unfortunately now we have to write actual code. My favourite language is C++. If you don’t like C++ (and lets face it - you have every reason to) it shouldn’t be a problem. You can still mostly follow along.

This is actually much easier than you would think, hell someone else has already written the PRNG algorithm for you! So lets start with the most complicated part - getting a random integer between two values.

In Defence of Modulo

Your PRNG will only return a random number between 0 and the maximum size of the number it returns (probably or ). So you need to transform that number to a uniform random distribution between your maxValue and minValue. This is a lot harder than you would think. Chances are if you’ve coded in C or C++ you’ve at some point written something like this:

int r = ( rand() % (maxValue-minValue+1) ) + minValue;

Some people will chide you and tell you it’s technically wrong. They’re right but it almost certainly doesn’t matter. The problem with this code is that for any range that isn’t a power of 2 you will generate a non-uniform distribution favouring the lower values.

For example if you have a PRNG which generates two bit random numbers - from 0 to 3 and you want to generate numbers between 0 and 2. Here 0 has a chance of occurring, whereas 1 and 2 only have a chance of occurring. This is obviously a disaster.

Hold on though, lets try an int - 32 bits. Now it’s for 0 and for 1 and 2. How many decimal places is your Monte Carlo simulation supposed to be accurate to again? In short if the number you are using is below and you are generating a bit integer, the inaccuracy from using modulo just doesn’t matter.

You can find the total difference in probability between all the lower probability numbers and the higher probability numbers with:

Where is the range of values you are choosing and is the number of bits the PRNG is returning.

Here is a small table of possible values:

b r Error
32 3
32 10
32 1000
32 10000
32 100000
64 100000
64 1000000
64

As you can see if you’re using a 64 bit int the chance of it ever mattering is incredibly small. However if the code isn’t just for you who knows what a user will put in. In this case the code should be correct.

A Correct Way

While I’ve looked for good ways of doing this I have not seen anything definitive, so what follows is two algorithms that I have written and one used in GCC. I wrote my first PRNG library back before the C++11 standard. This is the algorithm I used:

// Generate a random number between 0 and maxValue (inclusive)
unsigned long long getRandomInt(const unsigned long long &maxValue)
{
    static const uint64_t max64Int = 0xffffffffffffffff;
    if(maxValue == max64Int)
    {
        return xorshift1024();
    }

    uint64_t inclusiveMax = maxValue + 1;

    uint64_t randomInt;
    uint64_t maxAcceptableValue = (max64Int / inclusiveMax) * inclusiveMax;
    do
    {
        randomInt = xorshift1024();
    } while(randomInt >= maxAcceptableValue);
    randomInt %= inclusiveMax;
    return randomInt;
}

This algorithm works by first generating a random number. If it is part of the remainder of the division of the max int value by the range then it is part of what will contribute to an uneven distribution. In the case it throws it away and creates another random number. As discussed before this is quite unlikely.

The GCC Correct Way

It might be worth looking at how a group of people that presumably know what they’re doing do it, rather than taking some idiot on the Internet’s word for it. Here is the code used in GCC 4.8.4, which I’ve cleaned up a little for ease of reading. Here urng is the random number engine that has been passed in.

const uctype urngmax = urng.max();
const uctype urngrange = urngmax - urngmin;
const uctype urange = maxValue - minValue;

uctype ret;

if (urngrange > urange)
{
    // downscaling
    const uctype uerange = urange + 1; // urange can be zero
    const uctype scaling = urngrange / uerange;
    const uctype past = uerange * scaling;
    do
    {
        ret =uctype(ng()) - urngmin;
    } while ret >= past);
    ret /= scaling;
}
else if (urngrange < urange)
{
    // upscaling
...

It’s clear that they have had to make substantial allowances for some of the more quirky possible RNGs. It also allows the user to put in a number higher than the highest value the PRNG can take, or “upscale” (code not shown). Other than that, in the end it’s remarkably similar to my algorithm.

I benchmarked1 it, removing all the cruft and it performs similarly to my algorithm, which isn’t surprising. It’s worth noting that with the cruft and just using the STL it is definitely slower.

The Quicker Correct Way

PRNGs are not CSRNGs. They are supposed to be fast. With my C++ hat on I can’t say either of these algorithms are fast. Two divisions/modulos - that’s two quite slow operations. I thought I could do better and it turns out I can2. We can get rid of those at the expense of increasing the chance of needing to call the PRNG again.

// Generate a random number between 0 and maxValue (inclusive)
unsigned long long getRandomInt(const unsigned long long &maxValue)
{
    int leadingZeros = countLeadingZeros64(maxValue);
    
    uint64_t randomInt;
    do
    {
        randomInt = xorshift1024() >> leadingZeros;
    } while(randomInt > maxValue);

    return randomInt;
}

Here countLeadingZeros64 is a function you will need to implement that counts the number of leading zeros in binary of a uint64_t. On some C++ compilers there’s a built in that will use a single CPU instruction if it’s available (it usually is). There are also lots of bitwise algorithms you can plug in there instead.

This algorithm takes the next largest power of 2 from the maxValue and generates random numbers within that power of 2 until one falls within the correct range. The worst case for this algorithm is that maxValue = . In this case we would expect the PRNG to be called approximately twice.

Which is fastest depends on both how fast your PRNG is and what kind of maximums you are using, in my own tests with Mersenne Twister (the slowest thing you’re likely to use) the leading zero algorithm is significantly faster across the board using the CPU instruction. However when I tested countLeadingZeros64 with an algorithm (I chose De Bruijn Multiplication) it came out slower in the worst case. Though it was still much faster in the best case. It looks better on average. For this reason I recommend using this version. If you can’t use a built in to count leading zeros here is an implementation using De Bruijn Multiplication.

const int debruijn64Index[64] = {
    63, 16, 62,  7, 15, 36, 61,  3,
     6, 14, 22, 26, 35, 47, 60,  2,
     9,  5, 28, 11, 13, 21, 42, 19,
    25, 31, 34, 40, 46, 52, 59,  1,
    17,  8, 37,  4, 23, 27, 48, 10,
    29, 12, 43, 20, 32, 41, 53, 18,
    38, 24, 49, 30, 44, 33, 54, 39,
    50, 45, 55, 51, 56, 57, 58,  0
};

int countLeadingZeros64(uint64_t toCount) {
    static const uint64_t debruijn64 = 0x03f79d71b4cb0a89ULL;
    toCount |= toCount >> 1;
    toCount |= toCount >> 2;
    toCount |= toCount >> 4;
    toCount |= toCount >> 8;
    toCount |= toCount >> 16;
    toCount |= toCount >> 32;
    return debruijn64Index[(toCount * debruijn64) >> 58];
}

Of course straight modulo runs away with the gold for speed absolutely destroying it’s competition. So if speed matters more to your calculation than absolute correctness, just use modulo.

Phew! Hard work, huh? Don’t worry, if you’ve read through to here it’s all plain sailing from now.

Final Utilities

But I promised you ints between two values, not 0 and a max value. Thankfully that’s a lot easier:

unsigned long long getRandomInt(const unsigned long long &minValue,
                                const unsigned long long &maxValue)
{
    unsigned long long range = maxValue - minValue;
    return getRandomInt(range) + minValue;
}

Floats and doubles are easy too!

// Generate a random number between 0 and 1
float getRandomFloat()
{
    static const uint64_t max64Int = 0xffffffffffffffff;
    uint64_t rand = xorshift1024();
    float retVal = (float)rand / max64Int;
    return retVal;
}

The PRNG

This bit is simple - other people have done the hard work for you!

uint64_t state[16];
int position;

uint64_t xorshift1024(void)
{
        uint64_t state0 = state[position];
        position = (position + 1) % 16;
        uint64_t state1 = state[position];
        state1 ^= state1 << 31;
        state1 ^= state1 >> 11;
        state0 ^= state0 >> 30;
        state[position] = state0 ^ state1;
        return state[position] * 1181783497276652981LL;
}

Copy. Paste. Done. Obviously state needs to be initialised with some entropy first and position can be initialised to 0.

Putting it all together

You now know enough to write your own (perhaps with your own choice of PRNG) or if you’re using C++ you can use mine. It’s written as a trade off to be fast, simple, high quality and apply to 99% of cases. I also created it header only to make it quick and easy to incorporate into your program.

It makes heavy use of C++11. Seeding is taken care of for you during creation with random_device and is thread safe. However generating random numbers is not thread safe. Give each thread it’s own generator. You can supply your own seed for reproducibility. All public functions are virtual for gmock. Assignment and the copy constructor have been deleted. It uses xorshift* as the PRNG engine and the leading zero’s algorithm.

Because in practice we know how big ints and doubles are on different platforms I’ve not written any upscaling. One day long double might be represented with more than 8 bytes, but not today. Still it would be nice to have an if(sizeof(T) > 8) around it with an upscaling algorithm for if it ever does happen.

For the leading zero algorithm - at the moment only GCC and MSC intrinsics are supported, if you add support for your favourite C++ compiler or find a faster or more accurate way of doing something or want to add support from statistical distributions please submit a pull request!

You can use it like this:

PRNG prng;
cout << prng.getRandomUnsignedInt(5, 10) << endl;

What could be easier?

  1. I’m using sys/time.h for all benchmarking, it’s not that scientific but I can only apologise. In this case I don’t think there’s a better tool for the job. All cases where I’ve said an algorithm is faster it has been at least 25% faster on my hardware. All benchmarks were performed using GCC with -O3 on a crap laptop. Max value was given after compilation to prevent optimising for it specifically.

  2. I say I did it, because I came up with it by myself. However I would be very, very surprised if someone else hasn’t already done it.