Implementing Eiserloh's Noise-Based RNG in Rust

Published:

thumbnail

A couple years ago, I watched a really interesting GDC talk by Squirrel Eiserloh about Noise-Based RNG, and it's really stuck with me. Lately, I've been looking at starting a small game side project to learn the Bevy engine, and to practice designing procedural narrative systems, which would require extensive use of seeded "RNG" and would like to use this idea. However, Eiserloh's examples are written in C++, not Rust, so implementing the function in Rust seemed like a great place to get started with actually writing some code.

The Talk

First, if you're interested in any kind of procedural content generation, I really recommend just watching the talk for yourself! But I'll give a high level overview here to get started. According to Eiserloh, seeded generation has benefits for a variety of uses, like world or NPC generation. And using a seed allows the program to discard unchanged content and regenerate it as needed since it will be the same, such as for on-demand generation in an infinite or near infinite playing space. One example would be for a Minecraft-like world generation. A randomly-accessible seed allows generating parts that all fit together correctly no matter which direction you come from, without having to generate the whole world at once.

What we need from a random number generator

An RNG for game development should have a fair statistical distribution, a statistically correct degree of repetition, and a high minimum repeat period. A wide range of possible seeds enables a nice variety of player experience, and a minimum of arbitrary restrictions on which seeds produce viable results allows more types of game data to be used to seed the number generation. The bits that make up the content of the results shouldn't be correlated with each other, either in a single result or in subsequent results. It also needs to be platform independent and reproducible across different platforms - you should get the same result whether it's running on an M1 Mac or an Android phone, or whatever else, in order to enable things like trading of seeds (and using the seed to compress data for saving or sharing generated content). The order of results should be independent - if you need the 75th result, you should be able to get it at any time, not only just after the first 74 results. You should be able to get the same results from the same seed deterministically from the generator every time. It also needs to be fast! You may need to get a lot of seeded results at once. This also means it needs a small memory footprint. It should also be thread safe for parallelization.

How traditional number generators fall short

Traditional random number generators have some significant cons when compared to the above list, including using C++'s rand(). Eiserloh gets into the numbers of several tests that he uses to characterize the results from different generator options, which I'd definitely suggest watching the talk to see if you're curious about the details. In short, they suffer from poor speed and poor statistical randomness overall. Many also have limitations on what seeds you can use to produce useful results, which makes it more complicated to use values such as entity IDs. And in a lot of cases, a seed doesn't actually give you a different set of random numbers, but instead just starts you at a different place in the same set of possible "random" numbers. Initial values may not be statistically useful due to slow warm up times, and some, such as Mersenne Twister, use internal state in a way that make them non-thread-safe. Worst of all, calls are order dependent, so each call affects the next call and game elements must be generated in the same order to be identical for a given seed.

How noise functions are better

Noise functions broadly solve the problems listed with more traditional RNG options. A noise function here behaves basically as if it is an endless, stateless table of possibilities from which you select the result you want based on which position in the "table" you request, making it order-independent. It has no state, so it is thread safe and endlessly instantiable. A different seed results in a unique infinite table, not an offset into the same table. They can be designed with as many dimensions as desired. Because it's order independent, it can be used to generate pieces of content on demand - world chunks, villages, NPCs - in whatever sequence the player needs them, without having to generate them ahead of time.

Some typical uses of noise functions are for things like tile variation, such as "which variant of grass should this tile be". But noise can be used for a lot more than that. It can be used to add variance in placement to make things feel more organic, such as using the peaks in a smoothed two dimensional noise graph to plot where trees should be placed by the engine. Want more trees? Increase the noise, now you have more peaks and thus more trees.

Ultimately, the biggest difference between a random number function and a noise function, is that while a random number function does some fancy modifications to a separately tracked state to produce a seemingly randomized result, a noise function does the fancy modifications to the input data and just returns the result.

Building noise functions in Rust

To follow along and play with the code in a repository, feel free to use the repo I created for this exercise. Of course, you can always just use std::hash and save yourself some time implementing something yourself, but seeing how these kinds of things work is an interesting exercise!

Example One: "Some Noise Function"

The first example in Eiserloh's talk is at about 39:00 in the video. He shows a couple different operations performed on an input position to mix up the result, which in C++ looks like this:

uint32_t SomeNoiseFunction(int position)
{
    uint32_t mangled = (uint32_t) position;
    mangled *= SOME_BIG_PRIME_NUMBER;
    mangled += SOME_OTHER_NUMBER;
    mangled *= mangled;
    mangled ^= (mangled >> 13);
    return mangled;
}

The code relies heavily on overflows to get its nice scrambling. So how would this look in Rust? It's considered an error in Rust to implicitly rely on overflows, but there are functions in the standard library to use if you want to explicitly use it. Let's go through rewriting this.

First, we'll use i32 in place of int, since in C++ int can change depending on architecture, then assign our mutable variable to start working with.

pub fn some_noise_function(position: i32) -> u32 {
    let mut mangled = position as u32;
    // TODO: actually change this.
    mangled
}

We can assign a couple constants to use in our mangling:

const SOME_BIG_PRIME_NUMBER: u32 = 27_644_437;
const SOME_OTHER_NUMBER: u32 = 17;

The first one is the fourth Bell Prime, because it was the first prime number over 10,000 that I saw on Wikipedia.

Now let's do the multiplication and addition to mangled. Since we're explicitly relying on overflows, we want to use wrapping_*:

pub fn some_noise_function(position: i32) -> u32 {
    let mut mangled = position as u32;
    mangled = mangled.wrapping_mul(SOME_BIG_PRIME_NUMBER);
    mangled = mangled.wrapping_add(SOME_OTHER_NUMBER);
    mangled = mangled.wrapping_mul(mangled);

    mangled
}

Finally, Eiserloh's example shows a bitwise XOR assignment with the same value, but bit shifted 13 bits to the right. This looks almost exactly the same in Rust as it does in C++, only without the unnecessary parentheses. This finally gives us our equivalent function for the first example:

pub fn some_noise_function(position: i32) -> u32 {
    let mut mangled = position as u32;
    mangled = mangled.wrapping_mul(SOME_BIG_PRIME_NUMBER);
    mangled = mangled.wrapping_add(SOME_OTHER_NUMBER);
    mangled = mangled.wrapping_mul(mangled);

    mangled ^= mangled >> 13;

    mangled
}

To test it, you can make a simple output with a brief range:

    println!("Some Noise Function results:");
    for number in 0..12 {
        println!(
            "Input {} produces output {}",
            number,
            some_noise_function(number)
        )
    }

Which produces results like these:

Some Noise Function results:
Input 0 produces output 289
Input 1 produces output 3715675734
Input 2 produces output 99453684
Input 3 produces output 2035159918
Input 4 produces output 933186350
Input 5 produces output 1088619820
Input 6 produces output 2501336958
Input 7 produces output 875985070
Input 8 produces output 508319700
Input 9 produces output 1397802838
Input 10 produces output 3545209601
Input 11 produces output 2654233028

Interesting to note, at least with the numbers I've chosen, that the result for 0 is such a low number. This could be fine, but probably not the most desirable, and is likely a result of starting our mangling with multiplication. (Zero times zero is, well, zero.)

Example Two: "Squirrel3"

From about 46:30 in the video, Eiserloh says that this is his third iteration of this hash function. Ultimately, his hand-crafted function produces results with an equally good statistical quality, and just slightly faster in C++ than C++'s std::hash<int>. I don't have the details to know how a Rust-based implementation would compare, but let's take a look at what it does. It takes both the position we're requesting, and the seed, as arguments.

uint32_t Squirrel3(int position, uint32_t seed)
{
    constexpr unsigned int BIT_NOISE1 = 0xB5297A4D;
    // He says in the talk that these are prime numbers, but this is an even number,
    // so it may be a typo. Regardless, we're using it as displayed in the talk.
    constexpr unsigned int BIT_NOISE2 = 0x68E31DA4;
    constexpr unsigned int BIT_NOISE3 = 0X1B56C4E9;

    unsigned int mangled = position;
    mangled *= BIT_NOISE1;
    mangled += seed;
    mangled ^= (mangled >> 8);
    mangled += BIT_NOISE2;
    mangled ^= (mangled << 8);
    mangled *= BIT_NOISE3;
    mangled ^= (mangled >> 8);
    return mangled;
}

There's a few more transformations here - and worth pointing out that the seed value is used after the first transformation, and not merely added to the position - but nothing new syntax wise from the above example. In Rust, following the same strategy with regards to i32 and u32 as in the first example, we end up with a function like this:

pub fn squirrel_3(position: i32, seed: u32) -> u32 {
    const BIT_NOISE1: u32 = 0xB5297A4D;
    const BIT_NOISE2: u32 = 0x68E31DA4;
    const BIT_NOISE3: u32 = 0x1B56C4E9;

    let mut mangled = position as u32;
    mangled = mangled.wrapping_mul(BIT_NOISE1);
    mangled += seed;
    mangled ^= mangled >> 8;
    mangled = mangled.wrapping_add(BIT_NOISE2);
    mangled ^= mangled << 8;
    mangled = mangled.wrapping_mul(BIT_NOISE3);
    mangled ^= mangled >> 8;

    mangled
}

Note if you forget to use wrapping_* here and try to run it, your program will immediately crash with a panic error when you try to run it. Ask me how I know...

Now if you run this with a couple basic ranges and some alternate seeds:

// In main.rs in this case.
    println!("Squirrel 3 results:");
    let seed = 12345; // Don't use this on your luggage.
    for number in 0..12 {
        println!(
            "Input {} with seed {} produces output {}",
            number,
            seed,
            squirrel_3(number, seed)
        )
    }
    let seed_two = 54321;
    for number in 0..12 {
        println!(
            "Input {} with seed {} produces output {}",
            number,
            seed_two,
            squirrel_3(number, seed_two)
        )
    }

You get results like this!

Squirrel 3 results:
Input 0 with seed 12345 produces output 3220422020
Input 1 with seed 12345 produces output 4234179005
Input 2 with seed 12345 produces output 334301668
Input 3 with seed 12345 produces output 145620291
Input 4 with seed 12345 produces output 2582164250
Input 5 with seed 12345 produces output 3262987543
Input 6 with seed 12345 produces output 63288327
Input 7 with seed 12345 produces output 2166186108
Input 8 with seed 12345 produces output 3083917344
Input 9 with seed 12345 produces output 28553252
Input 10 with seed 12345 produces output 2522297604
Input 11 with seed 12345 produces output 3818220281
Input 0 with seed 54321 produces output 3899447266
Input 1 with seed 54321 produces output 3175783065
Input 2 with seed 54321 produces output 3814005845
Input 3 with seed 54321 produces output 2040039807
Input 4 with seed 54321 produces output 1530743793
Input 5 with seed 54321 produces output 3295748292
Input 6 with seed 54321 produces output 327221713
Input 7 with seed 54321 produces output 3085187008
Input 8 with seed 54321 produces output 1267060869
Input 9 with seed 54321 produces output 3629072812
Input 10 with seed 54321 produces output 3952664767
Input 11 with seed 54321 produces output 900255667

These results don't show the issue noted in the first example with the weirdly low first number and give a nice range. Unfortunately, I'm not sure of the details on how to test this in a way that compares directly with the results in the talk, so I can't comment on that aspect. But we have successfully implemented a function to return a a random, yet totally predictable number for generating procedural content!

This post is tagged: