How Randomness Runs The Internet

Random numbers are at the heart of security on the internet. Without them, the internet would be a totally insecure place, a world where confidential online banking and messaging would be rendered impossible. If we couldn’t generate random numbers properly, I could literally read the contents of your Facebook messages.

In fact, random numbers are so critical to the modern world, the NSA tried to sneak a backdoor into a “random number generator” algorithm, a backdoor which is suspected to have affected millions of devices around the world – but more on that later.

For now, welcome to the hidden world of random numbers. This will be a fun ride, I promise!

Before we dive in, here’s some places where random numbers are used

Recall that when you login to do your online banking, you only have to login once every 10 minutes (say). That is, you can use the site for 10 minutes without having to re-log in each and every time you do something.

This means the computer is somehow remembering the fact that you’re authenticated.

The way it does this is a technical complexity, but the idea is simple: when you log in to your bank, the bank’s server sends back a random string of characters (say, 64 characters in length). Your computer stores this string temporarily, and every time you make subsequent changes across the site, your computer sends this sequence of characters to the bank. The bank can then look up who it gave that sequence of characters to and confirm your identity.

 

Example conversation between a browser and a server

 

That random string of characters is called a session token. In reality, there’s many different types of tokens that are used, but as you can see, their fundamental existence relies on an assumption that others cannot guess them.

Some other use cases are multi-factor authentication codes, temporary passwords when you perform a password reset, etc.

You can imagine what would happen if the session tokens had some form of predictability to them. I’d be able to guess a session token and masquerade as someone else. Yes – if I could predict them accurately – it really would be that simple!

Luckily, many smart people have gone to a lot of effort to make that a difficult task indeed.

What’s so hard about randomness?

A truly “random”, random number generator would have absolutely zero bias, and knowing all prior history of the generator would give no information about the next value in the sequence.

Unfortunately, constructing such a beast from a deterministic machine like a computer that performs fast and reliably is a real challenge. Truly pure randomness needs to come from something that physically cannot be predicted, such as measuring nuclear decay of atoms, measuring thermal noise from electronic circuits, or shooting photons through a mirror. This is all very complicated and as expected, not something your home computer can achieve all on its own. As such, the world relies on “pseudo-random number generators”, or PRNGs.

How good are PRNGs?

Pseudo-random number generators use mathematical tricks to take an incredibly small amount of randomness (something all devices can typically cook up) and stretch it into something that is larger and still looks reasonably random. The strength and security of these PRNGs relies purely on the mathematics and for the most part, they do a great job.

 

 

The challenge is that sometimes there’s a flaw in the mathematics (deliberate, or otherwise…), and an algorithm might have a very slight bias towards a particular number. For example, there’s an algorithm called “Dual-EC”. Shortly after publication, it was identified that if an adversary had samples of numbers from the past, they could predict the future with a probability of around 0.1%. Now, that seems very small, but in the world of cryptography that’s enough to sign its death warrant. No self-respecting PRNG would have a predictability that high.

(We did a write up of the deliberate backdoor, and the logic behind it, here)

You might also be wondering where that initial seed of randomness comes from, given that a PRNG still needs an initial “activation” value to stretch in the first place. In a well-designed system this initialization value needs to be as random as possible (circular, I know) and must be kept secret.

This is how Windows does it, straight from the MSDN itself:

To form the seed for the random number generator, a calling application supplies bits it might have—for instance, mouse or keyboard timing input—that are then combined with both the stored seed and various system data and user data such as the process ID and thread ID, the system clock, the system time, the system counter, memory status, free disk clusters, the hashed user environment block. This result is used to seed the pseudorandom number generator (PRNG).

Microsoft Developer Network (MSDN) documentation

In short, it tries to collect bits of entropy from various devices connected to the computer and stores it in an entropy pool, which is then used by applications to seed a PRNG.

That’s a brief introduction into the role of randomness in our ever-digital lives. Over the next few months I’ll publish an article about how weak random number generators can be exploited (with an example!) as well as the backstory into how the NIST and NSA apparently added a back door into a world-wide standard, including what made the world suspicious in the first place.

And there you have it; who would have thought something totally random would be so incredibly useful.

 

Back to Blog