Generating Truly Random Sequences

Your brain can’t generate random numbers, and computers can’t either. Most of the ‘random’ numbers we come across in our lives are actually pseudorandom numbers; random enough for their purpose, but ordered enough to throw statistical analyses for a loop. [Giorgio] thought generating random sequences would make for an excellent project, so he whipped up a random sequence generator out of a few Opamps, resistors, and a handful of caps.

[Giorgio] used a Chua Circuit – a circuit that models nonlinear equations – to create a chaotic system. When pairs of points from these systems of equations are plotted on a graph, a fabulous and chaotic ‘double scroll’ pattern (seen above) can be found. After taking oscilloscope probes to different points on his Chua circuit, [Giorgio] watched chaos magically appear on his ‘oscope screen.

The double scroll pattern isn’t exactly random, but since the Z signal of his circuit chaotically varies between positive and negative, the only thing needed to create a random sequence of 1s and 0s is sending the Z signal through a comparator.

After calibrating and sampling his circuit [Giorgio] captured thousands of samples at a rate of 5 samples per second. From a cursory glance, it looks like [Giorgio]’s circuit is at least as good as flipping a coin, but proper tests for randomness require many more samples.

A very, very cool piece of work that is much, much more elegant than getting random bits from a Geiger counter.

69 thoughts on “Generating Truly Random Sequences

  1. Chaos is NOT randomness. You shouldn’t assume because you can’t see patterns in your random number generator that they don’t exist. The old random() function from microsoft DOS basic seems random but I wouldn’t use it for anything important.

    The strange attractor plot is interesting in its own right.

      1. No, if it’s chaotic then it doesn’t repeat. (IIRC, mathematicians haven’t proven that the strange attractor is chaotjc, but they’ve got good reason to beleive it is.) In 3 dimensions, being bounded doesn’t require repitition. It may be that chaos isn’t possible in 2 dimensions since it would have to repeat; I don’t remember

      2. If you know the curve (or at least a little about it) and you know the sampeling rate, you can estimate within what range the next sample would be.

        True randomness means that your ‘next sample’ has equal chance to get any value. The best example is this one: After you got 10 samples at 128 in a row, a human would expect it to be something else. How in the world could this be possible? Well, even if it were 10 million samples at 128, it still is equally possible that the next one is 128.

        Statistics.. fun stuff!

      3. “No, if it’s chaotic then it doesn’t repeat.”

        No. Chaotic simply means unpredictable.

        There is deterministic chaos, non-deterministic chaos and everything in between. The common factor is that chaotic systems are so sensitive to small variations in initial conditions or some variables that they’re essentially impossible to predict precisely.

      4. I had an undergrad course on non-linearity and chaos.
        A chaotic attractor is defined by a non-repeating sequence that is bounded in space.

        One important characteristic of chaos is that it is highly sensible to initial condition. Say you start at 1.00000000, you will get a very different result than if you start with 1.00000001. Our lack of precise knowledge on the exact starting point make chaotic patterns completely unpredictable. There can still be some predictable statistics, though.

      5. Dax wrote: “No. Chaotic simply means unpredictable…”

        It’s a free country, so I guess you can concoct whatever definition you want, but that’s not the standard mathematical definition.

        Here’s the definition from Nonlinear Dynamics and Chaos by Steven Strogatz (a popular text book on the subject):

        “Chaos is aperiodic long-term behavior in a deterministic system that exhibits sensitive dependence on initial conditions.”

        Aperiodic means it doesn’t repeat — like I wrote.

        Deterministic means it’s predictable — the opposite of what you wrote.

        The reason it appears unpredictable is because of the sensitive dependence on initial conditions; it’s predictability is limited by how accurately you can measure its initial condition.

  2. A way to test if a random number generator is generating random numbers is to calculate Pi! And since he has this already wired up to a RPi, it would be a cool way to test the circuits randomness. It is a basic comparison of of 2 random values (X and Y) to see if they fall inside a circle or inside of the square. The ratio of which should slowly converge towards pi.

    See the below random links for more info, I can’t find the one I first saw of this idea on a good few years ago.

    http://math.fullerton.edu/mathews/n2003/montecarlopimod.html

    http://www.stealthcopter.com/blog/2009/09/python-calculating-pi-using-random-numbers/

    http://http.developer.nvidia.com/GPUGems3/gpugems3_ch37.html

    1. That commercial site offers many claims, but little actual information on the specifics of what and how they are implementing their device.

      Even well accepted random source can have problems with specific physical implementations that cause biases that prevent resulting stream from being truly random.

      Compound that with the problem, that circuits that produce adequately random streams when first constructed can loose that ability over time.

      And none of that takes into account that people can influence even truly random entropy sources if the know enough about the design of the circuit.

      AVR random number generation experiments/code

  3. Fun project.

    I wonder how it could be effectively interfaced to some kind of touch device – hopefully not something as boring as a screen or flat surface – to reshape the image in realtime?

    Audio input could also play a roll.

    Hey, isn’t plotting the result kind of boring compared to using the coordinates to drive a laser pointer?

  4. I suspect that this circuit is not producing a truly random sequence, since the circuit is simply implementing a chaotic equation. Fundamentally that is basically was a computer’s psuedo-random number generation functions do.

    It would be interesting to see the output submitted to actual tests for randomness, even basic ones which the OP didn’t do.

    1. A computer’s pseudo-random generator can only sample the bits used in the computation… the chaotic circuit can sample the arbitrary low significant digits present in real life.

      The result is that the circuit system has far more chaotic potential, as chaos represents a strong dependence on low significant digits.

      1. Sorry, but the point is that since the entropy source is an equation, it is not truly random, ie it is from a deterministic source just like a computers software based pseudo-random number generator.

        Its possible the claim is correct; however, without easily providable evidence it is more than likely simply naive.

      2. The entropy source is not the equation modeled in the circuit, but the random thermodynamic noise which impacts the physical circuit.

        The circuit will diverge rapidly from a deterministic simulation of the circuit.

      3. If the entropy source is thermodynamic noise, then much simpler (and much faster) circuits already exist which make use of that entropy source.

        Essentially what you are saying is that the entire equation implemented by the OP as an analog computet is irrelevant to the entropy performance :)

  5. There is no reason to balance the circuit to be positive half of the time, and negative the other half. Simply invert every other output bit in software. This will cancel out any inherent bias the circuit might have.

  6. Have a look at “A New Kind of Science” by Stephan Wolfram. TL,DR: Computer programs *can* generate perfectly random* bit patterns, and with trivial programs.

    Note, these are deterministic sequences, and eventually must repeat if they are approximated in a computer.

    But a bitstream that takes a significant fraction of the age of the universe to get to a repeat, using only 32 bits of space, is a pretty damn good approximation.

    The point is – this kind of pattern (see “rule 30” in the book) is, from the point of view of feeding some chunk of its output through the usual statistical analysis, absolutely random.

    The fact that a trivial program can generate it is only “proof that it’s not random” by the generally given definition of “random”. Since such programs are by no means unique, (in the sense that a search of all possible programs in different kinds of very simple computational systems always turns a few up) then there is therefore no proof that the universe itself does not, in fact, depend at the lowest level on some ridiculously simple set of rules to determine the next step of time.

    So in general then, just because we don’t yet know what that simple fundamental mechanism is, doesn’t prove that there isn’t one. It also raises the possibility therefore that any “naturally random sequence” is itself just a consequence of such a computation.

    Knowing that it could be generated by such a simple ruleset is, it turns out, no help whatsoever in finding which program generates it.

    The whole point is that the very idea that “programs are deterministic, and therefore can create no new randomness, it must come from elsewhere” is fairly trivially proved wrong. This was what Wolfram discovered, and it’s absolutely fascinating.

    It’s one of these seemingly trivial breakthroughs that begins to up the ante right across science.

    The notion of a purely random source is brought into suspicion as just a perception. When we say “it’s random” what we generally mean is that “we can’t find a pattern to simply describe its behaviour”.

    The natural logic would then reasonably continue with the mistaken idea that “since I can’t figure it, and I am perfectly smart, therefore no-one can, it must be fundamentally unfigurable. A wall, an unbreakable law of nature.” It is from this logic that the idea of a “true source of randomness” comes.

    1. “Computer programs *can* generate perfectly random* bit patterns, and with trivial programs. Note, these are deterministic sequences, and eventually must repeat if they are approximated in a computer.”

      In other words, they aren’t random.

    2. Mayby I am wrong,
      but isnt i.e. computing the digits of pi (as long as its not proven to repeat) some entirely deterministic sort
      of random sequence generation in the sense that there
      is no other way to predict the next digit, other than computing it (see kolmogorov complexity).
      Of course thats of no practical use since so many digits are known already and you would get exactly the same sequence every time you start over again.

      1. No, in the mathematical context, truly random means unpredictable. Give just a few digits from Pi, it would be trivial to predict the output of such a ‘random’ number generator.

        The above generator would be far less trivial, but given that it is deterministic, it is certainly predictable.

      2. okay,
        I just looked it up and it seems that the pi sequence fullfills the criteria for kolmogorov randomness, which is: A sequence of symbols, which cannot be described by an algorithm shorter than the sequence itself.
        Also i just noticed, that a coin flip is only truly random as long as you dont throw the coin. after that it is only pseudo random, since its outcome can be predicted by measurement. thats pretty interesting.

    3. The problem isn’t that the number sequence take a trillion years to repeat, the problem is that someone who comes along can capture the output, analyse the numbers and figure out the coefficients and gain the ability to “predict” the next number sequences.

      Imagine the lottery was done using pseudorandom number generators – sure they’re “random enough”, but someone could well analyse historical trends and predict future lotto numbers.

      Of course this is much more difficult if the random number generator algorithm is hidden from the attacker, and if the algirthm is non-standard, but this is getting into the whole “security by obscurity” argument.

      All I’m saying is that there are situations where true randomness is required. And it’s not that difficult to get a true random generator, and that’s why they exist and are used in certain circumstances.

      1. look up Ronald Dale Harris, he got access to the algorithm used for the Keno pseudo random number generator So after some games he could predict the next winning numbers
        He got caught..

    4. Wolfram describes the Rule30 output as ‘perfectly random’, but does not show or describe any tests to indicate how good of randomness the rule produces.

      Wolfram at very many places in his big book makes a strong claim about a system without actually justifying it.

  7. Hm, lots of misunderstanding about terminology:

    Usually when we talk about pseudorandomness in crypto or CS, we’re talking about strings of numbers generated by a deterministic function.

    The string looks random, but you can always recreate it if you know the algorithm and the state it began in (and often you can specify that initial state as a “seed”). So the “pseudo” is not a pejorative or a statement on quality, just a label meaning “deterministic” or, if you prefer, “reproducible”.

    And when we talk about true randomness, we’re talking about something derived from a physical phenomenon with an unpredictable component, like the time at which a radioactive element decays.

    Neither of those terms address the /quality/ of the randomness — your radioactive decay monitor could spit out millions of 0’s between each 1, but when that 1 happens is still truly random. (And you can use a “whitener” to turn it into a uniformly distributed truly random source that will be much more useful).

    And your pseudorandom number generator (PRNG) could be a simple LFSR like many rand() implementations use (which any good cryptanalyst can reverse-engineer very quickly, making it useless for crypto), or a good one like RC4, which nobody knows how to predict unless they know the key.

    So if Georgio’s circuit relies on unpredictable electrical noise, then it’s truly random, even if it needs whitening to clean it up statistically. And likewise, the random events we see in the real world, if they derived from a random natural phenomenon, are truly random too.

    1. Your confusing the quality of the random generator with the distribution of values from said generator. The quality of a generator has nothing to do with its distribution, since any particular distribution can be mapped to any other mathematically.

      All that matters is that the original source be unpredictable, and this circuit, which is just an analog computer implementing a particular set of equations is does not meet that requirement.

  8. Truly random data was generated from 1997 through 2001 by SGI’s Lavarand system.

    http://en.wikipedia.org/wiki/Lavarand

    I participated in an IRC chat with one of the Lavarand inventors. He mentioned that they’d built the system for a “large financial institution” to protect data on a CD-ROM.

    In addition to all that security, plus the system being inside a secure room, the “large financial institution” had SGI place the CD-ROM drive inside a microwave oven, connected to the building’s security system. If someone triggered the alarm the disc would get fried.

    Paranoid much? To make off with the data a thief would’ve had to carry out the entire system and exactly duplicated everything within the video camera’s field of view to even have a hope of decrypting the data.

  9. Randomness does not exist.
    Given the same set of initial conditions the result is always the same.
    Good thing.
    If randomness existed the universe would quickly dissolve.
    If randomness existed mathematics could not exist.

  10. The word ‘random’ has a lot of street-level connotations that make discussion of the subject complicated.

    Statistically, ‘random’ means ‘equal probabilities for any possible outcome’. For most computer purposes, ‘random’ means ‘starting from a given N-bit output sequence, you have an equal chance of seeing every other N-bit output sequence next’. That gets harder to do as the sequence lengths approach the RNG’s period, but for the truly huge period systems there’s little practical difference.

    Some people take ‘random’ to mean ‘stochastic’.. a system that can be analyzed statistically but can’t be predicted from any finite amount of kowledge about the system’s internal state or previous behavior.

    Zener noise generators are stochastic because it’s impossible to predict the next bust of electrons, though in large samples the bursts fall into a well-defined pattern.

    Algorithmic RNGs are the opposite of stochastic, since the RNG itself ‘predicts’ the next output value based on its internal state and the previous output of the system.

    Very few systems need true stochastic randomness. Almost all concemporary security is based on statistically random but non-stochastic systems that make it hard to reverse-engineer information about the internal state of the system from any reasonable amount of output.. the standard defintion of ‘hard’ being “couldn’t do it in the lifespan of the universe with the fastest computers physically possible”.

    BTW – @rgd2: Wolfram’s 1d finite automata do generate beautiful statistically random sequences, but to get infinite randomness, they have to get infinitely large.

    It’s mathematically impossible to get a period longer than 2^32 from a RNG with 32 bits of storage if pattern N+1 is a function of pattern N.. the domain is a set of 2^32 possible states, and the RNG is a function that maps the domain back to itself. If the system is deterministic, that mapping can only be one-to-one. The longest possible path through such a mapping is the one that visits every member of the domain exactly once before repeating.

    Rule 30 adds two new bits of storage with each iteration.. its range of possible states grows twice as fast as its set of possible outputs. That’s what all the ‘ever-expanding triangle’ business is about.

    In practical terms, it’s a moot point because there isn’t enough energy in the universe to cycle through every possible combination of 512 bits (64 bytes), and it’s trivial to run rule 30 (or any 1d cellular automaton) on a space several kilobytes wide.

    1. This statement, “Statistically, ‘random’ means ‘equal probabilities for any possible outcome’.” is incorrect. Mathetically, a random number stream can have any distribution (uniform–what you describe, gaussian, etc…)

      The distribution is irrelevant to whether the stream is truly random. In simple english, a stream is random if it is ‘unpredictable’ or to use your term stochastic. Any stream that is produced by a deterministic function is presumed to be predictable even if current technology does not allow such prediction. That is why even psuedo-random number streams, ie derived from deterministic equations, are still useful. They can’t be predicted by current technology. It is also why they are not used for truly important activities; because you can never be sure that some savant will not find a way to predict there outcome.

  11. The behavior of chaotic systems depends very strongly on their initial state, doesn’t that imply that any noise in the system will also affect the state strongly?
    I would imagine that this circuit would be very good at turning small fluctuations into large ones and therefore would make quite a good RNG.

  12. Generating a random number is very easy using a computer (that someone is using at the time).

    All you need is a counter and an event that is truly independent of the counter.

    Since for example keyboard clicks and the CPU clock are perfectly independent at least well above a 32bit variable, you can just cycle a variable (count from 0-(2^32)) and stop the timer when a given event occurs (for example A is pressed).

    There are much more advanced methods but most rely on the independence of a internal clock and the user input, or the random values in a photograph.

    Larger “random data farms” usually use radio noise to generate large volumes of random data fast.

  13. Actually, it is possible to train people and even animals to produce numbers that are indistinguishable from random. Check out the work of Dr. Allen Neuringer. He suggests using similar methods as a way of training people to be more creative.

  14. Chaotic systems do generate true randomness. Sure, they are deterministic equations, but the initial conditions contain an infinite amount of random information, and this continuously “bubbles up” onto an observable scale. The rate at which previously unobservable randomness makes itself present is known as the Kolmogorov entropy (not related to Kolmogorov complexity, but named after the same person). This has a precise mathematical definition and is positive for all chaotic systems.

    Even though the state space is bounded, the sequence never repeats. The trajectory can come arbitrarily close to a state that it has been in before, but it will always be a little bit off and in the future it will not do the same as it had done in the past, since it is not quite exactly in the same place as it was in the past. This is the butterfly effect. It may be possible to make predictions about what will happen a short time from now, but it is not possible to predict where the system will be a large time from now.

  15. in my “opingion”,…

    PRE-PS: when i say “system” here, i mean apparatus andor software to produce your numbers/data

    anything derived from a “system” is bound, in some way, to the system that created it, and is thus part of a system. anything that is part of a system is not, and can not, be truly random, after all it came from a system.

    saying a number that came from a “system” is random is like saying the traces on a random computer circuit-board are random, no, they are based around a specific chip, or system-bus, somewhat random, but still based on rules. IE the reset is always connected to pin3 (of chip) ect. pin 3 can be oriented the other way, in fact, could be anywhere on the entire board, and the board could have the chip hanging off into space with connecting wires going to the board’s surface. but no matter HOW you arrange and mount stuff, there IS a patternl, if it was truly random, it would cease to work.

    YES, no two snowflakes are the same when you take a sample, but they are all influenced by the day’s weather and type of snowfall, they all pass through simillar cloud-cover, and over a simillarily temperature city, sampling site.
    overall charistics are of a category (in general) as you may have (mostly) huge snowflakes today, and (mostly) small ones tomorrow, DRASTICALY reducing the randomness, and by definition if randomness is reduced any any factor, no matter how small, it ceases to be random as it is following a pattern. patterns are the oppisite of true randomness.

    same with calculations and even physics, you say the atomic “noise” is random because it does not repeat with your lifetime analysis but the truth is, it came from something that has electrons circiling it in a near sphere, and although the specific path it takes around the nucliei is seemingly random, it circles in the same “space” for so long that it could NOT be truly random. a truly random orbit would eventually screw up and fly off into space. we would cease to be able to rely on the peirodic table, and thus we would not have life.

    everything is from a pattern, your life, your biology, your thoughts, even the time you were born, is all connected to a central existence and nothing is seperate from that.

    the search for true randomness is foolish, unless a near randomness would suffice, which is does. anyone believeing they can find true randomness is ignorant of the system we live in OR is attempting to prouve to the world that they can single-handedly overthrow nature and her “laws” which keep us alive (edit: ability to exist). the very same laws that affects everything around us, including your “random” data on your HDD.

    overandout, peace.

  16. I’d like to thank all the people who have visited my page and those who have taken the time to leave comments on here, the feedback is highly appreciated. To be honest, I didn’t think my experiment would spark such interest. I believe that the key to understand why the Chua Circuit can be used for this purpose, is in the definition of chaos itself:

    Chaos theory studies the behavior of dynamical systems that are highly sensitive to initial conditions, an effect which is popularly referred to as the butterfly effect. Small differences in initial conditions (such as those due to rounding errors in numerical computation) yield widely diverging outcomes for chaotic systems, rendering long-term prediction impossible in general. This happens even though these systems are deterministic, meaning that their future behavior is fully determined by their initial conditions, with no random elements involved. In other words, the deterministic nature of these systems does not make them predictable. This behavior is known as deterministic chaos, or simply chaos.

    (from: http://en.wikipedia.org/wiki/Chaos_theory )

    My initial goal was trying to reproduce the results presented in the paper listed as [3] on my page. It is quite possible that I have made some mistakes in the process or that I haven’t been able to explain clearly why we can expect the sequence coming from this apparatus to be truly random. In this case, if you haven’t already, I suggest you to find a copy of that paper and read it, I’m sure it would solve many doubts. Also, a more recent and very interesting article that explains in details a similar technique (with results from FIPS 140-1 Statistical Tests for Randomness and Diehard Test Suite) is “True Random Bit Generation From a Double-Scroll Attractor – M.E. Yalcin” (the pdf is easily found on google)
    I have been working on acquiring new, longer sequences and looked into the de-skewing techniques to extract unbiased bits from a “defective” generator with unknown bias. I will update my page as soon as I have other interesting results (probably in a week or so).

    1. All good pseduo-random number generators have chaotic equations at their heart. Non-linearity is a requirement for any deterministic equation to successfully mimic a truly random source.

      It might be possible that your circuit is producing a true random source, which would be due to electrical noise, not the underlying equation your analog computer is using; however, a fairly extensive set of testing is required to justify that claim.

      Personally, I look forward to hearing about your continued research on such circuits.

    1. A truly random number generator would only generate that stream approximately 1 in every 100000 sequences of 5 digits… Not likely to be random.

      Now, finding a string of five 9’s in a larger random sequence is much more likely…

      I forget the professor at UT, but he bets his class that he can distinguish a sequence of a given size generated by people versus generated by a RNG… And he does it visually.

      Now he doesn’t make that bet with the students after they have taken his class, since by then they will know what he looks for to distinguish the two…

      1. Also, that 1 in 100000 chance applies equally to every other possible sequence of 5 digits. Our brains look at that 99999 pattern and think it is less likely than 13893, but it’s not.

      2. You’re essentially correct about human beings typically being poor random number generators.

        There are a few ways to force your brain to do a better job. Sure, pseudorandom number generators are just algorithms, it is conceivable with time and paper you could run one in your brain… in a pinch, digits from irrational numbers are an improvement (in terms of distribution only) over ‘just picking something’.

        However, what I would do to win the bet with that professor is to arbitrarily pick sets of 2 students of the same gender. If the leftmost one is taller, output 1, if the rightmost one is taller, output 0. The height parameter in this case should be normally distributed-ish, and the act of comparing should whitewash the data.

        Arguably I’d be cheating since I’m not actually generating the numbers per se.

        Here’s a fun paper that plots the phase space of different psuedorandom number generators, with some graphically interesting results:

        http://www.cs.ru.ac.za/research/g02c2954/Final%20Writeup.htm

  17. When testing output from a quantum source, I found this quite useful:

    http://csrc.nist.gov/groups/ST/toolkit/rng/index.html

    I found 100 megabits (divided into 100 blocks)to be a useful minimum data set, which may be impractical in your case without an upgrade, but it’s still a fun tool to play with.

    Another fun trick is to divide the data into N columns, and calculate the mean of each row. Then, do a frequency plot of the means. You’ll get a binomial distribution if your data is “random-like”. It works with pretty small data sets, though is not by any means a reliable test.

    Also, kudos for using opamps for this. I’m notorious among my friends for suggesting them as a panacea, but somehow I never thought of using them as an entropy source!

Leave a Reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.