Computers are known to be precise and — usually — repeatable. That’s why it is so hard to get something that seems random out of them. Yet random things are great for games, encryption, and multimedia. Who wants the same order of a playlist or slide show every time?
It is very hard to get truly random numbers, but for a lot of cases, it isn’t that important. Even better, if you programming or using a scripting language, there are lots of things that you can use to get some degree of randomness that is sufficient for many purposes.
The Root of Random
In your device directory are two quasi-files you might not have noticed before. The /dev/random
and /dev/urandom
files will output as many random bytes as you might want to read. Why are there two? The kernel grabs noisy data from different places. For example, it might read crypto hardware or measure time intervals between disk accesses. These numbers are not easy to predict and can make a good source of difficult to guess numbers. However, for a certain number of random bits you need a certain amount of random noise. The /dev/random
device file fills with these environmental random bits, and if it needs more random measurements to complete the request, it will block until it gets them. The /dev/urandom
file, on the other hand, will provide an “unlimited” number of bytes; it works by periodically re-seeding a pseudo-random number generator with environmental randomness.
If you program in any normal language, it is easy to just open either of these files and read the number of bytes you want. In normal shell scripting, it is easy, too. For example:
head -c 3 /dev/random | od -t x1 -A none
This command will give you three hex bytes. If you prefer, you could change the x1
to get decimal numbers or anything else you want.
Better Shell
Of course, the Shell knows you want to do this. Bash keeps $RANDOM
updated and you can read from it if you prefer:
for i in {1..5} do echo $RANDOM done
This will give you five random numbers each time.
Better Still
This is easy, but we can still do better. After all, suppose you have a bunch of sayings in a file, one per line. Even with a random number, you’d need to skip the lines and worry about how many lines are in the file total. There’s a better way: the shuf
command.
This command seems simple at first but is actually quite powerful. The bare command reads a file, or standard input, and permutes it based on a random number. There are options to feed it your own source of random numbers if you care.
Sometimes you don’t want all the items in the file. For example, picking a single quote from a file, you might just want the next random song. The -n
option limits the output to the first line or lines. If you want to shuffle numbers, you can use the -i
option. For example:
shuf -n 1 -i1-10
This command will give you a single random number between 1 and 10. Very easy!
Back to the picking a random quote from a file, that’s as easy as:
shuf -n1 input_file.txt
Combined with a list of files, this can pick random files easily, too:
ls *.mp3 | shuf -n 1
When to Choose Which
Note the shuf
command is part of the GNU Core Utilities, so some machines won’t have it. In BSD, the jot
command is somewhat similar. For a more portable script, it would probably be wise to check that shuf
exists, maybe look for jot
, and if you find neither, try to see if $RANDOM
changes. You could process the raw number with awk
. Absent that, you could check for /dev/urandom
and /dev/random,
which would also require some processing.
With these tools, you can write delightfully unpredictable scripts. (Of course, some of our scripts are less than delightfully unpredictable, too. But we can’t blame /dev/urandom
for that.)
If you want to dig deep into /dev/random
, check out Elliot’s writeup of the Linux entropy collecting system.
Interestingly, unlike shuf, mp3 player style shuffle is not actually random. If it were, you would get repeated songs more than you’d expect.
This is a really cool little one liner, but I’ve usually regretted most of the bash scripts I’ve written.
I’m sure it makes sense if you’re a sysadmin, but for home use, I think Python is just so much nicer than bash.
The main thing I like about bash is that it’s really convenient when you have a logicless list of commands to run, but when you start doing heavier programming, the syntax gets a bit annoying.
The used language depends on the life expectancy of the script.
If it is less than a year, today’s hot language is good
If it is less than five years bash shouldn’t break in that time
If it is more than five years, sh and C89 (no external libs allowed). The C source should be readable from the script for recompilation.
If you need to go older system, Linux sed and awk for example are quite luxurious.
Also writing some stdin to stdout processing software is better than invoking tens of simple programs. Just keeping the processing simple and wrapping it with sh will give flexibility. Also you get srand(), rand() for not so random stuff and arc4random() for more random stuff on newer systems.
The only more irritating scripting problem than old script breaking by bit rot is old perl script breaking.
Python should work for the foreseeable future with minimal maintenance, but yeah, it’s not perfect.
I’m really surprised how casually most languages take backwards compatibility. All this Agile RefactorMercilessly doesn’t encourage stable anything.
There should really be a decent high level language (Preferably just a fork of an existing one) that everyone just agrees they will leave alone.
There are tons of file formats we can still read from the 90s, but for some reason OOP scripting languages seem to always want to shuffle it up constantly.
Archival Oriented Programming would be pretty awesome.
There already is: python 2. It is no longer supported and therefore forzen and therefore left alone. Python2 is not going away anytime soon simply because it is no longer updated.
I can easily imagine quite a few distros ditching 2 in the next few years, but it should be pretty safe.
Still, something like Wren+a very basic embedded style GUI might be a better choice. A decent modern language you can just compile yourself and include directly in the project, if you want to be really sure it’s going to stay around.
“Interestingly, unlike shuf, mp3 player style shuffle is not actually random. If it were, you would get repeated songs more than you’d expect.”
I don’t remember well but it was either early Winamp or Foobar that when playing “random” mp3 at some point was repeating constantly the same songs.
A random shuffle does not give you repeats of songs back to back. It’s a permutation of the songs in the playlist; each one gets played once but the order is a random permutation. It has nothing to do with “truly random” or not.
Do NOT use /dev/random for embedded systems. They do not generate enough entropy to keep your task from blocking and you do not need “randomer numbers”. A PRNG seeded with HWRNG and system entropy is more than unpredictable enough. I don’t care that GPG and Systemd do it, most kernel developers disagree with those decisions and they are debating changing this behavior because crypto operations in early boot keep hanging machines.
+1 Can confirm the current systemd behavior leads to hangs and headache.
Question: I’ve been intrigued with creating tiny Linux instances, like the Business Card Linux project. Can you pick a random source, urandom/random/etc. with something like BuildRoot?
Generally you always get both and software gets whatever it asks for. If you have a supported CPU or TPM with HWRNG instructions setting rng_core.default_quality to something sensible like 700 or 1000 will cause the kernel to credit entropy from them and not block on random or uninitialized urandom.
That’s neat, I wasn’t aware of that. I’ve been reading up on how Linux and micro-controllers handle proper “random”. Do it internally to a SoC or leverage some type of external entropy source.
Yes, and you can get blocking on non-embedded sytems that load the filesystem into RAM, just not enough I/O activity. The easy solution is to symlink random to urandom.
“When to choose which”
Unless you have good reasons, use urandom.
If you think you have a good reason, read https://www.2uo.de/myths-about-urandom/ and then ask again.
(The only good reason will turn out to be in the kernel near boot time)
Here’s a good reason to use something besides urandom:
You are writing a long-running unit test that requires random data, and you want the same set of random numbers each time so that your test yields the same result every time and is suitable for CI development. In this case you need to be able to seed the random number generator with the same seed every time, which is not supported with urandom.
If you think this is some sort of degenerate special case then your tests are probably not very good.
I’m sorry but most people tend to run their unit tests after the machine boots up
To clarify for people reading your comment: this is when you use the C stndlib srand() and rand() functions. This is NOT when you use /dev/random
“Who wants the same order of a playlist or slide show every time?”
I do. Random playlists always seem to cause jarring mood changes and I can’t imagine how hard it would be to write a talk for a random slide show.
Super annoying how every player wants to be “helpful” in tracking down every piece of music on any of your drives and finds an mp3 alert sound in an application’s directory and slips that in between your actual music. It’s getting so that I want to set up a different user for each genre of music and lock down the search to their own home/user directory.
Yeah the folks who work in retail really appreciate it when the same terrible song plays over the PA system every day at the same time.
If it’s so short that it needs to be restarted daily then the songs would get old regardless of the order. Plus retail music is meant to be bad enough to annoy customers on the first play through. That’s why order doesn’t matter; it’s all the same rotten tripe and none has any mood other than anxiety that you might not get out of the store before the next track starts. Thankfully most retail has gotten rid of their music; silence is better.
$RANDOM as far as I know only exists in the bash shell.
And I think ksh as well. But it is not in many of the older shells.
And zsh as well (just checked, I was not sure either).
$RANDOM is supported in the default shell on macOS (/bin/zsh) and AIX (/bin/sh) so feel free to use it in your shell scripts.
A good question is: Is it in POSIX? Which as far as I know the answer is no. And I don’t think it’s one of the almost-POSIX things like ampersand redirection either.
Notably, it’s not in Busybox’s ash, so it’s probably unavailable in an initramfs or in resource constrained embedded environment. If you do a lot of system integration work, learning to use POSIX isms over bash or ksh isms is a useful skill because of this. Reducing the number of constructs in the language also makes it more orthogonal which may make scripts require less knowledge to read, even though it makes things a little more unpleasant to write.
Thank you!
:o)
What does “FU” mean in this context?
Your word fu is weak old man :-)
I’ve often wondered, and your query led me to actually looking for the answer.
“Fu” derives from a Taoist concept of returning, particularly returning to basics. There is some implication in the Taoist concept of this being cyclical, which does not seem to apply to computerese. The computer use of “Fu” might also carry an implication of skill.
In addition to these Linux-Fu articles which focus on basic Linux text commands, you might also consider GIMP’s script-fu, a scripting language for GIMP plugins.
I think “Fu” is just short for “kung fu” which is a generic chinese term for martial arts.
https://en.wikipedia.org/wiki/Kung_fu_(term)
So linux Fu would most likely mean the art or techniques in using linux?
Sounds good to me. I suppose we’d have to do historical research and find out who originated the computer usage, and ask that person.
Etymology does not govern meaning, so that wouldn’t help. Knowing it is a loan-word used to mean “skill” is already to know the meaning. If the intent had been different, it wouldn’t affect the current meaning in this context.
For simple pseudo random there is sort -R.
I am soooo new and sooo old. I started writing programs on green screen Commadore 64 then the Trash 80. Sorry TRS80. I can get buy enough if I read it. Prob is windows 7 isCapt. Indo not understand how to load bionic puppyLennox. I’ve tried several ways..nuttin. also issues the IPv6 No Acess Availabel. I’ve reset config, wonsoc, twice. I see the date back on oymt and drove it back a week seemed to fish iit. But now it’s been a week and it’s doing it again. Any help is appreciated. Ty.
Zippydoo
Someone here might find the HAVEGE-algorithm interesting :
https://github.com/jirka-h/haveged