If you are a traditional programmer, using bash
for scripting may seem limiting sometimes, but for certain tasks, bash
can be very productive. It turns out, some of the limits of bash
are really limits of older shells and people code to that to be compatible. Still other perceived issues are because some of the advanced functions in bash
are arcane or confusing.
Strings are a good example. You don’t think of bash
as a string manipulation language, but it has many powerful ways to handle strings. In fact, it may have too many ways, since the functionality winds up in more than one place. Of course, you can also call out to programs, and sometimes it is just easier to make a call to an awk
or Python script to do the heavy lifting.
But let’s stick with bash
-isms for handling strings. Obviously, you can put a string in an environment variable and pull it back out. I am going to assume you know how string interpolation and quoting works. In other words, this should make sense:
echo "Your path is $PATH and the current directory is ${PWD}"
The Long and the Short
Suppose you want to know the length of a string. That’s a pretty basic string operation. In bash
, you can write ${#var}
to find the length of $var
:
#/bin/bash echo -n "Project Name? " read PNAME if (( ${#PNAME} > 16 )) then echo Error: Project name longer than 16 characters else echo ${PNAME} it is! fi
The “((” forms an arithmetic context which is why you can get away with an unquoted greater-than sign here. If you don’t mind using expr
— which is an external program — there are at least two more ways to get there:
echo ${#STR} expr length "${STR}" expr match "${STR}" '.*'
Of course, if you allow yourself to call outside of bash
, you could use awk
or anything else to do this, too, but we’ll stick with expr
as it is relatively lightweight.
Swiss Army Knife
In fact, expr
can do a lot of string manipulations in addition to length and match. You can pull a substring from a string using substr
. It is often handy to use index
to find a particular character in the string first. The expr
program uses 1 as the first character of the string. So, for example:
#/bin/bash echo -n "Full path? " read FFN LAST_SLASH=0 SLASH=$( expr index "$FFN" / ) # find first slash while (( $SLASH != 0 )) do let LAST_SLASH=$LAST_SLASH+$SLASH # point at next slash SLASH=$(expr index "${FFN:$LAST_SLASH}" / ) # look for another done # now LAST_SLASH points to last slash echo -n "Directory: " expr substr "$FFN" 1 $LAST_SLASH echo -or- echo ${FFN:0:$LAST_SLASH} # Yes, I know about dirname but this is an example
Enter a full path (like /foo/bar/hackaday
) and the script will find the last slash and print the name up to and including the last slash using two different methods. This script makes use of expr
but also uses the syntax for bash
‘s built in substring extraction which starts at index zero. For example, if the variable FOO contains “Hackaday”:
- ${FOO} -> Hackaday
- ${FOO:1} -> ackaday
- ${FOO:5:3} -> day
The first number is an offset and the second is a length if it is positive. You can also make either of the numbers negative, although you need a space after the colon if the offset is negative. The last character of the string is at index -1, for example. A negative length is shorthand for an absolute position from the end of the string. So:
- ${FOO: -3} -> day
- ${FOO:1:-4} -> ack
- ${FOO: -8:-4} -> Hack
Of course, either or both numbers could be variables, as you can see in the example.
Less is More
Sometimes you don’t want to find something, you just want to get rid of it. bash
has lots of ways to remove substrings using fixed strings or glob-based pattern matching. There are four variations. One pair of deletions remove the longest and shortest possible substrings from the front of the string and the other pair does the same thing from the back of the string. Consider this:
TSTR=my.first.file.txt echo ${TSTR%.*} # prints my.first.file echo ${TSTR%%.*} # prints my echo ${TSTR#*fi} # prints rst.file.txt echo $TSTR##*fi} # prints le.txt
Transformation
Of course, sometimes you don’t want to delete, as much as you want to replace some string with another string. You can use a single slash to replace the first instance of a search string or two slashes to replace globally. You can also fail to provide a replacement string and you’ll get another way to delete parts of strings. One other trick is to add a # or % to anchor the match to the start or end of the string, just like with a deletion.
TSTR=my.first.file.txt echo ${TSTR/fi/Fi} # my.First.file.txt echo ${TSTR//fi/Fi} # my.First.File.txt echo ${TSTR/#*./PREFIX-} # PREFIX-txt (note: always longest match) echo ${TSTR/%.*/.backup} # my.backup (note: always longest match)
Miscellaneous
Some of the more common ways to manipulate strings in bash
have to do with dealing with parameters. Suppose you have a script that expects a variable called OTERM
to be set but you want to be sure:
REALTERM=${OTERM:-vt100}
Now REALTERM
will have the value of OTERM
or the string “vt100” if there was nothing in OTERM
. Sometimes you want to set OTERM
itself so while you could assign to OTERM
instead of REALTERM
, there is an easier way. Use := instead of the :- sequence. If you do that, you don’t necessarily need an assignment at all, although you can use one if you like:
echo ${OTERM:=vt100} # now OTERM is vt100 if it was empty before
You can also reverse the sense so that you replace the value only if the main value is not empty, although that’s not as generally useful:
echo ${DEBUG:+"Debug mode is ON"} # reverse -; no assignment
A more drastic measure lets you print an error message to stderr and abort a non-interactive shell:
REALTERM=${OTERM:?"Error. Please set OTERM before calling this script"}
Just in Case
Converting things to upper or lower case is fairly simple. You can provide a glob pattern that matches a single character. If you omit it, it is the same as ?, which matches any character. You can elect to change all the matching characters or just attempt to match the first character. Here are the obligatory examples:
NAME="joe Hackaday" echo ${NAME^} # prints Joe Hackaday (first match of any character) echo ${NAME^^} # prints JOE HACKADAY (all of any character) echo ${NAME^^[a]} # prints joe HAckAdAy (all a characters) echo ${NAME,,] # prints joe hackaday (all characters) echo ${NAME,] # prints joe Hackaday (first character matched and didn't convert) NAME="Joe Hackaday" echo ${NAME,,[A-H]} # prints Joe hackaday (apply pattern to all characters and convert A-H to lowercase)
Recent versions of bash
can also convert upper and lower case using ${VAR@U}
and ${VAR@L}
along with just the first character using @u
and @l
, but your mileage may vary.
Pass the Test
You probably realize that when you do a standard test, that actually calls a program:
if [ $f -eq 0 ] then ...
If you do an ls on /usr/bin
, you’ll see an executable actually named “[” used as a shorthand for the test program. However, bash
has its own test in the form of two brackets:
if [[ $f == 0 ]] then ...
That test built-in can handle regular expressions using =~ so that’s another option for matching strings:
if [[ "$NAME" =~ [hH]a.k ]] ...
Choose Wisely
Of course, if you are doing a slew of text processing, maybe you don’t need to be using bash
. Even if you are, don’t forget you can always leverage other programs like tr, awk
, sed
, and many others to do things like this. Sure, performance won’t be as good — probably — but if you are worried about performance why are you writing a script?
Unless you just swear off scripting altogether, it is nice to have some of these tricks in your back pocket. Use them wisely.
Bash sucks as a programming language. Only donkeys use bash as a general purpose programming language.
Why waste time and space even talking about it?
Whenever I come across a multi-hundred line bash script some nutball wrote, I want to hunt them down and punish them.
Funny thing about bash… Even on the obscurest of systems, there’s usually no need to “apt-get” it.
Move on if you don’t like it. Or stay, you might learn something.
Like, for example: Bash isn’t a programming language.
> Like, for example: Bash isn’t a programming language.
Yeah that’s what they were saying.
Bash IS a programing language. It has been years since I used any other language for my programs.
This is Hackaday. Using something for programming, that isn’t a programming language, is exactly in the spirit of what we do! See also: “anything is possible with enough 555s” and “flagrant abuse of the C pre-processor”.
Donkeys confuse scripting with programming.
Monkeys know and use both :)
Only a buffoon would pretend to be a “programmer” and not realize/appreciate the importance of shell scripting
Long time ago (early 90s) I was the Unix systems guy for a little company, which went out of business.
Another ex-employee and I made a living for a while as freelance consultants/developers to the now-unsupported user base, him doing the business and customer training aspects and me doing the software/hardware/tech/comms.
Since we didn’t own the applications IP (compiled COBOL) and the systems were plain vanilla SysVr2 and Xenix with no internet connections, I wrote lots of shellscripts as filters to take the outputs of A/R, G/L and Reports and
generate new data and reports from them. Management loved it. Everything worked real great except on the odd occasions when a single parenthesis etc. was misplaced in a thousand-line shellscript.
Really wish I had colour-coded editing back then! I did everything in vi and kermit on my Compaq LTE laptop as a serial terminal.
Hhmm, this is a really professional statement, lucky guy :-(
“punish them” really???
can’t hack it???
this is hackaday right???
also ‘bash’ is an acronym not a verb
Long time ago (early 90s) I was the Unix systems guy for a little company, which went out of business.
Afterwards, another ex-employee and I made a living for a while as freelance consultants/developers to the now-unsupported user base, he did the business and customer training aspects and I did the software/hardware/tech/comms.
Since we didn’t own the applications IP (they were compiled COBOL) and the systems were plain vanilla SysVr2 and Xenix with no internet connections, I wrote lots of shellscripts and sed/awk scripts as filters to take the outputs of A/R, G/L and Reports and generate new data and printouts from them. Management loved it. Everything worked real great except on the odd occasions when a single parenthesis etc. was misplaced in a thousand-line shellscript.
Really wish I had colour-coded editing back then, I mainly used vi and kermit on my Compaq LTE laptop as a serial terminal.
I prefer fish on Friday and the other six days. Strings are easier, color coding is helpful and the auto-suggestion seems to read my mind.
(Ba)sh/openssh or the best way to admin a server…
Great article!
Your dirname example could have been much simpler using the following non-greedy trailing match substitution:
echo -n ${FFN%/[!/]*}/
Perhaps as important than the code elegance, it avoids at 2 instantiations of expr, and their associated context switches. If you needed to parse a large list of filenames this becomes critical:
~]$ cat test.sh
#/bin/bash
while read -r FFN;
do
LAST_SLASH=0
SLASH=$( expr index “$FFN” / ) # find first slash
while (( $SLASH != 0 ))
do
let LAST_SLASH=$LAST_SLASH+$SLASH # point at next slash
SLASH=$(expr index “${FFN:$LAST_SLASH}” / ) # look for another
done
echo ${FFN:0:$LAST_SLASH}
done dir.list
~]$ cat test2.sh
#!/bin/bash
while read -r FFN;
do
echo ${FFN%/[!/]*}/
done dir2.list
~]$ find /lib/ -t file | head -n 10000 > file.list
~]$ wc -l file.list
10000 file.list
~]$ time test.sh
real 5m52.066s
user 1m10.284s
sys 4m14.552s
~]$ time ./test2.sh
real 0m0.216s
user 0m0.140s
sys 0m0.075s
The time savings are a factor of over 10^3….
Note though that the two scripts are not exactly the same functionally. test2.sh needs to remove a “/” and one or more characters that are not a “/” to work, so it will mangle a root level file or directory such as /foo.txt or /lib into /. But that limitation is avoided if you know that you have a list of pathed files. It would probably be easier to use sed or perl to handle those cases than fight with bash’s substitution…
HaD comment scrubbing strikes again. I really wish we had a decent mark-up language here. Both scripts are reading from file.list and outputting into dir.list using the typical bash while read;do;done file redirection structure, but HaD dumped the redirections…
Comments get scrubbed of all non-valid HTML-tag-like stuff. Keeps you from passing scripts and breaking stuff.
pre tag experiment …
Here it is
retry
How to use those “pre tags”??
Dunno if this site has some kind of mark-up-language implementation, but in HTML: Less-than-sign, the text “pre” (three letters, no quotes), greater-than-sign. End with the same tag, but with a forward-slash after the less-than-sign.
If it does use mark-up, perhaps the same but with square brackets in stead of comparison operators.
HTH!
OK, let’s see if this works:
#!/bin/bash
while read -r FFN
do
LAST_SLASH=0
SLASH=$( expr index “$FFN” / ) # find first slash
while (( $SLASH != 0 ))
do
let LAST_SLASH=$LAST_SLASH+$SLASH # point at next slash
SLASH=$(expr index “${FFN:$LAST_SLASH}” / ) # look for another
done
echo ${FFN:0:$LAST_SLASH}
done dir.list
#!/bin/bash
while read -r FFN;
do
echo ${FFN%/[!/]*}/
done dir2.list
Nope.
I wonder if the comment scrubbing also created the syntax error above:
echo $TSTR##*fi}
my.first.file.txt##*fi}
echo ${TSTR##*fi}
le.txt
For me, bash is another shell I may have to understand if maintaining someone else’s work, but my brain got wired for csh many years ago, and there ain’t more room for another shell.
P.S. yes, it’s also a programming language, albeit interpreted.
A serious question for the author and a hacky example (for fun).
1. In the discussion of “test”, is there a square bracket missing at the end of line 1? It should be a double-bracket, yes? As in —
if [[ $f == 0 ]]
2. A crazy idea for a demo script –
#!/bin/bash
read SANE
while (( ${#SANE} > 0 }}
do
echo ${SANE^^[aeiouthsz]}
read SANE
done
exit 0
I do hope this survives the submission process.
Yes, I don’t know if I did a typo or WordPress ate my square brackets (we fight WordPress to get code in posts all the time. So I may have had an HTML entity that I overagressively deleted).
Bash is a great scripting language for one primary reason: I can log into any Linux box and my entire IDE is at my fingertips as long as my shell is set to /bin/bash.
Oh sure, it’s not python. Who cares? I have no dependencies to worry about and if I stick to the basis of sed, grep, awk, tr, expr, and the like, I don’t have to worry about installing however many programs just so I can use one function.
I’m one of those loonies who loves bash scripting because I can almost always bend it to my will. And when the tasks you need to automate are already in Linux, why go to the trouble of writing a python script that just does os calls to bash when I can just use bash to begin with.
Bash is far more powerful and useful than the haters think.
And hey, at least it’s not PERL! (runs for cover)
I used to write shell in android environment. My one idea on a program to test file integrity with the md5sum applet and this would create a paradox
Script like this:
#!/system/bin/sh
check=”$(busybox md5sum $0|busybox awk ‘{print $1}’)”
echo “md5 of $0 is $check”
if [ “$check” == “36e77f5fed92460796df627f6dd1d0ab” ] # this is paradox, because is writing this script where define as $0
then
echo “this message never show”
else
echo “md5 not equal”
kill -9 $$
fi
What was your goal @Lbreak? because that’s not a good way to check md5sums for anything :)
My goal is that people cannot rewrite my script. that is by testing MD5.
but i can’t.
Check out SHC. It’s a compiler for shell scripts:
https://en.wikipedia.org/wiki/Shc_(shell_script_compiler)
thanks @geocrasher
😝
“But let’s stick with bash-isms for handling strings.”
Some of those things aren’t Bash-isms, actually.
Cf. https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02
These days, you can (and should) simply use ShellCheck.
Syntax highlighting usually sucks.
Great article–I’ve been writing shell scripts for many years and a LOT of what is covered in this article I’ve been unawares of…So, dumb question: any good reference sites for BASH? I have tried to read the official GNU doc and it has a ton of obscura but I find it unparseable by my old brain.
The last reference on BASH one needs.
Bash Considered Pointless (Or: Pointless Bashing):
https://blogstrapping.com/2013.271.13.19.30/
I’d even say: The days of shells are over. Tear the scripting part out of the shell and concentrate the rest on being a text mode user interface. For all other stuff use *readable* and *safe* languages!
http://shell.cfaj.ca/?2004-05-22_shell_websites
One of my favourite uses for BASH strings are to replace commands like dirname and basename. So can one use “${var%/*}” instead of $(dirname “$var”) and “${var##*/}” instead of $(basename “$var”). And the string replacement on variables i.e. “${var/old/new}” has also saved me countless calls to sed.
Seeing all the idiotic replies … Word of advise! Do not use the Internet as your personal diary. Your suffering with the world is not anywhere as noteworthy as that of Anne Frank.
This article has more broken code than I’ve seen in a long time.
Unquoted variable expansion leads to broken scripts.
For example: echo ${NAME^} will strip leading and trailing whitespace from $NAME and will reduce multiple consecutive whitespace characters with a single space. It should be: echo “echo ${NAME^}”
That should be:
echo “${NAME}”
or
echo “$NAME” # the braces do nothing