Gawking Text Files

Some tools in a toolbox are versatile. You can use a screwdriver as a pry bar to open a paint can, for example. I’ve even hammered a tack in with a screwdriver handle even though you probably shouldn’t. But a chainsaw isn’t that versatile. It only cuts. But man does it cut!

aukAWK is a chainsaw for processing text files line-by-line (and the GNU version is known as GAWK). That’s a pretty common case. It is even more common if you produce a text file from a spreadsheet or work with other kinds of text files. AWK has some serious limitations, but so do chainsaws. They are still super useful. Although AWK sounds like a penguin-like bird (see right), that’s an auk. Sounds the same, but spelled differently. AWK is actually an acronym of the original author’s names.

If you know C and you grok regular expressions, then you can learn AWK in about 5 minutes. If you only know C, go read up on regular expressions and come back. Five minutes later you will know AWK. If you are running Linux, you probably already have GAWK installed and can run it using the alias awk. If you are running Windows, you might consider installing Cygwin, although there are pure Windows versions available. If you just want to play in a browser, try webawk.

Continue reading “Gawking Text Files”