Linux-Fu: Making AWK A Bit Easier

awk is a kind of Swiss Army knife for text files. However, some of its limitations are often a bit annoying. I’ve used a simple set of functions to make awk a bit better, although I will warn you: it does require GNU extensions to awk. That is, you must use gawk and not other versions. Your system probably maps /usr/bin/awk to something and that something might be gawk. But it could also be mawk or some other flavor. If you use a Debian-based distro, update-alternatives is your friend here. But for the purposes of this post, I’m going to assume you are using gawk.

By the end of the post, you’ll see how to use my awk add-on functions to split up a line into fields even when there is no single character to separate all fields. In addition, you’ll be able to refer to the fields using names you decide. You won’t have to remember that $2 is the time field. You’ll say Fields_fields["time"] instead.

The Problem

awk does a lot of common work for you when you use it to process text files. It reads files a record at a time. Normally, a record is a single line. Then it splits the line on fields using whitespace, or some other choice of field separators. You can write code that manipulates the line or individual fields. This default behavior is great, especially since you can change the end of record character and the field separator. A surprising number of files fit this sort of format.

Until, of course, they don’t. If you have data coming from a data logging instrument or some database, it could be formatted in a variety of ways. Some fields might have structured data with a variety of separators. This isn’t a deal-breaker. Since you can get at the whole line, you can do almost anything you want, but the logic is harder and the whole point to using awk is to make things easier.

For example, suppose you had a file from a data recorder that had an eight-digit serial number, followed by a six-character tag, and then two floating point numbers separated by colons. The pattern might look like

^([0-9]{8})([a-zA-Z0-9]{6})([-+.0-9]+),([-+.0-9]+)$

This would be hard to handle with the conventional field splitting and you’d normally just write code to split everything apart.

Continue reading “Linux-Fu: Making AWK A Bit Easier”

Linux Fu: Simple SSH File Sharing

If you have more than one Linux computer, you probably use ssh all the time. It is a great tool, but I’ve always found one thing about it strange. Despite having file transfer capabilities in the form of scp and sftp, there is no way to move a file back or forth between the local and remote hosts without starting a new program on the local machine or logging in from the remote machine back to the local machine.

That last bit is a real problem since you often access a server from behind a firewall or a NAT router with an ephemeral IP address, so it can’t reconnect to you anyway. It would be nice to hit the escape character, select a local or remote file, and teleport it across the  interface, all from inside a single ssh session.

I didn’t quite get to that goal, but I did get pretty close. I’ll show you a script that can automatically mount a remote directory on the local machine. You’ll need sshfs on the local machine, but no changes on the remote machine where you may not be able to install software. With a little more work, and if your client has an ssh server running, you can mount a local directory on the remote machine, too. You won’t need to worry about your IP address or port blocking. If you can log into the remote machine, you are good.

Combined, this got me me very close to my goal. I can be working in a shell on either side and have access to read or write files on the other side. I just have to set it up carefully. Continue reading “Linux Fu: Simple SSH File Sharing”

Linux Fu: Literate Regular Expressions

Regular expressions — the things you feed to programs like grep — are a bit like riding a bike. It seems impossible until you learn to do it, and then it’s easy. Part of their bad reputation is because they use a very concise and abbreviated syntax that alarms people. To help people who don’t use regular expressions every day, I created a tool that lets you write them in something a little closer to plain English. Actually, I’ve written several versions of this over the years, but this incarnation that targets grep is the latest. Unlike some previous versions, this time I did it all using Bash.

Those who don’t know regular expressions might freak out when they see something like:

[0-9]{5}(-[0-9]{4})?

How long does it take to figure out what that does? What if you could write that in a more literate way? For example:

digit repeat 5 \

start_group \

   - digit repeat 4 \

end_group optional

Not as fast to type, sure. But you can probably deduce what it does: it reads US Zipcodes.

I’ve found that some of the most popular tools I’ve created over the years are ones that I don’t need myself. I’m sure you’ve had that experience, too. You know how to operate a computer, but you create a menu system for people who don’t and they love it. That’s how it is with this tool. You might not need it, but there’s a good chance you know someone who does. Along the way, the code uses some interesting features of Bash, so even if you don’t want to be verbose with your regular expressions, you might pick up a trick or two.

Continue reading “Linux Fu: Literate Regular Expressions”

Linux Fu: Moving /usr

Linux has changed. Originally inspired by Unix, there were certain well understood but not well enforced rules that everyone understood. Programs did small things and used pipes to communicate. X Windows servers didn’t always run on your local machine. Nothing in /usr contributed to booting up the system.

These days, we have systemd controlling everything. If you run Chrome on one display, it is locked to that display and it really wants that to be the local video card. And moving /usr to another partition will easily prevent you from booting up, unless you take precautions. I moved /usr and I lived to tell about it. If you ever need to do it, you’ll want to hear my story.

A lot of people are critical of systemd — including me — but really it isn’t systemd’s fault. It is the loss of these principles as we get more programmers and many of them are influenced by other systems where things work differently. I’m not just ranting, though. I recently had an experience that brought all this to mind and, along the way, I learned a few things about the modern state of the boot process. The story starts with a friend giving me an Intel Compute Stick. But the problems I had were not specific to that hardware, but rather how modern Linux distributions manage their start-up process.

Continue reading “Linux Fu: Moving /usr”

Linux-Fu: Your Own Dynamic DNS

It is a problem as old as the Internet. You want to access your computer remotely, but it is behind a router that randomly gets different IP addresses. Or maybe it is your laptop and it winds up in different locations with, again, different IP addresses. There are many ways to solve this problem and some of them are better than others.

A lot of routers can report their IP address to a dynamic DNS server. That used to be great, but now it seems like many of them hound you to upgrade or constantly renew so you can see their ads. Some of them disappear, too. If your router vendor supplies one, that might be a good choice, until you change routers, of course. OpenWRT supports many such services and there are many lists of common services.

However, if you have a single public accessible computer, for example a Web server or even a cloud instance, and you are running your own DNS server, you really don’t need one of those services. I’m going to show you how I do it with an accessible Linux server running Bind. This is a common setup, but if you have a different system you might have to adapt a bit.

There are many ways to set up dynamic DNS if you are willing to have a great deal of structure on both sides. Most of these depend on setting up a secret key to allow for DNS updates and some sort of script that calls nsupdate or having the DHCP server do it. The problem is, I have a lot of client computers and many are set up differently. I wanted a system where the only thing needed on the client side was ssh. All the infrastructure remains on the DNS server.

Continue reading “Linux-Fu: Your Own Dynamic DNS”

Linux-Fu: One At A Time, Please! Critical Sections In Bash Scripts

You normally think of a critical section — that is, a piece of a program that excludes other programs from using a resource — as a pretty advanced technique. You certainly don’t often think of them as part of shell scripting but it turns out they are surprisingly useful for certain scripts. Most often, a critical section is protecting some system resource like a shared memory location, but there are cases where a shell script needs similar protection. Luckily, it is really easy to add critical sections to shell scripts, and I’ll show you how.

Sometimes Scripts Need to Be Selfish

One very common case is where you want a script to run exactly one time. If the same script runs again while the original is active, you want to exit after possibly printing a message. Another common case is when you are updating some file and you need undisturbed access while making the change.

That was actually the case that got me thinking about this. I have a script — may be the subject of a future Linux-Fu — that provides dynamic DNS by altering a configuration file for the DNS server. If two copies of the script run at the same time, it is important that only one of them does modifications. The second copy can run after the first is totally complete.

Continue reading “Linux-Fu: One At A Time, Please! Critical Sections In Bash Scripts”

Linux Fu: Remote Execution Made Easy

If you have SSH and a few other tools set up, it is pretty easy to log into another machine and run a few programs. This could be handy when you are using a machine that might not have a lot of memory or processing power and you have access to a bigger machine somewhere on the network. For example, suppose you want to reencode some video on a box you use as a media server but it would go much faster on your giant server with a dozen cores and 32 GB of RAM.

Remote Execution

However, there are a few problems with that scenario. First, you might not have the software on the remote machine. Even if you do, it might not be the version you expect or have all the same configuration as your local copy. Then there’s the file problem. the input file should come from your local file system and you’d like the output to wind up there, too. These aren’t insurmountable, of course. You could install the program on the remote box and copy your files back and forth manually. Or you can use Outrun.

There are a few limitations, though. You do need Outrun on both machines and both machines have to have the same CPU architecture. Sadly, that means you can’t use this to easily run jobs on your x86-64 PC from a Raspberry Pi. You’ll need root access to the remote machine, too. The system also depends on having the FUSE file system libraries set up.

Continue reading “Linux Fu: Remote Execution Made Easy”