Linux Fu: Where’s That Darn File?

Disk storage has exploded in the last 40 years. These days, even a terabyte drive is considered small. There is one downside, though. The more stuff you have, the harder it is to find it. Linux provides numerous tools to find files when you can’t remember their name. Each has plusses and minuses, and choosing between them is often difficult.

Definitions

Different tools work differently to find files. There are several ways you might look for a file:

  1. Find a file if you know its name but not its location.
  2. Find a file when you know some part of its name.
  3. Find a file that contains something.
  4. Find a file with certain attributes (e.g., larger than 100 kB)

You might combine these, too. For example, it is reasonable to query all PDF files created in the last week that are larger than 100 kB.

There are plenty of different types of attributes. Some file systems support tags, too. So, you might have a PERSONAL tag to mark files that apply to you personally. Unfortunately, tool support for tags is somewhat lacking, as you’ll see later.

Another key point is how up-to-date your search results are. If you sift through terabytes of files for each search, that will be slow. If you keep an index, that’s fast, but the index will quickly be out of date. Do you periodically refresh the index? Do you watch the entire file system for changes and then update the index? Different tools do it differently. Continue reading “Linux Fu: Where’s That Darn File?”

Linux Fu: Name That Tune

If you aren’t old enough to remember, the title of this post refers to an old game show where contestants would try to name a tune using the fewest possible notes. What can we say? Entertainment options were sparse before the Internet. However, using audio fingerprinting, computers are very good at pulling this off. The real problem is having a substantial library of fingerprints to compare with. You can probably already do this with your phone, and now you can do it with your Linux computer.

In all fairness, your computer isn’t doing the actual work. In fact, SongRec — the program in question — is just a client for Shazam, a service that can identify many songs. While this is mildly interesting if you use a Linux desktop, we could also see using the same technique with a Raspberry Pi to get some interesting projects. For example, imagine identifying a song playing and adjusting mood lighting to match. A robot that could display song information could be the hit of a nerdy party.

Continue reading “Linux Fu: Name That Tune”

Linux Fu: Preprocessing Beyond Code

If you glanced at the title and thought, “I don’t care — I don’t write C code,” then hang on a minute. While it is true that C has a preprocessor and you can notoriously do strange and — depending on your point of view — horrible or wonderful things with it, there are actually other options and you don’t have to use any of them with a C program. You can actually use the C preprocessor with almost any kind of text file. And it’s not the only preprocessor you can abuse this way. For example, the m4 preprocessor is wildly complex, vastly underused, and can handle C source code or anything else you care to send to it.

Definitions

I’ll define a preprocessor as a program that transforms its input file into an output file, reacting to commands that are probably embedded in the file itself. Most often, that output is then sent to some other program to do the “real” work. That covers cpp, the C preprocessor. It also covers things like sed. Honestly, you can easily create custom preprocessors using C, awk, Python, Perl, or any other programming language. There are many other standard programs that you could think of as preprocessors, for example, tr. However, one of the most powerful is made to preprocess complex input files called m4. For some reason — maybe because of its complexity — you don’t see much m4 in the wild.

Continue reading “Linux Fu: Preprocessing Beyond Code”

Linux Fu: Easy Kernel Debugging

It used to be that building the Linux kernel was not easy. Testing and debugging were even worse. Nowadays, it is reasonably easy to build a custom kernel and test or debug it using virtualization. But if you still find it daunting, try [deepseagirl’s] script to download, configure, build, and debug the kernel.

The Python program takes command line arguments so you can select a kernel version and different operations. The script can download the source, patch the configuration, build the kernel, and then package it into a Debian package you can boot under qemu. From there, you can test and even debug with gdb. No risk of hosing your everyday system and no need to understand how to configure everything to run.

Continue reading “Linux Fu: Easy Kernel Debugging”

Linux Fu: Customizing Printf

When it comes to programming in C and, sometimes, C++, the printf function is a jack-of-all-trades. It does a nice job of quickly writing output, but it can also do surprisingly intricate formatting. For debugging, it is a quick way to dump some data. But what if you have data that printf can’t format? Sure, you can just write a function to pick things apart into things printf knows about. But if you are using the GNU C library, you can also extend printf to use custom specifications. It isn’t that hard, and it makes using custom data types easier.

An Example

Suppose you are writing a program that studies coin flips. Even numbers are considered tails, and odd numbers are heads. Of course, you could just print out the number or even mask off the least significant bit and print that. But what fun is that?

Here’s a very simple example of using our new printf specifier “%H”:

printf("%H %H %H %H\n",1,2,3,4);
printf("%1H %1H\n",0,1);

When you have a width specification of 1 (like you do in the second line) the output will be H or T. If you have anything else, the output will be HEADS or TAILS.

Continue reading “Linux Fu: Customizing Printf”

Linux Fu: Deep Git Rebasing

If you spend much time helping people with word processor programs, you’ll find that many people don’t really use much of the product. They type, change fonts, save, and print. But cross-references? Indexing? Largely, those parts of the program go unused. I’ve noticed the same thing with Git. We all use it constantly. But do we? You clone a repo. Work on it. Maybe switch branches and create a pull request. That’s about 80% of what you want to do under normal circumstances. But what if you want to do something out of the ordinary? Git is very flexible, but you do have to know the magic incantations.

For example, suppose you mess up a commit message — we never do that, of course, but just pretend. Or you accidentally added a file you didn’t want in the commit. Git has some very useful ways to deal with situations like this, especially the interactive rebase.

Continue reading “Linux Fu: Deep Git Rebasing”

Linux Fu: Build A Better Ls

Ask someone to name all the things they can find in a room. Only a few will mention air. Ask a Linux command line user about programs they use and they may well forget to mention ls. Like air, it is seemingly invisible since it is so everpresent. But is it the best it can be? Sure, you can use environment variables and aliases to make it work a little nicer, but, in fact, it is much the same ls we have used for decades. But there have always been moves to make better ls programs. One of them, exa, was recently deprecated in favor of one of its forks, eza.

One thing we liked about eza is that it is a single file. No strange installation. No multiple files to coordinate. Put it on your path, and you are done. So installation is easy, but why should you install it?

Continue reading “Linux Fu: Build A Better Ls”