Historically, one of the nice things about Unix and Linux is that everything is a file, and files are just sequences of characters. Of course, modern practice is that everything is not a file, and there is a proliferation of files with some imposed structure. However, if you’ve ever worked on old systems where your file access was by the block, you’ll appreciate the Unix-like files. Classic tools like awk, sed, and grep work with this idea. Files are just characters. But this sometimes has its problems. That’s the motivation behind a tool called Miller, and I think it deserves more attention because, for certain tasks, it is a lifesaver.
The Problem
Consider trying to process a comma-delimited file, known as a CSV file. There are a lot of variations to this type of file. Here’s one that defines two “columns.” I’ve deliberately used different line formats as a test, but most often, you get one format for the entire file:
Slot,String A,"Hello" "B",Howdy "C","Hello Hackaday" "D","""Madam, I'm Adam,"" he said." E 100,With some spaces! X,"With a comma, or two, even"
Continue reading “Linux Fu: Miller The Killer Makes CSV No Pest”



The tool relies on a kernel module, and is coded primarily in C, with some assembly code used to measure performance as accurately as possible. It’s capable of reporting everything from core frequencies to details on hyper-threading and turbo boost operation. Other performance reports include information on instructions per cycle or instructions per second, and of course, all the thermal monitoring data you could ask for. It all runs in the terminal, which helps keep overheads low.


