Linux Fu: Watch That Filesystem

June 7, 2018

The UNIX Way™ is to cobble together different, single-purpose programs to get the effect you want, for instance in a Bash script that you run by typing its name into the command line. But sometimes you want the system to react to changes in the system without your intervention. For example, you might like to watch a directory and kick off some program automatically when a file appears from a completed FTP transaction, without having to sit there and refresh the directory yourself.

The simple but ugly way to do this just scans the directory periodically. Here’s a really dumb shell script:

#!/bin/bash
while true
 do
   for I in `ls`
    do cat $I; rm $I
   done
 sleep 10
done

Just for an example, I dump the file to the console and remove it, but in real life, you’d do something more interesting. This is really not a good script because it executes all the time and it just isn’t a very elegant solution. (If you think I should use for I in *, try doing that in an empty directory and you’ll see why I use the ls command instead.)

Increase Elegance

Honestly, you want something more elegant right? Modern kernels (2.6.13 and later) have filesystem notifications in the form of an interface called inotify. You can use these calls programmatically with the sys/inotify.h header file. There is also a set of command-line programs you can install, usually packaged as inotify-tools.

One of those tools is inotifywait and it makes for a nicer script. For example:

#!/bin/bash
while true
 do
   if FN=`inotifywait –e close_write,moved_to --format %f .`
   then
    cat $FN
    rm $FN
   fi
 done

That’s better, I think. It doesn’t wake up frequently, only when something has changed. I figure any sane program putting something in the directory will either open the file for writing and close it, or it will move it. Either way will work and the %f tells the command to report the file name. There are other events you can wait for as well, of course.

If you are wondering why the move case is necessary, think about how most text editors and network download software works. Usually, a new file doesn’t have the final name until it is complete. For example, Chrome will download the file test.txt as test.txt.crdownload or something like that. Only when the file is done will it rename (move) the file to test.txt.

If you want to try the command without a script so you can see the effect, just open up two terminal windows like this:

In the lower terminal, issue the inotifywait command. Don’t forget the period at the end which tells it to monitor the current directory. Then in the other terminal create a file in the same directory. The name of the file will appear in the first terminal and the program will exit. The script just takes advantage of this behavior to set the FN variable, takes action, and then relaunches inotifywait. You can ask the program not to quit, by the way, but that makes scripting a little more difficult. However, it also removes the problem of a file changing while you are doing your processing.

The other command line, inotifywatch, also outputs file change events but it watches for a certain amount of time and then gives you a summary of changes. I won’t talk about it any further. If you think you need that capability, you can read the man page.

A New Cron

The script is still less than ideal, though. Presumably, a system might have lots of different directories it wants to monitor. You really don’t want to repeat this script, or a variation of it, for each case.

There is another program for that, called incron (you will almost surely have to install this one). The incron program is like cron but instead of time-based events, the events are based on file notifications. Once you install it, you will probably have to change /etc/incron.allow and /etc/incron.deny if you want to actually use it, especially as a normal user.

Suppose you want to run a script when a file appears in the hexfiles directory. You can use the command incrontab -e to edit your incron table. The format is very picky (it wants spaces, not tabs, for example). Here’s a line from the file that will do the job:

/home/alw/Downloads/hexfiles IN_CLOSE_WRITE,IN_MOVED_TO /home/alw/bin/program_cpu $@/$#

The $@/$# at the end provides a full path to the file affected. You can also grab the vent time as text ($%) or a number ($&). You can monitor all the usual events and also set options to do things like not dereference symbolic links. You can find it all in the incron man pages.

GUI

I’m not a big fan of GUI editors, but I know I’m in the minority. If you like, there’s a Java-based incrontab editor available. There isn’t much documentation, but you can import your incrontab — if it exists — from /var/spool/incron/your_user_id. If you look at the image below, you can see it offers a form that builds the incron table line for you.

You can find the system files in /etc/incron.d, usually. All the locations can be set by the /etc/incron.conf file, so if you aren’t sure where to look or you want to change the location for the table files, start there.

Go Forth and Watch

Using incron is quite elegant. A system program does all the waiting and our script only runs when necessary. It is easy to look and see all the things you have notifications set for. You can do a lot with these tools, and not just in the embedded space. How are you going to use them?

31 thoughts on “Linux Fu: Watch That Filesystem”

Perry Harrington says:

June 7, 2018 at 10:08 am

It looks like something converted your backticks to single quotes in the inotifywait example. It’s better form to use $(command) instead of backticks, and it won’t get mutilated by your CMM.

Report comment

Reply
1. Al Williams says:
  
  June 7, 2018 at 10:11 am
  
  Ah. WordPress. I’ll fix when I’m back at my desk.
  
  Report comment
  
  Reply
  1. Luke says:
    
    June 7, 2018 at 10:34 am
    
    It’s also converting the tack before the e into some similar but incompatible character.
    
    Report comment
    
    Reply
q2dg (@q2dg) says:

June 7, 2018 at 10:23 am

Incron is deprecated. Try systemd’ path units

Report comment

Reply
1. Al Williams says:
  
  June 7, 2018 at 11:28 am
  
  Oh my. We are going to start the systemd vs non-systemd comment thread….
  
  Report comment
  
  Reply
  1. Mark Lamb says:
    
    June 7, 2018 at 11:53 am
    
    Real Admins don’t care about systemd vs. initd. Real Admins have their own init script (often perl) as an easier and more flexible alternative to all those tetchy little standard files in /etc. All your static IPs, mount points, etc, everything in one handy file.
    
    There’s actual uses for such a hack, but there’s really good reasons not to, especially on multiuser, production systems that others will have to support.
    
    Report comment
    
    Reply
  2. eni says:
    
    June 8, 2018 at 2:41 am
    
    There is nothing to discuss anymore. Every major Linux distro uses systemd now. And incron is now unmaintained for years. There are alternatives to incron and systemd-path, but most of them are not in the repos of your distro..
    
    Report comment
    
    Reply
2. Charlie says:
  
  June 7, 2018 at 1:09 pm
  
  Or use gidget. http://www.typinganimal.net/code/gidget/
  
  Where I work gidget handles thousands of file transfer triggered events a day.
  
  Report comment
  
  Reply
3. anon says:
  
  June 8, 2018 at 11:56 am
  
  I know I should not feed the troll.
  But you know some people are having fun with init scripts and are running without systemD. I use OpenRC, So if any solution involves installing systemD and sh*tloads of dependencies, it is NOT a solution. Especially for an embedded system.
  
  Report comment
  
  Reply
detuur says:

June 7, 2018 at 10:41 am

For the love of $DEITY, don’t parse the output of ls. In case anyone sees the first script and thinks to themselves “it’s dirty, but it works”, you’re wrong. Read this: https://mywiki.wooledge.org/ParsingLs . Long story short: you can easily fix the globbing behaviour in an empty directory by doing the `shopt -s nullglob` command at the beginning of your script. BUT DON’T PARSE LS — EVER!

Report comment

Reply
1. Al Williams says:
  
  June 7, 2018 at 11:28 am
  
  Well it was an anti-example.
  
  Report comment
  
  Reply
  1. djsmiley2k says:
    
    June 8, 2018 at 2:50 am
    
    Doesn’t make it right though :/
    
    Report comment
    
    Reply
  2. Mike Rogers says:
    
    June 8, 2018 at 9:09 am
    
    Right?
    
    Report comment
    
    Reply
  3. Zach says:
    
    June 8, 2018 at 1:17 pm
    
    Should read “Thanks. You further support my anti-example.” Should be no offence taken there, you did a fine job.
    
    Report comment
    
    Reply
yeti says:

June 7, 2018 at 12:09 pm

“I’m not a big fan of GUI editors,”
+1

“but I know I’m in the minority.”
…sssssssssht… don’t mention it too often or all the hipsters will crowd shell providers and IRC!

Report comment

Reply
BobbyMac99 says:

June 7, 2018 at 1:22 pm

Scripts that scan a directory for new files need to have time/date checking on them and some other type of “GO” indicator, or secondary target directories. Example: I want to move a transaction file to a disaster mirror and apply the transaction log to keep it current. The big problem is you need to make sure you aren’t writing to the file and are done with it totally, this can be solved by using 2 directories and renaming the file to the target directory when ready – then you know it’s safe to use. Also, recursion can occur it you run the same job on multiple machines for workload balancing. I setup a whole system for a similar case where a record comes down from ERP software, and you need to apply it to a reporting SQL database, and of course that gets applied to the disaster backup. It can be a real pain, especially for things like applying updates, where you need to take down jobs and start them in the correct order. Timing is obviously a big factor – you need to write it to be entered at any time of the process without backfiring.

Report comment

Reply
user says:

June 7, 2018 at 2:05 pm

To periodical run a program you can use the “watch” program, for example to run ls every 2 seconds run this
“watch ls”

Report comment

Reply
1. GekkePrutser says:
  
  June 8, 2018 at 8:56 am
  
  Yes that’s what I do! I’m surprised it took so long to come up.
  
  Before I found ‘watch’ (which is not posix but gnu so not available by default on systems like macOS), I used to use a simple bash loop with a delay.
  
  Report comment
  
  Reply
Daniel Matthews says:

June 7, 2018 at 2:31 pm

The comments here are my idea of positive diversity. Thanks everyone, it is very helpful to have all that information and different perspectives, put so logically.

Report comment

Reply
David L Norris ???? (@DavidLNorris) says:

June 7, 2018 at 3:55 pm

fswatch on Linux, macOS, BSD, and Windows would be a portable way to implement this same functionality. Just replace inotifywait with fswatch and modify the arguments appropriately in your script. fswatch is in ports for BSD and macOS, available via Debian and Ubuntu repos, and you’ll have to use Google to find a Windows build. fswatch will use the operating system’s native filesystem monitor: inotify on Linux (inotify is way too slow for rapidly changing files), FSEvents on macOS (no known limitations), kqueue on BSD (limited number of files it can watch), ReadDirectoryChangesW on Windows (only reads entire directories; you have to figure out which files changed on your own), “File Events Notification API ” on Solaris (no known limitations), and “poll” which stats in a loop on any POSIX compatible system (usable but non-performant). i.e., if rapid file change notifications on lots of files are what you need then Solaris is likely the best option. But then that is the kind of workload it was designed for.

Report comment

Reply
ian says:

June 7, 2018 at 4:22 pm

quite a few of these methods don’t work on shared directors where the remote system updates the files (ie inotify) – so be careful and test test test!

Report comment

Reply
1. Mike R says:
  
  June 10, 2018 at 5:01 pm
  
  I’m noticing that inotifywait doesn’t seem to notice if I move/copy files to the directory over sftp. Seems to work with everything else’s I’ve used, though. :/
  
  Report comment
  
  Reply
omegacs says:

June 7, 2018 at 7:52 pm

My main beef with inotify is that it doesn’t have any provisions for a global wildcard. It should of course be root only and security-restricted, but Crashplan and other backup applications that try to keep up with file changes are currently forced to request a watch on every single file in the system. This mandates that the backup program trudge through the entire filesystem upon boot to initiate the watches, then maintain them along with the associated kernel overhead (which is significant, when watching millions of files).

Report comment

Reply
Mike R says:

June 8, 2018 at 10:48 am

Weird, when I run the inotifywait command from the article, I get this:
Couldn’t watch –e: No such file or directory

Report comment

Reply
1. Mike Rogers says:
  
  June 8, 2018 at 10:50 am
  
  I should add Debian 9.4, I suppose.
  
  Report comment
  
  Reply
  1. Mike Rogers says:
    
    June 8, 2018 at 10:52 am
    
    Nevermind! I typed it myself and it worked. Must be from your “make it display right in wordpress” hijinks. :D
    
    Report comment
    
    Reply
rasz_pl says:

June 8, 2018 at 11:22 am

just be warned, it doesnt work the way you hope it works
http://wingolog.org/archives/2018/05/21/correct-or-inotify-pick-one

Report comment

Reply
Splud says:

June 8, 2018 at 11:46 am

inotify is quite handy. I employed it several years ago when hired to track down an active intrusion on a university server which had entirely too many users (and professors who delegated their accounts to assistants, and very likely did not change credentials between) – isolating the host to fix things wasn’t an option. They had wordpress too, which was a major pathway. By the time I came onboard, the intruders had many backdoors on the system, and all it took was one of them to springboard more. As I’d identify backdoors and disabled them, I’d add them to an inotify process I’d written so that I could more directly associate inbound traffic and see where else they’d head after finding something missing.

Report comment

Reply
1. Mike Rogers says:
  
  June 8, 2018 at 12:13 pm
  
  “Whack-a-mole”-ing a compromised WordPress install is a super PITA. I tried that one time, but ended up just wiping reverting to a backup (thankfully) then updating.
  
  Report comment
  
  Reply
Ben Hamilton says:

June 15, 2020 at 7:09 pm

Al Williams, thanks for writing this. It is the most elegant solution of the ones I’ve researched. CentOS 7 packages for incron are still being maintained, btw. Last updated in March, 2019.

Report comment

Reply
Randy says:

June 13, 2022 at 12:13 am

Apparently incron is deprecated in some recent distros, due to not being updated in several years, while bugs remain.
The alternative that remains seems to be systemd Path units, which at first glance seems a big step backward in ease of use.

Report comment

Reply