Linux Fu: Send In The (Cloud) Clones

Storing data “in the cloud” — even if it is your own server — is all the rage. But many cloud solutions require you to access your files in a clumsy way using a web browser. One day, operating systems will incorporate generic cloud storage just like any other file system. But by using two tools, rclone and sshfs, you can nearly accomplish this today with a little one-time setup. There are a few limitations, but, generally, it works quite well.

It is a story as old as computing. There’s something new. Using it is exotic and requires special techniques. Then it becomes just another part of the operating system. If you go back far enough, programmers had to pull specific records from mass storage like tapes, drums, or disks and deblock data. Now you just open a file or a database. Cameras, printers, audio, and even networking once were special devices that are now commonplace. If you use Windows, for example, OneDrive is well-supported. But if you use another service, you may or may not have an easy option to just access your files as a first-class file system.

The rclone program is the Swiss Army knife of cloud storage services. Despite its name, it doesn’t have to synchronize a local file store to a remote service, although it can do that. The program works with a dizzying array of cloud storage providers and it can do simple operations like listing and copying files. It can also synchronize, as you’d expect. However, it also has an experimental FUSE filesystem that lets you mount a remote service — with varying degrees of success.

What’s Supported?

If you don’t like using someone like Google or Amazon, you can host your own cloud. In that case, you can probably use sshfs to mount a file using ssh, although rclone can also do that. There are also cloud services you can self-host like OwnCloud and NextCloud. A Raspberry Pi running Docker can easily stand up one of these in a few minutes and rclone can handle these, too.

The project claims to support 33 types of systems, although some of those are serving local files, but by any count, it is at least 30. The big players like Google, Box, Dropbox, and Amazon are there. There are variations for things like Google Drive vs Google Photos. Some of the protocols are generic like SFTP, HTTP, and WebDAV, so they will work with multiple providers. Then there are lesser-known names like Tardigrade, Mega, and Hubric.

Each system has its idiosyncracies, of course. Some file systems are case-sensitive and some are not. Some support modification time recording, but others don’t. Some are read-only and some do not support duplicate files. You can mount most of the filesystems and there are also meta systems that can show files from multiple remotes (e.g., Google Drive and Dropbox together) and other special ones that can cache another remote or split up large files.

How Does It Work?

When you setup rclone, you use the program to configure one or more remotes. The program stores the setup in ~/.config/rclone/rclone.conf although you rarely need to edit that file. Instead, you run rclone config.

From there you can see any remotes you already have and edit them or you can define new ones. Each backend provider has slightly different setup, but you’ll generally have to provide some sort of login credentials. In many cases, the program will launch a Web browser to authenticate you or allow you to grant permission for rclone to access the service.

Once you have a remote, you can use it with rclone. Suppose you have a Google Drive and you’ve made a remote named HaDFiles: that points to that drive. You could use commands like:

rclone ls HaDFiles:
rclone ls HaDFiles:/pending_files    # directory name, not file wildcard!
rclone copy ~/myfile.txt HaDFiles:/notes/myfile.txt
rclone copy HaDFiles:/notes/myfile.txt ~/myfile.txt
rclone sync ~/schedule HaDFiles:/schedules

The copy is more like a synchronization. The file is copied from one path to another path. You can’t copy a directory. In other words, consider this command:

rclone copy /A/B/C/d.txt remote:/X/Y/Z

This copies d.txt from /A/B/C to /X/Y/Z. If won’t copy a file that already exists on the other side unless it hashes to a different value than the new file, indicating the file changed. There is also a move command, as well as delete, mkdir, rmdir, and all the other things you would expect. The sync command updates the destination to match the source, but not vice versa.

However, what we are interested in is the mount command. On the face of it, it is simple:

rclone mount remote:/ /some_local_mount_point

Caveats

There are a few problems, though. First, the performance of some of the filesystems is pretty poor and could be even worse if you have a slow connection. This is especially bad if you have tools like a file index program (e.g., baloo) or a backup program that walks your entire file system. The best thing to do is to exclude these mount points from those programs.

Hitting the remote filesystem can be inefficient so rclone will cache file attributes for a short period of time. If a file changed on the remote side, you could get stale data and that could be bad for your data. It also caches directories, so if you are using this with multiple users, be sure to read the documentation.

You also can’t write randomly into files by default. This stops some programs like word processors from working. You can pass --vfs-cache-mode with an argument to cause rclone to cache the file locally, which may help that. There’s no free lunch, though. If you set the cache mode to full, all file operations will work, but you risk rclone not being able to move the complete file over to the remote later which, again, isn’t good for your data integrity.

Problems

If you don’t mind manually setting up things, it is really just this simple. Run a mount command, probably specifying a cache mode, and you are done. However, I wanted to mount the cloud all the time and that leads to some problems.

You can set up rclone to run as a systemd service, but that didn’t work well for me. Just putting my commands in my login profile seemed to work better. But there were two problems. First, it was wasteful to call it every time I run a login shell, even if the mount was already there. Second, sometimes the network connection would drop and the mounted directory was in some kind of zombie state. You couldn’t remount, but you also couldn’t get any files out.

The Script

The answer to my problems? Create a simple script.

#!/bin/bash 
# error checking 
if [ $# != 2 ] 
then 
cat <<EOF 
   Usage rclonemount volume mount_point 
EOF 
exit 1 
fi 

if ! which rclone >/dev/null # check we have rclone
then 
   echo Can\'t find rclone, exiting. 
   exit 3 
fi 

if [ ! -d "$2" ] 
then 
   echo Mount point $2 does not exist. 
   exit 2 
fi 

VOL="$1" 
DIR="$2" 

# Check if getting something out of the dir fails 
# if so, maybe a stale mount so try to unmount 
if ! ls "$DIR/." >/dev/null 
then 
   fusermount -u "$DIR" 
fi 
# See if directory appears to be mounted. If so, we are done 
if grep "$DIR" /etc/mtab >/dev/null 
then 
   echo $VOL Mounted 
else # if not, mount it 
   echo Mounting $VOL 
   rclone mount --vfs-cache-mode full "$VOL"/ "$DIR" & # run in background
fi
exit 0  # we tried

In my shell startup, I simply call this script once for each remote and mount them to things like ~/cloud/googledrive or ~/cloud/dropbox. You could also run the script as a user service for systemd, of course.

There is one caveat. One day, one of your remotes will fail to mount. You have to remember you probably need to run the configuration again to reauthorize the connection to the remote service or change the password if you recently changed it. The error messages won’t make that clear.

You can use the same general script with sshfs instead of rclone, and rclone can mount over SSH, too. Pick your poison.

Head in the Clouds

I have my own WebDAV server and having it simply look like a directory on all my machines is really handy. I’ll admit that I enjoy having Google Photos mapped to my filesystem, too. The scope of rclone is very impressive and it seems to have kept up with the various changes that the remote services seem to make every few months that often breaks tools like these. Overall, it is a good tool to have in your Linux box.

I haven’t tried it, but apparently rclone will also work on other platforms including BSD, MacOS, and even Windows, where it looks like it mounts a drive letter. Let us know!

17 thoughts on “Linux Fu: Send In The (Cloud) Clones

    1. Still use NFS, but in the home environment. Tis how the Linux laptops, desktops connect to the home Linux server. I still don’t get why people ‘trust’ their files to the cloud and probably pay for the privilege too…. When a simple backup and off site backup will do.

      1. Uhm, no, it won’t. That is, it might work just fine for some purposes, including YOUR purposes. But it certainly does not work to allow me to access the same core group of files from my home desktop Linux box, my home desktop Windows machine, my work Mac laptop (whether using it at home or in the office, when that was still a thing,) my iPhone or my iPad. (I don’t use an Android device but the files would be available there as well if I did.) Edit on one device, the changes are instantly available on any other device. As a writer, that is incredibly convenient. It’s equally convenient for things like grocery shopping lists.

        In my case, I have a rented Linux VM sitting in a server farm somewhere running the open source Nextcloud service (which I think is easier and far superior to the method listed here for most purposes.) Most people are not geeks, however, and are going to use one of the commercial offerings because they simply offer features and capabilities that NFS and offsite backup do not.

  1. I haven’t used, but looks interesting. I use Syncthing. Its greatest advantage is the Android app that lets me sync my mobile. The app can use Tor, so with a hidden service on my PC I can sync quite securely even when I am not at home.

  2. rclone vs rsync

    They are very similar at heart, however rsync supports sending binary diffs which rclone doesn’t, and rclone supports a whole range of cloud providers which rsync doesn’t. orschiro: Why does Rclone support the local file system if that is already offered by Rsync? Rclone copies between remotes.

  3. I really hate cloud and SASS. Apparently programmers love it because it’s a steady job, and big companies save a lot by outsourcing, but it’s terrible for home users, freelancers(So glad I’m not an independent contractor anymore!), and small companies, where unreliable WiFi is common and $10 a month subscriptions are a measurable percentage of your spare cash at the end of the month.

    Cloud does exactly two things well. Off-site backup and Sync. Both of which can be done just as well by offline-first software.

    1. You forgot about netflix and other online services that use the cloud extensively for capacity management. Nobody gets “slashdotted” any more, they just fire up some more docker instances on AWS.

    2. There is a silver lining in that that the proliferation of cloud software means that your OS doesn’t matter much anymore. Maybe this is what finally took away Windows’s >95% market share it had in the 90s. Now turning EVERYTHING into subscription software, that gets my blood boiling….

  4. The web GUI is very easy to use too, was quick to set up a local mount of the Gdrive backup of what is on my Android phone. As soon as there is new content generated by my phone and Gdrive syncs then I also have it on my Linux desktop. From the phone side of things everything can be accessed using Ghost Commander.

    1. I use SyncThing to keep some directories together where I want local access. For example, pictures on my phone and my computers. Configuration files. But rclone is great when I just want to browse my entire file system from another computer. I don’t want a whole copy. I just want to mount it.

Leave a Reply to Adam WeldCancel reply

Please be kind and respectful to help make the comments section excellent. (Comment Policy)

This site uses Akismet to reduce spam. Learn how your comment data is processed.