Linux Fu: Send In The (Cloud) Clones

0

Storing data “in the cloud” — even if it is your own server — is all the rage. But many cloud solutions require you to access your files in a clumsy way using a web browser. One day, operating systems will incorporate generic cloud storage just like any other file system. But by using two tools, rclone and sshfs, you can nearly accomplish this today with a little one-time setup. There are a few limitations, but, generally, it works quite well.

It is a story as old as computing. There’s something new. Using it is exotic and requires special techniques. Then it becomes just another part of the operating system. If you go back far enough, programmers had to pull specific records from mass storage like tapes, drums, or disks and deblock data. Now you just open a file or a database. Cameras, printers, audio, and even networking once were special devices that are now commonplace. If you use Windows, for example, OneDrive is well-supported. But if you use another service, you may or may not have an easy option to just access your files as a first-class file system.

The rclone program is the Swiss Army knife of cloud storage services. Despite its name, it doesn’t have to synchronize a local file store to a remote service, although it can do that. The program works with a dizzying array of cloud storage providers and it can do simple operations like listing and copying files. It can also synchronize, as you’d expect. However, it also has an experimental FUSE filesystem that lets you mount a remote service — with varying degrees of success.

What’s Supported?

If you don’t like using someone like Google or Amazon, you can host your own cloud. In that case, you can probably use sshfs to mount a file using ssh, although rclone can also do that. There are also cloud services you can self-host like OwnCloud and NextCloud. A Raspberry Pi running Docker can easily stand up one of these in a few minutes and rclone can handle these, too.

The project claims to support 33 types of systems, although some of those are serving local files, but by any count, it is at least 30. The big players like Google, Box, Dropbox, and Amazon are there. There are variations for things like Google Drive vs Google Photos. Some of the protocols are generic like SFTP, HTTP, and WebDAV, so they will work with multiple providers. Then there are lesser-known names like Tardigrade, Mega, and Hubric.

Each system has its idiosyncracies, of course. Some file systems are case-sensitive and some are not. Some support modification time recording, but others don’t. Some are read-only and some do not support duplicate files. You can mount most of the filesystems and there are also meta systems that can show files from multiple remotes (e.g., Google Drive and Dropbox together) and other special ones that can cache another remote or split up large files.

How Does It Work?

When you setup rclone, you use the program to configure one or more remotes. The program stores the setup in ~/.config/rclone/rclone.conf although you rarely need to edit that file. Instead, you run rclone config.

From there you can see any remotes you already have and edit them or you can define new ones. Each backend provider has slightly different setup, but you’ll generally have to provide some sort of login credentials. In many cases, the program will launch a Web browser to authenticate you or allow you to grant permission for rclone to access the service.

Once you have a remote, you can use it with rclone. Suppose you have a Google Drive and you’ve made a remote named HaDFiles: that points to that drive. You could use commands like:

rclone ls HaDFiles:
rclone ls HaDFiles:/pending_files    # directory name, not file wildcard!
rclone copy ~/myfile.txt HaDFiles:/notes/myfile.txt
rclone copy HaDFiles:/notes/myfile.txt ~/myfile.txt
rclone sync ~/schedule HaDFiles:/schedules

The copy is more like a synchronization. The file is copied from one path to another path. You can’t copy a directory. In other words, consider this command:

rclone copy /A/B/C/d.txt remote:/X/Y/Z

This copies d.txt from /A/B/C to /X/Y/Z. If won’t copy a file that already exists on the other side unless it hashes to a different value than the new file, indicating the file changed. There is also a move command, as well as delete, mkdir, rmdir, and all the other things you would expect. The sync command updates the destination to match the source, but not vice versa.

However, what we are interested in is the mount command. On the face of it, it is simple:

rclone mount remote:/ /some_local_mount_point

Caveats

There are a few problems, though. First, the performance of some of the filesystems is pretty poor and could be even worse if you have a slow connection. This is especially bad if you have tools like a file index program (e.g., baloo) or a backup program that walks your entire file system. The best thing to do is to exclude these mount points from those programs.

Hitting the remote filesystem can be inefficient so rclone will cache file attributes for a short period of time. If a file changed on the remote side, you could get stale data and that could be bad for your data. It also caches directories, so if you are using this with multiple users, be sure to read the documentation.

You also can’t write randomly into files by default. This stops some programs like word processors from working. You can pass --vfs-cache-mode with an argument to cause rclone to cache the file locally, which may help that. There’s no free lunch, though. If you set the cache mode to full, all file operations will work, but you risk rclone not being able to move the complete file over to the remote later which, again, isn’t good for your data integrity.

Problems

If you don’t mind manually setting up things, it is really just this simple. Run a mount command, probably specifying a cache mode, and you are done. However, I wanted to mount the cloud all the time and that leads to some problems.

You can set up rclone to run as a systemd service, but that didn’t work well for me. Just putting my commands in my login profile seemed to work better. But there were two problems. First, it was wasteful to call it every time I run a login shell, even if the mount was already there. Second, sometimes the network connection would drop and the mounted directory was in some kind of zombie state. You couldn’t remount, but you also couldn’t get any files out.

The Script

The answer to my problems? Create a simple script.

#!/bin/bash 
# error checking 
if [ $# != 2 ] 
then 
cat <<EOF 
   Usage rclonemount volume mount_point 
EOF 
exit 1 
fi 

if ! which rclone >/dev/null # check we have rclone
then 
   echo Can't find rclone, exiting. 
   exit 3 
fi 

if [ ! -d "$2" ] 
then 
   echo Mount point $2 does not exist. 
   exit 2 
fi 

VOL="$1" 
DIR="$2" 

# Check if getting something out of the dir fails 
# if so, maybe a stale mount so try to unmount 
if ! ls "$DIR/." >/dev/null 
then 
   fusermount -u "$DIR" 
fi 
# See if directory appears to be mounted. If so, we are done 
if grep "$DIR" /etc/mtab >/dev/null 
then 
   echo $VOL Mounted 
else # if not, mount it 
   echo Mounting $VOL 
   rclone mount --vfs-cache-mode full "$VOL"/ "$DIR" & # run in background
fi
exit 0  # we tried

In my shell startup, I simply call this script once for each remote and mount them to things like ~/cloud/googledrive or ~/cloud/dropbox. You could also run the script as a user service for systemd, of course.

There is one caveat. One day, one of your remotes will fail to mount. You have to remember you probably need to run the configuration again to reauthorize the connection to the remote service or change the password if you recently changed it. The error messages won’t make that clear.

You can use the same general script with sshfs instead of rclone, and rclone can mount over SSH, too. Pick your poison.

Head in the Clouds

I have my own WebDAV server and having it simply look like a directory on all my machines is really handy. I’ll admit that I enjoy having Google Photos mapped to my filesystem, too. The scope of rclone is very impressive and it seems to have kept up with the various changes that the remote services seem to make every few months that often breaks tools like these. Overall, it is a good tool to have in your Linux box.

I haven’t tried it, but apparently rclone will also work on other platforms including BSD, MacOS, and even Windows, where it looks like it mounts a drive letter. Let us know!

FOLLOW us ON GOOGLE NEWS

 

Source

Get real time updates directly on you device, subscribe now.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. AcceptRead More