Related to the server build, what’s current state of the art for user isolation for something like a backup system? Chroots are a “keep nice people separated” solution, VMs are heavy, but it seems as though something like a namespace backed chroot would work fairly well to keep users apart from one another, especially if there are no sudo type tools in the isolated space.
Is there anything in particular I’ve missed on this in the 10 years since I last really messed with this sort of thing?
If your just talking about file storage, I believe it’s possible to set things up such that a user can have SFTP access without ssh access. So files and folders are available without giving the user shell access of any kind, if I understand it correctly.
The problem there, with sftp only, is that it eliminates the ability to use rsync. rsync is awesome for incremental backups, and I plan to have a hard file based rotation system that will keep files rotated for a while (think daily backups for a week, weekly backups for a month, monthly backups for a year) available.
I do something like that for my backups. I haven’t bothered to automate it yet (though I should) since - as you say rsync is awesome and makes this sort of thing really easy.
I just mount an encrypted disk to /mnt/backup_drive and do approximately;
Where 2019_05_31_home_$USER is the previous backup created the same way.
Off-site-ness/backup-redundancy:
For me thing constitutes rotating which of the backup drives I’m using periodically and leaving one or two of them with friends or family members so there’s always a couple reasonably recent backups in different physical locations (but never in the “cloud”).
Yeah, something like that. I’ve done the work before (can’t find the scripts and probably don’t have rights to them anyway, it was 15 years ago) to automate the rotation of that sort of thing on a daily basis. It was just a mid-day cron job (backups ran at night) to shuffle stuff around by date and hardlink everything.
I’m debating going crazy and using zfs compression to help. It supports dedup as well, if I want to throw a ton of CPU and RAM at it, which might actually be useful for “small changes in a big file” (think Outlook PST files - they’ll be different, but the bulk of them is the same).
I’m not terribly familiar with ZFS - how would that work for remote backups? I’m planning to base the system around rsync because it works across all major platforms, Windows included.
Saving snapshots – Use the zfs send and zfs receive commands to send and receive a ZFS snapshot. You can save incremental changes between snapshots, but you cannot restore files individually. You must restore the entire file system snapshot. These commands do not provide a complete backup solution for saving your ZFS data.
Remote replication – Use the zfs send and zfs receive commands to copy a file system from one system to another system. This process is different from a traditional volume management product that might mirror devices across a WAN. No special configuration or hardware is required. The advantage of replicating a ZFS file system is that you can re-create a file system on a storage pool on another system, and specify different levels of configuration for the newly created pool, such as RAID-Z, but with identical file system data.
ZFS is CDDL licensed - and is under Oracle control these days. One good reason to avoid it even if technically it has some nice things (no thanks to Oracle, but rather Sun who originally developed it). Red Hat’s GFS has similar technologies, but on the whole those are only useful in a large-scale NAS which provides backing stores for MQs, DBs, etc. Rsync and any ordinary file system work perfectly well (and with similar performance) for everything else. The main benefit of ZFS/GFS/similar transactional filesystems is that you can take snapshots that are then static during the backup, and the diffs of the snapshots to each other and to the current state are all easily identifiable as entities themselves, making it very easy to synchronize while the FS is hot, and to do various tricks such as dynamic resizing and moving of storage containers around. If you do that a lot on a hot FS that needs to still be reliable, they make some sense. But I would definitely not use them just to try to get some slightly more convenient rsync-like behaviour, as there’s not really a significant gain for remote backups in most cases (I know whereof I speak; I used to design and support globally distributed filesystems for large firms).
The only real advantage of zfs for my use case would be file compression, but… I don’t know how compressible most people’s backup data is. Photos and such certainly aren’t, and a lot of other files are more compressed than they used to be as well. I hadn’t realized it was under Oracle’s control. Yuck.
As you can see I actually get some of the best compression in “backup” which is rsync copies of server(s) and timemachine files - otherwise in movies and music I don’t get much at all.
The “main” Oracle control of ZFS is steadfastly refusing to relicense it such that it could be included in the Linux kernel (the CDDL is incompatible with GPL but compatible with the *BSD) - I’m not even sure they do much of the development anymore.
I do agree that if you do NOT use ZFS already a machine dedicated to backups is not the place to start.