I have been hosting Nextcloud for more than 8 years (and previously Owncloud), both for myself and for customers. During that time, backups have saved me countless times and I have continuously optimized my approach to them. In this series, I'll share what I learned and what results in a solution for Nextcloud backups which, at least for me, leaves nothing to be desired.

This first part covers the considerations behind my backup design. Stay tuned for the next article to learn how to implement a backup process that fulfills the requirements discussed here.

What actually makes a great backup?

Before jumping into setting up our backup process, let's take a second to think about what backups are actually supposed to achieve.

The main purpose of backups

Backups are our last line of defense against data loss. Therefore, in order to figure out whether they serve their purpose, we have to think about are causes of data loss that we need to take into account.

Data loss by accident

8 years of hosting experience definitely means one thing: I have fucked up more than once. Be it a corrupted copy during update, a misconfigured Nextcloud client or a lost encryption key. It might be embarrassing, but pretending that those things don't happen would just be plain dangerous. It might not even be us - users have the capability to accidentally delete their own important data as well. Luckily, if we play our cards right and have strong backups we can use those to get away in such situations with a black eye.

Data loss from hardware failure

A manufacturing issue, a power surge or just normal disk degradation - hardware failures are just a matter of time when running a server. And once they affect our disks, we are at risk of data loss. While we can (and definitely should!) protect ourselves from some cases of disk failure by disk health monitoring and RAID setups, we can't rule them out entirely. Especially if you have an issue with your power supply, that can cause hardware failures in multiple disks at once. So we really want backups that are not at risk of being affected by hardware failures in our server.

Data loss from natural disaster or theft

Wherever our server is located - there's probably a nonzero chance of "natural" disasters like fire, flood, or similar. And then, depending on how much we're able to protect our server physically, there's also the chance for vandalism or theft. Mine is sitting in my flat, for example, so that's definitely a scenario I need to prepare for. Backups are part of the solution here (of course accompanied by strong disk encryption and information security).

Data loss caused by malicious actors

So, now we're talking about the other kind of malicious actor, not the kind that breaks your front door and leaves with your server under their arm. Instead, this is about our server being compromised by attackers who are using security vulnerabilities in our setup. As demonstrated by the xz utils backdoor, the log4shell vulnerability in a popular java logging library or the regreSSHion vulnerability in SSH, it's not possible to reliably rule out that our server will at any point be vulnerable. A compromised server can be abused by attackers in many ways, using harvested credentials to impersonate users, send spam mail, collect or leak valuable data and all of those are bad. However, there's one specific attack that caused an estimated 40-50 billion dollars of cost in 2024 and can be prevented by backups. I am, of course talking about ransomware, where attackers encrypt our data and try to blackmail us for a ransom to get it back. If we set up our backups with this scenario in mind, however, we can just purge our server, set it up from scratch and restore a backup, which will be annoying but by far better than paying the ransom. For completeness’s sake, though, I want to mention that ransomware attacks often also include risk of leaking sensitive data, which backups can't protect us from.

Reliable Recovery Process

Of course, any backup is just as good as our capability of restoring it. So knowing exactly, how to do it when establishing a backup process is mandatory, as well as ensuring we will still have that knowledge when we need it.

Regular backups + retention

We want to achieve a satisfactory frequency of backup creation as well as duration to keep them. The former defines the maximum time period we can lose (time since last backup) and the latter defines the time we have to realize that there's an issue with our data before losing the chance to correct it. Both taken together define the absolute number of backups we want to keep. In an naive approach, where disk spaces scales linearly with the number of backups, this tends to quickly be a tradeoff between expensive storage costs and recoverability. However, I will present options to keep many backups with minimal storage costs.

Low Storage costs

As mentioned before, we ideally want to keep many backups, so we need to keep an eye on our storage costs.

Zero downtime backups

As we want regular backups, we need backups without service downtime (If we back up more than once per day we might have a hard time explaining to our users, why they can't use the service that often).

Functional backups

When we need our backups, they should be present, complete and working, so we should find ways to ensure that. This might sound trivial, right after data loss is an unconvenient time to notice that your backup process has had an issue, but it will be the time we do notice it if we don't take precautions.

Don't leave the porch open

Having backups should not increase our risk of data leaks. Therefore, we need to make sure that the backups are at least as secure as the service data itself.

Summary

Alright, let's wrap up our goals:

protection from data loss
- by accidental deletion
- from hardware failure
- from natural disaster
- caused by malicious actors
reliable recovery process
regular backups + long backup history
minimal storage costs
zero downtime backups
confidence in the presence and integrity of our backups
no increase in risk of data leaks

That's certainly a lot of boxes to tick. A detailed walk through my approach to accomplishing this will be the topic of the next part in this series.

You are welcome to subscribe to this blog to make sure, you won't miss it. :)

Tags: