How do you effectively backup your high capacity (20+ TB) local NAS?
from NekoKoneko@lemmy.world to selfhosted@lemmy.world on 26 Feb 07:26
https://lemmy.world/post/43604046

I have a 56 TB local Unraid NAS that is parity protected against single drive failure, and while I think a single drive failing and being parity recovered covers data loss 95% of the time, I’m always concerned about two drives failing or a site-/system-wide disaster that takes out the whole NAS.

For other larger local hosters who are smarter and more prepared, what do you do? Do you sync it off site? How do you deal with cost and bandwidth needs if so? What other backup strategies do you use?

(Sorry if this standard scenario has been discussed - searching didn’t turn up anything.)

#selfhosted

threaded - newest

Shadow@lemmy.ca on 26 Feb 07:29 next collapse

I don’t. Of my 120tb, I only care about the 4tb of personal data and I push that to a cloud backup. The rest can just be downloaded again.

NekoKoneko@lemmy.world on 26 Feb 07:32 next collapse

Do you have logs or software that keeps track of what you need to redownload? A big stress for me with that method is remembering or keeping track of what is lost when I and software can’t even see the filesystem anymore.

Sibbo@sopuli.xyz on 26 Feb 07:34 next collapse

If you can’t remember what you lost, did you really need it to begin with?

Unless it’s personal memories of course.

NekoKoneko@lemmy.world on 26 Feb 07:49 next collapse

For me, I have a bad memory. I might remember a childhood movie (a nickname I give to special Linux ISOs) that I hadn’t even thought of for 10 years and track down a copy, sometimes excavating obscure sources, and that may be hours of one-time inspiration and work repeated many times over. Having a complete list is a good helper, but a full backup of course is best.

Onomatopoeia@lemmy.cafe on 26 Feb 08:16 collapse

I can’t remember the name of an excel spreadsheet I created years ago, which has continually matured with lots of changes. I often have to search for it of the many I have for different purposes.

Trusting your memory is a naive, amateur approach.

frongt@lemmy.zip on 26 Feb 08:29 next collapse

So you do remember that you have several frequently-used spreadsheets.

ExcessShiv@lemmy.dbzer0.com on 26 Feb 09:54 next collapse

The key here being that you actually remember the file exists, because it’s important. Some other random spreadsheet you don’t even remember exists because you haven’t needed it since forever is probably not all that important to backup.

If you loose something without ever realizing you lost it, it was not important so there would be no reason to make a backup.

cenzorrll@piefed.ca on 26 Feb 11:01 next collapse

You put that with everything else similar into a folder, which is backed up. Mine is called “Files”. If there’s something in there that I don’t need backed up. It still gets backed up. If there’s something very large in there that I don’t need backed up, it gets removed in one of my “oh shit these backups are huge” purges.

three@lemmy.zip on 26 Feb 14:37 next collapse

Psst, you missed the point and need to re-read the thread.

a_non_monotonic_function@lemmy.world on 26 Feb 15:29 collapse

If the spreadsheet is important it sounds like it would be part of the 4 GB that was backed up.

kurotora@lemmy.world on 26 Feb 07:40 next collapse

In my case, for Linux ISOs, is only needed to login in my usual private trackers and re-download my leeched torrents. For more niche content, like old school TV shows in local language, I would rely in the community. For even more niche content, like tankoubons only available at the time on DD services, I have a specific job but also relying in the same back up provider that I’m using for personal data.

Also, as it’s important to remind to everyone, you must encrypt your backup no matter where you store it.

ShortN0te@lemmy.ml on 26 Feb 07:42 next collapse

That should be part of the backup configuration. You select in the backup tool of choice what you backup. When you poose your array then you download that stuff again?

i_stole_ur_taco@lemmy.ca on 26 Feb 08:20 next collapse

Set up a job to write the file names of everything in your file system to a text file and make sure that text file gets backed up. I did that on my Unraid server for years in lieu of fully backing up the whole array.

whyNotSquirrel@sh.itjust.works on 26 Feb 08:37 next collapse

servarr* and jellyfin are managing my movies and tv-shows

BakedCatboy@lemmy.ml on 26 Feb 08:55 next collapse

My *arrstack DBs are part of my backed up portion, so they’ll remember what I have downloaded in my non-backed up portion.

NekoKoneko@lemmy.world on 26 Feb 09:07 collapse

That’s a great point.

tal@lemmy.today on 26 Feb 09:56 collapse

I don’t know of a pre-wrapped utility to do that, but assuming that this is a Linux system, here’s a simple bash script that’d do it.

#!/bin/bash

# Set this.  Path to a new, not-yet-existing directory that will retain a copy of a list
# of your files.  You probably don't actually want this in /tmp, or
# it'll be wiped on reboot.

file_list_location=/tmp/storage-history

# Set this.  Path to location with files that you want to monitor.

path_to_monitor=path-to-monitor

# If the file list location doesn't yet exist, create it.
if [[ ! -d "$file_list_location" ]]; then
    mkdir "$file_list_location"
    git -C "$file_list_location" init
fi

# in case someone's checked out things at a different time
git -C "$file_list_location" checkout master
find "$path_to_monitor"|sort>"$file_list_location/files.txt"
git -C "$file_list_location" add "$file_list_location/files.txt"
git -C "$file_list_location" commit -m "Updated file list for $(date)"

That’ll drop a text file at /tmp/storage-history/files.txt with a list of the files at that location, and create a git repo at /tmp/storage-history that will contain a history of that file.

When your drive array kerplodes or something, your files.txt file will probably become empty if the mount goes away, but you’ll have a git repository containing a full history of your list of files, so you can go back to a list of the files there as they existed at any historical date.

Run that script nightly out of your crontab or something ($ crontab -e to edit your crontab).

As the script says, you need to choose a file_list_location (not /tmp, since that’ll be wiped on reboot), and set path_to_monitor to wherever the tree of files is that you want to keep track of (like, /mnt/file_array or whatever).

You could save a bit of space by adding a line at the end to remove the current files.txt after generating the current git commit if you want. The next run will just regenerate files.txt anyway, and you can just use git to regenerate a copy of the file at for any historical day you want. If you’re not familiar with git, $ git log to find the hashref for a given day, $ git checkout <hashref> to move where things were on that day.

EDIT: Moved the git checkout up.

NekoKoneko@lemmy.world on 26 Feb 10:22 collapse

That’s incredibly helpful and informative, a great read. Thanks so much!

zorflieg@lemmy.world on 26 Feb 12:58 collapse

Abefinder/Neofinder is great for cataloging but it costs money. If you do a limited backup it’s good to know what you had. I use tape formatted to LTFS and Neofind both the source and the finished tape.

hendrik@palaver.p3x.de on 26 Feb 07:45 next collapse

I follow a similar strategy. I back up my important stuff. And I’m gonna have to re-rip my DVD collection and redownload the Linux ISOs in the unlikely case the RAID falls apart. That massively cuts down on the amount of storage needed.

givesomefucks@lemmy.world on 26 Feb 07:54 next collapse

I only care about the 4tb of personal data and I push that to a cloud backup

I have doubles of the data. Some of 'em. That way I know I have a pristine one in backup. Then I can use it, it gets corrupted, I don’t care.

Actually, I have triples of the W2s. I have triples, right? If I don’t, the other stuff’s not true.

See, the W2s the one I have triples of. Oh, no, actually, I also have triples of the kids photos, too. But just those two. And your dad and I are the same age, and I’m rich and I have triples of the W2s and the kids photos.

Triples makes it safe.

Triples is best.

www.youtube.com/watch?v=8Inf1Yz_fgk

NekoKoneko@lemmy.world on 26 Feb 10:26 collapse

Bob Odenkirk has never steered us wrong, thanks. I downloaded three copies of this from YouTube in case I forget.

BakedCatboy@lemmy.ml on 26 Feb 08:54 collapse

Same here, ~30TB currently but my personal artifacts portion is only like 2TB, which is very affordable with rsync.net, which conveniently has an alerts setting if more than X kb hasn’t changed in Y days. (I have my Synology set up to spit out daily security reports to meet that amount, so even if I don’t change anything myself I won’t get bugged)

OR3X@lemmy.world on 26 Feb 07:37 next collapse

So you have 56TB of total storage, but how much of that 56TB is actually used? Take the amount of storage used and add 10-12% to that figure. Now you create a new NAS (preferably off-site) with that amount of storage and that becomes your backup target. Take an initial backup (locally if possible to speed up the process) and then you can use something like rsync to create incremental backups going forward. This is the method I’ve used and so far it has worked out well. I target 10-12% more than the amount of used storage for my backup capacity because my storage use grows reasonably slowly. If your usage grows faster you might want to increase your “buffer” a little more so that you’re not having to constantly add drives to your backup target.

NekoKoneko@lemmy.world on 26 Feb 07:46 collapse

Yeah, this is certainly a viable “brute-force”-ish ooption. While I have 56, I’m only using 26 or so. But I’d actually be hesitant to do anything less than a full capacity mirror because I do expect to eventually use this (and more - adding drives to Unraid).

I’ve balked because of cost and upkeep (maintaining the same capacity, additional chances for drive failure, two separate sites I need physical access to with a high bandwidth connection), so I admit I was hoping I was missing an easier option.

OR3X@lemmy.world on 26 Feb 08:09 collapse

I mean, if you want a full mirror, rolling your own backup target is going to be the cheapest option even with the current high price of hardware. Other options are cloud storage, or using another media like tape. Cloud storage is of course an on going cost which rules it out for me, not to mention privacy concerns. There are certain “cold storage” options from cloud storage hosts which are considerably cheaper but they have limitations on how the data can be accessed and how often. The tape route is possible but it’s not really viable for home use due to the high upfront cost of the drives. Outside of that, backing up a subset of your storage as others have suggested is the only other option. Creating viable backups without breaking the bank is a challenge as old as computers, unfortunately.

Yorick@piefed.social on 26 Feb 07:37 next collapse

I have 2 500GB SSDs in RAID1 for important data, truenas apps etc…, then 32TB total in RAIDZ1 for large Dataset that won’t need speed (movies, TV show, music, pictures, archives,…)

If I have a complete NAS failure, a remote backup (via rsync to a friend’s NAS Weekly) of the SSD and bootable drive can be used in a new system, and my torrent app has the list and magnet of all torrents stored on the SSD so it can re-download them.

originalucifer@moist.catsweat.com on 26 Feb 07:42 next collapse

entire nas (~24TB used) is replicated to another nas in another building (2 actually). i like having 3 copies.

iamthetot@piefed.ca on 26 Feb 07:50 next collapse

The stuff that I actually care about are automatically backed up twice, once to a simple external on site and once to a cloud. The cloud rotates between the most recent backups so it never takes up more than 1tb compressed, while the local external keeps backups for much longer (something like 6tb at a time).

Decronym@lemmy.decronym.xyz on 26 Feb 07:51 next collapse

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

Fewer Letters More Letters
Git Popular version control system, primarily for code
HTTP Hypertext Transfer Protocol, the Web
HTTPS HTTP over SSL
NAS Network-Attached Storage
RAID Redundant Array of Independent Disks for mass storage
SSD Solid State Drive mass storage
SSL Secure Sockets Layer, for transparent encryption
VNC Virtual Network Computing for remote desktop access
VPN Virtual Private Network
ZFS Solaris/Linux filesystem focusing on data integrity

[Thread #119 for this comm, first seen 26th Feb 2026, 15:51] [FAQ] [Full list] [Contact] [Source code]

MentalEdge@sopuli.xyz on 26 Feb 08:04 next collapse

Recently helped someone get set up with backblaze B2 using Kopia, which turned out fairly affordable. It compresses and de-duplicates leading to very little storage use, and it encrypts so that Backblaze can’t read the data.

Kopia connects to it directly. To restore, you just install Kopia again and enter the same connection credentials to access the backup repository.

My personal solution is a second NAS off-site, which periodically wakes up and connects to mine via VPN, during that window Kopia is set to update my backups.

Kopia figures out what parts of the filesystem has changed very quickly, and only those changes are transferred over during each update.

NekoKoneko@lemmy.world on 26 Feb 09:05 collapse

The Backblaze option is something I’ve seriously considered.

Any reason this person didn’t go with the $99/year personal backup plan? It says “unlimited” and it is for my household only, but maybe I’m missing something about how difficult it is to setup on Unraid or other NAS software. B2’s $6/TB/mo rate would put me at $150/mo which is not great.

MentalEdge@sopuli.xyz on 26 Feb 10:04 next collapse

They only needed about 500GB.

And personal is for desktop systems. You have to use Backblazes macOS/Windows desktop application, and the setup is not zero-knowledge on Backblazes part. They literally advertise being able to ship you your files on a physical device if need be.

Which some people are ok with, but not what most of us would want.

FreedomAdvocate@lemmy.net.au on 26 Feb 13:09 collapse

You can ship encrypted files you know……?

MentalEdge@sopuli.xyz on 26 Feb 13:18 collapse

Yes. That’s not mutually exclusive with Backblaze having access to your backups.

FreedomAdvocate@lemmy.net.au on 27 Feb 00:56 collapse

Them having access to them is irrelevant if they’re encrypted. What’s the issue?

MentalEdge@sopuli.xyz on 27 Feb 01:02 collapse

You can do that with B2. Just use an application to upload that encrypts as it uploads.

The only way to achieve the same on the backup plan (because you have to use their desktop app) is to always have your entire system encrypted and never decrypt anything while the desktop app is performing a backup.

Did you not read what I said? You use their app, which copies files from your system as-is. Ensuring it never grabs a cleartext file is not practical.

FreedomAdvocate@lemmy.net.au on 27 Feb 01:23 collapse

That doesn’t mean it’s not encrypted on their servers……

MentalEdge@sopuli.xyz on 27 Feb 01:38 collapse

Also doesn’t mean it is. Or in a way where only you can decrypt it.

The chain of custody is unclear either way. You’re not in control.

FreedomAdvocate@lemmy.net.au on 27 Feb 09:27 collapse

It’s pretty clear actually - all data is encrypted at rest on their servers. They specifically say so.

www.backblaze.com/cloud-storage/security

backblaze.com/…/how-to-make-strong-encryption-eas…

MentalEdge@sopuli.xyz on 27 Feb 09:59 collapse

No shit. But encryption isn’t the same as zero-knowledge. Where by the time they handle the data in any way whatsoever, it’s already encrypted, by you.

Do you not know what zero-knowledge means? Or are you so focused on my mentioning they’ll ship data to you physically that what I actually said went over your head?

From the page you just linked:

  1. Implement encryption transparently so users don’t have to deal with it

  2. Allow users to change their password without re-encrypting their data

  3. In business environments, allow IT access to data without the user’s password

It’s not zero-knowledge!

FreedomAdvocate@lemmy.net.au on 27 Feb 12:05 collapse

That’s really not an issue though.

MentalEdge@sopuli.xyz on 27 Feb 12:25 collapse

Yeah. It’s almost like I literally said that in my second comment.

Which some people are ok with, but not what most of us would want.

What gap in my knowledge are you trying to fill here?

I didn’t even mention encryption in my second comment. Just that their backup plan isn’t zero-knowledge.

FreedomAdvocate@lemmy.net.au on 27 Feb 15:21 collapse

not what most of us want

Strongly disagree.

MentalEdge@sopuli.xyz on 27 Feb 15:27 collapse

With what?

That self hosting admins on lemmy probably care about their backups not being accessible to third parties?

I don’t think you can claim that they wouldn’t.

You can claim that YOU don’t mind. But that’s a sample size of one. And I’m not denying there are people who don’t care.

I just don’t think they’re the type to be self-hosting in the first place.

And that still doesn’t answer why the fuck you set out on this series of “well achuallys”?

It seems to me, you’re still looking for something to correct me on.

FreedomAdvocate@lemmy.net.au on 27 Feb 15:46 collapse

Define “accessible” here. They’re encrypted ……

Being able to download an encrypted file is not the same as being able to download it and unencrypt it, which they can’t do.

MentalEdge@sopuli.xyz on 27 Feb 16:21 collapse

Sure they can. How else do they enable providing access to the content without the user password?

The data is secured against unauthorized access, but unlike zero-knowledge setups where the chain of custody is fully within user control, the user is not the only one authorized. And even if you are supposed to be, you cannot ensure that you actually are.

OF-FUCKING-COURSE the physical drives, and network traffic are encrypted. That’s how you prevent unauthorized physical access or sniffing of data in-flight. That’s nothing special.

But encryption is not some kind of magic thing that just automatically means anyone who shouldn’t have access to the data, doesn’t.

For that to actually be the case, you need solid opsec and known chain of custody. Ways of doing things that means the data stays encrypted end-to-end.

The personal backup plan doesn’t have that.

FreedomAdvocate@lemmy.net.au on 28 Feb 14:35 collapse

Where do they provide access to the content without the user password?

MentalEdge@sopuli.xyz on 28 Feb 14:54 collapse

4

Explain to me how they couldn’t. Without simply stating “it’s encrypted”.

On the B2 plan you can use open source solutions like Kopia, and literally look at the code, to KNOW that data is encrypted on your system with keys only you have, before Backblaze ever sees it.

Explain to me, how the personal plan using their closed source application achieves the same.

Linking to a page where they say “it’s secure” is not sufficient. Elaborate. In detail. To at least an equal extent I already have.

FreedomAdvocate@lemmy.net.au on 01 Mar 22:02 collapse

So your whole point is that you shouldn’t trust one of the biggest cloud backup companies on the planet when they say that your data is encrypted, with no proof that they’re telling lies…and you’re asking me to prove that they’re telling the truth?

The onus is on you to prove that they’re telling lies, not on me to prove what they say is true.

They say this about computer backup on one of the pages I linked earlier:

Computer Backup Encryption

Data is encrypted on your computer—during transmission and while stored. Block unauthorized users from accessing your data by using a Personal Encryption Key (PEK) or use a 2048-bit public/private key to secure a symmetric AES-128 key. Data is transferred via HTTPS. Enhance your protection with two-factor verification via a TOTP (Time-based One Time Password).

Is that all a lie? Based on what?

MentalEdge@sopuli.xyz on 01 Mar 22:15 collapse

No.

I’m saying 99.999999999999999999999999999999999999999999999999999999% ≠ 100%

For some people that’s close enough. For some of us it’s not.

Prove otherwise. I dare you. I’m done putting in effort explaining the obvius to you. Your turn.

FreedomAdvocate@lemmy.net.au on 02 Mar 04:40 collapse

So being encrypted before transmission and at rest isn’t enough simply because someone at backblaze can send the encrypted files out to you on a HDD…

lol

MentalEdge@sopuli.xyz on 02 Mar 06:04 collapse

Nice ragebait.

If you genuinely still think that was my point in its entirety, you are truly obtuse.

Scrollone@feddit.it on 26 Feb 10:05 collapse

You can’t use the $99/year plan for that. The authorized client only works as a desktop application on Windows and MacOS.

Brkdncr@lemmy.world on 26 Feb 08:08 next collapse

Backup to 2nd nas.

Important stuff gets backed up to cloud storage. Whatever is cheapest.

In my case Synology c2 cloud was cheapest.

raicon@lemmy.world on 26 Feb 23:21 collapse

c2 seems expensive, I would go with hetzner storage box + restic

Brkdncr@lemmy.world on 27 Feb 06:41 collapse

It offers some other features like hybrid access to data,If my nas isn’t available I can access it from their cloud. There’s also some identity services.

ClickyMcTicker@hachyderm.io on 26 Feb 08:17 next collapse

@NekoKoneko A second system in a secure local with minimal redundancy that can *PULL* backup data from your production environment at a rate fast enough to keep up.

In my case, production is a many node Ceph cluster with flash storage and my backup is a single server loaded with big hard drives in a locked room at work (with approval). Both my house and work have fiber. The backup server pulls data from my production cluster on a regular basis using rsnapshot. It does use RAIDZ1 so I can run my hard drives until they fail without losing backups, but especially because it would take a massive amount of time to rebuild the backup server should I need to do so from scratch.

If you have a large catalog of Linux ISOs downloaded via torrent, I might recommend keeping a (backed up) folder containing the old torrent files, that way you can just download them from the source again should you lose everything. Let the community be your backup on those.

NekoKoneko@lemmy.world on 26 Feb 09:07 collapse

Thank you. I think the “folder of torrent files” you and others have said is probably a good failsafe anyway.

I assume the pull requirement is to offload the process resource use as much as possible to the backup system?

Bishma@discuss.tchncs.de on 26 Feb 08:20 next collapse

Like others, I have a 2 tier system.

About 2TB of my (Synology) NAS is critical files. Those get sent via Hyperbackup to cloud storage on at least a weekly basis, some daily. I have them broken up into multiple tasks with staggered schedules so it never has much to do on any given day.

The other 16TB I have get sync’d (again with hyperbackup, but not a scheduled backup task) to a 20TB external drive roughly once per quarter. Then that drive lives on the closet of a family member.

Konraddo@lemmy.world on 26 Feb 08:22 next collapse

Similar to most responses, I backup whatever I created myself, not shared by someone or downloaded from somewhere. I care about pictures that I took, documents, financial records, etc, which don’t take up much space at all.

worhui@lemmy.world on 26 Feb 10:05 next collapse

Lto tape. But I only have 15tb

It quickly becomes cost effective when you actually need the data to be safe. Far easier to have off site backups. I have never had a problem , but I like to have offline backup. Most of the time my data is static. So I am only backing up projects files ans changes for the most part.

If you have 40+ tb of dynamic data I can’t help there.

Edit: I buy used drives that are usually 2 generations old, so I got lto-5 drives when lto 7 was new. The used drives may be less reliable but used drives can be 1/10th the price of the newest ones.

Treczoks@lemmy.world on 26 Feb 10:07 next collapse

As someone who has experienced double failure twice in my lifetime, I seriously recommend doing backups.

The problem is that the only serious backup solution is another HDD for this size. A robot array for tapes or worm drives is probably out of budget.

irmadlad@lemmy.world on 26 Feb 10:26 next collapse

I’m not sure if I qualify as a ‘larger local hoster’ but I would go through your 20 TB and decide what really is important enough to backup in case the wheels fall off. Linux ISOs, those can be re-downloaded, although it would take a bit of time. The things that can’t be readily downloaded such as my music collection that I have been accumulating for decades, converted to flac, and meticulously tagged, can’t be re-downloaded. So that is one of my priorities to back up. Pictures, business documents, personal documents, can’t be re-downloaded, so that goes on the ‘must back up’ list…and so on. Just cull out what is and isn’t replaceable. I would bet that once you do that, your 20 TB will be a bit more slim, and you’re not trying to push 20TB up the pipe to a cloud backup.

I use BackBlaze’s Personal, unlimited tier for $99 USD per year, which is a pretty sweet deal. One thing about Backblaze to remember is that the drives being backed up must be physically connected to the PC doing the backup/uploading. I get around that because I have a hot swap bay on my main PC, but there are other methods and software that will masquerade your NAS or other as a physically connected drive.

countstex@feddit.dk on 26 Feb 10:40 next collapse

I use backblaze too, started with the personal back up, but swapped to the B2 solution as it was supported by my NAS. The cost of the actual storage isn’t much, most of the cost is in access, so for data that doesn’t alter much it worked out just as cheap, and easier to do things that way.

irmadlad@lemmy.world on 26 Feb 11:40 next collapse

and easier to do things that way.

I’m cheap and my labor is free. LOL But you do have a point.

FreedomAdvocate@lemmy.net.au on 26 Feb 13:04 collapse

The cost of B2 storage is very high, what are you talking about? USD$6 per terabyte per month would be like $4k a year for me.

cmnybo@discuss.tchncs.de on 26 Feb 12:36 collapse

Backblaze personal doesn’t support Linux or BSD, so it would be useless for a NAS.

irmadlad@lemmy.world on 26 Feb 13:41 collapse

There are many ways to skin the cat. Here’s just one:

This Docker container runs the Backblaze personal backup client via WINE, so that you can back up your files with the separation and portability capabilities of Docker on Linux.

It runs the Backblaze client and starts a virtual X server and a VNC server with Web GUI, so that you can interact with it.

github.com/…/backblaze-personal-wine-container

There are also other apps that will ‘fool’, for a lack of a better word, Backblaze to think a NAS drive is physically connected.

WhyJiffie@sh.itjust.works on 27 Feb 08:42 collapse

better would be something that can just eat a zfs send stream, but I guess for an emergency it’s fine. but I would still want to encrypt everything somehow.

FreedomAdvocate@lemmy.net.au on 26 Feb 13:00 next collapse

I switched to a DAS for my storage and use backblaze to back up all 50TB+. I couldn’t find a cost effective way to do it with a NAS.

danielquinn@lemmy.ca on 26 Feb 13:42 next collapse

Honestly, I’d buy 6 external 20tb drives and make 2 copies of your data on it (3 drives each) and then leave them somewhere-safe-but-not-at-home. If you have friends or family able to store them, that’d do, but also a safety deposit box is good.

If you want to make frequent updates to your backups, you could patch them into a Raspberry Pi and put it on Tailscale, then just rsync changes every regularly. Of course means that wherever youre storing the backup needs room for such a setup.

I often wonder why there isn’t a sort of collective backup sharing thing going on amongst self hosters. A sort of “I’ll host your backups if you host mine” sort of thing. Better than paying a cloud provider at any rate.

Joelk111@lemmy.world on 26 Feb 15:37 collapse

That NAS software company Linus (of Linus Tech Tips) funded has a feature for this planned I think.

An open-source standalone implementation would be dope as hell. Sure, it’d mean you’d need to double your NAS capacity (as you’d have to provide enough storage as you use), but that’s way easier than building a second NAS and storing/maintaining it somewhere else or constantly paying for and managing a cloud backup.

WhyJiffie@sh.itjust.works on 27 Feb 08:47 collapse

such a system would need a strict time limit for restoration after the catastrophe. Otherwise leeching would be too easy.

Joelk111@lemmy.world on 27 Feb 09:06 collapse

That’s an incredibly good point. Bad actors are the worst. Some ideas:

  • Maybe you’d need to contribute your storage capacity +10% (or more), to account for your and other’s downtime during disasters.
  • A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.
  • Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails). Sure, that’d suck, but it’d be better than loosing your data, and cheaper overall than paying for cloud backups. I’m not sure where that money would go. Maybe distributed to those who didn’t experience a disaster, or maybe to the software project, though that would mean people are profiting from a disaster. Maybe it could go to a charity of your choice or something.

Definitely a difficult problem to solve. I’m sure people smarter than me have ideas beyond mine.

WhyJiffie@sh.itjust.works on 27 Feb 10:21 collapse

A time limit after disasters would be necessary. It’s difficult to think of a proper time limit though, as even a month might not be enough time if your entire house burns down.

and also accounting for low bandwidth connections… whats more, some shitty providers even have monthly data caps

Maybe a payment system could be set up to where, if your server doesn’t ping for a week, your credit card is automatically charged (after pinging you with many emails).

yeah, that would be almost a necessary feature. being able to hold on to the backup when you really can’t restore.

billwashere@lemmy.world on 26 Feb 13:53 next collapse

With another large NAS.

Cyber@feddit.uk on 26 Feb 15:12 collapse

In a different location

billwashere@lemmy.world on 26 Feb 16:46 collapse

Well I personally have about 50tb, with one local copy and one remote copy but I’m very lucky to have access to old enterprise storage.

kaotic@lemmy.world on 26 Feb 14:11 next collapse

Backblaze offers unlimited data on a single computer, $99/year.

There might be some fine print that excludes your setup but might be worth investigating.

www.backblaze.com/cloud-backup/pricing

unit327@lemmy.zip on 26 Feb 14:31 next collapse

only windows (maybe mac)

Joelk111@lemmy.world on 26 Feb 15:32 next collapse

Yeah, people have done workarounds and stuff to get their entire NAS backed up but those seemed sketchy and bad when I looked into it.

osanna@lemmy.vg on 27 Feb 02:12 collapse

if you break their TOS, you’ll likely lose your data. So… be careful. Mind you, I haven’t read their TOS, so i don’t know if those work arounds are breaking their TOS.

irmadlad@lemmy.world on 26 Feb 16:16 collapse

Wine or there is a Docker container that runs the Backblaze client.

Mister_Hangman@lemmy.world on 26 Feb 18:56 collapse

Oh shit.

unit327@lemmy.zip on 26 Feb 14:40 next collapse

I use aws s3 deep archive storage class, $0.001 per GB per month. But your upload bandwidth really matters in this case, I only have a subset of the most important things backed up this way otherwise it would take months just to upload a single backup. Using rclone sync instead of just uploading the whole thing each time helps but you still have to get that first upload done somehow…

I have complicated system where:

  • borgmatic backups happen daily, locally
  • those backups are stored on a btrfs subvolume
  • a python script will make a read-only snapshot of that volume once a week
  • the snapshot is synced to s3 using rclone with --checksum --no-update-modtime
  • once the upload is complete the btrfs snapshot is deleted

I’ve also set up encryption in rclone so that all the data is encrypted an unreadable by aws.

quick_snail@feddit.nl on 26 Feb 20:47 next collapse

Don’t do this. It’s a god damn nightmare to delete

unit327@lemmy.zip on 27 Feb 14:48 collapse

How so? I can easily just delete the whole s3 bucket.

quick_snail@feddit.nl on 27 Feb 19:57 collapse

Maybe I’m thinking of glacier. It took months trying to delete that.

CucumberFetish@lemmy.dbzer0.com on 27 Feb 08:07 collapse

It is cheap as long as you don’t need to restore your data. Downloading data from S3 costs a lot. OP asked about 56TB of storage, for which data retrieval would cost about 4.7k

aws.amazon.com/s3/pricing/ under data transfer

unit327@lemmy.zip on 27 Feb 14:45 collapse

I’m aware, but I myself have < 3TB and if I actually need it I’ll be more happy to pay. It’s my “backup of last resort”, I keep other backups on site and infrequently on a portable HDD offsite.

Cyber@feddit.uk on 26 Feb 15:21 next collapse

What’s your recovery needs?

It’s ok to take 6 months to backup to a cloud provider, but do you need all your data to be recovered in a short period of time? If so, cloud isn’t the solution, you’d need a duplicate set of drives nearby (but not close enough for the same flood, fire, etc.

But, if you’re ok waiting for the data to download again (and check the storage provider costs for that specific scenario), then your main factor is how much data changes after that initial 1st upload.

NekoKoneko@lemmy.world on 28 Feb 14:33 collapse

Sorry. Shortly after posting this and the initial QA I left for a trip.

I could definitely wait those time periods for a first backup and a restore, since I assume it’ll be a once in 10 year at worst situation. Data changes after the first upload should be show enough to keep up.

Cyber@feddit.uk on 01 Mar 22:54 collapse

No worries, I don’t have a time limit on responses 😉

But… I took somethong like ~3 days to get an initial baxkup done.

Then ~3 years later I was at a different provider doing the same thing.

What I did do differently was to split the data into different backup pools (ie photos, music, work, etc) rather than 1 monolithic pool… that’ll make a difference.

NekoKoneko@lemmy.world on 02 Mar 01:50 collapse

That does make sense - also matches how I have currently sperated files so it’s a valuable idea. Thanks!

GenderNeutralBro@lemmy.sdf.org on 26 Feb 16:06 next collapse

You’ll think I’m crazy, and you’re not wrong, but: sneakernet.

Every time I run the numbers on cloud providers, I’m stuck with one conclusion: shit’s expensive. Way more expensive than the cost of a few hard drives when calculated over the life expectancy of those drives.

So I use hard drives. I periodically copy everything to external, encrypted drives. Then I put those drives in a safe place off-site.

On top of that, I run much leaner and more frequent backups of more dynamic and important data. I offload those smaller backups to cloud services. Over the years I’ve picked up a number of lifetime cloud storage subscriptions from not-too-shady companies, mostly from Black Friday sales. I’ve already gotten my money’s worth out of most of them and it doesn’t look like they’re going to fold anytime soon. There are a lot of shady companies out there so you should be skeptical when you see “lifetime” sales, but every now and then a legit deal pops up.

I will also confess that a lot of my data is not truly backed up at all. If it’s something I could realistically recreate or redownload, I don’t bother spending much of my own time and money backing it up unless it’s, like, really really important to me. Yes, it will be a pain in the ass when shit eventually hits the fan. It’s a calculated risk.

I am watching this thread with great interest, hoping to be swayed into something more modern and robust.

MightyLordJason@lemmy.world on 26 Feb 18:34 next collapse

Sneakernet crew here too. My work offsite backup is in my backpack. Few times per week I do a sync which takes a few minutes and take it home again. (The sync archives old versions of files and the drive is encrypted.)

We tried several cloud-based solutions and they were all rather expensive or just plain hard to run to completion or both.

irmadlad@lemmy.world on 26 Feb 18:55 collapse

That is old-old-school. It works tho. You have to be a bit scheduled about it, to encompass current and future important data. IIRC AWS created a 100 petabyte drive and a truck to haul it around to basically do the same thiing, just in much larger amounts.

tommij@lemmy.world on 26 Feb 16:19 next collapse

Zfs send. Done

dmention7@midwest.social on 26 Feb 17:53 next collapse

Personally I deal with it by prioritizing the data.

I have about the same total size Unraid NAS as you, but the vast majority is downloaded or ripped media that would be annoying to replace, but not disastrous.

My personal photos, videos and other documents which are irreplaceable only make up a few TB, which is pretty managable to maintain true local and cloud backups of.

Not sure if that helps at all in your situation.

Burninator05@lemmy.world on 26 Feb 21:21 collapse

I have data that I actually care about in RAIDZ1 array with a hot standby and it is syched to the cloud. The rest (the vast majority) is in a RAIDZ5. If I lose it, I “lose” it. Its recoverable if I decide I want it again.

Mister_Hangman@lemmy.world on 26 Feb 18:58 next collapse

Definitely following this

quick_snail@feddit.nl on 26 Feb 20:46 next collapse

Tape or backblaze

randombullet@programming.dev on 27 Feb 01:49 next collapse

I have 3 main NASes

78TB (52TB usable) hot storage. ZFS1

160TB (120TB) warm storage ZFS2

48TB (24TB) off site. ZFS mirror

I rsync every day from hot to off site.

And once a month I turn on my warm storage and sync it.

Warm and hot storage is at the same location.

Off site storage is with a family friend who I trust. Data isn’t encrypted aside from in transit. That’s something else I’d like to mess with later.

Core vital data is sprinkled around different continents with about 10TB. I have 2 nodes in 2 countries for vital data. These are with family.

I think I have 5 total servers.

Cost is a lot obviously, but pieced together over several years.

The world will end before my data gets destroyed.

modus@lemmy.world on 27 Feb 08:03 collapse

But would your data survive a nearby gamma-ray burst?

Omgpwnies@lemmy.world on 27 Feb 08:42 collapse

Amateurs not keeping at least one backup off-planet SMH

modus@lemmy.world on 27 Feb 13:05 collapse

I put a QNAP on the ISS. Expensive, but I sleep soundly.

PieMePlenty@lemmy.world on 27 Feb 05:07 next collapse

Not all data is equal. I backup things i absolutely can not lose and yolo everything else. My love for this hobby does not extend to buying racks of hard drives.

zatanas@lemmy.zip on 27 Feb 09:24 next collapse

True words of wisdom here from a self hosting perspective.

Zetta@mander.xyz on 27 Feb 11:51 collapse

Same, my unraid server is over 40 tb but I only have ~1.5 tb of critical data, being my immich photos and some files. I have an on site and off site raspberry pi with 4tb nvme SSD for nightly backups

INeedMana@piefed.zip on 27 Feb 05:26 next collapse

I’ve been following this post since the first comment.

And I have just put together my own RAID1 1TB NAS. And I did not think that 1TB will serve me forever, more like “a good start”.

But the numbers I’ve been seeing in here… you guys are nuts 😆

lightnsfw@reddthat.com on 27 Feb 06:18 next collapse

I don’t for media. I have 2 parity drives and that’s it. I’d like to do some kind of off site mirror but I haven’t had time to figure it out and buying enough storage to do that is expensive.

My actual data for like taxes and stuff is backed up to my server and backblaze.

Batman@lemmy.world on 27 Feb 06:52 next collapse

I’ve started using k8up to save my photos and config to an encrypted restic repo in an s3 bucket. having a lot of trouble backing up my SQL DB though, not as easy as they make it sound.

sefra1@lemmy.zip on 27 Feb 09:40 next collapse

Well, first while raid is great, it’s not a replacement for backups. Raid is mostly useful if uptime is imperative, but does not protect against user errors, software errors, fs corruption, ransomware or a power surge killing the entire array.

Since uptime isn’t an issue on my home nas, instead of parity I simply have cold backups which (supposedly) I plug in from time to time to scrub the filesystems.

If a online drive dies I can simply restore it from backup and accept the downtime. For my anime I have simply one single backup, but or my most important files I have 2 backups just in case one fails. (Unfornately both onsite)

On the other hand, for a client of mine’s server where uptime is imperative, in addiction to raid I have 2 automatic daily backups (which ideally one should be offsite but isn’t, at least they are in different floors of the same building).

lepinkainen@lemmy.world on 27 Feb 12:27 next collapse

A second offsite NAS (my old one) with the same capacity for the larger files

Backblaze B2 and a Hezner storage box for Really Important stuff.

Rooster326@programming.dev on 27 Feb 17:35 collapse

Okay Mr. Money Bags

lepinkainen@lemmy.world on 28 Feb 12:29 collapse

It’s literally a Raspberry pi 3B+ and a USB hard drive in a plastic storage box at my parents house 😅

trk@aussie.zone on 27 Feb 15:44 next collapse

I have a 120TB unraid server at home, and a 40TB unraid server at work. Both use 2 x parity disks.

The critical work stuff backs up to home, and the critical home stuff backs up to work.

The media is disposable.

Both servers then back up to Crashplan on separate accounts - work uses the Australian server on a business account, home used the US server on a personal account.

I figure I should be safe unless Australia and the US are nuked simultaneously… At which point my data integrity is probably not the most pressing issue.

JaddedFauceet@lemmy.world on 28 Feb 06:56 collapse

why is your work stuff at home and why is your personal stuff at work ಠ_ಠ

trk@aussie.zone on 28 Feb 14:45 collapse

Yeah I guess it probably makes more sense when it’s my business… Maybe not if you’re an employee at some corporate randomly hosting backups of your dog photos.

clif@lemmy.world on 28 Feb 15:39 collapse

I dunno. At a big company they probably won’t notice an extra TB of storage cost… So long as you’re discrete with the transfers.

ShawiniganHandshake@sh.itjust.works on 27 Feb 15:47 collapse

For me, I only back up data I can’t replace, which is a small subset of the capacity of my NAS. Personal data like photos, password manager databases, personal documents, etc. get locally encrypted, then synced to a cloud storage provider. I have my encryption keys stored in a location that’s automatically synced to various personal devices and one off-site location maintained by a trusted party. I have the backups and encryption key sync configured to keep n old versions of the files (where the value of n depends on how critical the file is).

Incremental synchronization really keeps the bandwidth and storage costs down and the amount of data I am backing up makes file level backup a very reasonable option.

If I wanted to back up everything, I would set up a second system off-site and run backups over a secure tunnel.