• 0 Posts
  • 5 Comments
Joined 6 months ago
cake
Cake day: February 5th, 2025

help-circle


  • I want to write this in a separate post because I see many questionable suggestions:

    Your scenario does not allow for a simple rsync / ZFS copy. That is because those only work with 1:many. Meaning one “true” copy that gets replicated a couple of times.

    As I understand you have a many:many scenario, where any location can access and upload new data. So if you have two locations that changed the same file that day, what do you do? many:many data storage is a hard problem. Because of this a simple solution unfortunately won’t work. There is a lot of research that has gone into this for hyperscalers such as AWS GCP, Azure etc. They all basically came to the same solution, which is that they use distributed quorum based storage systems with a unified interface. Meaning everyone accesses the “same” interface and under the hood the data gets replicated 3 times. So it turns it back into a 1:many basically, with the advantages of many:many.


  • So I think this can be achieved in different levels of complexity.

    First of all, you may want to look into ZFS, because there you can have multiple “partitions” that all have access to the entire free space of the device or devices, meaning you won’t need two separate drives. Or probably you want multiple smaller and cheaper devices that are combined together because it will be cheaper and more fault tolerant.

    You also need some way to actually access the data. You have not shared how that is supposed to work: smb/nfs, etc. In either case you need a software that can do that. There a various options.

    Then, you probably want to create some form of overlay network. This will make it so that the individual devices can talk to each other lime they are in the same lan. You could use tailscale/headscale for this. If you have static public IPs you can probably get around this and build your own mesh using wireguard (spoiler: thats what tailscale does anyway).

    Then, the syncing. You can try to use syncthing for this, but I am not sure it will work well in this scenario.

    The better solution is to use a distributed storage system like garage for this, but that requires some technical expertise. https://garagehq.deuxfleurs.fr/

    Garage would actually allow you to for example only store two copies, so with three locations you would actually gain some storage space. Or you stay with the 3x replication factor. Anyway, garage is an object store which backup software will absolutely support, but there is no easy NFS/smb. So your smart TV, vanilla windows or whatever will not be able to access it. Plus side: its the only software you need, no ZFS required.

    Overall its a pretty tricky thing that will require some managing. There is no super easy solution to set this up.