New ask Hacker News story: Ask HN: Theory of Backups

Ask HN: Theory of Backups
3 by Xen9 | 2 comments on Hacker News.
Most discussions of backups are focused on the consumerist matters. There seems to exist, buried underneath the superficial & the common sense, theory on how to do backups well. I've found two elements upon which better theory concerning rotatioms & other details (EG hash verification scheduling, amount of different devices) can be built. The first is the Tower of Hanoi scheduling scheme, which we will abbreviate TOH. The second is the Incremental-Differential-Full backups concept, which we will abbreviate IDF. The best available resource seems to be the Acronis websites' illustrated docs: https://ift.tt/XjQAnMG. I request that you in good faith ignore Acronis is a company selling commercial Windows software; you are free to post better links in comments were you to find better info elsewhere. We end up with a scheme we can call IDF-TOH. In it we have three types of backups: - Incremental, at L_0 ("Level A" in the linked resource), the most frequent level, capturing only changes made since the last backup. - Differential, at each level that belongs to the closed interval L_0-L_n, capturing changes made since the last full backup. - Full, at L_n, the least frequent level, capturing the whole system to be backed up. So now, what can we do? At least the following directions could be taken in further developing a Theory of Backups: 0 (backup scheduling): The frequencies can be chosen in many ways, and I am not sure which one is most optimal. Tower of Hanoi is for every level L_a, where a belongs to closed interval 0 ... n, 2^a. Frame-Steward may or may not be of any use in this. 1 (rotations): IDF-TOH does not address the problems of rotations. IE: if you make a backup that corrupts your previous data, and then repeat the mistake, you get in trouble quick. It's ALSO noteworthy that certain mediums may better fit certain layers in IDF-TOH & the future schemes. At least, for example, adrian_b three days before this wrote: "... Of all the optical discs that have been available commercially, those with the longest archival time were the pressed CD-ROM with gold mirrors, where the only degradation mechanism is the depolymerization of the polycarbonate, which could make them fragile, but when kept at reasonable temperatures and humidities that should require many centuries..." Consequently these would be the best for the Full backups, while Solid State Drives may work for the Incremental ones. 2 (perfecting IDF): The IDF scheme may not be perfect either & can probably be refined more or less. 3 (hashing): Verifying the backups matters & should be a part of a complete scheme. --- This may not be valuable for all businesses but most invididuals already using rsync or borg would probably prefer to use the best available scheme if reduces probability of incidental data loss at minimal effort. The task of translating the best possible scheme to a config program with humane interface is an undertaking of its own.

Comments