English

Tips of the electronic data backup system applicable to individuals

Personal insights into the practicalities of personal data backup

This article was completed with the help of www.DeepL.com/Translator.

Electronic data (hereinafter referred to as “Data”) is stored on electronic media and can only be read and written by electronic devices, which means that additional equipment may need to be purchased to meet the need for its storage and processing.

In this paper, all self-created terms that do not have a corresponding term-level definition will be cited using brackets to highlight the term to prevent confusion to the reader (example: [self-created term]). Also, this article uses backquotes to quote terms that have been mentioned above (e.g., terminology). This article is a summary of my reflections on the loss of four and a half months of work data due to stupidity and accidents. The content of this article is only personal opinion, discusses welcome.

#Data classification bases on importance

Data is classified into several levels of importance (severity of loss).

#Level 1

  • Examples:

    1. Encryption key:
      • used to decrypt large blocks of data that are encrypted, but once the encrypted data lost the corresponding encryption key is useless too.
      • i.e. it is interdependent with the data it encrypts (in usage).
    2. Seed of TOTP (Time-based One-Time Password):
      • it is guaranteed that it can generate TOTP 6-digit number by itself, so it does not depend on other (already entered into the authenticator APP and cannot be extracted) data.
    3. Platform account, software activation serial number, platform login Licence…… these “credentials”:
      • Equivalent to encryption key, it is the “access chunk of data credentials (on the platforms)”.
      • It corresponds to “encrypted” data that is usually not kept by individuals (platform services) or easily accessible again (paid software).
      • That is, it does not depend on the data it “encrypts”.
  • Features:

    1. small size, usually hardly more than 10MB or even 1MB (in total), this feature is beneficial for multi-copy backup requirements, easy to save on multiple storage media.
    2. large value, the confirmation of the loss of this type of data is usually accompanied by irreparable huge losses. (Luckily platforms allow you retrieve passwords.)
    3. you do not want to make them public (1), nor do you want to lose them completely (2).
  • Preservation methods:

    1. Encrypted preservation.
    2. Multi-copy backup, including local storage dedicated devices (mobile hard drives, home NAS) and cloud storages.

#Level 2

  • Examples:

    1. Personal Information:

      • Often referred as “personal privacy”, generated as a result of personal experience, or containing information that identifies a particular individual, mostly in the form of documents, multimedia.
      • Only the parts you can control are of interest here. As for what is uploaded externally or generated on external platforms, it is best to control it at the source, i.e. always upload the least amount of information that is most irrelevant to the reality of your own identity. (Believe me, compliance with this principle will not affect online social behavior).
      • Examples of documents that are [personal information]: “personal information summary” such as job resumes, containing a video or a photo of face (even if it is a low-resolution non-frontal face). Records of communication with others on the Internet (chat logs, emails, documents containing information about yourself sent by others, etc.)
      • The difference between “personal information” and “credentials” is the size of the file and its content. “Credentials” are like keys, while “personal information” is the treasure chest.
    2. Other information that you want to keep secret varies from person to person:

      • For example, information that is not related to “personal information” in the narrow sense, but is still included in the broader sense of “personal privacy” (such as data related to personal interests).
      • All data that you do not want to be made public or disclosed (whether it is appropriate to make it public, you have a natural right to keep it private as your wish).
  • Features:

    1. variable size and increases with age, depending on individual interest personality career differences. Most of the volume can be controlled within 500G (fashionistas who love photography may still not be satisfied with this already overestimated capacity).

    2. the value is variable, but for you these are important, which is enough as a reason to perform backup. (Bad guys and good friends who are interested in you are usually interested in this information as well.)

  • Preservation methods:

    • Encrypted preservation (the key used for encryption can be saved in [Level 1 Data]).
    • At least one full local backup, and optional cloud backup.

#Level 3

  • Examples:

    1. General data accumulated on the hard disk during the use of electronic devices, including program and operation records, documents, etc. Encryption or not is irrelevant, but in general you do not want to lose it.
  • Features:

    1. very large, but only need to purchase additional storage with a capacity suitable for their needs (including potential needs for the next few years).
    2. Smaller in value, but worth the money and effort to add an insurance.
  • Preservation methods:

    • At least one full local backup.
    • Also consider periodically transferring large files that are simply stored on hard drives but not used often to a storage dedicated device (like a mobile hard disk).

#Encryption

Let me recommend tools for you! The author has been using VeraCrypt for several years for all encryption-related operations. It is open-source, multi-platform, and fully functional, and it has recently become active again after a few years of stagnation!

What could be better than a long-lasting and still active open-source software project? VeraCrypt can encrypt the entire drive or create a strongly encrypted “file-type driver”. When decrypted and mounted, it becomes an accessible disk, and when unmounted, it becomes a portable file that can be backed up like a normal single file. Perfect!

Note here, don’t be confused by VeraCrypt’s File Driver feature, I just tried Bitlocker recklessly because he didn’t read the document without making a backup, and stored the backup key in the File Driver, forgetting that the file itself is stored in the Driver set up with Bitlocker, lock the key inside the chest and…… can not open the chest forever.

For Windows Professional users, Microsoft Windows BitLocker is also a handy tool, being commercial grade software, it is easy to use and has Microsoft’s continuous quality support. However, it can only encrypt Driver, so it is not very convenient when combined with cloud backup operations. For Driver encrypted with BitLocker software, you need to decrypt it manually before backup, or set it to decrypt automatically after boot (login). As a side effect, this will expose the encrypted files to be readable in an unencrypted state during backup, so the security requirement of the environment at the time of backing up the data is a bit stricter.

Sorry, I don’t know much about the common software collections for Mac and Linux, and cannot introduce the platform-exclusive software corresponding to them. The VeraCrypt is multi-platform, so maybe you can try it first.

In particular, for [Level 1 data], encrypted storage means encrypting a separate driver containing [Level 1 data] with another human-readable passphrase (password). This requires that you remember at least one “master key”, which is used to decrypt the data that contains other keys. I use the file Driver provided by VeraCrypt to store such information, which is primitive, but I stubbornly believe it is “more secure”. There are many other programs on the market that provide direct encryption and backup services for such keys, such as 1Password. Even modern browsers such as Firefox and Chrome have built-in services for managing web passwords. And you can purchase a little hardware like the Yubikey to carry your keys be your side just like a key ring. You will find one that suits your taste.

#Backup

First identifying the data that needs to be backed up. For example, for me, there is one [Level 1 data], two [Level 2 data] (personal confidential documents and work confidential documents), and five [Level 3 data] (home desktop, home laptop, work laptop, cell phone, tablet). The cell phone and tablet are actually backed up in the [Level 3 data] section of the computer’s hard drive, which themselves are mix of Level 1~3 and my solution is to encrypt their backups (treated as [Level 2 data]), but storing these backups within the computer’s [Level 3 data] backups.

#Backup storage location

Because the data is saved on electronic media, based on the physical presence of storage devices, for the choice of backup location, I have not considered redundant backups within the same storage location, but to achieve hardware-level isolation, regardless of geographic location, at least at the hard drive level. In a mobile hard drive in a desk drawer OR a NAS on the floor OR in a cluster of servers of a cloud storage service, the difference between these methods is the geographical location where the data is stored.

For example, I sort of dislike the cloud storage services and don’t have the need to transfer data in large quantities to multiple devices (subjective + objective reasons), so I don’t use a cloud storage service that is objectively really affordable and convenient. I also did not set up a home NAS, but only purchased two large-capacity mobile hard drives, regular backup of [Level 2 data] and [Level 3 data] on two hard drives, one in the company cabinet, one at home, rounded up also considered off-site disaster recovery, to allow a fire and man-made disasters. As for the [Level 1 data], it is saved on the desktop at home, the laptop at home, two mobile hard drives, and the company laptop, for a total of five backups.

After this accident, I decided to upload [Level 1 data] and [Level 2 data] to Microsoft OneDrive. Anyway, more backups are better, since it is encrypted and the size is not very big, so I can upload the backup quickly.

PS: It is better to have [Level 1 data] in each backup location where [Level 2 data] is located, because the two are interdependent, which can be understood as the key is more than the number of treasure chests, and the key is placed next to each treasure chest, and there is a “master key” as a protection mechanism for the key. (For people like me who use standalone encryption software, I need to put the portable version of the encryption software together with [Level 1 data] and [Level 2 data]

#Backup frequency

This was the biggest problem in the backup system exposed by the man-made disaster I experienced this time — not frequent enough to do backup, to be precise for the company laptop backups. I decided to set the backup frequency to:

  • Biweekly backups on local mobile hard drives at home and at work (and incidentally on cell phones and tablets).
  • Two mobile hard drives at home and at work are crossed monthly to form two full backups.
  • [Level 1 data] and [Level 2 data] are set up on the computers for automatic Windows OneDrive updating, and uploaded after each change.

#Problems with (automated) synchronization between backups

This issue is again to reflect the advantages of network drives, after experience, OneDrive on Windows can do real-time backup, you can use this feature to ensure consistency between backups on networked devices. If you don’t want to use a network drive service, you can also build your own home NAS. As long as you can ensure that the storage devices are connected to the Internet, there are always ready-made solutions to set up an automated data synchronization system. For example, Windows 10 comes with a “backup” feature that can synchronize at directory-level granularity as short as once every 10 minutes. But devices that can’t be networked, like my two mobile HDDs, I think of as the “tape warehouse” of commercial solutions - large and inexpensive, stable storage media (comparing to SSD), but can not read/write over the network. They can only be updated manually, but at best, they can be plugged into a computer on time, and the rest should be left to a backup tool that only needs to be manually configured once.

#Postscript

Originally, my backup frequency is once a month. A busy work schedule has caused me to go months without a backup for my work laptop. I’m still thankful for the accident, because false security is more dangerous than not having insurance. Another insight based on SRE theory is to back up before important operations, but don’t forget that if you can’t do the backup first, you should stop continuing important operations, so “the process fall in safe”.

updatedupdated2023-03-252023-03-25