Backup Basics Part 2: Demystifying Backup Media

Date: Jun 30, 2006 By Ryan Faas.
In part two of a three-part series on backup basics, Ryan Faas continues to demystify backup options for new technicians and server/systems administrators. This time, the topic is choosing the media in which to store your backups. Find out the pros and cons of tape, hard drives, and RAID arrays; using network storage; and archiving using CDs or DVDs.

Part 1 of this Backup Basics series covered the common types of backups included with most backup tools—as well as ways to support workstation backups in a network environment. This article focuses on the types of storage devices that you can use to actually store your backups. Like the previous article, it is written with server backups in mind, but it does also apply to individual computer backups.

In theory, you can store backups on any type of disk or media. However, for practical purposes, network backups are typically stored on tape or fixed media (hard drives or RAID arrays) or by using network storage technologies. You might also want to consider optical media (CDs or—increasingly more commonly—DVDs) for storing archived data or for data that does not need to be frequently changed, such as backups of workstation disk image files.

Tape

Digital tape is probably the oldest choice for backup media and it is still popular as an option today. Tape drives use magnetic tape (similar to that used by an audio or video cassette) to store data. The data is stored magnetically (as in a hard drive or a floppy), but the data is laid in sequential sectors along the length of the tape rather than across a platter or disc. This means that tape has a significantly lower cost per gigabyte of storage than hard drives. It also means that the tape drive must rewind and fast forward to different segments of a tape to retrieve data, making it much slower.

The slowdown due to locating data on a tape isn’t typically an issue when writing backup files to the tape because new data is usually appended to the end of the tape, with the tape’s catalog file (which is typically stored as a file separate from the tape), identifying which preexisting data has been superseded by the newer data as it is added. This leads to some wasted space on the tape (as older data is usually ignored rather than overwritten or erased) and it makes tape backups dependent on the catalog file. Most backup applications that support tape media can attempt to re-create a catalog file based on the data stored on the tape if the catalog file is lost or damaged, however.

Tape media cannot be interacted with directly from most operating systems (that is, if you put a tape into a tape drive, most computers won’t mount the tape as a volume allowing you to browse, open, or save files to it). A backup application is required to interact with the tape drive and the media, so your choice of drive type and media is limited to what is supported by your backup application of choice.

Tape is also physically more fragile than other media types. As with an audiocassette or VHS tape, if you open the cassette, you can pull the tape out and physically damage it. If exposed to significant heat, it will melt. And, like all magnetic media, the data on it can be damaged by exposure to strong electric or magnetic fields.

All this doesn’t mean that tape isn’t a viable option. It does, however, mean that you need to plan for the physical security of tape backups accordingly. It also means that maintaining multiple physical copies or sets of tape backups is often a good idea.

One of the big advantages of tape as a backup medium, beyond its cost, is that you can easily add tapes to a backup set as the amount of data you need to back up grows (for large organizations, there are tape autoloaders that can support backups that span multiple tapes). You can also create multiple sets of tape backup for fault tolerance. And tapes are extremely portable, making them a very easy choice for providing offsite storage of backup sets.

If you use tape media, you should plan to periodically recycle tapes and eventually replace them. Again, like audio and videotape, the quality of the data stored on a tape and eventually the quality of the tape itself will degrade over time. Because tapes store data digitally, this degradation isn’t as significant as with a VHS tape, but it can cause a problem over moderate to long periods of time (exact timeframes vary depending on the type and manufacturer of the tape). Recycling a tape is the process of erasing the tape and then using it for backup again. Periodically recycling tapes can improve their usable lifespan. You will also need to periodically clean the heads of a tape drive mechanism to ensure accurate function and to prolong the lifespan of individual tapes using a head cleaning tape. Recommendations on head cleaning and tape recycling or replacement will vary depending on the type of drive and media used.

Tape drive mechanisms, and the tapes that they use, come in a variety of formats. Some mechanisms offer increased performance; others offer greater reliability. Some offer higher performance at a cost of wear and tear on the tapes and the drive heads. You can find out more about specific tape drive types and models here.

Fixed Media Drives

A fixed media drive refers to the use of hard drives as backup devices, although it can also refer to drive arrays containing multiple hard drives. Fixed media drives offer significant speed improvements over tape as a backup medium. Also, because they are recognized directly by the operating system, they are companionable with all backup software—including homemade solutions (such as backup scripts) or manual copying of files.

It is common to create RAID arrays when using fixed media for backup. A RAID array is a set of multiple physical hard drives tied together to behave as a single drive. RAID arrays come in various types and levels and can be used to increase performance as well as redundancy. The simplest RAID array, known as a mirrored array, ties two hard drives together as mirrors of each other. Each file copied to the drive as it appears to the computer is actually copied to both physical hard drives, providing redundancy if one drive fails. Other options include creating striped arrays (in which multiple drives each contain only a piece of the file for faster drive performance) and striped arrays with parity (which stripe data across multiple drives, but include parity information that can be used to regenerate lost data if a single drive fails).

Using a mirrored or striped with parity RAID array, you can easily create redundant backup drives. If one physical drive fails, the backup is still available (either in direct form on the mirrored drive or in a form that can be reconstructed). Thus, there are significant advantages in reliability over tape backups. Even without using an array structure, hard drives are generally far less fragile than tapes and don’t carry any special concerns about handling or storage.

Typically, when using fixed media drives (particularly in the case of a drive array), you will connect them directly to the servers that they will be backing up. Although this removes much concern about physically storing them, it does limit their use as an offsite backup option.

If you want to use fixed media drives for off-site backup devices, particularly for small organizations, you might be tempted to consider some of the smaller portable hard drives on the market. Several firewire drives can be powered directly from the firewire bus and have little additional bulk or weight over tape drives. You might want to consider these drives as your primary backup drive or as supplemental drives to provide offsite options, perhaps making offsite backups on a less frequent basis. The downside of this approach is that it might prove more difficult to integrate in an automated backup approach than other options.

Network Storage

The category of network backups contains a number of possibilities. Typically, it refers to designating one file server as a backup server, onto which backups from other servers (and potentially workstations) are stored. This server should be configured with a great deal of fault tolerance in its design (typically using RAID arrays for storage) and, depending on your network’s size and budget constraints, have a second server that serves as a backup server for the backup server (or at least have a separate backup strategy utilizing fixed media or tape). If your network spans multiple locations, you might want to consider building a backup server at each location and mirroring or synchronizing the contents of the servers to each other as an option for both fault tolerance and a method of maintaining off-site backups. In the case of particularly large and active networks, you might also find that using storage area network technology plays a very active role in your backup consideration (although discussion of such technology is beyond the scope of this series).

A backup server can function as simply a server hosting a share point to be used for storing backups or can be implemented using a client-server backup tool. In the first case, backup software running on the servers (or workstations) using the backup server simply stores files on it, accessing it like any other mounted disk or share point. In the second, the backup service running on the backup server will actively query and interact with a backup client tool installed on the server or workstation to be backed up. This removes some of the overhead of managing the backup process from the server being backed up and it enables centralized management of the backup process for multiple servers and/or workstations (as discussed in Part 1 of this series).

If you are in a small or medium-sized organization, you might be able to use a network attached storage device, such as a Snap drive, as a network storage solution instead of a full-blown server. These devices, which are essentially a hard drive with a very basic operating system that enables them to host share points using varying protocols, often cost significantly less than implementing a full server. The trade-off is that they cannot offer anything other than basic network storage space as a traditional server can, although if you are simply looking for a place to store backups, this might not be a concern. These devices typically cannot be formatted as a part of a RAID array.

Finally, network storage can consist of using an outside company for storage over a secure Internet connection. This is often appealing if your company or school is big enough to have high-speed Internet connectivity and require off-site reliable backups, but doesn’t include the resources for maintaining adequate backups on its own. Several companies specialize in such services, and many ISPs can also simply provide storage space. This option solves issues such as maintaining off-site or multisite backups but opens some security concerns to be worked out with the company you choose. It also offers you the option of maintaining your own onsite backup server and using their services for a second tier backup.

Whichever method you select for network backup, remember that you are likely to be transferring extreme amounts of data over your network as it is backed up. You will want to automate the backup processes so they run when network use is minimal. In some cases, complete and incremental backups might be too much for your network to handle even on an overnight basis, so you might need to take this into consideration. In such cases, you will likely need to plan your backup strategy using mostly differential backups.

Optical Media

Optical media are probably the most secure backup media. Recordable CDs and DVDs are sturdier than tape or hard drives. The data stored on them is not affected by electrical or magnetic disruptions and they have higher tolerance for heat. Estimates for the life of such media can extend into the decades with minimal proper care.

That does not mean that there aren’t limitations of optical media. They are slower than fixed media (and, depending on the specific technology, also slower than tape media). They also have distinct capacity limits, which are much lower than fixed media and tape. And despite the fact that there are rewriteable CDs and DVDs, they are not as readily reusable as fixed media. These limitations mean that for server and network backups, optical media are generally not considered viable options. However, optical media are well suited to data archiving. If you are removing data from your network, but will still need to retain a permanent record of it (data pertaining to graduating student work or archive copies of workstation image files, for example), placing that data on a CD or DVD where it can be easily accessed if needed and where degradation is not a concern is an excellent option.