Tải bản đầy đủ (.pdf) (17 trang)

Linux Systems Administrators - Backups

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (315.72 KB, 17 trang )

Systems Administration Chapter 12: Backups
Page 292
Chapter
Backups
Like most of those who study history, he (Napoleon III) learned from the mistakes of
the past how to make new ones.
A.J.P. Taylor.
Introduction
This is THE MOST IMPORTANT responsibility of the Systems Administrator.
Backups MUST be made of all the data on the system. It is inevitable that equipment
will fail and that users will "accidentally" delete files. There should be a safety net so
that important information can be recovered.
It isn't just users who accidentally delete files.
A friend of mine who was once the Systems Administrator of a UNIX machine (and
shall remain nameless, but is now a respected Academic at CQU), committed one of
the great no-no's of UNIX Administration.
Early on in his career he was carefully removing numerous old files for some obscure
reason when he entered commands resembling the following (he was logged in as
root
when doing this):
cd / usr/user/panea …
notice the mistake

rm -r *
The first command contained a typing mistake (the extra space) that meant that
instead of being in the directory
/usr/user/panea
he was now in the
/
directory.
The second command says delete everything in the current directory and any


directories below it. Result: a great many files removed.
The moral of this story is that everyone makes mistakes. Root users, normal users,
hardware and software all make mistakes, break down or have faults. This means you
must keep backups of any system.
Other resources
Other resources which discuss backups and related information include:
· HOW-TOs
Linux ADSM Mini-Howto.
· The LAME guide's chapter on backup and restore procedures
· The Linux Systems Administrators Guide's chapter (10) on backups
Systems Administration Chapter 12: Backups
Page 293
Backups aren't enough
Making sure that backups are made at your site isn't enough. Backups aren't any good
if you can't restore the information contained. You must have some sort of plan to
recover the data. That plan should take into account all sorts of scenarios. Recovery
planning is not covered to any great extent in this text. That doesn't mean it isn't
important.
Characteristics of a good backup strategy
Backup strategies change from site to site. What works on one machine may not be
possible on another. There is no standard backup strategy. There are however a
number of characteristics that need to be considered including:
· ease of use
· time efficiency
· ease of restoring files
· ability to verify backups
· tolerance of faulty media
· portabilty to a range of machines
Ease of use
If backups are easy to use, you will use them. AUTOMATE!! It should be as easy as

placing a tape in a drive, typing a command and waiting for it to complete. In fact
you probably shouldn't have to enter the command, it should be automatically run.
When backups are too much work
At many large computing sites, operators are employed to perform low-level tasks like
looking after backups. Looking after backups generally involves obtaining a blank
tape, labelling it, placing it in the tape drive, waiting for the information to be stored
on the tape and then storing it away.
A true story that is told by an experienced Systems Administrator is about an operator
who thought backups took too long to perform. To solve this problem the operator
decided backups finished much quicker if you didn't bother putting the tape in the tape
drive. You just labelled the blank tape and placed it in storage.
This is all quite alright as long as you don't want to retrieve anything from the
backups.
Time efficiency
Obtain a balance to minimise the amount of operator, real and CPU time taken to
carry out the backup and to restore files. The typical trade-off is that a quick backup
implies a longer time to restore files. Keep in mind that you will in general perform
more backups than restores.
On some large sites, particular backup strategies fail because there aren’t enough
hours in a day. Backups scheduled to occur every 24 hours fail because the previous
backup still hasn't finished. This obviously occurs at sites which have large disks.
Systems Administration Chapter 12: Backups
Page 294
Ease of restoring files
The reason for doing backups is so you can get information back. You will have to be
able to restore information ranging from a single file to an entire file system. You
need to know on which media the required file is and you need to be able to get to it
quickly.
This means that you will need to maintain a table of contents and label media
carefully.

Ability to verify backups
YOU MUST VERIFY YOUR BACKUPS. The safest method is that once the backup
is complete, read the information back from the media and compare it with the
information stored on the disk. If it isn’t the same then the backup is not correct.
Well that is a nice theory but it rarely works in practice. This method is only valid if
the information on the disk hasn't changed since the backup started. This means the
file system cannot be used by users while a backup is being performed or during the
verification. Keeping a file system unused for this amount of time is not often an
option.
Other quicker methods include:
· restoring a random selection of files from the start, middle and end of the backup.
If these particular files are retrieved correctly, the assumption is that all of the files
are valid.
· create a table of contents during the backup; afterwards read the contents of the
tape and compare the two.

These methods also do not always work. Under some conditions and with some
commands, the two methods will not guarantee that your backup is correct.
Tolerance of faulty media
A backup strategy should be able to handle:
· faults in the media
· physical dangers

There are situations where it is important that:
· there exist at least two copies of full backups of a system
· that at least one set should be stored at another site

Consider the following situation:
A site has one set of full backups stored on tapes. They are currently performing
another full backup of the system onto the same tapes. What happens when the

backup system is happily churning away when it gets about halfway and crashes (the
power goes off, the tape drive fails etc). This could result in the both the tape and the
disk drive being corrupted. Always maintain duplicate copies of full backups.
An example of the importance of storing backups off site was the Pauls ice-cream
factory in Brisbane. The factory is located right on the riverbank, and during the early
1970's Brisbane suffered problems caused by a major flood. The Pauls computer room
was in the basement of their factory and was completely washed out. All the backups
were kept in the computer room.
Systems Administration Chapter 12: Backups
Page 295
Portability to a range of platforms
There may be situations where the data stored on backups must be retrieved onto a
different type of machine. The ability for backups to be portable to different types of
machine is often an important characteristic.
For example:
The computer currently being used by a company is the last in its line. The
manufacturer is bankrupt and no one else uses the machine. Due to unforeseen
circumstances, the machine burns to the ground. The Systems Administrator has
recent backups available and they contain essential data for this business. How are
the backups to be used to reconstruct the system?
Considerations for a backup strategy
Apart from the above characteristics, factors that may affect the type of backup
strategy implemented will include:
· the available commands
The characteristics of the available commands limit what can be done.
· available hardware
The capacity of the backup media to be used also limits how backups are
performed. In particular, how much information can the media hold?
· maximum expected size of file systems
The amount of information required to be backed up and whether or not the

combination of the available software and hardware can handle it. A suggestion is
that individual file systems should never contain more information than can fit
easily onto the backup media.
· importance of the data
The more important the data is, the more important that it be backed up regularly
and safely.
· level of data modification
The more data being created and modified, the more often it should be backed up.
For example the directories
/bin
and
/usr/bin
will hardly ever change so they
rarely need backing up. On the other hand, directories under
/home
are likely to
change drastically every day.
The components of backups
There are basically three components to a backup strategy:
· scheduler
Decides when the backup is performed.
· transport
The command that moves the backup from the disks to the backup media.
· media
The actual physical device on which the backup is stored.
Systems Administration Chapter 12: Backups
Page 296
Scheduler
The scheduler is the component that decides when backups should be performed and
how much should be backed up. The scheduler could be the

root
user or a program,
usually
cron
(discussed in a later chapter).
The amount of information that the scheduler backs up can have the following
categories:
· full backups
All the information on the entire system is backed up. This is the safest type but
also the most expensive in machine and operator time and the amount of media
required.
· partial backups
Only the busier and more important file systems are backed up. One example of a
partial backup might include configuration files (like
/etc/passwd
), user home
directories and the mail and news spool directories. The reasoning is that these
files change the most and are the most important to keep a track of. In most
instances, this can still take substantial resources to perform.
· incremental backups
Only those files that have been modified since the last backup are backed up. This
method requires less resources but a large amount of incremental backups make it
more difficult to locate the version of a particular file you may desire.
Transport
The transport is a program that is responsible for placing the backed-up data onto the
media. There are quite a number of different programs that can be used as transports.
Some of the standard UNIX transport programs are examined later in this chapter.
There are two basic mechanisms that are used by transport programs to obtain the
information from the disk:
· image

· through the file system

Image transports
An image transport program bypasses the file system and reads the information
straight off the disk using the raw device file. To do this, the transport program needs
to understand how the information is structured on the disk. This means that transport
programs are linked very closely to exact file systems, since different file systems
structure information differently.
Once read off the disk, the data is written byte by byte from disk onto tape. This
method generally means that backups are usually quicker than the "file by file"
method. However restoration of individual files generally takes much more time.
Transport programs that use the method include
dd
,
volcopy
and
dump
.
File by file
Commands performing backups using this method use the system calls provided by
the operating system to read the information. Since almost any UNIX system uses the
same system calls, a transport program that uses the file by file method (and the data
it saves) is more portable.
File by file backups generally take more time but it is generally easier to restore
individual files. Commands that use this method include
tar
and
cpio
.
Systems Administration Chapter 12: Backups

Page 297
Backing up FAT, NTFS and ext3 file systems
If you are like most people using this text then chances are that your Linux computer
contains both
FAT
or
NTFS
and
ext3
file systems. The
FAT
/
NTFS
file systems will be
used by the version of Windows you were originally running, while the
ext3
file
systems will be those used by Linux.
Of course being the trainee computing professional you are, backups of your personal
computer are performed regularly. It would probably be useful to you to be able to
backup both the
FAT/NTFS
and
ext3
file systems at the same time, without having to
switch operating systems. Remember that
ext3
is backwards-compatible with
ext2
so

any programs or utilities that work with
ext2
will continue to work with
ext3
.
Well doing this from Windows isn't going to work. Windows still doesn't read the
ext2
/
ext3
file system. (Actually, with the addition of extra filesystem drivers,
Windows can read and write
ext2
/
ext3
. However these drivers are quite young and
further development is required before you could trust them enough to offer a solution
robust enough to trust your backups to.) So you will have to do it from Linux. It is
also worth noting that Linux’s support for
NTFS
is also pretty weak. Currently
NTFS

partitions can only be mounted as read-only on Linux and most distributions do not
include support as standard. It’s also interesting to note that Linux does not take heed
of
NTFS
permissions either… Which type of transport do you use for this: image or
file by file?
Well here's a little excerpt from the manual page for the
dump

command, one of the
image transports available on Linux.
It might be considered a bug that this version of dump can only
handle ext2 filesystems. Specifically, it does not work with FAT
filesystems.
If you think about it, this shortcoming is kind of obvious.
The
dump
command does not use the kernel file system code. It is an image transport.
This means it must know everything about the filesystem it is going to backup. How
are directories structured, how are the data blocks for files stored on the system, how
is file metadata (for example permissions, file owners etc) stored and many more
questions.
The people who wrote
dump
included this information into the command.
They didn't include any information about the
FAT
or
NTFS
file systems. So
dump

can't backup these file systems.
File by file transports on the other hand can quite happily backup any file system
which you can mount on a Linux machine. In this situation the virtual file system
takes care of all the differences, and file-by-file transport is none the wiser.

×