Data Archiving
Data archiving is the process of chronologically storing data, based on company's policies. There are many attributes which may be considered when creating an archive policy, but usually, the age of files is the main criteria. Simply putting aside the tapes containing the last backup does not mean you have an archive. An archive is a system, and that system must be readily available to users in a "near-line" mode, and must be searchable, either by users, or for Compliance reasons. These conditions obviously are not met by those tapes.
Data archiving can be categorized by target as follows:
- - File system archiving or file archiving (or file-based archiving), when the archiving targets are for example Network shares (users, departmental).
- - Email Archiving, when archiving the messages stored in the email servers databases.
A simple archiving system will archive older data to a single storage tier designated to host the archive. An enterprise data archiving system will store archived data on different storage tiers, according to Data Retention policies.
The archiving system may also contain an online archive, and an offline archive. The online archive can also be sub-divided into a faster (disk based) archive, containing newer data, and a slower, tape library based archive, containing older data. The tapes containing very old data are removed form the tape library and stored offsite, into the vault containing the offline archive.
Data archiving allows for much better management of disk space. Older items are moved from the more expensive primary storage and replaced by very small size stubs (4k for example), considerable space being saved on the primary storage. The items are moved to less expensive, SATA disks storage, and then to tape. Thus, data archiving is lowering the costs of storage.
Besides providing better disk storage management, data archiving is also necessary for regulatory compliance. The data archiving system must allow for quick search and retrieval of archived items for electronic discovery.