This post appeared in slightly different form in the Connecticut Digital Archive blog on February 2, 2013.
People new to digital archives (and more often funding stakeholders, and certain IT managers) often ask about the difference between preservation and backup. The question goes something like this: “If I have backups of my files, and can restore them if something happens to my computer (or CD, or portable hard drive) then isn’t my data preserved?”
It is a good question that is often answered either too simply: “Backup is NOT preservation” or by an explanation that goes into detail that only an archivist can understand. Here we attempt to explain digital preservation in everyday terms–well as everyday as we can get and still be archivists.
Digital preservation seeks to guarantee the integrity of and long-term access to digital information resources. Preserving Digital Information, the 1996 report of the Task Force on Archiving Digital Information identified five attributes of what they called digital integrity. Integrity was defined as attributes that give digital resources a distinct identity. These attributes are:
These five attributes became the foundation for what developed into digital preservation. Paul Conway later very succinctly explained these attributes as “formatted and structured bits (content) ‘frozen’ as discrete objects (ﬁxity) in a predictable location (reference) with a documented chain of custody (provenance) and linkages to related objects (context).”
But, while these aspects together may insure a digital resource’s integrity, they do not necessarily insure its preservation. Digital preservation comes from the addition of time and preservation actions to the five attributes of integrity.
Today the term “Digital Curation” is commonly used to identify the activities surrounding maintaining digital information resources over time. These activities take place within a context of stewardship that makes appraisal decisions based on judgments about the value of information resources over time. Data curators or modern archivists, like their analog predecessors, continually review the collections in their care and make decisions about what to do with them in terms of access, description, reformatting, disposition and the like.
The Digital Curation Centre’s Lifecycle Model illustrates the cyclical concepts and activities involved in digital curation.
According to the DCC, data archiving (or digital curation) both preserves and adds value to data. For example:
- Selection decisions affect which data are kept in the long term, and therefore which data are accessible to users
- Ingest and preservation action can lead to the addition of administrative metadata which describes the curation chain
- Data can be transformed into new formats
- Data are placed in a wider context in terms of their long-term management through, for example, the addition of annotations or developing relationships with other datasets
(See more at: http://www.dcc.ac.uk/resources/curation-lifecycle-model)
While backup strategies are important to insure the preservation of the bitstreams and can insure some or all of the five facets of integrity, digital curation adds value and makes the preserved data useable and useful beyond their original purposes. Data backup can insure recovery of digital information resources in forms and structures consistent with their original creation, digital curation supports preservation and reuse of digital information resources for future uses. Backup, disaster recovery, and digital curation are mutually supporting activities and are essential activities in a well-run digital repository