The Yale University Library is collaborating with Preservica, a world leader in digital preservation technology, to enable the preservation of nearly one petabyte* of its unique and valuable digital content. This includes both ‘born digital’ content such as emails, websites, word documents or spreadsheets, and digitized versions of original physical materials.
“Our goal is to create a sustainable infrastructure to ensure long-term access to our digital collections,” commented Euan Cochrane, the Yale Library’s Digital Preservation Manager. “We have nearly a petabyte of highly unique and valuable digital content, which we anticipate will grow by tens of terabytes next year and at an exponential rate over coming years. Beyond our existing preservation efforts, we needed to get a digital preservation system in place to handle our plans to scale.”
Preservica was chosen for the extensible nature of its architecture that allows for scaling and connecting with other systems as technology evolves. Its ability to easily migrate between file formats after they have been ingested, and the ease of storage management, were also important factors. Also of benefit is the ability to prove the provenance and authenticity of the original digital content. Once an item enters the system, it’s then possible to keep track of its entire provenance and history.
To begin using Preservica, the Yale Library is launching a pilot ‘ingest process’, using a collection of 60 terabytes of master files via an automated workflow process, which will be followed by digitized audio-visual material and born-digital materials from the Beinecke Rare Book and Manuscript Library, and the Manuscripts and Archives Department. The born digital collections include email correspondence, drafts of poetry and prose, drafts for Sesame Street skits, digital photographs, and many other items of significance to researchers.
ArchivesSpace, a web-based archives information management system already in use at the library, will automatically synchronize metadata between the two systems, providing a single coherent view of both physical and digital artifacts.
Cochrane is positive about the project’s future. “Having Preservica in place is really exciting because we are now able to widen our scope to include more complex objects and entire new archives, and we can ensure that our unique digital collections are accessible and useable for future generations.”
Preservica CEO Jon Tilbury is enthusiastic about the collaboration. “This is a great opportunity to work with a world-renowned educational institution and to preserve objects of significant historical importance. The Yale Library’s Digital Preservation Services team has always been at the forefront of technology development and application for digital preservation, and we are delighted to be part of this dedicated program.”
*One petabyte is equal to 1,000,000 gigabytes, or around 1,250,000 CDs.