Thursday, September 30, 2010

Designing Storage Architectures for Digital Preservation - Day One, Part Two

The second session of the first day featured technologists from higher education who either operate large archives, or who build systems for operating an archive.

Cory Snavely (University of Michigan, Hathitrust) gave a brief overview of Hathitrust, a repository of digital content shared by many of the Big Ten schools and a few other partners.

Brad McLean (Duraspace) reported on DuraCloud and results from the initial pilot partners.  (ICPSR is part of the current pilot, but was not a member of the original, smaller pilot program.)  He noted theseconcerns about using the cloud for digital preservation:
  1. Some services (such as Amazon's S3) have limits on the size of objects (files)
  2. Bandwidth limits on a per-server basis can impede function and performance
  3. Large files are troublesome
  4. Performance across the cloud can vary widely
  5. (File) naming matters; some storage services limit the type of characters in a name
Brad reiterated a comment made by several others:  A standard for checksums would be good to have.

Matt Schulz (MetaArchive) updated us on the MetaArchive, including a current partnership with Chronopolis.

David Minor (San Diego Supercomputer Center) updated us on the Chronopolis project.  David noted that SDSC is reimplementing its data center, and described three levels of storage in its future architecture:
  1. High-performance storage for scratch content
  2. Traditional filesystem storage
  3. Archival storage
The follow-on discussion included conversations about the right type of interface to access content in archival storage (POSIX, RESTful, object-oriented, etc); the trade-off between using long-lived media and systems for digital preservation v. taking advantage of advances in technology by using short-lived media and systems; and, David Rosenthal reminded everyone that we "... cannot test large systems for zero media failures."

I'll write-up my notes from Day Two early next week.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.