Tuesday, September 8, 2009

ICPSR: Then and Now: Archival Storage

Archival Storage, the OAIS function responsible for storing and retrieving content, was built on DLT IV tapes at ICPSR in 2002. Files that we wanted to keep indefinitely were moved to a pair of DLT tapes; one copy was retained at ICPSR, and the other was stored at an off-site location in Ann Arbor, Michigan.

And, unfortunately, we also had a large number of older tape formats as well: IBM 3480 cartridge and 9-track. Again there were two copies, but in this case, both were off-site.

As you might expect with an off-line system such as this, it was very expensive to retrieve any item from Archival Storage. Also, if the requestor was a little fuzzy about the exact item of interest, that would also add to the cost. There was no good way to browse the holdings, and retrieval time was measured in days not minutes.

Today we've moved the master copy of each file from tape to disk, and we replicate each file off-site using a variety of techniques, such as rsync and the Storage Resource Broker Srsync utility. We also keep a copy on tape too, but instead of DLT IV, we're using LTO-3 tape which is ten times more dense. And so this gives us more copies in more locations, and a high degree of confidence that the copies are synchronized.

The next step in Archival Storage is a move away from file-based solutions to object-based solutions. We've been evaluating Fedora as a possible storage platform for social science datasets and documentation, and the results are very promising so far.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.