Friday, July 20, 2012
Amazon's loss is SDSC's gain
Unfortunately, for these particular volumes we have bonded them together to form a virtual RAID. And this RAID is used as a single multi-TB filesystem which is much bigger than fsck can handle. So we are kind of stuck.
One option would be to newfs the big filesystem, and to move the several TBs of content back into AWS, but that would be very slow. And if there is another power outage......
So instead we called up our pals at Duracloud and asked them if they could help us enable replication of our content to a second provider. (The first provider is - ironically - AWS. But their S3 service, not their EC2/EBS service.) They said they'd be happy to help, and, in fact, they will starting to replicate our content later this same week. (Now that's service!)
The new copy of our content will now be replicated in...... SDSC's storage cloud. This really brings us full circle at ICPSR since our very first off-site archival copy was stored at SDSC. Back then (like in 2008) it was stored in their Storage Resource Broker (SRB) system, and we used a set of command-line utilities to sync content between ICPSR and SDSC.
The SRB stuff was kind of clunky for us, especially given our large number of files, our sometimes large files (>2GB), and our sometimes poorly named files (e.g., control characters in file names). Our content then moved into Chronopolis from SRB, and then at the end of the demonstration project, we asked SDSC to dispose of the copy they had. But now it is coming back......