Technology at ICPSR: Fedora at ICPSR

We've been spending a lot of time this month at ICPSR working with the Fedora digital repository software. It looks like a lot of people have downloaded the software (the Fedora Commons web site says that there have been over 25,000 downloads), but the number of organizations using it is still relatively small compared to the number of downloads, less than 200. I suspect that some of those listed organizations, like ICPSR, are using Fedora in sand box environments rather than for production use.

For those not familiar with Fedora, it is an acronym which stands for Flexible Extensible Digital Object Repository Architecture. It's software that was developed originally at Cornell in the late 1990's and is freely available today. After exploring the system for a few weeks now, we'd certainly agree that the name is apt.

One, this really is architecture level software, sometimes called middleware. It isn't a finished content management system (CMS), say like Drupal or Plone, where one performs a quick install and then starts adding content. The software does come with a few demonstration objects already loaded, and a handful of basic mechanisms to browse and display the content, but ultimately one needs to build a "stack" on top of Fedora to really use the system in the way it was intended. This means building your own stack, or using one of the ready-made stacks such as Fez, Muradora, Islandora, and so on. In addition to test-driving Fedora, we've also been looking at the stacks.

Two, the system does force one to think seriously about the precise nature and shape of one's digital content. What are the atomic objects and what are the higher-level digital object molecules they form? What's actually a relationship between two objects v. an attribute of an object? What's the essential content that should be preserved, and what is merely a derivative? If one has a bunch of image-oriented content, there are lots of good examples available for how one might decide to organize the objects; if one has a bunch of social science data, the examples aren't as applicable (but are instructive nonetheless). This is indeed extremely flexible and extensible stuff.

One small complaint I have is with the name: When Thorny Stapes from the Fedora Commons visited us in the fall, he told us the story about how the Cornell folks got the name first, and how they reached a compromise with the Red Hat guys when they created their Fedora distribution of Linux. But the problem for folks like me is that it makes it very hard to find web pages and information about the Fedora repository software without some heroic efforts with search engines. For example, I use Google Alerts as a mechanism to collect information about items of interest, and my query for Fedora is longer than the rest of my Google Alerts combined. And it also turns out that there is a pretty popular college football coach who also has the name Fedora!

Up next: Short summaries of the three Fedora-related projects underway today.

Technology at ICPSR

Friday, April 10, 2009

Fedora at ICPSR

No comments:

Post a Comment