Monday, December 5, 2011

Collaborators, not depositors

ICPSR should stop accepting deposits.

Instead ICPSR should be recruiting collaborators.

To be sure ICPSR receives a great deal of its content via US Government agencies who have decided to outsource the digital preservation of their content to a trustworthy repository like ICPSR.  In this case the relevant contract, grant, or inter-agency agreement makes it clear what content will be coming to ICPSR to be curated and preserved.  In some cases the agency has little interest in depositing content ("Isn't that what we pay you for?"), and so the formal act of depositing content falls to the ICPSR staff anyway.

However, we also receive a considerable volume of content through our web portal where the depositor is external.  Sometimes we have worked hard to acquire the content, and the deposit is one milestone on a very long road, but other times the content comes to us unsolicited.  (I like to call these "drive-by deposits.")

In some cases the depositor is quite eager and able to help ICPSR with much of the curation work:  drafting rich descriptive metadata; organizing survey data and documentation into coherent groups; packaging other types of content into logical bundles (such as with our Publication-Related Archive); and, reviewing the data for possible disclosure risks.  Depositors may have access to resources like graduate students who can help with these tasks, and if the depositor is also the data producer, then s/he has valuable, unique insight into the data and documentation.  Unfortunately ICPSR is not well poised to tap into that expertise and those resources.

What would it take to get there?

ICPSR could separate the transactional step of submitting content (i.e., file upload concurrent with signature) from the iterative step of preparing metadata applicable to the submitted content.  In fact, one could even prepare metadata well before the submission transaction if the data producer had the interest and resources to prepare that information, but was not quite ready to share the data yet.  And, it would be equally permissible to submit the data for preservation and sharing, and then build the metadata slowly during the weeks and months following the upload.

If the data producer could also export the metadata in machine actionable formats, say, DDI XML for content which maps well to the classic "study" object that ICPSR has curated and preserved for decades, then there may be additional value to the producer. And introducing the structure that comes along with an XML schema like DDI might also be valuable to the producer in terms of thinking about and organizing the documentation, even for his/her own use.

In this world the ICPSR deposit system becomes a much shorter, much simpler web application.  And the ICPSR data management infrastructure would need to be opened up -- but with serious access controls -- so that content providers could access, create, and revise their documentation and metadata.  But the best thing about this world is that ICPSR gains a lot of collaborators, some who would be quite eager to work with us, I think.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.