Google+ Followers

Friday, January 28, 2011

TRAC: B2.10: Processing deposits

B2.10 Repository has a documented process for testing understandability of the information content and bringing the information content up to the agreed level of understandability.

If Content Information or Preservation Description Information (PDI) is not directly usable by the current application tools of the designated community(ies), the repository needs to have a defined process for giving it usable form or for making additional Representation Information available (see B3.2).

Repositories that share the burden of ensuring that adequate metadata or documentation is captured or generated to meet a required degree of understandability can implement any number of procedures to address this requirement. Such repositories typically have a narrowly defined designated community, such as a particular science discipline.

Evidence: Retention of individuals with the discipline expertise; periodic assembly of designated or outside community members to evaluate and identify additional required metadata.

Disclaimer:  I'm not sure that I fully understand this TRAC requirement, and my sense is that it is one of the few where I (as the "tech guy") might get a pass.  But here goes....

This requirement seems to be asking the question:  Are your customers and clients able to use the content you are making available?  I think there are two different answers to this question.

One answer is an emphatic, Yes!  At the aggregate level it seems clear that the content preserved and disseminated by ICPSR is useful to the community.  If it wasn't, presumably this would result in the rapid erosion of the number of members, the number of datasets downloaded, and the general disuse of ICPSR as a resource for social science research.  Why come get our stuff if it isn't useful to the community?

Another answer is the more equivocal, Probably.  At the micro level of a particular study, it is easy to imagine that we have some content which is both accessed infrequently and where the metadata is somewhat sketchy.  For example, imagine a study first processed by ICPSR in the 1970s, and which, for whatever reason, does not have modern "ready to go" formats or even modern setups for the common stat packages.  It would still have data available (plain ASCII, maybe even in card format), and it would have some sort of associated codebook, even one in plan text format.  Clearly this type of content would be much less usable - perhaps even unusable - by our clients.

That scenario raises the question:  Is there a problem to solve?  If ICPSR wants to serve its membership well over the long term, what is the right strategy for handling content which may have little value to the community (at least today)?

1 comment:

  1. If you haven't done so already, you could check the interpretation of this requirement in the reports on Portico's and Hathitrust's compliance with TRAC requirements.

    Clearly, this is one aspect of identifying & preserving the "significant properties" of digital objects (, and regularly revisiting & evaluating your success in doing so. The metadata & documentation required for ensuring understandability will change as time goes by: the understandability of a data set in card format will be quite different for communities of researchers of 1961, 2011, and 2061.

    The question you raise is important & points to the need for regular reappraisal of the value of content: the understandability of that data set in card format to a research community of 2061 might be pretty low, not just because of changes in technologies and the level of understanding of archaic technologies & methods, but more generally because of changes in what constitutes knowledge: i.e., no one cares enough about it any more to make the effort to understand it fully.


Note: Only a member of this blog may post a comment.