Technology at ICPSR: TRAC: B6.7: Generating complete DIPs

B6.7 Repository can demonstrate that the process that generates the requested digital
object(s) (i.e., DIP) is completed in relation to the request.

If a user expects a set, the user should get the whole set. If the user expects a file, the user should get the whole file. If the user’s request cannot be satisfied, the user should be told this; for instance, resource shortages may mean a valid request cannot be satisfied.

Acceptable scenarios include:

The user receives the complete DIP asked for and it is clear to the user that this has happened.
The user is told that the request cannot be satisfied.
Part of the request cannot be satisfied, the user receives a DIP containing the elements that can be provided, and the system makes clear that the request is only partially satisfied.

Unacceptable scenarios include:

The request can only be partially satisfied and a partial DIP is generated, but it is not clear to the user that it is partial.
The request is delayed indefinitely because something it requires, such as access to a particular AIP, is not available, but the user is not notified nor is there any indication as to when the conflict will be resolved.
The user is told the request cannot be satisfied, implying nothing can be delivered, but actually receives a DIP, and is left unsure of its validity or completeness.

Evidence: System design documents; work instructions (if DIPs involve manual processing); process walkthroughs; logs of orders and DIP production; test accesses to verify delivery of appropriate digital objects.

My sense is that one of ICPSR's strengths is its delivery system for downloading packages of social science research data. Content goes through a fairly rigorous quality assurance process, and we make the content available in the most common open-and-serve formats.

Also, I know that we spend resources and staff time on a regular basis updating the oldest content, fixing it up so that it is easier to use. For example, when we first started making content available as SAS, SPSS, and Stata files, and gave web site visitors the opportunity to select just the format they wanted, we ran into problems with some of the older content. My recollection (somewhat fuzzy now) is that there were cases where studies were organized in odd ways, and one could have the same content spread across several datasets/parts, but in different formats. And this could then lead to very weird behavior if someone picked a format (e.g., SAS) that would leave mysterious "holes" in the download.

Because our DIPs are generated by a human, and reviewed before we place them on the web site for delivery, we should be delivering complete, correct DIPs. Certainly these is no evidence that the content people are downloading is flawed or incomplete on a routine basis (e.g., data without a codebook).

Technology at ICPSR

Friday, August 19, 2011

TRAC: B6.7: Generating complete DIPs

No comments:

Post a Comment