Felicia LeClere, the Director of our Data Sharing for Demographic Research project, and I collaborated on an NIH Challenge Grant proposal to explore cloud computing technologies as a mechanism to deliver restricted-access data. Despite fierce competition (there were nearly 18,000 proposals submitted), ICPSR learned recently that our proposal was accepted, and that the NIH will fund the project. Felicia is the PI for the grant, and will provide overall leadership, and my team will be have the lead on execution and deliverables. In addition to our own colleagues at ICPSR, we're also working with partners at the RAND Corporation and at the University of Michigan Survey Research Center to test and evaluate the system. The title of the project is Exploring New Methods for Protecting and Distributing Confidential Research Data.
From the proposal:
In this project, the Inter University Consortium for Political and Social Research and partners at the Rand Corporation and the Survey Research Center at the University of Michigan will build and test a data storage and dissemination system for confidential data, which obviates the need for users to build and secure their own computing environments. Recent advances in public utility (or “cloud”) computing now makes it feasible to provision powerful, secure data analysis platforms on-demand. We will leverage these advances to build a system which collects “system configuration” information from analysts using a simple web interface, and then produces a custom computing environment for each confidential data contract holder. Each custom system will secure the data storage and usage environment in accordance with the confidentiality requirements of each data file. When the analysis has been completed, this custom system will be fed into a “virtual shredder” before final disposal. This prototype data dissemination system will be tested for (1) system functionality (i.e., does it remove the usual barriers to data access?); (2) storage and computing security (i.e., does it keep the data secure?); and (3) usability (i.e., is the entire system easier to use?). Contract holders of two major data systems (the Panel Study of Income Dynamics and the Los Angeles Family and Neighborhood Study) will be recruited to assess both the user interface and the analytic flexibility of the new customized computing environments.
This is a very exciting opportunity for ICPSR to continue its exploration and evaluation of public computing clouds for enabling research. If our test is successful, this may also be another delivery mechanism that we add to our upcoming Restricted-access data Contracting System (RCS), where researchers apply online.
I'll be working on the technology portion of the grant, of course, and so will Steve Burling, a member of the ICPSR technology team. Steve has been leading most of our cloud computing efforts over the past year, and has acquired a lot of experience with Amazon's services during that time. To complement what Steve brings to the table, we'll also be posting a position for a fairly senior position: someone who brings solid expertise with Windows systems and who also has gained recent experience with one of the public computing clouds. That job will appear on the University of Michigan central job site, but I'll post a link to it here too once it goes live.
It will be very, very early in the project, but I'm hoping to talk about it in a preliminary way at the upcoming Coalition for Networked Information Fall 2009 Membership Meeting. I hope to see some of you there!