Wednesday, July 20, 2011

ICPSR's Secure Data Environment (SDE) - The Network

By the end of 2009 ICPSR's data network looked very much like it had in 1999.  It consisted on a single virtual local area network (VLAN) that was home to a handful of IPv4 address blocks.  The number of blocks had grown over the decade as ICPSR hosted more equipment in its machine room, such as servers running Stanford's LOCKSS software and Harvard's DataVerse Network (DVN) system.  Also, as the ICPSR Summer Program expanded, the number of guest machines and lab machines expanded, and this too drove the acquisition of more network blocks.

The blocks were in public IPv4 space, and therefore in principle, any machine on ICPSR's VLAN could reach any location on the Internet, and vice-versa.  In practice some simple devices, such as printers and network switches, used private IPv4 address space, routed only within the University of Michigan.  This is a fairly common practice, of course, to conserve IPv4 address space and to protect (somewhat) systems from network-based attacks.

At that time we also made use of simple Cisco access-list rules which acted as a very primitive firewall.  The campus network administrators did this for us, but somewhat grudgingly since it was a non-standard practice for them, and made ICPSR's data networking equipment more difficult to manage.  And it was also less than ideal for us too since we didn't have regular access to the data networking switches and routers, and so never knew exactly how they were configured at any given time.

So in a nutshell we have a very flat, very basic, and very open network.

This all changed in early 2010 when we started using a new product/service available from the campus network administrators called the Virtual Firewall (VFW).  This is based upon a Checkpoint product which (I believe) is often used with commercial network providers who resell network blocks to smaller companies.  Within the University of Michigan it is used by departments and organizations like ICPSR who would like all of the benefits of having a firewall, but who lack the resources and expertise to manage all of the infrastructure.  In many ways it is the "cloud version" of a firewall, giving one access to the tools to manage access controls, but without the expense of managing the physical firewall itself.  This has been an outstanding service.



In addition to using the new VFW we also partitioned our network into four (and later seven!) VLANs:
  1. Public
  2. Semi-Private
  3. Private
  4. Virtual desktops
  5. Virtual Summer Program
  6. Virtual Data Enclave
  7. Virtual Testing and Evaluation
I'm going to skip discussion of the last three VLANs for now to focus on the first four.

The Public VLAN uses public address space and is home to all of our public-facing infrastructure.  For example, it is the home of our production web server, our authoritative DNS server, and special-purpose machines running LOCKSS, DVN, etc.  Access into and out of this VLAN is relatively open, but we do restrict access to certain protocols for certain machines.

The Semi-Private VLAN uses private address space and is home to all of our non-public, but non-sensitive systems such as desktop computers, printers, and so on.  We make relatively light use of the VFW for this VLAN, and outbound access uses NAT so that people can reach the Internet.  One of our two EMC NS-120 NAS units also resides on this VLAN.

The Private VLAN also uses private address space, and it contains all of our internal data management and archival storage systems.  Our second EMC NS-120 NAS holds this content.  This VLAN is heavily controlled via the VFW, and both inbound and outbound access are heavily restricted.

Finally, we use a different VLAN for a pool of virtual workstations that our data managers use to "process" research data and documentation.  Like the Private VLAN, this VLAN makes extensive use of the VFW for access control.  In many ways the access controls of this VLAN are similar to the Private VLAN, but we have found it useful to use two different network segments, one for the individual virtual workstations and one for the back-end systems.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.