Some of you probably noticed that we had a rough weekend with the web site. We first saw trouble around 4pm EDT on Saturday. After some trouble shooting and investigation left us unsure of the root cause, we failed things over to our replica around 5pm EDT. We then ran off the replica over night and through the following morning.
The big breakthrough came at 1:30pm or so Sunday when we isolated the cause, and then it took only a few minutes to correct the problem, test the solution, and finally roll service back to the production site. As with any longer outage this one pointed out a bunch of small, but important, changes to make in procedures and documentation.
My apologies if you happened to be using our web site late afternoon on Saturday; the was certainly the roughest time.