According to this January 2011 "System Safety Program Plan," operational "reliability is provided for the ROCC [Metro's nerve center] systems by a back-up computer, which automatically activates if the primary control computer malfunctions."
There was no sign of any automatic, redundant system functioning this weekend, and operations for an entire metropolitan subway system were halted two times, once for nearly an hour.
According to Metro, "the computer problem affected an information management system that allows controllers in Metro's Rail Operations Control Center to see where trains are on a dynamic map and to remotely control switches."
It's called a code 34, and many riders reported hearing that throughout the system last night.
"code 34" is in effect on @wmata right now. Entire system is at a standstill... does anyone know what a code 34 means? @unsuckdcmetro
— Ohmygoshi (@Ohmygoshi) July 15, 2012
Recently, there have been at least two documented code 34 events, one this past March, and another in October of 2010. Metro sources tell me there have been more.
The weekend computer failure has some Metro workers scratching their heads because Metro recently built a back-up system for bus and rail OCC systems costing millions of dollars in Landover. It is unclear whether that particular back-up system is for cases like this weekend's, for a destructive event such as a fire--or both.
The Landover back ups were installed after the authority's inspector general criticized Metro for lacking IT contingency plans in a September 2010 internal audit.
One would think a fundamental requirement of any back-up system would be to avoid the need to completely stop operations.
Metro needs to explain to riders whether its back-up systems work, and if they do, why aren't they good enough to prevent the entire system from shutting down two times in less than a day?
It should be noted that while the weekend's events are alarming, sources confirm Dan Stessel's comments to WTOP that "the signal system, the system that keeps trains properly spaced from each other, did remain operational. Those systems were up and running at all times."
Other items:
Accountability lacking in Metro's IT department (Examiner)