For two hours before the BlackBerry email outage two weeks ago, the Zenprise BlackBerry service monitoring system in place at the County of Alameda (Cal.) data center in the 1.5 million-populaton county seat of Oakland "gave us some indications connectivity was intermittent," senior server engineer Paul Hinsberg told me just a few minutes ago.
During the two-hour interval from first report to system-wide shutdown the testing was performed at least once every half an hour by means of a command sent out from Alameda's BlackBerry Enterprise Server software to the BlackBerry network.
"It (BES) runs a command to check and see if it (BES) can communicate with (BlackBerry-maker Research In Motion)'s resources," Hinsberg adds.Hinsberg told me that throughout the two-hour period between first bug reports and crash, the "confidence levels" Zenprise scored for the likelihood of successful connections fluctuated between " ' yes, you can connect,'" and " "no, you can't '.
The engineer adds that as the afternoon progressed, the (scored) confidence levels of successful BlackBerry network connections as scored by Zenprise gradually descended from Very High-This Is Fixed to " 'I am completely sure I cannot connect. '
"Despite these frunstrations, Hinsberg tells me he is willing to cut BlackBerry engineers more than a bit of slack."I would think they (BlackBerry) may have realized (there was a problem), and they started to get calls, but they were committed to solving the problems as rapidly as possible," Hinsberg says.
And as to the software upgrade that was at the heart of the outage?"They probably didn't test the software (thorough enough) and probably (pursued the upgrade) too fast," Hinsberg hypothesizes. "When you (upgrade too fast), you wind up cutting corners, and saying "this looks OK."