Ever get the feeling that you're not being told everything? Last night, for example, a number of UK Internet users noticed their connections going slow or stopping altogether, and other problems with email, Web and ISP servers. This was due to transatlantic fibre TAT-14 giving up the ghost just off the coast of France (as a veteran of the cross-Channel ferry, I know how it feels). Although it had a back-up loop, this had gone wrong a few days previously – and as fixing either fault needs a ship to pop out, scoop the stricken cable from the seabed and knot it together with duct tape, it'll be a while. But, um, two faults within a couple of weeks? Not what you want from a high-reliability system.
At roughly the same time, ZoneAlarm's client software went a bit loopy. Its auto-update function decided to call home and check for new stuff, but instead got stuck in a loop showering the Internet with DNS requests. This bug turned ZA into a denial-of-service zombie, bringing ISP DNS servers to a creaking halt.
Just to add more fun to the mix, we looked at the London Internet Exchange status page, to see what the statistics said for the period in question. There was indeed a slight drop at 4 p.m., the time that the cable lost its photons, but between midnight and 3 a.m. there was what looked like a complete outage. Midnight? Weren't all the problems fixed by then?
"Ah, yes," said the nice lady at LINX. "We were just changing a card. No problems really, just the stats server didn't get its information." So the other murmurs we heard -- that the switch fabric in the exchange was getting its knickers in a digital twist -- were presumably just put about by engineers tired and shagged out by a night wrestling with multiple problems and infinitely irate users.