BlackBerry outage day three: RIM explains what went wrong

"A service disruption of this nature is not acceptable to us or to you," says RIM's UK chief...
Written by Natasha Lomas, Contributor and  Nick Heath, Contributor

"A service disruption of this nature is not acceptable to us or to you," says RIM's UK chief...


BlackBerry users continue to have problems with the service as RIM explains the cause of the outagePhoto: Natasha Lomas/silicon.com

BlackBerry users are continuing to report problems with the service - the third day of web and email issues for users of RIM's smartphones.

Yesterday's BlackBerry service problems, which affected users of the smartphones in Europe, South America and India, were caused by a core switch failure, according to RIM. Speaking to silicon.com today, Rory O'Neill, RIM's VP of software and services, gave further details about the outage.

The outage has been caused by the failure of a core switch on the network that connects a major RIM datacentre in Slough to other RIM datacentres worldwide, O'Neill said. "On Monday we thought we had identified the root cause - a core switch that connects our datacentres worldwide - and what we did yesterday was to identify that and change the components and the infrastructure, and brought the BlackBerry service back again overnight.

"Overnight it was performing very well but it became evident when we loaded the traffic back on that unfortunately the service and components weren't behaving as we wanted them to or should have been behaving.

"The teams have now identified another fix - they have taken the service down to try and remedy that fix and are in the process of reloading content and connections so we can get the datacentres up and running in the ways that we need to."

O'Neill said RIM engineers had "isolated the core component parts that are causing the issue" and are working to fix it "as quickly as possible".

At this time, RIM can't say when the problems will be resolved, according to O'Neill, but he said that as soon as it has a timeframe for the problem being resolved the company will update customers through various channels, including Twitter and Facebook.

At RIM's annual customer conference, taking place in London today, the company's UK MD, Stephen Bates, gave a statement on the outage. "We thought we had got to the root cause on Monday but we did not," he said, adding: "We are dealing with over 20PB of data every month so you can imagine the disruption we are trying to resolve."

"We have a world-class team of engineers working on this night and day," added Bates. "A service disruption of this nature is not acceptable to us or to you."

RIM's O'Neill said the company is "not ready" to discuss compensation for affected customers, saying the company's "core focus is getting the service available again". But he added: "We will think about that as soon as we get the service up and running."

"Right now every professional in our datacentres in network and software engineering is 100 per cent focused on getting the service up and running."

O'Neill added that the BlackBerry network architecture "has been designed at its core to be fully resilient".

"Unfortunately in this case, particularly because it's a connectivity switch between the datacentres, the resilience architecture hasn't performed as desired or planned".

"We are running an infrastructure that supports 70 million customers, seven million of those in the UK, processing over 20PB a month. It's a complex architecture that has in ordinary circumstances full redundancy and full resilience," he added.

Independent telecoms analyst Ian Fogg said the outages are a nightmare for RIM - which uses services such as BlackBerry Messenger (BBM) and BlackBerry email to differentiate its offering from rival mobile platforms such as Apple's iOS and Google's Android.

"RIM is in danger of becoming its own worst enemy if it is unable to reliably operate the communication services that have differentiated it. BBM is the reason many young consumers stay with BlackBerry. If it doesn't work, they will leave RIM," Fogg blogged.

BlackBerry users in EMEA first reported problems with email and browsing the web on Monday.

RIM later reported that services had returned to normal - however a second disruption soon struck users in EMEA, South America and India.

Editorial standards