HPE SAN behind ATO outage was designed for performance instead of stability

When 12 drives failed, it took down a system of over 800 disk drives, and a taxation agency with it.
Written by Chris Duckett, Contributor

The Australian Taxation Office (ATO) has told Senate Estimates that when it hands down its report into the outage it suffered in December 2016, the finger of blame will be pointed squarely at the storage solution provided by Hewlett Packard Enterprise (HPE).

"The turnkey service of data storage as per the 3PAR SAN provided by HPE failed us," Commissioner of Taxation Chris Jordan said.

Jordan said the report due to be released next week will state the fibre cables used were not "optimally fitted", the drives used in the SAN had software bugs that made data inaccessible, and monitoring features were not turned on.

"The SAN design and configuration meant we had an overemphasis on performance features rather than stability or resilience -- a relatively small disk drive failure had a large impact -- only 12 of some 800 disk drives failed, but they impacted most ATO systems," Jordan said.

The Commissioner also admitted it took longer than it should have to restore the SAN, because the recovery tools were kept on the failed SAN.

Discussions on a settlement are complete, with the ATO recovering "key costs" and receiving "additional and higher grade IT equipment".

The storage hardware in question was upgraded in November 2015 by HPE and was seen by the ATO to be "state-of- the-art" at the time, with the ATO noting last year it was "basically the same" hardware used by other large clients of HPE.

Earlier this month the ATO admitted the outage has caused it to roll back its use of application whitelisting.

"Whilst the ATO was fully compliant in November 2016 with whitelisting our Windows based servers, our current levels of compliance have been impacted by the ATO's recent SAN outages," it said in a submission to the Joint Committee of Public Accounts and Trust.

"In support of the full restoration and remediation program, whitelisting on a range of servers needed to be disabled and re-enabled as the restoration progresses. We have plans in place to progressively re-enable whitelisting in coming months taking into account tax time activities."

The hardware issue also pushed back the ATO's patch cycles, and whitelisting is set to be in place again by June.

The ATO also said this morning the rollout of its Single Touch Payroll over the coming years will allow for better data matching with Centrelink by providing data every payday, rather than at the end of the year.

"As I understand it, sometimes people that get benefits, [when they] get a job, keep getting the benefits. They assume the information is already being exchanged by the tax office and Centrelink because someone has taken tax out of their pay, they don't know its at the end of the year Centrelink do the check and say: 'You owe us $5,000'" he said.

"Single Touch Payroll will enable us and Centrelink to exchange information fortnightly."

The embattled automated debt recovery system used by Centrelink is based on a 27-year-old process, which contained an error that was incorrectly calculating a recipient's income, basing a recipient's fortnightly pay on their annual salary rather than taking a cumulative 26-week snapshot of what an individual was paid.

From July 1, the welfare agency intends to expand the program to examine a range of incomes including interest on bank accounts, investment properties, share dividends, and rental incomes.

"It's very likely the main people who will be caught up in this expansion of the system will be old age pensioners," Labor Senator Murray Watt said at a Senate inquiry earlier this month.

"No one has been able to convince this inquiry the system has been running so smoothly that we aren't going to see a whole bunch of new problems emerge on July 1."

The Department of Human Services has said the new measures are expected to save AU$980 million over three years.

Editorial standards