NAB: more than just an outage?

Summary:It's hard to escape the impression that NAB's system issues starting at the end of last month weren't just another "Severity 1" incident for the bank.

commentary It's hard to escape the impression that NAB's system issues starting at the end of last month weren't just another "Severity 1" incident for the bank.

A problem that impacts other banks, leading to a highly embarrassing public apology by the CEO? Problems stretching out into the best part of a month now? Sounds pretty drastic.

The key question here for me is what is really going on from a technical perspective. So far, all we really know is that somebody loaded the wrong piece of code. According to the AustralianIT:

It is understood someone with access to NAB's mainframe systems, either an internal staff member or one of the bank's IT outsourcing partners in India, inadvertently bypassed a piece of code that checks BSBs against addresses. This happened during the batch transaction cycle and disrupted the bank's ability to process the files.

However, speaking with technology sector insiders over the past few weeks, it's plain that many are surprised that a simple bit of rogue code was able to cause so many problems at the bank.

It's understandable, after all, that an issue would halt a batch transaction; large organisations like banks, the Australian Taxation Office and Centrelink often pause batch processing for small periods to deal with other things going on. It's often an ongoing battle to keep the batch jobs up to date. However, what remains a mystery is why the rogue code appears to have affected NAB's ability to process batches in general, rather than just causing a short-term issue.

Now there's been a fair degree of hype around the issue. I don't personally agree with IBRS's Jorn Bettin, who made some fairly hyped statements about future likely disasters at other banks because of their crumbling core banking platforms. Frankly, these systems have been stable for decades, and banks aren't gradually migrating from them because of some kind of instability; it's more the long-term cost and flexibility benefits that is spurring them to do so.

No, what the NAB situation feels like to me is a classic case of long-term underinvestment.

The NAB's technology management team currently does not have a strong seat at the executive table as its counterpart at the Commonwealth Bank or Westpac, which have Michael Harte and Bob McKinnon. Instead, its technology leadership structure has remained unclear for more than a year since its high-profile chief information officer Michelle Tredenick was ousted by incoming CEO Cameron Clyne.

And it wasn't just Tredenick who departed the bank last year; her second in command, Craig Bright, also left around the same time.

With Tredenick gone, it was new chief information officer Adam Bennett who took the technology reins at the NAB. However, Bennett has never had a high public profile, and in practice it appears to have been the bank's chief executive of Group Business Services, Gavin Slater, who has had overarching responsibility for the bank's technology operations.

Slater's an accountant by training and a CFO and general manager by trade; he appears to have little experience in technology. Yet his oversight of the bank's tech team comes at a time when the NAB has been going through major technology initiatives: offshoring, a toe dipped in the water on core banking transformation, and more.

To me, it sounds like the NAB has been under-investing in its technology operation for some time (the normal strategy when a CFO-type oversees a technology department) and that this months' events have exposed the weakness of such a strategy. It feels like the underinvestment is going on in several areas; governance, which is usually the problem in these cases, and possibly even at the application layer or even infrastructure.

The bank's problems over the past month smack of what happens when you squash a cockroach in your bathroom. You might have fixed the immediate problem, but you know there is probably an absolute nest of the buggers hiding just below the floorboards. In places where you haven't looked for years and messing with stuff that you have always assumed will "just work".

In this case, NAB's backup systems should have "just worked" to resolve its ongoing problems. They should have rolled back whatever code was introduced and then restarted processes. The fact that they couldn't do that easily is a massive cause for concern.

In addition, NAB's high-profile IT problems could not have come at a worse time for the bank. Not only are the nation's politicians already riled up about what they see as unjustified interest rate rises and thus paying more attention right now, but both Westpac and CommBank currently have a lot to crow about when it comes to the performance of their technology systems.

The pair have been at pains over the past year to demonstrate that they have virtually eliminated the sort of "Severity 1" incidents from their technology operations that the NAB is experiencing, continually, right now.

One wonders how long it will be before NAB's board starts to realise the depths of the hole which the bank has placed itself in.

The Westpac experience has delivered Australia's banking sector a stern lesson when it comes to reducing Severity 1 incidents in banks. You can cut down the problems but it will take time, serious investment, strong leadership and a commitment to change. I'm not sure that the NAB has all the elements it needs right now; it may take a change of CEO for the bank to understand that, as it did at CommBank and Westpac before.

Topics: Outage, Banking

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Related Stories

The best of ZDNet, delivered

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
Subscription failed.