Home Banking How to Make Banking Outages a Thing of the Past 

How to Make Banking Outages a Thing of the Past 

by internationalbanker

By Yaniv Valik, VP Product Management & Customer Success, Continuity Software 

Institutions from government to universities are no longer what they once were in the public mind, but banks have fared worse in the court of public opinion than many others, according to a Gallup poll. Over the course of a decade, the 2016 poll showed that while confidence in the military, media, and even organized religion had fallen, confidence in banks had dropped 22%, more than double the confidence drop for any other type of institution. 

Already struggling to maintain their stature in the public eye, banks cannot afford any additional hits to their good names and to the confidence their clients have in them – so one would think that they would do everything in their power to ensure that customers have 24/7 access to their funds and avoid service interruptions. And indeed, banks do put a lot of effort into ensuring access – but somehow, service outages do occur. And when they do, dismay, disbelief, and eventually anger ensue. 

Just ask the folks at HSBC UK. Earlier this year, the bank’s online and mobile services went down due to “technical problems,” with the services offline for days. Angry customers took to social media with pleas, puns, and protest: 

“Our rent’s overdue. But I can’t check into it because the HSBC website is down.” 

“HSBC (or How Simple Became Complicated’) still has its online banking platform down.” 

“How a biz as big and wealthy as @HSBC_UK can have internet banking down for 2 days is beyond belief.” 

There are no public statistics on whether irate customers closed accounts and moved to a competitor due to this event. But no bank should take comfort if that did not indeed happen, because it most likely means that customers have thrown up their hands in frustration, and don’t expect any better at another bank. 

One wouldn’t imagine that a bank didn’t do everything it could to avoid such outages. No doubt these banks buy top of the line systems to ensure service continuity, yet outages still happen. Why?  

Blame it in part on the pace of technological change. To keep up, banks need to continue and push the envelope with new services and implement new systems. But are those systems compatible with the existing ones? Will a glitch in one affect others?  Can these glitches be worked out so that services will remain up, and not go down? 

The answer to that latter question is often “we hope so” – and that’s an answer that even the IT department will give. It’s actually the only honest answer; there are likely thousands of files on bank servers that control the configuration of services. Obviously, examining these configurations manually is next to impossible for an IT team – even one with hundreds of members. And running a test environment is usually not enough; the test environment will likely not include all the legacy systems and infrastructure layers that the new services must integrate with, leaving these interdependencies untested until the new system goes live.  

These configuration issues are not necessarily even due to “errors” or bugs – they are often outgrowths of the normal function of systems. While IT infrastructures have been growing massively in complexity and scale, the tools at the disposal of IT teams have not kept pace. And as the rate of change in these environments continues to escalate with newer technologies and rapid release cycles, it’s no wonder IT teams are struggling to ensure all facets of the infrastructure are risk-free and configured according to industry best practices. 

In fact, a recent study by the University of Chicago, which examined the roots of service outages at online companies determined that the majority – nearly 300 of the 500-some instances that were examined – were due to “unknown” factors. “Unknown” could mean anything – misconfiguration, malware, incompatible software, etc. The point is that it is unknown, and that is dangerous for any organization that needs to ensure continuity of services – especially organizations like banks, which, subject to regulation and public ire, are in a more sensitive position than most other service businesses. 

Detecting these unknowns at the scale and complexity common in today’s environments requires a different approach – a quality assurance mindset and processes designed to proactively identify misconfigurations and risks. It requires system-wide visibility and the ability to proactively examine and analyze the thousands of things that could go wrong across all layers of the infrastructure, alerting IT personnel to potential risks so they can be addressed before adversely impacting service availability and performance.  

An automated big data system that constantly crawls a company’s IT infrastructure – checking out the dependencies, determining the way resources are allocated, and proactively detecting problems – could help IT teams ensure full availability. When a change occurs that has the potential of causing a service disruption, the system alerts IT personnel, pointing out where the problem is and what needs to be done. The issues are highly visible, and remediation procedures are available to allow for a quick resolution before the business is affected. 

Of course, there will still be issues for IT teams to contend with, ranging from hacking and security threats to regulatory and compliance requirements. At least, though, banks will know what they are dealing with, as opposed to searching in the dark – vulnerable to the vagaries of unknown IT issues, as well as to the ire of customers. 

 

Related Articles

2 comments

Wayne Sadin January 14, 2017 - 8:49 pm

The problem with IT at financial services firms can be traced to Technical Debt: core systems bought or developed decades ago as ‘systems of record’ and then not adequately maintained over the years are unable to meet the demands of today’s ‘systems of engagement.’

No CEO or Board would allow a decades-old factory to go without updating: they would understand that you can’t compete with yesterday’s capital asset. But given the abysmal level of IT understanding on the part of CEOs and Board members and the pressure put on CIOs to do more with less, it’s no wonder that systems’ maintenance can be put on the back burner without risk alarms going off.

As long as the demands placed on the systems are similar to those contemplated by the designers, lack of long-term maintenance may not be a serious issue. But banks today must change, and change quickly. And so systems are bent to the breaking point, hurriedly, which can lead to outages and other breakdowns.

This article should be a wake-up call for Bank Boards: be sure you have a Digital Director who can ask the right questions–about Technical Debt and other IT risks and opportunities–and understand the answers.

Reply
Alvaro Bueno January 17, 2017 - 3:23 pm

IMO financial industry builds part of its IT infrastructure an old and large mainframes, but the rest is made up of modern distributed systems. These Those outdated systems are running transactions with unparalleled robustness and are the best suited for existing channels for more than 20 years

However, the new channels need modern infrastructures that, although are properly updated, they imply a great complexity by the enormous quantity of pieces that build them.

In the event of an interruption of a service as critical as internet banking for 2 days, it can not be due to anything different that lack of planning, lack of foresight and overconfidence.

Reply

Leave a Comment

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.