Is multi-cloud failover really the go-to resilience strategy for banking?
I have long followed Lydia Leong’s advice, and I have no doubt that she’s exposed to a wider market than me, but her piece on multi-cloud failover slightly surprised me. Not it’s main premise, that multi-cloud failover is the wrong place to focus resilience efforts on, which is hard to disagree with. However, I didn’t perceive the Bank of England’s advice for outsourcing as encouraging firms in financial services to focus on multi-cloud fail-over, nor did I see the Reuters article to which she refers in that light.
In fact I was entirely, perhaps naively, optimistic about the Bank of England’s statement as I felt it set out a defined demarcation of responsibilities between organisations who consume cloud, the providers of cloud who are increasingly supporting our national infrastructure and those of the regulators who need to take a much more active role.
The Bank of England’s stance
True, the statement maintained that “the regulated industries have primary responsibility for managing risks stemming from their outsourcing and third party dependencies”. That’s not a surprise or shock and it’s hard to see how you can argue that this shouldn’t be the case without affecting an organisations’ independence in choosing their outsourcing and third party arrangements.
It’s also fair to say that the complexity and interconnection of a vast majority of organisations’ services are more and more dependent now on one or a few cloud providers, especially if you look at total supply chain. That's the hybrid reality that exists at most large organisations today and it’s true the blast radius from the failure of a cloud is increasing. So it’s no surprise that regulated organisations, like those in financial services, are increasingly thinking about "exit strategy", or, in the event of a vendor outage, i.e. how quickly they could switch to an alternative cloud provider or in-house back-up to avoid disruption to customers.
However I don't think the BoE statement necessarily pointed organisations towards relying on multi-cloud failover. To me, it highlighted the need for organisations to consider it as a possibility across their outsourcing plans and third party vendors. Organisations are now reliant on the availability of these third party vendors, who are also cloud consumers, as well as on their own direct cloud consumption.
Resilience isn’t just in the hands of the cloud consumer
What was more interesting to me was the Bank of England’s recognition that the services of a few large cloud providers are so ubiquitous across many banks that “a glitch at one cloud company could bring down key services across multiple banks and countries, leaving customers unable to make payments or access services, and undermine confidence in the financial system”. The BoE went on to say that “additional policy measures, some requiring legislative change, are likely to be needed to mitigate the financial stability risks stemming from concentration in the provision of some third-party services”. Of particular interest was the suggestion that measures should include “an ability to designate some third parties as 'critical', meaning they would be required to meet 'resilience' standards which would be regularly tested”. I took that to mean that this was a step towards thinking about cloud providers as part of the (UK) Critical National Infrastructure and making the problem not that of a single organisation but a national concern. This seems sensible, given cloud solutions are beginning to underpin a great deal of the 13 CNI sectors.
The BoE statement was a nod towards suggesting that resilience wasn't something that they expected the regulated organisations, as consumers, to tackle alone. Instead, they were indicating that regulators need to appreciate the critical role our cloud providers play across all our national infrastructure, and how difficult and complicated multi-cloud failover is. Thus challenging the regulators to engage more with these vital cloud providers and act more on behalf of industry.
The role of the regulator
The area of Lydia’s argument I was most uncomfortable with was her assertion that “Regulators, risk managers, and plenty of IT management largely think of AWS, Azure, etc. as monolithic entities, where “the cloud” can just break for them, and then kaboom, everything is dead everywhere worldwide”.
For me, the Bank of England’s statement showed that they know that it’s not that simple. Instead the solution has to be baked deep into an organisation's resilience and outsourcing strategy and deep into in-house and procured services. The BoE also knows that resilience issues are unlikely to be solved by the largest cloud providers working alone. Even if the cloud provider wanted to tackle vendor resilience (and I will bet that, considering the impact of large provider outages recently, there are people within all of the large cloud providers talking about this) ensuring that solution is best for UK national infrastructure is not their sole focus.
It also made me feel the Bank of England also understands its role in setting the agenda, and that regulation, or the threat of, is an effective way to make big national complex problems important and immediate to the consumer and supplier alike. It’s the role of the regulator to spearhead the response to hard, thorny challenges like third-party vendor resilience.
A holisitic approach to resilience
The key point, on which I totally agree with Lydia Leong, is that multi-cloud failover shouldn't be the primary problem that individual organisations are focussing on. A sensible approach would be to consider it as just one component of an overall resilience and risk management strategy. Therefore I read the Bank of England’s statement, relating it to the organisation's third party and outsourcing strategy, as a positive step toward recognising a holistic approach to resilience is what’s needed. I also saw recognition that the challenge posed by the increasing criticality of the cloud providers to our national success is much more collective one that can’t be met by individual organisations alone.
It’s also worth noting that resilience regulations like those from the BoE and the EU’s Digital Operational Resilience Act are not solely focused on the role of cloud(s) within Financial Service organisations. Digital transformation has accelerated cloud adoption - the Boston Consulting Group estimates that by 2025, up to 60% of consumer-facing applications and more than 30% of core business applications will be running on public clouds. But these figures show that for many large banking institutions there are some workloads or technologies that cannot or will not be migrated, which only adds to the complexity of technology estates. Regulators are therefore also concerned with how these environments operate alongside each other in the event of instability.
So, and mostly due to compelling voices such as Lydia’s, I think that the Bank of England’s statement, and the actions of other regulators, show a greater awareness of the complexity of multi cloud failover, the need for it to be part of a broader resilience strategy and a need for a collective approach between organisations, cloud providers and regulators.
Want to find out more about the impact of resilience regulation, including that of the EU's upcoming Digital Operational Resilience Act?
Get our DORA Explorer, a practical guide to Digital Operational Resilience for Technology Leaders.