Society is changing and so are the demands on IT departments and CIOs across various industries. Our clients and users of our systems have higher expectations. This is driven by our experience as consumers. Google is never down, so why should our banking or payment system be down for maintenance or experience an outage? Availability requirements over 99.9 percent are becoming a new norm in our hyper-connected world.
For some, outsourcing of critical system operations is an emerging trend and for some it is a new norm. But how can we navigate the world of clouds and platforms as a service and other service offerings? In the world of disaster recovery, leveraging outsourcing wisely can help strengthen resiliency at a lower cost. Of course, there is no one solution that “fits all.”
Outsourcing allows companies to focus more on their core business and eliminates the need to develop and support in-house expertise in non-essential areas. In my view, it is also about cost savings–bringing the best value to the business as a technology provider. Running hardware that is secured, continuously patched and reliable is a table stake. It doesn’t get extra points for this anymore.
While luckily most of IT departments design disaster recovery for disasters of a smaller scale compared to massive earthquakes or flooding, we still need to ensure initial and ongoing risk assessments, rock-solid business continuity, disaster recovery plans and stress testing. They are crucial to the success of outsourcing operations. In our IT world, major power outage or failure of cooling systems is a disaster. At Payments Canada, as we continue on our journey to excellence and continuous-operations, we want to make sure the testing is becoming more difficult to better replicate the stressful environment during a real crisis.
"In the world of disaster recovery, leveraging outsourcing wisely can help strengthen resiliency at a lower cost"
When considering outsourcing, you can’t just sign a contract and assume the vendor will take care of everything. Remember that outsourcing is a partnership that takes effort and continuous communication. You can outsource responsibility, but the accountability stays with you. What needs to be considered before and during the outsourcing process is the business landscape, anticipatory regulatory expectations and technological challenges of third parties. The vulnerability of outsourcing partners also needs to be examined. In the financial markets, the systems are so interconnected that if one of our systems goes down then it can have ripple effects. The weakest part of the chain really matters.
In order to avoid pitfalls, outsourcing partners need to be very well aware through continuous education that while we may be small from an invoicing perspective, Payments Canada’s systems are critical infrastructure and we are huge from a reputation and risk perspective.
Since Payments Canada’s Large Value Transfer System (LVTS) was designated by the Bank of Canada as systematically important to the Canadian economy, and clears and settles more than $196 billion on average each day, disaster recovery is crucial to ensure that operations aren’t wiped out by a single event, such as a blackout or natural disaster. If our systems do not work, the Canadian economy would be impacted. Trades in securities, major acquisition deals or things like payrolls would be effected.
Payments Canada has developed a Resilience and Security Strategy that focuses on near-continuous operations, policy and process improvements, managing interdependencies and an approach to cyber security. These enhancements are essential for us to meet our legislative mandate and public policy objectives like safety and soundness.
A secondary data site that is a significant distance away from the main hub is needed in order to provide a buffer in the case of an emergency. It acts as an insurance policy–it’s likely a business will never have to run from a secondary site but it’s needed to ensure the stability of our systems. That is a challenge for us as CIOs to convince our boards to make major investments in something that may never be used. Luckily, my experience has been that board directors are increasingly well aware and want to manage risks for the organization.
While we have a good base track record in resilience and security,Payments Canada is constantly investing in and enhancing our business continuity planning and disaster recovery. Even now, we are in the process of enhancing our disaster recovery capabilities to respond to an evolving landscape with greater integration, interdependencies and cyber risks. Remember, if you do not test or use your secondary site regularly, there will always be a hesitation to use it in case of a real failure. You have to have absolute confidence in the disaster recovery procedures so you feel comfortable invoking it when you are under a lot of pressure.
In our organization, we have embarked on a major modernization journey and are developing a more modern architecture that includes near-continuous operations in the event of a crisis. If one data centre fails, a second system will seamlessly take over operations. As per our regulatory requirements, in the event of any emergency or disaster, Payments Canada needs to restore operations within two hours.
More and more businesses are moving to this model where a secondary data centre is set up with mirrored capabilities. This secondary data centre must be far enough away from the primary site so it wouldn’t be impacted in the event of a large scale natural disaster.
Some businesses have a third data centre as part of their disaster recovery strategy so they can recover in the event of data integrity corruption. A third site with a delayed data replication is an effective solution. The only other option for a company that does not have this capability, is to restore data from the last non-corrupted back-up, which usually causes data loss and extended outage. Disaster recovery could be viewed as an onion with multi-layered resiliency. A third data centre could be another layer to limit the residual risks and achieve higher level of Recovery Point Objectives (RPO).
Payments Canada is considering this for the future as another level of security to ensure our systems have near-continuous operations. Outsourcing is an efficient way to run a third data centre. In our case, as we are small in size, running our own data centres, cooling systems and backup power would quadruple our operating costs. As Payments Canada leads and facilitates modernization of the Canadian payments systems, we are improving and replacing legacy infrastructure and have set a target for our Risk Appetite to have the best in class resiliency.