A trusted disaster recovery plan not only ensures credit unions that their data will be safe should a system failure occur, Federal Financial Institutions Examination Council examination guidelines require financial institutions to have one in place.
For the $96 million, Coulee Dam, Wash.-based Coulee Dam Federal Credit Union, building and testing a solid plan was a team effort between the credit union and IT-Lifeline, a Liberty Lake, Wash.-based provider of disaster recovery and testing solutions.
Prior to partnering with IT-Lifeline in 2008, Coulee Dam FCU, which runs Harland Financial Solutions’ UltraData Enterprise Core on an in-house, IBM AIX platform and a number of ancillary systems including Exchange email, file and print servers and two domain controllers, received data vaulting services from a company located thousands of miles away in Texas, said Shannon Burge, vice president of technology for Coulee Dam FCU.
“The examiners began to focus more on disaster recovery after Hurricane Katrina, and in the event of a disaster, we would have had to recover in Texas,” Burge said. “We wanted a local provider, and IT-Lifeline is only a two-hour drive from us. It’s feasible to get there, but it’s also far enough away for the weather to differ from ours.”
Newly equipped with a more ideal disaster recovery vendor, Coulee Dam FCU formed a testing plan that would be in accordance with FFIEC guidelines and guarantee system recovery within hours, Burge said.
Regulators require a written disaster recovery management plan and evidence of annual critical environment testing. Burge said while the annual test can be a mock test in which key personnel lay out a disaster scenario, discuss their next steps, and identify and address gaps, Coulee Dam FCU chose to physically recover its critical environment every year.
For each of the five annual tests Coulee Dam FCU has conducted since 2008, the credit union and IT-Lifeline set up a private cloud, into which credit union employees recovered the previous night’s backup data from its core system and three or four of its servers. Coulee Dam FCU recovers a different combination of servers in each test.
“The testing process has become much shorter over the years, and I come away from it feeling secure,” Burge said. “I know that if we had to declare a disaster, I’d be very comfortable.”
IT-Lifeline CEO Matt Gerber said Coulee Dam FCU’s process of testing its core system annually and rotating servers before each test, as opposed to testing its core system and all of its servers every year, is an example of a disaster recovery best practice.
“Credit unions need to first identify the different areas of their IT operations,” Gerber said. “The regulators do not want them to test all of their servers. They need to determine which is the most critical, test three or four of those at a time, and rotate them every year.”
The most common culprit of an IT disaster is a hardware failure. Gerber said it’s caused around 90% of the incidents he’s seen. For example, if a cluster of four servers supports 25 applications and one of the four servers dies, it may take down the other three servers and all 25 applications with it. He said disasters can also occur due to the accidental deleting of a file or folder, and, although it’s rare, an external factor such as a fire at a data center.
The biggest misstep credit unions take in the disaster planning process, Gerber said, is neglecting to conduct annual tests.
“Some will attempt to make a recovery for the first time in two years, and it will take much longer [than it would have after one year], because in that time period, systems change and people come and go,” he said. “What [IT-Lifeline] does is help financial institutions plan for, document and report annual tests so they can give the evidence to their examiner. With those documents, they’re ensuring they have a much higher probability of recovering in the event of a disaster.”
When Coulee Dam FCU conducts its next test in June, the credit union will utilize an IT-Lifeline solution with better virtual machine recovery capabilities. Eighty-five percent of Coulee Dam FCU’s environment is virtual, Gerber and Burge said. Additionally, Burge said the credit union would like to begin testing once every six months.
“We want to have that cadence, that rhythm, in our testing, so if something happens, we’re prepared,” Burge added.