Your business is hit with a ransomware attack. Or your ecommerce site crashes. Your legacy system stops working. Or maybe your latest software release has a major bug. These are just some of the problems that ecommerce, technology and other companies experience at one time or another.
The issue is not if a problem – or crisis – occurs, but how your company handles it when it does. Manage the problem poorly, you risk losing customers, or worse. Handle a crisis promptly and professionally, you can fend off a public relations disaster and might even gain new customers.
So what steps can businesses take to mitigate and effectively manage an IT-related crisis? Here are eight suggestions.
1. Stay calm, prioritize and don’t point fingers
“During a crisis like a site outage, which has a high level of visibility, it is absolutely critical for senior IT leadership to remain calm and focused,” says Jake Bennett, CTO, POP. “Everyone on the team is under an extreme level of stress during an outage, and this can lead to mistakes. However, you need the team to perform their best at that very moment.
“As a leader, by simply being a balanced, reassuring presence, you can calm nerves and make sure the team stays focused on resolving the issue,” he explains. “Conversely, if you appear to be spinning out of control, you’ll take your team down with you and it will take much longer to resolve the issue.”
Similarly, “micromanaging your team when the pressure is on will lead to disaster,” says David Cox, CEO, LiquidVPN. “A better approach is to hold a 10- or 15-minute meeting [to lay out the problem(s) and what needs to be done], break up the work into sections, assign work to each team member and keep them focused on their tasks.”
Finally, don’t look for excuses. “If you start making excuses about what caused the issue or pointing fingers this will erode confidence,” says Eric Hobbs, CEO, Technology Associates. “There will be plenty of time for a postmortem later.”
2. Have both an incident response plan and a disaster recovery plan in place
“Handling a security crisis can often come down to preparation,” says Ron Winward, security evangelist, Radware. “Even if you don’t have a security budget, you can still plan for what you will do if you encounter a security problem. Understand who needs to be notified, both internally and externally, as well as who will be involved in your response. Then practice it. Those first few minutes and hours will be critical to how you fare under duress.”
Similarly, “when it comes to common network problems, like servers going down, it pays to have a recovery plan in place before it happens,” says Jacob Beckstead, marketing manager, Bailey’s Moving and Storage. The plan “should be detailed enough to follow step by step, but broad enough to allow room for improvisation, because even a well-laid plan always needs changing in the moment, depending on the specific situation.”
3. Take snapshots regularly (at least once a week)
“If you’re hit with viruses, ransomware or data corruption, rely on snapshots to restore your data, rolling back to several minutes, hours or a day prior,” says Kevin Liebl, vice president of marketing, Zadara Storage. “With snapshots, your Recovery Time Objective (RTO) and Recover Point Objective (RPO) depends on how you have setup your automated snapshot process. Depending on how frequently you set snapshot levels, you may not be able to restore everything, but at least you’ll have a fast path to partial restoration.”
4. Have a failover option
“If your cloud-based system has an outage, and you have set up the proper failover architecture, you can simply redirect your applications and/or data to an alternate cloud service provider,” says Liebl. “Having a multi-cloud architecture is very smart to prevent single points of failure. Alternatively, many cloud-based architects recommend a hybrid approach where critical data is synchronously mirrored between the cloud and on-premises storage. This way, you can failover between the cloud and an on-premises copy of your data.”
5. Involve the PR/media team – and legal (if necessary)
“The importance of involving [public or media relations professionals] early on when an issue arises and has the potential to become a crisis is imperative for mitigating damage to a company’s brand,” says Kimberly Nissen, president, Public Relations Society of America, Philadelphia Chapter. Your PR team “should be among the first to know of a potential incident or breach because the earlier [it] is aware of a situation, the more time [it has] to collect the facts and work with the legal department to prepare a public-facing statement.
“Having such a statement on standby should the incident become public is crucial,” she states. “In cases such as cybersecurity incidents, during which attackers may want to take credit for their work by broadcasting their own statements on social platforms, it is especially important that the [PR] team is ready to monitor what’s being said about the company and the situation to respond effectively, if and when appropriate. This cross-silos collaboration is essential for mitigating damage to the company’s reputation.”
6. Immediately notify customers
“If your email server gets hacked and your entire customer list is sent spam from your company email address, quickly send a global email to let your customers know,” says management consultant Amy Cooper Hakim. “Make sure that the subject line is clear and direct. Write something like: ‘Please do not open an email from us with [x] in the subject line.’ Then, in the body of the message, own up to the error. Apologize and take responsibility (even though it is not your fault). Customers expect mistakes to happen. It is your job to wow them on recovery.”
Similarly, post messages on your social media accounts (e.g., Twitter, Facebook, LinkedIn) letting people know you are experiencing a problem but are handling it – and will keep people posted.
7. Manage user/customer expectations
“Managing expectations is the key to ensuring that a problem isn’t compounded by users perpetually asking questions,” says Beckstead. So it’s important for companies to provide customers/users with “a brief explanation of the problem, in layman’s terms but specific enough to give them an idea of what’s wrong,” he says. “That way it doesn’t look like you aren’t trying to cover something up or avoid the problem and creates some empathy from people who may not understand completely, but know you’re working on something to fix it.”
It’s also important to let people know “when a fix may be in place, overestimating (by 10 percent) to be safe. If the problem spans multiple days… update users every 24 hours or so to let them know that you’re working on it. This way you aren’t barraged.”
Finally, says Beckstead, “send an all-clear message when the problem is resolved.”
8. Conduct a postmortem
“Once an IT outage issue has been resolved, it’s important to immediately conduct a blameless postmortem to analyze what happened,” says Rachel Obstler, vice president, product, PagerDuty. “Use this time to evaluate what worked well in your incident response process and what didn’t, as well as ways you can fire-proof your system for incidents of this nature in the future… [and] perfect and streamline your incident resolution process [going forward].”
Write to us firstname.lastname@example.org