IT disasters come in many flavors. Here are some top examples from the past 15 years.
Going through the motions
The most embarrassing failure we cleaned up was when a well-known IT company had set up a high-end backup tape system for a local marketing company. This backup system had been in place for five years. The staff person in charge of backup had religiously been changing backup tapes daily for the entire five years. One day they had a failure of one of their servers and needed to get to the backup from the day before only to find out there were no backups on the tape. We were called in and discovered that all seven tapes in rotation had the “write-protect” tab on the tape set, so that nothing could be written to the tapes.
This was a multi-level failure, as the backup system was reporting in the logs that it couldn’t write to the disks, but no-one had given the person in charge of backups the instructions to check the logs, or had set up emailing the logs. Further, they had not instituted a quarterly check of the tapes to verify that a backup could be successfully pulled from the tapes. In addition, tapes need regular replacement, as they do wear out. Of course, at this company the tapes were far from worn out, in fact they had never actually been used – in five years. Five years that management thought they had robust back-up protection.
We got a phone call from a local medical device company early on a Sunday morning. There had been a break in at their office Friday night and the thieves had stolen ten computers, ten phones, one phone server, one camera system server and their business server. So, every shred of technology that their business depended on was gone. Ironically and luckily, the thieves ejected the backup tape from the server and left it sitting where the server had been.
The owner had been unable to get a hold of their IT company and asked us to get them going again. We immediately brought out 10 computers, a phone system and a server to get them operational again. We recovered their latest backup from the tape and had them fully operational again by start of business Monday morning. We worked with their insurance company to get their claim filed and ultimately set up a new system for them with both on-site and off-site backups, among many other improvements.
Heat Your Office with Bitcoin
We got a call from a large law firm who complained that pc’s were constantly slow and their in-house IT department was unable to figure out the cause. We started an investigation but it seemed the in-house IT staff was surreptitiously trying to hamper our investigation. Ultimately, we figured out that the IT staff was running a Bitcoin mining scheme on over 600 computers and servers. Interestingly, at the same time the company had been trying to figure out why their electric bills had spiked over the last number of months. It’s impressive how much power 600 computers running maxed out 24/7 will use, and how much heat they produce that causes the AC system to have to run full out too.
Running out of Space
We had a new client who complained that they were constantly adding server storage space and new servers which were quickly running out of storage space. Unsurprisingly, their IT staff was unhelpful in trying to get to the source of a fairly obvious and nefarious scheme. The staff were running an illegal movie sharing website on the companies servers and cloaking resources from the line of business applications to put it toward being able to share questionable content. Not only did elimination of this activity drastically speed up operations, it removed a legal liability from the company and allowed the mothballing of four servers. Also, they saved on three IT staff who clearly were providing the company a negative return on investment.
Where There is Smoke
One of our customers had a fire in their building overnight. I had never experienced the aftermath of a fire before and was stunned that while the fire itself had not done a lot of damage, the smoke and the water used to extinguish the fire did. You could open a filing cabinet in a part of the building well away from where the fire was, pull out two pieces of paper and there would be soot in between the papers. There was soot in and on everything. The next day we set up a network in an unaffected part of their building and brought in temporary computers, a phone system, and servers to get them going again. We coordinated with ATT to have their data and phone lines brought in to another point in the building. We spent many nights and weekends getting a disaster to a reasonable level of manageability so they could keep business going.
All of this to say, make sure your IT company and staff have thought through all of the ways that loss of your data or systems could affect your ability to even be in business after an unforeseen problem.