The economics of IT are always going to be a pain point. Sadly, such penny-pinching when it comes to IT spending can result in some pretty creative issues. This is just a small Friday rant from work, so read at your expense!
Today we had a web server D drive fill up (the drive with our data), which caused some errors to start occuring on that server. This filled up because the log files weren’t getting cleaned up. We didn’t get alerts because our web servers run on such small disks that we were getting constant reminders about low disk space, so we turned them off as no one would pony up for more space. *
The log files weren’t getting cleaned up because a separate web log processing server’s disk was full and couldn’t pull the logs in anymore. This filled up because no one a) wants to make a policy on how long to keep log files or how important they are, so they are kept forever, and b) no one wants to look at the criticality of the server and assign a dollar value, which can then be used to offset costs for more storage. So it stays with the disks it has.
So a non-critical system that can’t get more storage due to penny-pinching caused an intermittent production outage on a system that itself is running on fumes because no one wants to put out for more storage. Capacity planning and budget submissions are one thing, but as much as we do them, the exec/business side continues to say “No thanks,” to the expense.
Ugh! I understand this can be a way to go for companies, kind of a JIT of disk storage, but it really, really helps to be up front with that policy so IT staff doesn’t have to constantly work in a “worry/told you before” sort of mode all the time. It’s just not important until it brings down production and clients notice. Sounds awfully similar to security!
* I love the little side risk to this practice. Developers can put out code quite easily enough on their own to fill the disks and cause web servers to all die in production. And even if intent isn’t there, we do run the risk that someone will accidentally publish something large that effects a DoS.