Server uptimes have long been a badge of honour in operations departments, and working with both Linux and Windows server teams over the years has provided the Linux contingent with the chance to laugh and deride the sometimes weekly Windows server reboots. Aside from the OS, what impact does a good uptime have on the hardware?  I found out recently.

small_server

It started with alarms going off on my mobile phone one Monday morning before I left the house for work – a UPS battery failure. Not a great thing to come into first thing. At least it wasn’t over a weekend. Good, had a spare battery, happy about that. Having spares helps reduce the concern and panic of bringing back machines.

Had the place to myself, being early, so started the exciting process of shutting down servers. I have to say the quiet of the server room and no flashing lights is a scary thing. Then to the battery replacement, UPS off, battery out, new battery in, UPS back on. All good. Servers all starting up. Yeah. Ah, all but one, and my heart sinks. OK, perhaps I might have knocked the power cord when replacing the battery. No. Light on PSU at the back of the server is lit. Oh dear. Cover off. No fan abruptly turning at power on. I know that no matter how many times I press that power switch, pressing more than once won’t make a difference, but I try anyway, frantically with different ways of pressing it. That machine had been up over 400 days, but today it died.

Autopsy time – PSU or motherboard. Out of the rack. First is PSU swap for a spare and got lucky to great relief. A dead PSU. Spares…. Thank you. Back in and then all is up and running at last.

Lucky that time. In the past I’ve had redundant power supplies for example. But I remember one boss saying “redundant everything why spend that money?” Well because of things like this I suppose. You just never know what may fail and when, usually at the most inconvenient times. Ironically that particular machine did eventually die from the VRM failing and I couldn’t get a redundant one of those. Most often you do in fact win but sometimes you lose.

How lucky are you feeling today?

Kev





Switching A Server Off – A Close Call
twitterlinkedinmail
Tagged on: