Its been a bit of a shit day today. The people who look our building decided to do some work on the air conditioning systems today. Normally this wouldn’t be a problem, except they didn’t tell us what they were doing, and the only air con that seamed to be affected was in our main server room!!
It started to get mightily hot in there, at one point it got up to 38C, we had various fans in there but nothing seamed to be getting rid of the hot air. The best thing seamed to be getting a long foil tube, putting one end over a large fan situated at the back of the racks, and the other end out of the door. This got rid of a lot of heat until the portable air con’s turned up.
We managed to get away with very little initial problems, the watchguard firewall overheated, switched itself of and lost its config, various server lost disks and everything was reporting heat warnings.
The only complete failure was a Dell poweredge 750 server that was running websites for our editorial system. This went down, and refused to power up again while it was still in the rack. We moved its services onto another server, and when it got a bit quieter we took the server out of the rack. When we got inside it, we found a section of the main fan had broken off completely, and fan blades had been tossed all over the server!
So we got the fan out of the housing, and it looked as though the bearing had gone, so the fan was lose in the casing. Presumably the blades had hit something and this is how they snapped off.
We then looked into Dell IT assistant and one of its last temperature warnings was at 88C, no wonder a couple of hours later the CPU was still to hot to touch.
So let this be a lesson, always make sure your air con and its backup units are in working order!