||[Dec. 3rd, 2004|02:10 pm]
The Veritable TechNinja
Wheeeeeeeee, cold medicine. Head's all cloudy, can't concentrate.
To make things worse, the power was out on one side of my building today, the one with all the servers and network hardware in it. I deliberately parked on that side so I could take the elevators up (the ones on the other side of the building, ours caught fire, see previous entries), and lo and behold, they don't work. Neither did the electronic lock on the door. So I walk up the stairs with my balance all screwed up and my head all cloudy, and meet up with the deskside techs. They take me back down to the server room, saying all the servers are down. Of course, that door is badge-controlled as well, and we can't open it either. While we're standing there, the power comes back up, and I badge my way in.
None of the servers would come back up, everything was plugged in to a building UPS, which hadn't kicked back in yet. There's signs on the front of the thing that say "don't touch, call the building engineer", so the deskside guys call up and ask him to get the building UPS working. He comes by after a while, flips some switches in another room, and the whole platform spins up and comes back to life. Surprisingly, everything was fine once the routers came back up, no dead drives, no SAN init failures.
Or so I thought. The NAS server is not mine, it's administrated directly by a dedicated third-level group, so I didn't check up on it. I called them to let them know they could bring it back up, and they said that they ran a "health test" which worked. Too bad they use Kerberos to authenticate with the domain controller, and their NTP settings were wrong. Kerberos was using timed-out keys to try to auth users, so nothing worked but security that was local to the server (IE the third level guys' admin IDs), so they didn't think anything was wrong. I had to call them once I heard about it, they did at least manage to fix it eventually. Why, oh why didn't I just skip today, too?