Xyon (xyon) wrote,

Power Loss = Fail

So I happened to notice today that my phone claimed that it wasn't able to sync mail from home since 1:30pm. Though when I'm at work it often fails to sync since it uses wireless at work and it blocks outgoing non-proxy connections, and won't proxy IMAP. But it didn't sync when I was away from work for 2 hours (on a 1 hour sync), even when forced. Okay, now I'm thinking that my IP changed and my dynamic DNS hasn't propagated yet. No worries.

When I get home I notice the stove clock blinking. I walk back down to my laptop, and notice that the tuner is on standby, which is a pretty good visual indicator that I lost power.

> ping marduk (my desktop machine, the only exposed SSH)
[times out]
> ping girru (mail)
[times out]
> ping ninatta (HTPC, I can see the power light on it from here, making sure network is working)
> ping nisaba (RAID)
[times out]
> ping nabu (HTTP)

So clearly I lost power long enough for the UPS upstairs and the UPS in the garage to fail, and for some reason only 2 out of the 5 machines on a UPS came back (HTPC isn't protected, I didn't bother pinging NAT/DNS, since it was obviously working).

> press soft power on girru
[lights light, it makes noise]
> press soft power on nisaba
[lights light, it makes noise, shake it a little, it makes less (now the correct amount) of noise]
> go upstairs, press soft power on marduk
> press soft power on marduk
> toggle hard power on marduk
> turn off hard power on marduk, let capacitors drain, turn on hard power on marduk
[power comes on for 0.1 seconds, turns off again]
> unplug all power connectors internally, try again (a few times)
[max powered on state was 0.5 seconds, average 0.1]
> swap out power supply
[nothing, but who knows if that PSU is any good or not]
> put old power supply back
[0.1 seconds again]
> unplug front panel connections one by one while toggling hard power
> consider marduk to be dead, walk downstairs to look at pending "ping -t nisaba"
[still timing out]
> ping girru
[still timing out]
> hook up maintenance monitor on girru
[performing startup, albiet slowly]
> hook up maintenance monitor on nisaba
[it's angry I haven't run fsck in 288 days, slowly doing a fsck on the RAID array]
> get mad at girru, Ctrl+Alt+Del
[going so slowly that after 18 seconds of "shutting down lo0" I...]
> hit hard reboot on girru
[boots at normal speed, loads RAID0, sees that the mirror is inconsistent, starts rebuilding, proceeds to boot slowly]
> write LJ post
[and be sad]

At some point I should figure out what's dead on marduk. Sadly, I was about to move some of its components to a new computer, guess I have to buy more parts than I thought.
