Power Outage

Had a 2 day long power outage, and wanted to document everything that went wrong.
No UPS, No Redundancy, No Backups
Firstly, I don't have a UPS for my server rack, and things could have been more disastrous for my NAS which currently has no redundancy nor backups.
Luckily the NAS and all of its disks survived, but I don't want to continue to bet on a coinflip on every power outage, so I need to invest in a backup NAS as well as a UPS.
I've been thinking during the outage how I can fit both a UPS and an additional NAS on my already full 12U rack, and I've decided I'll just get one more Unifi Toolless Mini Rack and try a triple-decker mount for a total 18Us of space, even if it is officially unsupported.
I still haven't decided on the UPS model nor the specs of the NAS, and I'll continue to explore the build choice over the next few weeks.
Getting a second UNAS also doesn't sound too unreasonable, as it's actually incredible value for what you're getting. Still listing out some choices.
Power Recovery, but startup issues
I ran into a few huge issues after the power was restored as well:
Docker will not start on my Raspberry Pi cluster
I had to manually powercycle the cluster so that Docker would actually start.

Feb 25 02:47:48 is right before the power outage, and what seems to have happened was an internet service outage.

Seems like the network is unreachable
logs are then followed by service-start-limit-hit
, which aligns with the timestamp from the systemctl
logs.
Now the mystery of "why this didn't get autocleared" when eventually there was a prolonged power outage and then recovery, and why I needed to manually powercycle the Raspberry Pis to get Docker to run again, I'm not sure.
Whether that is a Docker quirk or a configuration I need to change is something I'll need to look into later.
Windows Server does not automatically power on
This one I should've known. My current windows based server uses a consumer grade motherboard and chassis, meaning that you need to press the power button to power it on.
I'll try to explore Wake on Lan which my motherboard supports.
DDNS issues
Long story short, my DDNS update script had some errors, and wasn't properly updating my Route53 records with new public IPs. Luckily my DDNS script is really simple, as it is just an AWS CLI call to Route53. I updated the input parameters so that the calls no longer fail, and my server was finally back online.
Big Learnings, What's next?
Well obviously I need more rack space. I obviously need a UPS, backup internet, a backup server, and much more things in place so that I don't need to fear of losing critical data upon a sudden unplanned power outage.
Time to start budgeting....