Automating the shutdown of APC UPS devices

ups-shutdown-device

[Ishan Karve] works in some bizarro world where the building management demands that all servers and Uninterruptible Power Supplies be shut down at the end of each evening. While inconceivable to most systems admins, he has no recourse but to comply. This means that his employees need to turn things off before they leave for the day, and since they often work up to 15 hours a day, waiting for Windows server to shut down seems like an eternity.

Being the good manager he is, [Ishan] decided to build a device that handles the clean shutdown of their servers and UPS for them. An Arduino board serves as the brains of the device, communicating with and issuing shutdown commands to the UPS over a serial port. The Arduino is also connected to the office network, enabling it to send ARP requests to the servers in order to determine when they have completely shut down for the day. In order to protect against an accidental shutdown due to network connectivity issues, [Ishan] added an RTC module to the mix so that the Arduino does not issue shutdown commands until at least 8 pm.

Instead of waiting around for Windows to do its thing, [Ishan’s] employees can take off once they start the server shutdown process, knowing that they are totally compliant with their landlord’s crazy requests.

Comments

  1. Arihia Ngaronoa says:

    Cool

  2. polossatik says:

    from the blog: “…knows the time of the day and shutdowns the servers only after 2000 hrs …”

    tip: 20:00 hrs is 8.00 pm

  3. ciper says:

    What in the heck? I don’t comment on this site but this time I have to show my shock that such a thing exists. The building management must be a bunch of 80 year old women or someone with severe OCD who has to check if the coffee maker is unplugged 15 times before leaving the house

    What is the purpose? Are they trying to prevent fires? I’d be more worried about wall warts than any UPS or purpose built server (assuming the “servers” aren’t desktops shoved into a closet)

    • Dax says:

      It’s probably to nickle and dime the cost of running the place.

      If they don’t serve anything out of the building then there isn’t much need to run the servers at night anyways, and I can’t imagine they would with those rules in place.

    • fartface says:

      It’s called “idiot CTO” syndrome.

      I’ve seen it some places. One requires we reboot the firewall nightly to get the “internet dust” out of it.

    • jaded says:

      Actually, if you ever visited India, you’d be amazed at the crap people jury rig to deliver electricity. I saw an extension cord tied to the tip of a long pole that was jammed in the median of a main road so that they could get power to the shop on the other side of the street. Line distribution transformer safety (when it exists) consists of up to three metal guard bars shielding the buss bars, and optionally a sign saying “danger high voltage”. And these buss bars and transformers are at shoulder height, not up on a pole!

      Fear of electrical fires would be a very natural and real fear over there. And if you’re a landlord who has suffered many losses because previous renters damaged your property, you’d suspect that your tenants are always trying to cheat you. You’ll probably only trust the electrical meter that they’re not using electricity, and not trust their word.

      What seems crazy and unreasonable to us would be just part of everyday life over there. Wiring up a hack to handle it seems like a perfectly appropriate response.

  4. Wrf says:

    Many years ago, we were building a tv station in the tropics and management demanded everything – including sitcom – was off overnight !!?
    Of course by 0900 the next day when HVAC had settled in there was condensation running down the walls and racks !

  5. Brandon says:

    The UPS devices seem kind of dumb if they don’t need their servers up 24/7. Although I suppose it allows the server to shutdown properly if there is a power outage, preventing damage to it.

    It still seems kind of silly to shut down the servers every day. Oh well.

  6. vonskippy says:

    Who’s dumber? The landlord or the tenants for putting up with the illogical rules?

  7. Andrew says:

    Yeah, this is actually really bad for the server hardware. It’s designed to be powered up and left running – startup and shutdown are the hardest times on pieces of hardware. Hard drives especially – you can have a HDD working perfectly fine for years, but when you power cycle it, it’ll refuse to come back up due to some issue that only presents itself in the powerup sequence.

    Building management here are goddamn morons. Since he doesn’t have an ethernet-enabled UPS, I’m presuming it’s a rather small one (even my SU1400 has ethernet) so I’d just be bolting extra batteries to it to keep the servers running overnight…

    This is stupid.

    • Andrew says:

      So I just read the article. It’s a SMART-UPS 2200.

      Just buy an ethernet management card for it, ffs. It has this built in, and far more reliably. You can even schedule shutdowns. All you have to do is go to the webUI, click one button and it will issue a shutdown command, wait for system shutdown and power itself off. Get back in the morning, tap the button on the UPS, everything powers up. You could even schedule automatic powerup 15 minutes before you arrive.

      Idiocy.

    • ejonesss says:

      this reminds me of the production of who shot mr burns from the simpsons where matt groening’s mac drive was failing and he had to leave the mac on and 1 day someone cut the power and the drive failed and he had to pay thousands to recover the drive.

    • lwatcdr says:

      That is an old myth. Powering up and down a computer once a day does no real harm to the computer.
      There are some actual benefits go doing this.
      1. It saves power. If you only use the computers and servers x hours a day why pay to power them 24 hours a day 7 days a week?
      2. Protection. Not from hacking but form power transients. If you computers are powered off then they and the UPS is off then you have even less chance of getting hit by surge.
      3. Cooling bills. If the servers are not running then they do not need to be cooled. In a hot climate that can help a good bit.

      Unless you are running jobs over night why bother with keeping a machine running over night doing nothing but spinning it’s drives and wasting power.

      • pod says:

        exactly.
        I’m sure all these people complaining with shutting things down are from the USA.
        It seems that to some people over there trying to save on energy bills is some kind of communism, and we all know how America is obsessed over it

      • phil c. says:

        However, it CAN be a problem if you normally leave things up…for a long time, say, years…and power-cycle the equipment. I’ve seen things that ran for multiple thousands of hours with no problems die when the power was switched off and on again. The inrush current can kill weak, old components like capacitors that are drying up.

      • lwatcdr says:

        @pod actually I am from the US and was talking about the power savings so that statement was both inaccurate as well as inflammatory and bigotes. . It has nothing to do with nationality but with old wives tales. Back when computes used tubes, core memory, even down to the transistor and bipolar logic it chips system operators would keep systems on all the time. Back when they used those old techs and point to point wire wrapping it made a lot of sense. The thermal stress of heating up and cooling down played havoc on tubes and even early solid state systems. Today it is all a myth. Unless the board has a problem like a bad solder joint powering cycling once a day is harmless. It will in make things like hard drives last longer since you will have fewer hours of the spindle spinning on the bearings and the head moving back and forth unless you have have them spin down to save power anyway. Even the idea that power “rushes” in is just kind of odd to hear today. Again that that had to do with old style power supplies and the power surge caused by the tubes all coming up at once than anything to do with a modern system with a switching power supply.
        I will bet that some of the folk complaining about it are for the EU as well as the US because they where taught to fear power cycling from their elders.

      • Saul Goode says:

        “That is an old myth. Powering up and down a computer once a day does no real harm to the computer.”

        I whole-heartedly disagree. Anything using electrical currents is also going to have thermal issues to deal with. When a machine is booting and all the traces on the circuitry are warming back up the physical properties of those traces will change. It is a well known constant that physical materials are affected by thermal conditions.

        What I’m trying to say is that when you hit the power-on button, all those motherboard traces are going to start warming up and as they do they are going to get slightly bigger. Repeated on-off cycles can, over the course of time, cause the traces to pull away from the substrate. Same goes for where chip pins are soldered to the traces. Enough power-cycles and the chip pins can start to pull away.

        I’ve always been one to leave my systems running and I whole-heartedly believe this is one of the main contributing factors to why I have so few hardware issues.

      • lwatcdr says:

        @Saul Goode
        When was the last time you saw a server motherboard fail from a trace pulling away?
        And just how hot is your system running? Think about it? how warm does the traces get? The only ones that are are really moving much current are the power traces!
        So which do you think will fail first? The traces or the bushings on the fans and the bearings on the hard drives left to spin 24/7 for no good reason?
        That is the heart of a myth. People saying things without any proof to back it up.
        I turn off my computers at night every night. In the 25 + years of using PCs I have had 3 hardware failures. 1 was caused by a direct lightning strike that fried not just the Computer but the Surge supressor that it was connected too. A hard drive fail in warranty, and a fuse that blew on my old Commodore 64.
        I also worked as a system operator on an IBM System 38. We did a nightly shutdown and restart as part of our end of day. It also never failed.
        On a modern system their just isn’t enough thermal stress to matter. 3.3 volts and a few milliamps just isn’t going to cause enough thermal strain to worry about and the power planes are sized to deal with it.

  8. N0LKK says:

    Obviating any fire risks was mentioned. I wonder if Ishan’s employer is complying with a demand from their insurance carrier? Personally if I where a business owner I wouldn’t be consuming electrical power or placing wear,and tear on equipment needlessly. This would be great low cost project to that end.

  9. Jon says:

    I use APC UPSes with Windows Servers a lot at my work — This guy should’ve just bothered with the Arduino/APC thing, because the software that communicates with the server that the UPS is shut down will auto-start the shutdown process for the server when [x]% of battery life is left. That part of the equation already existed, no need to bother networking it in. Just start the chain reaction by cutting the UPS off from wall power.

  10. skuhl says:

    I am a Systems Admin, I can’t count the number of times I have experienced some sort of hardware failure during a reboot, especially on older machines.

  11. skuhl says:

    Additionally.. in my 12 years of running 100s of servers, and 1000s of workstations, I have yet to have one start a fire. Now laptops and walwarts… thats a whole different story.

  12. Andrew says:

    Use Linux, nut and cron.

  13. ChrisE says:

    Nice. Nothing like thermal cycling to kill equipment quickly.

  14. Anon says:

    This sort of shit seems to be more common than you’d think.

    There’s a guy in my local DC that powers his server down every night, so that people can’t ‘hack his stuff’ overnight. Completely ignoring google indexing, and the fact that no one can view his site when they need to :|

    • HackJack says:

      That’s not stupid, just poorly executed. My dedi servers all have a cronjob that restarts the server every 2 minutes. With the time it takes readers to read a page and browse between pages, it’s VERY rare that they ever even notice the server has been restarted. However these 2 min restarts completely thwarts hackers from being able to get into the server because they can’t stay connected long enough to do any damage.

      • Anon says:

        That is incredibly stupid. How you get any remote admin done is beyond me.

      • YouAreAnIdiot says:

        You sir, are an idiot…

      • phil c. says:

        If you don’t want it hacked, don’t connect it to the internet. Simple. If you want something that’s actually useful, take the necessary precautions then cross your fingers and monitor it wisely.

      • Saul Goode says:

        pretty sure this ranks as one of the worst idea’s I have ever heard.

        Are you a professional IT person? If so may I please have to contact information of your employer so that I may suggest to them that they start accepting applications for your replacement?

      • Bogdan says:

        I assume you are talking about the server process restarting, not rebooting the whole computer, right?

      • Bogdan says:

        I don’t think thermal stress is that much of a problem.
        I mean, look at the temperature difference between a loaded and unloaded CPU. Now compare that with the difference between an OFF CPU and an unloaded one.
        Here is an example for my laptop:
        CPU OFF(laptop off) 25 deg C
        CPU idle: 41 deg C
        CPU 100%: 72 deg.

        There is a greater temperature swing during functioning, and this happens so many times as i use the computer. My guess: most of the times temp swing is not an issue.

        ALso, this is from the google study about hard drive failure: “Our results find that for drives aged up to two years, this
        is true, there is no significant correlation between failures
        and high power cycles count. But for drives 3 years
        and older, higher power cycle counts can increase the
        absolute failure rate by over 2%. We believe this is due
        more to our population mix than to aging effects. Moreover,
        this correlation could be the effect (not the cause)
        of troubled machines that require many repair iterations
        and thus many power cycles to be fixed.” see here: http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/disk_failures.pdf

      • mkanoap says:

        Congratulations on your successful haul with your excellent trolling.

  15. Anon says:

    anybody thought of saving power? thats not really a bad thing…

    id love to see a calculation of higher production energy cost due to more hardware wasted vs. saving energy. anyone know if there is a study about that? but i bet the ecological footprint for the production is low vs. the energy wasted at night.

  16. Linker3000 says:

    Our two corporate sites have VMWare servers, each with a Linux guest running some scripts based around the Open Source APCUPSD UPS monitoring app. In the event of a power outage, or if manually invoked, the scripts instruct VMWare to issue ACPI shutdown commands to all guests (Windows and Linux) as required and then shut itself down.

    As the VMWare servers have HP ILO remote access boards, we can also power up the VMWare servers through a remote VPN link.

    Not that we power down our servers overnight – but we could if we wanted AND have them power up again later. Total solution cost to us was just the time taken to install the Open Source apps and develop the scripts.

  17. networknut says:

    And who/what shuts down the network after the servers have shutdown?

  18. Chris says:

    They probably do this because energy is much more expensive per KWh in India then in the states (and many other places). One reason datacenters are returning to the states. In NC, it costs about $3 per day to keep a server running 24/7. In India, that goes up to about $6-$8 per day. Time = money. It is all about the money. If they don’t need the servers on at night, there is no problem with doing this at all.
    IF the reason is simply energy costs, they should really look into virtualization.

  19. Bogdan says:

    Why not simply measure the consumption of the UPS, if it is below a certain threshold for a certain time(5 minutes) it disconnects the load.

  20. Mr-Midnight says:

    Working in a company with +10.000 pc’s I once suggested to shut them down overnite. I was informed that the pros outweigh the cons.

    The bootup time employees would sit through would be more expensive than letting them run. I suggested to pre-boot them 30 minutes before employees would come into work but maintaining the powerscheme was too much work.

    • bogdan says:

      i’ve seen this before, people arguing booting takes too much time to be worth it…
      true, 14 hours of idle computer electricity costs less than 5 minutes of somebodys time, but it is still a huge energy saving for such a high number of computers.

  21. Rich @ APC UPS says:

    Pragmatic solutions like this are really helpful. You did a god job explaining and use a decent ups to start

  22. Regardless of the idiocy, automating the shut down is the best solution for a crazy policy.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 92,260 other followers