The default setup of the port is to run "apcupsd --kill-on-powerfail". This means when the UPS power is exhausted, apcupsd sends a signal to the UPS to hybernate and initiates a system shutdown. Thus the complete shutdown sequence must fit within the UPS grace period.
While this may be tolerated on workstations, it is completely unusable on busy servers, especially running multiple virtual machines in bhyve, because the UPS grace period may be not long enough for all the VMs to shut down correctly. Thus we observe a race condition and cannot predict if there will be enough time for the daemons/VMs to stop, before the host is powered off.
Ideally, apcupsd should wait for all the shutdown scripts to complete, and only then should it send the hybernate signal to the UPS. This means that:
1. apcupsd should be started without the --kill-on-powerfail option, perhaps without any options at all, or with the --term-on-powerfail option. It is easy to implement, just alter the default apcupsd_flags.
2. The "apcupsd --killpower" command should be called somewhere near the very end of the shutdown sequence, on the condition that /var/run/powerfail exists.
I cannot currently suggest where to put "apcupsd --killpower" in /usr/local/etc/rc.d/ so that it is executed last.
I think that the way apcupsd is run on FreeBSD should be seriously reconsidered.
Thanks for your report.
About your point 2:
"2. The "apcupsd --killpower" command should be called somewhere near
the very end of the shutdown sequence, on the condition that
Maybe you can customize it with apccontrol and your own script.
And about removing the --kill-on-powerfail option, well, that can be tricky.
We are talking about a behavior that is present for 17 years in this port:
What I suggest is to improve pkg-message with a warning and considerations
about this, and maybe a custom sample script for your point 2 and then increase
the PORTVERSION to force the update and to inform all users.
After a while, we can think about changing this behavior.
That sounds reasonable?
And thank you for pointing this out.
(In reply to Danilo G. Baio from comment #1)
> Maybe you can customize it with apccontrol and your own script.
I'm afraid I cannot, and I'll explain why it's a problem.
If apcupsd has been started with --kill-on-powerfail, it first sends the hybernate signal to the UPS and then starts the doshutdown procedure from apccontrol. No matter what you put into doshutdown/apccontrol, you are already at the mercy of the UPS. It won't wait for your scripts and customizations.
That's why I'm suggesting that doshutdown and the killing of UPS should be separated.
> a behavior that is present for 17 years
17 years is a very long term for a software, maybe it's time for a revolution? No custom script will work untul --kill-on-powerfail is there.
(In reply to Victor Sudakov from comment #2)
you can if you insert apcupsd_flags="" in your rc.conf
(In reply to Danilo G. Baio from comment #3)
> you can if you insert apcupsd_flags="" in your rc.conf
Ah, then yes, of course, if I change the whole logic. For the present, I suggest the following in in rc.conf:
and the following at the end of /etc/rc.shutdown:
test -f /var/run/powerfail && apcupsd --killpower
But I feel that modifying rc.shutdown is ugly.
What say you?
(In reply to Victor Sudakov from comment #4)
you should take a look at the apccontrol script, you can customize each command, there is an explanation there
(In reply to Danilo G. Baio from comment #5)
> you should take a look at the apccontrol script, you can customize each command, there is an explanation there
I have already studied the script and its man page, but I have no idea what can be changed *there* to get rid of the race condition between shutting down daemons and hybernating the UPS. What do you suggest changing there?
To me, the apccontrol script is fine, we just need, after we call doshutdown, to wait for all daemons to stop, and only then send the kill signal to the UPS.
If you read the man page to the appcontrol script, you will see the following:
killpower apcupsd does not normally generate this event. Instead, it
is invoked directly from the system halt script as
'apccontrol killpower' because the killpower event needs to
be performed as late in system shutdown as possible.
That's exactly what I'm talking about the whole time: we should shift the killpower as late as possible, but how do we implement this on FreeBSD?
I don't want you to change apccontrol, I'm saying about customize the commands:
Create a script `doshutdown` in /usr/local/etc/apcupsd.
stop other main deamons
check for whatever you want
/sbin/shutdown -h now "apcupsd initiated shutdown, bhyve is safe..."
you can even call `apccontrol killpower` in it if you want.
(In reply to Danilo G. Baio from comment #7)
> Create a script `doshutdown` in /usr/local/etc/apcupsd.
Oh, this may be a solution. Thank you. I'll try it out.
> you can even call `apccontrol killpower` in it if you want.
This would probably require starting apcupsd initially with --term-on-powerfail instead of --kill-on-powerfail, right?
I suggest you to remove all the flags, create this custom script to stop your VMs and insert `apccontrol killpower` command at the end, just before the `shutdown`.
(In reply to Danilo G. Baio from comment #9)
>I suggest you to remove all the flags,
This means an instance of apcupsd will remain running during shutdown, while another instance of apcupsd will be trying to kill the UPS. Are you sure they will not conflict for access to the UPS?
> insert `apccontrol killpower` command at the end, just before the `shutdown`.
But `apccontrol killpower` is a NOOP, all actions are commented out in apccontrol.
(In reply to Victor Sudakov from comment #10)
You are right, `apccontrol killall` is commented out, but you can use `apcupsd --killpower` (with -f config-file) directly in the custom doshutdown.
I didn't know about the two instances of apcupsd, but I think it's another reason for you use custom commands.
Do some tests and let us know, we can add some hints about this in the pkg-message, and add some custom commands as examples.
(In reply to Danilo G. Baio from comment #11)
> you can use `apcupsd --killpower` (with -f config-file) directly in the custom doshutdown.
Before or after "shutdown -h", in your opinion?
On the other hand, the more I think of this custom doshutdown solution, the less I like it. You know why? Because in fact it does not eliminate the race between the system shutdown procedure and the UPS grace time.
The correct way would be to place `apcupsd --killpower` somewhere in the shutdown procedure itself, as close to the end thereof as possible.
The good news is that `apcupsd --killpower` tests if /var/run/powerfail exists and refuses to kill power if it does not, so there is even no need for an additional check.
(In reply to Danilo G. Baio from comment #11)
> Do some tests and let us know,
My conclusion in Russian is now here: https://victor-sudakov.dreamwidth.org/470378.html
An English translation will be ready shortly, maybe tonight.
My final conclusion. The default way apcupsd is started by the port (with --kill-on-powerfail) is wrong. You cannot rely on the UPS grace shutdown delay even with the default setting of rcshutdown_timeout="90", while using virtual machines necessitates making rcshutdown_timeout even longer. A race between the UPS grace delay and the FreeBSD shutdown procedure must be eradicated.
The only feasible way of doing it is:
1. Starting apcupsd with apcupsd_flags="--term-on-powerfail" so that the daemon exits once it has started the doshutdown procedure and does not send any commands to the UPS at this stage.
2. Putting the line
"test -f /var/run/powerfail && /usr/local/sbin/apcupsd --hibernate"
"test -f /var/run/powerfail && /usr/local/sbin/apcupsd --power-off"
(depending on the desired behaviour of the UPS after the mains is restored) at the very end of the /etc/rc.shutdown script, after the "Insert other shutdown procedures here" line.
Thus the shutdown procedure started by "apcupsd --term-on-powerfail" can proceed at its own tempo, with "apcupsd --hibernate" being called when all the daemons and VMs have been safely shut down.
(In reply to Victor Sudakov from comment #14)
Thank you Victor.
I want to insert this in the pkg-message:
For default, apcupsd starts with `--kill-on-powerfail` parameter.
Please, read its man page, and if this is not the intended behavior you want,
change it accordingly.
In some systems where the shutdown can take a while (like in bhyve environment),
you may want to change this behavior as follows:
Set apcupsd_flags="--term-on-powerfail" on your /etc/rc.conf[.local].
Add this to /etc/rc.shutdown, after the "Insert other shutdown procedures here"
test -f /var/run/powerfail && /usr/local/sbin/apcupsd --hibernate
test -f /var/run/powerfail && /usr/local/sbin/apcupsd --power-off
Do you want to add something more?
(In reply to Danilo G. Baio from comment #15)
Seems fine to me.
I hope it will be the first step to changing the way apcupsd is integrated into the system, maybe inspired by Gentoo.
A commit references this bug:
Date: Sat May 11 18:30:41 UTC 2019
New revision: 501281
sysutils/apcupsd: Improve pkg-message
In some systems where the shutdown can take a while (like in bhyve
environment), users may want to change the default shutdown behavior.
Reported by: Victor Sudakov <firstname.lastname@example.org>
The new procedure still sucks. I had a power failure today and again found my bhyve guests with "/ was not properly dismounted" .
We need someone really smart to figure out where the race is. I'm not smart enough to solve the case.