The Manager2 runtime component is responsible for managing the specified executables. Typically it is responsible for starting all Ratatosk executables on a host and for associated periodic tasks, such as sending files to shore. Specifically, it has functionality for:
It is configured using a yaml input file with the following format:
ManagedProcess
: A list of processes to manage.Each element of ManagedProcess
have the following fields:
PrgName
: The application name, the path must be included if the application is not on the search path of the environment.Arguments
: A yaml array with the process arguments, e.g. '[/home/sintef/vesseldeployments/config/myvessel/modbus.yml, 2]'RestartPeriod
: If this optional argument is given, the application will be restarted after the specified number of seconds, if the return value indicated success. If unset, the application will not be restarted after successful execution.DelayOnError
: If this optional argument is set, the application will be restarted after the specified number of seconds, if the return value indicates an error, the application fails to launch or some other error is detected. If this argument is not given, the application will not be restarted if it fails.DelayOnErrorLong
: If this argumentis set, in addition to DelayOnError
, a schedule for restarting the application on successive failures will be used. After a number of successive failed restarts with a (short) delay DelayOnError
, a (longer) delay of DelayOnError
will be used. After this delay, the restart schedule will again run a number of attempts using DelayOnError
and so on. Se also MaxRestartsOnError
and ResetFailCounterAfter
.MaxRestartsOnError
: This optional field will limit the number of restarts on unsuccessfull execution. If DelayOnError
is set and DelayOnErrorLong
is not set, the number of restarts will be limited to MaxRestartsOnError
if set. If both DelayOnError
and DelayOnErrorLong
are set, MaxRestartOnError
specifies the number of restarts using DelayOnError
for each restart using DelayOnErrorLong
. In this case, a default value of three will be used if MaxRestartsOnError
is unset.ResetFailCounterAfter
: [3] This optional field specifies how long the application must be running without returning an error code or otherwise fail, before the application should be considered to have started successfully. After the application has been running for this duration, the restart schedule will be reset, and a subsequent error will trigger a restart following the reset schedule. If unset, a default value of three seconds will be used. For application which use a longer time to e.g. set up communication with remote peers, it is recommended to increase this value.MaxQuietPeriod
: If this optional field is set, the manager will monitor the application's output to standard out and check every MaxQuietPeriod
there has been any output within this period. If there has been no output, the application will be terminated immediately and subsequently restarted according to the restart on error schedule. Note that a minimum period of two seconds will be used!WatchdogPeriod
: Run a watchdog periodically with the specified period (in seconds). The watchdog will try to detect if the application has somehow terminated without triggering handling of the failure. Typically, this would mean that no notification was sent to the exit handler. If the watchdog detects failure, the process will be restarted according to the beforementioned restart on error schedule. Note that if MaxQuietPeriod
is set, the watchdog period specified here will be overwritten by MaxQuietPeriod
.An annotated example input file is shown below, as well as here.