[an error occurred while processing this directive]
Printer-friendly version
keepalive(1M)
keepalive -- monitor and respawn processes and daemons
Synopsis
keepalive [-i] [-t interval] [-P node] [-S node]
Description
The keepalive daemon monitors processes and daemons that are
registered with
it using the spawndaemon(1M)
utility. When a registered process or daemon fails,
keepalive logs the event using
syslog(3G), and normally
executes a restart script to restart the process/daemon.
For a complete description of
the process monitoring and restart features available with keepalive,
refer to
spawndaemon(1M).
keepalive is more versatile than
init(1M) for respawning processes.
Advantages of using keepalive include:
- Provides a command-line interface (spawndaemon) with many
process/daemon restart options.
- Monitors real-time processes; it is completely signal driven.
- Monitor daemons.
- Restarts a process/daemon on any node in the cluster with a
user-specified node selection policy.
keepalive, which is started by
init(1M), uses a memory-mapped
data file, /etc/keepalive.d/keepalive.data (referred to as the
monitored process table), to track the state of
processes/daemons it is monitoring. Restart keepalive in any script
invoked by
init(1M) or
shutdown(1M).
To kill keepalive permanently, you must
execute spawndaemon with the -Q option and edit
/etc/inittab to remove the keepalive entries.
keepalive uses syslog to log messages. The ident
string
is keepalive, and the facility string is LOG_DAEMON.
This information is provided for customizing
syslogd(1M)
configuration.
The keepalive daemon uses standard shell scripts to restart
processes/daemons
that have terminated. These scripts can have any file name, but must be
stored in the /etc/keepalive.d directory. Their group and user IDs
must be root and their permissions set to 0755 to disallow write
access by others.
If the executable for the process/daemon resides in a remote system,
you must have root access to start or restart the process. The remote file
system must be shared or exported with root permissions enabled. Root
permissions must also be enabled on the mount point, the automount table,
or the /etc/vfstab file.
Any process/daemon started by keepalive has
its stdout and stderr redirected to a logging file called
/var/log/keepalive/daemon_basename.process_id.
You should call
fflush(3S) on stdout
and stderr to
force data to disk. stdin for the process/daemon is redirected to
/ (root), causing any attempts to read from stdin to
fail.
Files under /var/log/keepalive are not allowed to accumulate.
Unregistering a process/daemon causes its logging file to be deleted. If a
process/daemon is restarted, the logging file of the previous instance of the
process/daemon is deleted. If keepalive is started with the -i
option or is shutdown, all logging files not in use are deleted.
If you remove the memory-mapped data file (monitored process table) belonging
to the keepalive
daemon, you should also shut down the keepalive daemon with the
spawndaemon -Q command. To shut down keepalive, but leave
the monitored process table intact, send the keepalive daemon a
SIGTERM signal so it performs a controlled exit.
Options
The keepalive command uses the following options and arguments:
- -i
- Removes the memory-mapped data file,
/etc/keepalive.d/keepalive.data,
before starting the keepalive daemon. Using the -i option
causes the keepalive daemon to start up with a clean monitored process
table, such that it is not monitoring any processes.
- -t interval
-
Defines the time in seconds that keepalive uses as a polling interval
in rare cases where keepalive must use polling (such as when a call to
fork(2) fails due to
a node resource problem). The default value is 5. Note that if a
fork fails on a node, keepalive tries other nodes in the node
set for the process/daemon being (re)started. See
spawndaemon(1M) for details.
- -P node
-
Specifies the primary node on which the keepalive process is
executed. The keepalive process is pinned on the specified
node and cannot be migrated by using the
load_leveld(1) utility.
- -S node
-
Specifies the secondary node on which the keepalive process is
executed when the primary node is unavailable. The keepalive
process is pinned on the specified node and cannot be migrated by
using the load_leveld(1)
utility.
If one of the specified nodes is down or invalid, keepalive logs
a warning and continues execution on another available node.
A keepalive process is always pinned on the node on which it is
running, regardless of whether or not the user specifies a node. If a
primary or secondary node is not specified on the command line, the
system chooses any available node.
Files
- /dev/keepalivecfg
-
Named pipe for receiving commands
- /etc/keepalive.d
-
Directory for process/daemon restart scripts
- /etc/keepalive.d/keepalive.data
-
The keepalive memory-mapped data file (monitored process table)
for tracking the state of processes/daemons being monitored.
- /var/log/keepalive
-
Directory containing files into which any process/daemon started by
keepalive has its stdout and stderr redirected.
References
fflush(3S),
init(1M),
load_leveld(1),
spawndaemon(1M),
syslogd(1M),
syslog(3G),
vfstab(4)
15 August 2001
Copyright 2001 Compaq Computer Corporation
Cluster-Tools Version 0.5.8