libsigwatch.a is a library of routines to provide simple signal watching for Fortran programs. This allows a minimal level of control of a running program from outside it, for example to tell it to checkpoint itself on receipt of a signal.
Version 1.0, 2011 February 2.
The project home page is
http://purl.org/nxg/dist/libsigwatch
,
and it's hosted
at Bitbucket,
where you can find downloads.
If you find any bugs, please report them at the Bitbucket issue tracker.
It is often useful to have some simple signal handling in larger
Fortran programs, for example to handle the INT
interrupt
signal generated by ^C
, and have a program shut itself
down cleanly; or to handle one of the user signals USR1
or USR2
, for example to have a program checkpoint itself,
in case it crashes at some later stage. However, signal handling is
tricky in Fortran (because the function that is registered as a signal
handler is later called by value rather than by reference), so this
library provides functions to make it easier.
On Unix, there is a smallish set of signals which may be sent to a
running process, which the process can either catch or
ignore. For example, the INT
signal is sent to
a process by pressing the interrupt character (usually
^C
), HUP
is sent when a controlling terminal
logs out, and KILL
can be sent either by hand or by the
system when it is forcing processes to die. The default action of the
INT
signal is to terminate a process, and by default the
HUP
signal is ignored. The KILL
signal is
one of those which cannot be caught or ignored, but always has its
effect. There are also two signals, called USR1
and
USR2
which are ignored by default, have no default
meaning, and are provided for user convenience.
Each signal has a numeric value -- for example HUP
is
1 and KILL
is 9 -- and after finding a process's PID with
the ps(1)
command, you can send signals to it with the
kill(1)
command:
% kill -HUP <pid>
or
% kill -1 <pid>
Signals thus provide a limited mechanism for communicating with a
running program. A useful way to use this is to have the program
watch for signal USR1
, say, and examine this by calling
function getlastsignal at the end of a loop. If this
returns a non-zero response, you might make your program checkpoint
itself -- save its state for later restart -- in case the program
crashes or has to be stopped for some reason.
For more details about signals, see the man pages for
signal(3)
or signal(7)
, depending on your
platform.
A program prepares to receive signals by calling one of the watchsignalname or watchsignal functions, and calls getlastsignal at any point to retrieve the last signal which was sent to the process.
The arguments to watchsignalname are
signame, a character string containing the name of the signal
to watch for, and response, an integer which will be returned
by getlastsignal after the specified signal has been
caught. The signal names which the function recognises are those most
likely to be useful, namely HUP
, INT
,
USR1
and USR2
.
The integer response is the number which will subsequently
be returned by getlastsignal, after this signal is
caught. If this response is passed as -1, the signal number
associated with this name is what will be returned. Note that,
although both HUP
and INT
have generally
fixed numbers, the numbers associated with signals USR1
and USR2
are different on different unix variants.
If you need to catch another signal for some reason (make sure you understand the default behavour of the given signal first, however) you can give that signal as a number to the watchsignal function, and when that signal is later caught, the corresponding number is what will be returned by getlastsignal.
The getlastsignal function returns the response associated with the last signal which was caught, or zero if no signal has been caught so far, or since the last call to getlastsignal. That is, any caught signal is returned only once.
The installed signal handler does not re-throw the signal
after it has caught it; this would defeat the purpose of this library
for those signals, such as HUP
and INT
, for
which the default action is to kill the process. Also, there is no
way to tell if the signal was received by being re-thrown by another
handler, installed after this one. If all of this matters to you,
then this library cannot reasonably help you, and you have no hope but
to learn to love the sigaction(2)
manpage.
When installing the handler, these functions replace any previous signal handler. If that was a non-default one (for example, one put there by an MPI environment) this could potentially change the behaviour of your program in an unhelpful fashion. To warn you of this, these functions return +1 in this case; this is a success return value, but also a warning that you should understand what that previous signal handler was doing there.
The sigwatchversion function returns the version number of the library, as an integer formed from the version number by major_version * 1000 + minor_version, So that the version number 1.2, for example, would be returned as integer 1002.
Both watchsignalname and watchsignal return 0 if the signal watching was installed successfully, and -1 if there was an error. If there was a non-default signal handler already installed, it is replaced, but the routine returns 1 to warn you of this.
The function getlastsignal returns the response associated with the last signal caught, or zero if there has been no signal caught since the last time this function was invoked.
The following Fortran program shows the library in use.
program sigs implicit none integer i integer status integer watchsignal integer watchsignalname integer getlastsignal * watch for signal 10 (which is USR1 on this platform) status = watchsignal(10) write(*,'("watchsignal 10:",i2)') status * watch for HUP, too status = watchsignalname("HUP", 99) write(*,'("watchsignal HUP:",i2)') status do i=1,10 call sleep(1) write (*,'("lastsig=", i2)') getlastsignal() enddo end
Then you can use the library like this:
% g77 -o libsigwatch-demo -lsigwatch libsigwatch-demo.f % ./libsigwatch-demo & # start in the background ($! now has the PID) [1] 15131 watchsignal 10: 0 watchsignal HUP: 0 % lastsig= 0 lastsig= 0 lastsig= 0 kill -HUP $! # send the HUP signal to the process lastsig=99 # saw it! % lastsig= 0 ...
You can also link against just sigwatch.o
if necessary.
Download the distribution from Bitbucket.
To configure, build and install, just use:
% ./configure % make % make install
That will install the software into /usr/local
. If
you want it to go somewhere else, then (as usual with
./configure
), specify the alternative location as the
argument to configure's --prefix
option. See
./configure --help for more details.
This software is copyright 2003, 2005, 2011, Norman Gray. It is free software, released under the terms of the GNU General Public Licence.