I''m testing with the ServerIOTimeout setting of the plugin.xml (with
WAS 5.1.1.5 and IHS2). With this setting I want to force a takeover to
another clustermember within a certain time-interval (110 seconds).
I simulate a hanging AppServer on AIX 5.2 by killing the process with -
stop. This hangs the process, serverstatus.sh doesnt respond, but
process is still "active" (pid file exists, and ps -ef shows the
process). But unfortunately, the plugin keeps sending requests to this
hanging server. With a full load test this causes the HTTP-servers to
reach their upper limit of 600 active clients so that the application
is totally unreachable. These limits are reached after several
minutes: more than the 110 seconds specified in the ServerIOTimeout
parameter.
After ten minutes had to restart the http-servers (and kill -9 the
hanging process). These are actions I am trying to avoid.
Can somebody explain this behaviour? Is a kill -stop a good way of
testing a hanging AppServer?

Test with the connecttimeout parameter of the plugin were successful
and did act as expected, but this timeout only
works for TCP/IP timeouts... See: http://www-1.ibm.com/support/docview...id=swg21219808