Problem accessing application on WAS 6.1. from IHS. - Websphere

This is a discussion on Problem accessing application on WAS 6.1. from IHS. - Websphere ; Hi, I need some quick insight on a problem that am currently facing. The environment is like this: 3 application servers (WASND v6.1 / AIX), managed by a dmgr. One of the application server resides on the same server as ...

+ Reply to Thread
Results 1 to 5 of 5

Thread: Problem accessing application on WAS 6.1. from IHS.

  1. Problem accessing application on WAS 6.1. from IHS.

    Hi,



    I need some quick insight on a problem that am currently facing.



    The environment is like this:



    3 application servers (WASND v6.1 / AIX), managed by a dmgr. One of the application server resides on the same server as the dmgr. All of the application servers are clustered.



    Dmgr01 - 10.2.1.7

    AppSrv01 - 10.2.1.7



    AppSrv02 - 10.2.1.8



    AppSrv03 - 10.2.1.10


    2 web servers (IHS / AIX) - 10.2.1.3 / 10.2.1.4



    Applications have been deployed on the cluster and everything is fine since I can access the applications from the webserver with no problem. Tested with snoop, the requests are directed to the application server in turn on a round-robin basis.



    Now here's the problem - at times, I won't be able to access the application from the webserver. I tried to access it directly on WAS, and only AppSrv01 and AppSrv02 responded well. AppSrv03 ended up 'loading' forever. AppSrv03 status is up and started, 9082 port is accessible, but the application just won't load. And plus, what bothers me, the webservers are supposed to direct the requests to the other available application servers, but since they failed to load the application, it seems like the web servers are hitting AppSrv03 constantly.



    I tried to ping AppSrv03, and the result is consistent (though I kinda remember not getting any responds from it once or twice).



    An additional information on AppSrv03 - previously the server was used as a load balancer (edge). I just checked today and noticed that there is a second IP being aliased to the network interface en2. Since there are 2 IP addresses attached to the NIC, the routing table is also affected. The last few times I configured edge servers, I remember editing the routing table since adding IP alias to the network interface (not sure lo0 or enx) led to some network problems. Could this contribute to the problem that I'm facing?



    I have checked the webservers' logs - error_log has recorded numerous of following warnings / errors:



    child process 3759 did not exit, sending SIGTERM

    child process 3759 did not exit, sending SIGKILL



    and also - SIGHUP received. Attempting to restart - but I think this is due to the rotatelogs that I just configured.



    QUESTION: Why is the application intermittently not accessible from the webserver (and even on one of the application server), when all of the application servers are up and running. Plugin-cfg.xml and httpd.conf files have been reviewed and both looked fine to me.



    Ideas? Any advice is highly appreciated.



    Thanks in advance.



    P.S. I don't have any logs with me at the moment else I could attach it for reference.

    P.P.S. I'm posting this on both WAS and IHS forums, since I think it fits for both.

  2. Re: Problem accessing application on WAS 6.1. from IHS.

    On Jun 25, 5:44*am, gadi...@gmail.com wrote:
    > Hi,
    >
    > I need some quick insight on a problem that am currently facing.
    >
    > The environment is like this:
    >
    > 3 application servers (WASND v6.1 / AIX), managed by a dmgr. One of the application server resides on the same server as the dmgr. All of the application servers are clustered.
    >
    > Dmgr01 * - 10.2.1.7
    >
    > AppSrv01 - 10.2.1.7
    >
    > AppSrv02 - 10.2.1.8
    >
    > AppSrv03 - 10.2.1.10
    >
    > 2 web servers (IHS / AIX) - 10.2.1.3 */ *10.2.1.4
    >
    > Applications have been deployed on the cluster and everything is fine since I can access the applications from the webserver with no problem. Testedwith snoop, the requests are directed to the application server in turn ona round-robin basis.
    >
    > Now here's the problem - at times, I won't be able to access the application from the webserver. I tried to access it directly on WAS, and only AppSrv01 and AppSrv02 responded well. AppSrv03 ended up 'loading' forever. AppSrv03 status is up and started, 9082 port is accessible, but the applicationjust won't load. And plus, what bothers me, the webservers are supposed todirect the requests to the other available application servers, but since they failed to load the application, it seems like the web servers are hitting AppSrv03 constantly.
    >
    > I tried to ping AppSrv03, and the result is consistent (though I kinda remember not getting any responds from it once or twice).
    >
    > An additional information on AppSrv03 - previously the server was used asa load balancer (edge). I just checked today and noticed that there is a second IP being aliased to the network interface en2. Since there are 2 IP addresses attached to the NIC, the routing table is also affected. The last few times I configured edge servers, I remember editing the routing table since adding IP alias to the network interface (not sure lo0 or enx) led to some network problems. Could this contribute to the problem that I'm facing?
    >
    > I have checked the webservers' logs - error_log has recorded numerous of following warnings / errors:
    >
    > child process 3759 did not exit, sending SIGTERM
    >
    > child process 3759 did not exit, sending SIGKILL
    >
    > and also - SIGHUP received. Attempting to restart - but I think this is due to the rotatelogs that I just configured.
    >
    > QUESTION: Why is the application intermittently not accessible from the webserver (and even on one of the application server), when all of the application servers are up and running. Plugin-cfg.xml and httpd.conf files havebeen reviewed and both looked fine to me.
    >
    > Ideas? Any advice is highly appreciated.
    >
    > Thanks in advance.
    >
    > P.S. I don't have any logs with me at the moment else I could attach it for reference.
    >
    > P.P.S. I'm posting this on both WAS and IHS forums, since I think it fitsfor both.


    Hey,
    Is there any firewall between IHS and WAS?....have you tried looking
    at the fw log?....maybe some packages are being rejected because of a
    fw rule.
    i had kind of a similar problem in the past, i had an application
    server which hostname and ip address were changed after installation,
    we could solve the config files problem but we got access
    issues...something like you mentioned in your post. We activated a
    network monitor to check for the tcp packages and we found that the
    ihs was trying to connect to the old ip address of the application
    server.....so i suggest that you check the network traffic between IHS
    and WAS as well.

    best regards,

  3. Re: Problem accessing application on WAS 6.1. from IHS.

    gadidot@gmail.com wrote:
    > Hi,
    >
    >
    >
    > I need some quick insight on a problem that am currently facing.
    >
    >
    >
    > The environment is like this:
    >
    >
    >
    > 3 application servers (WASND v6.1 / AIX), managed by a dmgr. One of the application server resides on the same server as the dmgr. All of the application servers are clustered.
    >
    >
    >
    > Dmgr01 - 10.2.1.7
    >
    > AppSrv01 - 10.2.1.7
    >
    >
    >
    > AppSrv02 - 10.2.1.8
    >
    >
    >
    > AppSrv03 - 10.2.1.10
    >
    >
    > 2 web servers (IHS / AIX) - 10.2.1.3 / 10.2.1.4
    >
    >
    >
    > Applications have been deployed on the cluster and everything is fine since I can access the applications from the webserver with no problem. Tested with snoop, the requests are directed to the application server in turn on a round-robin basis.
    >
    >
    >
    > Now here's the problem - at times, I won't be able to access the application from the webserver. I tried to access it directly on WAS, and only AppSrv01 and AppSrv02 responded well. AppSrv03 ended up 'loading' forever. AppSrv03 status is up and started, 9082 port is accessible, but the application just won't load. And plus, what bothers me, the webservers are supposed to direct the requests to the other available application servers, but since they failed to load the application, it seems like the web servers are hitting AppSrv03 constantly.
    >
    >
    >
    > I tried to ping AppSrv03, and the result is consistent (though I kinda remember not getting any responds from it once or twice).
    >
    >
    >
    > An additional information on AppSrv03 - previously the server was used as a load balancer (edge). I just checked today and noticed that there is a second IP being aliased to the network interface en2. Since there are 2 IP addresses attached to the NIC, the routing table is also affected. The last few times I configured edge servers, I remember editing the routing table since adding IP alias to the network interface (not sure lo0 or enx) led to some network problems. Could this contribute to the problem that I'm facing?
    >
    >
    >
    > I have checked the webservers' logs - error_log has recorded numerous of following warnings / errors:
    >
    >
    >
    > child process 3759 did not exit, sending SIGTERM
    >
    > child process 3759 did not exit, sending SIGKILL
    >
    >
    >
    > and also - SIGHUP received. Attempting to restart - but I think this is due to the rotatelogs that I just configured.
    >
    >
    >
    > QUESTION: Why is the application intermittently not accessible from the webserver (and even on one of the application server), when all of the application servers are up and running. Plugin-cfg.xml and httpd.conf files have been reviewed and both looked fine to me.
    >
    >
    >
    > Ideas? Any advice is highly appreciated.
    >
    >
    >
    > Thanks in advance.
    >
    >
    >
    > P.S. I don't have any logs with me at the moment else I could attach it for reference.
    >
    > P.P.S. I'm posting this on both WAS and IHS forums, since I think it fits for both.
    >

    take a look in the plugin log files and see if appsrv03 is ever marked
    down. Sounds like you may have a network issue between IHS and appsrv03,
    and if it's at the TCP stack layer, a request can take a long time to
    fail - several OS's take 30-60 seconds to fail a socket open. But if
    that happens, the plugin should mark that server down.

    Ken

  4. Re: Problem accessing application on WAS 6.1. from IHS.

    Thanks for the reply, I'll check the logs.



    Correct me if I'm wrong - assuming there is a problem with the network between the IHS and AppSrv03, the web servers will keep on trying to send the request to AppSrv03 until it fail, and due to the delay, the application will be inaccessible from the web servers for a certain period of time. The request was not diverted to the other 2 application servers as this is not the case where the web servers can immediately detect that AppSrv03 has been marked down.



    Thanks again.

  5. Re: Problem accessing application on WAS 6.1. from IHS.

    gadidot@gmail.com wrote:
    > Thanks for the reply, I'll check the logs.
    >
    >
    >
    > Correct me if I'm wrong - assuming there is a problem with the network between the IHS and AppSrv03, the web servers will keep on trying to send the request to AppSrv03 until it fail, and due to the delay, the application will be inaccessible from the web servers for a certain period of time. The request was not diverted to the other 2 application servers as this is not the case where the web servers can immediately detect that AppSrv03 has been marked down.
    >
    >
    >
    > Thanks again.
    >

    Essentially correct. Note that the "requests" we're talking about are
    all first-time requests that do not have an existing session.

    Ken

+ Reply to Thread