Uploaded image for project: 'Application Server 7'
  1. Application Server 7
  2. AS7-4260

Occurences of "Exception acquiring ownership of X (via SharedLocalYieldingClusterLockManager)"

    Details

    • Similar Issues:
      Show 10 results 

      Description

      There are often occurrences of "Exception acquiring ownership of X":

      During shutdown, using dist:

      13:09:28,290 WARN  [org.apache.catalina.session.ManagerBase] (ContainerBackgroundProcessor[StandardEngine[jboss.web]]) JBAS018080: Standard expiration of session qLQps+CQJxEi6pJTBSzis0g3 failed; switching to a brute force cleanup. Problem is JBAS018060: Exception acquiring ownership of qLQps+CQJxEi6pJTBSzis0g3
      

      During server crash, using repl:

      12:55:49,556 INFO  [org.jboss.as.clustering.impl.CoreGroupCommunicationService.ejb] (Incoming-2,null) JBAS010248: New cluster view for partition ejb: 6 (org.jboss.as.clustering.impl.CoreGroupCommunicationService$GroupView@327579f6 delta: -1, merge: false)
      12:55:49,557 INFO  [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-2,null) ISPN000094: Received new cluster view: [perf21/ejb|6] [perf21/ejb, perf20/ejb, perf18/ejb]
      12:56:31,694 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host]] (ajp-perf20-10.16.90.58-8009-211) Exception sending request initialized lifecycle event to listener instance of class org.jboss.weld.servlet.WeldListener: java.lang.RuntimeException: JBAS018060: Exception acquiring ownership of MvwC8KGWTuKPjD1-JjLW7URZ
      	at org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:528) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.jboss.as.web.session.ClusteredSession.expire(ClusteredSession.java:1260) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.jboss.as.web.session.ClusteredSession.isValid(ClusteredSession.java:1208) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.jboss.as.web.session.ClusteredSession.isValid(ClusteredSession.java:598) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.jboss.as.web.session.DistributableSessionManager.add(DistributableSessionManager.java:917) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.jboss.as.web.session.DistributableSessionManager.loadSession(DistributableSessionManager.java:1407) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.jboss.as.web.session.DistributableSessionManager.findSession(DistributableSessionManager.java:686) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.jboss.as.web.session.DistributableSessionManager.findSession(DistributableSessionManager.java:85) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	at org.apache.catalina.connector.Request.doGetSession(Request.java:2618) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.catalina.connector.Request.getSession(Request.java:2375) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:841) [jbossweb-7.0.13.Final.jar:]
      	at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.getSession(LazySessionBeanStore.java:72) [weld-core-1.1.5.AS71.Final.jar:2012-02-10 15:31]
      	at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.<init>(LazySessionBeanStore.java:58) [weld-core-1.1.5.AS71.Final.jar:2012-02-10 15:31]
      	at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:31) [weld-core-1.1.5.AS71.Final.jar:2012-02-10 15:31]
      	at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:16) [weld-core-1.1.5.AS71.Final.jar:2012-02-10 15:31]
      	at org.jboss.weld.servlet.WeldListener.requestInitialized(WeldListener.java:134) [weld-core-1.1.5.AS71.Final.jar:2012-02-10 15:31]
      	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445) [jbossweb-7.0.13.Final.jar:]
      	at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930) [jbossweb-7.0.13.Final.jar:]
      	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_30]
      Caused by: org.jboss.as.clustering.lock.TimeoutException: JBAS010223: Cannot acquire lock //default-host//clusterbench/MvwC8KGWTuKPjD1-JjLW7URZ from cluster
      	at org.jboss.as.clustering.lock.SharedLocalYieldingClusterLockManager.lock(SharedLocalYieldingClusterLockManager.java:439)
      	at org.jboss.as.clustering.web.infinispan.DistributedCacheManager.acquireSessionOwnership(DistributedCacheManager.java:372)
      	at org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:520) [jboss-as-web-7.1.2.Final-SNAPSHOT.jar:7.1.2.Final-SNAPSHOT]
      	... 23 more
      

        Gliffy Diagrams

        1. view-perf18.log
          2 kB
          Richard Achmatowicz
        2. view-perf19.log
          55 kB
          Richard Achmatowicz
        3. view-perf20.log
          10 kB
          Richard Achmatowicz
        4. view-perf21.log
          13 kB
          Richard Achmatowicz
        1. SessionTravel.gif
          12 kB

          Issue Links

            Activity

            Show
            rhusar Radoslav Husar added a comment - http://hudson.qa.jboss.com/hudson/view/EAP6/view/EAP6-Failover/job/eap-6x-failover-http-session-jvmkill-repl-sync/15/ http://hudson.qa.jboss.com/hudson/view/EAP6/view/EAP6-Failover/job/eap-6x-failover-http-session-shutdown-dist-async/10/
            Hide
            rhusar Radoslav Husar added a comment -

            Here is my theory: the cluster has already received a new cluster view after a node crashed – however the sync repl was still trying to acquire the lock from a crashed node. The fix would then be to cancel getting lock from a crash node when a new view is accepted.

            The implications of this are that the sessions trying to get sync repl lock timeout or reply after the timeout when a node crashed even though FD detected it earlier.

            I might be totally wrong here

            Show
            rhusar Radoslav Husar added a comment - Here is my theory: the cluster has already received a new cluster view after a node crashed – however the sync repl was still trying to acquire the lock from a crashed node. The fix would then be to cancel getting lock from a crash node when a new view is accepted. The implications of this are that the sessions trying to get sync repl lock timeout or reply after the timeout when a node crashed even though FD detected it earlier. I might be totally wrong here
            Hide
            pferraro Paul Ferraro added a comment -

            Actually, it looks like some fishy is going on...
            Looking at the stacktrace, the session as fetched from the distributed cache is invalid - meaning it's timestamp is older than the maxInactiveInterval, hence the call to ClusteredSession.expire(...).
            So, my question is, does your test attempt to access a session that would have been expired? Even though this condition should not cause a TimeoutException, I want to make sure there isn't something more fundamental that's broken.

            Show
            pferraro Paul Ferraro added a comment - Actually, it looks like some fishy is going on... Looking at the stacktrace, the session as fetched from the distributed cache is invalid - meaning it's timestamp is older than the maxInactiveInterval, hence the call to ClusteredSession.expire(...). So, my question is, does your test attempt to access a session that would have been expired? Even though this condition should not cause a TimeoutException, I want to make sure there isn't something more fundamental that's broken.
            Hide
            rhusar Radoslav Husar added a comment -

            Well, with all the time-outs going on there could have been a retry – I should rerun. Either way wouldn't this result in something different?

            Show
            rhusar Radoslav Husar added a comment - Well, with all the time-outs going on there could have been a retry – I should rerun. Either way wouldn't this result in something different?
            Hide
            rhusar Radoslav Husar added a comment -

            This is still being looked up, I will rerun with latest build and update.

            Show
            rhusar Radoslav Husar added a comment - This is still being looked up, I will rerun with latest build and update.
            Hide
            rachmato Richard Achmatowicz added a comment -

            I am actively looking at this. From what I can tell, after failure of the lock holder, the node to which the request failed over can't get the lock, resulting in the exception. It may be that the lock state is not being updated in a timely manner, post failure. Going to look at dumping the lock states.

            Please let me know if you still see these exceptions, Rado, on the new build.

            Show
            rachmato Richard Achmatowicz added a comment - I am actively looking at this. From what I can tell, after failure of the lock holder, the node to which the request failed over can't get the lock, resulting in the exception. It may be that the lock state is not being updated in a timely manner, post failure. Going to look at dumping the lock states. Please let me know if you still see these exceptions, Rado, on the new build.
            Hide
            rhusar Radoslav Husar added a comment -

            Richard I am still seeing this in EAP ER5 (~2 weeks old AS7).

            REPL and SYNC replication.

            https://hudson.qa.jboss.com/hudson/job/eap-6x-failover-http-cdi-shutdown-dist-sync/6/console-perf19/

            [JBossINF] 19:14:01,091 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host]] (ajp-perf19-10.16.90.56-8009-12) Exception sending request initialized lifecycle event to listener instance of class org.jboss.weld.servlet.WeldListener: java.lang.RuntimeException: JBAS018060: Exception acquiring ownership of e9nOvXjom8IIp9ZcmFiQVZh8
            [JBossINF] 	at org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:528) [jboss-as-web-7.1.1.Final-redhat-1.jar:7.1.1.Final-redhat-1]
            [JBossINF] 	at org.jboss.as.web.session.ClusteredSession.access(ClusteredSession.java:496) [jboss-as-web-7.1.1.Final-redhat-1.jar:7.1.1.Final-redhat-1]
            [JBossINF] 	at org.apache.catalina.connector.Request.doGetSession(Request.java:2625) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.catalina.connector.Request.getSession(Request.java:2375) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:841) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.getSession(LazySessionBeanStore.java:72) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52]
            [JBossINF] 	at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.<init>(LazySessionBeanStore.java:58) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52]
            [JBossINF] 	at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:31) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52]
            [JBossINF] 	at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:16) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52]
            [JBossINF] 	at org.jboss.weld.servlet.WeldListener.requestInitialized(WeldListener.java:134) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52]
            [JBossINF] 	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:931) [jbossweb-7.0.14.Final-redhat-0-todo.jar:]
            [JBossINF] 	at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_30]
            [JBossINF] Caused by: org.jboss.as.clustering.lock.TimeoutException: JBAS010213: Cannot acquire lock //default-host//clusterbench/e9nOvXjom8IIp9ZcmFiQVZh8 from cluster
            [JBossINF] 	at org.jboss.as.clustering.lock.SharedLocalYieldingClusterLockManager.lock(SharedLocalYieldingClusterLockManager.java:439)
            [JBossINF] 	at org.jboss.as.clustering.web.infinispan.DistributedCacheManager.acquireSessionOwnership(DistributedCacheManager.java:371)
            [JBossINF] 	at org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:520) [jboss-as-web-7.1.1.Final-redhat-1.jar:7.1.1.Final-redhat-1]
            [JBossINF] 	... 17 more
            

            Show
            rhusar Radoslav Husar added a comment - Richard I am still seeing this in EAP ER5 (~2 weeks old AS7). REPL and SYNC replication. https://hudson.qa.jboss.com/hudson/job/eap-6x-failover-http-cdi-shutdown-dist-sync/6/console-perf19/ [JBossINF] 19:14:01,091 ERROR [org.apache.catalina.core.ContainerBase.[jboss.web].[default-host]] (ajp-perf19-10.16.90.56-8009-12) Exception sending request initialized lifecycle event to listener instance of class org.jboss.weld.servlet.WeldListener: java.lang.RuntimeException: JBAS018060: Exception acquiring ownership of e9nOvXjom8IIp9ZcmFiQVZh8 [JBossINF] at org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:528) [jboss-as-web-7.1.1.Final-redhat-1.jar:7.1.1.Final-redhat-1] [JBossINF] at org.jboss.as.web.session.ClusteredSession.access(ClusteredSession.java:496) [jboss-as-web-7.1.1.Final-redhat-1.jar:7.1.1.Final-redhat-1] [JBossINF] at org.apache.catalina.connector.Request.doGetSession(Request.java:2625) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.catalina.connector.Request.getSession(Request.java:2375) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.catalina.connector.RequestFacade.getSession(RequestFacade.java:841) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.getSession(LazySessionBeanStore.java:72) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52] [JBossINF] at org.jboss.weld.context.beanstore.http.LazySessionBeanStore.<init>(LazySessionBeanStore.java:58) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52] [JBossINF] at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:31) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52] [JBossINF] at org.jboss.weld.context.http.HttpSessionContextImpl.associate(HttpSessionContextImpl.java:16) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52] [JBossINF] at org.jboss.weld.servlet.WeldListener.requestInitialized(WeldListener.java:134) [weld-core-1.1.6.Final-redhat-0-todo.jar:2012-03-21 18:52] [JBossINF] at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:143) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:931) [jbossweb-7.0.14.Final-redhat-0-todo.jar:] [JBossINF] at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_30] [JBossINF] Caused by: org.jboss.as.clustering.lock.TimeoutException: JBAS010213: Cannot acquire lock //default-host//clusterbench/e9nOvXjom8IIp9ZcmFiQVZh8 from cluster [JBossINF] at org.jboss.as.clustering.lock.SharedLocalYieldingClusterLockManager.lock(SharedLocalYieldingClusterLockManager.java:439) [JBossINF] at org.jboss.as.clustering.web.infinispan.DistributedCacheManager.acquireSessionOwnership(DistributedCacheManager.java:371) [JBossINF] at org.jboss.as.web.session.ClusteredSession.acquireSessionOwnership(ClusteredSession.java:520) [jboss-as-web-7.1.1.Final-redhat-1.jar:7.1.1.Final-redhat-1] [JBossINF] ... 17 more
            Hide
            sarah.g Sarah Gerweck added a comment - - edited

            Any progress on this one?

            At my workplace we are using JBoss 7.1.1 and I am starting to get pressure to move to another container because of this issue. We have had to turn off clustering completely.

            Our QE team is convinced that this is not simply caused by a failed node. We have tried carefully stopping all JBoss nodes in our cluster, then bringing them back one at a time. As users bounce between hosts, they randomly trigger this error and get failures. (Our environment does not have strict session affinity, so users can bounce around.)

            Show
            sarah.g Sarah Gerweck added a comment - - edited Any progress on this one? At my workplace we are using JBoss 7.1.1 and I am starting to get pressure to move to another container because of this issue. We have had to turn off clustering completely. Our QE team is convinced that this is not simply caused by a failed node. We have tried carefully stopping all JBoss nodes in our cluster, then bringing them back one at a time. As users bounce between hosts, they randomly trigger this error and get failures. (Our environment does not have strict session affinity, so users can bounce around.)
            Hide
            rachmato Richard Achmatowicz added a comment -

            Sarah

            We're actively working on it. I'll keep you posted.

            Show
            rachmato Richard Achmatowicz added a comment - Sarah We're actively working on it. I'll keep you posted.
            Hide
            rachmato Richard Achmatowicz added a comment - - edited

            Some work in progress. I added some logging to JvmRouteValve, ClusteredValve and SharedLocalYieldingClusterLockManagerSupport in order to seee which session related requests are coming into each node and (ii) what is happening when the global lock is being acquired (which triggers the exception).

            I did a run of the session replication test, processed the logs and ended up with four logs which track the activity of a single session which triggers the exception.

            I have attached the logs as view-perf18, view-perf19, view-perf20 and view-perf21.

            I have also created a small graphic showing the relationships between hosts, views received, failures and which hosts are receiving requests involving that session. The graphic is called sessionTravel.gif. What is clear is that in view 10, both perf18 and pef19 are trying to acquire the session ownership.

            There are some issues with the JVMRoute processing which i'm still looking into. What is odd is that two requests involving the same session are being processed by different hosts.

            Show
            rachmato Richard Achmatowicz added a comment - - edited Some work in progress. I added some logging to JvmRouteValve, ClusteredValve and SharedLocalYieldingClusterLockManagerSupport in order to seee which session related requests are coming into each node and (ii) what is happening when the global lock is being acquired (which triggers the exception). I did a run of the session replication test, processed the logs and ended up with four logs which track the activity of a single session which triggers the exception. I have attached the logs as view-perf18, view-perf19, view-perf20 and view-perf21. I have also created a small graphic showing the relationships between hosts, views received, failures and which hosts are receiving requests involving that session. The graphic is called sessionTravel.gif. What is clear is that in view 10, both perf18 and pef19 are trying to acquire the session ownership. There are some issues with the JVMRoute processing which i'm still looking into. What is odd is that two requests involving the same session are being processed by different hosts.
            Hide
            pferraro Paul Ferraro added a comment -

            The issue with requests getting routed to different nodes using DIST mode has been identified (and fixed) here: AS7-4560
            The fact that this issue is present using REPL_SYNC mode still requires investigation.

            Show
            pferraro Paul Ferraro added a comment - The issue with requests getting routed to different nodes using DIST mode has been identified (and fixed) here: AS7-4560 The fact that this issue is present using REPL_SYNC mode still requires investigation.
            Hide
            rachmato Richard Achmatowicz added a comment -

            I've re-tested both repl-sync and dist-async with the latest AS7 and I no longer see instances of this exception occurring.

            Show
            rachmato Richard Achmatowicz added a comment - I've re-tested both repl-sync and dist-async with the latest AS7 and I no longer see instances of this exception occurring.
            Hide
            pferraro Paul Ferraro added a comment -

            If Rado can confirm this, let's close it as no longer an issue. I'm not terribly surprised as there have been many changes that likely addressed the underlying problem.

            Show
            pferraro Paul Ferraro added a comment - If Rado can confirm this, let's close it as no longer an issue. I'm not terribly surprised as there have been many changes that likely addressed the underlying problem.
            Show
            rhusar Radoslav Husar added a comment - This is still an issue in latest snapshot: https://hudson.qa.jboss.com/hudson/view/EAP6/view/EAP6-Failover/job/eap-6x-failover-http-session-shutdown-dist-async/13/console-perf18/
            Hide
            rachmato Richard Achmatowicz added a comment -

            Still looking into this one. What I can see happening is as follows:
            1. when a failover occurs, a new jvmRoute is being computed by looking up the cache to get the DistributionManager (ii) finding out which addresses own the session key (iii) use the JvmRouteRegistry to look up the new jvmRoute for the key, using the address of the key
            2. the jvmRoute is then adjusted and the request continues processing on the node which it arrived at
            3. when the request is returned, the load balancer then uses the new jvmRoute to process the request

            For example:

            11:49:46,418 TRACE [org.jboss.as.web.session.JvmRouteValve] (ajp-perf18-10.16.90.54-8009-361) checkJvmRoute(): check if need to re-route based on JvmRoute. Session id: mdbFou-te-0qzdaY1T6FXyaw.perf21 jvmRoute: perf18
            11:49:46,418 TRACE [org.jboss.as.web.session.JvmRouteValve] (ajp-perf18-10.16.90.54-8009-361) handleJvmRoute(): We have detected a failover with different jvmRoute. old one: perf21, new one: perf18. Will reset the session id.
            ...
            11:49:46,420 TRACE [org.jboss.as.web.session.JvmRouteValve] (ajp-perf18-10.16.90.54-8009-361) resetSessionId(): changed catalina session to= [mdbFou-te-0qzdaY1T6FXyaw.perf19] old one= [mdbFou-te-0qzdaY1T6FXyaw.perf21]
            

            In the above, the request arrives at perf18, the jvmRoute assigned is perf19 (based on key ownership), but the request processing is completed at perf18 until the request is returned. In processing at perf18, perf18 will have to take the global lock. If for some reason, the global lock were not being released, then the next time the request was processed by perf19, it would not be able to obtain the lock and the exception would occur. This would at least explain how two requests involving the same session were arising.

            Unfortunately, when I add TRACE logging to try to see more of what is going on, the exceptions do not occur.

            Show
            rachmato Richard Achmatowicz added a comment - Still looking into this one. What I can see happening is as follows: 1. when a failover occurs, a new jvmRoute is being computed by looking up the cache to get the DistributionManager (ii) finding out which addresses own the session key (iii) use the JvmRouteRegistry to look up the new jvmRoute for the key, using the address of the key 2. the jvmRoute is then adjusted and the request continues processing on the node which it arrived at 3. when the request is returned, the load balancer then uses the new jvmRoute to process the request For example: 11:49:46,418 TRACE [org.jboss.as.web.session.JvmRouteValve] (ajp-perf18-10.16.90.54-8009-361) checkJvmRoute(): check if need to re-route based on JvmRoute. Session id: mdbFou-te-0qzdaY1T6FXyaw.perf21 jvmRoute: perf18 11:49:46,418 TRACE [org.jboss.as.web.session.JvmRouteValve] (ajp-perf18-10.16.90.54-8009-361) handleJvmRoute(): We have detected a failover with different jvmRoute. old one: perf21, new one: perf18. Will reset the session id. ... 11:49:46,420 TRACE [org.jboss.as.web.session.JvmRouteValve] (ajp-perf18-10.16.90.54-8009-361) resetSessionId(): changed catalina session to= [mdbFou-te-0qzdaY1T6FXyaw.perf19] old one= [mdbFou-te-0qzdaY1T6FXyaw.perf21] In the above, the request arrives at perf18, the jvmRoute assigned is perf19 (based on key ownership), but the request processing is completed at perf18 until the request is returned. In processing at perf18, perf18 will have to take the global lock. If for some reason, the global lock were not being released, then the next time the request was processed by perf19, it would not be able to obtain the lock and the exception would occur. This would at least explain how two requests involving the same session were arising. Unfortunately, when I add TRACE logging to try to see more of what is going on, the exceptions do not occur.
            Hide
            elcody02 Marc Grimme added a comment -

            Any news on that bug?
            Is it likely to be fixed with 7.1.3?

            Thanks for you help.

            Show
            elcody02 Marc Grimme added a comment - Any news on that bug? Is it likely to be fixed with 7.1.3? Thanks for you help.
            Hide
            maksymg Maksym Gryevtsov added a comment -
            Show
            maksymg Maksym Gryevtsov added a comment - Will https://community.jboss.org/thread/201663?tstart=30 help? Thanks
            Hide
            vishalkumar375 vishal kumar added a comment -

            Hi,

            We are facing the exactly same issue, can you please help in finding out the route cause.

            We are having two jboss in standalone cluster , they are giving this particular issue while using the application.

            Thanks

            Show
            vishalkumar375 vishal kumar added a comment - Hi, We are facing the exactly same issue, can you please help in finding out the route cause. We are having two jboss in standalone cluster , they are giving this particular issue while using the application. Thanks
            Hide
            ctomc Tomaz Cerar added a comment -

            vishal kumar so upgrade to newer version that has this fixed.

            Show
            ctomc Tomaz Cerar added a comment - vishal kumar so upgrade to newer version that has this fixed.

              People

              • Assignee:
                pferraro Paul Ferraro
                Reporter:
                rhusar Radoslav Husar
              • Votes:
                6 Vote for this issue
                Watchers:
                10 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Development