Details

    • Type: Bug Bug
    • Status: Resolved Resolved (View Workflow)
    • Priority: Critical Critical
    • Resolution: Done
    • Affects Version/s: 2.3 SP1
    • Fix Version/s: 2.4
    • Labels:
      None
    • Estimated Difficulty:
      Medium
    • Similar Issues:
      Show 10 results 

      Description

      Stack dump shows deadlock:
      Name: http-192.168.5.2-8180-5
      State: BLOCKED on java.lang.Object@2afbf7 owned by: http-192.168.5.2-8180-4
      Total blocked: 35,522 Total waited: 401

      Stack trace:
      java.lang.Object.wait(Native Method)
      org.jgroups.protocols.FC.handleDownMessage(FC.java:360)
      org.jgroups.protocols.FC.down(FC.java:300)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.protocols.FC.receiveDownEvent(FC.java:294)
      org.jgroups.stack.Protocol.passDown(Protocol.java:551)
      org.jgroups.protocols.FRAG2.down(FRAG2.java:167)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.stack.Protocol.passDown(Protocol.java:551)
      org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:276)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:385)
      org.jgroups.JChannel.down(JChannel.java:1175)
      org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:776)
      org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passDown(MessageDispatcher.java:753)
      org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:286)
      org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:439)
      org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:189)
      org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:424)
      org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:178)
      org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4160)
      org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4114)
      org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4215)
      org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:110)
      org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:88)
      org.jboss.cache.interceptors.ReplicationInterceptor.handleReplicatedMethod(ReplicationInterceptor.java:114)
      org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:83)
      org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:345)
      org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:156)
      org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:167)
      org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5520)
      org.jboss.cache.TreeCache.put(TreeCache.java:3601)
      sun.reflect.GeneratedMethodAccessor122.invoke(Unknown Source)
      sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      java.lang.reflect.Method.invoke(Method.java:585)
      org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
      org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
      org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
      org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
      org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
      org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:279)
      org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
      $Proxy58.put(Unknown Source)
      org.jboss.web.tomcat.tc5.session.JBossCacheWrapper.put(JBossCacheWrapper.java:141)
      org.jboss.web.tomcat.tc5.session.JBossCacheService.putSession(JBossCacheService.java:315)
      org.jboss.web.tomcat.tc5.session.JBossCacheClusteredSession.processSessionRepl(JBossCacheClusteredSession.java:122)
      org.jboss.web.tomcat.tc5.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1094)
      org.jboss.web.tomcat.tc5.session.JBossCacheManager.storeSession(JBossCacheManager.java:649)
      org.jboss.web.tomcat.tc5.session.InstantSnapshotManager.snapshot(InstantSnapshotManager.java:49)
      org.jboss.web.tomcat.tc5.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:98)
      org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:74)
      org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
      org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
      org.jboss.web.tomcat.tc5.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:153)
      org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
      org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
      org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
      org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
      org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
      org.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWorkerThread.java:112)
      java.lang.Thread.run(Thread.java:595)

      Name: http-192.168.5.2-8180-4
      State: BLOCKED on org.jgroups.protocols.UNICAST$Entry@13e6987 owned by: IncomingPacketHandler (channel=Tomcat-Cluster)
      Total blocked: 39,178 Total waited: 490

      Stack trace:
      org.jgroups.protocols.UNICAST.down(UNICAST.java:264)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.stack.Protocol.passDown(Protocol.java:551)
      org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:283)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.stack.Protocol.passDown(Protocol.java:551)
      org.jgroups.protocols.pbcast.GMS.down(GMS.java:809)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.stack.Protocol.passDown(Protocol.java:551)
      org.jgroups.protocols.FC.sendCreditRequest(FC.java:528)
      org.jgroups.protocols.FC.handleDownMessage(FC.java:365)
      org.jgroups.protocols.FC.down(FC.java:300)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.protocols.FC.receiveDownEvent(FC.java:294)
      org.jgroups.stack.Protocol.passDown(Protocol.java:551)
      org.jgroups.protocols.FRAG2.down(FRAG2.java:167)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.stack.Protocol.passDown(Protocol.java:551)
      org.jgroups.protocols.pbcast.STATE_TRANSFER.down(STATE_TRANSFER.java:276)
      org.jgroups.stack.Protocol.receiveDownEvent(Protocol.java:517)
      org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:385)
      org.jgroups.JChannel.down(JChannel.java:1175)
      org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.down(MessageDispatcher.java:776)
      org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.passDown(MessageDispatcher.java:753)
      org.jgroups.blocks.RequestCorrelator.sendRequest(RequestCorrelator.java:286)
      org.jgroups.blocks.GroupRequest.doExecute(GroupRequest.java:439)
      org.jgroups.blocks.GroupRequest.execute(GroupRequest.java:189)
      org.jgroups.blocks.MessageDispatcher.castMessage(MessageDispatcher.java:424)
      org.jgroups.blocks.RpcDispatcher.callRemoteMethods(RpcDispatcher.java:178)
      org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4160)
      org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4114)
      org.jboss.cache.TreeCache.callRemoteMethods(TreeCache.java:4215)
      org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:110)
      org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:88)
      org.jboss.cache.interceptors.ReplicationInterceptor.handleReplicatedMethod(ReplicationInterceptor.java:114)
      org.jboss.cache.interceptors.ReplicationInterceptor.invoke(ReplicationInterceptor.java:83)
      org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      org.jboss.cache.interceptors.TxInterceptor.handleNonTxMethod(TxInterceptor.java:345)
      org.jboss.cache.interceptors.TxInterceptor.invoke(TxInterceptor.java:156)
      org.jboss.cache.interceptors.Interceptor.invoke(Interceptor.java:68)
      org.jboss.cache.interceptors.CacheMgmtInterceptor.invoke(CacheMgmtInterceptor.java:167)
      org.jboss.cache.TreeCache.invokeMethod(TreeCache.java:5520)
      org.jboss.cache.TreeCache.put(TreeCache.java:3601)
      sun.reflect.GeneratedMethodAccessor122.invoke(Unknown Source)
      sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      java.lang.reflect.Method.invoke(Method.java:585)
      org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
      org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
      org.jboss.mx.server.Invocation.invoke(Invocation.java:86)
      org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
      org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
      org.jboss.system.server.jmx.LazyMBeanServer.invoke(LazyMBeanServer.java:279)
      org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
      $Proxy58.put(Unknown Source)
      org.jboss.web.tomcat.tc5.session.JBossCacheWrapper.put(JBossCacheWrapper.java:141)
      org.jboss.web.tomcat.tc5.session.JBossCacheService.putSession(JBossCacheService.java:315)
      org.jboss.web.tomcat.tc5.session.JBossCacheClusteredSession.processSessionRepl(JBossCacheClusteredSession.java:122)
      org.jboss.web.tomcat.tc5.session.JBossCacheManager.processSessionRepl(JBossCacheManager.java:1094)
      org.jboss.web.tomcat.tc5.session.JBossCacheManager.storeSession(JBossCacheManager.java:649)
      org.jboss.web.tomcat.tc5.session.InstantSnapshotManager.snapshot(InstantSnapshotManager.java:49)
      org.jboss.web.tomcat.tc5.session.ClusteredSessionValve.invoke(ClusteredSessionValve.java:98)
      org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:74)
      org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
      org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
      org.jboss.web.tomcat.tc5.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:153)
      org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
      org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
      org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
      org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
      org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
      org.apache.tomcat.util.net.MasterSlaveWorkerThread.run(MasterSlaveWorkerThread.java:112)
      java.lang.Thread.run(Thread.java:595)

      Name: IncomingPacketHandler (channel=Tomcat-Cluster)
      State: BLOCKED on java.lang.Object@2afbf7 owned by: http-192.168.5.2-8180-4
      Total blocked: 85,549 Total waited: 18,813

      Stack trace:
      org.jgroups.protocols.FC.handleCredit(FC.java:462)
      org.jgroups.protocols.FC.up(FC.java:319)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.pbcast.GMS.up(GMS.java:753)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.protocols.pbcast.GMS.receiveUpEvent(GMS.java:773)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:258)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.UNICAST.handleDataReceived(UNICAST.java:471)
      org.jgroups.protocols.UNICAST.up(UNICAST.java:206)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:569)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:185)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.FD.up(FD.java:274)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:303)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.MERGE2.up(MERGE2.java:163)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.Discovery.up(Discovery.java:225)
      org.jgroups.stack.Protocol.receiveUpEvent(Protocol.java:488)
      org.jgroups.stack.Protocol.passUp(Protocol.java:538)
      org.jgroups.protocols.TP.handleIncomingMessage(TP.java:918)
      org.jgroups.protocols.TP.handleIncomingPacket(TP.java:860)
      org.jgroups.protocols.TP.access$200(TP.java:45)
      org.jgroups.protocols.TP$IncomingPacketHandler.run(TP.java:1294)
      java.lang.Thread.run(Thread.java:595)

        Activity

        Hide
        Bela Ban
        added a comment -

        The deadlock was fixed by

        • converting synchronized(mutex) into a EDU.util.concurrent.CondVar and ReentrantLock
        • releasing the lock while sending the credit request and re-acquiring it afterwards

        The code in FC looks a bit ugly, but with JDK 5's util.concurrent.lock.ReentrantLock/Condition, the lock() and unlock() methods will not have to be guarded by a try-catch clause (uninterruptible).

        Show
        Bela Ban
        added a comment - The deadlock was fixed by converting synchronized(mutex) into a EDU.util.concurrent.CondVar and ReentrantLock releasing the lock while sending the credit request and re-acquiring it afterwards The code in FC looks a bit ugly, but with JDK 5's util.concurrent.lock.ReentrantLock/Condition, the lock() and unlock() methods will not have to be guarded by a try-catch clause (uninterruptible).
        Hide
        Bela Ban
        added a comment -

        Tested on both local laptop and ATL cluster lab, with 80M messages, no deadlocks

        Show
        Bela Ban
        added a comment - Tested on both local laptop and ATL cluster lab, with 80M messages, no deadlocks

          People

          • Assignee:
            Bela Ban
            Reporter:
            Bela Ban
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: