Details
-
Bug
-
Resolution: Done
-
Major
-
5.3.0.Final
-
None
Description
ReplTotalOrderStateTransferFunctional1PcTest.testSTWithThirdWritingTxCache sometimes times out. It seems to deadlock because a VersionedPrepareCommand and the REBALANCE_START command are delivered in different order on two different nodes:
01:27:31,740 TRACE (Incoming-2,ISPN,NodeL-3115:) [CommandAwareRpcDispatcher] Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=repl-to-1pc-nbst, type=REBALANCE_START, sender=NodeK-31445, joinInfo=null, topologyId=3, currentCH=ReplicatedConsistentHash{members=[NodeK-31445, NodeL-3115], numSegments=1, primaryOwners=[0]}, pendingCH=ReplicatedConsistentHash{members=[NodeK-31445, NodeL-3115, NodeM-36388], numSegments=1, primaryOwners=[0]}, throwable=null, viewId=2} [sender=NodeK-31445] 01:27:31,740 TRACE (Incoming-1,ISPN,NodeK-31445:) [CommandAwareRpcDispatcher] Attempting to execute command: VersionedPrepareCommand {modifications=[PutKeyValueCommand{key=test12, value=12, flags=null, putIfAbsent=false, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, versionsSeen={test12=null}, gtx=GlobalTransaction:<NodeL-3115>:22544:local, cacheName='repl-to-1pc-nbst'} [sender=NodeL-3115] 01:27:31,740 TRACE (Incoming-3,ISPN,NodeL-3115:) [CommandAwareRpcDispatcher] Attempting to execute command: VersionedPrepareCommand {modifications=[PutKeyValueCommand{key=test12, value=12, flags=null, putIfAbsent=false, metadata=EmbeddedMetadata{version=null}, successful=true}], onePhaseCommit=false, versionsSeen={test12=null}, gtx=GlobalTransaction:<NodeL-3115>:22544:local, cacheName='repl-to-1pc-nbst'} [sender=NodeL-3115] 01:27:31,741 TRACE (Incoming-3,ISPN,NodeL-3115:) [TotalOrderManager] Transaction [NodeL-3115:22544] will wait for [TotalOrderLatchImpl{latch=java.util.concurrent.CountDownLatch@5868f0ca[Count = 1], name='StateTransfer-3'}] and locked [test12] 01:27:31,741 TRACE (Incoming-2,ISPN,NodeK-31445:) [CommandAwareRpcDispatcher] Attempting to execute non-CacheRpcCommand command: CacheTopologyControlCommand{cache=repl-to-1pc-nbst, type=REBALANCE_START, sender=NodeK-31445, joinInfo=null, topologyId=3, currentCH=ReplicatedConsistentHash{members=[NodeK-31445, NodeL-3115], numSegments=1, primaryOwners=[0]}, pendingCH=ReplicatedConsistentHash{members=[NodeK-31445, NodeL-3115, NodeM-36388], numSegments=1, primaryOwners=[0]}, throwable=null, viewId=2} [sender=NodeK-31445] 01:27:31,741 TRACE (Incoming-2,ISPN,NodeK-31445:repl-to-1pc-nbst) [TotalOrderManager] State Transfer start. It will wait for [TotalOrderLatchImpl{latch=java.util.concurrent.CountDownLatch@51988b1e[Count = 1], name='NodeL-3115:22544'}]