Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-5885

NonTotalOrderPerCacheInvocationHandlerImpl should release locks

    XMLWordPrintable

Details

    Description

      Traditionally, the locking interceptor is the one that releases key locks. But after the ISPN-2849 fix, the locks are acquired by NonTotalOrderPerCacheInvocationHandlerImpl before the interceptor chain even starts executing. If there's an exception between the lock acquisition and the locking interceptor, the locks will never be released.

      There is currently one situation where I have reproduced this, in ThreeNodesReplicatedSplitAndMergeTest.testSplitAndMerge0. Node C is split in a partition by itself, and then it is merged with the AB partition. After the merge, JGroups sometimes replays the put command broadcasted in the AB partition on C. C's cache is still in degraded mode, so the write fails, but its lock is never released.

      12:53:41,088 INFO  (testng-ThreeNodesReplicatedSplitAndMergeTest:[]) [JGroupsTransport] ISPN000093: Received new, MERGED cluster view for channel ISPN: MergeView::[NodeA-44116|10] (3) [NodeA-44116, NodeB-58097, NodeC-3920], 2 subgroups: [NodeA-44116|8] (2) [NodeA-44116, NodeB-58097], [NodeC-3920|9] (1) [NodeC-3920]
      12:53:42,143 TRACE (OOB-1,NodeC-3920:[]) [GlobalInboundInvocationHandler] Attempting to execute CacheRpcCommand: SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}} [sender=NodeB-58097]
      12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [DefaultLockManager] Lock key=MagicKey#null{ced8b1f2@NodeC-3920/9} for owner=CommandUUID{address=NodeB-58097, id=48899}. timeout=10000 (MILLISECONDS)
      12:53:42,144 TRACE (OOB-1,NodeC-3920:[]) [InfinispanLock] LockPlaceHolder{lockState=ACQUIRED, owner=CommandUUID{address=NodeB-58097, id=48899}} successfully acquired the lock.
      12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true} and InvocationContext [org.infinispan.context.SingleKeyNonTxInvocationContext@67bb1b50]
      12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Checking availability for key=MagicKey#null{ced8b1f2@NodeC-3920/9}, status=DEGRADED_MODE
      12:53:42,144 TRACE (remote-thread-NodeC-p36993-t5:[]) [PartitionHandlingManagerImpl] Partition is in DEGRADED_MODE mode, access is not allowed for key MagicKey#null{ced8b1f2@NodeC-3920/9}
      12:53:42,144 ERROR (remote-thread-NodeC-p36993-t5:[]) [InvocationContextInterceptor] ISPN000136: Execution error
      org.infinispan.partitionhandling.AvailabilityException: ISPN000306: Key 'MagicKey#null{ced8b1f2@NodeC-3920/9}' is not available. Not all owners are in this partition
      12:53:42,144 WARN  (remote-thread-NodeC-p36993-t5:[]) [NonTotalOrderPerCacheInboundInvocationHandler] ISPN000071: Caught exception when handling command SingleRpcCommand{cacheName='___defaultcache', command=PutKeyValueCommand{key=MagicKey#null{ced8b1f2@NodeC-3920/9}, value=v22, flags=null, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}}
      

      Of course, knowing that C is in degraded mode, perhaps we should not try to acquire the lock at all. But then we'd move even more responsibility into the invocation handler. There's also the possibility of having exceptions caused by topology changes or programming errors, so a solution that handles any exception would be preferable.

      Attachments

        Activity

          People

            pruivo@redhat.com Pedro Ruivo
            dberinde@redhat.com Dan Berindei (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: