Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-10366

ScatteredStateConsumerImpl sets segment state to OWNED before applying values

    XMLWordPrintable

Details

    • Bug
    • Resolution: Obsolete
    • Major
    • None
    • 10.0.0.Beta3, 9.4.15.Final
    • Core, Test Suite
    • DataGrid Sprint #30

    Description

      ScatteredStateConsumerImpl uses InboundTransferTask only to request keys, then after it received all the keys of a segment it changes the segment state to VALUE_TRANSFER and starts an asynchronous request to fetch the values and replace the RemoteMetadata entries with real entries.

      ScatteredStateConsumerImpl.chunkCounter is supposed to delay the state transfer end and the segment state change to OWNED, but in rare occasions this doesn't happen.

      This happened in PrefetchTest.testPrefetch12 while running the test suite with taskset -c 1-2:

      21:54:43,304 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Received new topology for cache ___defaultcache, isRebalance = true, isMember = true, topology = CacheTopology{id=9, phase=TRANSITORY, rebalanceId=5, currentCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 1, Test-NodeC-3746: 0]}, pendingCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, unionCH=PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=false, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}, actualMembers=[Test-NodeA-39104, Test-NodeC-3746], persistentUUIDs=[f58e0a9a-dd4e-429a-8464-da64bf001d4e, 1471096f-c59a-4dc9-8f4d-31fbf399a2aa]}
      21:54:43,305 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[StateRequest-___defaultcache]) [ScatteredStateConsumerImpl] Requesting keys for segments {0} from Test-NodeA-39104
      21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Topology update processed, stateTransferTopologyId = 9, startRebalance = true, pending CH = PartitionerConsistentHash:ScatteredConsistentHash{ns=1, rebalanced=true, owners = (2)[Test-NodeA-39104: 0, Test-NodeC-3746: 1]}
      21:54:43,313 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateTransferLockImpl] Signalling transaction data received for topology 9
      21:54:43,313 TRACE (remote-thread-Test-NodeC-p69905-t2:[]) [TrianglePerCacheInboundInvocationHandler] Calling perform() on StateResponseCommand{cache=___defaultcache, pushTransfer=false, stateChunks=[StateChunk{segmentId=0, cacheEntries=1, isLastChunk=true}], origin=Test-NodeA-39104, topologyId=9, applyState=true}
      21:54:43,313 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Applying new state chunk for segment 0 of cache ___defaultcache from node Test-NodeA-39104: received 1 cache entries
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Finished transfer for segment 0 = KEY_TRANSFER -> VALUE_TRANSFER
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746, segment 0 has all keys in, expects value transfer
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Requesting values from segments {0}, for in-memory keys
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Retrieving values, chunk counter is 1
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [JGroupsTransport] Test-NodeC-3746 sending request 11 to Test-NodeA-39104: ClusteredGetAllCommand{keys=[key], flags=[SKIP_OWNERSHIP_CHECK], topologyId=9}
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Invalidating versions on Test-NodeC-3746, chunk counter incremented to 2
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Versions invalidated on Test-NodeC-3746, chunk counter decremented to 1
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Removing inbound transfers from node {0} for segments Test-NodeA-39104
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [ScatteredStateConsumerImpl] Inbound transfer removed, chunk counter is 1
      21:54:43,314 TRACE (stateTransferExecutor-thread-Test-NodeC-p69908-t6:[]) [StateConsumerImpl] Latch 0
      21:54:43,315 TRACE (jgroups-7,Test-NodeC-3746:[]) [JGroupsTransport] Test-NodeC-3746 received response for request 11 from Test-NodeA-39104: SuccessfulResponse([MetadataImmortalCacheValue {value=v0, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}}])
      21:54:43,316 TRACE (jgroups-7,Test-NodeC-3746:[]) [BlockingInterceptor] Command blocking before completion of PutKeyValueCommand{key=key, value=v0, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], commandInvocationId=CommandInvocation:Test-NodeC-3746:121294, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=InternalMetadataImpl{actual=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=7, version=1}}, created=-1, lastUsed=-1}, successful=true, topologyId=-1}
      21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] After applying the received state the data container of cache ___defaultcache has 1 keys
      21:54:43,316 TRACE (remote-thread-Test-NodeC-p69905-t2:[___defaultcache]) [StateConsumerImpl] Segments not received yet for cache ___defaultcache: {}
      21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Finished receiving of segments for cache ___defaultcache for topology 9.
      21:54:43,316 DEBUG (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [ScatteredVersionManagerImpl] Node Test-NodeC-3746 received values for all segments in topology 9
      21:54:43,316 TRACE (transport-thread-Test-NodeC-p69907-t5:[Topology-___defaultcache]) [StateConsumerImpl] Stop keeping track of changed keys for state transfer in topology 9
      

      The test then starts a put operation and expects it to prefetch the previous value, but because the segment is OWNED, the RemoteMetadata is ignored:

      21:54:43,316 TRACE (ForkThread-1,Test:[]) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=key, value=v1, flags=[], commandInvocationId=CommandInvocation:Test-NodeC-3746:121295, putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true, topologyId=-1} and InvocationContext [SingleKeyNonTxInvocationContext{isLocked=false, key=null, cacheEntry=null, origin=null, lockOwner=CommandInvocation:Test-NodeC-3746:121295}]
      21:54:43,316 TRACE (ForkThread-1,Test:[]) [EntryFactoryImpl] Retrieved from container MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
      21:54:43,316 TRACE (ForkThread-1,Test:[]) [ScatteredDistributionInterceptor] Committing entry RepeatableReadEntry(108d175b){key=key, value=v1, isCreated=false, isChanged=true, isRemoved=false, isExpired=false, skipLookup=true, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=SimpleClusteredVersion{topologyId=9, version=1}}}, replaced MetadataImmortalCacheEntry{key=key, value=null, metadata=RemoteMetadata{address=Test-NodeA-39104, version=1}}
      21:54:53,316 ERROR (testng-Test:[]) [TestSuiteProgress] Test failed: org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12
      org.infinispan.test.TestException: java.util.concurrent.TimeoutException
      	at org.infinispan.util.ControlledRpcManager.uncheckedGet(ControlledRpcManager.java:259) ~[test-classes/:?]
      	at org.infinispan.util.ControlledRpcManager.expectCommand(ControlledRpcManager.java:124) ~[test-classes/:?]
      	at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch(PrefetchTest.java:110) ~[test-classes/:?]
      	at org.infinispan.scattered.statetransfer.PrefetchTest.testPrefetch12(PrefetchTest.java:67) ~[test-classes/:?]
      

      On a related note, StateConsumerImpl.applyState(pushTransfer=true) initializes a CountDownLatch(stateChunks.size()), but doesn't actually count down if stateChunk.getCacheEntries() == null, potentially hanging state transfer until it times out.

      Attachments

        Activity

          People

            rvansa1@redhat.com Radim Vansa (Inactive)
            dberinde@redhat.com Dan Berindei (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: