Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-8092

Scattered cache state transfer misses segments

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Minor
    • 9.1.1.Final
    • 9.1.0.Final
    • Core
    • None

    Description

      I noticed this in the pull request for ISPN-7997, which uses a ControlledConsistentHashFactory to make the stream tests more predictable.

      For simplicity, I used 3 segments, and the ownership is as follows:

      • With a full cluster ABC, A owns segment 0, B owns segment 1, and C owns segment 2
      • With a smaller cluster A, AB, or AC, A owns all the segments.

      ScatteredStreamIteratorTest.verifyNodeLeavesAfterSendingBackSomeData[SCATTERED_SYNC, tx=false] kills node B, and A immediately becomes the owner of segment 1. Then the rebalance starts and A pushes segment 2 to node C, but it doesn't try to fetch any entries from segment 1 that were backed up on node C.

      17:35:09,897 DEBUG (remote-thread-test-NodeA-p2-t5:[testCache]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache testCache, topology = CacheTopology{id=7, rebalanceId=4, currentCH=ScatteredConsistentHash{ns=3, rebalanced=true, owners = (3)[test-NodeA-59810: 1, test-NodeB-37315: 1, test-NodeC-50539: 1]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[test-NodeA-59810, test-NodeB-37315, test-NodeC-50539], persistentUUIDs=[6118c3ba-840e-4838-a0cf-1165d3d5ec4b, 38cc2bd9-0a21-4020-97ab-909a32506fa1, 6a4f1a13-0fbb-4f92-867e-64068d574d4d]}, availability mode = AVAILABLE
      
      17:35:10,974 DEBUG (remote-thread-test-NodeA-p2-t5:[testCache]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache testCache, topology = CacheTopology{id=8, rebalanceId=4, currentCH=ScatteredConsistentHash{ns=3, rebalanced=false, owners = (2)[test-NodeA-59810: 2, test-NodeC-50539: 1]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[test-NodeA-59810, test-NodeC-50539], persistentUUIDs=[6118c3ba-840e-4838-a0cf-1165d3d5ec4b, 6a4f1a13-0fbb-4f92-867e-64068d574d4d]}, availability mode = AVAILABLE
      17:35:10,975 TRACE (transport-thread-test-NodeA-p4-t2:[Topology-testCache]) [StateConsumerImpl] On cache testCache we have: new segments: [0, 1]; old segments: [0]
      17:35:10,975 TRACE (transport-thread-test-NodeA-p4-t2:[Topology-testCache]) [StateConsumerImpl] On cache testCache we have: added segments: {1}; removed segments: {}
      17:35:10,975 TRACE (transport-thread-test-NodeA-p4-t2:[Topology-testCache]) [StateConsumerImpl] This is not a rebalance, not doing anything...
      
      17:35:10,976 INFO  (remote-thread-test-NodeA-p2-t5:[testCache]) [CLUSTER] ISPN000310: Starting cluster-wide rebalance for cache testCache, topology CacheTopology{id=9, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=3, rebalanced=false, owners = (2)[test-NodeA-59810: 2, test-NodeC-50539: 1]}, pendingCH=ScatteredConsistentHash{ns=3, rebalanced=true, owners = (2)[test-NodeA-59810: 3, test-NodeC-50539: 0]}, unionCH=null, phase=TRANSITORY, actualMembers=[test-NodeA-59810, test-NodeC-50539], persistentUUIDs=[6118c3ba-840e-4838-a0cf-1165d3d5ec4b, 6a4f1a13-0fbb-4f92-867e-64068d574d4d]}
      17:35:10,977 TRACE (transport-thread-test-NodeA-p4-t3:[Topology-testCache]) [CacheTopology] Current consistent hash's routing table: 0: 0, 1: 0, 2: 1
      17:35:10,977 TRACE (transport-thread-test-NodeA-p4-t3:[Topology-testCache]) [CacheTopology] Pending consistent hash's routing table: 0: 0, 1: 0, 2: 0
      17:35:10,978 TRACE (transport-thread-test-NodeA-p4-t3:[Topology-testCache]) [ScatteredVersionManager] Node will transfer value for topology 9
      17:35:10,978 TRACE (transport-thread-test-NodeA-p4-t3:[Topology-testCache]) [StateConsumerImpl] On cache testCache we have: new segments: [0, 1, 2]; old segments: [0, 1]
      17:35:10,978 TRACE (transport-thread-test-NodeA-p4-t3:[Topology-testCache]) [StateConsumerImpl] On cache testCache we have: added segments: {2}; removed segments: {}
      17:35:10,979 TRACE (transport-thread-test-NodeA-p4-t3:[Topology-testCache]) [JGroupsTransport] test-NodeA-59810 sending request 9 to all: StateRequestCommand{cache=testCache, origin=test-NodeA-59810, type=CONFIRM_REVOKED_SEGMENTS, topologyId=9, segments=null}
      17:35:10,989 TRACE (transport-thread-test-NodeA-p4-t3:[Topology-testCache]) [StateProviderImpl] Segments to replicate and invalidate: [0, 1]
      17:35:10,989 TRACE (transport-thread-test-NodeA-p4-t1:[]) [OutboundTransferTask] Sending last chunk to node test-NodeC-50539 containing 0 cache entries from segments [0, 1]
      17:35:10,989 TRACE (stateTransferExecutor-thread-test-NodeA-p7-t1:[StateRequest-testCache]) [RpcManagerImpl] test-NodeA-59810 invoking StateRequestCommand{cache=testCache, origin=test-NodeA-59810, type=START_KEYS_TRANSFER, topologyId=9, segments={2}} to recipient list [test-NodeC-50539] with options RpcOptions{timeout=240000, unit=MILLISECONDS, deliverOrder=NONE, responseFilter=null, responseMode=SYNCHRONOUS_IGNORE_LEAVERS}
      
      17:35:11,017 INFO  (transport-thread-test-NodeA-p4-t4:[testCache]) [CLUSTER] ISPN000336: Finished cluster-wide rebalance for cache testCache, topology id = 9
      17:35:11,017 DEBUG (transport-thread-test-NodeA-p4-t4:[testCache]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache testCache, topology = CacheTopology{id=10, rebalanceId=5, currentCH=ScatteredConsistentHash{ns=3, rebalanced=true, owners = (2)[test-NodeA-59810: 3, test-NodeC-50539: 0]}, pendingCH=null, unionCH=null, phase=NO_REBALANCE, actualMembers=[test-NodeA-59810, test-NodeC-50539], persistentUUIDs=[6118c3ba-840e-4838-a0cf-1165d3d5ec4b, 6a4f1a13-0fbb-4f92-867e-64068d574d4d]}, availability mode = AVAILABLE
      
      
      

      Attachments

        Activity

          People

            dberinde@redhat.com Dan Berindei (Inactive)
            dberinde@redhat.com Dan Berindei (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: