Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-7135

ClusterListenerReplTest.testPrimaryOwnerGoesDownBeforeBackupRaisesEvent random failures

    XMLWordPrintable

Details

    Description

      The test assumes that because the primary owner is killed before the backups updated the entry, the clustered listener will only receive one CACHE_ENTRY_CREATED event.

      However, the crash and the following rebalance require 3 topology updates, meaning the command could be retried a second time, after the backup already applied the new value. In the log below, NodeDS retries with topology 7, the primary NodeDU forwards it to the backup (also NodeDS), but then sees topology 8 and triggers a retry. The clustered listener on NodeDS receives both a CACHE_ENTRY_CREATED event and a CACHE_ENTRY_MODIFIED event.

      10:35:47,429 TRACE (ForkThread-2,Test:[]) [BaseDistributionInterceptor] Command topology id is 5, current topology id is 5, successful? true
      10:35:47,429 TRACE (ForkThread-2,Test:[]) [BaseDistributionInterceptor] I'm not the primary owner, so sending the command to the primary owner(Test-NodeDT-11194) in order to be forwarded
      10:35:47,496 DEBUG (remote-thread-Test-NodeDS-p31418-t6:[cluster-listener]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache cluster-listener, topology = CacheTopology{id=6, rebalanceId=3, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}, availability mode = AVAILABLE
      10:35:47,502 TRACE (transport-thread-Test-NodeDS-p31420-t1:[Topology-cluster-listener]) [StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false, isMember = true, topology = CacheTopology{id=6, rebalanceId=3, currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
      10:35:47,514 INFO  (remote-thread-Test-NodeDS-p31418-t6:[cluster-listener]) [CLUSTER] ISPN000310: Starting cluster-wide rebalance for cache cluster-listener, topology CacheTopology{id=7, rebalanceId=4, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, unionCH=null, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
      10:35:47,526 TRACE (transport-thread-Test-NodeDS-p31420-t2:[Topology-cluster-listener]) [StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = true, isMember = true, topology = CacheTopology{id=7, rebalanceId=4, currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, unionCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
      10:35:47,527 TRACE (transport-thread-Test-NodeDU-p31480-t5:[Topology-cluster-listener]) [StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false, isMember = true, topology = CacheTopology{id=6, rebalanceId=3, currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
      *** 10:35:47,555 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [StateTransferInterceptor] Retrying command because of topology change, current topology is 7: PutKeyValueCommand{key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value, flags=[], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}
      10:35:47,555 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [BaseDistributionInterceptor] I'm not the primary owner, so sending the command to the primary owner(Test-NodeDU-38945) in order to be forwarded
      10:35:47,558 TRACE (transport-thread-Test-NodeDU-p31480-t5:[Topology-cluster-listener]) [StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = true, isMember = true, topology = CacheTopology{id=7, rebalanceId=4, currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, pendingCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, unionCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 133, Test-NodeDU-38945: 123]}, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
      10:35:47,600 DEBUG (remote-thread-Test-NodeDS-p31418-t5:[cluster-listener]) [ClusterTopologyManagerImpl] Updating cluster-wide current topology for cache cluster-listener, topology = CacheTopology{id=8, rebalanceId=4, currentCH=ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}, availability mode = AVAILABLE
      10:35:47,603 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [ReadCommittedEntry] Updating entry (key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194} removed=false valid=true changed=true created=true value=first-value metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null})
      10:35:47,603 TRACE (transport-thread-Test-NodeDS-p31420-t6:[Topology-cluster-listener]) [StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false, isMember = true, topology = CacheTopology{id=8, rebalanceId=4, currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
      10:35:47,611 DEBUG (remote-thread-Test-NodeDS-p31418-t6:[]) [Test] Adding new cluster event EventImpl{type=CACHE_ENTRY_CREATED, pre=false, cache=Cache 'cluster-listener'@Test-NodeDS-11906, key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value, oldValue=null, transaction=null, originLocal=false, transactionSuccessful=false, entries=null, created=false}
      10:35:47,614 TRACE (transport-thread-Test-NodeDU-p31480-t1:[Topology-cluster-listener]) [StateConsumerImpl] Received new topology for cache cluster-listener, isRebalance = false, isMember = true, topology = CacheTopology{id=8, rebalanceId=4, currentCH=PartitionerConsistentHash:ReplicatedConsistentHash{ns = 256, owners = (2)[Test-NodeDS-11906: 134, Test-NodeDU-38945: 122]}, pendingCH=null, unionCH=null, actualMembers=[Test-NodeDS-11906, Test-NodeDU-38945], persistentUUIDs=[7ddc513a-2a94-4dc5-9918-7fb90a6510bb, cd827dd6-62d9-4f78-a28e-4d10dc77a862]}
      10:35:47,660 TRACE (OOB-1,Test-NodeDS-11906:[]) [AsyncInvocationStage] Exception from invocation handler
      org.infinispan.statetransfer.OutdatedTopologyException: Cache topology changed while the command was executing: expected 7, got 8
      *** 10:35:47,661 TRACE (OOB-1,Test-NodeDS-11906:[]) [StateTransferInterceptor] Retrying command because of topology change, current topology is 8: PutKeyValueCommand{key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value, flags=[COMMAND_RETRY], putIfAbsent=false, valueMatcher=MATCH_ALWAYS, metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, successful=true}
      10:35:47,661 TRACE (OOB-1,Test-NodeDS-11906:[]) [BaseDistributionInterceptor] I'm not the primary owner, so sending the command to the primary owner(Test-NodeDU-38945) in order to be forwarded
      10:35:47,745 TRACE (remote-thread-Test-NodeDS-p31418-t6:[]) [ReadCommittedEntry] Updating entry (key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194} removed=false valid=true changed=true created=false value=first-value metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null})
      10:35:47,746 DEBUG (remote-thread-Test-NodeDS-p31418-t6:[]) [Test] Adding new cluster event EventImpl{type=CACHE_ENTRY_MODIFIED, pre=false, cache=Cache 'cluster-listener'@Test-NodeDS-11906, key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194}, value=first-value, oldValue=first-value, transaction=null, originLocal=false, transactionSuccessful=false, entries=null, created=false}
      10:35:47,771 TRACE (OOB-2,Test-NodeDU-38945:[]) [ReadCommittedEntry] Updating entry (key=MagicKey{1088/5C01D0B5/73@Test-NodeDT-11194} removed=false valid=true changed=true created=true value=first-value metadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null}, providedMetadata=EmbeddedExpirableMetadata{lifespan=-1, maxIdle=-1, version=null})
      

      Attachments

        Activity

          People

            wburns@redhat.com Will Burns
            dberinde@redhat.com Dan Berindei (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: