Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-10093

PersistenceManagerImpl stop deadlock with topology update

    Details

      Description

      DistSyncStoreNotSharedTest.clearContent hanged in CI recently:

      "testng-DistSyncStoreNotSharedTest" #16 prio=5 os_prio=0 cpu=11511.26ms elapsed=435.14s tid=0x00007fdb710b6000 nid=0x3222 waiting on condition  [0x00007fdb352d3000]
         java.lang.Thread.State: WAITING (parking)
      	at jdk.internal.misc.Unsafe.park(java.base@11/Native Method)
      	- parking to wait for  <0x00000000c8a22450> (a java.util.concurrent.Semaphore$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(java.base@11/LockSupport.java:194)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11/AbstractQueuedSynchronizer.java:885)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base@11/AbstractQueuedSynchronizer.java:1009)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base@11/AbstractQueuedSynchronizer.java:1324)
      	at java.util.concurrent.Semaphore.acquireUninterruptibly(java.base@11/Semaphore.java:504)
      	at org.infinispan.persistence.manager.PersistenceManagerImpl.stop(PersistenceManagerImpl.java:222)
      	at jdk.internal.reflect.GeneratedMethodAccessor72.invoke(Unknown Source)
      	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11/DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(java.base@11/Method.java:566)
      	at org.infinispan.commons.util.SecurityActions.lambda$invokeAccessibly$0(SecurityActions.java:79)
      	at org.infinispan.commons.util.SecurityActions$$Lambda$237/0x0000000100661c40.run(Unknown Source)
      	at org.infinispan.commons.util.SecurityActions.doPrivileged(SecurityActions.java:71)
      	at org.infinispan.commons.util.SecurityActions.invokeAccessibly(SecurityActions.java:76)
      	at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:181)
      	at org.infinispan.factories.impl.BasicComponentRegistryImpl.performStop(BasicComponentRegistryImpl.java:601)
      	at org.infinispan.factories.impl.BasicComponentRegistryImpl.stopWrapper(BasicComponentRegistryImpl.java:590)
      	at org.infinispan.factories.impl.BasicComponentRegistryImpl.stop(BasicComponentRegistryImpl.java:461)
      	at org.infinispan.factories.AbstractComponentRegistry.internalStop(AbstractComponentRegistry.java:431)
      	at org.infinispan.factories.AbstractComponentRegistry.stop(AbstractComponentRegistry.java:366)
      	at org.infinispan.cache.impl.CacheImpl.performImmediateShutdown(CacheImpl.java:1160)
      	at org.infinispan.cache.impl.CacheImpl.stop(CacheImpl.java:1125)
      	at org.infinispan.cache.impl.AbstractDelegatingCache.stop(AbstractDelegatingCache.java:521)
      	at org.infinispan.manager.DefaultCacheManager.terminate(DefaultCacheManager.java:747)
      	at org.infinispan.manager.DefaultCacheManager.stopCaches(DefaultCacheManager.java:799)
      	at org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:775)
      	at org.infinispan.test.TestingUtil.killCacheManagers(TestingUtil.java:846)
      	at org.infinispan.test.MultipleCacheManagersTest.clearContent(MultipleCacheManagersTest.java:158)
      
      "persistence-thread-DistSyncStoreNotSharedTest-NodeB-p16432-t1" #53654 daemon prio=5 os_prio=0 cpu=1.26ms elapsed=301.93s tid=0x00007fdb3c3d8000 nid=0x8ef waiting on condition  [0x00007fdb00055000]
         java.lang.Thread.State: WAITING (parking)
      	at jdk.internal.misc.Unsafe.park(java.base@11/Native Method)
      	- parking to wait for  <0x00000000c8b1fb88> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
      	at java.util.concurrent.locks.LockSupport.park(java.base@11/LockSupport.java:194)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11/AbstractQueuedSynchronizer.java:885)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base@11/AbstractQueuedSynchronizer.java:1009)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base@11/AbstractQueuedSynchronizer.java:1324)
      	at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(java.base@11/ReentrantReadWriteLock.java:738)
      	at org.infinispan.persistence.manager.PersistenceManagerImpl.pollStoreAvailability(PersistenceManagerImpl.java:196)
      	at org.infinispan.persistence.manager.PersistenceManagerImpl$$Lambda$492/0x00000001007fb440.run(Unknown Source)
      	at java.util.concurrent.Executors$RunnableAdapter.call(java.base@11/Executors.java:515)
      	at java.util.concurrent.FutureTask.runAndReset(java.base@11/FutureTask.java:305)
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@11/ScheduledThreadPoolExecutor.java:305)
      
      "transport-thread-DistSyncStoreNotSharedTest-NodeB-p16424-t5" #53646 daemon prio=5 os_prio=0 cpu=3.15ms elapsed=301.94s tid=0x00007fdb2007a000 nid=0x8e8 waiting on condition  [0x00007fdb0b406000]
         java.lang.Thread.State: WAITING (parking)
      	at jdk.internal.misc.Unsafe.park(java.base@11/Native Method)
      	- parking to wait for  <0x00000000c8d2abb0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      	at java.util.concurrent.locks.LockSupport.park(java.base@11/LockSupport.java:194)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(java.base@11/AbstractQueuedSynchronizer.java:2081)
      	at io.reactivex.internal.operators.flowable.BlockingFlowableIterable$BlockingFlowableIterator.hasNext(BlockingFlowableIterable.java:94)
      	at io.reactivex.Flowable.blockingForEach(Flowable.java:5682)
      	at org.infinispan.statetransfer.StateConsumerImpl.removeStaleData(StateConsumerImpl.java:1011)
      	at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:453)
      	at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:202)
      	at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:58)
      	at org.infinispan.statetransfer.StateTransferManagerImpl$1.updateConsistentHash(StateTransferManagerImpl.java:114)
      	at org.infinispan.topology.LocalTopologyManagerImpl.resetLocalTopologyBeforeRebalance(LocalTopologyManagerImpl.java:437)
      	at org.infinispan.topology.LocalTopologyManagerImpl.doHandleRebalance(LocalTopologyManagerImpl.java:519)
      	- locked <0x00000000c8b30b30> (a org.infinispan.topology.LocalCacheStatus)
      	at org.infinispan.topology.LocalTopologyManagerImpl.lambda$handleRebalance$3(LocalTopologyManagerImpl.java:484)
      	at org.infinispan.topology.LocalTopologyManagerImpl$$Lambda$574/0x000000010089a040.run(Unknown Source)
      	at org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:175)

      Full thread dump

      Somehow the producer thread for the transport-thread iteration is blocked, but without waiting for the persistence mutex. Maybe it's waiting for a topology? Not sure if it's relevant, but the last test to run was testClearWithFlag, so the data container was empty and the store had 5 entries.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  william.burns Will Burns
                  Reporter:
                  dan.berindei Dan Berindei
                • Votes:
                  0 Vote for this issue
                  Watchers:
                  3 Start watching this issue

                  Dates

                  • Created:
                    Updated: