Uploaded image for project: 'WildFly'
  1. WildFly
  2. WFLY-1966

Hangs in DomainControllerMigrationTestCase

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 8.0.0.CR1
    • None
    • Management
    • None

    Description

      DomainControllerMigrationTestCase is intermittently hanging recently.

      I will attach thread dumps from a recent hang.

      The test driver is hanging here:

      "main" prio=10 tid=0xb7605c00 nid=0x7656 in Object.wait() [0xb7786000]
      java.lang.Thread.State: WAITING (on object monitor)
      at java.lang.Object.wait(Native Method)

      • waiting on <0xa99833f8> (a org.jboss.as.protocol.mgmt.ActiveOperationSupport$ActiveOperationImpl)
        at java.lang.Object.wait(Object.java:503)
        at org.jboss.threads.AsyncFutureTask.await(AsyncFutureTask.java:192)
      • eliminated <0xa99833f8> (a org.jboss.as.protocol.mgmt.ActiveOperationSupport$ActiveOperationImpl)
        at org.jboss.threads.AsyncFutureTask.get(AsyncFutureTask.java:266)
      • locked <0xa99833f8> (a org.jboss.as.protocol.mgmt.ActiveOperationSupport$ActiveOperationImpl)
        at org.jboss.as.controller.client.impl.AbstractDelegatingAsyncFuture.get(AbstractDelegatingAsyncFuture.java:100)
        at org.jboss.as.controller.client.impl.AbstractModelControllerClient.executeForResult(AbstractModelControllerClient.java:127)
        at org.jboss.as.controller.client.impl.AbstractModelControllerClient.execute(AbstractModelControllerClient.java:76)
        at org.jboss.as.controller.client.helpers.domain.impl.DomainClientImpl.execute(DomainClientImpl.java:86)
        at org.jboss.as.test.integration.domain.management.util.DomainLifecycleUtil.executeForResult(DomainLifecycleUtil.java:580)
        at org.jboss.as.test.integration.domain.DomainControllerMigrationTestCase.testDCFailover(DomainControllerMigrationTestCase.java:206)

      The master HC process is stuck here:

      "management-handler-thread - 2" prio=10 tid=0x8be50400 nid=0x1a92 waiting on condition [0x87604000]
      java.lang.Thread.State: WAITING (parking)
      at sun.misc.Unsafe.park(Native Method)

      • parking to wait for <0xaa032fd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
        at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at org.jboss.as.controller.remote.BlockingQueueOperationListener.retrievePreparedOperation(BlockingQueueOperationListener.java:84)
        at org.jboss.as.domain.controller.operations.coordination.DomainSlaveHandler.execute(DomainSlaveHandler.java:117)
        at org.jboss.as.controller.AbstractOperationContext.executeStep(AbstractOperationContext.java:608)
        at org.jboss.as.controller.AbstractOperationContext.doCompleteStep(AbstractOperationContext.java:486)
        at org.jboss.as.controller.AbstractOperationContext.completeStepInternal(AbstractOperationContext.java:275)
        at org.jboss.as.controller.AbstractOperationContext.executeOperation(AbstractOperationContext.java:270)
        at org.jboss.as.controller.ModelControllerImpl.internalExecute(ModelControllerImpl.java:257)
        at org.jboss.as.controller.ModelControllerImpl.execute(ModelControllerImpl.java:142)
        at org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler.doExecute(ModelControllerClientOperationHandler.java:205)
        at org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler.access$300(ModelControllerClientOperationHandler.java:110)
        at org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler$1$2.run(ModelControllerClientOperationHandler.java:157)
        at org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler$1$2.run(ModelControllerClientOperationHandler.java:153)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.jboss.as.controller.remote.ModelControllerClientOperationHandler$ExecuteRequestHandler$1.execute(ModelControllerClientOperationHandler.java:153)
        at org.jboss.as.protocol.mgmt.AbstractMessageHandler$2$1.doExecute(AbstractMessageHandler.java:296)
        at org.jboss.as.protocol.mgmt.AbstractMessageHandler$AsyncTaskRunner.run(AbstractMessageHandler.java:518)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)
        at org.jboss.threads.JBossThread.run(JBossThread.java:122)

      I see nothing relevant in the other thread dumps.

      The master HC is blocking waiting for a response from the slave HC.

      Just before the client makes the request that triggers this, it has reloaded the slave HC. The test is not robust in terms of how it checks that the slave is ready before continuing on with the test; it merely connects directly to the slave checks that the slave's root resource can be read. That can succeed relatively early in boot even before the slave has registered with the master.

      Fixing that may prevent the hangs, but it would likely paper over whatever problem is causing the hangs. A client invoking multi-host operations while a slave is booting and registering is a use case that can't hang.

      Attachments

        Activity

          People

            darran.lofthouse@redhat.com Darran Lofthouse
            bstansbe@redhat.com Brian Stansberry
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: