Loading...

Details

Type: Bug
Resolution: Won't Do
Priority: Critical
Fix Version/s: None
Affects Version/s: 7.0.z.GA
Component/s: ActiveMQ
Labels:
- artemis

Release Note Text:
Issue was due to a missing configuration property "ssl-enabled" on the connector. Closing.
Target Release:

7.backlog.GA
Steps to Reproduce:

Hide

Reproducer in progress

Show
Reproducer in progress

SFDC Cases Counter:
SFDC Cases Links:

Description

In a multinode JBoss EAP / Artemis cluster (greater than two nodes), after a cluster update, the jgroups discovery threads become blocked waiting for a lock held by the ActiveMQ-server threads, which are repeatedly trying to upgrade the netty connection and failing. JGroups messages pile up in the receiver, with the eventual result that the container goes OutOfMemory. [edit] This happens quickly in clustered configurations with more than one server per host controller, but can also happen in clusters greater than two nodes with one server per controller.[/edit] It is not detected as a deadlock, as the server thread is in a timed wait, but the thread holds the lock repeatedly as long as the connection upgrade is failing. Relevant stacks look like:

"Thread-8 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$2@d85ed25-1490599274)" #154 prio=5 os_prio=0 tid=0x00007f1c6469e000 nid=0x3a5e waiting on condition [0x00007f1c1fa63000]
   java.lang.Thread.State: TIMED_WAITING (parking)
	at sun.misc.Unsafe.park(Native Method)
	- parking to wait for  <0x00000007862028c0> (a java.util.concurrent.CountDownLatch$Sync)
	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
	at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
	at org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector$HttpUpgradeHandler.awaitHandshake(NettyConnector.java:765)
	at org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector.createConnection(NettyConnector.java:664)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.openTransportConnection(ClientSessionFactoryImpl.java:1009)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.createTransportConnection(ClientSessionFactoryImpl.java:1051)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.establishNewConnection(ClientSessionFactoryImpl.java:1230)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.getConnection(ClientSessionFactoryImpl.java:867)
	- locked <0x000000078074b620> (a java.lang.Object)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.getConnectionWithRetry(ClientSessionFactoryImpl.java:769)
	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.connect(ClientSessionFactoryImpl.java:238)
	at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:760)
	- locked <0x0000000772f8f010> (a org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl)
	at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:617)
	at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:598)
	at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl$4.run(ServerLocatorImpl.java:562)
	at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:103)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
	- <0x0000000772f92810> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"activemq-discovery-group-thread-dg-group1" #153 daemon prio=5 os_prio=0 tid=0x00007f1c6469c000 nid=0x3a5d waiting for monitor entry [0x00007f1c1fb65000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connectorsChanged(ServerLocatorImpl.java:1431)
	- waiting to lock <0x0000000772f8f010> (a org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl)
	at org.apache.activemq.artemis.core.cluster.DiscoveryGroup.callListeners(DiscoveryGroup.java:355)
	at org.apache.activemq.artemis.core.cluster.DiscoveryGroup.access$500(DiscoveryGroup.java:49)
	at org.apache.activemq.artemis.core.cluster.DiscoveryGroup$DiscoveryRunnable.run(DiscoveryGroup.java:323)
	at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
	- None

It is unclear why the connection upgrade fails in this configuration. I was suspicious of the miss-spelled "httpPpgradeEndpoint" constant used in a header, but it appears to be this way in both Artemis and Wildfly related code blocks.

Failing Netty Transport Upgrade in JBoss EAP, Resulting in Blocked JGroups Discovery Threads and OutOfMemoryErrors

Details

Description

Attachments

Activity

People

Dates