Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-1977

ActiveMQ NIO Worker thread throws many warnings "java.io.IOException: Connection timed out"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Major Major
    • None
    • JBoss A-MQ 6.2.1
    • broker

      We have seen many "Connection timed out" warnings on some brokers. Here is one example:

      2016-06-28 08:05:43,452 [ActiveMQ NIO Worker 3] WARN Transport - Transport Connection to: tcp://xxx.xxx.xxx.xxx:50245 failed: java.io.IOException: Connection timed out
      

      We see thousands of messages like this every day, from different machines.

      With the log4j change, we captured stack traces. They all look the same:

      2016-09-16 07:56:48,969 [ActiveMQ NIO Worker 10] DEBUG Transport - Transport Connection to: tcp://xxx.xxx.xxx.xx:43797 failed: java.io.IOException: Connection timed out
      java.io.IOException: Connection timed out
      	at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
      	at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
      	at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
      	at sun.nio.ch.IOUtil.read(IOUtil.java:197)
      	at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
      	at org.apache.activemq.transport.stomp.StompNIOTransport.serviceRead(StompNIOTransport.java:94)
      	at org.apache.activemq.transport.stomp.StompNIOTransport.access$000(StompNIOTransport.java:44)
      	at org.apache.activemq.transport.stomp.StompNIOTransport$1.onSelect(StompNIOTransport.java:69)
      	at org.apache.activemq.transport.nio.SelectorSelection.onSelect(SelectorSelection.java:97)
      	at org.apache.activemq.transport.nio.SelectorWorker$1.run(SelectorWorker.java:119)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      	at java.lang.Thread.run(Thread.java:745)
      

      The "IOException: Connection timed out" error happens when the TCP retransmit timeout is reached. This may happen due to package loss etc. An RTO (retransmission timeout) occurs when the sender is missing acknowledgments for tcp packets within defined timeout and has to resend them again. The tcp parameter tcp_retries1 and tcp_retries2 on our Linux machines have default value of 3 and 15. The default value of 15 for the tcp parameter "tcp_retries2" is equivalent to something like 900 seconds which is quite long time for RTO retransmissions remain unacknowledged though.

      Here is the java version we used:

      # java -version
      java version "1.7.0_79"
      Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
      Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)
      

      There were a few of Oralce JDK ThreadPoolExecutor related bugs but all of them had been fixed in JDK 1.7.

      We are using standalone ActiveMQ of version activemq-5.11.0.redhat-621107 (equivalent to JBoss A-MQ 6.2.1 R2).

      We suspected this JIRA could be the root cause:
      https://issues.jboss.org/browse/ENTMQ-1931

      We tried to increase "corePoolSize" to 100 so the NIO thread pool created 100 threads in the pool upfront. Unfortunately, it still did not help and we still saw thousands of the warning messages in the broker log. Although the number of threads from the broker nearly doubled after the broker was restarted, which means that the setting was in effect.

      Any idea what had caused so many of "Connection timed out" warning messages?

            Unassigned Unassigned
            rhn-support-qluo Joe Luo
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: