Uploaded image for project: 'AMQ Broker'
  1. AMQ Broker
  2. ENTMQBR-576

Broker having issues handling many durable messages - ConcurrentModificationException

    XMLWordPrintable

Details

    • Hide
      1. start the broker
      2. create a durable queue
      3. use an amqp client to send many durable messages to that queue
        1. EITHER use dtests QE client:
          $ ./aac5_sender.py  --broker-url "admin:admin@<broker_ip>:5672/test_node_many_durable_msg_queue" --log-msgs dict --count 400 --msg-content "durable-msg-7D6ZulAeprEcUCG54192" --msg-durable True
          
        2. OR modify the examples ie.: simple_send.py (from python-qpid-proton-docs) to send durable messages and execute the client
          40a41
          >             msg.durable = True
          
          $ python /usr/share/proton-0.16.0/examples/python/simple_send.py -m 400 -a "admin:admin@<broker_ip>:5672/test_node_many_durable_msg_queue"
          
      4. realize that the client can't finish it's execution and seems to be hanging
      5. the broker accepted just a few messages from the batch
      Show
      start the broker create a durable queue use an amqp client to send many durable messages to that queue EITHER use dtests QE client: $ ./aac5_sender.py --broker-url "admin:admin@<broker_ip>:5672/test_node_many_durable_msg_queue" --log-msgs dict --count 400 --msg-content "durable-msg-7D6ZulAeprEcUCG54192" --msg-durable True OR modify the examples ie.: simple_send.py (from python-qpid-proton-docs) to send durable messages and execute the client 40a41 > msg.durable = True $ python /usr/share/proton-0.16.0/examples/python/simple_send.py -m 400 -a "admin:admin@<broker_ip>:5672/test_node_many_durable_msg_queue" realize that the client can't finish it's execution and seems to be hanging the broker accepted just a few messages from the batch

    Description

      Following tests from the QE acceptance test suite recently started to fail with ER17:
      • JAMQNodeAac000Tests/test_node_many_durable_msg_queue (qpid-cpp)
      • JAMQNodeAac555Tests/test_node_many_durable_msg_queue (python Proton)
      • JAMQNodeAacbbbTests/test_node_many_durable_msg_queue (nodejs)
      • Java client do not suffer from that

      It looks that the broker start to have issues handling many durable messages (handling transient messages works well) sent by a AMQP senders. It seems that the broker can't handle the massage rate (because If a tiny delay is added between sending of the messages, broker is able to handle the load well). This results in the fact that the client never ends it's execution. The issue may be easily reproduced using example clients (please see steps to reproduce).

      Sometimes a message is rejected by the broker (~50%), but other time not (~50%).
      There is no useful info in the broker debug log when the messages are not rejected, but if the broker rejects the message following concurrency error is logged to broker log:

      14:33:17,164 INFO  [org.apache.activemq.artemis] AMQ241001: HTTP Server started at http://0.0.0.0:8161
      14:33:17,164 INFO  [org.apache.activemq.artemis] AMQ241002: Artemis Jolokia REST API available at http://0.0.0.0:8161/jolokia
      14:39:00,737 WARN  [org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerReceiverContext] null: java.util.ConcurrentModificationException
              at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) [rt.jar:1.8.0_121]
              at java.util.ArrayList$Itr.next(ArrayList.java:851) [rt.jar:1.8.0_121]
              at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.processRoute(PostOfficeImpl.java:1084) [artemis-server-2.0.0.amq-700002-redhat-1.jar:2.0.0.amq-700002-redhat-1]
              at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:784) [artemis-server-2.0.0.amq-700002-redhat-1.jar:2.0.0.amq-700002-redhat-1]
              at org.apache.activemq.artemis.core.postoffice.impl.PostOfficeImpl.route(PostOfficeImpl.java:689) [artemis-server-2.0.0.amq-700002-redhat-1.jar:2.0.0.amq-700002-redhat-1]
              at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.doSend(ServerSessionImpl.java:1662) [artemis-server-2.0.0.amq-700002-redhat-1.jar:2.0.0.amq-700002-redhat-1]
              at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.send(ServerSessionImpl.java:1346) [artemis-server-2.0.0.amq-700002-redhat-1.jar:2.0.0.amq-700002-redhat-1]
              at org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback.serverSend(AMQPSessionCallback.java:397) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback.serverSend(AMQPSessionCallback.java:375) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.proton.ProtonServerReceiverContext.onMessage(ProtonServerReceiverContext.java:168) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext$LocalListener.onDelivery(AMQPConnectionContext.java:427) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.proton.handler.Events.dispatch(Events.java:97) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.dispatch(ProtonHandler.java:345) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.proton.handler.ProtonHandler.flush(ProtonHandler.java:266) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.proton.AMQPConnectionContext.flush(AMQPConnectionContext.java:138) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.protocol.amqp.broker.AMQPSessionCallback$3.done(AMQPSessionCallback.java:407) [artemis-amqp-protocol-2.0.0.amq-700002-redhat-1.jar:]
              at org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1.run(OperationContextImpl.java:236) [artemis-server-2.0.0.amq-700002-redhat-1.jar:2.0.0.amq-700002-redhat-1]
              at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExecutor$ExecutorTask.run(OrderedExecutorFactory.java:101) [artemis-commons-2.0.0.amq-700002-redhat-1.jar:2.0.0.amq-700002-redhat-1]
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_121]
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_121]
              at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_121]
      14:40:00,935 WARN  [org.apache.activemq.artemis.core.client] AMQ212037: Connection failure has been detected: AMQ119014: Did not receive data from null within the 60,000ms connection TTL. The connection will now be closed. [code=CONNECTION_TIMEDOUT]
      14:40:00,935 WARN  [org.apache.activemq.artemis.core.server] AMQ222061: Client connection failed, clearing up resources for session e9a7d831-0a4d-11e7-b937-3497f69c64ad
      14:40:00,935 WARN  [org.apache.activemq.artemis.core.server] AMQ222107: Cleared up resources for session e9a7d831-0a4d-11e7-b937-3497f69c64ad
      

      We have tested multiple HW configuration, it doesn't seems to be HW related issue. Also multiple JDKs and Broker versions were tested and doesn't seem to be related to the issue. Upstream Artemis 2.0, Artemis 2.1 does suffer from this issue as well (apache-artemis-2.0.0-20170310.031206-53-bin.zip, apache-artemis-2.1.0-20170316.031150-4-bin.zip)

      I was wondering if it may be affected by flow control, but given that it worked well in the past I don't think so. This is a regression from ER16 where it all worked well.

      Attachments

        Issue Links

          Activity

            People

              mtaylor1@redhat.com Martyn Taylor (Inactive)
              pematous Petr Matousek
              Michal Toth Michal Toth
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: