Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-2178

Messages Fail to Page In from JDBC Datasource On Broker Restart

    XMLWordPrintable

Details

    • +
    • Hide

      Only a full JVM restart or failover seems to work around the issue.

      Show
      Only a full JVM restart or failover seems to work around the issue.
    • Hide

      Details to come, but in a nutshell:

      1. Configure a broker with a JDBC datasource (we used DBCP pooling)
      2. Introduce some latency in the datasource (accomplished here via a stored procedure to introduce delays and triggers to fire the procedure on inserts or updates)
      3. Start a producer and consumer on a queue
      4. Enable the triggers for the delay procedure to force a restart
      5. Disable triggers and watch for lingering queue counts after consumers reconnect and consume remaining message.

      Show
      Details to come, but in a nutshell: 1. Configure a broker with a JDBC datasource (we used DBCP pooling) 2. Introduce some latency in the datasource (accomplished here via a stored procedure to introduce delays and triggers to fire the procedure on inserts or updates) 3. Start a producer and consumer on a queue 4. Enable the triggers for the delay procedure to force a restart 5. Disable triggers and watch for lingering queue counts after consumers reconnect and consume remaining message.

    Description

      Upon internal broker restart (triggered by JDBC IOException), we see that not all messages are successfully paged into the queue. The queue count accurately reflects the count for the destination in the JDBC source; however, none of the messages are browseable and the messages do not appear to be loaded into the cursor.

      In this instance, we had 1 "stuck" message that was queryable in the database, showed in the queue count, but was not consumed by the consumer and was not browseable via Hawtio:

      2018-02-16 15:08:49,158 | DEBUG | ce[amq02] Task-4 | Queue                            | che.activemq.broker.region.Queue 1926 | 162 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-630310 | queue://TEST.INBOUND_QUEUE, subscriptions=1, memory=0%, size=1, pending=0 toPageIn: 1, force:false, Inflight: 0, pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 19, dequeueCount: 23, memUsage:0, maxPageSize:200
      

      We could send and receive other messages, but the message in question remained unbrowseable, but still counting against the queue depth. This issue occurred after an IOException restarted the broker, with the same broker obtaining the lock following the restart. We were able to reproduce the issue several times and upon subsequent internal restarts, some of the "stuck" messages got paged in and consumed, but in the case above one orphaned message remained.

      Restarting consumers had no effect, but a JVM restart on the broker resulted in the messages all being paged in and consumed.

      In the reproducer environment, there is a single network connection to another broker, but the queue depth for the same queue reported there was "0."

      Attachments

        1. 1000-msg-one-master-slave-1.tar.bz2
          25.96 MB
        2. 1000-msg-one-master-slave-2.tar.bz2
          26.97 MB
        3. 200-msgs-catch-npe-1.tar.bz2
          18.58 MB
        4. 200-msgs-catch-npe-2.tar.bz2
          22.62 MB
        5. 200msgs-node1.tar.bz2
          24.72 MB
        6. 200-msgs-no-delete-delay-1.tar.bz2
          18.49 MB
        7. 200msgs-xtra-instrumentation-1.tar.bz2
          22.47 MB
        8. 200msgs-xtra-instrumentation-2.tar.bz2
          23.07 MB
        9. count-off-by-1-200msgs.tar.bz2
          20.28 MB
        10. fixed-delays.tar.bz2
          28.95 MB
        11. LOG.txt
          319 kB
        12. mq-fabric-1.2.0.redhat-630-SNAPSHOT.jar
          82 kB
        13. multinode-amq1.tar.bz2
          2.27 MB
        14. multinode-amq2.tar.bz2
          5.33 MB
        15. node1-restarts.tar.bz2
          6.26 MB
        16. node2-restarts.tar.bz2
          2.15 MB
        17. reproducer-jvm-restarts-no-heap.tar.bz2
          2.71 MB
        18. reproducer-no-jvm-restarts-wo-new-instrumentation.tar.bz2
          24.01 MB
        19. reproducer-xtra-instrumentation.tar.bz2
          19.04 MB

        Issue Links

          Activity

            People

              gtully@redhat.com Gary Tully
              rhn-support-dhawkins Duane Hawkins
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: