FUSE Message Broker
  1. FUSE Message Broker
  2. MB-668

Message Broker Stops Dispatching from Queues

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Done
    • Affects Version/s: 5.3.0.3-fuse
    • Fix Version/s: None
    • Component/s: broker
    • Labels:
      None
    • Environment:
      AIX 6.1
    • Similar Issues:
      Show 9 results 

      Description

      Application has Two Queues, used for processing persistent messages. There is an inbound Queue and an Oubound Queue - both with multiple consumers, using selectors.
      The Queue depth never gets beyond 2-3 messages, until dispatching suddenly stops.

        Gliffy Diagrams

          Issue Links

            Activity

            Hide
            Jack Britton added a comment -

            The change in maxPageSize was dont via JMX after the hung condition manifested itself (Rob suggested it during troubleshooting). The only other thing we do with JMX is monitor so I dont think the sitescope tools would be doing a copy or move?

            Show
            Jack Britton added a comment - The change in maxPageSize was dont via JMX after the hung condition manifested itself (Rob suggested it during troubleshooting). The only other thing we do with JMX is monitor so I dont think the sitescope tools would be doing a copy or move?
            Hide
            Gary Tully added a comment -

            That explains why the broker got into the non recoverable state it was in but it does not help understand the original root cause. What information have we from before the JMX maxPageSize operation was invoked?

            Show
            Gary Tully added a comment - That explains why the broker got into the non recoverable state it was in but it does not help understand the original root cause. What information have we from before the JMX maxPageSize operation was invoked?
            Hide
            Jack Britton added a comment -

            Just what was in the submitted logs. This really is an issue here as everything is monitored with JMX and if there a chance that the monitoring is somehow stoping messaging then we need to figure that out.

            Show
            Jack Britton added a comment - Just what was in the submitted logs. This really is an issue here as everything is monitored with JMX and if there a chance that the monitoring is somehow stoping messaging then we need to figure that out.
            Hide
            Gary Tully added a comment -

            We can address the issue with concurrent access to maxPageSize via JMX and dispatch in the 5.4 release but from your comments it looks like this is not the root cause. I have opened http://fusesource.com/issues/browse/MB-706 to track this for 5.4.

            To do more analysis we need some more thread dumps and logs from a broker that gets into this state, or a test case that can reproduce the hung scenario.
            I need to do one more pass of the existing logs to ensure there is nothing useful being overlooked.

            Show
            Gary Tully added a comment - We can address the issue with concurrent access to maxPageSize via JMX and dispatch in the 5.4 release but from your comments it looks like this is not the root cause. I have opened http://fusesource.com/issues/browse/MB-706 to track this for 5.4. To do more analysis we need some more thread dumps and logs from a broker that gets into this state, or a test case that can reproduce the hung scenario. I need to do one more pass of the existing logs to ensure there is nothing useful being overlooked.
            Hide
            Gary Tully added a comment -

            Jeff yes, that looks like the sort of information we need. Essentially a collection of periodic thread dumps from the broker when the hang occurs.

            Jack, can you try and pull together the relevant information, logs and thread dumps that are relevant to a given event, there seems to be information scattered across a bunch of jira issues. We need to make sure we are analyzing the correct latest information.

            Show
            Gary Tully added a comment - Jeff yes, that looks like the sort of information we need. Essentially a collection of periodic thread dumps from the broker when the hang occurs. Jack, can you try and pull together the relevant information, logs and thread dumps that are relevant to a given event, there seems to be information scattered across a bunch of jira issues. We need to make sure we are analyzing the correct latest information.

              People

              • Assignee:
                Gary Tully
                Reporter:
                Rob Davies
              • Votes:
                1 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: