FUSE Message Broker
  1. FUSE Message Broker
  2. MB-515

Interaction between Message Groups and Message Selectors freezes JMS Consumers

    Details

    • Type: Bug Bug
    • Status: Closed Closed
    • Priority: Blocker Blocker
    • Resolution: Done
    • Affects Version/s: 5.3.0.3-fuse
    • Fix Version/s: 5.3.0.4-fuse
    • Component/s: None
    • Labels:
      None
    • Environment:
      ActiveMQ 5.3.0.3-fuse, JDK 1.5
    • Similar Issues:
      Show 10 results 

      Description

      Interaction between Message Groups (JMSXGroupID) and Message Selectors freezes JMS Consumers.

      MAVEN_OPTS='-Xmx256m -Dcom.sun.management.jmxremote'

      To run: mvn clean test

      test case will start an embedded broker that will store data within target dir, so mvn clean will delete brokers files to allow for clean runs.

      You see failure when Camel Throughput logger output stops indicating that JMS Consumers have stoped. Enqueue Count however is still growing within embedded broker.

      if within src/main/resources/routers.xml you set MessageCreator's numPartitions propery to 1 (only 1 message selector) or numMessageGroups to 0 (no JMSXGroupID header) then the JMS Consumer(s) will continue forever

      MAVEN_OPTS='-Xmx256m -Dcom.sun.management.jmxremote' To run: mvn clean test test case will start an embedded broker that will store data within target dir, so mvn clean will delete brokers files to allow for clean runs. You see failure when Camel Throughput logger output stops indicating that JMS Consumers have stoped. Enqueue Count however is still growing within embedded broker. if within src/main/resources/routers.xml you set MessageCreator's numPartitions propery to 1 (only 1 message selector) or numMessageGroups to 0 (no JMSXGroupID header) then the JMS Consumer(s) will continue forever

        Activity

        Hide
        Dejan Bosanac
        added a comment -

        Hi,

        I think there might be a misuse of message groups in this example. Here's what happens with this test case:

        When you set message group on a message, the consumer that consumes the first message from that group is "associated" with the group. From that point on all messages from that group will be sent to that particular consumer. Now, the consumer also has a selector on it, so that will further filter messages sent to it.

        If the group count is small (1 to 10) it can cause that the first consumer get "assigned" to all groups because of the prefetch size.

        With the large group count (>1000) groups will be shared between two consumers, but again, only portion of those messages will be actually consumed.

        What is the purpose for message groups in this scenario, maybe we can find some alternative way to achieve requested behavior.

        Regards
        Dejan

        Show
        Dejan Bosanac
        added a comment - Hi, I think there might be a misuse of message groups in this example. Here's what happens with this test case: When you set message group on a message, the consumer that consumes the first message from that group is "associated" with the group. From that point on all messages from that group will be sent to that particular consumer. Now, the consumer also has a selector on it, so that will further filter messages sent to it. If the group count is small (1 to 10) it can cause that the first consumer get "assigned" to all groups because of the prefetch size. With the large group count (>1000) groups will be shared between two consumers, but again, only portion of those messages will be actually consumed. What is the purpose for message groups in this scenario, maybe we can find some alternative way to achieve requested behavior. Regards Dejan
        Hide
        Dejan Bosanac
        added a comment -

        I guess the alternative would be to first check message selectors and then divide those consumers into groups. Is that a desired behavior?

        Show
        Dejan Bosanac
        added a comment - I guess the alternative would be to first check message selectors and then divide those consumers into groups. Is that a desired behavior?
        Hide
        Scott Cranton
        added a comment -

        Core problem is to take 600 msgs/sec from one queue and route them to 10 independent store brokers; each partition broker will have 1000 store connection each with their own store specific queue.

        The message header PartitionID is directly correlated to the store broker. The message header StoreID is directly correlated to a store consumer / store queue. The JMSXGroupID is set to the store id as we must guarantee that messages are delivered in publish order to a store without any dependence on message content.

        The message groups are partitioned into clean subsets - e.g. stores (message groups) 1-1000 are in Partition 0; stores 1001-200 are in Partition 1; ... There will ultimately be ~10,000 message groups / consumers spread over 10 non-overlapping partition sets.

        I started down combining selectors and groups based on earlier tests (and at the suggestion of Rob Davies where message groups with competing consumers wasn't delivering the required throughput.

        Show
        Scott Cranton
        added a comment - Core problem is to take 600 msgs/sec from one queue and route them to 10 independent store brokers; each partition broker will have 1000 store connection each with their own store specific queue. The message header PartitionID is directly correlated to the store broker. The message header StoreID is directly correlated to a store consumer / store queue. The JMSXGroupID is set to the store id as we must guarantee that messages are delivered in publish order to a store without any dependence on message content. The message groups are partitioned into clean subsets - e.g. stores (message groups) 1-1000 are in Partition 0; stores 1001-200 are in Partition 1; ... There will ultimately be ~10,000 message groups / consumers spread over 10 non-overlapping partition sets. I started down combining selectors and groups based on earlier tests (and at the suggestion of Rob Davies where message groups with competing consumers wasn't delivering the required throughput.
        Hide
        Dejan Bosanac
        added a comment -

        Hi Scott,

        thanks for the info. I created a pure-amq test case and there is definitely something wrong with this behavior. I hope I can provide a fix soon.

        Regards
        Dejan

        Show
        Dejan Bosanac
        added a comment - Hi Scott, thanks for the info. I created a pure-amq test case and there is definitely something wrong with this behavior. I hope I can provide a fix soon. Regards Dejan
        Hide
        Dejan Bosanac
        added a comment -

        Hi Scott,

        I played a bit more with your test case and got it working. Here's what's happening:

        1. First of all, in order to have message groups with selectors working fine, we have to "cluster" selectors among partition id. Otherwise we can end up with the following situation: we send a message with group id 5 to partition 0 which will assign consumer to group id 5. If you later send a message with group id 5 and partition 1, consumer for partition 1 will not be able to consume the message, since it is assigned to the other consumer (because of the group id). I modified your MessageCreator to set group ids such as 0-5 and 1-5 so we can distinguish groups for these two consumers.

        2. This will solve the problem for small number of message groups, but with large number of message groups the will block again. The problem is that by default, message group to consumer association is kept in map that uses the hash value of the group to save some memory. This works fine with small number of groups, but in our use case (large number of similary named groups), there is a big chance these hash values will overlap and some groups will be left unassigned. To fix this you should use another type of message group map implementation. You can set it with the following policy entry:

        						
        <amq:messageGroupMapFactory>
            <amq:simpleMessageGroupMapFactory/>
        </amq:messageGroupMapFactory>
        

        with this modification, you're test should be running fine even with large number of message groups.

        I attached a test case with described modifications (mb515.zip). Please let me know if it works for you.

        Show
        Dejan Bosanac
        added a comment - Hi Scott, I played a bit more with your test case and got it working. Here's what's happening: 1. First of all, in order to have message groups with selectors working fine, we have to "cluster" selectors among partition id. Otherwise we can end up with the following situation: we send a message with group id 5 to partition 0 which will assign consumer to group id 5. If you later send a message with group id 5 and partition 1, consumer for partition 1 will not be able to consume the message, since it is assigned to the other consumer (because of the group id). I modified your MessageCreator to set group ids such as 0-5 and 1-5 so we can distinguish groups for these two consumers. 2. This will solve the problem for small number of message groups, but with large number of message groups the will block again. The problem is that by default, message group to consumer association is kept in map that uses the hash value of the group to save some memory. This works fine with small number of groups, but in our use case (large number of similary named groups), there is a big chance these hash values will overlap and some groups will be left unassigned. To fix this you should use another type of message group map implementation. You can set it with the following policy entry: <amq:messageGroupMapFactory> <amq:simpleMessageGroupMapFactory/> </amq:messageGroupMapFactory> with this modification, you're test should be running fine even with large number of message groups. I attached a test case with described modifications (mb515.zip). Please let me know if it works for you.
        Hide
        Pedro Neveu
        added a comment - - edited

        Hi Scott,

        I've tried Dejan's (Thanks Dejan) suggestion and it seems to work for me.

        Pedro

        Show
        Pedro Neveu
        added a comment - - edited Hi Scott, I've tried Dejan's (Thanks Dejan) suggestion and it seems to work for me. Pedro
        Hide
        Dejan Bosanac
        added a comment -

        I see that corresponding DEV issue has been closed, so I'm closing this one as well. Please reopen if any further work is needed for this issue.

        Show
        Dejan Bosanac
        added a comment - I see that corresponding DEV issue has been closed, so I'm closing this one as well. Please reopen if any further work is needed for this issue.

          People

          • Assignee:
            Hiram Chirino
            Reporter:
            Pedro Neveu
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: