Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-827

broker restarts caused LevelDB throwing floods of WARN messages and CPU spike when using durable subs under load with mqtt transport

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • JBoss A-MQ 6.2
    • None
    • None
    • None

    Description

      The test scenario is pretty much the same as the other JIRA: ENTMQ-802.Producers: 800 MQTT clients connect to the Broker with cleanSession=false, create ~40 subscriptions each, and start publishing messages of 1.5KB in size, at the frequency of 1 msg every 30 sec or so. Publishers are running in a local Ubuntu box connected to the Internet through Ethernet on our company network.Consumers: 5 MQTT clients connect to the Broker, create ~10 subscriptions each matching all messages published by the Producers, and very rarely publish a larger message (10KB) randomly addressed to one of the Producers.In this test, LevelDB was configured on the broker as persistent adapter.So here is test steps:Start the broker;Start a consumer to connect to the broker;Start three publishers;Let the test run;Start the three publishers the second time;Let the test run;Start the three publishers the third time;Let the test run;Stop the broker;Start the broker;Start the consumer to connect to the broker again.After three test runs but before the broker was restarted, it looks ok. Although the broker memory usage was a bit high and stayed over 6G but it is understandable because it was under heavy load and there were so many durable subs and destinations and so on. Also GC cycle looks ok (no full GC cycle but only short GC young cycles).However, after the broker was restarted and the consumer was brought up to connect to the broker, following log entries flooded and filled up the log file very very quickly:2014-10-08 10:35:54,183 | WARN | Thread-39 | LevelDBClient | .activemq.leveldb.util.Log$class 75 | 121 - org.apache.activemq.activemq-osgi - 5.9.0.redhat-610379 | Invalid log position: <n>Where the <n> kept changing with value like 0, 1, 2, 3...The consumer was stuck and was not able to connect to the broker (the broker was probably hanging). Also the CPU usage of the broker went through the roof and stayed at around 360%.In the other case, there was slightly different log entry:2014-09-30 08:42:51,355 | WARN | Thread-39 | LevelDBClient | .activemq.leveldb.util.Log$class 75 | 121 - org.apache.activemq.activemq-osgi - 5.9.0.redhat-610379 | invalid: logRefDecrement: 0

      Attachments

        Activity

          People

            dejanbosanac Dejan Bosanac
            rhn-support-qluo Joe Luo
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: