Uploaded image for project: 'FUSE Message Broker'
  1. FUSE Message Broker
  2. MB-883

Kahadb Chunk stream does not exist when broker under stress

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 5.5.0-fuse-00-00
    • Component/s: None
    • Labels:
      None

      Description

      There appears to be a race condition in activemq. We're still working on the test case, but this is what I can tell you:

      2011-04-07 08:04:07,295 [Queue:loadq-4  ] ERROR FilePendingMessageCursor       - I/O error
      java.io.EOFException: Chunk stream does not exist at page: 198
      	at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454)
      	at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:431)
      	at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428)
      	at org.apache.kahadb.page.Transaction.load(Transaction.java:404)
      	at org.apache.kahadb.page.Transaction.load(Transaction.java:361)
      	at org.apache.activemq.store.kahadb.plist.PList.loadEntry(PList.java:468)
      	at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:483)
      	at org.apache.activemq.store.kahadb.plist.PList$10.execute(PList.java:312)
      	at org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
      	at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:310)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:500)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:473)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.next(FilePendingMessageCursor.java:293)
      	at org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1714)
      	at org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:1932)
      	at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1440)
      	at org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:104)
      	at org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:42)

      This scenario occurs on a really fast machine, e.g. 8 CPU's when running Hiram's Stomp Benchmark protocol against the latest 5.5 release. Unfortunately, the error does not always occur, only sometimes.

      Gary believes the error occurs as result of the file cursor is hitting the 1gb limit set in the config and starting to write to disk. On Vader, I can come close:

      	
      ERROR | Exception on periodic cleanup: java.io.EOFException: Chunk stream does not exist at page: 8490
      java.io.EOFException: Chunk stream does not exist at page: 8490
        at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454)
        at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:431)
        at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428)
        at org.apache.kahadb.page.Transaction.load(Transaction.java:404)
        at org.apache.kahadb.page.Transaction.load(Transaction.java:361)
        at org.apache.activemq.store.kahadb.plist.PList.loadEntry(PList.java:464)
        at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:479)
        at org.apache.activemq.store.kahadb.plist.PList$10.execute(PList.java:309)
        at org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
        at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:307)
        at org.apache.activemq.store.kahadb.plist.PListStore.run(PListStore.java:331)
        at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
      Exception in thread "ActiveMQ Broker[localhost] Scheduler" java.util.ConcurrentModificationException
        at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
        at java.util.HashMap$ValueIterator.next(HashMap.java:822)
        at org.apache.activemq.store.kahadb.plist.PListStore.run(PListStore.java:327)
        at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)

      The java.utilConcurrentModificationException seems to happen somewhere in the topic fan test. I can usually get it sort of consistent by starting with empty data file (rm -rf data) and restarting the broker. I'm just not sure where in that test yet and it's not the kahadb error. If I find anything more, will let you know.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                garytully Gary Tully
                Reporter:
                sjavurek Susan Javurek
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: