FUSE Message Broker
  1. FUSE Message Broker
  2. MB-883

Kahadb Chunk stream does not exist when broker under stress

    Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 5.5.0-fuse-00-00
    • Component/s: None
    • Labels:
      None
    • Similar Issues:
      Show 8 results 

      Description

      There appears to be a race condition in activemq. We're still working on the test case, but this is what I can tell you:

      2011-04-07 08:04:07,295 [Queue:loadq-4  ] ERROR FilePendingMessageCursor       - I/O error
      java.io.EOFException: Chunk stream does not exist at page: 198
      	at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454)
      	at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:431)
      	at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428)
      	at org.apache.kahadb.page.Transaction.load(Transaction.java:404)
      	at org.apache.kahadb.page.Transaction.load(Transaction.java:361)
      	at org.apache.activemq.store.kahadb.plist.PList.loadEntry(PList.java:468)
      	at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:483)
      	at org.apache.activemq.store.kahadb.plist.PList$10.execute(PList.java:312)
      	at org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
      	at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:310)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:500)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:473)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.next(FilePendingMessageCursor.java:293)
      	at org.apache.activemq.broker.region.Queue.doPageInForDispatch(Queue.java:1714)
      	at org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:1932)
      	at org.apache.activemq.broker.region.Queue.iterate(Queue.java:1440)
      	at org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:104)
      	at org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:42)
      

      This scenario occurs on a really fast machine, e.g. 8 CPU's when running Hiram's Stomp Benchmark protocol against the latest 5.5 release. Unfortunately, the error does not always occur, only sometimes.

      Gary believes the error occurs as result of the file cursor is hitting the 1gb limit set in the config and starting to write to disk. On Vader, I can come close:

      	
      ERROR | Exception on periodic cleanup: java.io.EOFException: Chunk stream does not exist at page: 8490
      java.io.EOFException: Chunk stream does not exist at page: 8490
        at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:454)
        at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:431)
        at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:428)
        at org.apache.kahadb.page.Transaction.load(Transaction.java:404)
        at org.apache.kahadb.page.Transaction.load(Transaction.java:361)
        at org.apache.activemq.store.kahadb.plist.PList.loadEntry(PList.java:464)
        at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:479)
        at org.apache.activemq.store.kahadb.plist.PList$10.execute(PList.java:309)
        at org.apache.kahadb.page.Transaction.execute(Transaction.java:728)
        at org.apache.activemq.store.kahadb.plist.PList.getNext(PList.java:307)
        at org.apache.activemq.store.kahadb.plist.PListStore.run(PListStore.java:331)
        at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
      Exception in thread "ActiveMQ Broker[localhost] Scheduler" java.util.ConcurrentModificationException
        at java.util.HashMap$HashIterator.nextEntry(HashMap.java:793)
        at java.util.HashMap$ValueIterator.next(HashMap.java:822)
        at org.apache.activemq.store.kahadb.plist.PListStore.run(PListStore.java:327)
        at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
      

      The java.utilConcurrentModificationException seems to happen somewhere in the topic fan test. I can usually get it sort of consistent by starting with empty data file (rm -rf data) and restarting the broker. I'm just not sure where in that test yet and it's not the kahadb error. If I find anything more, will let you know.

        Activity

        Hide
        Gary Tully
        added a comment -
        Show
        Gary Tully
        added a comment - fixed at apache by https://issues.apache.org/jira/browse/AMQ-3325
        Hide
        Gary Tully
        added a comment -

        fix now on fuse branch.

        Show
        Gary Tully
        added a comment - fix now on fuse branch.

          People

          • Assignee:
            Gary Tully
            Reporter:
            Susan Javurek
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: