FUSE Message Broker
  1. FUSE Message Broker
  2. MB-1199

NPE exception in kahadb after index issue

    Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Done
    • Affects Version/s: 5.5.1-fuse-04-01
    • Fix Version/s: 5.5.1-fuse-10-16
    • Component/s: broker
    • Labels:
      None
    • Similar Issues:
      Show 10 results 

      Description

      2012-07-24 07:11:19,476 [n.ch] Scheduler] ERROR Queue                          - Problem retrieving message for browse
      java.util.NoSuchElementException: Could not locate data file /var/lib/activemq/data/activemq_gridmsg104.cern.ch/tmp_storage/db-1093.log
              at org.apache.activemq.store.kahadb.plist.PList$PListIterator.next(PList.java:214)
              at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:508)
              at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.next(FilePendingMessageCursor.java:492)
              at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.next(FilePendingMessageCursor.java:305)
              at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:1053)
              at org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:804)
              at org.apache.activemq.broker.region.Queue.access$100(Queue.java:91)
              at org.apache.activemq.broker.region.Queue$2.run(Queue.java:131)
              at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
              at java.util.TimerThread.mainLoop(Timer.java:512)
              at java.util.TimerThread.run(Timer.java:462)
      Caused by: java.io.IOException: Could not locate data file /var/lib/activemq/data/activemq_gridmsg104.cern.ch/tmp_storage/db-1093.log
              at org.apache.kahadb.journal.Journal.getDataFile(Journal.java:350)
              at org.apache.kahadb.journal.Journal.read(Journal.java:597)
              at org.apache.activemq.store.kahadb.plist.PListStore.getPayload(PListStore.java:394)
              at org.apache.activemq.store.kahadb.plist.PList$PListIterator.next(PList.java:212)
              ... 10 more
      
      

      After a restart, we now see NPE related to KahaDB:

      2012-07-24 11:39:05,287 [n.ch] Scheduler] ERROR Queue                          - Problem retrieving message for browse
      java.lang.NullPointerException
      	at org.apache.kahadb.index.ListNode.access$200(ListNode.java:37)
      	at org.apache.kahadb.index.ListNode$NodeMarshaller.writePayload(ListNode.java:265)
      	at org.apache.kahadb.index.ListNode$NodeMarshaller.writePayload(ListNode.java:255)
      	at org.apache.kahadb.page.Transaction.store(Transaction.java:248)
      	at org.apache.kahadb.index.ListIndex.storeNode(ListIndex.java:326)
      	at org.apache.kahadb.index.ListNode.store(ListNode.java:348)
      	at org.apache.kahadb.index.ListNode.access$500(ListNode.java:37)
      	at org.apache.kahadb.index.ListNode$ListIterator.remove(ListNode.java:227)
      	at org.apache.activemq.store.kahadb.plist.PList$PListIterator$1.execute(PList.java:228)
      	at org.apache.kahadb.page.Transaction.execute(Transaction.java:765)
      	at org.apache.activemq.store.kahadb.plist.PList$PListIterator.remove(PList.java:225)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.remove(FilePendingMessageCursor.java:517)
      	at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.remove(FilePendingMessageCursor.java:321)
      	at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:1061)
      	at org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:804)
      	at org.apache.activemq.broker.region.Queue.access$100(Queue.java:91)
      	at org.apache.activemq.broker.region.Queue$2.run(Queue.java:131)
      	at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
      	at java.util.TimerThread.mainLoop(Timer.java:512)
      	at java.util.TimerThread.run(Timer.java:462)
      
      

      Seems to be a variant of https://issues.apache.org/jira/browse/AMQ-3434 and Gary put together a test case which we believe represents the problem.

        Activity

        Hide
        Gary Tully
        added a comment -

        I have a test case that runs for <5mins and always shows a problem, but the symptoms vary (From iterators reading too much to chunk stream not exist to NPE). I am getting closer to the root cause, but don't yet have it narrowed doen completely.

        Show
        Gary Tully
        added a comment - I have a test case that runs for <5mins and always shows a problem, but the symptoms vary (From iterators reading too much to chunk stream not exist to NPE). I am getting closer to the root cause, but don't yet have it narrowed doen completely.
        Hide
        Joe Luo
        added a comment -

        Sometimes, following stack trace can also be seen:

        2012-08-08 20:49:47,918 [n.ch] Scheduler] ERROR Queue - Problem retrieving message for browse
        java.util.NoSuchElementException: Chunk stream does not exist, page: 75281 is marked free
        at org.apache.kahadb.index.ListNode$ListIterator.getFromNextNode(ListNode.java:165)
        at org.apache.kahadb.index.ListNode$ListIterator.hasNext(ListNode.java:176)
        at org.apache.activemq.store.kahadb.plist.PList$PListIterator.hasNext(PList.java:204)
        at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.hasNext(FilePendingMessageCursor.java:503)
        at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.hasNext(FilePendingMessageCursor.java:297)
        at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:1052)
        at org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:804)
        at org.apache.activemq.broker.region.Queue.access$100(Queue.java:91)
        at org.apache.activemq.broker.region.Queue$2.run(Queue.java:131)
        at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
        Caused by: java.io.EOFException: Chunk stream does not exist, page: 75281 is marked free
        at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:455)
        at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:432)
        at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:429)
        at org.apache.kahadb.page.Transaction.load(Transaction.java:405)
        at org.apache.kahadb.page.Transaction.load(Transaction.java:362)
        at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306)

        Show
        Joe Luo
        added a comment - Sometimes, following stack trace can also be seen: 2012-08-08 20:49:47,918 [n.ch] Scheduler] ERROR Queue - Problem retrieving message for browse java.util.NoSuchElementException: Chunk stream does not exist, page: 75281 is marked free at org.apache.kahadb.index.ListNode$ListIterator.getFromNextNode(ListNode.java:165) at org.apache.kahadb.index.ListNode$ListIterator.hasNext(ListNode.java:176) at org.apache.activemq.store.kahadb.plist.PList$PListIterator.hasNext(PList.java:204) at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor$DiskIterator.hasNext(FilePendingMessageCursor.java:503) at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.hasNext(FilePendingMessageCursor.java:297) at org.apache.activemq.broker.region.Queue.doBrowse(Queue.java:1052) at org.apache.activemq.broker.region.Queue.expireMessages(Queue.java:804) at org.apache.activemq.broker.region.Queue.access$100(Queue.java:91) at org.apache.activemq.broker.region.Queue$2.run(Queue.java:131) at org.apache.activemq.thread.SchedulerTimerTask.run(SchedulerTimerTask.java:33) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Caused by: java.io.EOFException: Chunk stream does not exist, page: 75281 is marked free at org.apache.kahadb.page.Transaction$2.readPage(Transaction.java:455) at org.apache.kahadb.page.Transaction$2.<init>(Transaction.java:432) at org.apache.kahadb.page.Transaction.openInputStream(Transaction.java:429) at org.apache.kahadb.page.Transaction.load(Transaction.java:405) at org.apache.kahadb.page.Transaction.load(Transaction.java:362) at org.apache.kahadb.index.ListIndex.loadNode(ListIndex.java:306)
        Hide
        Gary Tully
        added a comment -

        An additional test that breaks and can demonstrate the problem:
        org.apache.activemq.store.kahadb.plist.PListTest#testConcurrentAddIterate

        Show
        Gary Tully
        added a comment - An additional test that breaks and can demonstrate the problem: org.apache.activemq.store.kahadb.plist.PListTest#testConcurrentAddIterate
        Hide
        Gary Tully
        added a comment -

        great news on this one, looks like the problem is in reading and seek. Top find tim.

        Show
        Gary Tully
        added a comment - great news on this one, looks like the problem is in reading and seek. Top find tim.
        Hide
        Susan Javurek
        added a comment -

        Hi Tim,

        It looks like a new snapshot built last night:

        apache-activemq-5.5.1.fuse-20120822.002833-49-bin.tar.gz.md5 Wed Aug 22 09:07:15 UTC 2012 32

        Will it be in there or do you think we need to wait for the next one?

        Susan

        Show
        Susan Javurek
        added a comment - Hi Tim, It looks like a new snapshot built last night: apache-activemq-5.5.1.fuse-20120822.002833-49-bin.tar.gz.md5 Wed Aug 22 09:07:15 UTC 2012 32 Will it be in there or do you think we need to wait for the next one? Susan

          People

          • Assignee:
            Timothy Bish
            Reporter:
            Susan Javurek
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: