Uploaded image for project: 'JBoss A-MQ'
  1. JBoss A-MQ
  2. ENTMQ-147

Kahadb error during SAN failover delayed write - Allow kahaDB to recover in a similar manner as the JDBC store using the IOExceptionHandler

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 7.1.0.fuse-046
    • Fix Version/s: JBoss A-MQ 6.1
    • Component/s: broker
    • Labels:
      None

      Description

      An issue can arise that causes the broker to terminate when using kahaDB with a SAN, when the SAN fails over. In this case the failover process is seamless however, on fail back there is a 2-3 sec delay where writes are blocked and the broker terminates. With the JDBC datastore a similar situation can be handled by using the IOExceptionHandler. However with kahaDB, when this same IOExceptionHandler is added it prevents the broker from terminating but kahaDB retains an invalid index.| INFO | ActiveMQ JMS Message Broker (Broker1, ID:macbookpro-251a.home-56915-1328715089252-0:1) started||INFO | jetty-7.1.6.v20100715|

      INFO ActiveMQ WebConsole initialized.
      INFO Initializing Spring FrameworkServlet 'dispatcher'
      INFO ActiveMQ Console at http://0.0.0.0:8161/admin
      INFO ActiveMQ Web Demos at http://0.0.0.0:8161/demo
      INFO RESTful file access application at http://0.0.0.0:8161/fileserver
      INFO FUSE Web Console at http://0.0.0.0:8161/console
      INFO Started SelectChannelConnector@0.0.0.0:8161
      ERROR KahaDB failed to store to Journal
      java.io.SyncFailedException: sync failed
      at java.io.FileDescriptor.sync(Native Method)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      INFO Ignoring IO exception, java.io.SyncFailedException: sync failed
      java.io.SyncFailedException: sync failed
      at java.io.FileDescriptor.sync(Native Method)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      ERROR Checkpoint failed
      java.io.SyncFailedException: sync failed
      at java.io.FileDescriptor.sync(Native Method)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      INFO Ignoring IO exception, java.io.SyncFailedException: sync failed
      java.io.SyncFailedException: sync failed
      at java.io.FileDescriptor.sync(Native Method)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      ERROR KahaDB failed to store to Journal
      java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
      at java.io.RandomAccessFile.open(Native Method)
      at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
      at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      INFO Ignoring IO exception, java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
      java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
      at java.io.RandomAccessFile.open(Native Method)
      at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
      at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      ERROR KahaDB failed to store to Journal
      java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
      at java.io.RandomAccessFile.open(Native Method)
      at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
      at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      INFO Ignoring IO exception, java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
      java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
      at java.io.RandomAccessFile.open(Native Method)
      at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
      at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
      at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
      at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
      WARN Transport failed: java.io.EOFException
      WARN Transport failed: java.io.EOFException
      INFO KahaDB: Recovering checkpoint thread after death
      ERROR Checkpoint failed
      java.io.IOException: Input/output error
      at java.io.RandomAccessFile.write(Native Method)
      at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
      at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
      at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
      at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
      at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
      at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
      INFO Ignoring IO exception, java.io.IOException: Input/output error
      java.io.IOException: Input/output error
      at java.io.RandomAccessFile.write(Native Method)
      at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
      at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
      at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
      at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
      at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
      at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
      INFO KahaDB: Recovering checkpoint thread after death
      ERROR Checkpoint failed
      java.io.IOException: Input/output error
      at java.io.RandomAccessFile.write(Native Method)
      at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
      at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
      at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
      at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
      at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
      at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
      INFO Ignoring IO exception, java.io.IOException: Input/output error
      java.io.IOException: Input/output error
      at java.io.RandomAccessFile.write(Native Method)
      at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
      at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
      at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
      at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
      at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
      at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
      at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
      WARN Transport failed: java.io.EOFException

        Gliffy Diagrams

          Activity

          Show
          dbosanac Dejan Bosanac added a comment - I pushed a fix for https://issues.apache.org/jira/browse/AMQ-3725 to Apache trunk. There's also a snapshot to be tested https://repository.apache.org/content/repositories/snapshots/org/apache/activemq/apache-activemq/5.10-SNAPSHOT/apache-activemq-5.10-20131101.033855-13-bin.tar.gz
          Hide
          dbosanac Dejan Bosanac added a comment -

          I think so. It's an old one

          Show
          dbosanac Dejan Bosanac added a comment - I think so. It's an old one

            People

            • Assignee:
              dbosanac Dejan Bosanac
              Reporter:
              jsherman Jason Sherman
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: