Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: JBoss A-MQ 6.2.1
Affects Version/s: JBoss A-MQ 6.2.1
Component/s: master-slave
Labels:
- ER4

Steps to Reproduce:
Hide

It is reproducible even when only one instance is deployed:

1) start A-MQ with persistent storage on shared NFSv4
2) block connection between A-MQ and NFS server (e.g. using iptables – see below)
3) wait until A-MQ reports failure in the broker shutdown procedure in the log
4) reestablish connection between A-MQ and NFS server
5) manually restart broker bundle
6) broker will not start since it will claims that database is locked

Iptables command executed on machine where A-MQ is running:

sudo iptables -A INPUT -s $NFS_SERVER_IP -j DROP sudo iptables -A OUTPUT -d $NFS_SERVER_IP -j DROP
Show
It is reproducible even when only one instance is deployed: 1) start A-MQ with persistent storage on shared NFSv4 2) block connection between A-MQ and NFS server (e.g. using iptables – see below) 3) wait until A-MQ reports failure in the broker shutdown procedure in the log 4) reestablish connection between A-MQ and NFS server 5) manually restart broker bundle 6) broker will not start since it will claims that database is locked Iptables command executed on machine where A-MQ is running: sudo iptables -A INPUT -s $NFS_SERVER_IP -j DROP sudo iptables -A OUTPUT -d $NFS_SERVER_IP -j DROP

SFDC Cases Counter:
SFDC Cases Links:

Description

I have two instances (A and B) in shared filesystem master slave configuration are deployed (A is master B is slave). When I simulate network failure between master and NFS server then B becomes master and A starts its shutdown procedure. A's shutdowns procedure throws exceptions related to I/O error (see attached log file) since kahaDB folder on shared NFS is unreachable and A does shut down.

But when I stop broker B (which is currently master), reestablish connection between A and NFS server and manually restart broker on A. Then it claims that DB is locked and broker on A will never start (which is bad especially in case that broker is restarted automatically by setting restartAllowed="true" in activemq.xml). Only solution to successfully start broker on A is to stop whole Fuse and start it again.

When debug logging is enabled on SharedFileLocker class it claims that:

08:42:19,559 | DEBUG | AMQ-2-thread-1   | SharedFileLocker                 | 140 - org.apache.activemq.activemq-osgi - 5.11.0.redhat-621032 | Database /mnt/nfs/fuse-shared/standaloneFaframTest/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/mnt/nfs/fuse-shared/standaloneFaframTest/lock' could not be locked as lock is already held for this jvm.

So it look like that A still holds the lock but it is not possible since in between B instance was a master.

Persistent storage configuration:

        <persistenceAdapter>
                <kahaDB directory="/mnt/nfs/fuse-shared/standaloneFaframTest/" lockKeepAlivePeriod="2000">
                        <locker>
                                <shared-file-locker lockAcquireSleepInterval="10000" />
                        </locker>
                </kahaDB>
        </persistenceAdapter>

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

fuse.log
65 kB
2015/10/05 9:11 AM

Activity

People

Assignee:: Gary Tully

Reporter:: Jakub Knetl (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2015/10/05 9:14 AM

Updated:: 2015/11/06 10:51 AM

Resolved:: 2015/10/14 10:38 AM