Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 6.0.0.Final
Component/s: Clustering
Labels:
None

Forum Reference:
http://community.jboss.org/message/612174
Steps to Reproduce:

Hide

Extract jboss 6.0.0 on 2 boxes on a network.
Deploy a singleton in the server/all/deploy-hasingleton/ directories.
Start jboss running with the 'all' profile on both boxes.
Wait for them to cluster.
Check singleton is only running on one of the boxes.
Kill the network (pull the cable or "iptables -A INPUT -s <ipaddress> -j DROP")
Wait until singleton has started up on 2nd box.
Enable the network (plug in cable or "iptables -F")
Watch as both singletons remain running.

Show
Extract jboss 6.0.0 on 2 boxes on a network. Deploy a singleton in the server/all/deploy-hasingleton/ directories. Start jboss running with the 'all' profile on both boxes. Wait for them to cluster. Check singleton is only running on one of the boxes. Kill the network (pull the cable or "iptables -A INPUT -s <ipaddress> -j DROP") Wait until singleton has started up on 2nd box. Enable the network (plug in cable or "iptables -F") Watch as both singletons remain running.

SFDC Cases Counter:
SFDC Cases Links:

Description

We've been running with JBoss 6.0.0 clustered across 2 boxes and running with a number of HA Singletons. A brief network outage caused the cluster to split and the HA Singletons to start up on the second box. After the network issues were resolved, the JBoss instances correctly re-clustered, but the HA Singletons remained running on both boxes.
I believe that they should have automatically stopped and only the HA Singletons on the master node should have started back up.

I've finally tracked the issue down to common/lib/jboss-ha-server-core.jar from the source code at
http://grepcode.com/snapshot/repository.jboss.org/nexus/content/repositories/releases/org.jboss.cluster/jboss-ha-server-core/1.0.0.Final

The bug is in the file:
org/jboss/ha/core/framework/server/DistributedReplicantManagerImpl.java

In the method:
/**

Add a replicant to the replicants map.
@param key replicant key name
@param nodeName name of the node that adds this replicant
@param replicant Serialized representation of the replica
@return true, if this replicant was newly added to the map, false otherwise
*/
protected boolean addReplicant(String key, String nodeName, Serializable replicant) { ConcurrentMap<String, Serializable> map = new ConcurrentHashMap<String, Serializable>(); ConcurrentMap<String, Serializable> existingMap = this.replicants.putIfAbsent(key, map); return (((existingMap != null) ? existingMap : map).put(nodeName, replicant) != null); }

The last line of the method should be changed to:
return (((existingMap != null) ? existingMap : map).put(nodeName, replicant) == null);

addReplicant() should return true if the replicant wasn't previously in the map, which would happen if the Map.put() method returns null. It looks like the return value of this method is only checked when merging a split cluster.

Probably affects JBoss 6.1.0 - not sure about 7.X.X though.

Attachments

Activity

People

Assignee:: Paul Ferraro

Reporter:: Robert Hayward (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 2011/12/08 10:35 AM

Updated:: 2011/12/08 10:35 AM