Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2375

Discovery: concurrent discovery doesn't work

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 4.1.5
    • None
    • None

    Description

      If num_discovery_runs is greater than 1, then sometimes startup of a member blocks.

      The stack trace below indicates this is an issue with comparison of Task (Future) elements in the ConcurrentSkipListSet.

      Solution: replace the set with an ArrayList: there is no need to sort the futures, or avoid duplicates, as the list is only added to or cleared.

      "main" #1 prio=5 os_prio=31 tid=0x00007ff886001800 nid=0x1c03 runnable [0x000070000f96f000]
         java.lang.Thread.State: RUNNABLE
      	at java.util.concurrent.ConcurrentSkipListMap.findPredecessor(ConcurrentSkipListMap.java:684)
      	at java.util.concurrent.ConcurrentSkipListMap.doPut(ConcurrentSkipListMap.java:823)
      	at java.util.concurrent.ConcurrentSkipListMap.putIfAbsent(ConcurrentSkipListMap.java:1979)
      	at java.util.concurrent.ConcurrentSkipListSet.add(ConcurrentSkipListSet.java:241)
      	at org.jgroups.protocols.Discovery.findMembers(Discovery.java:235)
      	at org.jgroups.protocols.Discovery.down(Discovery.java:380)
      	at org.jgroups.protocols.MERGE3.down(MERGE3.java:278)
      	at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:377)
      	at org.jgroups.protocols.FD_ALL.down(FD_ALL.java:235)
      	at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:102)
      	at org.jgroups.protocols.BARRIER.down(BARRIER.java:136)
      	at org.jgroups.protocols.pbcast.NAKACK2.down(NAKACK2.java:553)
      	at org.jgroups.protocols.UNICAST3.down(UNICAST3.java:581)
      	at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:347)
      	at org.jgroups.protocols.pbcast.ClientGmsImpl.joinInternal(ClientGmsImpl.java:72)
      	at org.jgroups.protocols.pbcast.ClientGmsImpl.join(ClientGmsImpl.java:40)
      	at org.jgroups.protocols.pbcast.GMS.down(GMS.java:1044)
      	at org.jgroups.protocols.FlowControl.down(FlowControl.java:295)
      	at org.jgroups.protocols.FlowControl.down(FlowControl.java:295)
      	at org.jgroups.protocols.FRAG2.down(FRAG2.java:141)
      	at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:928)
      	at org.jgroups.JChannel.down(JChannel.java:627)
      	at org.jgroups.JChannel._connect(JChannel.java:855)
      	at org.jgroups.JChannel.connect(JChannel.java:352)
      	- locked <0x000000079e04cfa0> (a org.jgroups.JChannel)
      	at org.jgroups.JChannel.connect(JChannel.java:343)
      	- locked <0x000000079e04cfa0> (a org.jgroups.JChannel)
      	at org.jgroups.tests.bla6.start(bla6.java:41)
      	at org.jgroups.tests.bla6.main(bla6.java:54)
      

      Attachments

        Activity

          People

            rhn-engineering-bban Bela Ban
            rhn-engineering-bban Bela Ban
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: