Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2715

ViewChange does not work using stack tcp.xml

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 5.2.17
    • 5.0.0.Final
    • False
    • None
    • False
    • Important

      I have downloaded jgroups-5.0.0.Final.jar. I'm running SUSE SLES15 ServicePack4 and I'm using Oracle-Java jdk-17.0.6 

      In my application the ViewChange (the joining of members into a common used cluster) mechanism is not working. I am using 4 JChannels with the tcp stack provided by JGroups (tcp.xml):

      <config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xmlns="urn:org:jgroups"
              xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/jgroups.xsd"
              version="5.0.0.Final">
          <TCP bind_addr="192.168.100.4"
               bind_port="8100"
               external_addr="192.168.100.4"
               external_port="8100"
               thread_pool.min_threads="0"
               thread_pool.max_threads="200"
               thread_pool.keep_alive_time="30000"/>

          <TCPPING async_discovery="true"
                   initial_hosts="${jgroups.tcpping.initial_hosts:192.168.100.4[8100],192.168.100.3[8100],192.168.100.100[8100]}"
                   return_entire_cache="${jgroups.tcpping.return_entire_cache:false}"
                   port_range="${jgroups.tcp.port_range:0}"/>
          <MERGE3  min_interval="10000"
                   max_interval="30000"/>
          <FD_SOCK/>
          <FD_ALL3 timeout="40000" interval="5000" />
          <VERIFY_SUSPECT timeout="1500"  />
          <BARRIER />
          <pbcast.NAKACK2 use_mcast_xmit="false" />
          <UNICAST3 />
          <pbcast.STABLE desired_avg_gossip="50000"
                         max_bytes="4M"/>
          <pbcast.GMS print_local_addr="true" join_timeout="2000"/>
          <UFC max_credits="2M"
               min_threshold="0.4"/>
          <MFC max_credits="2M"
               min_threshold="0.4"/>
          <FRAG2 frag_size="60K"  />
          <!-RSVP resend_interval="2000" timeout="10000"/->
          <pbcast.STATE_TRANSFER/>
      </config>

      My application is started on 2 computers with Ip addresses 192.168.100.3 and 192.168.100.4.
      During start of the process following messages come up and no ViewChange is done. Each processs remains in its own cluster instead of joining into one cluster (for each JChannel).

      Jul 13, 2023 10:26:59 AM org.jgroups.JChannel setAddress
      INFO: local_addr: 6e233599-a5c7-aff2-b3c1-84856779ecc6, name: REF-SESSION

      -------------------------------------------------------------------
      GMS: address=REF-SESSION, cluster=SessionCluster, physical address=192.168.100.4:8300
      -------------------------------------------------------------------
      Jul 13, 2023 10:27:01 AM org.jgroups.protocols.pbcast.ClientGmsImpl joinInternal
      INFO: REF-SESSION: no members discovered after 2003 ms: creating cluster as coordinator
      Jul 13, 2023 10:27:01 AM org.jgroups.JChannel setAddress
      INFO: local_addr: fdf302a8-067a-7a76-8721-8783bd26148a, name: REF-SIM

      -------------------------------------------------------------------
      GMS: address=REF-SIM, cluster=SimCluster, physical address=192.168.100.4:8400
      -------------------------------------------------------------------
      Jul 13, 2023 10:27:03 AM org.jgroups.protocols.pbcast.ClientGmsImpl joinInternal
      INFO: REF-SIM: no members discovered after 2000 ms: creating cluster as coordinator
      Jul 13, 2023 10:27:03 AM org.jgroups.JChannel setAddress
      INFO: local_addr: a95fe0ac-6931-0b54-d562-919a5db6053e, name: REF-CHAT
      Jul 13, 2023 10:27:03 AM org.jgroups.JChannel setAddress
      INFO: local_addr: 50302a0e-1fc5-42cf-165a-dd426f3ee77d, name: REF-FILE

      -------------------------------------------------------------------
      GMS: address=REF-FILE, cluster=FileCluster, physical address=192.168.100.4:8200
      -------------------------------------------------------------------

      -------------------------------------------------------------------
      GMS: address=REF-CHAT, cluster=ChatCluster, physical address=192.168.100.4:8100
      -------------------------------------------------------------------
      Jul 13, 2023 10:27:05 AM org.jgroups.protocols.pbcast.ClientGmsImpl joinInternal
      INFO: REF-FILE: no members discovered after 2000 ms: creating cluster as coordinator
      Jul 13, 2023 10:27:05 AM org.jgroups.protocols.pbcast.ClientGmsImpl joinInternal
      INFO: REF-CHAT: no members discovered after 2000 ms: creating cluster as coordinator
      Jul 13, 2023 10:27:25 AM org.jgroups.blocks.cs.TcpServer$Acceptor run
      WARNING: JGRP000006: failed accepting connection from peer Socket[addr=/192.168.100.3,port=53639,localport=8400]: java.net.SocketTimeoutException: Read timed out
      Jul 13, 2023 10:28:11 AM org.jgroups.blocks.cs.TcpServer$Acceptor run
      WARNING: JGRP000006: failed accepting connection from peer Socket[addr=/192.168.100.3,port=34393,localport=8300]: java.net.SocketTimeoutException: Read timed out
      Jul 13, 2023 10:28:15 AM org.jgroups.blocks.cs.TcpServer$Acceptor run
      WARNING: JGRP000006: failed accepting connection from peer Socket[addr=/192.168.100.3,port=55029,localport=8100]: java.net.SocketTimeoutException: Read timed out
      Jul 13, 2023 10:28:15 AM org.jgroups.blocks.cs.TcpServer$Acceptor run
      WARNING: JGRP000006: failed accepting connection from peer Socket[addr=/192.168.100.3,port=43787,localport=8200]: java.net.SocketTimeoutException: Read timed out

      When I start the org.jgroups.demos.Draw application with the tcp-stack  (mentioned above) the same error messages are displayed but opposing to my application the ViewChange mechanism finally works.

      Please help me solving the problem by removing the error messages.

            rhn-engineering-bban Bela Ban
            radelnfun Andreas Feltkamp (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: