Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2266

RouterStubManager.run() endless reconnect loop burning a CPU

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 4.0.12
    • 4.0.11
    • None

    Description

      RouterStubManager.run() tries in a loop to reconnect all stubs currently not connected. When for whatever reason it is not possible to connect one of this stubs, the method spins in a endless loop and burns a CPU.

      E.g. sometimes the VPN tunnel is down or one of the TCPGOSSIP hosts is down.

      No idea if it is really required to loop here, but at least it should do some some Thread.yield() or or sleep() here. As this run() method is called periodically it should not be required to do a endless loop here, should it? Maybe only loop e.g. three times and then give up?

      As the all nodes in the cluster are iMac workstations or special render Linux slaves, burning a CPU is very annoying. The CPU should rather be spend on the Blender render jobs or for the interactive work the people are doing on their iMacs. (JGroups is used here to distribute render jobs within the cluster)

      Attachments

        Activity

          People

            rhn-engineering-bban Bela Ban
            emmeran.seehuber Emmeran Seehuber (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: