Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1401

RELAY2: messages lost when local site master crashes

    XMLWordPrintable

Details

    • Feature Request
    • Resolution: Done
    • Major
    • 3.2
    • None
    • None
    • 0
    • 0% 0%

    Description

      When we have sites LON=

      {A,B,C}

      and SFO=

      {X,Y,Z}

      , if C wants to send a unicast message to the site master of SFO (X), but the local site master (A) leaves or crashes, and B hasn't taken over yet, the message will be lost.

      The idea to solve this is to forward the message to the next coordinator if the current coordinator leaves or dies.

      A FORWARD_TO_COORD protocol was developed, which handles this task. RELAY2 checks at startup if FORWARD_TO_COORD is present and uses a FORWARD event to tell that protocol to forward a message to the current coordinator. If the protocol is not present, a simple unicast will be sent (unreliably).

      FORWARD_TO_COORD sends a message M to the current coord and removes M when an ack has been received. If there is a view change, indicating the old coord left, it resends all pending messages, and so on. The extreme case would be that everyone but the sender dies and then M would be sent to the sender itself.

      Attachments

        Issue Links

          Activity

            People

              rhn-engineering-bban Bela Ban
              rhn-engineering-bban Bela Ban
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: