Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1401

RELAY2: messages lost when local site master crashes

    Details

    • Type: Feature Request
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 3.2
    • Labels:
      None

      Description

      When we have sites LON=

      {A,B,C}

      and SFO=

      {X,Y,Z}

      , if C wants to send a unicast message to the site master of SFO (X), but the local site master (A) leaves or crashes, and B hasn't taken over yet, the message will be lost.

      The idea to solve this is to forward the message to the next coordinator if the current coordinator leaves or dies.

      A FORWARD_TO_COORD protocol was developed, which handles this task. RELAY2 checks at startup if FORWARD_TO_COORD is present and uses a FORWARD event to tell that protocol to forward a message to the current coordinator. If the protocol is not present, a simple unicast will be sent (unreliably).

      FORWARD_TO_COORD sends a message M to the current coord and removes M when an ack has been received. If there is a view change, indicating the old coord left, it resends all pending messages, and so on. The extreme case would be that everyone but the sender dies and then M would be sent to the sender itself.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  belaban Bela Ban
                  Reporter:
                  belaban Bela Ban
                • Votes:
                  1 Vote for this issue
                  Watchers:
                  2 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: