Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1452

SEQUENCER goes wrong when members fail simultaneously

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 3.1
    • 3.0.9
    • None

    Description

      Consider the case where current view is [A, B, C, D], and A and B both die more or less simultaneously.

      C will now try to broadcast the new view [C, D]. But if SEQUENCER is in the stack this goes wrong: SEQUENCER on C doesn't yet know that it is coordinator and tries to forward to either A or B. The change of view gets stuck.

      The problem looks to be in handleSuspect(). This assumes that there is at most one suspect, removes that from the list of members, and figures that whoever is left will be the new coordinator. But this fails in the case just described.

      IMHO it's a mistake for SEQUENCER to try and duplicate the work that the GMS layer does in the new view. I'm currently trying a fix that removes handleSuspect() from SEQUENCER altogether, and instead pays attention to TMP_VIEW events. This seems to be working, I think.

      David

      Attachments

        Activity

          People

            rhn-engineering-bban Bela Ban
            dimbleby David Hotham (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: