Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1498

Deadlock in SEQUENCER

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 3.2
    • 3.1
    • None

    Description

      I've just seen a deadlock in SEQUENCER, which I think occurs as follows:

      • Start of day. coord=null, ack_mode=true, flushing=false.
      • First view arrives. We call handleView(), and start a Flusher
      • On the Flusher thread, flush() gets as far as waiting to acquire the send_lock, but doesn't yet have it.
      • Meanwhile on Thread 2, the application tries broadcasting a message. This gets as far as the trace "forwarding my-address::1 to coord null", but does not yet enter forwardToCoord().
      • Now on Thread 3, a second view arrives. handleViewChange() finds that coord_changed is true, and calls stopFlusher(). This sets flushing=false.
      • Now Thread 2 picks up. flushing=false, ack_mode=true; so forwardToCoord() gets as far as acquiring the send_lock.
      • Now Thread 2 loops around making no progress. forward() always drops the message, because coord is null. The send_lock is never relinquished
      • So the Flusher thread can never acquire the send_lock, and the Flusher can't exit
      • And so Thread 3 is stuck too, in stopFlusher().

      Attachments

        Activity

          People

            rhn-engineering-bban Bela Ban
            dimbleby David Hotham (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: