Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1742

BARRIER: minimize closing time

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Done
    • Major
    • 3.5
    • None
    • None

    Description

      During a state transfer, BARRIER.up() waits until all incoming threads (delivering messages to the application) are done, and blocks further incoming messages. This is done to get the digest and the state.
      However, duing the block, the following messages are not sent up:

      • Views !
      • STABLE messages, triggering retransmissions

      This is bad, so we should try to minimize the time BARRIER is closed. This can be done with JGRP-1352.
      However, we could also do the following:

      • A state request is received
      • Close BARRIER and flush all pending threads. This ensures that any message which updated the digest also updated the application state
      • Get the digest D
      • Open BARRIER. Messages will now be delivered and thus applied to the state
      • Get the application state S
      • When done, return D and S to the state requester

      The difference to JGRP-1352 is that we don't queue messages during state transfer. How does this work ? It is critical to ensure that all mesages which updated the digest D also updated the state S, or else messages present in D but not in S would not be retransmitted. However, if there are more messages in S than in D, this is not an issue as they will be retransmitted again.
      Example:

      • BARRIER is closed and pending threads are flushed
      • Digest D is (only for a given member P) 5, state S is 5 as well
      • Now we open BARRIER
      • P sends a few more messages (6, 7 and 8)
      • The digest is now 8, but the copy we have is still 5
      • State S is 8
      • We return D=5 and S=8
      • The state requester closes BARRIER and sets its digest to 5 and its state to 8
      • Since the digest is only 5 for P, the state requester asks P for retransmission of messages 6, 7 and 8
      • Messages 6, 7 and 8 from P are received and applied to the state
      • The assumption here is that if messages 6, 7 and 8 are applied twice, the state doesn't change (idempotency). This should be the case with Infinispan.

      The advantage of this issue over JGRP-1352 is that we don't need to queue messages for a long time if the state is large.

      Attachments

        Issue Links

          Activity

            People

              rhn-engineering-bban Bela Ban
              rhn-engineering-bban Bela Ban
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: