Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-126

UNICAST drops message if first==false

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 2.2.9
    • 2.2.8
    • None

      In UNICAST.up():
      case UnicastHeader.DATA: // received regular message
      handleDataReceived(src, hdr.seqno, hdr.first, msg);
      sendAck(src, hdr.seqno);
      break;

      We receive a message and unconditionally send an ACK, even if we don't add it to our received_msgs table. Example:

      Sender S sends unicast messages m1 (first=true) and m2 (first=false). S will retransmit m1 and m2 until it gets acks for m1 and m2. Let's assume the transport drops m1, and R receives m2 first and then m1 (retransmitted by S). The following code handles unicast messages:

      synchronized(connections) {
      entry=(Entry)connections.get(sender);
      if(entry == null)

      { entry=new Entry(); connections.put(sender, entry); }

      }

      synchronized(entry) {
      if(entry.received_msgs == null) {
      if(first)
      entry.received_msgs=new AckReceiverWindow(seqno);
      else {
      if(operational)

      { if(warn) log.warn(sender + "#" + seqno + " is not tagged as the first message sent by " + sender + "; however, the table for received messages from " + sender + " is null. We probably " + "haven't received the first message from " + sender + ". Discarding message: " + msg.toString() + ", headers (excluding UnicastHeader): " + msg.getHeaders()); return; }

      }
      }
      }

      if(entry.received_msgs != null) {
      entry.received_msgs.add(seqno, msg);

      So m2 is received first. But because it is not tagged as first, it is discarded ! However, because we unconditionally send an ACK for m2, S removes m2 from its retransmission table !
      Now R receives m1, which is tagged as first, so it will be added to the received_msgs table. S now also removes m1 from its retransmission table, as it received an ACK for m1.
      Now S sends m3: the only message in its retransmission table is now m3 !
      R receives m3, and adds it to its received_msgs table. It delivered m1, but cannot deliver m3 because m2 is still missing. However, S will never retransmit m2 because R sent an ACK for m2 although it never added it to its received_msgs table !
      Result: R will wait forever for m2, and thus blocks subsequent unicast traffic between S and R !
      Severity: this is a critical bug because it can potentially stop the entire unicast traffic between 2 members. Fortunately, the chances of this occurring are slim.

            rhn-engineering-bban Bela Ban
            rhn-engineering-bban Bela Ban
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: