Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1604

SEQUENCER problems and slowness

    Details

    • Type: Enhancement
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 3.3
    • Labels:
      None

      Description

      On high volumes of messages (e.g. MPerf), SEQUENCER runs out of memory.

      The culprit is the forward-table, which is unbounded, and if messages are added as a faster rate than the time it takes to forward the message to the coordinator (sequencer) and for it to be received as a multicast and subsequently be removed from the forward-table, memory increases.

      This also makes SEQUENCER slow.

      SOLUTION:

      • Look into bounding the forward-table, e.g. define the max number of bytes to be in the table at any given time
      • When a message is to be added, its length is checked (Message.getLength())
        • If the length is 0 or length + accumulated bytes is less than max: add, else block
      • A received message (sent by self) causes the entry to be removed from the forward-table and its length to be subtracted from accumulated bytes (unblocking potentially blocked threads)

      We should also look at delivery table and see if we can make it simpler, e.g. by running a gathering round when a new sequencer is chosen, determining the highest seqnos of all members

        Gliffy Diagrams

          Issue Links

            Activity

            Hide
            belaban Bela Ban added a comment -

            The same problem is present in FORWARD_TO_COORD's forward-table.

            Show
            belaban Bela Ban added a comment - The same problem is present in FORWARD_TO_COORD's forward-table.
            Hide
            belaban Bela Ban added a comment -

            Also check performance of UnicastTestRpc, which doesn't use multicasts, as it's also slow with SEQUENCER on the stack !

            Show
            belaban Bela Ban added a comment - Also check performance of UnicastTestRpc, which doesn't use multicasts, as it's also slow with SEQUENCER on the stack !
            Hide
            belaban Bela Ban added a comment -

            The culprit is delivery_table and canDeliver(), which calls size() on the ConcurrentSkipListSet: size() has a cost linear to the number of elements in the set, and size() is called on every delivery.

            Suggested SOLUTION:

            1. Make canDeliver() fast, e.g. use a linked hashmap with seqnos as keys for delivery_table
            2. (Indepenent from 1) Make addition to forward-table blocking, so we can't run out of memory when many threads are adding messages to the forward-table. The blocking should be based on the number of bytes in the forward-table (Message.getLength())
            Show
            belaban Bela Ban added a comment - The culprit is delivery_table and canDeliver(), which calls size() on the ConcurrentSkipListSet: size() has a cost linear to the number of elements in the set, and size() is called on every delivery. Suggested SOLUTION: Make canDeliver() fast, e.g. use a linked hashmap with seqnos as keys for delivery_table (Indepenent from 1) Make addition to forward-table blocking, so we can't run out of memory when many threads are adding messages to the forward-table. The blocking should be based on the number of bytes in the forward-table (Message.getLength())
            Hide
            belaban Bela Ban added a comment -

            I found out that having the coordinator (sequencer) broadcast its messages directly, versus other members having to forward their messages to the coord, was starving the messages of the other members: the coordinator always preferred sending its own messages.
            I changed this, so that now the coord forwards messages to itself, to have a fairer scheduling and MPerf shows ca. 50MB/sec/node, compared to 20 before !

            Show
            belaban Bela Ban added a comment - I found out that having the coordinator (sequencer) broadcast its messages directly, versus other members having to forward their messages to the coord, was starving the messages of the other members: the coordinator always preferred sending its own messages. I changed this, so that now the coord forwards messages to itself, to have a fairer scheduling and MPerf shows ca. 50MB/sec/node, compared to 20 before !
            Hide
            belaban Bela Ban added a comment -

            We don't need the blocking on forward-table; memory consumption always stayed ok with 10 sender threads. Can be added later if necessary. However, the change where messages were not serialized and added to forward-table as byte[] buffers, but rather as messages, probably helped more, as we're now storing a message only once (the same msg is stored in forward-table and UNICAST(2,3).

            Show
            belaban Bela Ban added a comment - We don't need the blocking on forward-table; memory consumption always stayed ok with 10 sender threads. Can be added later if necessary. However, the change where messages were not serialized and added to forward-table as byte[] buffers, but rather as messages, probably helped more, as we're now storing a message only once (the same msg is stored in forward-table and UNICAST(2,3).

              People

              • Assignee:
                belaban Bela Ban
                Reporter:
                belaban Bela Ban
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Development