Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1817

OverlappingMergeTest testSameCreatorDifferentIDs fails to create correct merged view

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Major
    • 3.2.14
    • 3.2.13
    • None

    Description

      This test does the following:

      • creates three channels a,b,c
      • injects views
        A: {A|5 A}, B:{A|6 A,B}, C:{A|7 A,B,C} 
        
      • calls MERGE.sendMergeSolicitation() on channel A to simulate the calling of the periodic task MERGE.findSubgroupsTask which should find all views of all reachable members, check if there are different views, and if there are prepare and send a MERGE event up to GMS
      • checks that all channels have the final view of size 3

      The test fails intermittently but frequently on RHEL, with the same failure each time:

      -------------------------------------------------------------------
      GMS: address=A, cluster=OverlappingMergeTest, physical address=10.16.95.7:27215
      -------------------------------------------------------------------
      -------------------------------------------------------------------
      GMS: address=B, cluster=OverlappingMergeTest, physical address=10.16.95.7:27216
      -------------------------------------------------------------------
      -------------------------------------------------------------------
      GMS: address=C, cluster=OverlappingMergeTest, physical address=10.16.95.7:27217
      -------------------------------------------------------------------
      
      ------------- testSameCreatorDifferentIDs -----------
      [A] view=[A|5] [A]
      [B] view=[A|6] [A, B]
      [C] view=[A|7] [A, B, C]
      
      A's view: [A|5] [A]
      B's view: [A|6] [A, B]
      C's view: [A|7] [A, B, C]
      Enabling TRACE debugging for GMS, MERGE2 and Discovery
      
      ==== triggering merge solicitation ====:
      212534 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27216
      212537 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27218
      212538 [TRACE] TCPPING: - A: sending discovery request to 10.16.95.7:27217
      215538 [TRACE] TCPPING: - A: discovery took 3004 ms: responses: 1 total (1 servers (0 coord), 0 clients)
      215539 [TRACE] MERGE2: - Discovery results:
      [B]: view_id=[A|6] ([A|6] [A, B])
      [A]: view_id=[A|5] ([A|5] [A])
      215539 [DEBUG] MERGE2: - A found different views : [A|5], [A|6]; sending up MERGE event with merge participants [B, A].
      Discovery results:
      [B]: coord=A
      [A]: coord=A
      
      ==== checking views after merge ====:
      ....................Disabling TRACE debugging for GMS, MERGE2 and Discovery
      
      A's view: [A|7] [A, B]
      B's view: [A|7] [A, B]
      C's view: [A|7] [A, B, C]
      

      Whenever this test fails, it is the discovery phase which fails to find the correct set of views. Instead of finding views for channels A, B and C, it only finds views for channels A and B.

      Also, the discovery requests are sent to host:port combinations which are offset by 1. For example, in the case above, the host:port combinations of the channels are 10.16.95.7:27215, 10.16.95.7:27216, and 10.16.95.7:27217, but the pings go put to 10.16.95.7:27216, 10.16.95.7:27217, and 10.16.95.7:27218. Not sure if this is significant as it still covers the channels B and C.

      Attachments

        Issue Links

          Activity

            People

              rhn-engineering-bban Bela Ban
              rachmato@redhat.com Richard Achmatowicz
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: