Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 4.0.11, 3.6.18
    • Labels:
      None
    • Steps to Reproduce:
      Hide

      Run the specified tests a lot of times until the test fails.

      Show
      Run the specified tests a lot of times until the test fails.

      Description

      As discussed in the mailing list we have issues where locks from the central lock protocol stay locked forever when the coordinator of the cluster disconnects. We can reproduce this with the attached ClusterSplitLockTest.java. Its a race condition and we need to run the test a lot of times (sometimes > 20) before we encounter a failure.

      What we think is happening:

      In a three node cluster (node A, B and C where node A is the coordinator) unlock requests from B and/or C can be missed when node A leaves and B and/or C don't have the new view installed yet. When, for example, node B takes over coordination it creates the lock table based on the back-ups. Lets say node C has locked the lock with name 'lockX'. Node C performs an unlock of 'lockX' just after node A (gracefully) leaves and sends the unlock request to node A since node C doesn't have the correct view installed yet. Node B has recreated the lock table where 'lockX' is locked by Node C. Node C doesn't resend the unlock request so 'lockX' gets locked forever.

      Attached is the testng test we wrote and the output of a test failure.

        Gliffy Diagrams

          Attachments

            Issue Links

              Activity

                People

                • Assignee:
                  belaban Bela Ban
                  Reporter:
                  bramklg Bram Klein Gunnewiek
                • Votes:
                  1 Vote for this issue
                  Watchers:
                  7 Start watching this issue

                  Dates

                  • Created:
                    Updated:
                    Resolved: