Uploaded image for project: 'Cloud Enablement'
  1. Cloud Enablement
  2. CLOUD-2261

[EAP][XA][Recovery][NFS] split lock is broken after a minute of network partition

    Description

    When a network partition separates an NFS persistent volume from a particular EAP pod, the split lock becomes available, thus the migration pod starts an eap recovery on such split. This may cause multiple EAP processes writing to the same split directory if the partition ends at that time

    Finished Migration Check cycle, pausing for 30 seconds before resuming
    Finished Migration Check cycle, pausing for 30 seconds before resuming
    Process has terminated abnorminally, forcing a termination check
    Attempting to migrate directory: (/opt/eap/standalone/partitioned_data/split-1)
    Waiting for grace period to expire, remaining timeout is 29 seconds
    Attempting to obtain lock for directory: (/opt/eap/standalone/partitioned_data/split-1)
    Successfully locked directory: (/opt/eap/standalone/partitioned_data/split-1)
    

      Gliffy Diagrams

        Attachments

        1. Race.png
          25 kB
          Thomas Jenkinson

          Issue Links

            Activity

              People

              • Assignee:
                ochaloup Ondrej Chaloupka
                Reporter:
                maschmid Marek Schmidt
                Tester:
                Tomas Remes
              • Votes:
                1 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: