Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-27219

resource move simulation doesn't match reality

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Normal Normal
    • None
    • rhel-9.4
    • pacemaker
    • None
    • Normal
    • sst_high_availability
    • ssg_filesystems_storage_and_HA
    • 2
    • False
    • Hide

      None

      Show
      None

      spetros@redhat.com asked me to help him with failures coming from 'pcs resource move' in RHEL 9. The command is basically moving a resource in a cluster without leaving a constraint behind. It achieves that by creating a constraint to move the resource, waiting for pacemaker to move it, and then removing the constraint. To make sure the resource will not move back right after that, pcs runs crm_simulate.

      This is the simulation procedure in detail:

      • Get two copies of current CIB, (let's call them cib-original and cib-move) and save them to files.
      • Apply 'crm_resource --move --resource virtualip' to one copy, cib-move. This modifies cib-move.
      • Run crm_diff to get a diff between cib-original and cib-move. This is a diff which adds a move constraint and it looks like this:
      <diff format="2">
        <change operation="create"
            path="/cib/configuration/constraints" position="2"
        >
          <rsc_location id="cli-ban-virtualip-on-rh93-node1"
           rsc="virtualip" role="Started" node="rh93-node1" score="-INFINITY"
          />
        </change>
      </diff>
      • Copy cib-move to cib-clear.
      • Apply 'crm_resource --clear --resource virtualip' to cib-clear. This modifies cib-clear.
      • Run crm_diff to get a diff between cib-move and cib-clear. This is a diff which removes the move constraint and it looks like this:
      <diff format="2">
        <change operation="delete"
            path="/cib/configuration/constraints/rsc_location[@id=&apos;cli-ban-virtualip-on-rh93-node1&apos;]"
        />
      </diff>
      • Simulate the effect of applying cib-move. This is done by running 'crm_simulate --simulate --save-output /tmp/tmpaq7k3px9.pcs --save-graph /tmp/tmpitczauet.pcs --xml-pipe' and writing cib-move to stdin of crm_simulate. The simulation produces new cib, let's call it cib-after-move.
      • Verify that the resource actually moved.
      • Apply the diff removing a constraint to cib-after-move and simulate applying the resulting CIB by means of crm_simulate.
      • Verify that the resource didn't move back.
      • If everything was ok, move the resource for real.

      Now the problem Sergei ran into. In his scenario, the simulation ends with a failure as pacemaker says the resource would move back after removing the move constraint. However, when the move is done manually for real ('crm_resource --move --resource virtualip', wait for it to move, 'crm_resource --clear --resource virtualip'), the resource stays on the node it moved to.

      The cluster configuration is:

      • 3-node cluster with SBD enabled (I don't think that SBD actually makes a difference here, but it enables fencing without a stonith resource. And the lack of stonith resources may make a difference).
      • One vIP resource: pcs resource create virtualip ocf:pacemaker:Dummy
      • One mssql clone resource: pcs resource create ag_cluster ocf:pacemaker:Stateful promotable notify=true
      • a colocation: pcs constraint colocation add Started virtualip with Promoted ag_cluster-clone INFINITY
      • an ordering: pcs constraint order promote ag_cluster-clone then start virtualip
      • no cluster properties except the defaults generated automatically by pacemaker
      • resource stickiness not set, but setting it to the default value of 1 makes no difference

      This is the output of a simulation of adding the move constraint:

      Current cluster status:
        * Node List:
          * Online: [ rh93-node1 rh93-node2 rh93-node3 ]
      
        * Full List of Resources:
          * virtualip (ocf:pacemaker:Dummy):   Started rh93-node1
          * Clone Set: ag_cluster-clone [ag_cluster] (promotable):
            * Promoted: [ rh93-node1 ]
            * Unpromoted: [ rh93-node2 rh93-node3 ]
      
      Transition Summary:
        * Move       virtualip        (          rh93-node1 -> rh93-node3 )
        * Demote     ag_cluster:0     ( Promoted -> Unpromoted rh93-node1 )
        * Promote    ag_cluster:1     ( Unpromoted -> Promoted rh93-node3 )
      
      Executing Cluster Transition:
        * Resource action: virtualip       stop on rh93-node1
        * Resource action: ag_cluster      cancel=10000 on rh93-node1
        * Resource action: ag_cluster      cancel=11000 on rh93-node3
        * Pseudo action:   ag_cluster-clone_pre_notify_demote_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-pre_notify_demote_0
        * Pseudo action:   ag_cluster-clone_demote_0
        * Resource action: ag_cluster      demote on rh93-node1
        * Pseudo action:   ag_cluster-clone_demoted_0
        * Pseudo action:   ag_cluster-clone_post_notify_demoted_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-post_notify_demoted_0
        * Pseudo action:   ag_cluster-clone_pre_notify_promote_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-pre_notify_promote_0
        * Pseudo action:   ag_cluster-clone_promote_0
        * Resource action: ag_cluster      promote on rh93-node3
        * Pseudo action:   ag_cluster-clone_promoted_0
        * Pseudo action:   ag_cluster-clone_post_notify_promoted_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-post_notify_promoted_0
        * Resource action: virtualip       start on rh93-node3
        * Resource action: ag_cluster      monitor=11000 on rh93-node1
        * Resource action: ag_cluster      monitor=10000 on rh93-node3
        * Resource action: virtualip       monitor=10000 on rh93-node3
      
      Revised Cluster Status:
        * Node List:
          * Online: [ rh93-node1 rh93-node2 rh93-node3 ]
      
        * Full List of Resources:
          * virtualip (ocf:pacemaker:Dummy):   Started rh93-node3
          * Clone Set: ag_cluster-clone [ag_cluster] (promotable):
            * Promoted: [ rh93-node3 ]
            * Unpromoted: [ rh93-node1 rh93-node2 ]

      As you can see, the virtualip would move, which is ok.

      However, after removing the move constraint, the simulation suggest the resource would move back:

      Current cluster status:
        * Node List:
          * Online: [ rh93-node1 rh93-node2 rh93-node3 ]
      
        * Full List of Resources:
          * virtualip (ocf:pacemaker:Dummy):   Started rh93-node3
          * Clone Set: ag_cluster-clone [ag_cluster] (promotable):
            * Promoted: [ rh93-node3 ]
            * Unpromoted: [ rh93-node1 rh93-node2 ]
      
      Transition Summary:
        * Move       virtualip        (          rh93-node3 -> rh93-node1 )
        * Promote    ag_cluster:0     ( Unpromoted -> Promoted rh93-node1 )
        * Demote     ag_cluster:1     ( Promoted -> Unpromoted rh93-node3 )
      
      Executing Cluster Transition:
        * Resource action: virtualip       stop on rh93-node3
        * Resource action: ag_cluster      cancel=11000 on rh93-node1
        * Resource action: ag_cluster      cancel=10000 on rh93-node3
        * Pseudo action:   ag_cluster-clone_pre_notify_demote_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-pre_notify_demote_0
        * Pseudo action:   ag_cluster-clone_demote_0
        * Resource action: ag_cluster      demote on rh93-node3
        * Pseudo action:   ag_cluster-clone_demoted_0
        * Pseudo action:   ag_cluster-clone_post_notify_demoted_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-post_notify_demoted_0
        * Pseudo action:   ag_cluster-clone_pre_notify_promote_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-pre_notify_promote_0
        * Pseudo action:   ag_cluster-clone_promote_0
        * Resource action: ag_cluster      promote on rh93-node1
        * Pseudo action:   ag_cluster-clone_promoted_0
        * Pseudo action:   ag_cluster-clone_post_notify_promoted_0
        * Resource action: ag_cluster      notify on rh93-node1
        * Resource action: ag_cluster      notify on rh93-node3
        * Resource action: ag_cluster      notify on rh93-node2
        * Pseudo action:   ag_cluster-clone_confirmed-post_notify_promoted_0
        * Resource action: virtualip       start on rh93-node1
        * Resource action: ag_cluster      monitor=10000 on rh93-node1
        * Resource action: ag_cluster      monitor=11000 on rh93-node3
        * Resource action: virtualip       monitor=10000 on rh93-node1
      
      Revised Cluster Status:
        * Node List:
          * Online: [ rh93-node1 rh93-node2 rh93-node3 ]
      
        * Full List of Resources:
          * virtualip (ocf:pacemaker:Dummy):   Started rh93-node1
          * Clone Set: ag_cluster-clone [ag_cluster] (promotable):
            * Promoted: [ rh93-node1 ]
            * Unpromoted: [ rh93-node2 rh93-node3 ]

       

      Now the question is: In simulation, the moved resource moves back to its original node after the move constraint is removed. In reality, the moved resource stays on its new node after the move constraint is removed. How does that happen? And can anything be done about it to make the simulation and reality match?

      This is easily and reliably reproducible with the configuration I described above on all RHEL 9. Running 'pcs resource move' with --debug gives you all the CIBs and simulations output. For the purposes of writing this mail, I used pcs-0.11.7-1.el9.x86_64 and pacemaker-2.1.7-4.el9.x86_64.

            kgaillot@redhat.com Kenneth Gaillot
            tojeline@redhat.com Tomas Jelinek
            Kenneth Gaillot Kenneth Gaillot
            Cluster QE Cluster QE
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: