Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-3140

JMX operation to suppress state transfer

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

      This feature request is to expose a JMX operation on each node, to suppress state transfer for a period of time. This flag would be false by default.

      The use case of this flag would be to ease bringing down (and up) a cluster for maintenance work. A typical workflow would be:

      1) Shut down application requests to the data grid
      2) Suppress state transfer on all nodes via JMX
      3) Bring down all nodes
      4) Perform maintenance work
      5) Bring up nodes, one at a time. As each node comes up, disable state transfer for the node via JMX.
      6) Once all nodes are up, enable state transfer for each node again via JMX
      7) Allow application requests to reach the grid again.

      The purpose of this is to allow smooth and fast shutdown and startup, remove the risk of OOM errors (when bringing a grid down).

      This is a small but useful subset of full manual state transfer as defined in ISPN-1394.

            [ISPN-3140] JMX operation to suppress state transfer

            Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to VERIFIED

            RH Bugzilla Integration added a comment - Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to VERIFIED

            Anna Manukyan <amanukya@redhat.com> made a comment on bug 974402

            Verified! Thanks a lot.

            RH Bugzilla Integration added a comment - Anna Manukyan <amanukya@redhat.com> made a comment on bug 974402 Verified! Thanks a lot.

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from ASSIGNED to MODIFIED

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from ASSIGNED to MODIFIED

            Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to ASSIGNED

            RH Bugzilla Integration added a comment - Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to ASSIGNED

            Anna Manukyan <amanukya@redhat.com> made a comment on bug 974402

            Tested for ER1 and still the issue described above appears.

            RH Bugzilla Integration added a comment - Anna Manukyan <amanukya@redhat.com> made a comment on bug 974402 Tested for ER1 and still the issue described above appears.

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from ASSIGNED to MODIFIED

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from ASSIGNED to MODIFIED

            Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to ASSIGNED

            RH Bugzilla Integration added a comment - Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to ASSIGNED

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from ASSIGNED to MODIFIED

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from ASSIGNED to MODIFIED

            Tristan Tarrant <ttarrant@redhat.com> made a comment on bug 974402

            The HotRod topology cache MUST not be configured by hand, but only by using the <topology-state-transfer> configuration element. ISPN-3373 adds support for this.

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> made a comment on bug 974402 The HotRod topology cache MUST not be configured by hand, but only by using the <topology-state-transfer> configuration element. ISPN-3373 adds support for this.

            Tristan Tarrant <ttarrant@redhat.com> made a comment on bug 974402

            Anna, disabling

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> made a comment on bug 974402 Anna, disabling

            Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to ASSIGNED

            RH Bugzilla Integration added a comment - Anna Manukyan <amanukya@redhat.com> changed the Status of bug 974402 from ON_QA to ASSIGNED

            Michal Linhard <mlinhard@redhat.com> made a comment on bug 974402

            anna can you please verify this ? (you did this for the patch)

            RH Bugzilla Integration added a comment - Michal Linhard <mlinhard@redhat.com> made a comment on bug 974402 anna can you please verify this ? (you did this for the patch)

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from MODIFIED to ON_QA

            Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from NEW to MODIFIED

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> changed the Status of bug 974402 from NEW to MODIFIED

            Tristan Tarrant <ttarrant@redhat.com> made a comment on bug 974402

            Resolved upstream

            RH Bugzilla Integration added a comment - Tristan Tarrant <ttarrant@redhat.com> made a comment on bug 974402 Resolved upstream

            We should broadcast (to all the members) the suspend request so that when the coordinator dies, the new coordinator would pick it up and it wouldn't start the rebalance.

            Dan Berindei (Inactive) added a comment - We should broadcast (to all the members) the suspend request so that when the coordinator dies, the new coordinator would pick it up and it wouldn't start the rebalance.

            Integrated in master. Thanks!

            Adrian Nistor (Inactive) added a comment - Integrated in master. Thanks!

            This needs to have a "Server" counterpart, so that it can be exposed via the Server RHQ plugin (which doesn't use JMX, but DMR)

            Tristan Tarrant added a comment - This needs to have a "Server" counterpart, so that it can be exposed via the Server RHQ plugin (which doesn't use JMX, but DMR)

            As pointed out by Dennis Reed (http://markmail.org/message/al7elzaqme5jri22) it makes sense to forward the jmx operation from any member to the coordinator to make it easier for the user.

            Adrian Nistor (Inactive) added a comment - As pointed out by Dennis Reed ( http://markmail.org/message/al7elzaqme5jri22 ) it makes sense to forward the jmx operation from any member to the coordinator to make it easier for the user.

            Dan Berindei (Inactive) added a comment - - edited

            Pasted from Adrian's message (http://markmail.org/message/ns7aojy7v7su2t7p):

            1. Add a JMX writable attribute (or operation?) to ClusterTopologyManager (name it suppressRehashing?) that is false by default but should also be configurable via API or xml. While this attribute is true the ClusterTopologyManager queues all join/leave/exclude(see below) requests and does not execute them on the spot as it would normally happen. [...] When it is set back to false all queued operations (except the ones that cancel eachother out) are executed. The setter should be synchronous so when setting is back to false it does not return until the queue is empty and all rehashing was processed.

            2. We add a JMX operation excludeNodes(list of addresses) to ClusterTopologyManager. [...] This operation removes the node from the topology (almost as if it left) and forces a rebalance. The node is still present in the current CH but not in the pending CH. It's basically disowned by all its data which is now being transferred to other (not excluded) nodes. At the end of the rebalance the node is removed from topology for good and can be shut down without loosing data. Note that if suppressRehashing==true operation excludeNodes(..) just queues them for later removal. We can batch multiple such exclusions and then re-activate the rehashing.

            The parts that need to be implemented are written in italic above. Everything else is already there.

            excludeNodes is a way of achieving a soft shutdown and should be used only if we care about preserving data int the extreme case where the nodes are the last/single owners. We can just kill the node directly if we do not care about its data.

            suppressRehashing is a way of achieving some kind of batching of topology changes. This should speed up state transfer a lot because it avoids a lot of pointless reshuffling of data segments when we have many successive joiners/leavers.

            So what happens if the current coordinator dies for whatever reason? The new one will take control and will not have knowledge of the existing rehash queue or the previous status of suppressRehashing attribute so it will just get the current cache membership status from all members of current view and proceed with the rehashing as usual. If the user does not want this he can set a default value of true for suppressRehashing. The admin has to interact now via JMX with the new coordinator. But that's not as bad as the alternative where all the nodes are involved in this jmx scheme I think having only the coordinator involved in this is a plus.

            We're actually going to implement only point 1 now, and point 2 will be a separate issue (or perhaps as a part of ISPN-1394).

            Dan Berindei (Inactive) added a comment - - edited Pasted from Adrian's message ( http://markmail.org/message/ns7aojy7v7su2t7p ): 1. Add a JMX writable attribute (or operation?) to ClusterTopologyManager (name it suppressRehashing?) that is false by default but should also be configurable via API or xml. While this attribute is true the ClusterTopologyManager queues all join/leave/exclude(see below) requests and does not execute them on the spot as it would normally happen. [...] When it is set back to false all queued operations (except the ones that cancel eachother out) are executed. The setter should be synchronous so when setting is back to false it does not return until the queue is empty and all rehashing was processed. 2. We add a JMX operation excludeNodes(list of addresses) to ClusterTopologyManager. [...] This operation removes the node from the topology (almost as if it left) and forces a rebalance. The node is still present in the current CH but not in the pending CH. It's basically disowned by all its data which is now being transferred to other (not excluded) nodes. At the end of the rebalance the node is removed from topology for good and can be shut down without loosing data. Note that if suppressRehashing==true operation excludeNodes(..) just queues them for later removal. We can batch multiple such exclusions and then re-activate the rehashing. The parts that need to be implemented are written in italic above. Everything else is already there. excludeNodes is a way of achieving a soft shutdown and should be used only if we care about preserving data int the extreme case where the nodes are the last/single owners. We can just kill the node directly if we do not care about its data. suppressRehashing is a way of achieving some kind of batching of topology changes. This should speed up state transfer a lot because it avoids a lot of pointless reshuffling of data segments when we have many successive joiners/leavers. So what happens if the current coordinator dies for whatever reason? The new one will take control and will not have knowledge of the existing rehash queue or the previous status of suppressRehashing attribute so it will just get the current cache membership status from all members of current view and proceed with the rehashing as usual. If the user does not want this he can set a default value of true for suppressRehashing. The admin has to interact now via JMX with the new coordinator. But that's not as bad as the alternative where all the nodes are involved in this jmx scheme I think having only the coordinator involved in this is a plus. We're actually going to implement only point 1 now, and point 2 will be a separate issue (or perhaps as a part of ISPN-1394 ).

            where would the data in this cluster be persisted during the shutdown? simpler with a shared cache store, each cache persisting locally would complicate things a bit.

            Mircea Markus (Inactive) added a comment - where would the data in this cluster be persisted during the shutdown? simpler with a shared cache store, each cache persisting locally would complicate things a bit.

              dberinde@redhat.com Dan Berindei (Inactive)
              manik_jira Manik Surtani (Inactive)
              Archiver:
              rhn-support-adongare Amol Dongare

                Created:
                Updated:
                Resolved:
                Archived: