FUSE Message Broker
  1. FUSE Message Broker
  2. MB-1156

Add support for lease based lock to jdbc persistent adapter.

    Details

    • Type: Enhancement Enhancement
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 5.5.1-fuse-07-11
    • Component/s: broker
    • Labels:
      None
    • Environment:
      Fuse MB 5.5.x
    • Similar Issues:
      Show 9 results 

      Description

      Add support for lease based lock to jdbc persistent adapter.

      The current transaction based locking mechanism works well for single broker but
      is difficult to configure correctly in a master/slave scenario.

      The two main problems are:

      1) You need to use a combination of an IOExceptionHandler and the lockKeepAlivePeriod
      in order to get the desired master/slave behavior, making the current solution difficult
      to configure.

      2) It would be preferable for a master broker to stay alive when it loses its connection to
      to the DB, but shutdown its transports and then go into retry mode to reacquire the lock.
      In reacquire mode, its effectively a slave with its transports shutdown.

      This is possible currently using an IOExceptionHandler, but the problem
      is once the broker refreshes its connection, it doesn't refresh the lock
      and so there are edge conditions where the slave can incorrectly acquire
      the lock after a failover (and mixed configs between master and slave are
      required to work around the issue).

      This enhancement is to add support for a lease based lock. This would allow
      us to simplify master/slave jdbc config.

      Master Behavior:

      When the first broker starts it acquires a lease for a time slice, renewing it periodically.

      When the master broker:
      + Terminates - it automatically releases the lease.

      + Crashes - the slave detects that the lease has expired and if it can
      acquire the lease it becomes the new master.

      + Network Glitch - once master detects the connection is gone (and so timeouts would
      be configured via broker thread or jdbc driver timeouts), it goes into retry mode. This
      would shutdown transport connectors until it can reacquire lease.

      Benefits:

      In this manner, both graceful and ungraceful broker process terminations are detected. Also
      we don't have the transaction log overhead associated with the current mechanism.

      Downside:

      Lease based approach would add additional lease renewal traffic going between the broker and db. The slaves
      will need to periodically try and acquire the lease. The master will need to periodically renew
      its lease. Its not expected this would be significant.

      Solution will require clocks to be sync'd between master and slave for reliable operation. Would be nice
      if brokers where able to detect/protect against out of sync peers.

      Config:

      1) lease_ping_time - amount of time between lease renewals

      2) lease_reap_time - if broker doesn't renew within lease_reap_time,
      the db lock is released and open to a new master. lease_reap_time should
      be larger than lease_ping_time.

      Nice to have:

      + activemq-admin command to tell you the current master
      + activemq-admin command to do a soft-failover (force expiration of current lease) ?

        Issue Links

          Activity

          Hide
          Gary Tully
          added a comment -

          https://issues.apache.org/jira/browse/AMQ-3654 shares a common problem that a lease based lock can help

          Show
          Gary Tully
          added a comment - https://issues.apache.org/jira/browse/AMQ-3654 shares a common problem that a lease based lock can help
          Hide
          Gary Tully
          added a comment - - edited

          Additional LeaseLocker on the 5.5.1 branch.

          <ioExceptionHandler>
                      <jDBCIOExceptionHandler/>
                  </ioExceptionHandler>
          
                  <persistenceAdapter>
                      <jdbcPersistenceAdapter lockKeepAlivePeriod="1000" lockAcquireSleepInterval="2000">
                          <databaseLocker>
                              <lease-database-locker/>
                          </databaseLocker>
                      </jdbcPersistenceAdapter>
                  </persistenceAdapter>

          The IOExceptionHandler will pause/resume the transport connectors on any IO exception related to access to the DB. This is important because transport restart is gated on a successful keepAlive, so that we avoid contending masters if the lease expires before the db comes back online.
          The lease based lock is acquired by blocking at start and retained by the keepAlivePeriod. To retain, the lease is extended by the lockAcquireSleepInterval, so in theory the master is always (lockAcquireSleepInterval-lockKeepAlivePeriod) ahead of the slave w.r.t the lease.
          The lease is dropped on normal shutdown.
          If the broker system clock is not in sync with the db, a maxAllowableDiffFromDBTime > 0 will adjust the lease duration if the skew exceeds the absolute maxAllowableDiffFromDBTime value, allowing the db to dictate the utc basis for the lease.
          There is no support for moving from a master state back to a slave. If the lease is lost, the master will exit.

          Show
          Gary Tully
          added a comment - - edited Additional LeaseLocker on the 5.5.1 branch. <ioExceptionHandler> <jDBCIOExceptionHandler/> </ioExceptionHandler> <persistenceAdapter> <jdbcPersistenceAdapter lockKeepAlivePeriod= "1000" lockAcquireSleepInterval= "2000" > <databaseLocker> <lease-database-locker/> </databaseLocker> </jdbcPersistenceAdapter> </persistenceAdapter> The IOExceptionHandler will pause/resume the transport connectors on any IO exception related to access to the DB. This is important because transport restart is gated on a successful keepAlive, so that we avoid contending masters if the lease expires before the db comes back online. The lease based lock is acquired by blocking at start and retained by the keepAlivePeriod. To retain, the lease is extended by the lockAcquireSleepInterval, so in theory the master is always (lockAcquireSleepInterval-lockKeepAlivePeriod) ahead of the slave w.r.t the lease. The lease is dropped on normal shutdown. If the broker system clock is not in sync with the db, a maxAllowableDiffFromDBTime > 0 will adjust the lease duration if the skew exceeds the absolute maxAllowableDiffFromDBTime value, allowing the db to dictate the utc basis for the lease. There is no support for moving from a master state back to a slave. If the lease is lost, the master will exit.

            People

            • Assignee:
              Gary Tully
              Reporter:
              Dave Stanley
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: