Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-2904

Race condition in cache startup causes state transfer timeout

    XMLWordPrintable

Details

    • Bug
    • Resolution: Obsolete
    • Major
    • None
    • 5.1.7.Final
    • State Transfer
    • None
    • Hide

      In EAP6.0.1 domain mode, redeploy a clustered application multiple times until the issue occurs.

      Show
      In EAP6.0.1 domain mode, redeploy a clustered application multiple times until the issue occurs.

    Description

      When starting multiple caches at the same time (as EAP domain mode deployment does), one cache can timeout during state transfer and abort startup.

      This is caused by a race condition where the master node accepts requests while it can't process them because it's still starting.

      Because of this, the other node's REQUEST_JOIN is ignored, and it finally times out.

      [node1]
      10:47:23,390 TRACE [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (ServerService Thread Pool – 65) dests=[master:server-two/web], command=CacheViewControlCommand

      {cache=repl, type=REQUEST_JOIN, sender=master:server-one/web, newViewId=0, newMembers=null, oldViewId=0, oldMembers=null}

      , mode=SYNCHRONOUS_IGNORE_LEAVERS, timeout=60000
      10:47:23,396 TRACE [org.jgroups.protocols.TCP] (ServerService Thread Pool – 65) sending msg to master:server-two/web, src=master:server-one/web, headers are RequestCorrelator: id=200, type=REQ, id=7, rsp_expected=true, RSVP: REQ(4), UNICAST2: DATA, seqno=27, TCP: [channel_name=web]
      ...
      10:48:23,404 ERROR [org.jboss.msc.service.fail] (ServerService Thread Pool – 65) MSC000001: Failed to start service jboss.infinispan.web.repl: org.jboss.msc.service.StartException in service jboss.infinispan.web.repl: org.infinispan.CacheException: Unable to invoke method public void org.infinispan.statetransfer.BaseStateTransferManagerImpl.waitForJoinToComplete() throws java.lang.InterruptedException on object of type ReplicatedStateTransferManagerImpl

      [node2]
      10:47:23,352 TRACE [org.infinispan.factories.GlobalComponentRegistry] (MSC service thread 1-6) Registering component Component

      {instance=org.infinispan.marshall.jboss.ExternalizerTable@3f9c437d, name=org.infinispan.marshall.jboss.ExternalizerTable}

      under name org.infinispan.marshall.jboss.ExternalizerTable
      ...
      10:47:23,397 TRACE [org.jgroups.protocols.TCP] (OOB-19,null) received [dst: master:server-two/web, src: master:server-one/web (4 headers), size=54 bytes, flags=OOB|DONT_BUNDLE|RSVP], headers are RequestCorrelator: id=200, type=REQ, id=7, rsp_expected=true, RSVP: REQ(4), UNICAST2: DATA, seqno=27, TCP: [channel_name=web]
      10:47:23,398 TRACE [org.jgroups.blocks.RequestCorrelator] (OOB-19,null) calling (org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher) with request 7
      10:47:23,398 TRACE [org.infinispan.marshall.jboss.ExternalizerTable] (OOB-19,null) Either the marshaller has stopped or hasn't started. Read externalizers are not properly populated: {}
      10:47:23,398 TRACE [org.infinispan.marshall.jboss.ExternalizerTable] (OOB-19,null) Cache manager is shutting down and type (id=74) cannot be resolved (thread not interrupted)
      10:47:23,400 TRACE [org.jgroups.blocks.RequestCorrelator] (OOB-19,null) sending rsp for 7 to master:server-one/web
      ...
      10:47:23,522 TRACE [org.infinispan.factories.GlobalComponentRegistry] (ServerService Thread Pool – 64) Invoking start method public void org.infinispan.marshall.jboss.ExternalizerTable.start() on component org.infinispan.marshall.jboss.ExternalizerTable

      Attachments

        Activity

          People

            mircea.markus Mircea Markus (Inactive)
            rhn-support-dereed Dennis Reed
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: