Uploaded image for project: 'JBoss Enterprise Application Platform 4 and 5'
  1. JBoss Enterprise Application Platform 4 and 5
  2. JBPAPP-10388

CLONE - Messages are hidden after restart when ACKs are done out of order on paging

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • EAP_EWP 5.2.0 ER5
    • EAP_EWP 5.2.0 ER4
    • HornetQ
    • None
    • Not Required
    • NEW

    Description

      Say you have messages 1, 2, 3, 4 and 5 on Page1.

      You acked 2, 3 and 4

      Pagin will look for the first acked message to start the flow and will hide message 1.

      This was discovered on the following test scenario:

      The MDB had random waits and multiple consumers on the queue, as a result we could get on this scenario described before.

      Test scenario:

      1. Configure two EAP 6 servers - "mdb" and "jms" server. "mdb" server has configured HornetQ resource adapter to connect to remote "jms" server.
      2. Deploy "InQueue" and "OutQueue" to jms server
      3. Send 500 messages to "InQueue"
      4. Deploy MDB to mdb server. MDB reads messages from InQueue and sends to OutQueue in XA transaction so it:
      starts an XA transaction
      dequeues a single message from InQueue on jms server
      enqueues the same message to OutQueue on jms server
      commits the transaction
      5. During processing of messages in step 4. kill (kill -9 ...) jms server and restart it.
      6. Wait until MDB processes all the messages.
      7. Receive messages from "OutQueue"

      Result:
      Sometimes happens that after step 7. there is lost message. There are warnings on mdb server:

      14:33:14,035 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016037: Could not find new XAResource to use for recovering non-serializable XAResource XAResourceRecord < resource:null, txid:< formatId=131077, gtrid_length=30, bqual_length=36, tx_uid=0:ffff0a2203bf:3eed75fd:5093cb0d:680, node_name=20, branch_uid=0:ffff0a2203bf:3eed75fd:5093cb0d:681, subordinatenodename=null, eis_name=unknown eis name >, heuristic: TwoPhaseOutcome.FINISH_OK com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord@6fbdd27a >
      14:33:14,036 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016038: No XAResource to recover < formatId=131077, gtrid_length=30, bqual_length=36, tx_uid=0:ffff0a2203bf:3eed75fd:5093cb0d:680, node_name=20, branch_uid=0:ffff0a2203bf:3eed75fd:5093cb0d:681, subordinatenodename=null, eis_name=unknown eis name >
      14:33:14,111 WARN  [org.hornetq.jms.server.recovery.HornetQXAResourceWrapper] (Periodic Recovery) null: javax.transaction.xa.XAException
      	at org.hornetq.core.client.impl.ClientSessionImpl.commit(ClientSessionImpl.java:1330) [hornetq-core-2.2.22.Final-redhat-1.jar:2.2.22.SNAPSHOT (HQ_2_2_22_AS7_Final, 122)]
      	at org.hornetq.core.client.impl.DelegatingSession.commit(DelegatingSession.java:163) [hornetq-core-2.2.22.Final-redhat-1.jar:2.2.22.SNAPSHOT (HQ_2_2_22_AS7_Final, 122)]
      	at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.commit(HornetQXAResourceWrapper.java:110) [hornetq-jms-2.2.22.Final-redhat-1.jar:2.2.22.SNAPSHOT (HQ_2_2_22_AS7_Final, 122)]
      	at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelCommit(XAResourceRecord.java:451) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2755) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2671) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.coordinator.BasicAction.phase2Commit(BasicAction.java:1806) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.recovery.RecoverAtomicAction.replayPhase2(RecoverAtomicAction.java:71) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.doRecoverTransaction(AtomicActionRecoveryModule.java:152) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.processTransactionsStatus(AtomicActionRecoveryModule.java:251) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.periodicWorkSecondPass(AtomicActionRecoveryModule.java:109) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:789) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:371) [jbossjts-4.16.6.Final-redhat-1.jar:]
      
      14:33:14,114 WARN  [com.arjuna.ats.jta] (Periodic Recovery) ARJUNA016036: commit on < formatId=131077, gtrid_length=30, bqual_length=36, tx_uid=0:ffff0a2203bf:3eed75fd:5093cb0d:670, node_name=20, branch_uid=0:ffff0a2203bf:3eed75fd:5093cb0d:671, subordinatenodename=null, eis_name=unknown eis name > (HornetQXAResourceWrapper [serverLocator=null, csf=null, delegate=null, xaRecoveryConfigs=[XARecoveryConfig [transportConfiguration = [org-hornetq-core-remoting-impl-netty-NettyConnectorFactory?port=5445&host=localhost], discoveryConfiguration = null, username=null, password=null]], instance=811298853]) failed with exception $XAException.XAER_NOTA: javax.transaction.xa.XAException
      	at org.hornetq.core.client.impl.ClientSessionImpl.commit(ClientSessionImpl.java:1330) [hornetq-core-2.2.22.Final-redhat-1.jar:2.2.22.SNAPSHOT (HQ_2_2_22_AS7_Final, 122)]
      	at org.hornetq.core.client.impl.DelegatingSession.commit(DelegatingSession.java:163) [hornetq-core-2.2.22.Final-redhat-1.jar:2.2.22.SNAPSHOT (HQ_2_2_22_AS7_Final, 122)]
      	at org.hornetq.jms.server.recovery.HornetQXAResourceWrapper.commit(HornetQXAResourceWrapper.java:110) [hornetq-jms-2.2.22.Final-redhat-1.jar:2.2.22.SNAPSHOT (HQ_2_2_22_AS7_Final, 122)]
      	at com.arjuna.ats.internal.jta.resources.arjunacore.XAResourceRecord.topLevelCommit(XAResourceRecord.java:451) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2755) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.coordinator.BasicAction.doCommit(BasicAction.java:2671) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.coordinator.BasicAction.phase2Commit(BasicAction.java:1806) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.arjuna.recovery.RecoverAtomicAction.replayPhase2(RecoverAtomicAction.java:71) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.doRecoverTransaction(AtomicActionRecoveryModule.java:152) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.processTransactionsStatus(AtomicActionRecoveryModule.java:251) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.AtomicActionRecoveryModule.periodicWorkSecondPass(AtomicActionRecoveryModule.java:109) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.doWorkInternal(PeriodicRecovery.java:789) [jbossjts-4.16.6.Final-redhat-1.jar:]
      	at com.arjuna.ats.internal.arjuna.recovery.PeriodicRecovery.run(PeriodicRecovery.java:371) [jbossjts-4.16.6.Final-redhat-1.jar:]
      

      Lost message is never delivered and seem to be stuck on jms server. HQ journal shows that there is one message but it's not in any queue - hq-journal-with-lost-message.log

      Attached reproducer2.zip:
      1. Download and unzip reproducer2.zip. Following scripts run in unzipped directory.
      2. Prepare servers - "sh prepare.sh"
      3. Start jms server - "sh start-server1.sh localhost"
      4. Start producer and send 500 messages to InQueue - "sh start-producer.sh localhost 500"
      5. Start mdb server - "sh start-server2.sh <some_other_ip>"
      6. When mdb server is processing messages, kill (kill -9 ...) jms server and start again by "sh start-server1.sh <some_other_ip>"
      7. Receive messages from OutQueue and check warnings in mdb servers log - "sh start-consumer.sh localhost".

      This issue relates to JBPAPP-10305. There is the same reproducer.

      Attachments

        Issue Links

          Activity

            People

              csuconic@redhat.com Clebert Suconic
              mnovak1@redhat.com Miroslav Novak
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: