Uploaded image for project: 'JBoss Enterprise Application Platform 4 and 5'
  1. JBoss Enterprise Application Platform 4 and 5
  2. JBPAPP-6153

One missing message - failover on server shutdown with MDB, chain backup topology

    XMLWordPrintable

Details

    Description

      Hi Clebert.

      I executed eap51-hornetq-failover-chain-mdb-shutdown [1] for 3 times and I noticed that one run failed, final consumer received only 2499 messages, expected was 2500. I executed this job many times today and failure happened from time to time, I couldn't find reason for that.

      MDB uses container transactions, it consumes message fron InQueue and produces message into OutQueue. Source codes are available online [3].

      MDB is invoked only 2499 times when test fails, see [count].

      I simplified my template to skip MDB deploy and verify that message producer works correctly. See Hudson job [2], there are no failures, diff of templates is available too [diff].

      It seems that problem is in JCA because all messages (2500) are generated and onMessage is sometimes invoked only 2499 times.

      Log for serverA [4], log for serverB [5] and log for producers and consumers [6], messaging-18 is deploy host, messaging-19 is serverA, messaging-20 is serverB and messaging-21 if for producers and consumers. Tested on CR3 HornetQ release + EAP 5.1.0 GA + TS from JBPAPP-5834.

      Scenario description:

      • Servers A and B are prepared to use chain backup topology, MDB is deployed, HQ configuration is online [7], I only modify <failover-on-shutdown> to true.
      • Servers A and B are started
      • Main failover producer is started (uses InQueue, 80 ms wait before ne message is generated)
      • 12 producers are started to generate load (uses TradeBrokerQueue, 80 ms wait before ne message is generated)
      • Server shutdown is invoked
      • Main failover producer and 12 load producers continue to work
      • Consumers are invoked after producers end their part, consumers are used to verify that no message is lost

      I tried to start serverA and serverB after test manually and there was no message in InQueue, OutQueue and DLQ on both servers, there was no pending prepared transaction on both servers.

      Clebert, do you have any suggestion what could cause message loss?
      Could you please check logs and configuration.

      [1] http://hudson.qa.jboss.com/hudson/view/EAP5-HornetQ-perf/job/eap51-hornetq-failover-chain-mdb-shutdown/
      [2] http://hudson.qa.jboss.com/hudson/view/EAP5-HornetQ-perf/job/eap51-hornetq-failover-chain-mdb-shutdown-simulate/
      [3] https://svn.devel.redhat.com/repos/jboss-qa/load-testing/apps/mdb-hornetq-test/trunk/ha-mdb/src/main/java/org/jboss/qa/mdbhornetqtest/CopyMsgTransactionMDB.java
      [4] http://hudson.qa.jboss.com/hudson/view/EAP5-HornetQ-perf/job/eap51-hornetq-failover-chain-mdb-shutdown/28/console-messaging-19/
      [5] http://hudson.qa.jboss.com/hudson/view/EAP5-HornetQ-perf/job/eap51-hornetq-failover-chain-mdb-shutdown/28/console-messaging-20/
      [6] http://hudson.qa.jboss.com/hudson/view/EAP5-HornetQ-perf/job/eap51-hornetq-failover-chain-mdb-shutdown/28/console-messaging-21/
      [7] https://svn.devel.redhat.com/repos/jboss-qa/load-testing/etc/eap-51/hornetq/failover-chain

      [count]

      [rsvoboda@rosta-ntb ~]$ cat server-01 | grep 'Received TextMessage' | wc -l
      52
      [rsvoboda@rosta-ntb ~]$ cat server-02 | grep 'Received TextMessage' | wc -l
      2447
      

      [diff]

      [rsvoboda@rosta-ntb templates]$ diff -u failoverMdbSlsbChainTemplate.sf failoverMdbSlsbChainSimulationTemplate.sf
      --- failoverMdbSlsbChainTemplate.sf	2011-02-18 10:26:01.972545195 +0100
      +++ failoverMdbSlsbChainSimulationTemplate.sf	2011-03-22 14:27:27.701897124 +0100
      @@ -101,8 +101,9 @@
           copyMdbTestApps extends BashShellScript {
               processName "copyMdbTestApps";
               cmd [
      -        "pwd",
      -        ("cp " ++ ROOT:sfConfig:mdb-hornetq-testDir ++ "/ha-mdb/target/*.jar " ++ deployPath)
      +        "pwd"
      +//        "pwd",
      +//        ("cp " ++ ROOT:sfConfig:mdb-hornetq-testDir ++ "/ha-mdb/target/*.jar " ++ deployPath)
               ];
           }
       }
      @@ -111,8 +112,9 @@
           copySlsbTestApps extends BashShellScript {
               processName "copySlsbTestApps";
               cmd [
      -        "pwd",
      -        ("cp " ++ ROOT:sfConfig:mdb-hornetq-testDir ++ "/ha-slsb/ear/target/*.ear " ++ deployPath)
      +        "pwd"
      +//        "pwd",
      +//        ("cp " ++ ROOT:sfConfig:mdb-hornetq-testDir ++ "/ha-slsb/ear/target/*.ear " ++ deployPath)
               ];
           }
       }
      @@ -427,7 +429,8 @@
                                   //verifier no duplications on outputQueue
                                   receiver extends SimpleReceiver {
                                       warmupTime 15000;  // waitBeforeVerification
      -                                destination            ROOT:sfConfig:destinationOut;
      +//                                destination            ROOT:sfConfig:destinationOut;
      +                                destination            ROOT:sfConfig:destinationIn;
                                       messagesCount          ROOT:sfConfig:messagesCount;
                                       hostAndPort ( (IF (ROOT:sfConfig:testFailback) THEN  ROOT:sfConfig:SERVER_NODE1 ELSE  ROOT:sfConfig:SERVER_NODE2 FI) ++ ":1099" );
                                       aggregator LAZY clientStats:receiverAggregator;
      

      Attachments

        Activity

          People

            csuconic@redhat.com Clebert Suconic
            rsvoboda@redhat.com Rostislav Svoboda
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: