Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-979

TCP DataOutputStream.flush() hang

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.6.10.merge
    • 2.6.10.merge
    • None

      JGroups cluster consist of 2 nodes. It uses 2 JChannel: one for config purposes, another for data transfer. Sometimes randomly servers hangs up trying to send message to cluster. I think the main reason is this:

      "Timer-3,tcp,81.19.94.71:7800" daemon prio=10 tid=0x00007f155c7a1400 nid=0x3eec runnable [0x0000000042f92000..0x0000000042f92c80]
      java.lang.Thread.State: RUNNABLE
      at java.net.SocketOutputStream.socketWrite0(Native Method)
      at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
      at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
      at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
      at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)

      • locked <0x00007f15688678f0> (a java.io.BufferedOutputStream)
        at java.io.DataOutputStream.flush(DataOutputStream.java:123)
        at org.jgroups.blocks.BasicConnectionTable$Connection.doSend(BasicConnectionTable.java:546)
        at org.jgroups.blocks.BasicConnectionTable$Connection._send(BasicConnectionTable.java:522)
        at org.jgroups.blocks.BasicConnectionTable$Connection.send(BasicConnectionTable.java:506)
        at org.jgroups.blocks.BasicConnectionTable.send(BasicConnectionTable.java:322)
        at org.jgroups.protocols.TCP.send(TCP.java:55)
        at org.jgroups.protocols.BasicTCP.sendToSingleMember(BasicTCP.java:219)
        at org.jgroups.protocols.BasicTCP.sendToAllMembers(BasicTCP.java:204)
        at org.jgroups.protocols.TP.doSend(TP.java:1486)
        at org.jgroups.protocols.TP.access$2500(TP.java:49)
        at org.jgroups.protocols.TP$Bundler.sendBundledMessages(TP.java:2059)
        at org.jgroups.protocols.TP$Bundler.access$2900(TP.java:1951)
        at org.jgroups.protocols.TP$Bundler$BundlingTimer.run(TP.java:2088)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
        at java.util.concurrent.FutureTask.run(FutureTask.java:166)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)

      I've done few thread dumps during 30 minutes and this thread was in this state in every dump.

      See full thread dump in attachment.

      P.S. In log file OpenJDK used, but switch to Sun JDK result in the same errors.
      $ java -version
      java version "1.6.0_0"
      IcedTea6 1.3.1 (6b12-0ubuntu6.4) Runtime Environment (build 1.6.0_0-b12)
      OpenJDK 64-Bit Server VM (build 1.6.0_0-b12, mixed mode)

        1. flush-tcp.xml
          2 kB
        2. nohup_2_node_cluster.tar.gz
          80 kB
        3. nohup.out
          579 kB
        4. nohup1.out
          685 kB

            rhn-engineering-bban Bela Ban
            bulat_jira Bulat Nigmatullin (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: