Infinispan
  1. Infinispan
  2. ISPN-1890

Unable to create new native thread when running core/ testsuite

    Details

    • Similar Issues:
      Show 10 results 

      Description

      Suddenly, running core/ testsuite only with -Xmx768m -XX:MaxPermSize=256M, I'm seeing:

      Test suite progress: tests succeeded: 1695, failed: 4, skipped: 0.
      Exception in thread "ConnectionMap.Acceptor,null,null" java.lang.OutOfMemoryError: unable to create new native thread
      	at java.lang.Thread.start0(Native Method)
      	at java.lang.Thread.start(Thread.java:658)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection$ConnectionPeerReceiver.start(TCPConnectionMap.java:568)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection.start(TCPConnectionMap.java:411)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection.access$600(TCPConnectionMap.java:354)
      	at org.jgroups.blocks.TCPConnectionMap$ConnectionAcceptor.run(TCPConnectionMap.java:259)
      	at java.lang.Thread.run(Thread.java:680)
      Exception in thread "ConnectionMap.Acceptor,null,null" java.lang.OutOfMemoryError: unable to create new native thread
      	at java.lang.Thread.start0(Native Method)
      	at java.lang.Thread.start(Thread.java:658)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection$ConnectionPeerReceiver.start(TCPConnectionMap.java:568)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection.start(TCPConnectionMap.java:411)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection.access$600(TCPConnectionMap.java:354)
      	at org.jgroups.blocks.TCPConnectionMap$ConnectionAcceptor.run(TCPConnectionMap.java:259)
      	at java.lang.Thread.run(Thread.java:680)
      Exception in thread "ConnectionMap.Acceptor,null,null" java.lang.OutOfMemoryError: unable to create new native thread
      	at java.lang.Thread.start0(Native Method)
      	at java.lang.Thread.start(Thread.java:658)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection$ConnectionPeerReceiver.start(TCPConnectionMap.java:568)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection.start(TCPConnectionMap.java:411)
      	at org.jgroups.blocks.TCPConnectionMap$TCPConnection.access$600(TCPConnectionMap.java:354)
      	at org.jgroups.blocks.TCPConnectionMap$ConnectionAcceptor.run(TCPConnectionMap.java:259)
      	at java.lang.Thread.run(Thread.java:680)
      [testng-TopologyAwareDistAsyncFuncTest] Test testPutIfAbsentFromNonOwner(org.infinispan.distribution.topologyaware.TopologyAwareDistAsyncFuncTest) succeeded.
      Test suite progress: tests succeeded: 1696, failed: 4, skipped: 0.
      [testng-TopologyAwareDistAsyncFuncTest] Test testRemoveFromNonOwner(org.infinispan.distribution.topologyaware.TopologyAwareDistAsyncFuncTest) succeeded.
      Test suite progress: tests succeeded: 1697, failed: 4, skipped: 0.
      [testng-TopologyAwareDistAsyncFuncTest] Test testReplaceFromNonOwner(org.infinispan.distribution.topologyaware.TopologyAwareDistAsyncFuncTest) succeeded.
      Test suite progress: tests succeeded: 1698, failed: 4, skipped: 0.
      Exception in thread "CacheViewTrigger,TopologyAwareDistAsyncFuncTest-NodeA-38647" java.lang.OutOfMemoryError: unable to create new native thread
      	at java.lang.Thread.start0(Native Method)
      	at java.lang.Thread.start(Thread.java:658)
      	at java.util.concurrent.ThreadPoolExecutor.addIfUnderMaximumPoolSize(ThreadPoolExecutor.java:727)
      	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:657)
      	at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:92)
      	at org.infinispan.cacheviews.CacheViewsManagerImpl$ViewTriggerThread.run(CacheViewsManagerImpl.java:848)

      See attached thread dump.

        Gliffy Diagrams

        1. output.log
          2.44 MB
          Galder Zamarreño

          Activity

          Hide
          Scott Marlow added a comment -

          Does anyone have a workaround? This happens everytime that I try to build on Linux with Oracle jdk 1.6.0_26 or 1.6.0_31. Also tried "MAVEN_OPTS=-Xmx1024m -Xms512m -XX:MaxPermSize=128m"

          It sounds like a leak to me but I'm not sure why I'm the only one hitting this (Galder only experienced this once and it went away).

          Show
          Scott Marlow added a comment - Does anyone have a workaround? This happens everytime that I try to build on Linux with Oracle jdk 1.6.0_26 or 1.6.0_31. Also tried "MAVEN_OPTS=-Xmx1024m -Xms512m -XX:MaxPermSize=128m" It sounds like a leak to me but I'm not sure why I'm the only one hitting this (Galder only experienced this once and it went away).
          Hide
          Sanne Grinovero added a comment -

          This is not a new problem.
          The OutOfMemory happens because GC is not allowed to run, because there are more threads run by the testsuite than what the default setting of /etc/security/limits.conf allow.

          Edit the file to add something like

          /etc/security/limits.conf

          sanne	soft	nofile	16384
          sanne	hard	nofile	16384
          sanne	soft	nproc	16384
          sanne	hard	nproc	16384
          

          (where sanne is my username)

          Show
          Sanne Grinovero added a comment - This is not a new problem. The OutOfMemory happens because GC is not allowed to run, because there are more threads run by the testsuite than what the default setting of /etc/security/limits.conf allow. Edit the file to add something like /etc/security/limits.conf sanne soft nofile 16384 sanne hard nofile 16384 sanne soft nproc 16384 sanne hard nproc 16384 (where sanne is my username)
          Hide
          Scott Marlow added a comment -

          I'm still having the same problems (I set the above settings for smarlow and rebooted to be sure it took effect). I don't think its critical though, just would of been nice if I could of gotten past this.

          Show
          Scott Marlow added a comment - I'm still having the same problems (I set the above settings for smarlow and rebooted to be sure it took effect). I don't think its critical though, just would of been nice if I could of gotten past this.
          Hide
          Dan Berindei added a comment -

          Although it's an OutOfMemoryError, this one is a little different as it's not affected by `-Xmx` and `-XX:MaxPermSize`.
          Instead it's affected by the maximum number of processes on linux (`ulimit -u`) and by the amount of available virtual memory (number of threads * `-Xss`). See this forum thread for a longer discussion: https://community.jboss.org/thread/164955

          The default `ulimit -u` value is 1024, and we're creating way more than that during the core test suite (I counted 1591 when I ran it on my machine). So this error is most likely to e caused by a small `ulimit -u`.

          We probably have a thread leak as well, I did see while running the test suite that we have an awful lot of KeyAffinityService threads alive at the same time.

          Show
          Dan Berindei added a comment - Although it's an OutOfMemoryError, this one is a little different as it's not affected by `-Xmx` and `-XX:MaxPermSize`. Instead it's affected by the maximum number of processes on linux (`ulimit -u`) and by the amount of available virtual memory (number of threads * `-Xss`). See this forum thread for a longer discussion: https://community.jboss.org/thread/164955 The default `ulimit -u` value is 1024, and we're creating way more than that during the core test suite (I counted 1591 when I ran it on my machine). So this error is most likely to e caused by a small `ulimit -u`. We probably have a thread leak as well, I did see while running the test suite that we have an awful lot of KeyAffinityService threads alive at the same time.
          Hide
          Sanne Grinovero added a comment -

          Dan, "ulimit -u" is the same as the soft nproc value I've suggested above; Scott tried as well with 512M of MaxPermSize as that's where the stack is allocated.
          It still doesn't build for him.. works fine for me having the same settings.

          Show
          Sanne Grinovero added a comment - Dan, "ulimit -u" is the same as the soft nproc value I've suggested above; Scott tried as well with 512M of MaxPermSize as that's where the stack is allocated. It still doesn't build for him.. works fine for me having the same settings.
          Hide
          Dan Berindei added a comment -

          Sanne, I should have mentioned that it's the same thing. But I talked to Scott and he actually didn't have it set, so I guess it's always good to check with ulimit rather than trust that the limits.conf settings have been applied correctly.

          I just wanted to make it clear, in case someone else gets here wondering about this error, that it's usually about the number of processes (on Linux) or the native memory (on 32-bit Windows, since threads will have a bigger stack by default) - and never about the heap settings.

          Show
          Dan Berindei added a comment - Sanne, I should have mentioned that it's the same thing. But I talked to Scott and he actually didn't have it set, so I guess it's always good to check with ulimit rather than trust that the limits.conf settings have been applied correctly. I just wanted to make it clear, in case someone else gets here wondering about this error, that it's usually about the number of processes (on Linux) or the native memory (on 32-bit Windows, since threads will have a bigger stack by default) - and never about the heap settings.
          Hide
          Sanne Grinovero added a comment -

          I'm having around 3000 threads in my testsuite, most of them are the KeyAffinityService as Dan also pointed out.
          It's the org.infinispan.affinity.KeyAffinityServiceImpl$KeyGeneratorWorker waiting.

          Show
          Sanne Grinovero added a comment - I'm having around 3000 threads in my testsuite, most of them are the KeyAffinityService as Dan also pointed out. It's the org.infinispan.affinity.KeyAffinityServiceImpl$KeyGeneratorWorker waiting.
          Hide
          Sanne Grinovero added a comment -

          BTW the threadpool used by the KeyGeneratorWorker is not having a nice name assigned, it would be nice to have it's threads properly named too.

          Show
          Sanne Grinovero added a comment - BTW the threadpool used by the KeyGeneratorWorker is not having a nice name assigned, it would be nice to have it's threads properly named too.
          Hide
          Mircea Markus added a comment -

          The test suite is down to aprox 1.1k threads from 2k+ threads now. Also thread leaks were removed and a log to dump the current system threads was added in case an OOM happens.

          Show
          Mircea Markus added a comment - The test suite is down to aprox 1.1k threads from 2k+ threads now. Also thread leaks were removed and a log to dump the current system threads was added in case an OOM happens.
          Hide
          Radoslav Husar added a comment -

          BTW I have just updated the doc with info on the security limits: https://docs.jboss.org/author/display/ISPN/Contributing+-+The+test+suite

          Show
          Radoslav Husar added a comment - BTW I have just updated the doc with info on the security limits: https://docs.jboss.org/author/display/ISPN/Contributing+-+The+test+suite

            People

            • Assignee:
              Mircea Markus
              Reporter:
              Galder Zamarreño
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development