Uploaded image for project: 'Agroal'
  1. Agroal
  2. AG-145

Active waiting deadlock in StampedCopyOnWriteArrayList

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • 1.9
    • 1.8
    • None
    • None

    Description

      While using agroal connection pool, we discovered some rare deadlock, which are causing 100% cpu on some threads. These deadlocks occur in the StampedCopyOnWriteArrayList class, when there is more than one thread trying to remove the same object.

       

      A simple reproducer in junit (fails nearly every time on my machine):

       

      @Test
      public void testThis() {
          ExecutorService service = Executors.newFixedThreadPool(10);
          StampedCopyOnWriteArrayList<Object> list = new StampedCopyOnWriteArrayList<>(Object.class);
          Object o = new Object();
          list.add(new Object());
          list.add(new Object());
          list.add(new Object());
          list.add(new Object());
          list.add(o);
          list.add(new Object());
          List<Runnable> runnerList = new ArrayList<>(10);
          List<Future> futureList = new ArrayList<>(10);
          for (int i = 0; i < 10; i++) {
              runnerList.add(new Runnable() {
                  @Override
                  public void run() {
                      list.remove(o);
                      System.out.println("Removed success!");
                  }
              });
          }
          for (Runnable r : runnerList) {
              futureList.add(service.submit(r));
          }
          for (Future r : futureList) {
              try {
                  r.get(10000, TimeUnit.MILLISECONDS);
              } catch (InterruptedException e) {
                  e.printStackTrace();
              } catch (ExecutionException e) {
                  e.printStackTrace();
              } catch (TimeoutException e) {
                  System.out.println("Seems like we have a deadlock!");
              }
          }
      }
      

       

      Originally this deadlock seems to occur, when agroal tries to flush a connection due to the config parameter

      <property name="hibernate.agroal.maxLifetime_m">60</property>

      If at the same time another thread using this connection calls session.close there is a possibility in the ConnectionPool.class getting called twice. The parameter goes through the following path:

       

      The parallel session.close call does not find a checked_out connection and tries to flush it instead, hence two Threads are getting into the deadlock situation:

       

      Kind regards,

      Rene

      Attachments

        Activity

          People

            lbarreiro-1 Luis Barreiro
            rene.boeing@d-velop.de Rene Böing (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: