Uploaded image for project: 'Red Hat 3scale API Management'
  1. Red Hat 3scale API Management
  2. THREESCALE-10974

How are database connections handled when a unicorn worker process is killed after a timeout?

    XMLWordPrintable

Details

    • Task
    • Resolution: Unresolved
    • Critical
    • None
    • 2.13.2 GA
    • System
    • False
    • None
    • False
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • Not Started
    • Not Started

    Description

      If system-app is experiencing slow response times from the database - let's say Postgres - and those responses cause the unicorn worker processes to timeout persitently for many hours how should the connections be handled?

      From the code it looks like before we fork the new child process we are in fact killing any previous connection. Is that a correct understanding?

      If yes then I assume each worker process will have a pool of 5 connections defined by RAILS_MAX_THREADS right?

      So for 1 replica of system-app with 1 CPU share available then we can expect 10 connections per container for a total of 30?

      So what other 3scale pods are opening connections against the database? Sidekiq and zync maybe? I assumed these would do so via system by way of the APIs but according to THREESCALE-10157 zync does indeed open connections and potentially too many of them.

      The question as stated in the summary is more about how are those connections handled if a worker process is killed?

      In this scenario there were many many persitent timeouts and after some hours the following errors started to appear in master and provider containers where those timeouts were happening:

      PG::ConnectionBad (FATAL:  remaining connection slots are reserved for non-replication superuser connections
      )
      

      From the monitoring it was possible to see over 100 connections were concurrently open and that the maximum query time for some of those was over 90s which is more than double the timeout setting in system-app.

      Why would those connections still be idle?

      Is that expected and if so is the correct solution simply to increase the max_connections setting on the database?

      Attachments

        Activity

          People

            Unassigned Unassigned
            rhn-support-keprice Kevin Price
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: