Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-277

Memory leak and excessive CPU usage when using materialized views

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 0.5.1
    • 0.5
    • postgresql-connector
    • None
    • Hide

      Monitor some table in a postgres database.

      Create a view in the same database as follows (the idea here is that the contents of the view changes every time you access it, generating lots of WAL from the materialized view):
      CREATE VIEW public.generate_wal AS
      SELECT s.i,
      random() r1, random() r2, random() r3, random() r4
      FROM generate_series(1, 100000) s;

      Make a materialized view out of it:
      CREATE MATERIALIZED VIEW mat_wal AS SELECT * FROM generate_wal;

      Add an index so concurrent refresh is possible:
      CREATE UNIQUE INDEX ON mat_wal;

      Refresh the materialized view:
      REFRESH MATERIALIZED VIEW CONCURRENTLY mat_wal;

      Observe the high CPU usage in the Kafka connect task, and enjoy how long it goes on for!

      Do it a second time:
      REFRESH MATERIALIZED VIEW CONCURRENTLY mat_wal;

      With the default Kafka connect heap size of 256MB this is already enough to exhaust the heap. With a 1G heap size you just need to be patient and keep doing this - eventually you will run out of heap.

      Probably there's some connection between the excessive CPU usage/processing time and the leak.

      Show
      Monitor some table in a postgres database. Create a view in the same database as follows (the idea here is that the contents of the view changes every time you access it, generating lots of WAL from the materialized view): CREATE VIEW public.generate_wal AS SELECT s.i, random() r1, random() r2, random() r3, random() r4 FROM generate_series(1, 100000) s ; Make a materialized view out of it: CREATE MATERIALIZED VIEW mat_wal AS SELECT * FROM generate_wal; Add an index so concurrent refresh is possible: CREATE UNIQUE INDEX ON mat_wal ; Refresh the materialized view: REFRESH MATERIALIZED VIEW CONCURRENTLY mat_wal; Observe the high CPU usage in the Kafka connect task, and enjoy how long it goes on for! Do it a second time: REFRESH MATERIALIZED VIEW CONCURRENTLY mat_wal; With the default Kafka connect heap size of 256MB this is already enough to exhaust the heap. With a 1G heap size you just need to be patient and keep doing this - eventually you will run out of heap. Probably there's some connection between the excessive CPU usage/processing time and the leak.

    Description

      Doing REFRESH MATERIALIZED VIEW CONCURRENTLY on a materialized view in the same PostgreSQL database as a table being monitored by debezium results in excessive CPU usage in the record stream producer thread, and leaks java heap: even with a 1G heap, refreshing a huge materialized view with many changes results in java.lang.OutOfMemoryError being thrown.

      Attachments

        Activity

          People

            Unassigned Unassigned
            duncan.sands Duncan Sands (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: