Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-1247

Ability to specify batch size during snapshot

    XMLWordPrintable

    Details

      Description

      io.debezium.connector.mongodb.Replicator class doesn't use batch size to limit memory allocation of MongoDB cursor:

      try (MongoCursor<Document> cursor = docCollection.find().iterator()) {
          while (running.get() && cursor.hasNext()) {
              Document doc = cursor.next();
              logger.trace("Found existing doc in {}: {}", collectionId, doc);
              counter += factory.recordObject(collectionId, doc, timestamp);
          }
      }
      

      MongoDB server chooses an appropriate batch size if the size isn't specified.

      I propose to add the following option:

      Property Default Description
      documents.fetch.size 0 Positive integer value that specifies the maximum number of documents that should be read in one go from each collection while taking a snapshot. The connector will read the collection contents in multiple batches of this size. Default to 0, which indicates that the server chooses an appropriate fetch size.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                jchipmunk Andrey Pustovetov
                Reporter:
                jchipmunk Andrey Pustovetov
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: