Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-438

Mongo initial sync misses records with initial.sync.max.threads > 1

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 0.6.1
    • Fix Version/s: 0.6.2
    • Component/s: mongodb-connector
    • Labels:
      None
    • Environment:

      Confluent 3.3.0

    • Steps to Reproduce:
      Hide

      1. set initial.sync.max.threads = 2
      2. prepare at least 2 collections with ~100k documents
      3. perform an initial sync (not from the oplog)
      4. compare document counts in kafka vs. mongo

      Show
      1. set initial.sync.max.threads = 2 2. prepare at least 2 collections with ~100k documents 3. perform an initial sync (not from the oplog) 4. compare document counts in kafka vs. mongo

      Description

      When using the mongo source connector (0.6.1) I noticed that setting initial.sync.max.threads > 1 the initial sync misses records, e.g. if I have two collections with ~100k documents each and with initial.sync.max.threads = 2 each collection will be missing ~100-200 random records (ran it like 10 times, random results every time), whereas this never happens with 1 thread - 10/10 times all the records are present in kafka.

      In the logs it says "Initial sync of 2 collections with a total of 153958 documents completed", which is actually the correct number of documents, but there's less of them in kafka, checked by sinking the topic to a database and also by manually consuming the topic with a simple java app.

      Attached TRACE logs for one such example initial sync: the collection dmng.di.lineItems was missing 74 items in kafka, dmng.di.campaigns missing 82. The missing items do not appear to be special in any way:

      Missing campaigns: 50239, 34748, 10166, 52861, 9180, 39537, 59604, 42647, 50283, 37489, 49078, 45638, 58653, 48533, 65921, 49864, 65417, 57489, 35396, 56547, 60663, 22442, 50267, 63244, 35335, 42521, 50047, 57263, 43247, 56583, 37669, 5070, 36857, 746, 38295, 64632, 46097, 58290, 43101, 66570, 47884, 35066, 56010, 42644, 31153, 35262, 45912, 39197, 4902, 10075, 30217, 1286, 30390, 57432, 33197, 39248, 30040, 59966, 52472, 5286, 59298, 43824, 636, 54293, 52248, 53722, 56300, 9005, 41935, 42420, 42867, 49397, 52535, 5527, 34921, 30408, 27456, 31040, 45856, 3411, 50599, 27847

      Missing lineItems: 12040, 29435, 17756, 17613, 7835, 19604, 18043, 31458, 15623, 21181, 243, 21371, 15402, 9384, 55312, 21416, 12510, 31115, 13131, 19635, 28350, 10067, 20745, 11877, 124, 23294, 27521, 27423, 18998, 8205, 27713, 2423, 1727, 14831, 17405, 18836, 28905, 25557, 19417, 11301, 2774, 13705, 2182, 12850, 33264, 9905, 24407, 27335, 19516, 55691, 6719, 2513, 28304, 13315, 22872, 7879, 6715, 13461, 7919, 31353, 7426, 9628, 28988, 13207, 8754, 273, 17757, 242, 27215, 19730, 10287, 20700, 11534, 19592

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                jpechanec Jiri Pechanec
                Reporter:
                saulius_vl Saulius Valatka
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: