Details

    • Type: Task
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core-library
    • Labels:
      None

      Description

      The idea is to export an id uniquely identifying a given change event, based on its position in the source DB's log (e.g. binlog offset in case of MySQL). This would be a single field, e.g. based on hashing all the (connector-specific) offset attributes. Having one single field which can be handled by consumers without having to know about the connector-specific details will allow for duplicate detection there. In particular, a sink connector will be able to ignore duplicates by using INSERT queries that result in a no-op if there already is a record on the sink with the same unique event id.

      Note that – unlike basing this operation of sink connectors on Kafka topic offsets – basing this information on the actual offset in the source DB this will ensure that duplicates can also be detected after a Debezium connector restart, as the same event exported a second time will have the same event id.

      This attribute could go into the "source" structure or alternatively be conveyed as a header property.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                gunnar.morling Gunnar Morling
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: