Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: 0.9.0.Beta2
Affects Version/s: 0.8.3.Final
Component/s: mysql-connector
Labels:
None

SFDC Cases Counter:
SFDC Cases Links:

Description

Lets say we have two mysql servers in standard active-passive high availability setup. If current master node fails, automation will promote passive instance to new master and it continues to serve live traffic. And debezium is connecting to master node as well.

Starting point:
Server A (current master)
uuid: abc
gtids: abc:1-100

Server B (slave)
uuid: dfg
gtid: abc:1-100 (replating from master)

Debezium is connecting to master also, so it has
gtids: abc:1-100

Now assume master node fails, failover is triggered

Server B (automation promotes it to new master)
uuid: dfg,
gtids: abc:1-100, dfg: 1-20

Server A (becomes slave, starts replication from B)
uuid: abc
gtids: abc:1-100, dfg: 1-20

Debezium after job restart:
gtids: abc:1-100, dfg:1-20,

Debezium gets connection reset error, then on job restart it successfully connects to new master (Server B), finds new gtid channel (dfg) and merges it to existing offsets and connects.

Works, BUT! There is a timing issue.

When encountering new gtid debezium starts reading it from mysql server latest gtid_executed position. So in case when mysql servers failover happens faster than debezium job failure detection and restart, the live data arriving to new master with new gtid channel (dfg in our example) is never processed in debezium. In our infra it can be several minutes of data lost as with large schemas debezium startup takes some time.

What do you think about option to specify what should debezium do when encountering new gtid - take the latest executed position and continue from there or take earlies available value on server. Default could remain "latest", but in our case "earliest" would solve our problem with lost data changes on failover. Earliest could be gtid_purged channel value or if nothing purged then from position 1.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Screen Shot 2018-09-27 at 22.48.03.png
141 kB
2018/09/27 3:48 PM

Issue Links

is related to

DBZ-1705 Default `gtid.new.channel.position` to earliest

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Eero Koplimets (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2018/09/27 3:24 AM

Updated:: 2020/01/15 7:03 AM

Resolved:: 2018/12/12 10:45 AM