[DBZ-607] Implement a CDC connector for Apache Cassandra

Type: Feature Request
Resolution: Done
Priority: Major
Fix Version/s: 0.10.0.Beta4
Affects Version/s: None
Component/s: cassandra-connector
Labels:
None

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Are there any plans to enable Cassandra support?

Jiri Pechanec added a comment - 2019/08/16 4:12 AM

Released

Jiri Pechanec added a comment - 2019/08/16 4:12 AM Released

Gunnar Morling added a comment - 2019/06/28 5:19 AM

Thanks a lot for sending that first PR, jgao54; super-happy to see it!

Hey abrarsheikh, welcome and nice seeing you here. Thanks for sharing those slides, they are really insightful. Looking forward to collaborating with you on this, too!

Gunnar Morling added a comment - 2019/06/28 5:19 AM Thanks a lot for sending that first PR, jgao54 ; super-happy to see it! Hey abrarsheikh , welcome and nice seeing you here. Thanks for sharing those slides, they are really insightful. Looking forward to collaborating with you on this, too!

Abrar Ahmed Sheikh (Inactive) added a comment - 2019/06/27 2:16 PM

sharing my slides from datastax conference for a possible design pattern for solving this problem https://www.slideshare.net/AbrarSheikh1/streaming-data-from-cassandra-into-kafka. jgao54 thanks for picking this up, looking forward to collaborate

Abrar Ahmed Sheikh (Inactive) added a comment - 2019/06/27 2:16 PM sharing my slides from datastax conference for a possible design pattern for solving this problem https://www.slideshare.net/AbrarSheikh1/streaming-data-from-cassandra-into-kafka . jgao54 thanks for picking this up, looking forward to collaborate

Gunnar Morling added a comment - 2019/06/14 2:54 AM

Hey jgao54, took the liberty to assign this one to you

Gunnar Morling added a comment - 2019/06/14 2:54 AM Hey jgao54 , took the liberty to assign this one to you

Gunnar Morling added a comment - 2019/04/29 6:54 AM

Hey jgao54, criccomini, just stumbling upon this one. I'd suggest to use it as the hub for any related discussion.

Gunnar Morling added a comment - 2019/04/29 6:54 AM Hey jgao54 , criccomini , just stumbling upon this one. I'd suggest to use it as the hub for any related discussion.

Tony Tony (Inactive) added a comment - 2018/02/09 4:53 AM

From the documentation it doesn't look like there is less config / preparation steps requied in order to enable cdc. For instance, cdc should be enabled per table using " WITH cdc=true" and also in the cassandra.yaml file.

The side car process should be install on the actual cassandra node to monitor cdc directory for a new files (interesting if there is a way to flush more frequently memtable). Also there are some caveates on ingesting cdc because there might be replica data - so need to figure out how to de-duplicate and where.

Tony Tony (Inactive) added a comment - 2018/02/09 4:53 AM From the documentation it doesn't look like there is less config / preparation steps requied in order to enable cdc. For instance, cdc should be enabled per table using " WITH cdc=true" and also in the cassandra.yaml file. The side car process should be install on the actual cassandra node to monitor cdc directory for a new files (interesting if there is a way to flush more frequently memtable). Also there are some caveates on ingesting cdc because there might be replica data - so need to figure out how to de-duplicate and where.

Gunnar Morling added a comment - 2018/02/09 4:37 AM

Yeah, I had a quick look at this a while ago to better understand the options and I also concluded that 1) would be the way to go. Triggers are quite invasive and need special installation/configuration for each captured table which isn't desirable.

IIRC, the CDC supported required to install some component on the actual Cassandra host, which would be a bit different from the existing connectors where we essentially use the DB's client API to connect to the server. So we'd have to figure a way to get the changes from that other service into our connector (which runs within the Kafka Connect process).

Gunnar Morling added a comment - 2018/02/09 4:37 AM Yeah, I had a quick look at this a while ago to better understand the options and I also concluded that 1) would be the way to go. Triggers are quite invasive and need special installation/configuration for each captured table which isn't desirable. IIRC, the CDC supported required to install some component on the actual Cassandra host, which would be a bit different from the existing connectors where we essentially use the DB's client API to connect to the server. So we'd have to figure a way to get the changes from that other service into our connector (which runs within the Kafka Connect process).

Tony Tony (Inactive) added a comment - 2018/02/09 4:32 AM

gunnar.morling

Let's discuss here how we can attack the problem - I need to have a chat with the company's legal to see how to contribute at the end of the day.

I see at least 2 options here:

1. Cassandra CDC. Seems like the best option to go for. Although we can't claim near real-time replication here due to memtable flush mechanism.
2. Cassandra Trigger. Due to its nature to run befor operation even complete I wouldn't say it's reliable mechanism for replication. Although it's a way more faster (near real-time) by compare to #1

Tony Tony (Inactive) added a comment - 2018/02/09 4:32 AM gunnar.morling Let's discuss here how we can attack the problem - I need to have a chat with the company's legal to see how to contribute at the end of the day. I see at least 2 options here: 1. Cassandra CDC. Seems like the best option to go for. Although we can't claim near real-time replication here due to memtable flush mechanism. 2. Cassandra Trigger. Due to its nature to run befor operation even complete I wouldn't say it's reliable mechanism for replication. Although it's a way more faster (near real-time) by compare to #1

Gunnar Morling added a comment - 2018/02/09 4:24 AM

Hi yandooo, it's one idea our long-term roadmap, but so far we don't have seen many requests about it. I'd personally like to have Cassandra support, but really it's a question of priorities and capacities we have. If you'd like to see it done sooner than later, your best chance would be to contribute a connector yourself. If you're interested, let me know and we can have a discussion on how to approach it.

Gunnar Morling added a comment - 2018/02/09 4:24 AM Hi yandooo , it's one idea our long-term roadmap, but so far we don't have seen many requests about it. I'd personally like to have Cassandra support, but really it's a question of priorities and capacities we have. If you'd like to see it done sooner than later, your best chance would be to contribute a connector yourself. If you're interested, let me know and we can have a discussion on how to approach it.

Assignee:: Joy Gao (Inactive)

Reporter:: Tony Tony (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2018/02/09 3:48 AM

Updated:: 2023/01/24 3:42 PM

Resolved:: 2019/07/18 8:06 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

Collapse comment: Jiri Pechanec added a comment - 2019/08/16 4:12 AM

Expand comment: Jiri Pechanec added a comment - 2019/08/16 4:12 AM

Collapse comment: Gunnar Morling added a comment - 2019/06/28 5:19 AM

Expand comment: Gunnar Morling added a comment - 2019/06/28 5:19 AM

Collapse comment: Abrar Ahmed Sheikh (Inactive) added a comment - 2019/06/27 2:16 PM

Expand comment: Abrar Ahmed Sheikh (Inactive) added a comment - 2019/06/27 2:16 PM

Collapse comment: Gunnar Morling added a comment - 2019/06/14 2:54 AM

Expand comment: Gunnar Morling added a comment - 2019/06/14 2:54 AM

Collapse comment: Gunnar Morling added a comment - 2019/04/29 6:54 AM

Expand comment: Gunnar Morling added a comment - 2019/04/29 6:54 AM

Collapse comment: Tony Tony (Inactive) added a comment - 2018/02/09 4:53 AM

Expand comment: Tony Tony (Inactive) added a comment - 2018/02/09 4:53 AM

Collapse comment: Gunnar Morling added a comment - 2018/02/09 4:37 AM

Expand comment: Gunnar Morling added a comment - 2018/02/09 4:37 AM

Collapse comment: Tony Tony (Inactive) added a comment - 2018/02/09 4:32 AM

Expand comment: Tony Tony (Inactive) added a comment - 2018/02/09 4:32 AM

Collapse comment: Gunnar Morling added a comment - 2018/02/09 4:24 AM

Expand comment: Gunnar Morling added a comment - 2018/02/09 4:24 AM

People

Dates