Uploaded image for project: 'AMQ Streams'
  1. AMQ Streams
  2. ENTMQST-4332

Investigate KRaft observer as replacement for zk watches in TO

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Major Major
    • 2.5.0.GA
    • None
    • topic-operator
    • None

      The Topic Operator currently uses watches on znodes as a notification mechanism for changes to topics in Kafka. KRaft doesn't offer equivalent functionality. This issue is to investigate in detail the feasibility of using a KRaft observer to do this.

      The basic idea would be to use the RaftManager class from Apache Kafka (not a publicly supported API) to give the TO visibility of committed changes to the __cluster_metadata log. We would therefore depend on internal details of the metadata records Kafka uses. Currently that would be TopicRecord, PartitionRecord, ConfigRecord, PartitionChangeRecord and RemoveTopicRecord. We'd be on the hook for adapting to any future changes to these records (such as switching ConfigRecord to support topic ids, or adding new records which affected topics.

      Option 1

      We could just use this as a pure notification mechanism. i.e. once we knew a topic had been changed in Kafka we could use an Admin client to get the "official" TopicDescription/ConfigResource. This is how the current watch-based functionality works. But there's the possibility of races because we cannot be sure the broker which receives the MetadataRequest or DescribeConfigsRequest has actually fetched the metadata which we've observed via the RaftManager.

      Option 2

      Alternatively we could depend on the metadata records more deeply, essentially having our own metadata cache within the TO and dispensing with using the Admin client for such queries. This would avoid such races conditions entirely. It would also give us access to the __cluster_metadata offset for each record, which would function like a logical version/epoch/generation for the topic. That would allow for extra improvements (for example, we could avoid the 3-way diff if the kube and kafka generations hadn't changed).

       

            Unassigned Unassigned
            tbentley-1 Tom Bentley
            Lukas Kral Lukas Kral
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: