[DBZ-812] Handle changes of table definitions and tables created while streaming

          Released

          Jiri Pechanec added a comment - Released

          Done with reviewing the PR, LGTM overall. Going to merge it now. It'd still be good to have some guidance in the docs on how to deal with schema changes and esp. the caveat, that values from newly added columns won't be contained in CDC messages until a new capture instance has been created.

          Gunnar Morling added a comment - Done with reviewing the PR, LGTM overall. Going to merge it now. It'd still be good to have some guidance in the docs on how to deal with schema changes and esp. the caveat, that values from newly added columns won't be contained in CDC messages until a new capture instance has been created.

          A list of things to cover in the docs as part of this change:

          • Describe steps to be taken in case of DDL changes (create new capture instance etc.)
          • ...

          Gunnar Morling added a comment - A list of things to cover in the docs as part of this change: Describe steps to be taken in case of DDL changes (create new capture instance etc.) ...

          jpechane I see your point. However, I think this approach can be used for the time being, until DDL parser is implemented. In my opinion it's better to have an alpha release with schema evolution support with a limitation than have no support or have incorrect schema (all fields nullable) In addition, I do not consider this limitation to be problematic.

          Grzegorz Kołakowski (Inactive) added a comment - - edited jpechane I see your point. However, I think this approach can be used for the time being, until DDL parser is implemented. In my opinion it's better to have an alpha release with schema evolution support with a limitation than have no support or have incorrect schema (all fields nullable) In addition, I do not consider this limitation to be problematic.

          grzegorz.kolakowski I orfinally took the schema from getTableSchemaFromTable. The problem is whathappens if you introduce two changes in a row - so if we manage to read the schema in time. OTH we can think about it as a limitation. As the operator is rsponsible for consistency via switching the capture instances we might make a limitation to switch to another consistent schema state. In that case we would not need DDL. But this is something that definitely needs more thinking.

          Jiri Pechanec added a comment - grzegorz.kolakowski I orfinally took the schema from getTableSchemaFromTable . The problem is whathappens if you introduce two changes in a row - so if we manage to read the schema in time. OTH we can think about it as a limitation. As the operator is rsponsible for consistency via switching the capture instances we might make a limitation to switch to another consistent schema state. In that case we would not need DDL. But this is something that definitely needs more thinking.

          jpechane Can't we execute sp_columns on the original table? There is NULLABLE column available.

          Grzegorz Kołakowski (Inactive) added a comment - jpechane Can't we execute sp_columns on the original table? There is NULLABLE column available.

          grzegorz.kolakowski Hi, yes for the new tables we will get schema from the original table. The problem is with ALTER. There is no way how could we detect the change of column nullability but only from parser.

          Jiri Pechanec added a comment - grzegorz.kolakowski Hi, yes for the new tables we will get schema from the original table. The problem is with ALTER . There is no way how could we detect the change of column nullability but only from parser.

          jpechane No, I just saw your discussion on gitter I do not fully understand why DDL parser is necessary.

          I just wonder whether we can obtain the schema from the original table directly (call `getTableSchemaFromTable` instead of `getTableSchemaFromChangeTable`)? I would assume that when Debezium detects a new capture instance has been created, the schema of the newly created change table corresponds to original table's schema.

          Grzegorz Kołakowski (Inactive) added a comment - - edited jpechane No, I just saw your discussion on gitter I do not fully understand why DDL parser is necessary. I just wonder whether we can obtain the schema from the original table directly (call `getTableSchemaFromTable` instead of `getTableSchemaFromChangeTable`)? I would assume that when Debezium detects a new capture instance has been created, the schema of the newly created change table corresponds to original table's schema.

          grzegorz.kolakowski Unfortunately we came to the same conclusion. Right now we think about taking all information from change table and nullability from ddl table using parser.

          Jiri Pechanec added a comment - grzegorz.kolakowski Unfortunately we came to the same conclusion. Right now we think about taking all information from change table and nullability from ddl table using parser.

          After the schema change, all columns become DEFAULT VALUE NULL. I see DDL parser is needed to address the issue. Is there any other way around?

          Grzegorz Kołakowski (Inactive) added a comment - After the schema change, all columns become DEFAULT VALUE NULL . I see DDL parser is needed to address the issue. Is there any other way around?

            jpechane Jiri Pechanec
            gunnar.morling Gunnar Morling
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: