-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
0.8.3.Final
-
None
If an exception was thrown during task execution, then Task stucks in running state. After 30 minutes it changes its state to failed state. During that period we cannot detect failure and restart the task.
Expected result:
Task changes its state to failed just after exception been thrown.
Actual result:
Task changes its state to failed in 30 minutes after exception been thrown.
Some investigation on this issue:
1. PostgresConnectorTask.commit() is called from kafka connect code just after the exception was thrown.
2. PostgresConnectorTask.commit() is blocked in RecordsStreamProducer.commit() call.
3. RecordsStreamProducer.commit() is awaiting for lock from RecordsStreamProducer.streamChanges(): the actual lock is in org.postgresql.core.v3.CopyDualImpl writeToCopy vs readFromCopy
4. Connect thread stacks just after exception are in "blocked.txt" attachment
5. Connect thread stacks 30 min after exception are in "unblocked.txt" attachment
6. Connect logs are in "connect_log.txt" attachment(exception was thrown in 12:50, and task failed only in 13:20)
7. The problem is reproduced in 100% test runs(ses Steps to Reproduce)
Need help guys!
- relates to
-
DBZ-1025 Connector remains in RUNNING state after exception during snapshotting
- Open