We have seen many cases where due to database or network issues, the quartz trigger may fail for > 5 times and then eventually delete the quartz trigger. This may cause workflow to abnormally abort.
We have even cases where the Quartz trigger is deleted for a process instance, but the process instance is left in reserved state, thereby leaving the process instance in an idle state and impacting the business SLA’s.
I think we need a better mechanism to handle the triggers in case of 5 consecutive failures instead of deleting them and aborting the process. Also would be nice to be able to configure the maximum count of failures and the retry wait time.
It would be great if we can error out these triggers instead of deleting them and then provide a mechanism to recover these failed triggers.