Details
-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
None
-
As explained in Ondra's comment, this issue is not a bug but a limitation of the current implementation.
-
Description
While testing tx recovery in OpenShift I see that scale down of client pod that has transaction in-doubt on it isn't successful when there is server pod that is part of transaction which isn't reachable
Scenario:
ejb client (app tx-client, pod tx-client-0):
- EJB business method
- lookup remote EJB
- enlist XA resource 1 to transaction
- enlist XA resource 2 to transaction
- call remote EJB
ejb server (app tx-server, pod tx-server-0):
- EJB business method
- enlist XA resource 1 to transaction
- enlist XA resource 2 to transaction
testTxStatelessServerSecondPrepareJvmHaltScaleDownClient
Test workflow:
- ejb server XA resource crashes JVM on tx-server pod
- label "wildfly.org/operated-by-headless" of server pod is changed, which causes that server isn't reachable
- tx-server pod is scaled down
- tx-client pod is scaled down
scale down of client pod hangs
{"level":"info","ts":1570636701.0314589,"logger":"wildflyserver_controller","msg":"Scaling down statefulset by verification if pods are clean by recovery","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-client"} {"level":"info","ts":1570636701.0314867,"logger":"wildflyserver_controller","msg":"Statefulset was not scaled to the desired replica size 0 (current StatefulSet size: 1). Transaction recovery scaledown process has not cleaned all pods. Please, check status of the WildflyServer tx-client","StatefulSet.Namespace":"msimka-namespace","StatefulSet.Name":"tx-client"} {"level":"info","ts":1570636703.5918324,"logger":"wildflyserver_controller","msg":"Reconciling WildFlyServer","Request.Namespace":"msimka-namespace","Request.Name":"tx-client"} {"level":"info","ts":1570636703.5919785,"logger":"wildlfyserver_resources","msg":"Getting resource","WildFlyServer.Namespace":"msimka-namespace","WildFlyServer.Name":"tx-client","Resource.Name":"tx-client"} {"level":"info","ts":1570636703.5920458,"logger":"wildlfyserver_resources","msg":"Got resource","WildFlyServer.Namespace":"msimka-namespace","WildFlyServer.Name":"tx-client","Resource.Name":"tx-client"} {"level":"info","ts":1570636703.5921679,"logger":"wildflyserver_controller","msg":"Transaction recovery scaledown processing","Request.Namespace":"msimka-namespace","Request.Name":"tx-client","Pod Name":"tx-client-0","IP Address":"10.128.1.34","Pod State":"SCALING_DOWN_RECOVERY_DIRTY","Pod Phase":"Running"} {"level":"info","ts":1570636703.5922475,"logger":"wildflyserver_controller","msg":"Recovery properties at pod were already defined. Skipping server restart.","Request.Namespace":"msimka-namespace","Request.Name":"tx-client","Pod Name":"tx-client-0"} {"level":"info","ts":1570636703.5971646,"logger":"wildflyserver_controller","msg":"Executing recovery scan at tx-client-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-client","Pod IP":"10.128.1.34","Recovery port":4712} {"level":"info","ts":1570636709.162107,"logger":"wildflyserver_controller","msg":"Executing recovery scan at tx-client-0","Request.Namespace":"msimka-namespace","Request.Name":"tx-client","Pod IP":"10.128.1.34","Recovery port":4712} {"level":"info","ts":1570636714.7187033,"logger":"wildflyserver_controller","msg":"In-doubt transactions in object store","Request.Namespace":"msimka-namespace","Request.Name":"tx-client","Pod Name":"tx-client-0","Message":"WildFly Transaction Client data dir is not empty and scaling down of the pod 'tx-client-0' will be retried.Wildfly Transacton Client data dir path '/opt/eap/standalone/data/ejb-xa-recovery', output listing: 20005_00000000000000000000ffff0a80012251f788695d9e026b0000001374782d636c69656e742d30_00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000\n"}