Uploaded image for project: 'WildFly Core'
  1. WildFly Core
  2. WFCORE-4519

Slave Host Controller deployment repository is cleaned after a full deployment replacement

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 9.0.1.Final
    • Fix Version/s: 10.0.0.Beta1
    • Component/s: Management
    • Labels:
      None
    • Environment:

      Domain mode, slave HC with deployments in its server groups

    • Workaround:
      Workaround Exists
    • Workaround Description:
      Hide

      As a workaround to avoid the possibility to hit the issue, we can use the server-group:replace-deployment operation instead of deploy --force to update content in the server groups. For example:

      [domain@localhost:9990 /] deploy /applications/test-application.war --name=test-application-v2.war --disabled
      
      [domain@localhost:9990 /] /server-group=main-server-group:replace-deployment(name=test-application-v2.war, runtime-name=test-application.war, to-replace=test-application.war)
      
      Show
      As a workaround to avoid the possibility to hit the issue, we can use the server-group:replace-deployment operation instead of deploy --force to update content in the server groups. For example: [domain@localhost:9990 /] deploy /applications/test-application.war --name=test-application-v2.war --disabled [domain@localhost:9990 /] /server-group=main-server-group:replace-deployment(name=test-application-v2.war, runtime-name=test-application.war, to-replace=test-application.war)

      Description

      In domain mode, there is a cleanup task that removes obsolete content from the deployment repository of each process (DC, slave HC, and servers). By default, this task is executed every five minutes.

      The task checks if there is any content to be marked as obsolete, if there is, it is marked and deleted on the next task execution.

      Deployment content is considerate obsolete in a slave HC if there are no references to it, that means if there is no server group that has this deployment configured.

      The issue here is the deployment handler that replaces the deployment content in a slave is not adding a reference to the new content if there are affected server groups.

      The consequence is the cleanup task could delete the slave HC content. If this occurs when the servers are starting, the servers could fail to start with the following error:

      2019-06-12 08:51:32,813 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0013: Operation ("add") failed - address: ([("deployment" => "test-application.war")]) - failure description: "WFLYSRV0137: No deployment content with hash b1fb3b872b3490bbdbd152bd082791b1f170397d is available in the deployment content repository for deployment 'test-application.war'. This is a fatal boot error. To correct the problem, either restart with the --admin-only switch set and use the CLI to install the missing content or remove it from the configuration, or remove the deployment from the xml configuration file and restart."
      2019-06-12 08:51:32,817 FATAL [org.jboss.as.server] (Controller Boot Thread) WFLYSRV0056: Server boot has failed in an unrecoverable manner; exiting. See previous messages for details.
      2019-06-12 08:51:32,833 INFO  [org.jboss.as] (MSC service thread 1-4) WFLYSRV0050: WildFly Full 17.0.0.Final-SNAPSHOT (WildFly Core 9.0.1.Final-SNAPSHOT) stopped in 5ms
      

      The issue is difficult to hit because it is the server who requests the required files to the slave HC. In order to reproduce it, there must be a coincidence when the server has requested a deployment file to its HC, the HC already has this file in its deployment repository marked as obsolete and, before send it to the server, the cleanup task removes it.

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                yersan Yeray Borges
                Reporter:
                yersan Yeray Borges
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: