Uploaded image for project: 'mod_cluster'
  1. mod_cluster
  2. MODCLUSTER-311

mod_manager doesn't handle multiple virtualhosts per node

    Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 1.2.1.Final
    • Fix Version/s: 1.2.2.Final
    • Component/s: None
    • Labels:
      None
    • Environment:

      RedHat EL 6.2, httpd-2.2.15-15.el6

      Description

      Hi,

      I was experimenting with mod_cluster and jboss as 7.1 configured with multiple virtualhosts.

      My simple test was made of a single node (as instance) with 2 virtualhosts (site01 and site02) and 2 applications respectively deployed on one of the two vhosts.

      I noticed that mod_manager was inserting the aliases of the 2 jboss vhosts in the same virtualhost (same vhost id):

      balancer: [1] Name: balancer01 Sticky: 1 [JSESSIONID]/[jsessionid] remove: 0 force: 0 Timeout: 0 maxAttempts: 1
      node: [1:1],Balancer: balancer01,JVMRoute: bf3c1d57-ed66-38b4-838d-0cba532b6737,LBGroup: [],Host: 192.168.122.21,Port: 8259,Type: ajp,flushpackets: 0,flushwait: 10,ping: 10,smax: 1,ttl: 60,timeout: 0
      host: 1 [site01] vhost: 1 node: 1
      host: 2 [site02] vhost: 1 node: 1
      context: 1 [/context01] vhost: 1 node: 1 status: 1
      context: 2 [/context02] vhost: 1 node: 1 status: 1
      

      Now, looking at the mod_manager.c code I noticed that, inside process_appl_cmd, if the first alias name (I assume they always come in order and the first one provided in the ENABLE-APP MCMP command is always the jboss vhost default-name) doesn't exists in the hoststatsmem table then a new one is created with a fixed vhost id of 1 (as the comment says):

      host = read_host(hoststatsmem, &hostinfo);
      if (host == NULL) {
      int vid = 1; /* XXX: That is not really the right value, but that works most time */

      I tried to fix this trying to calculate the first available vhost id (see first part of the patch attached below)

      From my tests this seems to work (tried deploy, undeploy of various apps on different hosts and context). This also means that the logic inside mod_proxy_cluster looks right and correctly choose the right balancer (and sends the request to the backend only if the requested context inside the requestes vhost is defined).

      balancer: [1] Name: balancer01 Sticky: 1 [JSESSIONID]/[jsessionid] remove: 0 force: 0 Timeout: 0 maxAttempts: 1
      node: [1:1],Balancer: balancer01,JVMRoute: bf3c1d57-ed66-38b4-838d-0cba532b6737,LBGroup: [],Host: 192.168.122.21,Port: 8259,Type: ajp,flushpackets: 0,flushwait: 10,ping: 10,smax: 1,ttl: 60,timeout: 0
      host: 1 [site02] vhost: 1 node: 1
      host: 2 [site01] vhost: 2 node: 1
      context: 1 [/context01] vhost: 2 node: 1 status: 1
      context: 2 [/context02] vhost: 1 node: 1 status: 1
      

      Then I tried adding some aliases on the jboss virtualhosts. On ENABLE it worked. Instead, during REMOVE, only the vhost default-name (the first Alias in the MCMP command) was removed keeping the other aliases and so the vhost (and giving problems during another ENABLE as it created another virtualhost only for the first alias).

      On ENABLE:

      balancer: [1] Name: balancer01 Sticky: 1 [JSESSIONID]/[jsessionid] remove: 0 force: 0 Timeout: 0 maxAttempts: 1
      node: [1:1],Balancer: balancer01,JVMRoute: bf3c1d57-ed66-38b4-838d-0cba532b6737,LBGroup: [],Host: 192.168.122.21,Port: 8259,Type: ajp,flushpackets: 0,flushwait: 10,ping: 10,smax: 1,ttl: 60,timeout: 0
      host: 1 [site01] vhost: 1 node: 1
      host: 2 [site01alias01] vhost: 1 node: 1
      host: 3 [site02] vhost: 2 node: 1
      context: 1 [/context01] vhost: 1 node: 1 status: 1
      context: 2 [/context02] vhost: 2 node: 1 status: 1
      

      On REMOVE:

      balancer: [1] Name: balancer01 Sticky: 1 [JSESSIONID]/[jsessionid] remove: 0 force: 0 Timeout: 0 maxAttempts: 1
      node: [1:1],Balancer: balancer01,JVMRoute: bf3c1d57-ed66-38b4-838d-0cba532b6737,LBGroup: [],Host: 192.168.122.21,Port: 8259,Type: ajp,flushpackets: 0,flushwait: 10,ping: 10,smax: 1,ttl: 60,timeout: 0
      host: 2 [site01alias01] vhost: 1 node: 1
      host: 3 [site02] vhost: 2 node: 1
      context: 2 [/context02] vhost: 2 node: 1 status: 1
      

      To fix this, always inside process_appl_cmd I noticed that it was removing only the first host. So I modified it to remove all the hosts of that node with that vhost id.

      This is the patch I made trying to fix this:

      Index: mod_manager.c
      ===================================================================
      --- mod_manager.c	(revision 840)
      +++ mod_manager.c	(working copy)
      @@ -1341,10 +1341,26 @@
           hostinfo.id = 0;
           host = read_host(hoststatsmem, &hostinfo);
           if (host == NULL) {
      -        int vid = 1; /* XXX: That is not really the right value, but that works most time */
      +        
               /* If REMOVE ignores it */
               if (status == REMOVE)
                   return NULL;
      +
      +        /* Find the first available vhost id */
      +        /* XXX: This can be racy if another request from the same node comes in the middle */
      +        int vid = 1;
      +        int size = loc_get_max_size_host();
      +        int *id = apr_palloc(r->pool, sizeof(int) * size);
      +        size = get_ids_used_host(hoststatsmem, id);
      +        for (i=0; i<size; i++) {
      +            hostinfo_t *ou;
      +            if (get_host(hoststatsmem, &ou, id[i]) != APR_SUCCESS)
      +                continue;
      +
      +	    if(ou->vhost == vid && ou->node == node->mess.id)
      +	        vid++;
      +        }
      +
               /* If the Host doesn't exist yet create it */
               if (insert_update_hosts(hoststatsmem, vhost->host, node->mess.id, vid) != APR_SUCCESS) {
                   *errtype = TYPEMEM;
      @@ -1384,7 +1400,18 @@
               }
               if (i==size) {
                   hostinfo.id = host->id;
      -            remove_host(hoststatsmem, &hostinfo);
      +
      +            int size = loc_get_max_size_host();
      +            int *id = apr_palloc(r->pool, sizeof(int) * size);
      +            size = get_ids_used_host(hoststatsmem, id);
      +            for (i=0; i<size; i++) {
      +                 hostinfo_t *ou;
      +
      +                 if (get_host(hoststatsmem, &ou, id[i]) != APR_SUCCESS)
      +                     continue;
      +                 if(ou->vhost == host->vhost && ou->node == node->mess.id)
      +                     remove_host(hoststatsmem, ou);
      +            }
               }
           } else if (status == STOPPED) {
               /* insert_update_contexts in fact makes that vhost->context corresponds only to the first context... */
      

      As discussed on the forum, during ENABLE, some concurrency problems may happen. Probably this can create problems only if the same node launches multiple concurrent ENABLE-APP commands (I don't know if this can happen on the as side).

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                jfclere Jean-Frederic Clere
                Reporter:
                simone.gotti Simone Gotti
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: