Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-858

First node of two-node cluster doesn't get ClusteredObservationBus change notifications from second node

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Critical
    • 2.2.0.Final
    • 2.1.0.Final
    • Clustering
    • None

    Description

      When using ModeShape 2.1.0.Final with two applications configured for JcrEngine clustering, the application that is started first doesn't receive repository change notifications from the application that is started second.

      The problem appears to be that when the second node is started, the ClusteredObservationBus#Receiver.viewAccepted(View) method is only called for the application (node) that was started first. Since viewAccepted(View) is not called for the second node, the value of multipleAddressesInCluster remains false. Consequently, when repository changes are made by the second node and its ClusteredObservationBus.notify(Changes) is called, the following code-block is executed:

      if (!multipleAddressesInCluster.get()) {
      // We are in clustered mode, but there is only one participant in the cluster (us).
      // So short-circuit the cluster and just notify the local observers ...
      if (!observers.isEmpty()) {
      observers.broadcast(changes);
      if (LOGGER.isTraceEnabled()) {
      LOGGER.trace("Received on cluster '

      {0}

      '

      {1}

      changes in source '

      {2}

      ' made by

      {3}

      from process '

      {4}

      ' at

      {5}

      ",
      getClusterName(),
      changes.getChangeRequests().size(),
      changes.getSourceName(),
      changes.getUserName(),
      changes.getProcessId(),
      changes.getTimestamp());
      }
      }
      return;
      }

      As can be seen in the logging from the second node below, the first node isn't notified of changes made by the second node:

      SECOND NODE LOG:

      At startup...
      2010-08-18 15:58:22,347 [main] DEBUG org.jgroups.conf.ClassConfigurator - Using jg-magic-map.xml as magic number file and jg-protocol-ids.xml for protocol IDs
      2010-08-18 15:58:22,564 [main] DEBUG org.jgroups.protocols.pbcast.GMS - changed role to org.jgroups.protocols.pbcast.ClientGmsImpl
      2010-08-18 15:58:23,020 [main] DEBUG org.jgroups.stack.Configurator - set property UDP.bind_addr to default value /192.168.10.11
      2010-08-18 15:58:23,020 [main] DEBUG org.jgroups.stack.Configurator - set property UDP.diagnostics_addr to default value /224.0.75.75
      2010-08-18 15:58:23,022 [main] DEBUG org.jgroups.stack.Configurator - set property FD_SOCK.bind_addr to default value /192.168.10.11
      2010-08-18 15:58:23,023 [main] DEBUG org.jgroups.stack.Configurator - set property VERIFY_SUSPECT.bind_addr to default value /192.168.10.11
      2010-08-18 15:58:23,033 [main] DEBUG org.jgroups.protocols.FRAG2 - received CONFIG event:

      {bind_addr=/192.168.10.11}
      2010-08-18 15:58:23,104 [main] DEBUG org.jgroups.protocols.UDP - creating sockets
      2010-08-18 15:58:23,104 [main] DEBUG org.jgroups.protocols.UDP - sockets will use interface 192.168.10.11
      2010-08-18 15:58:23,106 [main] DEBUG org.jgroups.protocols.UDP - socket information:
      , mcast_addr=224.10.10.10:5555, bind_addr=/192.168.10.11, ttl=2
      sock: bound to 192.168.10.11:64604, receive buffer size=64000, send buffer size=100000
      mcast_sock: bound to 192.168.10.11:5555, send buffer size=100000, receive buffer size=500000
      -------------------------------------------------------------------
      GMS: address=workpc-39877, cluster=project1-modeshape-cluster, physical address=192.168.10.11:64604
      -------------------------------------------------------------------
      2010-08-18 15:58:23,280 [main] DEBUG org.jgroups.protocols.UDP - created unicast receiver thread
      2010-08-18 15:58:23,280 [main] DEBUG org.jgroups.protocols.UDP - created multicast receiver thread
      2010-08-18 15:58:23,309 [main] DEBUG org.jgroups.protocols.pbcast.GMS - initial_mbrs are [own_addr=workpc-8420, view id=[workpc-8420|0], is_server=true, is_coord=true, logical_name=workpc-8420, physical_addrs=192.168.10.11:64603]
      2010-08-18 15:58:23,309 [main] DEBUG org.jgroups.protocols.pbcast.GMS - election results: {workpc-8420=1}
      2010-08-18 15:58:23,309 [main] DEBUG org.jgroups.protocols.pbcast.GMS - sending handleJoin(workpc-39877) to workpc-8420
      2010-08-18 15:58:23,424 [main] DEBUG org.jgroups.protocols.pbcast.NAKACK -
      [setDigest()]
      existing digest: []
      new digest: workpc-39877: [0 : 0], workpc-8420: [0 : 3 (2)]
      resulting digest: workpc-39877: [0 : 0 (0)], workpc-8420: [0 : 3 (3)]
      2010-08-18 15:58:23,430 [main] DEBUG org.jgroups.protocols.pbcast.GMS - [workpc-39877]: JoinRsp=[workpc-8420|1] [workpc-8420, workpc-39877] [size=2]
      2010-08-18 15:58:23,430 [main] DEBUG org.jgroups.protocols.pbcast.GMS - new_view=[workpc-8420|1] [workpc-8420, workpc-39877]
      2010-08-18 15:58:23,430 [main] DEBUG org.jgroups.protocols.pbcast.GMS - workpc-39877: view is [workpc-8420|1] [workpc-8420, workpc-39877]
      2010-08-18 15:58:23,433 [main] DEBUG org.jgroups.protocols.FD_SOCK - VIEW_CHANGE received: [workpc-8420, workpc-39877]
      2010-08-18 15:58:23,437 [FD_SOCK pinger,project1-modeshape-cluster,workpc-39877] DEBUG org.jgroups.protocols.FD_SOCK - ping_dest is workpc-8420, pingable_mbrs=[workpc-8420, workpc-39877]
      2010-08-18 15:58:23,455 [main] DEBUG org.jgroups.protocols.pbcast.GMS - workpc-39877:
      2010-08-18 15:58:23,473 [main] INFO org.springframework.extensions.jcr.modeshape.ModeShapeSessionFactory - Repositories started
      2010-08-18 15:58:23,531 [main] TRACE org.modeshape.clustering.ClusteredObservationBus - Received on cluster 'project1-modeshape-cluster' 5 changes in source 'ModeShape Configuration Repository' made by from process 'b1ce0d2f-f39e-4d44-8223-311f561210a0' at 2010-08-18T19:58:23.483Z

      After making repository changes...
      2010-08-18 16:05:22,643 [JCR ContentRepository-2-thread-2] TRACE org.modeshape.clustering.ClusteredObservationBus - Received on cluster 'project1-modeshape-cluster' 2 changes in source 'ContentRepository' made by anonymous from process 'b1ce0d2f-f39e-4d44-8223-311f561210a0' at 2010-08-18T20:05:22.512Z
      ...
      2010-08-18 16:05:23,520 [Timer-1,project1-modeshape-cluster,workpc-39877] DEBUG org.jgroups.protocols.FD - sending are-you-alive msg to workpc-8420 (own address=workpc-39877)


      However, if the first node makes repository changes, the logging below shows it notifies the second node correctly:

      FIRST NODE LOG:

      At startup...
      2010-08-18 16:17:11,847 [main] DEBUG org.jgroups.conf.ClassConfigurator - Using jg-magic-map.xml as magic number file and jg-protocol-ids.xml for protocol IDs
      2010-08-18 16:17:12,048 [main] DEBUG org.jgroups.protocols.pbcast.GMS - changed role to org.jgroups.protocols.pbcast.ClientGmsImpl
      2010-08-18 16:17:12,581 [main] DEBUG org.jgroups.stack.Configurator - set property UDP.bind_addr to default value /192.168.10.11
      2010-08-18 16:17:12,581 [main] DEBUG org.jgroups.stack.Configurator - set property UDP.diagnostics_addr to default value /224.0.75.75
      2010-08-18 16:17:12,582 [main] DEBUG org.jgroups.stack.Configurator - set property FD_SOCK.bind_addr to default value /192.168.10.11
      2010-08-18 16:17:12,583 [main] DEBUG org.jgroups.stack.Configurator - set property VERIFY_SUSPECT.bind_addr to default value /192.168.10.11
      2010-08-18 16:17:12,590 [main] DEBUG org.jgroups.protocols.FRAG2 - received CONFIG event: {bind_addr=/192.168.10.11}

      2010-08-18 16:17:12,668 [main] DEBUG org.jgroups.protocols.UDP - creating sockets
      2010-08-18 16:17:12,668 [main] DEBUG org.jgroups.protocols.UDP - sockets will use interface 192.168.10.11
      2010-08-18 16:17:12,670 [main] DEBUG org.jgroups.protocols.UDP - socket information:
      , mcast_addr=224.10.10.10:5555, bind_addr=/192.168.10.11, ttl=2
      sock: bound to 192.168.10.11:64110, receive buffer size=64000, send buffer size=100000
      mcast_sock: bound to 192.168.10.11:5555, send buffer size=100000, receive buffer size=500000

      -------------------------------------------------------------------
      GMS: address=workpc-27756, cluster=project1-modeshape-cluster, physical address=192.168.10.11:64110
      -------------------------------------------------------------------
      2010-08-18 16:17:12,864 [main] DEBUG org.jgroups.protocols.UDP - created unicast receiver thread
      2010-08-18 16:17:12,864 [main] DEBUG org.jgroups.protocols.UDP - created multicast receiver thread
      2010-08-18 16:17:14,868 [main] DEBUG org.jgroups.protocols.pbcast.GMS - initial_mbrs are []
      2010-08-18 16:17:14,868 [main] DEBUG org.jgroups.protocols.pbcast.GMS - no initial members discovered: creating group as first member
      2010-08-18 16:17:14,880 [main] DEBUG org.jgroups.protocols.pbcast.NAKACK -
      [setDigest()]
      existing digest: []
      new digest: workpc-27756: [0 : 0 (0)]
      resulting digest: workpc-27756: [0 : 0 (0)]
      2010-08-18 16:17:14,881 [main] DEBUG org.jgroups.protocols.pbcast.GMS - workpc-27756: view is [workpc-27756|0] [workpc-27756]
      2010-08-18 16:17:14,883 [main] DEBUG org.jgroups.protocols.FD_SOCK - VIEW_CHANGE received: [workpc-27756]
      2010-08-18 16:17:14,887 [main] DEBUG org.jgroups.protocols.pbcast.GMS - workpc-27756:
      2010-08-18 16:17:14,888 [main] DEBUG org.jgroups.protocols.pbcast.GMS - created group (first member). My view is [workpc-27756|0], impl is org.jgroups.protocols.pbcast.CoordGmsImpl
      2010-08-18 16:17:14,908 [main] INFO org.springframework.extensions.jcr.modeshape.ModeShapeSessionFactory - Repositories started
      2010-08-18 16:17:14,947 [main] TRACE org.modeshape.clustering.ClusteredObservationBus - Received on cluster 'project1-modeshape-cluster' 5 changes in source 'ModeShape Configuration Repository' made by from process '6167d47a-c014-4e91-b01b-d205d68d5c96' at 2010-08-18T20:17:14.916Z

      After the second node is started...
      2010-08-18 16:20:48,273 [ViewHandler,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.pbcast.GMS - new=[workpc-14711], suspected=[], leaving=[], new view: [workpc-27756|1] [workpc-27756, workpc-14711]
      2010-08-18 16:20:48,273 [ViewHandler,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.pbcast.STABLE - suspending message garbage collection
      2010-08-18 16:20:48,276 [ViewHandler,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.pbcast.STABLE - resume task started, max_suspend_time=33000
      2010-08-18 16:20:48,278 [Incoming-1,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.pbcast.NAKACK -
      [setDigest()]
      existing digest: workpc-27756: [3 : 7 (7)]
      new digest: workpc-27756: [3 : 6 (6)], workpc-14711: [0 : 0]
      resulting digest: workpc-27756: [3 : 7 (7)], workpc-14711: [0 : 0 (0)]
      2010-08-18 16:20:48,278 [Incoming-1,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.pbcast.GMS - workpc-27756: view is [workpc-27756|1] [workpc-27756, workpc-14711]
      2010-08-18 16:20:48,283 [Incoming-1,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.FD_SOCK - VIEW_CHANGE received: [workpc-27756, workpc-14711]
      2010-08-18 16:20:48,284 [FD_SOCK pinger,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.FD_SOCK - ping_dest is workpc-14711, pingable_mbrs=[workpc-27756, workpc-14711]
      2010-08-18 16:20:48,285 [Incoming-1,project1-modeshape-cluster,workpc-27756] TRACE org.modeshape.clustering.ClusteredObservationBus - Members of 'project1-modeshape-cluster' cluster have changed: [workpc-27756|1] [workpc-27756, workpc-14711]
      2010-08-18 16:20:48,285 [Incoming-1,project1-modeshape-cluster,workpc-27756] DEBUG org.modeshape.clustering.ClusteredObservationBus - There are now multiple members of cluster 'project1-modeshape-cluster'; changes will be propagated throughout the cluster
      2010-08-18 16:20:48,330 [ViewHandler,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.pbcast.STABLE - resuming message garbage collection
      2010-08-18 16:20:58,286 [Timer-2,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.FD - sending are-you-alive msg to workpc-14711 (own address=workpc-27756)

      After making repository changes in the first node...
      2010-08-18 16:24:19,282 [JCR ContentRepository-2-thread-3] TRACE org.modeshape.clustering.ClusteredObservationBus - Sending to cluster 'project1-modeshape-cluster' 2 changes in source 'ContentRepository' made by anonymous from process '6167d47a-c014-4e91-b01b-d205d68d5c96' at 2010-08-18T20:24:19.227Z
      ...
      2010-08-18 16:24:28,314 [Timer-4,project1-modeshape-cluster,workpc-27756] DEBUG org.jgroups.protocols.FD - sending are-you-alive msg to workpc-14711 (own address=workpc-27756)

      It appears that either the ClusteredObservationBus of a node joining a cluster needs to be aware of the number of cluster nodes so it can set the value of multipleAddressesInCluster appropriately, or a mechanism is needed to trigger the new node's ClusteredObservationBus#Receiver.viewAccepted(View) method.

      This issue does not show up in ClusteredObservationBusTest.shouldAllowNotifyToBeCalledAfterStartWithMultipleMembersAndShouldSendMessageToJGroups() since multipleAddressesInCluster is programmatically set to true for that test.

      CONFIGURATION DETAILS

      Each cluster node has the same repository configuration XML:

      <?xml version="1.0" encoding="UTF-8"?>
      <configuration xmlns:mode="http://www.modeshape.org/1.0" xmlns:jcr="http://www.jcp.org/jcr/1.0">
      <mode:clustering clusterName="project1-modeshape-cluster">
      <configuration>
      <![CDATA[
      <config xmlns="urn:org:jgroups"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-2.8.xsd">
      <UDP
      mcast_addr="${jgroups.udp.mcast_addr:224.10.10.10}"
      mcast_port="${jgroups.udp.mcast_port:5555}"
      discard_incompatible_packets="true"
      max_bundle_size="60000"
      max_bundle_timeout="30"
      ip_ttl="${jgroups.udp.ip_ttl:2}"
      enable_bundling="true"
      thread_pool.enabled="true"
      thread_pool.min_threads="1"
      thread_pool.max_threads="25"
      thread_pool.keep_alive_time="5000"
      thread_pool.queue_enabled="false"
      thread_pool.queue_max_size="100"
      thread_pool.rejection_policy="Run"
      oob_thread_pool.enabled="true"
      oob_thread_pool.min_threads="1"
      oob_thread_pool.max_threads="8"
      oob_thread_pool.keep_alive_time="5000"
      oob_thread_pool.queue_enabled="false"
      oob_thread_pool.queue_max_size="100"
      oob_thread_pool.rejection_policy="Run"/>
      <PING timeout="2000" num_initial_members="3"/>
      <MERGE2 max_interval="30000" min_interval="10000"/>
      <FD_SOCK/>
      <FD timeout="10000" max_tries="5"/>
      <VERIFY_SUSPECT timeout="1500"/>
      <BARRIER/>
      <pbcast.NAKACK use_mcast_xmit="false" gc_lag="0" retransmit_timeout="300,600,1200,2400,4800" discard_delivered_msgs="true"/>
      <UNICAST timeout="300,600,1200,2400,3600"/>
      <pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000" max_bytes="400000"/>
      <VIEW_SYNC avg_send_interval="60000"/>
      <pbcast.GMS print_local_addr="true" join_timeout="3000" view_bundling="true"/>
      <FC max_credits="20000000" min_threshold="0.10"/>
      <FRAG2 frag_size="30000"/>
      <pbcast.STATE_TRANSFER/>
      </config>
      ]]>
      </configuration>
      </mode:clustering>
      <mode:sources jcr:primaryType="nt:unstructured">
      <mode:source jcr:name="ContentRepository" mode:classname="org.modeshape.connector.store.jpa.JpaSource"
      mode:model="Simple"
      mode:dialect="org.hibernate.dialect.MySQL5InnoDBDialect"
      mode:driverClassName="com.mysql.jdbc.Driver"
      mode:username="*****"
      mode:password="*****"
      mode:url="jdbc:mysql://localhost:3306/modeshape?autoReconnect=true"
      mode:maximumConnectionsInPool="1"
      mode:referentialIntegrityEnforced="true"
      mode:largeValueSizeInBytes="1024"
      mode:retryLimit="3"
      mode:compressData="false"
      mode:showSql="false"
      mode:autoGenerateSchema="update"
      mode:creatingWorkspacesAllowed="true"
      mode:defaultWorkspaceName="default">
      <mode:predefinedWorkspaceNames>default</mode:predefinedWorkspaceNames>
      <mode:predefinedWorkspaceNames>system</mode:predefinedWorkspaceNames>
      </mode:source>
      </mode:sources>
      <mode:sequencers/>
      <mode:mimeTypeDetectors>
      <mode:mimeTypeDetector jcr:name="Detector">
      <mode:description>Standard extension-based MIME type detector</mode:description>
      <mode:classname>org.modeshape.graph.mimetype.ExtensionBasedMimeTypeDetector</mode:classname>
      <mode:classpath></mode:classpath>
      <mode:classpath></mode:classpath>
      </mode:mimeTypeDetector>
      </mode:mimeTypeDetectors>
      <mode:repositories>
      <mode:repository jcr:name="ContentRepository">
      <mode:source>ContentRepository</mode:source>
      <mode:options jcr:primaryType="mode:options">
      <systemSourceName jcr:primaryType="mode:option" mode:value="system@ContentRepository"/>
      </mode:options>
      </mode:repository>
      </mode:repositories>
      </configuration>

      Attachments

        Activity

          People

            rhauch Randall Hauch (Inactive)
            yellowdart726 Greg Larson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: