Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Won't Do
Priority: Major
Fix Version/s: None
Affects Version/s: jboss-fuse-6.2, jboss-fuse-6.3
Component/s: Fabric8 v1
Labels:
None

Fuse Progress Bar:

% %
Steps to Reproduce:
Hide

There are various ways to reproduce this problem. I suggest the simplest is this.

1. Install Fuse 6.3
2. Create a fabric and an ssh container
3. Ensure the ssh container is stopped
4. Change to the ssh container's installation directory
5. Run

$ ./bin/start && ./bin/start

Note that multiple JVMs are instantiated. Only one will be providing any sort of service – the other will be waiting for a lock. Check this using, for example, "netstat -anp".

6. Check the process IDs in instances/instances.properties. Note that the process ID of the recently-started container may be incorrect – it may reflect the ID of the container that is waiting for a lock, not the container providing a service

7. Try to shut down the container using Fabric8 tools, or by running ./bin/shutdown. Most likely this will fail, and the state of the container will remain ambiguous.

Another approach is to use the Hawtio console, and hit "start" on the container many times in quick succession. Or, if the container is running, hit "stop" many times in quick succession. Problems are harder to reproduce this way – it might take several attempts.
Show
There are various ways to reproduce this problem. I suggest the simplest is this. 1. Install Fuse 6.3 2. Create a fabric and an ssh container 3. Ensure the ssh container is stopped 4. Change to the ssh container's installation directory 5. Run $ ./bin/start && ./bin/start Note that multiple JVMs are instantiated. Only one will be providing any sort of service – the other will be waiting for a lock. Check this using, for example, "netstat -anp". 6. Check the process IDs in instances/instances.properties. Note that the process ID of the recently-started container may be incorrect – it may reflect the ID of the container that is waiting for a lock, not the container providing a service 7. Try to shut down the container using Fabric8 tools, or by running ./bin/shutdown. Most likely this will fail, and the state of the container will remain ambiguous. Another approach is to use the Hawtio console, and hit "start" on the container many times in quick succession. Or, if the container is running, hit "stop" many times in quick succession. Problems are harder to reproduce this way – it might take several attempts.

SFDC Cases Counter:
SFDC Cases Links:

Description

There are various ways in which a large Fabric8 installation with multiple administrators can be put into an indeterminate state, if administrators attempt to carry out particular actions in very quick succession. For example, if two administrators try to start a container at the same time, or a single administrator is heavy-handed on the "Start" button in the Hawtio console, we can end up with multiple instances of the same container running concurrently.

Although Karaf's built-in locking prevents multiple container instances all trying to provide a service, Fabric8 has no infrastructure for managing, or even recovering from, a scenario where the same container has multiple running instances. We can readily end up in a situation where the container status is reported differently by different tools, and containers can no longer be managed.

Attachments

Issue Links

is related to

ENTESB-9057 bin/client command does not work if a DefaultJDBCLock thread is hang

Closed

relates to

ENTESB-11955 Zookeeper cluster management is not 100% reliable

Closed

ENTESB-11987 [Fabric] instance.properties is not in sync with Zookeeper entries

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Kevin Boone

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2019/10/07 11:04 AM

Updated:: 2023/03/22 8:06 AM

Resolved:: 2019/10/11 5:57 AM