Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 4.0.5
Affects Version/s: 4.0
Labels:
None

Workaround Description:

Hide

Add this to JDBC_PING:

@Override
public void stop()
{ super.stop(); if (is_coord) removeAll(cluster_name); }

Show
Add this to JDBC_PING: @Override public void stop() { super.stop(); if (is_coord) removeAll(cluster_name); }
Steps to Reproduce:

Hide

Using the jdbc-ping.xml file attached:
1. start up a cluster of 3 nodes.
2. kill -9 the coordinator
3. attempt to start a new node

Using the file-ping.xml file attached:
1. start up a cluster of 3 nodes
2. kill -9 the coordinator
3. start a new node successfully

Show
Using the jdbc-ping.xml file attached: 1. start up a cluster of 3 nodes. 2. kill -9 the coordinator 3. attempt to start a new node Using the file-ping.xml file attached: 1. start up a cluster of 3 nodes 2. kill -9 the coordinator 3. start a new node successfully

SFDC Cases Counter:
SFDC Cases Links:

Description

FILE_PING and JDBC_PING have different behavior when a cluster's coordinator stops.

With FILE_PING the coordinator will delete the whole cluster's file on shutdown of the coordinator.

JDBC_PING does not do this and reveals a problematic flaw in how node's are handled on shutdown.

When I added my own logging to the source of these files I observed that they're both continuously writing to the database/file all of the members because write() is called very frequently.

—

Current behavior:

GIVEN a cluster of JDBC_PING registered nodes
WHEN a node shuts down
THEN it removes itself from the database table AND the coordinator almost immediately re-adds the shut down member to the table because of the List<PingData> sent to write()

GIVEN a cluster of JDBC_PING registered nodes has only the coordinator left
WHEN the coordinator shuts down
THEN the coordinator removes itself from the database and because there's no coordinator left the database shows a list of only the 'members' with no coordinator

GIVEN a cluster of JDBC_PING registered nodes
WHEN the coordinator shuts down or crashes and does not have time to remove itself from the database
THEN the next node to start will never finish negotiating membership with the cluster because a phantom coordinator still exists (see attachement: stuck_starting_up.log)

—

I expected the behavior between JDBC_PING and FILE_PING to remain consistent

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

file-ping.xml
2 kB
2017/06/26 2:03 PM
jdbc-ping.xml
3 kB
2017/06/26 2:03 PM
stuck_starting_up.log
28 kB
2017/06/26 2:02 PM

Activity

People

Assignee:: Bela Ban

Reporter:: Douglas Adams (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2017/06/26 3:01 PM

Updated:: 2017/07/26 9:20 AM

Resolved:: 2017/07/26 9:20 AM