Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-3918

Inconsistent view of the cache with putIfAbsent in a non-tx cache during state transfer

This issue belongs to an archived project. You can view it, but you can't modify it. Learn more

    • Icon: Bug Bug
    • Resolution: Obsolete
    • Icon: Major Major
    • None
    • 6.0.0.Final
    • Core, State Transfer

      In a non-tx cache, sometimes it's possible for a get(k) to return null even though a previous putIfAbsent(k, v) returned a non-null value and the only concurrent operations on the cache are concurrent putIfAbsent calls.

      Say [B, A, C] are the owners of k (C just joined)
      1. A starts a putIfAbsent(k, v1) command, sends it to B
      2. B forwards the command to A and C
      3. C writes k=v1
      4. C becomes the primary owner of k (owners are now [C, A])
      5. A/B see the new topology before committing and throw an outdatedTopologyException
      6. A retries the command, sends it to C
      7. C forwards the command to A, which writes k=v1
      8. C doesn't have to update the entry, returns null

      If, between steps 3 and 7, another thread on A starts a putIfAbsent(k, v2) command, the command will fail and return v1 (because the primary owner already has a value). However, a subsequent get(k) command will return null, because A is an owner and doesn't have the value.

            [ISPN-3918] Inconsistent view of the cache with putIfAbsent in a non-tx cache during state transfer

            Pedro Ruivo added a comment -

            if the primary replies to the originator as a regular message, it will be ordered with the backup command and it solve the problem when the topology is stable.
            if a topology changes while the put if absent, we would need to versioning to keep track with command-invocation-id generate which version (and return value) to decide if the command should be handled or not.

            Pedro Ruivo added a comment - if the primary replies to the originator as a regular message, it will be ordered with the backup command and it solve the problem when the topology is stable. if a topology changes while the put if absent, we would need to versioning to keep track with command-invocation-id generate which version (and return value) to decide if the command should be handled or not.

            Indeed, fixed.

            Dan Berindei (Inactive) added a comment - Indeed, fixed.

            Dan at point 29, the DC on B should contain v3, shouldn't it? Not that it would change much...

            Radim Vansa (Inactive) added a comment - Dan at point 29, the DC on B should contain v3, shouldn't it? Not that it would change much...

            Dan Berindei (Inactive) added a comment - - edited

            I found another failure mode while trying to work around this issue in NonTxPutIfAbsentDuringLeaveStressTest (I was also trying to merge it with NonTxPutIfAbsentDuringJoinStressTest into a NonTxPutIfAbsentDuringRebalanceStressTest, so the stack traces in the attached log NonTxPutIfAbsentDuringRebalanceStressTest.testPutIfAbsentDuringJoin_1.log.gz don't match master).

            Say owners(k) = AB in topology t, and owners(k) = BA in topology t+1. In the following scenario, putIfAbsent fails in thread C-app1 and C-app2 with different values:

            1. C-app1: start putIfAbsent(k, v1)
            2. C-app1: send putIfAbsent(k, v1) to A (primary)
            3. A-remote1: receive putIfAbsent(k, v1)
            4. A-remote1: send backup request for putIfAbsent(k, v1) to B
            5. A-remote1: write k=v1
            6. C-remote1: receive null for putIfAbsent(k, v1)
            7. C-app2: start putIfAbsent(k, v2)
            8. C-app2: send putIfAbsent(k, v2) to A (primary)
            9. A-remote2: receive putIfAbsent(k, v1)
            10. A-remote2: read v1 from data container, fail the command
            11. C-remote2: receive v1 for putIfAbsent(k, v2)
            12. C-app2: return v1 (put was unsuccessful)
            13. B-remote1: install topology t+1, B is now primary owner
            14. B-remote1: receive backup request for putIfAbsent(k, v1)
            15. B-remote1: check topology, send OutdatedTopologyException ack to C
            16. C-remote3: install topology t+1
            17. C-app3: start putIfAbsent(k, v3)
            18. C-app3: send putIfAbsent(k, v3) to B (primary)
            19. B-remote3: receive putIfAbsent(k, v3)
            20. B-remote3: send backup request for putIfAbsent(k, 3) to A
            21. B-remote3: write k=v3
            22. C-remote3: receive null for putIfAbsent(k, v3)
            23. A-remote3: receive backup request for putIfAbsent(k, v3)
            24. A-remote3: write k=v3
            25. A-remote3: send backup ack for putIfAbsent(k, v3) to C
            26. C-remote3: receive backup ack for putIfAbsent(k, v3)
            27. C-app3: return null (put was successful)
            28. B-remote1: retry, B is now primary owner
            29. B-remote1: read v3 from data container, fail the command
            30. C-remote1: receive v3 for putIfAbsent(k, v1)
            31. C-app1: return v3 (put was unsuccessful)

            I think both this scenario and the previous one are worse than the initial report of seeing null, because we're not respecting the READ_COMMITTED isolation level (a transaction is seeing a value that was never committed).

            Dan Berindei (Inactive) added a comment - - edited I found another failure mode while trying to work around this issue in NonTxPutIfAbsentDuringLeaveStressTest (I was also trying to merge it with NonTxPutIfAbsentDuringJoinStressTest into a NonTxPutIfAbsentDuringRebalanceStressTest , so the stack traces in the attached log NonTxPutIfAbsentDuringRebalanceStressTest.testPutIfAbsentDuringJoin_1.log.gz don't match master). Say owners(k) = AB in topology t , and owners(k) = BA in topology t+1 . In the following scenario, putIfAbsent fails in thread C-app1 and C-app2 with different values: C-app1 : start putIfAbsent(k, v1) C-app1 : send putIfAbsent(k, v1) to A (primary) A-remote1 : receive putIfAbsent(k, v1) A-remote1 : send backup request for putIfAbsent(k, v1) to B A-remote1 : write k=v1 C-remote1 : receive null for putIfAbsent(k, v1) C-app2 : start putIfAbsent(k, v2) C-app2 : send putIfAbsent(k, v2) to A (primary) A-remote2 : receive putIfAbsent(k, v1) A-remote2 : read v1 from data container, fail the command C-remote2 : receive v1 for putIfAbsent(k, v2) C-app2 : return v1 (put was unsuccessful) B-remote1 : install topology t+1, B is now primary owner B-remote1 : receive backup request for putIfAbsent(k, v1) B-remote1 : check topology, send OutdatedTopologyException ack to C C-remote3 : install topology t+1 C-app3 : start putIfAbsent(k, v3) C-app3 : send putIfAbsent(k, v3) to B (primary) B-remote3 : receive putIfAbsent(k, v3) B-remote3 : send backup request for putIfAbsent(k, 3) to A B-remote3 : write k=v3 C-remote3 : receive null for putIfAbsent(k, v3) A-remote3 : receive backup request for putIfAbsent(k, v3) A-remote3 : write k=v3 A-remote3 : send backup ack for putIfAbsent(k, v3) to C C-remote3 : receive backup ack for putIfAbsent(k, v3) C-app3 : return null (put was successful) B-remote1 : retry, B is now primary owner B-remote1 : read v3 from data container, fail the command C-remote1 : receive v3 for putIfAbsent(k, v1) C-app1 : return v3 (put was unsuccessful) I think both this scenario and the previous one are worse than the initial report of seeing null , because we're not respecting the READ_COMMITTED isolation level (a transaction is seeing a value that was never committed).

            The situation is even worse when the topology changes: a get after a failed putIfAbsent can return not only null, but also a completely different value.

            Say owners(k) = AB, and there is a topology change but the owners of k stay the same. In the following scenario, thread B-app3 first sees putIfAbsent(k, v3) = v2, and then get(k) = v1 (B-appX means application thread X on B, and A-remoteX means remote thread X on A):

            1. B-app1: start putIfAbsent(k, v1)
            2. B-app1: send putIfAbsent(k, v1) to A (primary)
            3. A-remote1: receive putIfAbsent(k, v1)
            4. A-remote1: send backup request for putIfAbsent(k, v1) to B
            5. B-remote1: receive backup request for putIfAbsent(k, v1)
            6. B-remote1: write k=v1
            7. B-remote1: send backup ack for putIfAbsent(k, v1)
            8. A-remote1: check topology, send OutdatedTopologyException back to B
            9. A-app2: start putIfAbsent(k, v2)
            10. A-app2: send backup request for putIfAbsent(k, v2) to B
            11. A-app2: write k=v2
            12. B-app3: start putIfAbsent(k, v3)
            13. B-app3: send putIfAbsent(k, v3) to A (primary)
            14. A-remote3: receive putIfAbsent(k, v3)
            15. A-remote3: read v2 from data container, fail the command
            16. B-remote3: receive v2 for putIfAbsent(k, v3)
            17. B-app3: return v2 for putIfAbsent(k, v3) (put was unsuccessful)
            18. B-app3: start get(k)
            19. B-app3: read v1 from local container
            20. B-app3: return v1 for get(k)
            21. B-remote2: receive putIfAbsent(k, v2) backup command
            22. B-remote2: write k=v2
            23. B-remote2: send backup ack for putIfAbsent(k, v2)
            24. A-remote2: receive backup ack for putIfAbsent(k, v2)
            25. A-app2: return null for putIfAbsent(k, v2) (put was successful)
            26. B-remote1: receive OutdatedTopologyException for putIfAbsent(k, v1)
            27. B-remote1: retry, send putIfAbsent(k, v1) to A (primary)
            28. A-remote1: receive putIfAbsent(k, v1)
            29. A-remote1: read v2 from data container, fail the command
            30. B-remote1: receive v2 for putIfAbsent(k, v1)
            31. B-app1: return v2 for putIfAbsent(k, v1) (put was unsuccessful)

            Dan Berindei (Inactive) added a comment - The situation is even worse when the topology changes: a get after a failed putIfAbsent can return not only null , but also a completely different value. Say owners(k) = AB , and there is a topology change but the owners of k stay the same. In the following scenario, thread B-app3 first sees putIfAbsent(k, v3) = v2 , and then get(k) = v1 ( B-appX means application thread X on B, and A-remoteX means remote thread X on A): B-app1 : start putIfAbsent(k, v1) B-app1 : send putIfAbsent(k, v1) to A (primary) A-remote1 : receive putIfAbsent(k, v1) A-remote1 : send backup request for putIfAbsent(k, v1) to B B-remote1 : receive backup request for putIfAbsent(k, v1) B-remote1 : write k=v1 B-remote1 : send backup ack for putIfAbsent(k, v1) A-remote1 : check topology, send OutdatedTopologyException back to B A-app2 : start putIfAbsent(k, v2) A-app2 : send backup request for putIfAbsent(k, v2) to B A-app2 : write k=v2 B-app3 : start putIfAbsent(k, v3) B-app3 : send putIfAbsent(k, v3) to A (primary) A-remote3 : receive putIfAbsent(k, v3) A-remote3 : read v2 from data container, fail the command B-remote3 : receive v2 for putIfAbsent(k, v3) B-app3 : return v2 for putIfAbsent(k, v3) (put was unsuccessful) B-app3 : start get(k) B-app3 : read v1 from local container B-app3 : return v1 for get(k) B-remote2 : receive putIfAbsent(k, v2) backup command B-remote2 : write k=v2 B-remote2 : send backup ack for putIfAbsent(k, v2) A-remote2 : receive backup ack for putIfAbsent(k, v2) A-app2 : return null for putIfAbsent(k, v2) (put was successful) B-remote1 : receive OutdatedTopologyException for putIfAbsent(k, v1) B-remote1 : retry, send putIfAbsent(k, v1) to A (primary) A-remote1 : receive putIfAbsent(k, v1) A-remote1 : read v2 from data container, fail the command B-remote1 : receive v2 for putIfAbsent(k, v1) B-app1 : return v2 for putIfAbsent(k, v1) (put was unsuccessful)

            This situation can happen with triangle algorithm even without any topology change; as primary does not hold the lock during replication, second putIfAbsent may fail before the first putIfAbsent is executed on backup.

            Radim Vansa (Inactive) added a comment - This situation can happen with triangle algorithm even without any topology change; as primary does not hold the lock during replication, second putIfAbsent may fail before the first putIfAbsent is executed on backup.

            FORCE_WRITE_LOCK doesn't work in non-tx caches.

            Dan Berindei (Inactive) added a comment - FORCE_WRITE_LOCK doesn't work in non-tx caches.

            Possible workaround is to use FORCE_WRITE_LOCK flag for the get() operation.

            Radim Vansa (Inactive) added a comment - Possible workaround is to use FORCE_WRITE_LOCK flag for the get() operation.

            This is causing a random failure in NonTxPutIfAbsentDuringJoinStressTest.

            Dan Berindei (Inactive) added a comment - This is causing a random failure in NonTxPutIfAbsentDuringJoinStressTest.

              Unassigned Unassigned
              dberinde@redhat.com Dan Berindei (Inactive)
              Archiver:
              rhn-support-adongare Amol Dongare

                Created:
                Updated:
                Resolved:
                Archived: