As in this example, why can we not use atomic operation imageAtomicCompSwap
to ensure exclusive access between different PS invocations?
If you are using atomic operations to lock access to a pixel, you are relying on one aspect of relative order: that all threads will eventually make forward progress. That is, you assume that any thread spinning on a lock will not starve the thread that has the lock of its execution resources. That threads holding the lock will eventually make forward progress and release it.
But since the relative order of execution is undefined, there is no guarantee of any of that. And therefore, your code cannot work. Any code which relies on any aspect of ordering between the invocations of a single shader stage cannot work (unless there are specific guarantees in place).
This is precisely why ARB_fragment_shader_interlock exists.
That being said, even if there were guarantees of forward progress, your code would still be broken.
You use a non-atomic operation to release the lock. You should be using an atomic set operation.
Plus, as others have pointed out, you need to continue to spin if the return value from the atomic compare/swap is not zero. Remember: all atomic functions return the original value from the image. So if the original value it atomically read is not 0, then it compared false and you don't have the lock.
Now, your code will still be UB by the spec. But it's more likely to work.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…