I was just reading slides from a talk that are available here:
http://www.nwcpp.org/Downloads/2008/Memory_Fences.pdf
On page 16, this person asserts that on x86, the memory model enables the relaxed std::atomic conditions shown on that slide to be implemented as nothing, so there would be no performance penalty for that code (assuming the atomic types had atomic store instructions), only that the compiler be required not to reverse the orders.
My questions:
1. Is this person correct?
Yes, plain loads have acquire semantics and plain stores have release semantics on x86. Note that actually, the ordering on the
data variable are superfluous and could be replaced with
std::memory_order_relaxed: the operations on
ready are sufficient to guarantee the ordering.
2. Would your library for Visual Studio indeed implement as no-ops?
some_atomic.load(std::memory_order_acquire) does just drop through to a simple load instruction, and
some_atomic.store(std::memory_order_release) drops through to a simple store instruction. However, this instruction is wrapped in compiler directives to ensure that the compiler doesn't reorder it in a way that could break the desired semantics.
3. How does your library force the compiler not to reorder the stores?
By using appropriate compiler directives in the code. For Visual Studio 2008,
_ReadWriteBarrier is used.
Anthony