cppreference.com gets acquire/release instructions wrong

It is quite well-known that the memory operations and fences provided by the C/C++11 languages are pretty complicated and confusing. For instance, my OOPSLA 2015 paper (joint with Batty, Beckmann, and Donaldson) explains that a proposed implementation of these instructions for next-generation AMD graphics cards is faulty because the designers misunderstood how they worked. And back in 2011, Batty et al. showed that (an early draft of) the C++11 language failed to enforce a desirable property called ‘sequential consistency for data-race-free programs’, as a result of a subtlety in the way some of these memory operations were being defined.

Today I found another example of the confusing arising from memory operations and fences in C/C++11. The cppreference.com website is a long-established, well-regarded, and widely-used reference for C and C++ programmers, but its description of how the “acquire/release” memory operations and fences work is quite wrong.

The website claims that acquire loads cannot be reordered with subsequent reads:

Screen Shot 2016-08-11 at 11.40.35.png

and that release stores cannot be reordered with preceding writes. It also claims that a store that follows a release fence cannot be reordered with any write that precedes the fence:

Screen Shot 2016-08-11 at 12.44.05.png

This is incorrect (or more precisely, it is not the whole truth). The truth, according to the official C/C++11 language specifications, is that

  • acquire loads cannot be reordered with any subsequent memory accesses (not just reads), and
  • release stores cannot be reordered with any preceding memory accesses (not just writes).

Similarly,

  • release fences prevent stores that follow the fence being reordered with any memory access that precedes the fence (not just writes), and
  • acquire fences prevent loads that precede the fence being reordered with any memory access that follows the fence.

Here is an example that demonstrates the different interpretations of acquire loads. Suppose we have two threads:

int main() {
  atomic_int x=0; atomic_int y=0;
  {{{ {
    // Thread 1:
    int r0 = atomic_load_explicit(&x, memory_order_relaxed); //a
    atomic_store_explicit(&y, 1, memory_order_release);      //b
  } ||| {
    // Thread 2:
    int r1 = atomic_load_explicit(&y, memory_order_acquire); //c
    atomic_store_explicit(&x, 1, memory_order_relaxed);      //d
  } }}}
}

Can we obtain r0 = 1 and r1 = 1 at the end? According to the official standard, the answer is no. This can be confirmed by copying the code above into the CppMem tool (which enumerates all the behaviours of a small C/C++ program that are allowed by the standard).

But if we follow the guidelines from cppreference.com and allow the acquire load (c) to be reordered with the subsequent relaxed store (d), then it is straightforward to obtain r0 = r1 = 1, simply by executing the instructions in the following order: d, a, b, c.

According to cppreference.com, acquire/release instructions are less powerful than they are in the official standard. This is problematic for programmers, because they might end up using stronger instructions than they really need, which may make their programs less efficient. It’s even more problematic for compiler-writers, because if they just follow the interpretation on cppreference.com, they might end up providing acquire/release instructions that are weaker than programmers expect, which means that correctly-written programs could give the wrong results when they are executed.

Advertisements

1 thought on “cppreference.com gets acquire/release instructions wrong”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s