Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
241 views
in Technique[技术] by (71.8m points)

c++ - Why GCC does not use LOAD(without fence) and STORE+SFENCE for Sequential Consistency?

Here are four approaches to make Sequential Consistency in x86/x86_64:

  1. LOAD(without fence) and STORE+MFENCE
  2. LOAD(without fence) and LOCK XCHG
  3. MFENCE+LOAD and STORE(without fence)
  4. LOCK XADD(0) and STORE(without fence)

As it is written here: http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

C/C++11 Operation x86 implementation

  • Load Seq_Cst: MOV (from memory)
  • Store Seq Cst: (LOCK) XCHG // alternative: MOV (into memory),MFENCE

Note: there is an alternative mapping of C/C++11 to x86, which instead of locking (or fencing) the Seq Cst store locks/fences the Seq Cst load:

  • Load Seq_Cst: LOCK XADD(0) // alternative: MFENCE,MOV (from memory)
  • Store Seq Cst: MOV (into memory)

GCC 4.8.2(GDB in x86_64) uses first(1) approach for C++11-std::memory_order_seq_cst, i.e. LOAD(without fence) and STORE+MFENCE:

std::atomic<int> a;
int temp = 0;
a.store(temp, std::memory_order_seq_cst);
0x4613e8  <+0x0058>         mov    0x38(%rsp),%eax
0x4613ec  <+0x005c>         mov    %eax,0x20(%rsp)
0x4613f0  <+0x0060>         mfence

As we know, that MFENCE = LFENCE+SFENCE. Then this code we can rewrite to this: LOAD(without fence) and STORE+LFENCE+SFENCE

Questions:

  1. Why do we need not to use LFENCE here before LOAD, and need to use LFENCE after STORE (because LFENCE make sense only before LOAD!)?
  2. Why GCC does not use approach: LOAD(without fence) and STORE+SFENCE for std::memory_order_seq_cst?
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Consider the following code:

#include <atomic>
#include <cstring>

std::atomic<int> a;
char b[64];

void seq() {
  /*
    movl    $0, a(%rip)
    mfence
  */
  int temp = 0;
  a.store(temp, std::memory_order_seq_cst);
}

void rel() {
  /*
    movl    $0, a(%rip)
   */
  int temp = 0;
  a.store(temp, std::memory_order_relaxed);
}

With respect to the atomic variable "a", seq() and rel() are both ordered and atomic on the x86 architecture because:

  1. mov is an atomic instruction
  2. mov is a legacy instruction and Intel promises ordered memory semantics for legacy instructions to be compatible with old processors that always used ordered memory semantics.

No fence is required to store a constant value into an atomic variable. The fences are there because std::memory_order_seq_cst implies that all memory is synchronized, not only the memory that holds the atomic variable.

The effect can be demonstrated by the following set and get functions:

void set(const char *s) {
  strcpy(b, s);
  int temp = 0;
  a.store(temp, std::memory_order_seq_cst);
}

const char *get() {
  int temp = 0;
  a.store(temp, std::memory_order_seq_cst);
  return b;
}

strcpy is a library function that might use newer sse instructions if such are available in runtime. Since sse instructions were not available in old processors there is no requirement on backwards compatibility and memory order is undefined. Thus the result of a strcpy in one thread might not be directly visible in other threads.

The set and get functions above uses an atomic value to enforce memory synchronization so that the result of strcpy becomes visible in other threads. Now the fences matters, but the order of them inside the call to atomic::store is not significant since the fences are not needed internally in atomic::store.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...