Tuesday, 15 April 2014

c++ - Is VC++ still broken Sequentially-Consistent-wise? -



c++ - Is VC++ still broken Sequentially-Consistent-wise? -

i watched (most of) herb sutter's atmoic<> weapons video, , wanted test "conditional lock" loop within sample. apparently, although (if understand correctly) c++11 standard says below illustration should work , sequentially consistent, not.

before read on, question is: correct? compiler broken? code broken - have race status here missed? how bypass this?

i tried on 3 different versions of visual c++: vc10 professional, vc11 professional , vc12 express (== visual studio 2013 desktop express).

below code used visual studio 2013. other versions used boost instead of std, thought same.

#include <iostream> #include <thread> #include <mutex> int = 0; std::mutex m; void other() { std::lock_guard<std::mutex> l(m); std::this_thread::sleep_for(std::chrono::milliseconds(2)); = 999999; std::this_thread::sleep_for(std::chrono::seconds(2)); std::cout << << "\n"; } int main(int argc, char* argv[]) { bool work = (argc > 1); if (work) { m.lock(); } std::thread th(other); (int = 0; < 100000000; ++i) { if (i % 7 == 3) { if (work) { ++a; } } } if (work) { std::cout << << "\n"; m.unlock(); } th.join(); }

to summarize thought of code: global variable a protected global mutex m. assuming there no command line arguments (argc==1) thread runs other() 1 supposed access global variable a.

the right output of programme print 999999.

however, because of compiler loop optimization (using register in-loop increments , @ end of loop copying value a), a modified assembly though it's not supposed to.

this happened in 3 vc versions, although in code illustration in vc12 had plant calls sleep() create break.

here's of assembly code (the adress of a in run 0x00f65498):

loop initialization - value a copied edi

27: (int = 0; < 100000000; ++i) 00f61543 xor esi,esi 00f61545 mov edi,dword ptr ds:[0f65498h] 00f6154b jmp main+0c0h (0f61550h) 00f6154d lea ecx,[ecx] 28: { 29: if (i % 7 == 3)

increment within condition, , after loop copied location of a unconditionally

30: { 31: if (work) 00f61572 mov al,byte ptr [esp+1bh] 00f61576 jne main+0edh (0f6157dh) 00f61578 test al,al 00f6157a je main+0edh (0f6157dh) 32: { 33: ++a; 00f6157c inc edi 27: (int = 0; < 100000000; ++i) 00f6157d inc esi 00f6157e cmp esi,5f5e100h 00f61584 jl main+0c0h (0f61550h) 32: { 33: ++a; 00f61586 mov dword ptr ds:[0f65498h],edi 34: }

and output of programme 0.

the 'volatile' keyword prevent kind of optimization. that's it's for: every utilize of 'a' read or written shown, , won't moved in different order other volatile variables.

the implementation of mutex should include compiler-specific instructions cause "fence" @ point, telling optimizer not reorder instructions across boundary. since implementation not compiler vendor, maybe that's left out? i've never checked.

since 'a' global, think compiler more careful it. but, vs10 doesn't know threads won't consider other threads utilize it. since optimizer grasps entire loop execution, knows functions called within loop won't touch 'a' , that's plenty it.

i'm not sure new standard says thread visibility of global variables other volatile. is, there rule prevent optimization (even though function can grasped way downwards knows other functions don't utilize global, must assume other threads can) ?

i suggest trying newer compiler compiler-provided std::mutex, , checking c++ standard , current drafts that. think above should help know for.

—john

c++ multithreading visual-c++ concurrency compiler-optimization

No comments:

Post a Comment