Sorry I should have been a lot clearer on what I meant. Let’s rephrase: no matter how "clever" a technique is, and whether it may ultimately guarantee safe access of shared state under any circumstance, any time the same location in memory is read or written concurrently by two separate threads TSAN will flag it as a violation. My question was whether atomic operations (e.g. atomic_load/_fetch_add/_fetch_sub) would ever trigger TSAN. They clearly touch a shared memory location simultaneously, but their very nature is to do so predictably. My guess is they shouldn't upset TSAN but if you google related keywords you'll end up reading about whether false positives are possible (perhaps in previous versions of TSAN only?)
Since this thread is turning more interesting than I had hoped for, it goes without saying Quinn’s advice is sound as always: all "clever" techniques are a difficult bet not just on writing code that is safe today, but will forever remain safe in light of evolving compiler technologies. Since his first reply I've already guilt-added some os_unfair_lock/unlock to keep TSAN watching over some functions I had previously disabled TSAN on.
My use of @synchronized() in that particular example is made with full awareness of its poor performance: does it matter how badly the implementation is, when you pay the price once in most cases? In some parts of my codebase relying on the syntactic sugar seems fair. We use GCD queues, os_unfair_lock, atomic_int and pthreads elsewhere, as appropriate. I assume (wrongly?) that @synchronized uses NSRecursiveLock or something similar to it behind the scenes. Would that be the reason it is slow? With so much else in the ObjC runtime having been optimized to the extreme, one wonders why this particular hole was left alone.
Another area where TSAN is clearly upset at my codebase is more "clever" techniques that rely on the atomicity of pointer-sized assignments on both Intel and ARM 64-bit architectures. The reason why you can get away with unguarded reads in (probably all) these techniques is that the bit pattern of a pointer-sized value will never contain a partially written sequence of bits. It will either contain all bits from the last-executed write, or it won't. The bits you read on any thread will never be the result of interleaving previous state with state from an "in-flight" CPU instruction. You can have some fun with this – again with the noble goal of avoiding any and all kinds of locking primitives in your most common access pattern. To be clear: this liberally-defined "atomicity" of a CPU instruction is a very, very thin foundation on which to be clever. But there are legitimate cases where trying to squeeze some extra performance makes sense for us, and I assume for other multi-threaded, graphics-intensive code aiming for highest FPS possible. My original question/hope was indeed one that makes sense in my domain: can one selectively disable TSAN on a statement-by-statement basis, or is one forced to snooze it for an entire function body? It seems that right now, only the blunt option is available.
@endecotp: I assume std::call_once() uses the same technique and therefore has the same excellent performance as dispatch_once. I am not against it, but would rather keep sources Obj-C only. Why would anyone prefer @synchronized over a dispatch_once? I sometimes do, when performance isn't an issue, so that the code is easier to read. Syntactic sugar is very sweet and one wonders why they stopped improving Obj-C in this regard. Anyone who follows Swift language "evolution" should readily notice that they entertain introducing language-wide changes for almost any scenario they come across. The line of where to stop adding syntactic sugar is rarely drawn, and each release adds more and more scenarios where you write less code (this trend troubles me in profound ways, but I like it :-)). There is something to dislike about dispatch_once and std:call_once() too... that pesky statically allocated token that sits there and – if you leave it at file scope – has the potential to be reused accidentally, be it from an auto-complete typo or by carelessly copying/pasting code. My preference is to scope a dispatch_once token within the body of a block defined locally to the function. This involves C macros though. For example if you want to have a lazily allocated, static/global variable:
#define UNPACK(...) __VA_ARGS__
#define ONCE(body) (^typeof((body)) { \
static typeof((body)) value;\
static dispatch_once_t once; \
dispatch_once(& once, ^{ value = (body); }); \
return value; \
})()
/// You would only use the following macro. The above two are just to support the pattern
#define Once(...) ONCE(UNPACK(__VA_ARGS__))
The goal being that the dispatch_once_t variable is static and crucially starts as 0 when your process is loaded, but it is invisible outside the scope of the locally defined block. No variable name "leaks" outside that scope, and the resulting code (e.g. someVar = Once([[MyClass alloc] init]) can be copied/pasted at will with no fear of re-using the same static token by accident. (Again in the hope I'm not overlooking something horrible...)