nv50/ir: do not perform global membar for shared memory
authorSamuel Pitoiset <samuel.pitoiset@gmail.com>
Mon, 24 Oct 2016 19:41:11 +0000 (21:41 +0200)
committerSamuel Pitoiset <samuel.pitoiset@gmail.com>
Mon, 24 Oct 2016 20:51:54 +0000 (22:51 +0200)
commit6dbb8d12a8b78769b9803884fad5f0d9923023bc
tree3716f196fa05256bdbb652d641fc76cec6545a8e
parenteed605a473554575305e1bf10c3641761a85feb9
nv50/ir: do not perform global membar for shared memory

Shared memory is local to CTA, thus we should only wait for
prior memory writes which are visible to other threads in
the same CTA, and not at global level. This should speedup
compute shaders which use shared memory.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp