radeonsi/gfx10: fix ds.ordered.add intrinsic for compute-based culling