draw: finally optimize bool clip mask generation
authorRoland Scheidegger <sroland@vmware.com>
Sat, 12 Nov 2016 21:46:58 +0000 (22:46 +0100)
committerRoland Scheidegger <sroland@vmware.com>
Fri, 18 Nov 2016 00:25:21 +0000 (01:25 +0100)
commit5ec3a7333fd77698610755d51e42094376e11d01
tree6ba951e3cea248704ecb2ca252f28b3c11cea59b
parentb16f06fd0593099aad74775a41cf74d4c09c3f6a
draw: finally optimize bool clip mask generation

lp_build_any_true_range is just what we need, though it will only produce
optimal code with sse41 (ptest + set) - but even without it on 64bit x86
the code is still better (1 unpack, 2 movq + or + set), on 32bit x86 it's
going to be roughly the same as before.
While here also make it a "real" 8bit boolean - cuts one instruction but
more importantly similar to ordinary booleans.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
src/gallium/auxiliary/draw/draw_llvm.c
src/gallium/auxiliary/draw/draw_llvm.h
src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c