gallivm: support avx512 (16x32) in interleave2_half
authorGeorge Kyriazis <george.kyriazis@intel.com>
Wed, 17 Jan 2018 00:06:34 +0000 (18:06 -0600)
committerGeorge Kyriazis <george.kyriazis@intel.com>
Thu, 18 Jan 2018 23:07:06 +0000 (17:07 -0600)
commitf76ca91ae07040fe661ecb215b2e6bf43dc16283
treeb3ff2cc2019aed05641c671b5c2bec3cb1a0726c
parent9e6efdd1776cb76386d8aa774926c77aa2db7804
gallivm: support avx512 (16x32) in interleave2_half

lp_build_interleave2_half was not doing the right thing for avx512-style
16-wide loads.

This path is hit in the swr driver with a 16-wide vertex shader. It is
called from lp_build_transpose_aos, when doing texel fetches and the
fetched data needs to be transposed to one component per output register.

Special-case the post-load swizzle operations for avx512 16x32 (16-wide
32-bit values) so that we move the xyzw components correctly to the outputs.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
src/gallium/auxiliary/gallivm/lp_bld_pack.c