Optimize vpsubusw compared to 0 into vpcmpleuw or vpcmpnleuw [PR96906]
authorliuhongt <hongtao.liu@intel.com>
Mon, 30 Nov 2020 05:27:16 +0000 (13:27 +0800)
committerliuhongt <hongtao.liu@intel.com>
Thu, 3 Dec 2020 05:42:39 +0000 (13:42 +0800)
commit70310982492071f98eacdac0747521769b0f0328
tree1b8f4e168b25ff13331f63c9d8592966cd4c9cc7
parent35c4c67e6c534ef3d6ba7a7752ab7e0fbc91755b
Optimize vpsubusw compared to 0 into vpcmpleuw or vpcmpnleuw [PR96906]

For signed comparisons, it handles cases that are eq or neq to 0.
For unsigned comparisons, it additionaly handles cases that are le or
gt to 0(equivilent to eq or neq to 0). Transform case eq to leu,
case neq to gtu.

.i.e. for -mavx512bw -mavx512vl transform eq case code from

vpsubusw        %xmm1, %xmm0, %xmm0
vpxor   %xmm1, %xmm1, %xmm1
vpcmpeqw  %xmm1, %xmm0, %k0
to
vpcmpleuw       %xmm1, %xmm0, %k0

.i.e. for -mavx512bw -mavx512vl transform neq case code from

vpsubusw        %xmm1, %xmm0, %xmm0
vpxor   %xmm1, %xmm1, %xmm1
vpcmpneqw  %xmm1, %xmm0, %k0
to
vpcmpnleuw       %xmm1, %xmm0, %k0

gcc/ChangeLog
PR target/96906
* config/i386/sse.md
(<avx512>_ucmp<mode>3<mask_scalar_merge_name>): Add a new
define_split after this insn.

gcc/testsuite/ChangeLog

* gcc.target/i386/avx512bw-pr96906-1.c: New test.
* gcc.target/i386/pr96906-1.c: Add -mno-avx512f.
gcc/config/i386/sse.md
gcc/testsuite/gcc.target/i386/avx512bw-pr96906-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/pr96906-1.c