Optimize memory broadcast for constant vector under AVX512.
authorliuhongt <hongtao.liu@intel.com>
Wed, 8 Jul 2020 09:14:36 +0000 (17:14 +0800)
committerliuhongt <hongtao.liu@intel.com>
Thu, 3 Sep 2020 08:10:45 +0000 (16:10 +0800)
commit433734126996b6fc4fc99b594421510f928a7bb9
tree3709ee6cb49463d9a7dc483903d79cba3c2546bb
parent8bd5530bfa136663f1fa79e9a1d3932b5adf15bd
Optimize memory broadcast for constant vector under AVX512.

For constant vector having one duplicated value, there's no need to put
whole vector in the constant pool, using embedded broadcast instead.

2020-07-09  Hongtao Liu  <hongtao.liu@intel.com>

gcc/ChangeLog:

PR target/87767
* config/i386/i386-features.c
(replace_constant_pool_with_broadcast): New function.
(constant_pool_broadcast): Ditto.
(class pass_constant_pool_broadcast): New pass.
(make_pass_constant_pool_broadcast): Ditto.
(remove_partial_avx_dependency): Call
replace_constant_pool_with_broadcast under TARGET_AVX512F, it
would save compile time when both pass rpad and cpb are
available.
(remove_partial_avx_dependency_gate): New function.
(class pass_remove_partial_avx_dependency::gate): Call
remove_partial_avx_dependency_gate.
* config/i386/i386-passes.def: Insert new pass after combine.
* config/i386/i386-protos.h
(make_pass_constant_pool_broadcast): Declare.
* config/i386/sse.md (*avx512dq_mul<mode>3<mask_name>_bcst):
New define_insn.
(*avx512f_mul<mode>3<mask_name>_bcst): Ditto.
* config/i386/avx512fintrin.h (_mm512_set1_ps,
_mm512_set1_pd,_mm512_set1_epi32, _mm512_set1_epi64): Adjusted.

gcc/testsuite/ChangeLog:

PR target/87767
* gcc.target/i386/avx2-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-2.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-3.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-4.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-5.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-6.c: New test.
* gcc.target/i386/avx512f-broadcast-pr87767-7.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-2.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-3.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-4.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: New test.
* gcc.target/i386/avx512vl-broadcast-pr87767-6.c: New test.
19 files changed:
gcc/config/i386/avx512fintrin.h
gcc/config/i386/i386-features.c
gcc/config/i386/i386-passes.def
gcc/config/i386/i386-protos.h
gcc/config/i386/sse.md
gcc/testsuite/gcc.target/i386/avx2-broadcast-pr87767-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-broadcast-pr87767-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-broadcast-pr87767-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-broadcast-pr87767-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-broadcast-pr87767-4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-broadcast-pr87767-5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-broadcast-pr87767-6.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-broadcast-pr87767-7.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-broadcast-pr87767-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-broadcast-pr87767-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-broadcast-pr87767-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-broadcast-pr87767-4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-broadcast-pr87767-5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-broadcast-pr87767-6.c [new file with mode: 0644]