Enable AVX512 memory broadcast for FMSUB
authorH.J. Lu <hongjiu.lu@intel.com>
Sun, 21 Oct 2018 20:24:50 +0000 (20:24 +0000)
committerH.J. Lu <hjl@gcc.gnu.org>
Sun, 21 Oct 2018 20:24:50 +0000 (13:24 -0700)
commitfe7f972d6ecc1f1df34f15615b7e3dea6f39e564
treee1eb0e9ceeeaffb0ac1c74cebe434dcc0f271985
parent88c08ac43c47bb5d21734be744df913dd568d108
Enable AVX512 memory broadcast for FMSUB

Many AVX512 vector operations can broadcast from a scalar memory source.
This patch enables memory broadcast for FMSUB operations.  In order to
support AVX512 memory broadcast for FMSUB, FMSUB builtin functions are
also added, instead of passing the negated value to FMA builtin functions.

gcc/

PR target/72782
* config/i386/avx512fintrin.h (_mm512_fmsub_round_pd): Use
__builtin_ia32_vfmsubpd512_mask.
(_mm512_mask_fmsub_round_pd): Likewise.
(_mm512_fmsub_pd): Likewise.
(_mm512_mask_fmsub_pd): Likewise.
(_mm512_maskz_fmsub_round_pd): Use
__builtin_ia32_vfmsubpd512_maskz.
(_mm512_maskz_fmsub_pd): Likewise.
(_mm512_fmsub_round_ps): Use __builtin_ia32_vfmsubps512_mask.
(_mm512_mask_fmsub_round_ps): Likewise.
(_mm512_fmsub_ps): Likewise.
(_mm512_mask_fmsub_ps): Likewise.
(_mm512_maskz_fmsub_round_ps): Use
__builtin_ia32_vfmsubps512_maskz.
(_mm512_maskz_fmsub_ps): Likewise.
* config/i386/avx512vlintrin.h (_mm256_mask_fmsub_pd): Use
__builtin_ia32_vfmsubpd256_mask.
(_mm256_maskz_fmsub_pd): Use __builtin_ia32_vfmsubpd256_maskz.
(_mm_mask_fmsub_pd): Use __builtin_ia32_vfmaddpd128_mask
(_mm_maskz_fmsub_pd): Use __builtin_ia32_vfmsubpd128_maskz.
(_mm256_mask_fmsub_ps): Use __builtin_ia32_vfmsubps256_mask.
(_mm256_mask_fmsub_ps): Use __builtin_ia32_vfmsubps256_mask.
(_mm256_maskz_fmsub_ps): Use __builtin_ia32_vfmsubps256_maskz.
(_mm_mask_fmsub_ps): Use __builtin_ia32_vfmsubps128_mask.
(_mm_maskz_fmsub_ps): Use __builtin_ia32_vfmsubps128_maskz.
* config/i386/fmaintrin.h (_mm_fmsub_pd): Use
__builtin_ia32_vfmsubpd.
(_mm256_fmsub_pd): Use __builtin_ia32_vfmsubpd256.
(_mm_fmsub_ps): Use __builtin_ia32_vfmsubps.
(_mm256_fmsub_ps): Use __builtin_ia32_vfmsubps256.
(_mm_fmsub_sd): Use __builtin_ia32_vfmsubsd3.
(_mm_fmsub_ss): Use __builtin_ia32_vfmsubss3.
* config/i386/i386-builtin.def: Add
__builtin_ia32_vfmsubpd256_mask,
__builtin_ia32_vfmsubpd256_maskz,
__builtin_ia32_vfmsubpd128_mask,
__builtin_ia32_vfmsubpd128_maskz,
__builtin_ia32_vfmsubps256_mask,
__builtin_ia32_vfmsubps256_maskz,
__builtin_ia32_vfmsubps128_mask,
__builtin_ia32_vfmsubps128_maskz,
__builtin_ia32_vfmsubpd512_mask,
__builtin_ia32_vfmsubpd512_maskz,
__builtin_ia32_vfmsubps512_mask,
__builtin_ia32_vfmsubps512_maskz, __builtin_ia32_vfmsubss3,
__builtin_ia32_vfmsubsd3, __builtin_ia32_vfmsubps,
__builtin_ia32_vfmsubpd, __builtin_ia32_vfmsubps256 and.
__builtin_ia32_vfmsubpd256.
* config/i386/sse.md (fma4i_fmsub_<mode>): New.
(<avx512>_fmsub_<mode>_maskz<round_expand_name>): Likewise.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_1):
Likewise.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_2):
Likewise.
(*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_3):
Likewise.
(fmai_vmfmsub_<mode><round_name>): Likewise.

gcc/testsuite/

PR target/72782
* gcc.target/i386/avx512f-fmsub-df-zmm-1.c: New test.
* gcc.target/i386/avx512f-fmsub-sf-zmm-1.c: Likewise.
* gcc.target/i386/avx512f-fmsub-sf-zmm-2.c: Likewise.
* gcc.target/i386/avx512f-fmsub-sf-zmm-3.c: Likewise.
* gcc.target/i386/avx512f-fmsub-sf-zmm-4.c: Likewise.
* gcc.target/i386/avx512f-fmsub-sf-zmm-5.c: Likewise.
* gcc.target/i386/avx512f-fmsub-sf-zmm-6.c: Likewise.
* gcc.target/i386/avx512f-fmsub-sf-zmm-7.c: Likewise.
* gcc.target/i386/avx512f-fmsub-sf-zmm-8.c: Likewise.
* gcc.target/i386/avx512vl-fmsub-sf-xmm-1.c: Likewise.
* gcc.target/i386/avx512vl-fmsub-sf-ymm-1.c: Likewise.

From-SVN: r265356
18 files changed:
gcc/ChangeLog
gcc/config/i386/avx512fintrin.h
gcc/config/i386/avx512vlintrin.h
gcc/config/i386/fmaintrin.h
gcc/config/i386/i386-builtin.def
gcc/config/i386/sse.md
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/i386/avx512f-fmsub-df-zmm-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-5.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-6.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-7.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512f-fmsub-sf-zmm-8.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-fmsub-sf-xmm-1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/i386/avx512vl-fmsub-sf-ymm-1.c [new file with mode: 0644]