[AArch64] Support for LDP/STP of Q-registers
authorKyrylo Tkachov <kyrylo.tkachov@arm.com>
Wed, 20 Jun 2018 08:57:17 +0000 (08:57 +0000)
committerKyrylo Tkachov <ktkachov@gcc.gnu.org>
Wed, 20 Jun 2018 08:57:17 +0000 (08:57 +0000)
commit9f5361c8cac181dbc79b7302d7241a61e0ce2386
tree199ba2d587d84ee6fc93df8134cce47cb4793499
parentde840bde88a7719fa022f1de6f4e407bf5b4c8a8
[AArch64] Support for LDP/STP of Q-registers

This patch adds support for generating LDPs and STPs of Q-registers.
This allows for more compact code generation and makes better use of the ISA.

It's implemented in a straightforward way by allowing 16-byte modes in the
sched-fusion machinery and adding appropriate peepholes in aarch64-ldpstp.md
as well as the patterns themselves in aarch64-simd.md.

It adds a new no_ldp_stp_qregs tuning flag.
I use it to restrict the peepholes in aarch64-ldpstp.md from merging the
operations together into PARALLELs. I also use it to restrict the sched fusion
check that brings such loads and stores together. This is enough to avoid
forming the pairs when the tuning flag is set.

I didn't see any non-noise performance effect on SPEC2017 on Cortex-A72 and Cortex-A53.

        * config/aarch64/aarch64-tuning-flags.def (no_ldp_stp_qregs): New.
        * config/aarch64/aarch64.c (xgene1_tunings): Add
        AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS to tune_flags.
        (aarch64_mode_valid_for_sched_fusion_p):
        Allow 16-byte modes.
        (aarch64_classify_address): Allow 16-byte modes for load_store_pair_p.
        * config/aarch64/aarch64-ldpstp.md: Add peepholes for LDP STP of
        128-bit modes.
        * config/aarch64/aarch64-simd.md (load_pair<VQ:mode><VQ2:mode>):
        New pattern.
        (vec_store_pair<VQ:mode><VQ2:mode>): Likewise.
        * config/aarch64/iterators.md (VQ2): New mode iterator.

        * gcc.target/aarch64/ldp_stp_q.c: New test.
        * gcc.target/aarch64/stp_vec_128_1.c: Likewise.
        * gcc.target/aarch64/ldp_stp_q_disable.c: Likewise.

From-SVN: r261796
gcc/ChangeLog
gcc/config/aarch64/aarch64-ldpstp.md
gcc/config/aarch64/aarch64-simd.md
gcc/config/aarch64/aarch64-tuning-flags.def
gcc/config/aarch64/aarch64.c
gcc/config/aarch64/iterators.md
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/aarch64/ldp_stp_q.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/ldp_stp_q_disable.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/stp_vec_128_1.c [new file with mode: 0644]