[AArch64] Rework SVE INC/DEC handling
authorRichard Sandiford <richard.sandiford@arm.com>
Thu, 15 Aug 2019 08:47:25 +0000 (08:47 +0000)
committerRichard Sandiford <rsandifo@gcc.gnu.org>
Thu, 15 Aug 2019 08:47:25 +0000 (08:47 +0000)
commit0fdc30bcf56d7b46122d7e67d61b56c0a198f3b3
tree2586c74f745eb5efeec23294c29e156dfd1d7e9b
parentd7a09c445a475a95559e8b9f29eb06ad92effa91
[AArch64] Rework SVE INC/DEC handling

The scalar addition patterns allowed all the VL constants that
ADDVL and ADDPL allow, but wrote the instructions as INC or DEC
if possible (i.e. adding or subtracting a number of elements * [1, 16]
when the source and target registers the same).  That works for the
cases that the autovectoriser needs, but there are a few constants
that INC and DEC can handle but ADDPL and ADDVL can't.  E.g.:

        inch    x0, all, mul #9

is not a multiple of the number of bytes in an SVE register, and so
can't use ADDVL.  It represents 36 times the number of bytes in an
SVE predicate, putting it outside the range of ADDPL.

This patch therefore adds separate alternatives for INC and DEC,
tied to a new Uai constraint.  It also adds an explicit "scalar"
or "vector" to the function names, to avoid a clash with the
existing support for vector INC and DEC.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
* config/aarch64/aarch64-protos.h
(aarch64_sve_scalar_inc_dec_immediate_p): Declare.
(aarch64_sve_inc_dec_immediate_p): Rename to...
(aarch64_sve_vector_inc_dec_immediate_p): ...this.
(aarch64_output_sve_addvl_addpl): Take a single rtx argument.
(aarch64_output_sve_scalar_inc_dec): Declare.
(aarch64_output_sve_inc_dec_immediate): Rename to...
(aarch64_output_sve_vector_inc_dec): ...this.
* config/aarch64/aarch64.c (aarch64_sve_scalar_inc_dec_immediate_p)
(aarch64_output_sve_scalar_inc_dec): New functions.
(aarch64_output_sve_addvl_addpl): Remove the base and offset
arguments.  Only handle true ADDVL and ADDPL instructions;
don't emit an INC or DEC.
(aarch64_sve_inc_dec_immediate_p): Rename to...
(aarch64_sve_vector_inc_dec_immediate_p): ...this.
(aarch64_output_sve_inc_dec_immediate): Rename to...
(aarch64_output_sve_vector_inc_dec): ...this.  Update call to
aarch64_sve_vector_inc_dec_immediate_p.
* config/aarch64/predicates.md (aarch64_sve_scalar_inc_dec_immediate)
(aarch64_sve_plus_immediate): New predicates.
(aarch64_pluslong_operand): Accept aarch64_sve_plus_immediate
rather than aarch64_sve_addvl_addpl_immediate.
(aarch64_sve_inc_dec_immediate): Rename to...
(aarch64_sve_vector_inc_dec_immediate): ...this.  Update call to
aarch64_sve_vector_inc_dec_immediate_p.
(aarch64_sve_add_operand): Update accordingly.
* config/aarch64/constraints.md (Uai): New constraint.
(vsi): Update call to aarch64_sve_vector_inc_dec_immediate_p.
* config/aarch64/aarch64.md (add<GPI:mode>3): Don't force the second
operand into a register if it satisfies aarch64_sve_plus_immediate.
(*add<GPI:mode>3_aarch64, *add<GPI:mode>3_poly_1): Add an alternative
for Uai.  Update calls to aarch64_output_sve_addvl_addpl.
* config/aarch64/aarch64-sve.md (add<mode>3): Call
aarch64_output_sve_vector_inc_dec instead of
aarch64_output_sve_inc_dec_immediate.

From-SVN: r274518
gcc/ChangeLog
gcc/config/aarch64/aarch64-protos.h
gcc/config/aarch64/aarch64-sve.md
gcc/config/aarch64/aarch64.c
gcc/config/aarch64/aarch64.md
gcc/config/aarch64/constraints.md
gcc/config/aarch64/predicates.md