[AArch64] Use SVE binary immediate instructions for conditional arithmetic
authorRichard Sandiford <richard.sandiford@arm.com>
Thu, 15 Aug 2019 08:18:03 +0000 (08:18 +0000)
committerRichard Sandiford <rsandifo@gcc.gnu.org>
Thu, 15 Aug 2019 08:18:03 +0000 (08:18 +0000)
This patch lets us use the immediate forms of FADD, FSUB, FSUBR,
FMUL, FMAXNM and FMINNM for conditional arithmetic.  (We already
use them for normal unconditional arithmetic.)

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
* config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
Print 2.0 naturally.
(aarch64_sve_float_mul_immediate_p): Return true for 2.0.
* config/aarch64/predicates.md
(aarch64_sve_float_negated_arith_immediate): New predicate,
renamed from aarch64_sve_float_arith_with_sub_immediate.
(aarch64_sve_float_arith_with_sub_immediate): Test for both
positive and negative constants.
(aarch64_sve_float_arith_with_sub_operand): Redefine as a register
or an aarch64_sve_float_arith_with_sub_immediate.
* config/aarch64/constraints.md (vsN): Use
aarch64_sve_float_negated_arith_immediate.
* config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
iterator.
(sve_pred_fp_rhs2_immediate): New int attribute.
* config/aarch64/aarch64-sve.md
(cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
(*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
(*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.

gcc/testsuite/
* gcc.target/aarch64/sve/cond_fadd_1.c: New test.
* gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
From-SVN: r274508

47 files changed:
gcc/ChangeLog
gcc/config/aarch64/aarch64-sve.md
gcc/config/aarch64/aarch64.c
gcc/config/aarch64/constraints.md
gcc/config/aarch64/iterators.md
gcc/config/aarch64/predicates.md
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3_run.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4_run.c [new file with mode: 0644]

index 593e8fb6e625b73e51303ac5e82f99d666fc3c41..52ab8e5d3702811a13a6764046345ed6e150bac8 100644 (file)
@@ -1,3 +1,29 @@
+2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
+           Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
+
+       * config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
+       Print 2.0 naturally.
+       (aarch64_sve_float_mul_immediate_p): Return true for 2.0.
+       * config/aarch64/predicates.md
+       (aarch64_sve_float_negated_arith_immediate): New predicate,
+       renamed from aarch64_sve_float_arith_with_sub_immediate.
+       (aarch64_sve_float_arith_with_sub_immediate): Test for both
+       positive and negative constants.
+       (aarch64_sve_float_arith_with_sub_operand): Redefine as a register
+       or an aarch64_sve_float_arith_with_sub_immediate.
+       * config/aarch64/constraints.md (vsN): Use
+       aarch64_sve_float_negated_arith_immediate.
+       * config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
+       iterator.
+       (sve_pred_fp_rhs2_immediate): New int attribute.
+       * config/aarch64/aarch64-sve.md
+       (cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
+       sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
+       (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
+       (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
+       (*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
+       (*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.
+
 2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
            Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
 
index b1e2f2434aeb06e69c90f92f386067074d3f2fde..d43ce521a799bf3385f0ca1ffd14599b18c27be3 100644 (file)
 ;; ---- [FP] General binary arithmetic corresponding to unspecs
 ;; -------------------------------------------------------------------------
 ;; Includes merging forms of:
-;; - FADD
+;; - FADD    (constant forms handled in the "Addition" section)
 ;; - FDIV
 ;; - FDIVR
-;; - FMAXNM
-;; - FMINNM
-;; - FMUL
-;; - FSUB
-;; - FSUBR
+;; - FMAXNM  (including #0.0 and #1.0)
+;; - FMINNM  (including #0.0 and #1.0)
+;; - FMUL    (including #0.5 and #2.0)
+;; - FSUB    (constant forms handled in the "Addition" section)
+;; - FSUBR   (constant forms handled in the "Subtraction" section)
 ;; -------------------------------------------------------------------------
 
 ;; Unpredicated floating-point binary operations.
           (unspec:SVE_F
             [(match_dup 1)
              (const_int SVE_STRICT_GP)
-             (match_operand:SVE_F 2 "register_operand")
-             (match_operand:SVE_F 3 "register_operand")]
+             (match_operand:SVE_F 2 "<sve_pred_fp_rhs1_operand>")
+             (match_operand:SVE_F 3 "<sve_pred_fp_rhs2_operand>")]
             SVE_COND_FP_BINARY)
           (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero")]
          UNSPEC_SEL))]
   [(set_attr "movprfx" "*,yes")]
 )
 
+;; Same for operations that take a 1-bit constant.
+(define_insn_and_rewrite "*cond_<optab><mode>_2_const"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, ?w")
+       (unspec:SVE_F
+         [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
+          (unspec:SVE_F
+            [(match_operand 4)
+             (match_operand:SI 5 "aarch64_sve_gp_strictness")
+             (match_operand:SVE_F 2 "register_operand" "0, w")
+             (match_operand:SVE_F 3 "<sve_pred_fp_rhs2_immediate>")]
+            SVE_COND_FP_BINARY_I1)
+          (match_dup 2)]
+         UNSPEC_SEL))]
+  "TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
+  "@
+   <sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
+   movprfx\t%0, %2\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3"
+  "&& !rtx_equal_p (operands[1], operands[4])"
+  {
+    operands[4] = copy_rtx (operands[1]);
+  }
+  [(set_attr "movprfx" "*,yes")]
+)
+
 ;; Predicated floating-point operations, merging with the second input.
 (define_insn_and_rewrite "*cond_<optab><mode>_3"
   [(set (match_operand:SVE_F 0 "register_operand" "=w, ?&w")
   [(set_attr "movprfx" "yes")]
 )
 
+;; Same for operations that take a 1-bit constant.
+(define_insn_and_rewrite "*cond_<optab><mode>_any_const"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w")
+       (unspec:SVE_F
+         [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl")
+          (unspec:SVE_F
+            [(match_operand 5)
+             (match_operand:SI 6 "aarch64_sve_gp_strictness")
+             (match_operand:SVE_F 2 "register_operand" "w, w, w")
+             (match_operand:SVE_F 3 "<sve_pred_fp_rhs2_immediate>")]
+            SVE_COND_FP_BINARY_I1)
+          (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")]
+         UNSPEC_SEL))]
+  "TARGET_SVE
+   && !rtx_equal_p (operands[2], operands[4])
+   && aarch64_sve_pred_dominates_p (&operands[5], operands[1])"
+  "@
+   movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
+   movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;<sve_fp_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
+   #"
+  "&& 1"
+  {
+    if (reload_completed
+        && register_operand (operands[4], <MODE>mode)
+        && !rtx_equal_p (operands[0], operands[4]))
+      {
+       emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
+                                                operands[4], operands[1]));
+       operands[4] = operands[2] = operands[0];
+      }
+    else if (!rtx_equal_p (operands[1], operands[5]))
+      operands[5] = copy_rtx (operands[1]);
+    else
+      FAIL;
+  }
+  [(set_attr "movprfx" "yes")]
+)
+
 ;; -------------------------------------------------------------------------
 ;; ---- [FP] Addition
 ;; -------------------------------------------------------------------------
   [(set (match_dup 0) (plus:SVE_F (match_dup 2) (match_dup 3)))]
 )
 
-;; Merging forms are handled through SVE_COND_FP_BINARY.
+;; Predicated floating-point addition of a constant, merging with the
+;; first input.
+(define_insn_and_rewrite "*cond_add<mode>_2_const"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w, ?w")
+       (unspec:SVE_F
+         [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl, Upl")
+          (unspec:SVE_F
+            [(match_operand 4)
+             (match_operand:SI 5 "aarch64_sve_gp_strictness")
+             (match_operand:SVE_F 2 "register_operand" "0, 0, w, w")
+             (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_immediate" "vsA, vsN, vsA, vsN")]
+            UNSPEC_COND_FADD)
+          (match_dup 2)]
+         UNSPEC_SEL))]
+  "TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
+  "@
+   fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
+   fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3
+   movprfx\t%0, %2\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
+   movprfx\t%0, %2\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3"
+  "&& !rtx_equal_p (operands[1], operands[4])"
+  {
+    operands[4] = copy_rtx (operands[1]);
+  }
+  [(set_attr "movprfx" "*,*,yes,yes")]
+)
+
+;; Predicated floating-point addition of a constant, merging with an
+;; independent value.
+(define_insn_and_rewrite "*cond_add<mode>_any_const"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, w, w, w, ?w, ?w")
+       (unspec:SVE_F
+         [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl, Upl, Upl, Upl")
+          (unspec:SVE_F
+            [(match_operand 5)
+             (match_operand:SI 6 "aarch64_sve_gp_strictness")
+             (match_operand:SVE_F 2 "register_operand" "w, w, w, w, w, w")
+             (match_operand:SVE_F 3 "aarch64_sve_float_arith_with_sub_immediate" "vsA, vsN, vsA, vsN, vsA, vsN")]
+            UNSPEC_COND_FADD)
+          (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, Dz, 0, 0, w, w")]
+         UNSPEC_SEL))]
+  "TARGET_SVE
+   && !rtx_equal_p (operands[2], operands[4])
+   && aarch64_sve_pred_dominates_p (&operands[5], operands[1])"
+  "@
+   movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
+   movprfx\t%0.<Vetype>, %1/z, %2.<Vetype>\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3
+   movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;fadd\t%0.<Vetype>, %1/m, %0.<Vetype>, #%3
+   movprfx\t%0.<Vetype>, %1/m, %2.<Vetype>\;fsub\t%0.<Vetype>, %1/m, %0.<Vetype>, #%N3
+   #
+   #"
+  "&& 1"
+  {
+    if (reload_completed
+        && register_operand (operands[4], <MODE>mode)
+        && !rtx_equal_p (operands[0], operands[4]))
+      {
+       emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
+                                                operands[4], operands[1]));
+       operands[4] = operands[2] = operands[0];
+      }
+    else if (!rtx_equal_p (operands[1], operands[5]))
+      operands[5] = copy_rtx (operands[1]);
+    else
+      FAIL;
+  }
+  [(set_attr "movprfx" "yes")]
+)
+
+;; Register merging forms are handled through SVE_COND_FP_BINARY.
 
 ;; -------------------------------------------------------------------------
 ;; ---- [FP] Subtraction
   [(set (match_dup 0) (minus:SVE_F (match_dup 2) (match_dup 3)))]
 )
 
-;; Merging forms are handled through SVE_COND_FP_BINARY.
+;; Predicated floating-point subtraction from a constant, merging with the
+;; second input.
+(define_insn_and_rewrite "*cond_sub<mode>_3_const"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, ?w")
+       (unspec:SVE_F
+         [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl")
+          (unspec:SVE_F
+            [(match_operand 4)
+             (match_operand:SI 5 "aarch64_sve_gp_strictness")
+             (match_operand:SVE_F 2 "aarch64_sve_float_arith_immediate")
+             (match_operand:SVE_F 3 "register_operand" "0, w")]
+            UNSPEC_COND_FSUB)
+          (match_dup 3)]
+         UNSPEC_SEL))]
+  "TARGET_SVE && aarch64_sve_pred_dominates_p (&operands[4], operands[1])"
+  "@
+   fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2
+   movprfx\t%0, %3\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2"
+  "&& !rtx_equal_p (operands[1], operands[4])"
+  {
+    operands[4] = copy_rtx (operands[1]);
+  }
+  [(set_attr "movprfx" "*,yes")]
+)
+
+;; Predicated floating-point subtraction from a constant, merging with an
+;; independent value.
+(define_insn_and_rewrite "*cond_sub<mode>_any_const"
+  [(set (match_operand:SVE_F 0 "register_operand" "=w, w, ?w")
+       (unspec:SVE_F
+         [(match_operand:<VPRED> 1 "register_operand" "Upl, Upl, Upl")
+          (unspec:SVE_F
+            [(match_operand 5)
+             (match_operand:SI 6 "aarch64_sve_gp_strictness")
+             (match_operand:SVE_F 2 "aarch64_sve_float_arith_immediate")
+             (match_operand:SVE_F 3 "register_operand" "w, w, w")]
+            UNSPEC_COND_FSUB)
+          (match_operand:SVE_F 4 "aarch64_simd_reg_or_zero" "Dz, 0, w")]
+         UNSPEC_SEL))]
+  "TARGET_SVE
+   && !rtx_equal_p (operands[3], operands[4])
+   && aarch64_sve_pred_dominates_p (&operands[5], operands[1])"
+  "@
+   movprfx\t%0.<Vetype>, %1/z, %3.<Vetype>\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2
+   movprfx\t%0.<Vetype>, %1/m, %3.<Vetype>\;fsubr\t%0.<Vetype>, %1/m, %0.<Vetype>, #%2
+   #"
+  "&& 1"
+  {
+    if (reload_completed
+        && register_operand (operands[4], <MODE>mode)
+        && !rtx_equal_p (operands[0], operands[4]))
+      {
+       emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[3],
+                                                operands[4], operands[1]));
+       operands[4] = operands[3] = operands[0];
+      }
+    else if (!rtx_equal_p (operands[1], operands[5]))
+      operands[5] = copy_rtx (operands[1]);
+    else
+      FAIL;
+  }
+  [(set_attr "movprfx" "yes")]
+)
+
+;; Register merging forms are handled through SVE_COND_FP_BINARY.
 
 ;; -------------------------------------------------------------------------
 ;; ---- [FP] Absolute difference
   [(set (match_dup 0) (mult:SVE_F (match_dup 2) (match_dup 3)))]
 )
 
-;; Merging forms are handled through SVE_COND_FP_BINARY.
+;; Merging forms are handled through SVE_COND_FP_BINARY and
+;; SVE_COND_FP_BINARY_I1.
 
 ;; -------------------------------------------------------------------------
 ;; ---- [FP] Binary logical operations
   [(set_attr "movprfx" "*,*,yes,yes")]
 )
 
-;; Merging forms are handled through SVE_COND_FP_BINARY.
+;; Merging forms are handled through SVE_COND_FP_BINARY and
+;; SVE_COND_FP_BINARY_I1.
 
 ;; -------------------------------------------------------------------------
 ;; ---- [PRED] Binary logical operations
index 8e392257be92dd543b2c3f6c096fb15c2fc040b7..81a267bde54372d999f4851d537d6885f9abd4ce 100644 (file)
@@ -8289,6 +8289,8 @@ aarch64_print_vector_float_operand (FILE *f, rtx x, bool negate)
      fixed form in the assembly syntax.  */
   if (real_equal (&r, &dconst0))
     asm_fprintf (f, "0.0");
+  else if (real_equal (&r, &dconst2))
+    asm_fprintf (f, "2.0");
   else if (real_equal (&r, &dconst1))
     asm_fprintf (f, "1.0");
   else if (real_equal (&r, &dconsthalf))
@@ -15205,11 +15207,10 @@ aarch64_sve_float_mul_immediate_p (rtx x)
 {
   rtx elt;
 
-  /* GCC will never generate a multiply with an immediate of 2, so there is no
-     point testing for it (even though it is a valid constant).  */
   return (const_vec_duplicate_p (x, &elt)
          && GET_CODE (elt) == CONST_DOUBLE
-         && real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf));
+         && (real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf)
+             || real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconst2)));
 }
 
 /* Return true if replicating VAL32 is a valid 2-byte or 4-byte immediate
index 28734b46009a09d60a0c34b58e1ac73471cc3f97..bb2fe8eb1f08f95a942a03ac796039cf2b520108 100644 (file)
 (define_constraint "vsN"
   "@internal
    A constraint that matches the negative of vsA"
- (match_operand 0 "aarch64_sve_float_arith_with_sub_immediate"))
+ (match_operand 0 "aarch64_sve_float_negated_arith_immediate"))
index 81dddb7b3a3cc7aac5ea217154193ec82277212a..31878434c29daf1409cf51607260f2e8b0485933 100644 (file)
                                         UNSPEC_COND_FMUL
                                         UNSPEC_COND_FSUB])
 
+(define_int_iterator SVE_COND_FP_BINARY_I1 [UNSPEC_COND_FMAXNM
+                                           UNSPEC_COND_FMINNM
+                                           UNSPEC_COND_FMUL])
+
 (define_int_iterator SVE_COND_FP_BINARY_REG [UNSPEC_COND_FDIV])
 
 ;; Floating-point max/min operations that correspond to optabs,
    (UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_operand")
    (UNSPEC_COND_FMUL "aarch64_sve_float_mul_operand")
    (UNSPEC_COND_FSUB "register_operand")])
+
+;; Likewise for immediates only.
+(define_int_attr sve_pred_fp_rhs2_immediate
+  [(UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_immediate")
+   (UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_immediate")
+   (UNSPEC_COND_FMUL "aarch64_sve_float_mul_immediate")])
index 1a47708c327d1a86cba5c21e62f3b21680d30c58..98f28f5ff8b0162bc9347291ff92ff631bfd5651 100644 (file)
   (and (match_code "const,const_vector")
        (match_test "aarch64_sve_float_arith_immediate_p (op, false)")))
 
-(define_predicate "aarch64_sve_float_arith_with_sub_immediate"
+(define_predicate "aarch64_sve_float_negated_arith_immediate"
   (and (match_code "const,const_vector")
        (match_test "aarch64_sve_float_arith_immediate_p (op, true)")))
 
+(define_predicate "aarch64_sve_float_arith_with_sub_immediate"
+  (ior (match_operand 0 "aarch64_sve_float_arith_immediate")
+       (match_operand 0 "aarch64_sve_float_negated_arith_immediate")))
+
 (define_predicate "aarch64_sve_float_mul_immediate"
   (and (match_code "const,const_vector")
        (match_test "aarch64_sve_float_mul_immediate_p (op)")))
        (match_operand 0 "aarch64_sve_float_arith_immediate")))
 
 (define_predicate "aarch64_sve_float_arith_with_sub_operand"
-  (ior (match_operand 0 "aarch64_sve_float_arith_operand")
+  (ior (match_operand 0 "register_operand")
        (match_operand 0 "aarch64_sve_float_arith_with_sub_immediate")))
 
 (define_predicate "aarch64_sve_float_mul_operand"
index 62d5369724db6c7b0393a08b88ad0c9c50e9deff..aad8e04c32ca396b7f682368469a58798ca212be 100644 (file)
@@ -1,3 +1,47 @@
+2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
+           Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
+
+       * gcc.target/aarch64/sve/cond_fadd_1.c: New test.
+       * gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
+       * gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.
+
 2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
            Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1.c
new file mode 100644 (file)
index 0000000..d103e1f
--- /dev/null
@@ -0,0 +1,62 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : y[i];        \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, one, 1.0) \
+  T (TYPE, PRED_TYPE, two, 2.0) \
+  T (TYPE, PRED_TYPE, minus_half, -0.5) \
+  T (TYPE, PRED_TYPE, minus_one, -1.0) \
+  T (TYPE, PRED_TYPE, minus_two, -2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_1_run.c
new file mode 100644 (file)
index 0000000..956ae14
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fadd_1.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                                \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : y[i];        \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2.c
new file mode 100644 (file)
index 0000000..b7d02f4
--- /dev/null
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, NAME, CONST)                    \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       TYPE *__restrict z,             \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = y[i] < 8 ? z[i] + (TYPE) CONST : y[i];    \
+  }
+
+#define TEST_TYPE(T, TYPE) \
+  T (TYPE, half, 0.5) \
+  T (TYPE, one, 1.0) \
+  T (TYPE, two, 2.0) \
+  T (TYPE, minus_half, -0.5) \
+  T (TYPE, minus_one, -1.0) \
+  T (TYPE, minus_two, -2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, float) \
+  TEST_TYPE (T, double)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 6 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_2_run.c
new file mode 100644 (file)
index 0000000..debf395
--- /dev/null
@@ -0,0 +1,31 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fadd_2.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, NAME, CONST)                                   \
+  {                                                                    \
+    TYPE x[N], y[N], z[N];                                             \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i % 13;                                                  \
+       z[i] = i * i;                                                   \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, z, N);                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = y[i] < 8 ? z[i] + (TYPE) CONST : y[i];          \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3.c
new file mode 100644 (file)
index 0000000..aec0e5a
--- /dev/null
@@ -0,0 +1,65 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 4;   \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, one, 1.0) \
+  T (TYPE, PRED_TYPE, two, 2.0) \
+  T (TYPE, PRED_TYPE, minus_half, -0.5) \
+  T (TYPE, PRED_TYPE, minus_one, -1.0) \
+  T (TYPE, PRED_TYPE, minus_two, -2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
+
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_3_run.c
new file mode 100644 (file)
index 0000000..d5268c5
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fadd_3.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                        \
+  {                                                            \
+    TYPE x[N], y[N];                                           \
+    PRED_TYPE pred[N];                                         \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       y[i] = i * i;                                           \
+       pred[i] = i % 3;                                        \
+      }                                                                \
+    test_##TYPE##_##NAME (x, y, pred, N);                      \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 4;   \
+       if (x[i] != expected)                                   \
+         __builtin_abort ();                                   \
+       asm volatile ("" ::: "memory");                         \
+      }                                                                \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4.c
new file mode 100644 (file)
index 0000000..bb276c1
--- /dev/null
@@ -0,0 +1,64 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 0;   \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, one, 1.0) \
+  T (TYPE, PRED_TYPE, two, 2.0) \
+  T (TYPE, PRED_TYPE, minus_half, -0.5) \
+  T (TYPE, PRED_TYPE, minus_one, -1.0) \
+  T (TYPE, PRED_TYPE, minus_two, -2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
+/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 6 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 6 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fadd_4_run.c
new file mode 100644 (file)
index 0000000..4ea8be6
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fadd_4.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                                \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 0;           \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1.c
new file mode 100644 (file)
index 0000000..d0db090
--- /dev/null
@@ -0,0 +1,55 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include <stdint.h>
+
+#ifndef FN
+#define FN(X) __builtin_fmax##X
+#endif
+
+#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST)     \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? FN (y[i], CONST) : y[i];   \
+  }
+
+#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
+  T (FN, TYPE, PRED_TYPE, zero, 0) \
+  T (FN, TYPE, PRED_TYPE, one, 1) \
+  T (FN, TYPE, PRED_TYPE, two, 2)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, FN (f16), _Float16, int16_t) \
+  TEST_TYPE (T, FN (f32), float, int32_t) \
+  TEST_TYPE (T, FN (f64), double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_1_run.c
new file mode 100644 (file)
index 0000000..00a3c41
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include "cond_fmaxnm_1.c"
+
+#define N 99
+
+#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST)                    \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : y[i];           \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2.c
new file mode 100644 (file)
index 0000000..0b535d1
--- /dev/null
@@ -0,0 +1,48 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include <stdint.h>
+
+#ifndef FN
+#define FN(X) __builtin_fmax##X
+#endif
+
+#define DEF_LOOP(FN, TYPE, NAME, CONST)                        \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       TYPE *__restrict z,             \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = y[i] < 8 ? FN (z[i], CONST) : y[i];       \
+  }
+
+#define TEST_TYPE(T, FN, TYPE) \
+  T (FN, TYPE, zero, 0) \
+  T (FN, TYPE, one, 1) \
+  T (FN, TYPE, two, 2)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, FN (f32), float) \
+  TEST_TYPE (T, FN (f64), double)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_2_run.c
new file mode 100644 (file)
index 0000000..9eb4d80
--- /dev/null
@@ -0,0 +1,31 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include "cond_fmaxnm_2.c"
+
+#define N 99
+
+#define TEST_LOOP(FN, TYPE, NAME, CONST)                               \
+  {                                                                    \
+    TYPE x[N], y[N], z[N];                                             \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i % 13;                                                  \
+       z[i] = i * i;                                                   \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, z, N);                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = y[i] < 8 ? FN (z[i], CONST) : y[i];             \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3.c
new file mode 100644 (file)
index 0000000..741f8f6
--- /dev/null
@@ -0,0 +1,54 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include <stdint.h>
+
+#ifndef FN
+#define FN(X) __builtin_fmax##X
+#endif
+
+#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST)     \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? FN (y[i], CONST) : 4;      \
+  }
+
+#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
+  T (FN, TYPE, PRED_TYPE, zero, 0) \
+  T (FN, TYPE, PRED_TYPE, one, 1) \
+  T (FN, TYPE, PRED_TYPE, two, 2)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, FN (f16), _Float16, int16_t) \
+  TEST_TYPE (T, FN (f32), float, int32_t) \
+  TEST_TYPE (T, FN (f64), double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_3_run.c
new file mode 100644 (file)
index 0000000..4aac75f
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include "cond_fmaxnm_3.c"
+
+#define N 99
+
+#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST)            \
+  {                                                            \
+    TYPE x[N], y[N];                                           \
+    PRED_TYPE pred[N];                                         \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       y[i] = i * i;                                           \
+       pred[i] = i % 3;                                        \
+      }                                                                \
+    test_##TYPE##_##NAME (x, y, pred, N);                      \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 4;      \
+       if (x[i] != expected)                                   \
+         __builtin_abort ();                                   \
+       asm volatile ("" ::: "memory");                         \
+      }                                                                \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4.c
new file mode 100644 (file)
index 0000000..83a53c7
--- /dev/null
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include <stdint.h>
+
+#ifndef FN
+#define FN(X) __builtin_fmax##X
+#endif
+
+#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST)     \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? FN (y[i], CONST) : 0;      \
+  }
+
+#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
+  T (FN, TYPE, PRED_TYPE, zero, 0) \
+  T (FN, TYPE, PRED_TYPE, one, 1) \
+  T (FN, TYPE, PRED_TYPE, two, 2)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, FN (f16), _Float16, int16_t) \
+  TEST_TYPE (T, FN (f32), float, int32_t) \
+  TEST_TYPE (T, FN (f64), double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmaxnm_4_run.c
new file mode 100644 (file)
index 0000000..e1d9043
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#include "cond_fmaxnm_4.c"
+
+#define N 99
+
+#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST)                    \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 0;              \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1.c
new file mode 100644 (file)
index 0000000..d667b20
--- /dev/null
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_1.c"
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_1_run.c
new file mode 100644 (file)
index 0000000..5df2ff8
--- /dev/null
@@ -0,0 +1,5 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_1_run.c"
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2.c
new file mode 100644 (file)
index 0000000..d66a84b
--- /dev/null
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_2.c"
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_2_run.c
new file mode 100644 (file)
index 0000000..79a98bb
--- /dev/null
@@ -0,0 +1,5 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_2_run.c"
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3.c
new file mode 100644 (file)
index 0000000..d39dd18
--- /dev/null
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_3.c"
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_3_run.c
new file mode 100644 (file)
index 0000000..ca1a047
--- /dev/null
@@ -0,0 +1,5 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_3_run.c"
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4.c
new file mode 100644 (file)
index 0000000..fff6fdd
--- /dev/null
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_4.c"
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fminnm_4_run.c
new file mode 100644 (file)
index 0000000..b945d04
--- /dev/null
@@ -0,0 +1,5 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
+
+#define FN(X) __builtin_fmin##X
+#include "cond_fmaxnm_4_run.c"
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1.c
new file mode 100644 (file)
index 0000000..ce417ed
--- /dev/null
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : y[i];        \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, two, 2.0) \
+  T (TYPE, PRED_TYPE, four, 4.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_1_run.c
new file mode 100644 (file)
index 0000000..9ca5b50
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fmul_1.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                                \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : y[i];        \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2.c
new file mode 100644 (file)
index 0000000..cbf9d13
--- /dev/null
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, NAME, CONST)                    \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       TYPE *__restrict z,             \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = y[i] < 8 ? z[i] * (TYPE) CONST : y[i];    \
+  }
+
+#define TEST_TYPE(T, TYPE) \
+  T (TYPE, half, 0.5) \
+  T (TYPE, two, 2.0) \
+  T (TYPE, four, 4.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, float) \
+  TEST_TYPE (T, double)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_2_run.c
new file mode 100644 (file)
index 0000000..44b283b
--- /dev/null
@@ -0,0 +1,31 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fmul_2.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, NAME, CONST)                                   \
+  {                                                                    \
+    TYPE x[N], y[N], z[N];                                             \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i % 13;                                                  \
+       z[i] = i * i;                                                   \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, z, N);                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = y[i] < 8 ? z[i] * (TYPE) CONST : y[i];          \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3.c
new file mode 100644 (file)
index 0000000..4da147e
--- /dev/null
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 8;   \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, two, 2.0) \
+  T (TYPE, PRED_TYPE, four, 4.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_3_run.c
new file mode 100644 (file)
index 0000000..9b81d43
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fmul_3.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                        \
+  {                                                            \
+    TYPE x[N], y[N];                                           \
+    PRED_TYPE pred[N];                                         \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       y[i] = i * i;                                           \
+       pred[i] = i % 3;                                        \
+      }                                                                \
+    test_##TYPE##_##NAME (x, y, pred, N);                      \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 8;   \
+       if (x[i] != expected)                                   \
+         __builtin_abort ();                                   \
+       asm volatile ("" ::: "memory");                         \
+      }                                                                \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4.c
new file mode 100644 (file)
index 0000000..c4fdb2b
--- /dev/null
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 0;   \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, two, 2.0) \
+  T (TYPE, PRED_TYPE, four, 4.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fmul_4_run.c
new file mode 100644 (file)
index 0000000..b93e031
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fmul_4.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                                \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 0;           \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1.c
new file mode 100644 (file)
index 0000000..8e7172a
--- /dev/null
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : y[i];        \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, one, 1.0) \
+  T (TYPE, PRED_TYPE, two, 2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_1_run.c
new file mode 100644 (file)
index 0000000..61ffac4
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fsubr_1.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                                \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : y[i];        \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2.c
new file mode 100644 (file)
index 0000000..6d2efde
--- /dev/null
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, NAME, CONST)                    \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       TYPE *__restrict z,             \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = y[i] < 8 ? (TYPE) CONST - z[i] : y[i];    \
+  }
+
+#define TEST_TYPE(T, TYPE) \
+  T (TYPE, half, 0.5) \
+  T (TYPE, one, 1.0) \
+  T (TYPE, two, 2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, float) \
+  TEST_TYPE (T, double)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_2_run.c
new file mode 100644 (file)
index 0000000..1b25392
--- /dev/null
@@ -0,0 +1,31 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fsubr_2.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, NAME, CONST)                                   \
+  {                                                                    \
+    TYPE x[N], y[N], z[N];                                             \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i % 13;                                                  \
+       z[i] = i * i;                                                   \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, z, N);                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = y[i] < 8 ? (TYPE) CONST - z[i] : y[i];          \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3.c
new file mode 100644 (file)
index 0000000..328af57
--- /dev/null
@@ -0,0 +1,50 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 4;   \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, one, 1.0) \
+  T (TYPE, PRED_TYPE, two, 2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_3_run.c
new file mode 100644 (file)
index 0000000..8978287
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fsubr_3.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                        \
+  {                                                            \
+    TYPE x[N], y[N];                                           \
+    PRED_TYPE pred[N];                                         \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       y[i] = i * i;                                           \
+       pred[i] = i % 3;                                        \
+      }                                                                \
+    test_##TYPE##_##NAME (x, y, pred, N);                      \
+    for (int i = 0; i < N; ++i)                                        \
+      {                                                                \
+       TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 4;   \
+       if (x[i] != expected)                                   \
+         __builtin_abort ();                                   \
+       asm volatile ("" ::: "memory");                         \
+      }                                                                \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4.c
new file mode 100644 (file)
index 0000000..1d420b1
--- /dev/null
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include <stdint.h>
+
+#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST)         \
+  void __attribute__ ((noipa))                         \
+  test_##TYPE##_##NAME (TYPE *__restrict x,            \
+                       TYPE *__restrict y,             \
+                       PRED_TYPE *__restrict pred,     \
+                       int n)                          \
+  {                                                    \
+    for (int i = 0; i < n; ++i)                                \
+      x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 0;   \
+  }
+
+#define TEST_TYPE(T, TYPE, PRED_TYPE) \
+  T (TYPE, PRED_TYPE, half, 0.5) \
+  T (TYPE, PRED_TYPE, one, 1.0) \
+  T (TYPE, PRED_TYPE, two, 2.0)
+
+#define TEST_ALL(T) \
+  TEST_TYPE (T, _Float16, int16_t) \
+  TEST_TYPE (T, float, int32_t) \
+  TEST_TYPE (T, double, int64_t)
+
+TEST_ALL (DEF_LOOP)
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
+/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
+/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
+
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
+/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
+
+/* { dg-final { scan-assembler-not {\tmov\tz} } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4_run.c b/gcc/testsuite/gcc.target/aarch64/sve/cond_fsubr_4_run.c
new file mode 100644 (file)
index 0000000..2cb3409
--- /dev/null
@@ -0,0 +1,32 @@
+/* { dg-do run { target aarch64_sve_hw } } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+
+#include "cond_fsubr_4.c"
+
+#define N 99
+
+#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST)                                \
+  {                                                                    \
+    TYPE x[N], y[N];                                                   \
+    PRED_TYPE pred[N];                                                 \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       y[i] = i * i;                                                   \
+       pred[i] = i % 3;                                                \
+      }                                                                        \
+    test_##TYPE##_##NAME (x, y, pred, N);                              \
+    for (int i = 0; i < N; ++i)                                                \
+      {                                                                        \
+       TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 0;           \
+       if (x[i] != expected)                                           \
+         __builtin_abort ();                                           \
+       asm volatile ("" ::: "memory");                                 \
+      }                                                                        \
+  }
+
+int
+main (void)
+{
+  TEST_ALL (TEST_LOOP)
+  return 0;
+}