This patch adds support to vectorize sum of abslolute differences (SAD_EXPR)
authorAlejandro Martinez <alejandro.martinezvicente@arm.com>
Tue, 7 May 2019 16:34:20 +0000 (16:34 +0000)
committerAlejandro Martinez <alejandro@gcc.gnu.org>
Tue, 7 May 2019 16:34:20 +0000 (16:34 +0000)
commita9fad8fe6c84de272f2a56d462e67d53c9f4a73d
tree9f0ff9561477c22ee099d09330ce742f5830d9a3
parent0a59215131c02dee4c8829f93d1ee678647614da
This patch adds support to vectorize sum of abslolute differences (SAD_EXPR)
using SVE.

Given this input code:

int
sum_abs (uint8_t *restrict x, uint8_t *restrict y, int n)
{
  int sum = 0;

  for (int i = 0; i < n; i++)
    {
      sum += __builtin_abs (x[i] - y[i]);
    }

  return sum;
}

The resulting SVE code is:

0000000000000000 <sum_abs>:
   0: 7100005f  cmp w2, #0x0
   4: 5400026d  b.le 50 <sum_abs+0x50>
   8: d2800003  mov x3, #0x0                    // #0
   c: 93407c42  sxtw x2, w2
  10: 2538c002  mov z2.b, #0
  14: 25221fe0  whilelo p0.b, xzr, x2
  18: 2538c023  mov z3.b, #1
  1c: 2518e3e1  ptrue p1.b
  20: a4034000  ld1b {z0.b}, p0/z, [x0, x3]
  24: a4034021  ld1b {z1.b}, p0/z, [x1, x3]
  28: 0430e3e3  incb x3
  2c: 0520c021  sel z1.b, p0, z1.b, z0.b
  30: 25221c60  whilelo p0.b, x3, x2
  34: 040d0420  uabd z0.b, p1/m, z0.b, z1.b
  38: 44830402  udot z2.s, z0.b, z3.b
  3c: 54ffff21  b.ne 20 <sum_abs+0x20>  // b.any
  40: 2598e3e0  ptrue p0.s
  44: 04812042  uaddv d2, p0, z2.s
  48: 1e260040  fmov w0, s2
  4c: d65f03c0  ret
  50: 1e2703e2  fmov s2, wzr
  54: 1e260040  fmov w0, s2
  58: d65f03c0  ret

Notice how udot is used inside a fully masked loop.

gcc/Changelog:

2019-05-07  Alejandro Martinez  <alejandro.martinezvicente@arm.com>

* config/aarch64/aarch64-sve.md (<su>abd<mode>_3): New define_expand.
(aarch64_<su>abd<mode>_3): Likewise.
(*aarch64_<su>abd<mode>_3): New define_insn.
(<sur>sad<vsi2qi>): New define_expand.
* config/aarch64/iterators.md: Added MAX_OPP attribute.
* tree-vect-loop.c (use_mask_by_cond_expr_p): Add SAD_EXPR.
(build_vect_cond_expr): Likewise.

gcc/testsuite/Changelog:

2019-05-07  Alejandro Martinez  <alejandro.martinezvicente@arm.com>

* gcc.target/aarch64/sve/sad_1.c: New test for sum of absolute
differences.

From-SVN: r270975
gcc/ChangeLog
gcc/config/aarch64/aarch64-sve.md
gcc/config/aarch64/iterators.md
gcc/testsuite/ChangeLog
gcc/testsuite/gcc.target/aarch64/sve/sad_1.c [new file with mode: 0644]
gcc/tree-vect-loop.c