git.libre-soc.org Git - gcc.git/commit

author	Kyrylo Tkachov <kyrylo.tkachov@arm.com>
	Mon, 3 Jun 2019 11:20:58 +0000 (11:20 +0000)
committer	Kyrylo Tkachov <ktkachov@gcc.gnu.org>
	Mon, 3 Jun 2019 11:20:58 +0000 (11:20 +0000)
commit	72215009a9f9827397a4eb74e9341b2b7dc658df
tree	85c9597bd0985e8be2de5f8dfbbcce8493abad31	tree
parent	c89503d957f13f7f0a5eeeab1326048c455d9533	commit \| diff

[AArch64] Emit TARGET_DOTPROD-specific sequence for <us>sadv16qi

Wilco pointed out that when the Dot Product instructions are available we can use them
to generate an even more efficient expansion for the [us]sadv16qi optab.
Instead of the current:
        uabdl2  v0.8h, v1.16b, v2.16b
        uabal   v0.8h, v1.8b, v2.8b
        uadalp  v3.4s, v0.8h

we can generate:
      (1)  mov    v4.16b, 1
      (2)  uabd    v0.16b, v1.16b, v2.16b
      (3)  udot    v3.4s, v0.16b, v4.16b

Instruction (1) can be CSEd across multiple such expansions and even hoisted outside of loops,
so when this sequence appears frequently back-to-back (like in x264_r) we essentially only have 2 instructions
per sum. Also, the UDOT instruction does the byte-to-word accumulation in one step, which allows us to use
the much simpler UABD instruction before it.

This makes it a shorter and lower-latency sequence overall for targets that support it.

* config/aarch64/iterators.md (MAX_OPP): New code attr.
* config/aarch64/aarch64-simd.md (*aarch64_<su>abd<mode>_3): Rename to...
(aarch64_<su>abd<mode>_3): ... This.
(<sur>sadv16qi): Add TARGET_DOTPROD expansion.

* gcc.target/aarch64/ssadv16qi.c: Add +nodotprod to pragma.
* gcc.target/aarch64/usadv16qi.c: Likewise.
* gcc.target/aarch64/ssadv16qi-dotprod.c: New test.
* gcc.target/aarch64/usadv16qi-dotprod.c: Likewise.

From-SVN: r271863

gcc/ChangeLog		diff \| blob \| history
gcc/config/aarch64/aarch64-simd.md		diff \| blob \| history
gcc/testsuite/ChangeLog		diff \| blob \| history
gcc/testsuite/gcc.target/aarch64/ssadv16qi-dotprod.c	[new file with mode: 0644]	blob
gcc/testsuite/gcc.target/aarch64/ssadv16qi.c		diff \| blob \| history
gcc/testsuite/gcc.target/aarch64/usadv16qi-dotprod.c	[new file with mode: 0644]	blob
gcc/testsuite/gcc.target/aarch64/usadv16qi.c		diff \| blob \| history