amdgcn: Improve FP division accuracy
authorJulian Brown <julian@codesourcery.com>
Mon, 30 Nov 2020 19:10:04 +0000 (11:10 -0800)
committerJulian Brown <julian@codesourcery.com>
Wed, 13 Jan 2021 00:46:01 +0000 (16:46 -0800)
commitc8812bac8ee39f73ea881e4f6260acf5590b4190
tree705063c5b9410785f858502398a8516c6bc78630
parentabb3993e49c04bd40e42f196f55785cc3fd81682
amdgcn: Improve FP division accuracy

GCN has a reciprocal-approximation instruction but no
hardware divide. This patch adjusts the open-coded reciprocal
approximation/Newton-Raphson refinement steps to use fused multiply-add
instructions as is necessary to obtain a properly-rounded result, and
adds further refinement steps to correctly round the full division result.

The patterns in question are still guarded by a flag_reciprocal_math
condition, and do not yet support denormals.

2021-01-13  Julian Brown  <julian@codesourcery.com>

gcc/
* config/gcn/gcn-valu.md (recip<mode>2<exec>, recip<mode>2): Use unspec
for reciprocal-approximation instructions.
(div<mode>3): Use fused multiply-accumulate operations for reciprocal
refinement and division result.
* config/gcn/gcn.md (UNSPEC_RCP): New unspec constant.

gcc/testsuite/
* gcc.target/gcn/fpdiv.c: New test.
gcc/config/gcn/gcn-valu.md
gcc/config/gcn/gcn.md
gcc/testsuite/gcc.target/gcn/fpdiv.c [new file with mode: 0644]