nir: Fix wrong sign in lower_rcp
authorRuslan Kabatsayev <b7.10110111@gmail.com>
Sat, 11 May 2019 11:04:36 +0000 (14:04 +0300)
committerKenneth Graunke <kenneth@whitecape.org>
Sat, 11 May 2019 16:25:22 +0000 (09:25 -0700)
The nested fma calls were supposed to implement

x_new = x + x * (1 - x*src),

but instead current code is equivalent to

x_new = x - x * (1 - x*src).

The result is that Newton-Raphson steps don't improve precision at all.
This patch fixes this problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110435
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
src/compiler/nir/nir_lower_double_ops.c

index 863046e65c7e9b5ef9800437b494b35e696e4eb1..18fe08c7d5dac15e49a23250a86d6aeae6d8f6ea 100644 (file)
@@ -142,8 +142,8 @@ lower_rcp(nir_builder *b, nir_ssa_def *src)
     * See https://en.wikipedia.org/wiki/Division_algorithm for more details.
     */
 
-   ra = nir_ffma(b, ra, nir_ffma(b, ra, src, nir_imm_double(b, -1)), ra);
-   ra = nir_ffma(b, ra, nir_ffma(b, ra, src, nir_imm_double(b, -1)), ra);
+   ra = nir_ffma(b, nir_fneg(b, ra), nir_ffma(b, ra, src, nir_imm_double(b, -1)), ra);
+   ra = nir_ffma(b, nir_fneg(b, ra), nir_ffma(b, ra, src, nir_imm_double(b, -1)), ra);
 
    return fix_inv_result(b, ra, src, new_exp);
 }