From: Ian Romanick Date: Tue, 3 Mar 2020 02:50:44 +0000 (-0800) Subject: soft-fp64/fsat: Micro-optimize x >= 1 test X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=8178fa88763a321cb5df853ee219884c2a7eedcc;p=mesa.git soft-fp64/fsat: Micro-optimize x >= 1 test Results on the 308 shaders extracted from the fp64 portion of the OpenGL CTS: Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 841590 -> 841332 (-0.03%) instructions in affected programs: 121957 -> 121699 (-0.21%) helped: 7 HURT: 0 helped stats (abs) min: 15 max: 54 x̄: 36.86 x̃: 41 helped stats (rel) min: 0.16% max: 0.33% x̄: 0.23% x̃: 0.18% 95% mean confidence interval for instructions value: -49.73 -23.98 95% mean confidence interval for instructions %-change: -0.29% -0.16% Instructions are helped. total cycles in shared programs: 6926828 -> 6923967 (-0.04%) cycles in affected programs: 1038569 -> 1035708 (-0.28%) helped: 7 HURT: 0 helped stats (abs) min: 128 max: 616 x̄: 408.71 x̃: 446 helped stats (rel) min: 0.18% max: 0.44% x̄: 0.29% x̃: 0.22% 95% mean confidence interval for cycles value: -571.72 -245.70 95% mean confidence interval for cycles %-change: -0.38% -0.19% Cycles are helped. Reviewed-by: Matt Turner Part-of: --- diff --git a/src/compiler/glsl/float64.glsl b/src/compiler/glsl/float64.glsl index e7a7e7860fc..c83e1aa8c97 100644 --- a/src/compiler/glsl/float64.glsl +++ b/src/compiler/glsl/float64.glsl @@ -264,7 +264,25 @@ __fsat64(uint64_t __a) if (__is_nan(__a) || int(a.y) < 0) return 0ul; - if (!__flt64_nonnan(__a, 0x3FF0000000000000ul /* 1.0 */)) + /* IEEE 754 floating point numbers are specifically designed so that, with + * two exceptions, values can be compared by bit-casting to signed integers + * with the same number of bits. + * + * From https://en.wikipedia.org/wiki/IEEE_754-1985#Comparing_floating-point_numbers: + * + * When comparing as 2's-complement integers: If the sign bits differ, + * the negative number precedes the positive number, so 2's complement + * gives the correct result (except that negative zero and positive zero + * should be considered equal). If both values are positive, the 2's + * complement comparison again gives the correct result. Otherwise (two + * negative numbers), the correct FP ordering is the opposite of the 2's + * complement ordering. + * + * We know that both values are not negative, and we know that at least one + * value is not zero. Therefore, we can just use the 2's complement + * comparison ordering. + */ + if (ilt64(0x3FF00000, 0x00000000, a.y, a.x)) return 0x3FF0000000000000ul; return __a;