mesa.git
4 years agosoft-fp64/fneg: Don't treat NaN specially
Ian Romanick [Tue, 3 Mar 2020 02:31:01 +0000 (18:31 -0800)]
soft-fp64/fneg: Don't treat NaN specially

__fabs64 doesn't do anything special, and the value is still NaN
regardless of the value of the MSB.  In a strict sense, it's possible
that both functions should set the "signal" bit.

lts on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 844558 -> 843005 (-0.18%)
instructions in affected programs: 725975 -> 724422 (-0.21%)
helped: 53
HURT: 4
helped stats (abs) min: 1 max: 313 x̄: 29.87 x̃: 21
helped stats (rel) min: 0.01% max: 0.94% x̄: 0.30% x̃: 0.22%
HURT stats (abs)   min: 4 max: 11 x̄: 7.50 x̃: 7
HURT stats (rel)   min: 0.03% max: 0.09% x̄: 0.05% x̃: 0.04%
95% mean confidence interval for instructions value: -39.02 -15.47
95% mean confidence interval for instructions %-change: -0.34% -0.21%
Instructions are helped.

total cycles in shared programs: 6962024 -> 6944998 (-0.24%)
cycles in affected programs: 6185470 -> 6168444 (-0.28%)
helped: 59
HURT: 0
helped stats (abs) min: 64 max: 2863 x̄: 288.58 x̃: 208
helped stats (rel) min: 0.11% max: 0.87% x̄: 0.33% x̃: 0.27%
95% mean confidence interval for cycles value: -387.15 -190.00
95% mean confidence interval for cycles %-change: -0.38% -0.28%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agosoft-fp64: Store sign value as 0 or 0x80000000
Ian Romanick [Tue, 3 Mar 2020 00:37:58 +0000 (16:37 -0800)]
soft-fp64: Store sign value as 0 or 0x80000000

...instead of 0 or 1.  Many places the sign bit is extracted, then later
put back in the same position.  This saves some left-shift operations.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848106 -> 844558 (-0.42%)
instructions in affected programs: 833480 -> 829932 (-0.43%)
helped: 106
HURT: 1
helped stats (abs) min: 1 max: 995 x̄: 33.48 x̃: 12
helped stats (rel) min: 0.15% max: 2.20% x̄: 0.60% x̃: 0.35%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for instructions value: -51.88 -14.43
95% mean confidence interval for instructions %-change: -0.71% -0.47%
Instructions are helped.

total cycles in shared programs: 6969125 -> 6962024 (-0.10%)
cycles in affected programs: 6717689 -> 6710588 (-0.11%)
helped: 78
HURT: 7
helped stats (abs) min: 2 max: 2083 x̄: 110.27 x̃: 56
helped stats (rel) min: <.01% max: 0.30% x̄: 0.11% x̃: 0.11%
HURT stats (abs)   min: 2 max: 1340 x̄: 214.29 x̃: 4
HURT stats (rel)   min: 0.01% max: 0.71% x̄: 0.13% x̃: 0.02%
95% mean confidence interval for cycles value: -144.02 -23.06
95% mean confidence interval for cycles %-change: -0.12% -0.07%
Cycles are helped.

total spills in shared programs: 814 -> 759 (-6.76%)
spills in affected programs: 814 -> 759 (-6.76%)
helped: 2
HURT: 1

total fills in shared programs: 2488 -> 2412 (-3.05%)
fills in affected programs: 2488 -> 2412 (-3.05%)
helped: 2
HURT: 1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agosoft-fp64: Pick a single idiom for treating sign value as a Boolean
Ian Romanick [Tue, 3 Mar 2020 00:33:16 +0000 (16:33 -0800)]
soft-fp64: Pick a single idiom for treating sign value as a Boolean

Replace all of the bool(qSign) with qSign != 0u.  Remove unnecessary
parenthesis from around most of the existing qSign != 0u.

This dramatically simplifies the next commit.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 848109 -> 848106 (<.01%)
instructions in affected programs: 53 -> 50 (-5.66%)
helped: 1
HURT: 0

total cycles in shared programs: 6969145 -> 6969125 (<.01%)
cycles in affected programs: 396 -> 376 (-5.05%)
helped: 1
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agosoft-fp64: Simplify __countLeadingZeros32 function
Ian Romanick [Tue, 3 Mar 2020 00:43:11 +0000 (16:43 -0800)]
soft-fp64: Simplify __countLeadingZeros32 function

findMSB returns -1 for an input of zero, so 31 - findMSB(a) is
sufficient on its own.

There's only one user of findMSB in shader-db, and it does not match
this pattern.

TODO: Add a pattern in the backend code generator that emits 31 -
nir_op_ufind_msb(a) as if it were nir_op_uclz.  That should save a couple
instructions.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 859509 -> 848109 (-1.33%)
instructions in affected programs: 841058 -> 829658 (-1.36%)
helped: 97
HURT: 0
helped stats (abs) min: 3 max: 1161 x̄: 117.53 x̃: 72
helped stats (rel) min: 0.98% max: 6.74% x̄: 1.70% x̃: 1.35%
95% mean confidence interval for instructions value: -147.21 -87.84
95% mean confidence interval for instructions %-change: -1.94% -1.46%
Instructions are helped.

total cycles in shared programs: 7072275 -> 6969145 (-1.46%)
cycles in affected programs: 6955767 -> 6852637 (-1.48%)
helped: 97
HURT: 0
helped stats (abs) min: 32 max: 10900 x̄: 1063.20 x̃: 560
helped stats (rel) min: 1.18% max: 7.58% x̄: 1.84% x̃: 1.45%
95% mean confidence interval for cycles value: -1339.43 -786.96
95% mean confidence interval for cycles %-change: -2.11% -1.57%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agosoft-fp64: Don't open-code umulExtended
Ian Romanick [Tue, 3 Mar 2020 20:26:37 +0000 (12:26 -0800)]
soft-fp64: Don't open-code umulExtended

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 928859 -> 859509 (-7.47%)
instructions in affected programs: 866293 -> 796943 (-8.01%)
helped: 76
HURT: 0
helped stats (abs) min: 75 max: 8042 x̄: 912.50 x̃: 688
helped stats (rel) min: 5.35% max: 21.02% x̄: 10.35% x̃: 7.58%
95% mean confidence interval for instructions value: -1138.37 -686.63
95% mean confidence interval for instructions %-change: -11.69% -9.00%
Instructions are helped.

total cycles in shared programs: 7272912 -> 7072275 (-2.76%)
cycles in affected programs: 6763486 -> 6562849 (-2.97%)
helped: 76
HURT: 0
helped stats (abs) min: 214 max: 30136 x̄: 2639.96 x̃: 1923
helped stats (rel) min: 1.75% max: 9.20% x̄: 4.04% x̃: 2.41%
95% mean confidence interval for cycles value: -3455.29 -1824.63
95% mean confidence interval for cycles %-change: -4.69% -3.39%
Cycles are helped.

total spills in shared programs: 817 -> 814 (-0.37%)
spills in affected programs: 791 -> 788 (-0.38%)
helped: 2
HURT: 0

total fills in shared programs: 2438 -> 2488 (2.05%)
fills in affected programs: 2392 -> 2442 (2.09%)
helped: 0
HURT: 2

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agosoft-fp64/b2f: Reimplement using bitwise logic ops
Ian Romanick [Thu, 5 Mar 2020 00:53:36 +0000 (16:53 -0800)]
soft-fp64/b2f: Reimplement using bitwise logic ops

This doesn't help a lot of shaders, but it helps those few a LOT.

This could also be implemented using bcsel.  That version is very
slightly worse because the generated SEL instruction wants to have two
immediate sources, so one of them usually needs an extra MOV instruction
to load.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 929619 -> 928859 (-0.08%)
instructions in affected programs: 1651 -> 891 (-46.03%)
helped: 8
HURT: 0
helped stats (abs) min: 38 max: 152 x̄: 95.00 x̃: 95
helped stats (rel) min: 42.70% max: 86.36% x̄: 49.88% x̃: 44.66%
95% mean confidence interval for instructions value: -132.97 -57.03
95% mean confidence interval for instructions %-change: -62.28% -37.49%
Instructions are helped.

total cycles in shared programs: 7280180 -> 7272912 (-0.10%)
cycles in affected programs: 12960 -> 5692 (-56.08%)
helped: 8
HURT: 0
helped stats (abs) min: 352 max: 1456 x̄: 908.50 x̃: 910
helped stats (rel) min: 52.45% max: 91.19% x̄: 59.24% x̃: 55.15%
95% mean confidence interval for cycles value: -1274.03 -542.97
95% mean confidence interval for cycles %-change: -70.06% -48.41%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agonir/algebraic: Simplify a contradiction that can occur in __flt64_nonnan
Ian Romanick [Fri, 6 Mar 2020 02:03:35 +0000 (18:03 -0800)]
nir/algebraic: Simplify a contradiction that can occur in __flt64_nonnan

The pattern is added to opt_algebraic because, for example, comparisons
with constant 0.0 will produce (a1 < 0).

Even with a pass that optimized Boolean expressions, I think this would
be very difficult to automatically recognize and optimize.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 933054 -> 929619 (-0.37%)
instructions in affected programs: 784041 -> 780606 (-0.44%)
helped: 59
HURT: 0
helped stats (abs) min: 2 max: 213 x̄: 58.22 x̃: 44
helped stats (rel) min: 0.02% max: 2.51% x̄: 0.72% x̃: 0.46%
95% mean confidence interval for instructions value: -70.80 -45.64
95% mean confidence interval for instructions %-change: -0.92% -0.53%
Instructions are helped.

total cycles in shared programs: 7304712 -> 7280180 (-0.34%)
cycles in affected programs: 7176260 -> 7151728 (-0.34%)
helped: 92
HURT: 0
helped stats (abs) min: 8 max: 1414 x̄: 266.65 x̃: 166
helped stats (rel) min: 0.04% max: 2.34% x̄: 0.43% x̃: 0.22%
95% mean confidence interval for cycles value: -333.05 -200.26
95% mean confidence interval for cycles %-change: -0.54% -0.31%
Cycles are helped.

Regular shader-db changes:

No changes on any Intel platform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agonir/algebraic: Constant reassociation for bitwise operations too
Ian Romanick [Sat, 7 Mar 2020 00:05:27 +0000 (16:05 -0800)]
nir/algebraic: Constant reassociation for bitwise operations too

Like 5886cd79a0e, but for iand, ior, and ixor.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake
total instructions in shared programs: 903108 -> 902830 (-0.03%)
instructions in affected programs: 654910 -> 654632 (-0.04%)
helped: 31
HURT: 5
helped stats (abs) min: 2 max: 31 x̄: 9.58 x̃: 7
helped stats (rel) min: 0.01% max: 0.23% x̄: 0.06% x̃: 0.04%
HURT stats (abs)   min: 1 max: 10 x̄: 3.80 x̃: 3
HURT stats (rel)   min: 0.01% max: 0.10% x̄: 0.03% x̃: 0.02%
95% mean confidence interval for instructions value: -10.55 -4.89
95% mean confidence interval for instructions %-change: -0.07% -0.03%
Instructions are helped.

total cycles in shared programs: 7059681 -> 7058006 (-0.02%)
cycles in affected programs: 5081309 -> 5079634 (-0.03%)
helped: 33
HURT: 12
helped stats (abs) min: 1 max: 444 x̄: 60.91 x̃: 18
helped stats (rel) min: <.01% max: 2.17% x̄: 0.25% x̃: 0.05%
HURT stats (abs)   min: 1 max: 288 x̄: 27.92 x̃: 2
HURT stats (rel)   min: <.01% max: 1.00% x̄: 0.23% x̃: 0.02%
95% mean confidence interval for cycles value: -68.32 -6.12
95% mean confidence interval for cycles %-change: -0.28% 0.03%
Inconclusive result (%-change mean confidence interval includes 0).

Ice Lake
total instructions in shared programs: 895384 -> 895159 (-0.03%)
instructions in affected programs: 658678 -> 658453 (-0.03%)
helped: 37
HURT: 0
helped stats (abs) min: 3 max: 16 x̄: 6.08 x̃: 4
helped stats (rel) min: <.01% max: 0.07% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for instructions value: -7.46 -4.70
95% mean confidence interval for instructions %-change: -0.04% -0.03%
Instructions are helped.

total cycles in shared programs: 7092224 -> 7091195 (-0.01%)
cycles in affected programs: 5221666 -> 5220637 (-0.02%)
helped: 35
HURT: 11
helped stats (abs) min: 1 max: 247 x̄: 43.46 x̃: 12
helped stats (rel) min: <.01% max: 2.17% x̄: 0.23% x̃: 0.05%
HURT stats (abs)   min: 2 max: 432 x̄: 44.73 x̃: 5
HURT stats (rel)   min: <.01% max: 1.00% x̄: 0.25% x̃: 0.02%
95% mean confidence interval for cycles value: -49.00 4.26
95% mean confidence interval for cycles %-change: -0.27% 0.03%
Inconclusive result (value mean confidence interval includes 0).

Regular shader-db results:

All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611408 -> 17611398 (<.01%)
instructions in affected programs: 1648 -> 1638 (-0.61%)
helped: 2
HURT: 0

total cycles in shared programs: 338366148 -> 338366124 (<.01%)
cycles in affected programs: 124048 -> 124024 (-0.02%)
helped: 2
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agonir/algebraic: Generalize some and-of-shift-right patterns [v2]
Ian Romanick [Sat, 7 Mar 2020 00:23:29 +0000 (16:23 -0800)]
nir/algebraic: Generalize some and-of-shift-right patterns [v2]

Generalizes some of the patterns from 76289fbfa84a and 905ff8619824.  In
particular, some of the soft-fp64 code generates (a & 0x7fffffff) << 1
when constant 0.0 is compared (flt or feq).

v2: Reduce the set of added patterns to those that actually help
something.  This reduces the size of the state transition tables by
about 29k.  Suggested by Jason.  Remove the existing patterns that this
commit subsumes.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake
total instructions in shared programs: 903171 -> 903108 (<.01%)
instructions in affected programs: 635903 -> 635840 (<.01%)
helped: 25
HURT: 11
helped stats (abs) min: 1 max: 16 x̄: 5.04 x̃: 3
helped stats (rel) min: <.01% max: 0.15% x̄: 0.04% x̃: 0.03%
HURT stats (abs)   min: 2 max: 14 x̄: 5.73 x̃: 5
HURT stats (rel)   min: <.01% max: 0.11% x̄: 0.04% x̃: 0.02%
95% mean confidence interval for instructions value: -3.91 0.41
95% mean confidence interval for instructions %-change: -0.03% <.01%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 7059527 -> 7059681 (<.01%)
cycles in affected programs: 5249401 -> 5249555 (<.01%)
helped: 41
HURT: 9
helped stats (abs) min: 2 max: 76 x̄: 11.90 x̃: 10
helped stats (rel) min: <.01% max: 11.86% x̄: 0.99% x̃: 0.01%
HURT stats (abs)   min: 2 max: 380 x̄: 71.33 x̃: 12
HURT stats (rel)   min: <.01% max: 0.22% x̄: 0.04% x̃: 0.01%
95% mean confidence interval for cycles value: -14.93 21.09
95% mean confidence interval for cycles %-change: -1.40% -0.20%
Inconclusive result (value mean confidence interval includes 0).

Ice Lake
total instructions in shared programs: 895506 -> 895384 (-0.01%)
instructions in affected programs: 658800 -> 658678 (-0.02%)
helped: 37
HURT: 0
helped stats (abs) min: 2 max: 8 x̄: 3.30 x̃: 2
helped stats (rel) min: <.01% max: 0.03% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -4.00 -2.59
95% mean confidence interval for instructions %-change: -0.02% -0.02%
Instructions are helped.

total cycles in shared programs: 7092748 -> 7092224 (<.01%)
cycles in affected programs: 5272008 -> 5271484 (<.01%)
helped: 36
HURT: 14
helped stats (abs) min: 2 max: 440 x̄: 21.67 x̃: 8
helped stats (rel) min: <.01% max: 11.86% x̄: 1.12% x̃: 0.02%
HURT stats (abs)   min: 2 max: 122 x̄: 18.29 x̃: 6
HURT stats (rel)   min: <.01% max: 0.07% x̄: 0.01% x̃: <.01%
95% mean confidence interval for cycles value: -29.24 8.28
95% mean confidence interval for cycles %-change: -1.40% -0.21%
Inconclusive result (value mean confidence interval includes 0).

Regular shader-db results:

All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611489 -> 17611408 (<.01%)
instructions in affected programs: 21188 -> 21107 (-0.38%)
helped: 23
HURT: 1
helped stats (abs) min: 1 max: 16 x̄: 3.78 x̃: 3
helped stats (rel) min: 0.03% max: 5.82% x̄: 1.13% x̃: 0.85%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 0.60% max: 0.60% x̄: 0.60% x̃: 0.60%
95% mean confidence interval for instructions value: -5.27 -1.48
95% mean confidence interval for instructions %-change: -1.70% -0.42%
Instructions are helped.

total cycles in shared programs: 338418502 -> 338366148 (-0.02%)
cycles in affected programs: 2289052 -> 2236698 (-2.29%)
helped: 18
HURT: 3
helped stats (abs) min: 4 max: 18000 x̄: 2909.67 x̃: 38
helped stats (rel) min: 0.09% max: 4.07% x̄: 0.96% x̃: 0.43%
HURT stats (abs)   min: 2 max: 14 x̄: 6.67 x̃: 4
HURT stats (rel)   min: 0.22% max: 1.13% x̄: 0.66% x̃: 0.64%
95% mean confidence interval for cycles value: -5204.00 217.91
95% mean confidence interval for cycles %-change: -1.31% -0.14%
Inconclusive result (value mean confidence interval includes 0).

Ivy Bridge
total instructions in shared programs: 11875617 -> 11875615 (<.01%)
instructions in affected programs: 1339 -> 1337 (-0.15%)
helped: 2
HURT: 0

No changes on any earlier Intel platforms.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agonir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b), 0)
Ian Romanick [Thu, 31 Oct 2019 00:41:41 +0000 (17:41 -0700)]
nir/algebraic: optimize ior(ine(a, 0), ine(b, 0)) to ine(ior(a, b), 0)

Like 70f9e2589e6b.  Also scrub the unnecessary size qualifier in both
replacement patterns.

This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.

Perhaps the patterns that generate umin should be conditioned on
something, but I'm not sure what.  lower_bitops might cover the cases
that matter, but it seems ugly.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 936505 -> 933388 (-0.33%)
instructions in affected programs: 925719 -> 922602 (-0.34%)
helped: 154
HURT: 1
helped stats (abs) min: 1 max: 211 x̄: 35.45 x̃: 16
helped stats (rel) min: 0.34% max: 9.30% x̄: 2.28% x̃: 0.96%
HURT stats (abs)   min: 2342 max: 2342 x̄: 2342.00 x̃: 2342
HURT stats (rel)   min: 2.28% max: 2.28% x̄: 2.28% x̃: 2.28%
95% mean confidence interval for instructions value: -51.21 10.99
95% mean confidence interval for instructions %-change: -2.61% -1.89%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 7323502 -> 7306184 (-0.24%)
cycles in affected programs: 7220376 -> 7203058 (-0.24%)
helped: 126
HURT: 1
helped stats (abs) min: 2 max: 946 x̄: 159.10 x̃: 95
helped stats (rel) min: 0.01% max: 9.62% x̄: 0.80% x̃: 0.37%
HURT stats (abs)   min: 2728 max: 2728 x̄: 2728.00 x̃: 2728
HURT stats (rel)   min: 0.37% max: 0.37% x̄: 0.37% x̃: 0.37%
95% mean confidence interval for cycles value: -192.07 -80.66
95% mean confidence interval for cycles %-change: -1.07% -0.51%
Cycles are helped.

total spills in shared programs: 635 -> 817 (28.66%)
spills in affected programs: 635 -> 817 (28.66%)
helped: 0
HURT: 3

total fills in shared programs: 2065 -> 2438 (18.06%)
fills in affected programs: 2019 -> 2392 (18.47%)
helped: 0
HURT: 2

Regular shader-db results:

All Haswell+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611506 -> 17611489 (<.01%)
instructions in affected programs: 33442 -> 33425 (-0.05%)
helped: 32
HURT: 6
helped stats (abs) min: 1 max: 6 x̄: 1.69 x̃: 1
helped stats (rel) min: 0.08% max: 1.90% x̄: 0.27% x̃: 0.11%
HURT stats (abs)   min: 1 max: 15 x̄: 6.17 x̃: 5
HURT stats (rel)   min: 0.09% max: 1.50% x̄: 0.65% x̃: 0.55%
95% mean confidence interval for instructions value: -1.70 0.80
95% mean confidence interval for instructions %-change: -0.30% 0.05%
Inconclusive result (value mean confidence interval includes 0).

total cycles in shared programs: 338419218 -> 338418502 (<.01%)
cycles in affected programs: 385795 -> 385079 (-0.19%)
helped: 42
HURT: 3
helped stats (abs) min: 2 max: 192 x̄: 24.57 x̃: 16
helped stats (rel) min: 0.04% max: 2.09% x̄: 0.33% x̃: 0.22%
HURT stats (abs)   min: 64 max: 164 x̄: 105.33 x̃: 88
HURT stats (rel)   min: 0.77% max: 1.58% x̄: 1.09% x̃: 0.93%
95% mean confidence interval for cycles value: -29.76 -2.06
95% mean confidence interval for cycles %-change: -0.40% -0.07%
Cycles are helped.

Ivy Bridge and Sandy Bridge had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11875620 -> 11875617 (<.01%)
instructions in affected programs: 421 -> 418 (-0.71%)
helped: 2
HURT: 0

total cycles in shared programs: 178245336 -> 178245326 (<.01%)
cycles in affected programs: 3425 -> 3415 (-0.29%)
helped: 2
HURT: 0

No changes on Gen4 or Gen5.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agonir/algebraic: Simplify logic to detect sign of an integer
Ian Romanick [Tue, 3 Mar 2020 18:51:59 +0000 (10:51 -0800)]
nir/algebraic: Simplify logic to detect sign of an integer

This occurs in a handful of places in the soft-fp64 code, and that is
the primary reason for the change.

v2: Fix a typo in a comment.  Noticed by Matt.  Copy the correct fp64
shader-db results to the commit message.  I realized that I used
accidentally used the results from the next commit.

Results on the 308 shaders extracted from the fp64 portion of the OpenGL
CTS:

Tiger Lake and Ice Lake had similar results. (Tiger Lake shown)
total instructions in shared programs: 906235 -> 906149 (<.01%)
instructions in affected programs: 353966 -> 353880 (-0.02%)
helped: 31
HURT: 2
helped stats (abs) min: 1 max: 8 x̄: 3.03 x̃: 3
helped stats (rel) min: 0.01% max: 1.59% x̄: 0.10% x̃: 0.04%
HURT stats (abs)   min: 3 max: 5 x̄: 4.00 x̃: 4
HURT stats (rel)   min: 0.02% max: 0.02% x̄: 0.02% x̃: 0.02%
95% mean confidence interval for instructions value: -3.51 -1.70
95% mean confidence interval for instructions %-change: -0.19% <.01%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 7076552 -> 7076173 (<.01%)
cycles in affected programs: 2878361 -> 2877982 (-0.01%)
helped: 37
HURT: 2
helped stats (abs) min: 2 max: 48 x̄: 10.81 x̃: 6
helped stats (rel) min: <.01% max: 2.17% x̄: 0.47% x̃: 0.01%
HURT stats (abs)   min: 1 max: 20 x̄: 10.50 x̃: 10
HURT stats (rel)   min: <.01% max: 0.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -13.96 -5.48
95% mean confidence interval for cycles %-change: -0.72% -0.16%
Cycles are helped.

total fills in shared programs: 2064 -> 2065 (0.05%)
fills in affected programs: 45 -> 46 (2.22%)
helped: 0
HURT: 1

Regular shader-db results:

All Gen7+ platforms had similar results. (Tiger Lake shown)
total instructions in shared programs: 17611530 -> 17611506 (<.01%)
instructions in affected programs: 5934 -> 5910 (-0.40%)
helped: 10
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 2.40 x̃: 2
helped stats (rel) min: 0.14% max: 1.24% x̄: 0.47% x̃: 0.34%
95% mean confidence interval for instructions value: -3.53 -1.27
95% mean confidence interval for instructions %-change: -0.78% -0.17%
Instructions are helped.

total cycles in shared programs: 338419178 -> 338419218 (<.01%)
cycles in affected programs: 19244 -> 19284 (0.21%)
helped: 4
HURT: 2
helped stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.05% max: 0.11% x̄: 0.08% x̃: 0.08%
HURT stats (abs)   min: 26 max: 26 x̄: 26.00 x̃: 26
HURT stats (rel)   min: 1.20% max: 1.20% x̄: 1.20% x̃: 1.20%
95% mean confidence interval for cycles value: -9.08 22.41
95% mean confidence interval for cycles %-change: -0.35% 1.04%
Inconclusive result (value mean confidence interval includes 0).

No changes on any earlier Intel platform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4142>

4 years agost/mesa: disallow deferred flush if there are multiple contexts
Pierre-Eric Pelloux-Prayer [Mon, 16 Mar 2020 14:10:23 +0000 (15:10 +0100)]
st/mesa: disallow deferred flush if there are multiple contexts

u_threaded can hang in these situation, with one context waiting on a
deferred fence from the other context.
But the other context isn't flushing its pending work (because it's waiting
for more work to pushed) so everything is stuck.

Fixes: d17b35e671a ("gallium: add PIPE_FLUSH_DEFERRED")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1430
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4213>

4 years agoanv: Use isl_drm_modifier_get_default_aux_state()
Chad Versace [Thu, 6 Feb 2020 01:50:12 +0000 (17:50 -0800)]
anv: Use isl_drm_modifier_get_default_aux_state()

Use it in anv_layout_to_aux_state().

Refactor only. No change in behavior.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3881>

4 years agointel/isl: Don't align linear images to 64K on Gen12+
Jason Ekstrand [Wed, 4 Mar 2020 17:09:50 +0000 (11:09 -0600)]
intel/isl: Don't align linear images to 64K on Gen12+

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4048>

4 years agoradv: fix random depth range unrestricted failures due to a cache issue
Samuel Pitoiset [Tue, 17 Mar 2020 14:32:45 +0000 (15:32 +0100)]
radv: fix random depth range unrestricted failures due to a cache issue

The shader module name is used to compute the pipeline key. The
driver used to load the wrong pipelines because the shader names
were similar.

This should fix random failures of
dEQP-VK.pipeline.depth_range_unrestricted.*

Fixes: f11ea226664 ("radv: fix a performance regression with graphics depth/stencil clears")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>

4 years agoturnip: Do gathering xfb info after nir_remove_dead_variables
Hyunjun Ko [Tue, 17 Mar 2020 03:57:03 +0000 (03:57 +0000)]
turnip: Do gathering xfb info after nir_remove_dead_variables

So we could align stream outputs correctly even if unused in/outs are
removed.

Fixes:
  dEQP-VK.transform_feedback.fuzz.random_vertex.scalar_types.*
  dEQP-VK.transform_feedback.fuzz.random_vertex.vector_types.*

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>

4 years agoturnip: Fix wrong assignment of xfb output's offset.
Hyunjun Ko [Tue, 17 Mar 2020 03:50:59 +0000 (03:50 +0000)]
turnip: Fix wrong assignment of xfb output's offset.

Should be divided by 4 so we could calculate the offset correctly in
tu6_setup_streamout.

Fixes: 2a1d6b81ed54971d33e83b7f5545da096b13b043
Related: 374406a7c420d266f920461f904864a94dc1b8c8

Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4207>

4 years agointel/decoder: don't consider header fields past dword0
Lionel Landwerlin [Tue, 10 Mar 2020 15:49:30 +0000 (17:49 +0200)]
intel/decoder: don't consider header fields past dword0

v2: use ULL

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4134>

4 years agolima: decode depth/stencil write bits in RSW
Vasily Khoruzhick [Sun, 15 Mar 2020 19:09:30 +0000 (12:09 -0700)]
lima: decode depth/stencil write bits in RSW

Now that we know the bits that are responsible for enabling depth/stencil
writes in shader we can decode them properly.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>

4 years agolima: implement zsbuf reload
Icenowy Zheng [Thu, 6 Feb 2020 16:53:46 +0000 (00:53 +0800)]
lima: implement zsbuf reload

Fragment shader can write depth and stencil if we set necessary flags
in RSW. In addition to that we need to use special format for Z24S8.
Original format is apparently Z24X8 since we can't sample stencil in GLES2.
This new format also seems to use several components for storing depth
since we saw r != g != b when sampling with this format.

[vasily: - initialize clear->depth to 0xffffff if we reload depth, just
           like blob does. Reloading doesn't work otherwise
         - use single bitmap for reload type]

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>

4 years agolima: disable Z16 format
Vasily Khoruzhick [Sat, 14 Mar 2020 23:33:00 +0000 (16:33 -0700)]
lima: disable Z16 format

Unfortunately we don't know how to reload Z16 buffers yet and blob
is using Z24 for dEQP tests that need depth reload.

Reviewed-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4197>

4 years agogallium/util: Switch util_float_to_half to _mesa_float_to_half()'s impl.
Eric Anholt [Thu, 27 Jun 2019 23:04:42 +0000 (16:04 -0700)]
gallium/util: Switch util_float_to_half to _mesa_float_to_half()'s impl.

The util_float_to_half() implementation was much smaller, but when trying
to switch _mesa_float_to_half to it, many testcases
(dEQP-VK.spirv_assembly.instruction.graphics.opquantize.*,
piglit.spec.arb_shading_language_packing.*packhalf2x16) start failing on
Intel.  Replace the broken impl so that people don't have to debug it
later.

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3699>

4 years agoamd/llvm: Fix divergent descriptor regressions with radeonsi.
Bas Nieuwenhuizen [Fri, 13 Mar 2020 19:48:27 +0000 (20:48 +0100)]
amd/llvm: Fix divergent descriptor regressions with radeonsi.

piglit/bin/arb_bindless_texture-limit -auto -fbo:
  Needed to deal with non-NULL dynamic_index without deref in tex instructions.

piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
  Need to deal with non-deref images in enter_waterfall_imae.

Fixes: b83c9aca4a5 "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>

4 years agogallium: fix build with latest meson and gcc10
Dave Airlie [Tue, 17 Mar 2020 20:15:06 +0000 (06:15 +1000)]
gallium: fix build with latest meson and gcc10

In Fedora 32 build was failing with meson-0.53.2-1.git88e40c7.fc32
and gcc-10.0.1-0.9.fc32.x86_64.

Worked with meson-0.53.1-1 and same gcc.

/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri2.c.o): in function `dri2_interop_export_object':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri2.c:1813: undefined reference to `st_finalize_texture'
/usr/bin/ld: src/gallium/state_trackers/dri/libdri.a(dri_screen.c.o): in function `dri_init_screen_helper':
/home/airlied/devel/mesa/mesa/build/../src/gallium/state_trackers/dri/dri_screen.c:580: undefined reference to `st_gl_api_create'

Moving this around seems to fix it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4220>

4 years agoac: don't set old denormals flags with LLVM >= 11
Marek Olšák [Fri, 13 Mar 2020 23:49:59 +0000 (19:49 -0400)]
ac: don't set old denormals flags with LLVM >= 11

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>

4 years agoac: set new LLVM denormal flags
Marek Olšák [Mon, 6 Jan 2020 20:27:15 +0000 (15:27 -0500)]
ac: set new LLVM denormal flags

See: https://reviews.llvm.org/D71358

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>

4 years agoac: unify denorm setting enforcement
Marek Olšák [Thu, 23 Jan 2020 20:52:01 +0000 (15:52 -0500)]
ac: unify denorm setting enforcement

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>

4 years agogallium/u_vbuf: simplify the first if statement in u_vbuf_upload_buffers
Marek Olšák [Wed, 11 Mar 2020 23:28:44 +0000 (19:28 -0400)]
gallium/u_vbuf: simplify the first if statement in u_vbuf_upload_buffers

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>

4 years agogallium/u_threaded: don't sync the thread for all unsychronized mappings
Marek Olšák [Wed, 11 Mar 2020 21:27:29 +0000 (17:27 -0400)]
gallium/u_threaded: don't sync the thread for all unsychronized mappings

This was missing for the READ case. This improves glBegin/End performance.
(vbo maps with WRITE | READ | UNSYCHRONIZED)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4153>

4 years agofreedreno/a5xx: Fix min-vs-mag filtering decisions on non-mipmap tex.
Eric Anholt [Wed, 11 Mar 2020 23:39:04 +0000 (16:39 -0700)]
freedreno/a5xx: Fix min-vs-mag filtering decisions on non-mipmap tex.

This a port of 3338d6e5f8b5 ("freedreno/a3xx: Mostly fix min-vs-mag
filtering decisions on non-mipmap tex.")

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>

4 years agoci: Enable testing GLES2-3 on a530 (Dragonboard 820c).
Eric Anholt [Tue, 3 Mar 2020 22:38:09 +0000 (14:38 -0800)]
ci: Enable testing GLES2-3 on a530 (Dragonboard 820c).

Following on from the db410c conversion to baremetal testing, reuse the
same scripts in the same rack to run 7 db820c boards (#4/8 is failing in
the bootloader for unknown reasons).

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4177>

4 years agoci: Enable ccaching of CMake builds as well.
Eric Anholt [Wed, 11 Mar 2020 18:11:19 +0000 (11:11 -0700)]
ci: Enable ccaching of CMake builds as well.

They ignore $PATH for unknown reasons, so you have to force the ccache
wrapping yourself.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>

4 years agoci: Enable ccache in the container builds.
Eric Anholt [Fri, 6 Mar 2020 21:23:20 +0000 (13:23 -0800)]
ci: Enable ccache in the container builds.

This should reduce our container rebuild times, particularly on the
40-minute ARM build (which is split across only 2 runners and thus likely
to have a hot cache) when working on updating containers.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>

4 years agoci: Update the ci-templates commit.
Eric Anholt [Fri, 6 Mar 2020 21:23:20 +0000 (13:23 -0800)]
ci: Update the ci-templates commit.

There has been a big rename of variables in the upstream repo to make it
clear what's being handed to ci-templates.

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4099>

4 years agoanv: Do an end-of-pipe sync before updating AUX table entries
Jason Ekstrand [Tue, 17 Mar 2020 03:58:53 +0000 (22:58 -0500)]
anv: Do an end-of-pipe sync before updating AUX table entries

We've found in GL that an actual end-of-pipe sync is required before
invalidating the aux tables and that a simple CS stall is insufficient.
If we're about to modify the actual AUX table entries from the GPU, we
should definitely make sure it's stopped dead before we do so.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4206>

4 years agointel/blorp: Plumb the stage through blorp upload_shader
Caio Marcelo de Oliveira Filho [Thu, 12 Mar 2020 21:27:13 +0000 (14:27 -0700)]
intel/blorp: Plumb the stage through blorp upload_shader

Vulkan uses that for its own upload function -- even though for BLORP
it doesn't really currently care.  Neither Iris and i965 makes use of
it at the moment.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4170>

4 years agozink: zero out zink_render_pass_state
Duncan Hopkins [Thu, 12 Mar 2020 16:45:39 +0000 (16:45 +0000)]
zink: zero out zink_render_pass_state

Since zink_render_pass_state is used as a hash-key, the entire struct gets
compared. This means we don't want any uninitialized padding in there, or
else we risk getting false negatives. This has led to issues on macOS builds.

So let's zero out the struct before we start filling it out.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4212>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4212>

4 years agoradv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control
Samuel Pitoiset [Mon, 16 Mar 2020 17:44:18 +0000 (18:44 +0100)]
radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_control

If compute shaders require a specific subgroup size (ie. Wave32),
we have to use the correct ballot size.

Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize.

Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>

4 years agoradv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control
Samuel Pitoiset [Mon, 16 Mar 2020 16:29:33 +0000 (17:29 +0100)]
radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control

If compute shaders require a specific subgroup size (ie. Wave32),
we have to return the correct one.

Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*.

Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>

4 years agoradv: only inject implicit subpass dependencies if necessary
Samuel Pitoiset [Tue, 17 Mar 2020 08:45:50 +0000 (09:45 +0100)]
radv: only inject implicit subpass dependencies if necessary

The Vulkan 1.2.134 spec update clarified when implicit subpass
dependencies should be injected by the driver. They only make
sense if automatic layout transitions are performed.

This should fix a performance regression with RPCS3 (although
they added a workaround for RADV since the regression has been found).

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2502
Fixes: e60de085473 ("radv: handle missing implicit subpass dependencies")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>

4 years agogitlab-ci: Enable more Gallium drivers in meson-i386 job
Michel Dänzer [Thu, 12 Mar 2020 11:31:05 +0000 (12:31 +0100)]
gitlab-ci: Enable more Gallium drivers in meson-i386 job

These are the ones which can be enabled with the current x86_build
docker image and which build without warnings.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>

4 years agollvmpipe: Use uintptr_t for pointer values
Michel Dänzer [Thu, 12 Mar 2020 14:03:20 +0000 (15:03 +0100)]
llvmpipe: Use uintptr_t for pointer values

Instead of uint64_t. Fixes potentially writing beyond the end of the
handles pointer array on 32-bit architectures (and copying all 0s
instead of the computed pointer values to the array on big endian
ones).

Corresponding compiler warning:

../src/gallium/drivers/llvmpipe/lp_state_cs.c: In function ‘llvmpipe_set_global_binding’:
../src/gallium/drivers/llvmpipe/lp_state_cs.c:1312:12: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
 1312 |       va = (uint64_t)((char *)lp_res->data + offset);
      |            ^

Fixes: 264663d55d32 "gallivm/llvmpipe: add support for global
                     operations."

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>

4 years agogitlab-ci: Move classic driver testing to a new meson-classic job
Michel Dänzer [Thu, 12 Mar 2020 11:29:40 +0000 (12:29 +0100)]
gitlab-ci: Move classic driver testing to a new meson-classic job

The motivation is to allow llvmpipe to be enabled instead in the
meson-i386 job.

v2: (Eric Engestrom)
* Rename meson-main job to meson-gallium
* Remove stale comment above meson-i386 job

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>

4 years agogitlab-ci: Fold scons-swr job into scons job
Michel Dänzer [Thu, 12 Mar 2020 11:13:44 +0000 (12:13 +0100)]
gitlab-ci: Fold scons-swr job into scons job

Should be fast enough.

Acked-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4166>

4 years agotu: Fix border color with compute shaders
Connor Abbott [Mon, 16 Mar 2020 14:23:44 +0000 (15:23 +0100)]
tu: Fix border color with compute shaders

I wasn't able to find any CTS tests that used compute shaders with
samplers and set a border color, so I hacked one of the tests included
with amber:

https://gist.github.com/cwabbott0/e72f0ed8259b84ed6bf3920c68fefee6

The register was found via looking at dumps of the Vulkan blob, and
setting it fixes this test.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4204>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4204>

4 years agogitlab-ci: Don't use buster-backports packages by default for x86_build
Michel Dänzer [Tue, 17 Mar 2020 08:34:51 +0000 (09:34 +0100)]
gitlab-ci: Don't use buster-backports packages by default for x86_build

The backports repository can be temporarily inconsistent between
architectures, which can break the docker image build.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4209>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4209>

4 years agoci: Drop the git dependency in tracie
Rohan Garg [Fri, 28 Feb 2020 12:48:53 +0000 (13:48 +0100)]
ci: Drop the git dependency in tracie

Instead of using git, use python and the Gitlab API
to fetch traces. This helps us slim down our ramdisks
in preparation for integrating trace replay on LAVA
devices.

Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4000>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4000>

4 years agogitlab-ci: Use surfaceless platform also for apitrace
Tomeu Vizoso [Fri, 6 Mar 2020 09:09:58 +0000 (10:09 +0100)]
gitlab-ci: Use surfaceless platform also for apitrace

In preparation for using apitrace to replay traces in LAVA jobs, build a
newer waffle so apitrace can use the surfaceless EGL platform.

As things were before this commit, Xvfb would have been needed in the
LAVA images.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4000>

4 years agogitlab-ci: Update renderdoc
Tomeu Vizoso [Wed, 11 Mar 2020 07:54:22 +0000 (08:54 +0100)]
gitlab-ci: Update renderdoc

Get closer to upstream to avoid accumulating changes.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4000>

4 years agolima/gpir: fix crash in schedule_insert_ready_list()
Vasily Khoruzhick [Tue, 10 Mar 2020 08:53:57 +0000 (01:53 -0700)]
lima/gpir: fix crash in schedule_insert_ready_list()

Fix crash if node is already at position we want. Otherwise we remove
it from list (and list->prev becomes NULL) and then we dereference list->prev
in list_addtail()

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4126>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4126>

4 years agolima/gpir: add better lowering for ftrunc
Vasily Khoruzhick [Tue, 10 Mar 2020 08:30:14 +0000 (01:30 -0700)]
lima/gpir: add better lowering for ftrunc

GP doesn't support ftrunc natively and unfortunately one in generic
opt_algebraic is not GP-friendly either. Introduce our own lowering
that utilizes fsign() that GP supports:
ftrunc(a) = fmul(fsign(a), ffloor(fmax(a, -a)))

Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4126>

4 years agolima/gpir: kill dead writes to regs in DCE
Vasily Khoruzhick [Tue, 10 Mar 2020 02:34:43 +0000 (19:34 -0700)]
lima/gpir: kill dead writes to regs in DCE

Writes to regs that are never read will confuse regalloc since they
are never live and don't conflict with any regs. Kill them to prevent
overwriting another live reg.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>

4 years agolima/gpir: Optimize nots created from branch lowering
Connor Abbott [Fri, 4 Oct 2019 14:16:52 +0000 (10:16 -0400)]
lima/gpir: Optimize nots created from branch lowering

We also add a DCE pass to cleanup the result of this pass, which turns
out to also be necessary to cleanup the result of nir->gpir in some
cases that we didn't hit until the next commit.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>

4 years agolima/gpir: Optimize conditional break/continue
Connor Abbott [Thu, 3 Oct 2019 19:24:02 +0000 (15:24 -0400)]
lima/gpir: Optimize conditional break/continue

Optimize the result of a conditional break/continue. In NIR something
like:

loop {
   ...
   if (cond)
      continue;

would get lowered to:

block_0:
...
block_1:
branch_cond !cond block_3
block_2:
branch_uncond block_0
block_3:
...

We recognize the conditional branch skipping over the unconditional
branch, and turn it into:

block_0:
...
block_1:
branch_cond cond block_0
block_2:
block_3:

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>

4 years agolima/gpir: Make lima_gpir_node_insert_child() useful
Connor Abbott [Thu, 3 Oct 2019 19:19:40 +0000 (15:19 -0400)]
lima/gpir: Make lima_gpir_node_insert_child() useful

We weren't using this function before. The name is confusing, but it
changes the child while also fixing up the dependence link, if you don't
have access to it already. Or at least, I think that's what the
intention is, and what we'll need to change the branch condition in the
next commit. Adding a dependency between the new and old source doesn't
make any sense for this, and we also need to change the actual source.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>

4 years agopanfrost: Fix gnu-empty-initializer error.
Vinson Lee [Sun, 15 Mar 2020 01:57:05 +0000 (18:57 -0700)]
panfrost: Fix gnu-empty-initializer error.

../src/gallium/drivers/panfrost/pan_cmdstream.c:1553:54: error: use of GNU empty initializer extension [-Werror,-Wgnu-empty-initializer]
        union mali_attr varyings[PIPE_MAX_ATTRIBS] = { };
                                                     ^

Fixes: 836686daf36c ("panfrost: Move panfrost_emit_varying_descriptor() to pan_cmdstream.c")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4198>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4198>

4 years agoaco: fix operand order for LS VGPR init bug workaround
Rhys Perry [Mon, 16 Mar 2020 17:11:16 +0000 (17:11 +0000)]
aco: fix operand order for LS VGPR init bug workaround

Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>

4 years agoaco: fix instruction encoding for LS VGPR init bug workaround
Rhys Perry [Mon, 16 Mar 2020 13:47:55 +0000 (13:47 +0000)]
aco: fix instruction encoding for LS VGPR init bug workaround

Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>

4 years agoaco: set late kill for v_interp_p1_f32 for some APUs
Rhys Perry [Fri, 21 Feb 2020 18:53:19 +0000 (18:53 +0000)]
aco: set late kill for v_interp_p1_f32 for some APUs

Apparently needed for Stoney Ridge, Kabini and Mullins APUs.

gfx702 also has 16-bank LDS and https://llvm.org/docs/AMDGPUUsage.html
lists some dGPUs under there. Those GPUs seem to be Hawaii actually
(gfx701) and we don't seem to have gotten any interpolation related bugs
reported with them so far.

The late kill flag was tested by running pipeline-db with
ACO_DEBUG=validatera while setting late kill for SMEM buffer loads,
emit_vop2_instruction() and texture instructions. I also tested with
just setting the flag for v_interp_p1_f32.

As far as I know, the only other thing we have to consider for 16-bank LDS
is something to do with 16-bit interpolation. We don't do that yet.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>

4 years agoaco: add a late kill flag
Rhys Perry [Fri, 21 Feb 2020 15:46:39 +0000 (15:46 +0000)]
aco: add a late kill flag

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>

4 years agoaco: move some register demand helpers into aco_live_var_analysis.cpp
Rhys Perry [Fri, 21 Feb 2020 20:14:03 +0000 (20:14 +0000)]
aco: move some register demand helpers into aco_live_var_analysis.cpp

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>

4 years agoradv/sqtt: handle thread trace capture in sqtt_QueuePresentKHR()
Samuel Pitoiset [Fri, 13 Mar 2020 09:39:41 +0000 (10:39 +0100)]
radv/sqtt: handle thread trace capture in sqtt_QueuePresentKHR()

To avoid wasting CPU cycles when thread trace is not enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180>

4 years agoanv: Push UBO ranges relative to the start of the binding
Jason Ekstrand [Fri, 13 Mar 2020 20:30:41 +0000 (15:30 -0500)]
anv: Push UBO ranges relative to the start of the binding

There was a disconnect between anv_nir_compute_push_layout and the code
which sets up the push_ubo_sizes array.  The NIR code we emit checks
relative to the start of the bound UBO range so that, if we end up with
a vector which straddles the start of the push range, we can perform the
bounds check without risking overflow issues.  The code which sets up
the push_ubo_sizes, on the other hand, assumed it was relative to the
start of the push range.  Somehow, this didn't get get caught by any of
the available tests.

Fixes: e03f9652801 "anv: Bounds-check pushed UBOs when ..."
Closes: #2623
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>

4 years agoanv: Fix the comparison in an assert
Jason Ekstrand [Fri, 13 Mar 2020 17:05:25 +0000 (12:05 -0500)]
anv: Fix the comparison in an assert

Fixes: e03f9652801 "anv: Bounds-check pushed UBOs when ..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4195>

4 years agogitlab-ci: bump Vulkan CTS to 1.2.1.0
Samuel Pitoiset [Thu, 5 Mar 2020 17:10:14 +0000 (18:10 +0100)]
gitlab-ci: bump Vulkan CTS to 1.2.1.0

Vulkan CTS 1.1.6.0 is quite old.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4179>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4179>

4 years agogitlab-ci: do not set the number of deqp-parallel jobs for RADV CTS
Samuel Pitoiset [Thu, 5 Mar 2020 14:09:38 +0000 (15:09 +0100)]
gitlab-ci: do not set the number of deqp-parallel jobs for RADV CTS

Let's the runner uses the maximum number of jobs to speedup CTS.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4179>

4 years agogitlab-ci: allow deqp-runner to use the maximum number of jobs
Samuel Pitoiset [Thu, 5 Mar 2020 14:20:34 +0000 (15:20 +0100)]
gitlab-ci: allow deqp-runner to use the maximum number of jobs

if $DEQP_PARALLEL is not set, it will use the maximum number of
jobs instead of 1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4179>

4 years agogitlab-ci: remove useless 'patch' package in the VK test image
Samuel Pitoiset [Thu, 5 Mar 2020 12:45:56 +0000 (13:45 +0100)]
gitlab-ci: remove useless 'patch' package in the VK test image

It was copied from the GL test image but it's actually unused.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4179>

4 years agotu: Rewrite border color handling
Connor Abbott [Thu, 12 Mar 2020 11:39:16 +0000 (12:39 +0100)]
tu: Rewrite border color handling

Emit a single table of all possible Vulkan border colors up front, and
then index into it using the Vulkan enum directly. In fact this seems to
be the entire point of separating out border colors in the first place.

In addition to being simpler and having less CPU overhead, and fixing
cases where more than one sampler uses border color, this paves the way
for bindless samplers because the existing approach isn't great for
bindless.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4200>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4200>

4 years agomeson: Avoid duplicate symbols.
Jose Fonseca [Tue, 10 Mar 2020 17:26:35 +0000 (17:26 +0000)]
meson: Avoid duplicate symbols.

All the stubs in src/compiler/glsl/glcpp/pp_standalone_scaffolding.c
are duplicate symbols.  They should only be used as replacement for
Mesa functions when building glcpp and glsl standalone compilers, but
in fact they are getting linked with Mesa.

This change fixes this by moving the standalone stubs to a
libglcpp_standalone target, that's only linked with the glcpp/glsl
tools.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4186>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4186>

4 years agoRevert "ci: Remove T820 from CI temporarily"
Neil Armstrong [Wed, 4 Mar 2020 13:39:20 +0000 (14:39 +0100)]
Revert "ci: Remove T820 from CI temporarily"

This reverts commit 089c8f0b8da86a05bde8359c84085e0b795abf17.

Our office changes are finished and power is now stable in our lab
for T820 CI to run again.

Cc: Daniel Stone <daniels@collabora.com>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4057>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4057>

4 years agogitlab-ci/lava: fix handling of lava tags
Neil Armstrong [Thu, 5 Mar 2020 14:19:15 +0000 (15:19 +0100)]
gitlab-ci/lava:  fix handling of lava tags

The lava tags was a python array not it's a gitlab CI string,
slit the string with periods in the jinja2 template to avoid having
the following tags :

tags:
 - p
 - a
 - n
 - f
 - r
 - o
 - s
 - t

instead of :
tags:
 - panfrost

Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4057>

4 years agoiris: allow compression conditionally for images on gen12
Tapani Pälli [Thu, 12 Mar 2020 06:24:09 +0000 (08:24 +0200)]
iris: allow compression conditionally for images on gen12

With this change, amount of resolves happening with deqp-gles31
(--deqp-case=*load_store*) drops ~50%.

v2: use iris_image_view_get_format to get the format,
    get devinfo from context instead of passing it (Nanley)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agoisl: allow compression for storage images on gen12+
Tapani Pälli [Fri, 6 Mar 2020 07:27:13 +0000 (09:27 +0200)]
isl: allow compression for storage images on gen12+

This is done to be able to use ISL_AUX_USAGE_CCS_E with images.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agoiris: determine aux usage during predraw and state setup
Tapani Pälli [Thu, 12 Mar 2020 11:43:14 +0000 (13:43 +0200)]
iris: determine aux usage during predraw and state setup

Patch changes surface state setup to alloc/fill states for all possible
usages for image resource on gen12. Also predraw and binding table
population is changed to determine correct aux usage with the new
iris_image_view_aux_usage.

v2: alloc always all states independent on current image
    aux state on gen >= 12 , code cleanups (Nanley)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agoiris: move existing image format fallback as a helper function
Tapani Pälli [Thu, 12 Mar 2020 06:16:28 +0000 (08:16 +0200)]
iris: move existing image format fallback as a helper function

Patch adds a helper function for determining image format and changes
iris_set_shader_images to use it.

v2: pass iris_context instead of pipe one, rename function,
    code cleanup (Nanley)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agoiris: provide dummy iris_image_view_aux_usage
Tapani Pälli [Fri, 6 Mar 2020 07:17:35 +0000 (09:17 +0200)]
iris: provide dummy iris_image_view_aux_usage

Similar to iris_resource_texture_aux_usage this function will
determine proper aux_usage for image, now it will default to
ISL_AUX_USAGE_NONE.

v2: drop gen_device_info parameter, rename function (Nanley)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agointel/compiler: detect if atomic load store operations are used
Tapani Pälli [Fri, 6 Mar 2020 06:59:16 +0000 (08:59 +0200)]
intel/compiler: detect if atomic load store operations are used

Patch adds a new arg and modifies existing calls from i965, anv
pass NULL but iris stores this information for later use.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agoiris: use the images_used mask in resolve pass
Tapani Pälli [Wed, 19 Feb 2020 06:34:51 +0000 (08:34 +0200)]
iris: use the images_used mask in resolve pass

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agonir/glsl: gather bitmask of images used by program
Tapani Pälli [Wed, 19 Feb 2020 06:33:04 +0000 (08:33 +0200)]
nir/glsl: gather bitmask of images used by program

In a similar fashion as commit f5c7df4dc95 does for textures.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>

4 years agost/mesa: Fix signed integer overflow when using util_throttle_memory_usage
Danylo Piliaiev [Fri, 13 Mar 2020 15:10:08 +0000 (17:10 +0200)]
st/mesa: Fix signed integer overflow when using util_throttle_memory_usage

../src/mesa/state_tracker/st_cb_texture.c:1719:57: runtime error: signed integer overflow: 203489280 * 16 cannot be represented in type 'int'

Fixes: 21ca322e637291b89a445159fc45b8dbf638e6c9
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4185>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4185>

4 years agoisl: Avoid EXPECT_DEATH in unit tests
Matt Turner [Thu, 12 Mar 2020 21:44:46 +0000 (14:44 -0700)]
isl: Avoid EXPECT_DEATH in unit tests

EXPECT_DEATH works by forking the process and letting the forked process
fail with an assertion. This process is evidently incredibly expensive,
taking ~30 seconds to run the whole isl_aux_info_test on a 2.8GHz
Skylake. Annoyingly all of the (expected) assertion failures also leaves
lots of messages in dmesg and potentially generates lots of coredumps.

Instead, avoid the expense of fork/exec by redefining assert() and
unreachable() in the code we're testing to return a unit-test-only
value. With this patch, the test takes ~1ms.

Also, while modifying the EXPECT_EQ() calls, reverse the arguments so
that the expected value comes first, as is intended. Otherwise gtest
failure messages don't make much sense.

Fixes: https://gitlab.freedesktop.org/mesa/mesa/issues/2567
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4174>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4174>

4 years agogallium/swr: use ElementCount type arguments for getSplat()
Jan Zielinski [Fri, 13 Mar 2020 19:00:39 +0000 (20:00 +0100)]
gallium/swr: use ElementCount type arguments for getSplat()

Reviewed-by: Alok Hota <alok.hota@intel.com>
In LLVM11, ConstantVector::getSplat() function definition
has changed and the first function argument has now ElementCount type.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4188>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4188>

4 years agoetnaviv: enable shareable shaders
Christian Gmeiner [Fri, 6 Mar 2020 21:44:42 +0000 (22:44 +0100)]
etnaviv: enable shareable shaders

We are not using any pctx reference in the shader so it seems fine
to enable this cap.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4095>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4095>

4 years agoetnaviv: get rid of etna_spec in etna_context
Christian Gmeiner [Fri, 6 Mar 2020 22:23:47 +0000 (23:23 +0100)]
etnaviv: get rid of etna_spec in etna_context

There is no need to have a complete copy of etna_spec - just
reference the one and only from etna_screen.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4095>

4 years agoanv: Dump push ranges via VK_KHR_pipeline_executable_properties
Jason Ekstrand [Thu, 12 Mar 2020 23:05:31 +0000 (18:05 -0500)]
anv: Dump push ranges via VK_KHR_pipeline_executable_properties

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173>

4 years agoaco: don't stop scheduling at exports
Rhys Perry [Tue, 11 Feb 2020 16:55:39 +0000 (16:55 +0000)]
aco: don't stop scheduling at exports

This allows us to move v_cvt_pkrtz_f16_f32 instructions upwards, improving
schedules and (somewhat unintentionally) moving the exports slightly
closer together.

Totals from affected shaders:
SGPRS: 1030224 -> 1030248 (0.00 %)
VGPRS: 794080 -> 794392 (0.04 %)
Spilled SGPRs: 127117 -> 127117 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 89028152 -> 89032312 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 65252 -> 65219 (-0.05 %)
SMEM score: 843808.00 -> 843918.00 (0.01 %)
VMEM score: 5331687.00 -> 5397802.00 (1.24 %)
SMEM clauses: 567659 -> 567655 (-0.00 %)
VMEM clauses: 290715 -> 290716 (0.00 %)
Instructions: 17143219 -> 17144259 (0.01 %)
Cycles: 1098442808 -> 1098446968 (0.00 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>

4 years agoaco: allow barriers to be skipped during scheduling
Rhys Perry [Tue, 11 Feb 2020 14:15:32 +0000 (14:15 +0000)]
aco: allow barriers to be skipped during scheduling

Much better scheduling apparently in 160 shaders

Totals from affected shaders:
SGPRS: 6272 -> 6344 (1.15 %)
VGPRS: 4832 -> 4844 (0.25 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 467192 -> 467428 (0.05 %) bytes
LDS: 459 -> 459 (0.00 %) blocks
Max Waves: 1407 -> 1409 (0.14 %)
SMEM score: 9309.00 -> 11216.00 (20.49 %)
VMEM score: 26679.00 -> 33652.00 (26.14 %)
SMEM clauses: 1817 -> 1776 (-2.26 %)
VMEM clauses: 2286 -> 2288 (0.09 %)
Instructions: 86537 -> 86596 (0.07 %)
Cycles: 676260 -> 676568 (0.05 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>

4 years agoaco: add helpers for ensuring correct ordering while scheduling
Rhys Perry [Thu, 7 Nov 2019 14:48:51 +0000 (14:48 +0000)]
aco: add helpers for ensuring correct ordering while scheduling

Pipeline-db changes in 721 shaders.

Totals from affected shaders:
SGPRS: 42336 -> 42656 (0.76 %)
VGPRS: 38368 -> 38636 (0.70 %)
Spilled SGPRs: 11967 -> 11967 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5268088 -> 5269840 (0.03 %) bytes
LDS: 1069 -> 1069 (0.00 %) blocks
Max Waves: 4473 -> 4447 (-0.58 %)
SMEM score: 41155.00 -> 41826.00 (1.63 %)
VMEM score: 146339.00 -> 147471.00 (0.77 %)
SMEM clauses: 24434 -> 24535 (0.41 %)
VMEM clauses: 16637 -> 16592 (-0.27 %)
Instructions: 996037 -> 996388 (0.04 %)
Cycles: 76476112 -> 75281416 (-1.56 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>

4 years agoaco: add helpers for moving instructions for scheduling
Rhys Perry [Wed, 6 Nov 2019 16:38:57 +0000 (16:38 +0000)]
aco: add helpers for moving instructions for scheduling

No pipeline-db changes

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>

4 years agoradv: add llvm_compiler_shader() helper
Samuel Pitoiset [Thu, 12 Mar 2020 13:49:55 +0000 (14:49 +0100)]
radv: add llvm_compiler_shader() helper

To match aco_compile_shader().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>

4 years agoradv: remove unnecessary LLVM includes
Samuel Pitoiset [Thu, 12 Mar 2020 13:29:54 +0000 (14:29 +0100)]
radv: remove unnecessary LLVM includes

They are already included from src/amd/llvm.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>

4 years agoradv: remove radv_shader_variant::aco_used
Samuel Pitoiset [Thu, 12 Mar 2020 13:24:49 +0000 (14:24 +0100)]
radv: remove radv_shader_variant::aco_used

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>

4 years agoradv: cleanup occurences of use_aco everywhere
Samuel Pitoiset [Thu, 12 Mar 2020 13:20:05 +0000 (14:20 +0100)]
radv: cleanup occurences of use_aco everywhere

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>

4 years agoglsl: do not crash if string literal is used outside of #include/#line
Danylo Piliaiev [Wed, 11 Mar 2020 13:29:12 +0000 (15:29 +0200)]
glsl: do not crash if string literal is used outside of #include/#line

Fixes: 67b32190f3c953c5b7091d76ddeff95c0cbfb439
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2619
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146>

4 years agoanv: Remove duplicate code in anv_cmd_buffer_bind_descriptor_set
Caio Marcelo de Oliveira Filho [Thu, 5 Mar 2020 20:33:05 +0000 (12:33 -0800)]
anv: Remove duplicate code in anv_cmd_buffer_bind_descriptor_set

Also use a single condition statement instead of two.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>

4 years agoanv: Reduce compute pipeline batch_data size
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 23:42:33 +0000 (15:42 -0800)]
anv: Reduce compute pipeline batch_data size

The batch associated with the compute pipeline only needs room for a
MEDIA_VFE_STATE. So this patch moves the batch_data to each pipeline
struct and cap the one in compute pipeline.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>

4 years agoanv: Split graphics and compute bits from anv_pipeline
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 23:31:50 +0000 (15:31 -0800)]
anv: Split graphics and compute bits from anv_pipeline

Add two new structs that use the anv_pipeline as base.  Changed all
functions that work on a specific pipeline to use the corresponding
struct.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>

4 years agoanv: Use a separate field in the pipeline for compute shader
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 21:43:39 +0000 (13:43 -0800)]
anv: Use a separate field in the pipeline for compute shader

This is a preparation for splitting the compute and graphics pipelines
into separate structs.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>

4 years agoanv: Decouple flush_descriptor_sets() from pipeline struct
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 21:21:45 +0000 (13:21 -0800)]
anv: Decouple flush_descriptor_sets() from pipeline struct

Explicitly pass the active stages and the array (and size) of shaders
to be processed.  This will make easy to store only the shaders needed
for each pipeline.

The active stages can be identified by a non-NULL shader in the
shaders array, so stop using it and keep track of the flushed stages
as iteration happens.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>