Ian Romanick [Tue, 4 Jun 2019 19:16:55 +0000 (12:16 -0700)]
intel/compiler: Treat b32csel as potentially producing a Boolean result for resolve analysis
If the 2nd and 3rd source are both Boolean values, we can potentially
avoid a resolve by only resolving the result of the b32csel.
No changes on any Gen6+ Intel platform.
v2: Use ?: instead of cast from bool to unsigned. Suggested by Caio.
Iron Lake
total instructions in shared programs:
8142729 ->
8142677 (<.01%)
instructions in affected programs: 12890 -> 12838 (-0.40%)
helped: 26
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.25% max: 0.74% x̄: 0.45% x̃: 0.38%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -0.52% -0.39%
Instructions are helped.
total cycles in shared programs:
188549632 ->
188549394 (<.01%)
cycles in affected programs: 60754 -> 60516 (-0.39%)
helped: 25
HURT: 1
helped stats (abs) min: 2 max: 26 x̄: 9.92 x̃: 8
helped stats (rel) min: 0.07% max: 2.23% x̄: 0.59% x̃: 0.27%
HURT stats (abs) min: 10 max: 10 x̄: 10.00 x̃: 10
HURT stats (rel) min: 0.70% max: 0.70% x̄: 0.70% x̃: 0.70%
95% mean confidence interval for cycles value: -12.91 -5.40
95% mean confidence interval for cycles %-change: -0.84% -0.23%
Cycles are helped.
GM45
total instructions in shared programs:
5013119 ->
5013093 (<.01%)
instructions in affected programs: 6764 -> 6738 (-0.38%)
helped: 13
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.24% max: 0.68% x̄: 0.43% x̃: 0.36%
95% mean confidence interval for instructions value: -2.00 -2.00
95% mean confidence interval for instructions %-change: -0.52% -0.34%
Instructions are helped.
total cycles in shared programs:
128977804 ->
128977700 (<.01%)
cycles in affected programs: 37738 -> 37634 (-0.28%)
helped: 13
HURT: 0
helped stats (abs) min: 8 max: 8 x̄: 8.00 x̃: 8
helped stats (rel) min: 0.18% max: 0.46% x̄: 0.30% x̃: 0.26%
95% mean confidence interval for cycles value: -8.00 -8.00
95% mean confidence interval for cycles %-change: -0.36% -0.24%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Tue, 21 May 2019 00:25:01 +0000 (17:25 -0700)]
intel/fs: Improve discard_if code generation
Previously we would blindly emit an sequence like:
mov(1) f0.1<1>UW g1.14<0,1,0>UW
...
cmp.l.f0(16) g7<1>F g5<8,8,1>F 0x41700000F /* 15F */
(+f0.1) cmp.z.f0.1(16) null<1>D g7<8,8,1>D 0D
The first move sets the flags based on the initial execution mask.
Later discard sequences contain a predicated compare that can only
remove more SIMD channels. Often times the only user of the result from
the first compare is the second compare. Instead, generate a sequence
like
mov(1) f0.1<1>UW g1.14<0,1,0>UW
...
cmp.l.f0(16) g7<1>F g5<8,8,1>F 0x41700000F /* 15F */
(+f0.1) cmp.ge.f0.1(8) null<1>F g5<8,8,1>F 0x41700000F /* 15F */
If the results stored in g7 and f0.0 are not used, the comparison will
be eliminated. This removes an instruction and potentially reduces
register pressure.
v2: Major re-write of the commit message (including fixing the assembly
code). Suggested by Matt.
All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
17224434 ->
17198659 (-0.15%)
instructions in affected programs:
2908125 ->
2882350 (-0.89%)
helped: 18891
HURT: 5
helped stats (abs) min: 1 max: 12 x̄: 1.38 x̃: 1
helped stats (rel) min: 0.03% max: 25.00% x̄: 1.76% x̃: 1.02%
HURT stats (abs) min: 9 max: 105 x̄: 51.40 x̃: 35
HURT stats (rel) min: 0.43% max: 4.92% x̄: 2.34% x̃: 1.56%
95% mean confidence interval for instructions value: -1.39 -1.34
95% mean confidence interval for instructions %-change: -1.79% -1.73%
Instructions are helped.
total cycles in shared programs:
361468458 ->
361170679 (-0.08%)
cycles in affected programs:
38470116 ->
38172337 (-0.77%)
helped: 16202
HURT: 1456
helped stats (abs) min: 1 max: 4473 x̄: 26.24 x̃: 18
helped stats (rel) min: <.01% max: 28.44% x̄: 2.90% x̃: 2.18%
HURT stats (abs) min: 1 max: 5982 x̄: 87.51 x̃: 28
HURT stats (rel) min: <.01% max: 51.29% x̄: 5.48% x̃: 1.64%
95% mean confidence interval for cycles value: -18.24 -15.49
95% mean confidence interval for cycles %-change: -2.26% -2.14%
Cycles are helped.
total spills in shared programs: 12147 -> 12176 (0.24%)
spills in affected programs: 175 -> 204 (16.57%)
helped: 8
HURT: 5
total fills in shared programs: 25262 -> 25292 (0.12%)
fills in affected programs: 269 -> 299 (11.15%)
helped: 8
HURT: 5
Haswell
total instructions in shared programs:
13530316 ->
13502647 (-0.20%)
instructions in affected programs:
2507824 ->
2480155 (-1.10%)
helped: 18859
HURT: 10
helped stats (abs) min: 1 max: 12 x̄: 1.48 x̃: 1
helped stats (rel) min: 0.03% max: 27.78% x̄: 2.38% x̃: 1.41%
HURT stats (abs) min: 5 max: 39 x̄: 25.70 x̃: 31
HURT stats (rel) min: 0.22% max: 1.66% x̄: 1.09% x̃: 1.31%
95% mean confidence interval for instructions value: -1.49 -1.44
95% mean confidence interval for instructions %-change: -2.42% -2.34%
Instructions are helped.
total cycles in shared programs:
377865412 ->
377639034 (-0.06%)
cycles in affected programs:
40169572 ->
39943194 (-0.56%)
helped: 15550
HURT: 1938
helped stats (abs) min: 1 max: 2482 x̄: 25.67 x̃: 18
helped stats (rel) min: <.01% max: 37.77% x̄: 3.00% x̃: 2.25%
HURT stats (abs) min: 1 max: 4862 x̄: 89.17 x̃: 35
HURT stats (rel) min: <.01% max: 67.67% x̄: 6.16% x̃: 2.75%
95% mean confidence interval for cycles value: -14.42 -11.47
95% mean confidence interval for cycles %-change: -2.05% -1.91%
Cycles are helped.
total spills in shared programs: 26769 -> 26814 (0.17%)
spills in affected programs: 826 -> 871 (5.45%)
helped: 9
HURT: 10
total fills in shared programs: 38383 -> 38425 (0.11%)
fills in affected programs: 834 -> 876 (5.04%)
helped: 9
HURT: 10
LOST: 5
GAINED: 10
Ivy Bridge
total instructions in shared programs:
12079250 ->
12044139 (-0.29%)
instructions in affected programs:
2409680 ->
2374569 (-1.46%)
helped: 16135
HURT: 0
helped stats (abs) min: 1 max: 23 x̄: 2.18 x̃: 2
helped stats (rel) min: 0.07% max: 37.50% x̄: 2.72% x̃: 1.68%
95% mean confidence interval for instructions value: -2.21 -2.14
95% mean confidence interval for instructions %-change: -2.76% -2.67%
Instructions are helped.
total cycles in shared programs:
180116747 ->
179900405 (-0.12%)
cycles in affected programs:
25439823 ->
25223481 (-0.85%)
helped: 13817
HURT: 1499
helped stats (abs) min: 1 max: 1886 x̄: 26.40 x̃: 18
helped stats (rel) min: <.01% max: 38.84% x̄: 2.57% x̃: 1.97%
HURT stats (abs) min: 1 max: 3684 x̄: 98.99 x̃: 52
HURT stats (rel) min: <.01% max: 97.01% x̄: 6.37% x̃: 3.42%
95% mean confidence interval for cycles value: -15.68 -12.57
95% mean confidence interval for cycles %-change: -1.77% -1.63%
Cycles are helped.
LOST: 8
GAINED: 10
Sandy Bridge
total instructions in shared programs:
10878990 ->
10863659 (-0.14%)
instructions in affected programs:
1806702 ->
1791371 (-0.85%)
helped: 13023
HURT: 0
helped stats (abs) min: 1 max: 5 x̄: 1.18 x̃: 1
helped stats (rel) min: 0.07% max: 13.79% x̄: 1.65% x̃: 1.10%
95% mean confidence interval for instructions value: -1.18 -1.17
95% mean confidence interval for instructions %-change: -1.68% -1.62%
Instructions are helped.
total cycles in shared programs:
154082878 ->
153862810 (-0.14%)
cycles in affected programs:
20199374 ->
19979306 (-1.09%)
helped: 12048
HURT: 510
helped stats (abs) min: 1 max: 323 x̄: 20.57 x̃: 18
helped stats (rel) min: 0.03% max: 17.78% x̄: 2.05% x̃: 1.52%
HURT stats (abs) min: 1 max: 448 x̄: 54.39 x̃: 16
HURT stats (rel) min: 0.02% max: 37.98% x̄: 4.13% x̃: 1.17%
95% mean confidence interval for cycles value: -17.97 -17.08
95% mean confidence interval for cycles %-change: -1.84% -1.75%
Cycles are helped.
LOST: 1
GAINED: 0
Iron Lake
total instructions in shared programs:
8155075 ->
8142729 (-0.15%)
instructions in affected programs: 949495 -> 937149 (-1.30%)
helped: 5810
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.12 x̃: 2
helped stats (rel) min: 0.10% max: 16.67% x̄: 2.53% x̃: 1.85%
95% mean confidence interval for instructions value: -2.14 -2.11
95% mean confidence interval for instructions %-change: -2.59% -2.48%
Instructions are helped.
total cycles in shared programs:
188584610 ->
188549632 (-0.02%)
cycles in affected programs:
17274446 ->
17239468 (-0.20%)
helped: 3881
HURT: 90
helped stats (abs) min: 2 max: 168 x̄: 9.08 x̃: 6
helped stats (rel) min: <.01% max: 23.53% x̄: 0.83% x̃: 0.30%
HURT stats (abs) min: 2 max: 10 x̄: 2.80 x̃: 2
HURT stats (rel) min: <.01% max: 0.60% x̄: 0.10% x̃: 0.07%
95% mean confidence interval for cycles value: -9.35 -8.27
95% mean confidence interval for cycles %-change: -0.85% -0.77%
Cycles are helped.
GM45
total instructions in shared programs:
5019308 ->
5013119 (-0.12%)
instructions in affected programs: 489028 -> 482839 (-1.27%)
helped: 2912
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 2.13 x̃: 2
helped stats (rel) min: 0.10% max: 16.67% x̄: 2.46% x̃: 1.81%
95% mean confidence interval for instructions value: -2.14 -2.11
95% mean confidence interval for instructions %-change: -2.54% -2.39%
Instructions are helped.
total cycles in shared programs:
129002592 ->
128977804 (-0.02%)
cycles in affected programs:
12669152 ->
12644364 (-0.20%)
helped: 2759
HURT: 37
helped stats (abs) min: 2 max: 168 x̄: 9.03 x̃: 4
helped stats (rel) min: <.01% max: 21.43% x̄: 0.75% x̃: 0.31%
HURT stats (abs) min: 2 max: 10 x̄: 3.62 x̃: 4
HURT stats (rel) min: <.01% max: 0.41% x̄: 0.10% x̃: 0.04%
95% mean confidence interval for cycles value: -9.53 -8.20
95% mean confidence interval for cycles %-change: -0.79% -0.70%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Tue, 21 May 2019 19:09:42 +0000 (12:09 -0700)]
intel/fs: Add need_dest parameter to fs_visitor::nir_emit_alu
This is the same as the need_dest parameter to
prepare_alu_destination_and_sources. This allows us to not change the
register that is expected to hold an result if an instruction is
re-emitted. This is particularly a problem if the re-emitted
instruction is a partial write. A later patch will use this feature.
No shader-db changes on any Intel platform.
v2: Don't do the Boolean resolve when there is no destination. If the
ALU instruction didn't write a register, there's nothing to resolve.
This replaces an earlier patch "intel/fs: Allocate dummy destination
register when need_dest is false".
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 22 May 2019 19:32:03 +0000 (12:32 -0700)]
intel/fs: Allow cmod propagation across reads and writes of different flags
This also helps a later patch (intel/fs: Improve discard_if code
generation) on about 200 shaders.
v2: Document that other instruction sequences are also valid in
subtract_merge_with_compare_intervening_mismatch_flag_write. Suggested
by Caio.
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
17224438 ->
17224434 (<.01%)
instructions in affected programs: 296 -> 292 (-1.35%)
helped: 4
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.99% max: 1.92% x̄: 1.43% x̃: 1.40%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -2.04% -0.81%
Instructions are helped.
total cycles in shared programs:
361468455 ->
361468458 (<.01%)
cycles in affected programs: 2862 -> 2865 (0.10%)
helped: 2
HURT: 2
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.24% max: 0.39% x̄: 0.31% x̃: 0.31%
HURT stats (abs) min: 3 max: 4 x̄: 3.50 x̃: 3
HURT stats (rel) min: 0.32% max: 0.70% x̄: 0.51% x̃: 0.51%
95% mean confidence interval for cycles value: -4.34 5.84
95% mean confidence interval for cycles %-change: -0.70% 0.90%
Inconclusive result (value mean confidence interval includes 0).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 22 May 2019 17:18:06 +0000 (10:18 -0700)]
intel/fs: Fix flag_subreg handling in cmod propagation
There were two errors. First, the pass could propagate conditional
modifiers from an instruction that writes on flag register to an
instruction that writes a different flag register. For example,
cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F
cmp.nz.f0.1(16) null:F, vgrf6:F, vgrf5:F
could be come
cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F
Second, if an instruction writes f0.1 has it's condition propagated, the
modified instruction will incorrectly write flag f0.0. For example,
linterp(16) vgrf6:F, g2:F, attr0:F
cmp.z.f0.1(16) null:F, vgrf6:F, vgrf5:F
(-f0.1) discard_jump(16) (null):UD
could become
linterp.z.f0.0(16) vgrf6:F, g2:F, attr0:F
(-f0.1) discard_jump(16) (null):UD
None of these cases will occur currently. The only time we use f0.1 is
for generating discard intrinsics. In all those cases, we generate a
squence like:
cmp.nz.f0.0(16) vgrf7:F, vgrf6:F, vgrf5:F
(+f0.1) cmp.z(16) null:D, vgrf7:D, 0d
(-f0.1) discard_jump(16) (null):UD
Due to the mixed types and incompatible conditions, this sequence would
never see any cmod propagation. The next patch will change this.
No shader-db changes on any Intel platform.
v2: Fix typo in comment in test case subtract_delete_compare_other_flag.
Noticed by Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Wed, 22 May 2019 18:06:19 +0000 (11:06 -0700)]
intel/fs: Add missing tests for cmod_propagate_not
Tests like this should have been added in
4467040cb65 ("i965/fs:
Propagate conditional modifiers from not instructions").
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Wed, 5 Jun 2019 06:19:22 +0000 (23:19 -0700)]
i965: Allow signed/unsigned integer conversions in miptree up/download
BLORP now handles this so there's no reason to fall back.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Wed, 5 Jun 2019 06:18:45 +0000 (23:18 -0700)]
intel/blorp: Handle SINT/UINT clamping on blits.
This patch makes blorp_blit handle SINT<->UINT blit value clamping.
After reading the source's integer data (which is expanded to 32-bit),
we either IMAX with 0 (for SINT -> UINT, to clamp negative numbers) or
UMIN with (1 << 31) - 1 (for UINT -> SINT, to clamp positive numbers
outside of the representable range).
Such blits are not allowed by the OpenGL or Vulkan APIs directly:
The Vulkan 1.1 spec for vkCmdBlitImage says:
"Integer formats can only be converted to other integer formats with
the same signedness."
The GL 4.5 spec for glBlitFramebuffer says:
"An INVALID_OPERATION error is generated if format conversions are
not supported, which occurs under any of the following conditions:
[...]
* The read buffer contains unsigned integer values and any draw
buffer does not contain unsigned integer values.
* The read buffer contains signed integer values and any draw buffer
does not contain signed integer values."
However, they are useful for other operations, such as texture upload
and download, which typically are implemented via blorp_blit(). i965
has code to fall back in this case (which the next commit will delete),
and Gallium expects blit() to handle this case for texture upload.
Fixes the following tests on iris:
- GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels
- GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo
- GTF-GL46.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Thu, 30 May 2019 23:55:19 +0000 (16:55 -0700)]
anv/pipeline: Move lowering of nir_var_mem_global later
This let deref optimizations apply to globals before lowering them.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Fri, 17 May 2019 05:41:13 +0000 (22:41 -0700)]
st/nir: Don't use GLSL IR's MOD_TO_FLOOR lowering when using NIR.
Both GLSL IR and NIR perform the same mod -> floor lowering for 32-bit
types. But nir_lower_double_ops is slightly more defensive against
lowered drcp precision loss, and handles mod(x, x) = 0 directly. This
works well...assuming nir_lower_double_ops actually gets an fmod op to
lower in the first place.
The previous patches enabled NIR-based lowering for the remaining
drivers, so we can stop using the GLSL IR lowering when using NIR.
Fixes KHR-GL45.gpu_shader_fp64.builtin.mod_dvec[234] on iris.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Kenneth Graunke [Mon, 3 Jun 2019 20:40:05 +0000 (13:40 -0700)]
radeonsi: Enable NIR's lower_fmod option.
Currently, st/mesa is always calling the GLSL IR lower_instructions()
pass with MOD_TO_FLOOR set, so mod operations will be lowered before
ever reaching NIR. This enables the same lowering at the NIR level,
which will let me shut off the GLSL IR path for NIR-based drivers.
The AMD NIR backend also has code to handle fmod, so we could
potentially skip this and still be fine. I don't have an opinion
on that.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Kenneth Graunke [Tue, 4 Jun 2019 06:06:49 +0000 (23:06 -0700)]
vc4: Enable NIR's lower_fmod option.
Currently, st/mesa is always calling the GLSL IR lower_instructions()
pass with MOD_TO_FLOOR set, so mod operations will be lowered before
ever reaching NIR. This enables the same lowering at the NIR level,
which will let me shut off the GLSL IR path for NIR-based drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Mon, 3 Jun 2019 20:36:56 +0000 (13:36 -0700)]
v3d: Enable NIR's lower_fmod option.
Currently, st/mesa is always calling the GLSL IR lower_instructions()
pass with MOD_TO_FLOOR set, so mod operations will be lowered before
ever reaching NIR. This enables the same lowering at the NIR level,
which will let me shut off the GLSL IR path for NIR-based drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Mon, 3 Jun 2019 20:18:55 +0000 (13:18 -0700)]
nir: Combine lower_fmod16/32 back into a single lower_fmod.
We originally had a single lower_fmod option. In commit
2ab2d2e5, Sam
split 32 and 64-bit lowering into separate flags, with the rationale
that some drivers might want different options there. This left 16-bit
unhandled, so Iago added a lower_fmod16 option in commit
ca31df6f.
Now that lower_fmod64 is gone (in favor of nir_lower_doubles and
nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a
single lower_fmod flag again. I'm not aware of any hardware which
need lowering for one bitsize and not the other.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Kenneth Graunke [Mon, 3 Jun 2019 20:15:49 +0000 (13:15 -0700)]
nir: Drop lower_fmod64 option.
nir_lower_doubles offers a wide variety of fp64 lowering, including
lowering fmod@64. The version there also better handles imprecisions
due to lowered frcp@64. Let's consolidate on one version.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Kenneth Graunke [Mon, 3 Jun 2019 18:54:21 +0000 (11:54 -0700)]
panfrost: Switch to nir_lower_doubles instead of lower_fmod64.
I don't think panfrost actually does doubles yet, but it at least
claims to support PIPE_CAP_DOUBLES, so at least pretend to switch
to the new lowering.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Kenneth Graunke [Mon, 3 Jun 2019 18:43:38 +0000 (11:43 -0700)]
nouveau: Use nir_lower_doubles instead of lower_fmod64 on nvc0.
We currently have two duplicate mechanisms for lowering fmod@64.
One is a nir_opt_algebraic rule keyed off of options->lower_fmod64,
and the other is nir_lower_doubles, which offers a full gamut of
fp64 lowering. The latter works slightly better in some corner cases,
so I'm trying to eliminate lower_fmod64 and drop the redundancy.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Kenneth Graunke [Mon, 3 Jun 2019 18:41:37 +0000 (11:41 -0700)]
gallium: Drop lower_fmod64 from drivers that don't support doubles.
Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no
fmod64 to be lowered.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dylan Baker [Wed, 5 Jun 2019 23:42:36 +0000 (16:42 -0700)]
docs: update calendar, and news item and link release notes for 19.0.6
Dylan Baker [Wed, 5 Jun 2019 23:37:20 +0000 (16:37 -0700)]
docs: Add SHA256 sums for 19.0.6
Dylan Baker [Wed, 5 Jun 2019 23:32:35 +0000 (16:32 -0700)]
docs: Add relnotes for 19.0.6
Erik Faye-Lund [Mon, 3 Jun 2019 15:53:13 +0000 (17:53 +0200)]
docs: add day of month to all news-entries
This makes it easier to batch-convert them to other structured
markup-formats.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 14:15:45 +0000 (16:15 +0200)]
docs: add MD5 checksums for 9.2.2 files
These checksums were obtained by downloading the releases from
ftp://ftp.freedesktop.org/pub/mesa/older-versions/9.x/9.2.2/ and
running md5sum on them.
Hopefully the server wasn't compromised since release.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 14:08:27 +0000 (16:08 +0200)]
docs: use pre-block for showing commit-note
Having a single-item list for this seems odd. Let's just use a pre-block
in stead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 13:42:46 +0000 (15:42 +0200)]
docs: switch to definition list and code-tags
A definition list is a better semantic match for what this list is
supposed to convey, so let's use that instead. And while we're at it,
let's add some code-tags around filenames, as they stand a bit more out
that way.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 11:03:57 +0000 (13:03 +0200)]
docs: combine headings
This is more in line with how we mark-up other definition lists, and
avoids portability issues with other markup-formats.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 11:11:59 +0000 (13:11 +0200)]
docs: more code-tags in llvmpipe.html
This makes the article a bit easier to read.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 10:50:23 +0000 (12:50 +0200)]
docs: use more code-tags in envvars.html
This wraps code, identifiers, values and paths in code-tags, which makes
them appear in a monospace-font for readability.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 10:19:51 +0000 (12:19 +0200)]
docs: use code-tags for envvars and options
This makes it a bit easier to tell what's what.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 09:26:40 +0000 (11:26 +0200)]
docs: use dl instead of ul
A HTML definition-list is more semantically strong than just some
unordered list, and renders a bit cleaner by default. So let's use that
instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 10:26:34 +0000 (12:26 +0200)]
docs: remove pointlessly repeated list
The examples listed above are exactly the same ones are we're about to
list, so let's just keep the list that defines what they do.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Wed, 8 May 2019 09:29:31 +0000 (11:29 +0200)]
docs: remove stray whitespace
There's some stray whitespace in these files that doesn't do anything
useful. Let's get rid of if.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 4 Jun 2019 07:38:25 +0000 (09:38 +0200)]
docs: use proper links instead of code-tags
These links are a bit odd in that the URLs are simply placed in
code-tags. This makes them harder to work with. Let's use proper
links instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Mon, 3 Jun 2019 17:23:07 +0000 (19:23 +0200)]
docs: update doxygen-links
One of these URLs are dead these days, and the other one forwards to the
current one, doxygen.nl. Let's get these links up to date.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Mon, 3 Jun 2019 16:54:05 +0000 (18:54 +0200)]
docs: remove some noisy spacing in pre-blocks
These newlines caused the blocks to have trailing newlines in them,
which renders a bit noisily.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Mon, 3 Jun 2019 16:50:41 +0000 (18:50 +0200)]
docs: improve quoting slightly
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Mon, 3 Jun 2019 16:30:23 +0000 (18:30 +0200)]
docs: do not use br-tag for non-significant breaks
According to the W3C, we shouldn't use the br-tag unless the line-break
is part of the content:
https://www.w3.org/TR/2011/WD-html5-author-
20110809/the-br-element.html
All of these instances are for non-content usage, and is as such technically
out-of-spec. So let's either remove them, or split paragraphs, based on
how related the content are.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Mon, 3 Jun 2019 16:26:38 +0000 (18:26 +0200)]
docs: remove pointless line-break
Line-breaks at the end of a paragraph doesn't do anything useful,
so let's just get rid of it.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Mon, 3 Jun 2019 16:24:26 +0000 (18:24 +0200)]
docs: remove pointless trailing hard-breaks
Line-break at the end of an article is quite pointless, and doesn't do
much to increase the readability. Let's get rid of them.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 12:07:48 +0000 (14:07 +0200)]
docs: rewrite paragraph to be free-form
These half-way structured sections are needlessly problematic to
translate cleanly to other markup-languages, so let's just make this
into a free-form paragraph instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 12:03:08 +0000 (14:03 +0200)]
docs: use h4 instead of free-standing paragraphs and br-tags
This makes this document a bit more structured, which is generally
considered a good thing for HTML. It will also translate a bit better
into other markup-formats.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 11:57:42 +0000 (13:57 +0200)]
docs: slightly reword paragraph and tweak markup
This makes this paragraph a bit easier to digest.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 11:53:03 +0000 (13:53 +0200)]
docs: remove stray space in code-block
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 11:48:15 +0000 (13:48 +0200)]
docs: remove some pointless spacing
The different headers and header-sizes already convey the hierarchical
structure of this document, the unusual spacing arguably just looks a
bit inconsistent with the rest of the site. Let's remove it; it looks
fine without it, and will translate better to other markup languages.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 11:14:03 +0000 (13:14 +0200)]
docs: add more more code-tags
It's easier to read function-names, file-names and other
"machine"-related strings if they are formatted in a monospace font. So
let's mark these up with code-tags.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 11:34:34 +0000 (13:34 +0200)]
docs: use code instead of tt-tag
The tt-tag has been removed from HTML5, so let's normalize this to
code-tags intead. This just makes things a bit more consistent, as we've
mixed these left and right so far anyway.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Tue, 28 May 2019 11:20:29 +0000 (13:20 +0200)]
docs: use paragraph instead of double newlines
This is a bit more semantically clean in HTML, and makes us keep
content and presentation a bit more separated.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Mon, 3 Jun 2019 17:07:50 +0000 (19:07 +0200)]
docs: use verbatim .plan quote
This quote is now verbatim, as archived here:
https://github.com/ESWAT/john-carmack-plan-archive/blob/master/by_year/johnc_plan_1999.txt
This makes it look a bit more consistent with the following news-entry,
and makes things IMO a bit more clear.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Alyssa Rosenzweig [Wed, 5 Jun 2019 18:51:16 +0000 (18:51 +0000)]
panfrost/midgard: Verify SSA claims when pipelining
The pipeline register creation algorithm is only valid for SSA indices;
NIR registers and such cannot be pipelined without more complex
analysis. However, there are the ocassional class of "liars" -- indices
that claim to be SSA but are not. This occurs in the blend shader
prologue, for example. Detect this and just bail quietly for now.
Eventually we need to rewrite the blend shader prologue to occur in NIR
anyway (which would mitigate the issue), but that's more involved and
depends on a better understanding of pixel formats in blend shaders (for
non-RGBA8888/UNORM cases).
Fixes some blend shader regressions.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 5 Jun 2019 18:26:29 +0000 (18:26 +0000)]
panfrost/midgard: Don't assign var locations ourselves
This piece of code was cargo-culted from the ir3 standalone compiler and
made sense when we were a standalone compiler ourselves. Unfortunately,
for the online compiler, mesa/st *already handles this for us* and if we
duplicate it here, we're duplicating it *incorrectly*. So just delete
these lines and fix a heck of a lot of tests.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Tue, 14 May 2019 15:28:17 +0000 (17:28 +0200)]
panfrost: Reload framebuffer contents if there's no clear
If by flush time the client hasn't submitted a clear, add jobs for
reloading the framebuffer contents as the first draw in the frame.
This is required by programs such as Weston which don't do clears and
rely on the previous contents of the framebuffer being there.
Reloading the whole framebuffer on every frame without regards to what
is needed or what is going to be covered is very inefficient, but future
work will introduce support for damage regions and partial updates so we
know what needs to be actually reloaded.
Fixes quite a few tests in dEQP-EGL.functional.buffer_age.*.
[Alyssa: The context is that tilers do an implicit glClear() on every
frame, whether you asked them to or not. If you want a clear, this is
very efficient. But if you don't, you have to explicitly blit the
backbuffer back into tile memory, accomplished by a dummy texturing
draw. This patch generates that draw via u_blitter, although we could do
a bit better ourselves by eliding the vertex job. This fixes "black
rectangles in Weston/sway" as well as "video not displaying when UI
visible in mpv"]
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Thu, 23 May 2019 03:01:32 +0000 (03:01 +0000)]
panfrost: Don't flip scanout
The mesa/st flips the viewport, so we respect that rather than
trying to flip the framebuffer itself and ignoring the viewport and
using a messy heuristic.
However, this brings an underlying disagreement about the interpretation
of winding order to light. The blob uses a different strategy than Mesa
for handling viewport Y flipping, so the meanings of the winding order
bit are flipped for it. To keep things clean on our end, we rename to
explicitly use Gallium (rather than flipped OpenGL) conventions.
Fixes upside-down Xwayland/egl windows.
v2: Adjust lowering configuration to correctly flip gl_PointCoord.y and
gl_FragCoord.y. v1 was R-b'd by Tomeu, but then retracted due to these
regressions which are not fixed.
Suggested-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Sort-of-reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Timur Kristóf [Fri, 31 May 2019 16:43:20 +0000 (18:43 +0200)]
st/nine: Use tgsi_to_nir when preferred IR is NIR.
This patch allows nine to read the preferred IR from pipe caps and use
NIR when that is preferred by the driver, by calling tgsi_to_nir. Also
adds some debug options that allow overriding it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
Lionel Landwerlin [Wed, 5 Jun 2019 08:20:23 +0000 (11:20 +0300)]
intel/perf: improve dynamic loading config detection
We're currently trying to detect dynamic loading config support by
trying to remove to test config (hard coded in the i915 driver) and
checking we get ENOENT.
This can fail if the test config was updated in Mesa but not yet in
i915.
A better way to do this is to pick an invalid ID and check for ENOENT.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 4 Jun 2019 23:23:17 +0000 (18:23 -0500)]
intel/nir: Take nir_shader*s in brw_nir_link_shaders
Since NIR_PASS no longer swaps out the NIR pointer when NIR_TEST_* is
enabled, we can just take a single pointer and not a pointer to pointer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 4 Jun 2019 23:19:06 +0000 (18:19 -0500)]
intel/nir: Stop returning the shader from helpers
Now that NIR_TEST_* doesn't swap the shader out from under us, it's
sufficient to just modify the shader rather than having to return in
case we're testing serialization or cloning.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 4 Jun 2019 22:50:22 +0000 (17:50 -0500)]
nir: Don't replace the nir_shader when NIR_TEST_SERIALIZE=1
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Jason Ekstrand [Tue, 4 Jun 2019 22:48:33 +0000 (17:48 -0500)]
nir: Don't replace the nir_shader when NIR_TEST_CLONE=1
Instead, we add a new helper which stomps one nir_shader and replaces it
with another. The new helper effectively just changes which pointer
gets used for the base nir_shader. It should be 99% as good at testing
cloning but without requiring that everything handle having the shader
swapped out from under it constantly.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108957
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Caio Marcelo de Oliveira Filho [Wed, 5 Jun 2019 05:55:13 +0000 (22:55 -0700)]
iris: Only recompile CS when needed
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Lionel Landwerlin [Wed, 5 Jun 2019 08:49:06 +0000 (11:49 +0300)]
intel/perf: fix EuThreadsCount value in performance equations
EuThreadsCount is supposed to be the number of threads per EU, not the
total number of threads in the whole device.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 1fc7b951278428 ("i965: Add Gen8+ INTEL_performance_query support")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Mark Janes [Wed, 5 Jun 2019 17:49:32 +0000 (10:49 -0700)]
intel/tools: use C99 print conversion specifier for 32 bit builds
Fixes formatting errors for 32 bit compilations, eg:
error: format ‘%lx’ expects argument of type ‘long unsigned int’,
but argument 5 has type ‘uint64_t’ {aka ‘long long unsigned int’}
[-Werror=format=]
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Samuel Pitoiset [Mon, 20 May 2019 08:28:03 +0000 (10:28 +0200)]
radv: use only one descriptor in the fmask expand pass
This removes one useless SMEM load operations which pointed to
the same descriptor anyway.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 20 May 2019 08:28:02 +0000 (10:28 +0200)]
radv: set ACCESS_NON_READABLE on the fmask expand pass output image
The driver will emit GLC=1.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 20 May 2019 08:28:01 +0000 (10:28 +0200)]
radv: remove one useless image type in the fmask expand shader
Both input and output images use the same type.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Kristian H. Kristensen [Tue, 4 Jun 2019 22:15:40 +0000 (15:15 -0700)]
freedreno/ir3: Extend debug helpers to support TCS/TES/GS
Reviewed-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Mon, 3 Jun 2019 21:25:39 +0000 (14:25 -0700)]
freedreno/a6xx: Use VALIDREG in next_regid() helper
Reviewed-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Tue, 4 Jun 2019 20:38:33 +0000 (13:38 -0700)]
freedreno/a6xx: Remove dead code from a5xx
Reviewed-by: Rob Clark <robdclark@gmail.com>
Kristian H. Kristensen [Mon, 3 Jun 2019 20:58:11 +0000 (13:58 -0700)]
freedreno/ir3: Generalize ir3_shader_disasm()
Use a helper function to get the sysval/attribute/varying/output name
and make the disam debug output independent of shader stage.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Alyssa Rosenzweig [Wed, 5 Jun 2019 14:48:57 +0000 (14:48 +0000)]
panfrost/midgard: Always break up fragment writeout
In a fragment shader, r0 is written out with a special branch sequence.
r0 is not a real register here, but essentially a pipeline register --
as such, it needs to be written out in full and on time, with hanging
dependencies in the bundle. Otherwise, we break up the bundle, which
costs an extra ALU cycle and adds a move.
When the scheduler ran last thing, we could do this analysis within the
scheduler. Now that RA can run after scheduling, that's no longer valid,
so we remove the analysis and always break it up (at a performance
penalty). Future work can add a post-RA/post-schedule pass to merge
writeout blocks if possible. It's a bit of a low-priority next to fixing
conformance regressions, of course.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 5 Jun 2019 14:33:42 +0000 (14:33 +0000)]
panfrost/midgard: Fix cubemap regression
Fixes: 2d9802233 ("panfrost/midgard: Extend RA to non-vec4 sources")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Deepak Rawat [Wed, 5 Jun 2019 17:46:47 +0000 (10:46 -0700)]
winsys/drm: Fix out of scope variable usage
In this particular instance, struct member were used outside of the
block where it was defined. Fix this by moving the definition outside of
block.
Signed-off-by: Deepak Rawat <drawat@vmware.com>
Fixes: 569f83898768 ("winsys/svga: Add support for new surface ioctl, multisample pattern")
Reviewed-by: Brian Paul <brianp@vmware.com>
Alyssa Rosenzweig [Wed, 5 Jun 2019 15:12:58 +0000 (15:12 +0000)]
panfrost/midgard: Lower integer division
We use the shared nir_lower_idiv pass to lower integer division, fixing
144 dEQP tests. This pass was not applied in the past due to breakage
from iabs fixed earlier in the series.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
Alyssa Rosenzweig [Wed, 5 Jun 2019 15:24:51 +0000 (15:24 +0000)]
panfrost/midgard: Fix 1-arg ALU memory corruption
Certain ops that only take one argument have an imaginary "zero"
constant for their second argument. For instance, conversions:
i2f [dest], [source], #0
Memory corruption meant that #0 was instead random noise. For some ops,
that doesn't matter (manifested as abnormally large code size and poor
scheduling due to extra constants in random places). But for others,
where a 1-op is emulated by a 2-op with an implicit 0 second argument,
that broke things.
Fixes iabs (emulated by iabsdiff).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
Alyssa Rosenzweig [Wed, 5 Jun 2019 15:18:35 +0000 (15:18 +0000)]
panfrost/midgard: Add a bunch of new ALU ops
These ops are used to accelerate various functions exposed in OpenCL.
This commit only includes the routine additions to the table. They are
not wired through the compiler; rather, they are just here to keep a
reference for the disassembler.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-By: Ryan Houdek <Sonicadvance1@gmail.com>
Emil Velikov [Thu, 16 May 2019 17:01:40 +0000 (18:01 +0100)]
egl: add EGL_platform_device support
This new 'platform' is added by default with no guards.
It is effectively a copy of the surfaceless one, with updated function
names and brand new probe function.
Due to the reuse, some of the ifdef HAVE_SURFACELESS_PLATFORM guards
have been dropped.
A worthy mention are the changes in _egFindDisplay, since the original
and dup'd fd are required, we make use of the plat_opt argument.
Note that no hacks for eglGetDisplay are added - the API works only with
the eglGetPlatformDisplay* API.
v2:
- s/_eglCompareDeviceDisplay/_eglSameDeviceDisplay/ (Eric)
- let ^^ return bool (Eric)
- fixup meson build, move files() further up (Eric)
- copy from plat. surfaceless w/o the visual cleanups
- close and free when destroying the dpy
- sprinkle a few _eglDeviceSupports
- split fd handling into separate function
- use directly the render node if no FD is given (Mathias)
v3:
- s/dpy/disp/g
- drop swap_buffers* callbacks
- drop loader_set_logger()
- drop local define
- re-introduce _eglGetDRMDeviceRenderNode()
- EGL_WARN on ForceSoftware with HW device - continue using the HW device
- bail out for "EGL_MESA_device_software" until it's fixed
- wire-up the Android build
v4:
- use new style _eglFindDisplay()
- split hw vs sw code paths
- don't close the internal fd (already handled in FiniDisplay())
- make swrast work (bit hacky bit will do for now)
- Android for real, drop autotools
- Correct HW + LIBGL_ALWAYS_SOFTWARE check
- use the dri2_create_drawable() helper
v5:
- enhance comment around fd checks (Mathias)
- rebase for dri2_init_surface() changes
Cc: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
Acked-by: Marek Olšák <marek.olsak@amd.com> (v4)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Thu, 16 May 2019 17:01:39 +0000 (18:01 +0100)]
egl: keep the software device at the end of the list
By default, the user is likely to pick the first device so it should
not be the least performant (aka software) one.
v2: Drop odd comment (Marek)
Suggested-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Thu, 16 May 2019 17:01:38 +0000 (18:01 +0100)]
egl/dri: flesh out and use dri2_create_drawable()
Wrap the loader->createNewDrawable() dance into a helper and use it
throughout the codebase.
This addresses a cases like surfaceless (SL) on swrast (SL on kms_swrast
is fine) where we'd attempt using the wrong driver and crash out.
v2: fixup quirky GBM (Mathias)
v3: fixup GBM for real (Marek)
Cc: mesa-stable@lists.freedesktop.org
Cc: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com> (v2)
Signed-off-by: Marek Olšák <marek.olsak@amd.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Thu, 16 May 2019 17:01:37 +0000 (18:01 +0100)]
egl: fold X11 attrib handling like other platforms
Since we no longer need special handling for X11, refactor the code to
follow the style used by all other platforms.
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Adam Jackson [Thu, 16 May 2019 17:01:36 +0000 (18:01 +0100)]
egl: remove Options::Platform handling
The full set of attributes is already handled with previous patches.
Thus all this is not dead code.
v2 (Emil) - split from a larger patch.
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Adam Jackson [Thu, 16 May 2019 17:01:35 +0000 (18:01 +0100)]
egl/x11: pick the user requested screen
At the moment the user will pass the screen number via attribs, yet we
would throw that away. Reason being that the int *screen passed to
xcb_connect() is output only.
v2 (Emil):
- split from a larger patch
- use xcb_connect() returned screen, as fallback
- use helper function only as needed
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Adam Jackson [Thu, 16 May 2019 17:01:34 +0000 (18:01 +0100)]
egl: handle the full attrib list in display::options
Earlier spec is vague, although EGL 1.5 makes it clear:
Multiple calls made to eglGetPlatformDisplay with the same
parameters will return the same EGLDisplay handle.
With this commit we store and compare the full attrib list.
v2 (Emil):
- Split into separate patches
- Use EGLBoolean over int masked as such
- Don't return free'd pointed on calloc failure
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Thu, 16 May 2019 17:01:33 +0000 (18:01 +0100)]
egl: flesh out a _eglNumAttribs() helper
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Krzysztof Raszkowski [Fri, 31 May 2019 11:33:32 +0000 (13:33 +0200)]
swr: fix support for GL_ARB_copy_image extension
This commit fix support and adjusts the capabilities
returned by the SWR driver and the documentation
to correctly report the GL_ARB_copy_image extension.
Reviewed-by: Alok Hota <alok.hota@intel.com>
Guido Günther [Mon, 3 Jun 2019 09:12:02 +0000 (11:12 +0200)]
etnaviv: etnaviv_bo_cache_test: Use /dev/dri/renderD128 by default
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Mon, 3 Jun 2019 09:12:02 +0000 (11:12 +0200)]
build: Build etnaviv drm tests
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Mon, 3 Jun 2019 09:12:02 +0000 (11:12 +0200)]
etnaviv: drm tests: Use mesa header locations
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Mon, 3 Jun 2019 09:12:01 +0000 (11:12 +0200)]
etnaviv: Add libdrm tests as of
922d92994267743266024ecceb734ce0ebbca808
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
build: Build etnaviv drm
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
etnaviv: gallium: Use internal etnaviv_drmif.h
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
etnaviv: drm: s/bo_del/_etna_bo_del/
This avoids a conflict with freedreno's bo_del().
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
etnaviv: drm: s/table_lock/etna_table_lock/
This avoids a conflict with freedreno's table_lock
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
etnaviv: drm: Move uapi header
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
etnaviv: drm: Drop excessive debugging in perfmon
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
entaviv: drm: Don't use drmMsg()
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:08 +0000 (14:35 +0200)]
etnaviv: drm: Use _mesa_hash_table instead of drmHash
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:06 +0000 (14:35 +0200)]
etnaviv: drm: Use mesa's ARRAY_SIZE
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:06 +0000 (14:35 +0200)]
etnaviv: drm: Use mesa's os_m{un,}map
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:06 +0000 (14:35 +0200)]
etnaviv: drm: Use mesa's atomic definitions
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:06 +0000 (14:35 +0200)]
etnaviv: drm: Drop drm_{public,private}
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Guido Günther [Fri, 31 May 2019 12:35:06 +0000 (14:35 +0200)]
etnaviv: drm: Drop inexistent headers
Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>