Elie Tournier [Thu, 23 May 2019 16:16:18 +0000 (17:16 +0100)]
nir/algebraic: i2f(f2i()) -> trunc()
total instructions in shared programs:
12840968 ->
12840784 (<.01%)
instructions in affected programs: 17886 -> 17702 (-1.03%)
helped: 77
HURT: 0
total cycles in shared programs:
302508917 ->
302505592 (<.01%)
cycles in affected programs: 249964 -> 246639 (-1.33%)
helped: 70
HURT: 7
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/948>
Eric Anholt [Mon, 6 Jan 2020 21:09:25 +0000 (13:09 -0800)]
i965: Reuse the new core glsl_count_dword_slots().
The only difference I could see was treating interfaces like structs.
Maintain that case.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
Eric Anholt [Mon, 6 Jan 2020 20:00:57 +0000 (12:00 -0800)]
mesa/st: Move the dword slot counting function to glsl_types as well.
To implement NIR-to-TGSI, we need to be able to get the size of the
uniform variable for the TGSI declaration, not just the
.driver_location. With its location in mesa/st, drivers couldn't link
to it from nir-to-tgsi.
This feels like a common enough function to want, so let's share it in
the core compiler.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
Eric Anholt [Mon, 6 Jan 2020 21:19:30 +0000 (13:19 -0800)]
mesa/prog: Reuse count_vec4_slots() from ir_to_mesa.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
Eric Anholt [Mon, 6 Jan 2020 19:37:38 +0000 (11:37 -0800)]
mesa/st: Move the vec4 type size function into core GLSL types.
The only bit that gallium varied on was handling of bindless. We can
retain previous behavior for count_attribute_slots() by passing in
"true" (though I suspect this is just giving a silly answer to a silly
question), and delete our recursive function from mesa/st.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
Eric Anholt [Mon, 6 Jan 2020 19:47:03 +0000 (11:47 -0800)]
mesa/st: Deduplicate the NIR uniform lowering code.
Just a little refactor as I go looking at the type size functions.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3297>
Marek Olšák [Sat, 11 Jan 2020 02:19:46 +0000 (21:19 -0500)]
radeonsi: move PS LLVM code into si_shader_llvm_ps.c
This is an attempt to clean up si_shader.c.
v2: don't move code that is not specific to LLVM
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
Marek Olšák [Sat, 11 Jan 2020 01:25:28 +0000 (20:25 -0500)]
radeonsi: remove always constant ballot_mask_bits from si_llvm_context_init
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sat, 11 Jan 2020 01:22:47 +0000 (20:22 -0500)]
radeonsi: fold si_create_function into si_llvm_create_func
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Sat, 11 Jan 2020 01:13:54 +0000 (20:13 -0500)]
radeonsi: rename si_shader_create -> si_create_shader_variant for clarity
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 23:26:01 +0000 (18:26 -0500)]
radeonsi: rename si_compile_tgsi_main -> si_build_main_function
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 23:22:27 +0000 (18:22 -0500)]
radeonsi: clean up si_shader_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 23:09:04 +0000 (18:09 -0500)]
radeonsi: merge si_tessctrl_info into si_shader_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 23:06:56 +0000 (18:06 -0500)]
radeonsi: fork tgsi_shader_info and tgsi_tessctrl_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 22:04:33 +0000 (17:04 -0500)]
radeonsi: rename si_shader_info -> si_shader_binary_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 21:57:17 +0000 (16:57 -0500)]
radeonsi: remove TGSI from comments
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 21:50:41 +0000 (16:50 -0500)]
radeonsi: rename DBG_NO_TGSI -> DBG_NO_NIR
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 10 Jan 2020 21:47:04 +0000 (16:47 -0500)]
radeonsi: don't adjust depth and stencil PS output locations
this was for compatibility with TGSI
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Caio Marcelo de Oliveira Filho [Sat, 11 Jan 2020 07:52:30 +0000 (23:52 -0800)]
nir: Add missing nir_var_mem_global to various passes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
Caio Marcelo de Oliveira Filho [Wed, 8 Jan 2020 21:30:43 +0000 (13:30 -0800)]
spirv: Handle PhysicalStorageBuffer in memory barriers
PhysicalStorageBuffer is lowered to nir_var_mem_global, and
SPIR-V 1.5rev1 in section "3.25. Memory Semantics <id>" says
UniformMemory
Apply the memory-ordering constraints to StorageBuffer,
PhysicalStorageBuffer, or Uniform Storage Class memory.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
Caio Marcelo de Oliveira Filho [Wed, 8 Jan 2020 21:25:59 +0000 (13:25 -0800)]
spirv: Drop EXT for PhysicalStorageBuffer symbols
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3322>
Timur Kristóf [Tue, 19 Nov 2019 12:29:54 +0000 (13:29 +0100)]
aco: Flip s_cbranch / s_cselect to optimize out an s_not if possible.
When possible, get rid of an s_not when all it does is invert the SCC,
and its successor s_cbranch / s_cselect can be inverted instead.
Also modify some parts of instruction_selection to take advantage of
this feature.
Example:
s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406
s2: %3902 = s_cselect_b64 -1, 0, %3900:scc
s2: %407, s1: %3903:scc = s_not_b64 %3902
s2: %3906, s1: %3905:scc = s_and_b64 %407, %0:exec
p_cbranch_z %3905:scc
Can now be optimized to:
s2: %3900, s1: %3899:scc = s_andn2_b64 %0:exec, %406
p_cbranch_nz %3900:scc
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Timur Kristóf [Fri, 3 Jan 2020 09:30:04 +0000 (10:30 +0100)]
aco: Optimize out s_and with exec, when used on uniform bitwise values.
Previously all booleans needed an s_and with exec when they were turned
into a scalar condition. However, this is not needed for uniform booleans.
v2 by Daniel Schürmann:
- Make the code more readable
v3 by Timur Kristóf:
- Fix regressions, make it work in wave32 mode
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Timur Kristóf [Tue, 19 Nov 2019 12:38:34 +0000 (13:38 +0100)]
aco: Don't skip combine_instruction when definitions[1] is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Timur Kristóf [Mon, 18 Nov 2019 15:11:19 +0000 (16:11 +0100)]
aco: Allow optimizing vote_all and nir_op_iand.
By adding an extra instruction, we can replace the operands of
the s_cselect_b64, which allows it to get picked up by the
optimizer when it looks for uniform booleans.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Timur Kristóf [Wed, 13 Nov 2019 10:14:51 +0000 (11:14 +0100)]
aco: Implement 64-bit constant propagation.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Alyssa Rosenzweig [Fri, 10 Jan 2020 18:12:35 +0000 (13:12 -0500)]
panfrost: Fix linear depth textures
As pointed out by Boris, what we were calling PAN_LINEAR depth textures
was in fact u-interleaved tiled (!), but we never noticed since we
flipped the flag used for sampling, leading to all sorts of fun bugs
when attempting to directly acess depth textures from the CPU. Which
begs the question -- if what we called LINEAR was tiled, how do we
actually render linear depth textures? It turns out the flags for AFBC
form a mali_block_format 2-bit code just like their render-target
counterparts, so we can render to any of the above.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Boris Brezillon <boris.brezillon@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>
Jason Ekstrand [Fri, 10 Jan 2020 19:16:25 +0000 (13:16 -0600)]
vulkan/wsi: Add a driconf option to force WSI to advertise BGRA8_UNORM first
The Aztec Ruins benchmark just grabs the first format in the list and
SRGB causes it to render washed out. With this workaround, it renders
the same as OpenGL.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3350>
Caio Marcelo de Oliveira Filho [Mon, 13 Jan 2020 23:48:12 +0000 (15:48 -0800)]
intel/fs: Only use SLM fence in compute shaders
Fixes: b390ff35170 ("intel/fs: Add support for SLM fence in Gen11")
Fixes: e142061399c ("intel/fs: Implement scoped_memory_barrier")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Tue, 14 Jan 2020 02:10:09 +0000 (21:10 -0500)]
radeonsi: actually enable VBOs in user SGPRs
Fixes: 363b4027fcb - radeonsi: put up to 5 VBO descriptors into user SGPRs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 14 Jan 2020 02:09:35 +0000 (21:09 -0500)]
radeonsi: fix assertion and other failures in si_emit_graphics_shader_pointers
The assertion was failing.
Fixes: 363b4027fcb - radeonsi: put up to 5 VBO descriptors into user SGPRs
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Rhys Perry [Tue, 10 Dec 2019 16:20:56 +0000 (16:20 +0000)]
nir/algebraic: a & ~(a >> 31) -> imax(a, 0)
Found in some Doom shaders
Totals from affected shaders:
SGPRS: 30056 -> 30064 (0.03 %)
VGPRS: 28024 -> 28024 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
4278648 ->
4270852 (-0.18 %) bytes
Max Waves: 1476 -> 1476 (0.00 %)
Instructions: 835287 -> 833338 (-0.23 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3089>
Marco Felsch [Thu, 5 Dec 2019 16:04:11 +0000 (17:04 +0100)]
etnaviv: Fix assert when try to accumulate an invalid fd
Check if it is a valid fd before merging it to the context's fd.
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3381>
Afonso Bordado [Tue, 31 Dec 2019 15:08:49 +0000 (15:08 +0000)]
pan/midgard: Fix midgard_compile.h includes
We now use enum mali_format which is defined in panfrost-job.h
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>
Lionel Landwerlin [Tue, 14 Jan 2020 14:10:21 +0000 (16:10 +0200)]
anv: only use VkSamplerCreateInfo::compareOp if enabled
The spec says nothing about the validity of the compareOp field when
compareEnable is false.
v2: use vulkan enum to pick default value (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2350
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3387>
Rhys Perry [Mon, 14 Oct 2019 16:15:04 +0000 (17:15 +0100)]
nir/sink,nir/move: move/sink nir_op_mov
Can uncover opportunities to move other instructions. This can increase
register usage, but that doesn't seem to actually happen.
This optimizes a pattern of a load_per_vertex_input followed by several
moves and then a store_output in a different block.
v2: add nir_move_copies to make it optional
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Acked-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
Rhys Perry [Mon, 14 Oct 2019 16:15:37 +0000 (17:15 +0100)]
nir/sink,nir/move: move/sink load_per_vertex_input
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2420>
Tomeu Vizoso [Tue, 17 Dec 2019 10:50:14 +0000 (11:50 +0100)]
gitlab-ci: Consolidate container and build stages for LAVA
Use the normal build job to also prepare the artifacts for LAVA jobs.
For that, the build container needs to also build the test suites,
kernel, ramdisk, etc.
Then the build job will place the just-built Mesa in the ramdisk and the
test job can generate a LAVA job and point to those artifacts.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3295>
Rhys Perry [Fri, 13 Dec 2019 16:17:21 +0000 (16:17 +0000)]
aco: add integer min/max to can_swap_operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Wed, 11 Dec 2019 16:10:26 +0000 (16:10 +0000)]
aco: improve readfirstlane after uniform LDS loads
Totals from affected shaders:
SGPRS: 976 -> 968 (-0.82 %)
VGPRS: 580 -> 584 (0.69 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 106032 -> 103076 (-2.79 %) bytes
Max Waves: 237 -> 237 (0.00 %)
Instructions: 19452 -> 18740 (-3.66 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Mon, 9 Dec 2019 21:20:10 +0000 (21:20 +0000)]
aco: replace extract_vector with copies
Helps a small number of small shaders with situations like this:
a = p_create_vector ...
b = p_extract_vector a, 3
and copy propagation can't be done
Totals from affected shaders:
SGPRS: 14304 -> 14416 (0.78 %)
VGPRS: 8716 -> 6592 (-24.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 184664 -> 176888 (-4.21 %) bytes
Max Waves: 6260 -> 6260 (0.00 %)
Instructions: 35561 -> 33617 (-5.47 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Tue, 10 Dec 2019 19:03:42 +0000 (19:03 +0000)]
aco: allow input modifiers on v_cndmask_b32
Totals from affected shaders:
SGPRS: 594099 -> 594019 (-0.01 %)
VGPRS: 441016 -> 441124 (0.02 %)
Spilled SGPRs: 101 -> 101 (0.00 %)
Spilled VGPRs: 18 -> 18 (0.00 %)
Code Size:
30266652 ->
30125256 (-0.47 %) bytes
Max Waves: 67044 -> 67057 (0.02 %)
Instructions:
5753097 ->
5726607 (-0.46 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Wed, 4 Dec 2019 19:28:43 +0000 (19:28 +0000)]
aco: don't move literal to reg when making an instruction VOP3 on GFX10
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 163398 -> 163398 (0.00 %)
VGPRS: 143820 -> 143820 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
13065744 ->
13044308 (-0.16 %) bytes
Max Waves: 18921 -> 18921 (0.00 %)
Instructions:
2514644 ->
2509285 (-0.21 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 20:32:11 +0000 (20:32 +0000)]
aco: add min(-max(), ) and max(-min(), ) optimization
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 17:50:29 +0000 (17:50 +0000)]
aco: improve clamp optimization
Not sure why it checked the use count, it doesn't apply the constants.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 269409 -> 269745 (0.12 %)
VGPRS: 238120 -> 238132 (0.01 %)
Spilled SGPRs: 305 -> 305 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
22908584 ->
22904672 (-0.02 %) bytes
Max Waves: 20217 -> 20217 (0.00 %)
Instructions:
4275312 ->
4263869 (-0.27 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 155409 -> 155233 (-0.11 %)
VGPRS: 153072 -> 153072 (0.00 %)
Spilled SGPRs: 269 -> 269 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
14650824 ->
14650396 (-0.00 %) bytes
Max Waves: 9609 -> 9609 (0.00 %)
Instructions:
2762802 ->
2755517 (-0.26 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 20:58:59 +0000 (20:58 +0000)]
aco: fix clamp optimization
We can't do the optimization if there are neg/abs in-between.
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 15:18:38 +0000 (15:18 +0000)]
aco: improve creation of v_madmk_f32/v_madak_f32
Using needs_vop3 check was flawed because it would only combine the
literal if the first operand is the literal. If the second or third
operand is the literal, then needs_vop3 will be true and the literal will
not be combined.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 782051 -> 782051 (0.00 %)
VGPRS: 630048 -> 630048 (0.00 %)
Spilled SGPRs: 195 -> 195 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
54743740 ->
54585548 (-0.29 %) bytes
Max Waves: 67340 -> 67340 (0.00 %)
Instructions:
10182030 ->
10182030 (0.00 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 701990 -> 699590 (-0.34 %)
VGPRS: 566632 -> 566784 (0.03 %)
Spilled SGPRs: 218 -> 218 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
49173564 ->
49007856 (-0.34 %) bytes
Max Waves: 59650 -> 59612 (-0.06 %)
Instructions:
9315135 ->
9293330 (-0.23 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Wed, 20 Nov 2019 16:42:17 +0000 (16:42 +0000)]
aco: take advantage of GFX10's constant bus limit and VOP3 literals
pipeline-db (Navi):
Totals from affected shaders:
SGPRS:
2397159 ->
2392494 (-0.19 %)
VGPRS:
1756036 ->
1753920 (-0.12 %)
Spilled SGPRs: 461 -> 470 (1.95 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
110287304 ->
109946304 (-0.31 %) bytes
Max Waves: 318341 -> 318475 (0.04 %)
Instructions:
21019327 ->
20533618 (-2.31 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 0 -> 0 (0.00 %)
VGPRS: 0 -> 0 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 0 -> 0 (0.00 %) bytes
Max Waves: 0 -> 0 (0.00 %)
Instructions: 0 -> 0 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 13:47:54 +0000 (13:47 +0000)]
aco: allow an extra SGPR with multiple uses to be applied to VOP3
This is in a separate patch from the apply_sgprs() rewrite so that the
rewrite can be more easily tested.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 3056 -> 3056 (0.00 %)
VGPRS: 1632 -> 1632 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156468 -> 156304 (-0.10 %) bytes
Max Waves: 288 -> 288 (0.00 %)
Instructions: 29510 -> 29469 (-0.14 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 1616 -> 1616 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 156132 -> 155968 (-0.11 %) bytes
Max Waves: 289 -> 289 (0.00 %)
Instructions: 29426 -> 29385 (-0.14 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 13:47:08 +0000 (13:47 +0000)]
aco: allow applying two sgprs to an instruction
We could create VALU instructions which read two sgprs, but only if isel
created an instruction which already read one of them.
This change is in a separate patch from the apply_sgprs() rewrite so that
it can be tested if the rewrite affected anything.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1756 -> 1708 (-2.73 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 312 -> 300 (-3.85 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 216 -> 216 (0.00 %)
VGPRS: 64 -> 64 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1784 -> 1736 (-2.69 %) bytes
Max Waves: 120 -> 120 (0.00 %)
Instructions: 319 -> 307 (-3.76 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 14:17:27 +0000 (14:17 +0000)]
aco: follow through temporary when merging tests into constant comparisons
This can happen with v_mov_b32(s_mov_b32(literal))
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77488 -> 76928 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14426 -> 14332 (-0.65 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 632 -> 632 (0.00 %)
VGPRS: 492 -> 492 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 77512 -> 76952 (-0.72 %) bytes
Max Waves: 67 -> 67 (0.00 %)
Instructions: 14432 -> 14338 (-0.65 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 14:34:24 +0000 (14:34 +0000)]
aco: be more careful with literals in combine_salu_{n2,lshl_add}
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 14:50:41 +0000 (14:50 +0000)]
aco: add check_vop3_operands()
This will be useful when taking advantage of GFX10 features.
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 14:55:25 +0000 (14:55 +0000)]
aco: rewrite apply_sgprs()
This will make it easier to apply two different sgprs (for GFX10) or apply
the same sgpr twice (just remove the break).
No pipeline-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 13:43:39 +0000 (13:43 +0000)]
aco: rewrite literal combining
Should make taking advantage of GFX10's increased constant bus limit and
VOP3 literals easier.
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Fri, 22 Nov 2019 15:00:04 +0000 (15:00 +0000)]
aco: improve can_use_VOP3()
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Wed, 20 Nov 2019 16:31:43 +0000 (16:31 +0000)]
aco: combine two sgprs into a VALU if they're the same
This was supposed to be done before but it wasn't done correctly and
everywhere.
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 784680 -> 786128 (0.18 %)
VGPRS: 574012 -> 573892 (-0.02 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
45477088 ->
45478172 (0.00 %) bytes
Max Waves: 81294 -> 81277 (-0.02 %)
Instructions:
8657970 ->
8622483 (-0.41 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 780664 -> 782072 (0.18 %)
VGPRS: 573880 -> 573760 (-0.02 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
45445244 ->
45448340 (0.01 %) bytes
Max Waves: 81178 -> 81161 (-0.02 %)
Instructions:
8649902 ->
8614918 (-0.40 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Wed, 20 Nov 2019 19:09:25 +0000 (19:09 +0000)]
aco: apply literals to split mads
Removing the return is also needed to apply literals to mads (which can be
done on GFX10).
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 368787 -> 367555 (-0.33 %)
VGPRS: 312436 -> 312448 (0.00 %)
Spilled SGPRs: 461 -> 461 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
26113388 ->
26098260 (-0.06 %) bytes
Max Waves: 35982 -> 35982 (0.00 %)
Instructions:
5038670 ->
5028941 (-0.19 %)
pipeline-db (Vega):
Totals from affected shaders:
SGPRS: 369843 -> 368659 (-0.32 %)
VGPRS: 317224 -> 317196 (-0.01 %)
Spilled SGPRs: 629 -> 629 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size:
26310540 ->
26295156 (-0.06 %) bytes
Max Waves: 36324 -> 36326 (0.01 %)
Instructions:
5073957 ->
5064164 (-0.19 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Mon, 25 Nov 2019 16:12:44 +0000 (16:12 +0000)]
aco: update IR validator
GFX10 increased the constant bus limit and allowed literals on VOP3
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2883>
Rhys Perry [Tue, 15 Oct 2019 15:46:02 +0000 (16:46 +0100)]
nir/lower_gs_intrinsics: add option for per-stream counts
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2422>
Rhys Perry [Mon, 14 Oct 2019 16:03:07 +0000 (17:03 +0100)]
nir/divergence: handle load_primitive_id in GS
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2323>
Erik Faye-Lund [Tue, 24 Sep 2019 13:51:13 +0000 (15:51 +0200)]
mesa/st: use float literals
This removes a warning on MSVC.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Erik Faye-Lund [Mon, 23 Sep 2019 11:36:45 +0000 (13:36 +0200)]
gallium: fix a warning
On some platforms (like Win64), unsigned long is 32-bit, so the first
cast doesn't do anything, and the compiler complains about an implicit
cast to a smaller type. So let's cast to an uintptr_t instead first,
as that's large enough on all platforms.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Erik Faye-Lund [Fri, 20 Sep 2019 14:07:47 +0000 (16:07 +0200)]
st/wgl: eliminate implicit cast warning
I get warnings on MSVC for these implicit casts. Let's use explicit
casts instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Erik Faye-Lund [Fri, 20 Sep 2019 14:04:06 +0000 (16:04 +0200)]
util: initialize float-array with float-literals
We currently initialize this float-array with double-literals. Some
compilers generate warnings for this, so let's switch these to
float-literals instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Lionel Landwerlin [Mon, 13 Jan 2020 15:50:36 +0000 (17:50 +0200)]
anv: Implement Gen12 workaround for non pipelined state
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
Lionel Landwerlin [Mon, 13 Jan 2020 15:50:06 +0000 (17:50 +0200)]
iris: Implement Gen12 workaround for non pipelined state
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3365>
Vasily Khoruzhick [Sun, 12 Jan 2020 06:23:36 +0000 (22:23 -0800)]
lima: add new findings to texture descriptor
Lower 8 bits of unknown_1_3 seems to be min_lod,
rest of 4 bits + miplevels are max_lod and min_mipfilter seems to be
lod bias. All are in fixed format with 4 bit integer and 4 bit fraction,
lod_bias also has sign bit.
Blob also seems to do some magic with lod_bias if min filter is nearest --
it adds 0.5 to lod_bias in this case. Same story when all filters are
nearest and mipmapping is enabled, but in this case it subtracts 1/16
from lod_bias.
Fixes 134 dEQP tests in dEQP-GLES2.functional.texture.*
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
Kenneth Graunke [Tue, 17 Dec 2019 08:51:20 +0000 (00:51 -0800)]
intel: Use similar brand strings to the Windows drivers
This updates our product name strings to match the ones reported
by the Windows driver, which is typically the marketing name.
We retain a platform abbreviation and GT level in parenthesis so that
we're able to distinguish similar parts more easily, helping us better
understand at a glance which GPU a bug reporter has.
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
Kenneth Graunke [Tue, 17 Dec 2019 10:57:55 +0000 (02:57 -0800)]
iris: Simplify iris_get_renderer_string()
We use gen_get_device_name() instead of PCI ID list munging.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
Kenneth Graunke [Tue, 17 Dec 2019 09:00:14 +0000 (01:00 -0800)]
i965: Simplify brw_get_renderer_string()
This stops using driGetRendererString() in favor of a simple snprintf().
This should have the same functionality on 64-bit systems, but drops
a "x86/MMX/SSE2" suffix on 32-bit systems. (People shouldn't be using
the GL_RENDERER string to check for CPU features...)
We also use gen_get_device_name() instead of PCI ID list munging.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3371>
Kenneth Graunke [Tue, 14 Jan 2020 01:35:35 +0000 (17:35 -0800)]
Revert "nir: assert that nir_lower_tex runs after lowering derefs"
This reverts commit
4cda61f11e922fb5914ae73d22cc0c495abf0377 for now,
as it appears to break i965 CI (32,000+ failures). Rob and I suspect
we need to do the equivalent of
1c6a2efa06e9bb5914f4557118930fc61065a467
on i965 - we are doing nir_lower_tex and brw_nir_lower_resources in the
wrong order and that's likely triggering this condition. Once we fix
that, we should put this patch back.
Erik Faye-Lund [Thu, 19 Dec 2019 09:17:14 +0000 (10:17 +0100)]
zink: fixup initialization of operand_mask / num_extra_operands
This doesn't change behavior, but makes the code a bit easier to read.
Both values are zero, but I somehow swapped the logical meaning of them
when initializing.
Eric Anholt [Mon, 13 Jan 2020 21:06:01 +0000 (13:06 -0800)]
mesa: Fix detection of invalidating both depth and stencil.
Fixes an extra 1024x1024x4 MSAA Z/S store on WebGL fishtank on cheza.
Reported-by: Dave Airlie <airlied@redhat.com>
Fixes: db2ae5112106 ("mesa: Skip partial InvalidateFramebuffer of packed depth/stencil.")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3370>
Rob Clark [Mon, 13 Jan 2020 19:36:19 +0000 (11:36 -0800)]
mesa/st: lower samplers before nir_lower_tex
Fixes incorrect lowering of YUV samplers when there are non-yuv
samplers.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>
Rob Clark [Mon, 13 Jan 2020 19:34:53 +0000 (11:34 -0800)]
nir: assert that nir_lower_tex runs after lowering derefs
It isn't going to do the right thing, because texture_index/
sampler_index defaults to zero.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3368>
Gurchetan Singh [Thu, 15 Aug 2019 01:09:28 +0000 (18:09 -0700)]
i965: support EXT_EGL_image_storage
i965 can support this.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gurchetan Singh [Thu, 7 Nov 2019 02:02:37 +0000 (18:02 -0800)]
i965: refactor intel_image_target_texture_2d
intel_image_target_texture_tex_storage can reuse much of this
code.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gurchetan Singh [Wed, 21 Aug 2019 22:07:28 +0000 (15:07 -0700)]
i965: track if image is created by a dmabuf
Will be used by EXT_EGL_image_storage later.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gurchetan Singh [Thu, 7 Nov 2019 01:18:13 +0000 (17:18 -0800)]
dri_util: add driImageFormatToSizedInternalGLFormat function
This is needed to implement the EXT_EGL_image_storage spec:
"If <target> is GL_TEXTURE_2D, then the resultant texture must have a
sized internal format which is colorspace and size compatible with the
dma-buf. If the GL is unable to determine such a format, the error
INVALID_OPERATION is generated."
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gurchetan Singh [Wed, 14 Aug 2019 22:16:04 +0000 (15:16 -0700)]
glapi / teximage: implement EGLImageTargetTexStorageEXT
Check various parts of the EXT_EGL_image_storage spec, and add a
new vfunc for drivers implementing it.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Gurchetan Singh [Wed, 14 Aug 2019 01:16:41 +0000 (18:16 -0700)]
teximage: split out helper from EGLImageTargetTexture2DOES
The major differences between EXT_EGL_image_storage and
EGLImageTargetTexture2DOES are:
(1) The texture target is made immutable
(2) EXT_EGL_image_storage supports non-2D targets.
We can reuse EGLImageTargetTexture2D and FreeTextureImageBuffer
for (1) pretty easily. For (2), let's just not support the
complicated targets. Let's reuse aspects of the
EGLImageTargetTexture2DOES implementation.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Jason Ekstrand [Mon, 13 Jan 2020 19:49:57 +0000 (13:49 -0600)]
anv: Memset array properties
This is probably better than possibly leaving those bytes uninitialized
even if the app will theoretically not use them.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
Jason Ekstrand [Mon, 13 Jan 2020 18:55:41 +0000 (12:55 -0600)]
anv: Don't over-advertise descriptor indexing features
We should only advertise sub-features if we advertise the extension.
Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3369>
Jason Ekstrand [Fri, 10 Jan 2020 21:30:02 +0000 (15:30 -0600)]
intel/blorp: Fill out all the dwords of MI_ATOMIC
This makes us valgrind clean again.
Fixes: 9175c7058efb "intel/blorp: Make blorp update the clear color..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3366>
Tomeu Vizoso [Mon, 13 Jan 2020 10:47:58 +0000 (11:47 +0100)]
gitlab-ci: Upgrade kernel for LAVA jobs to v5.5-rc5
Some fixes got in that should prevent hangs in lima jobs.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3363>
Daniel Schürmann [Fri, 10 Jan 2020 16:19:40 +0000 (17:19 +0100)]
aco: fix unconditional demote_to_helper
This patch fixes an out-of-bounds access on p_exit_early
and binds the exec register to the correct operand.
Fixes: 2ea9e59e8d976ec77800d2a20645087b96d1e241 ('aco: move s_andn2_b64 instructions out of the p_discard_if')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3347>
Marek Olšák [Thu, 9 Jan 2020 21:41:13 +0000 (16:41 -0500)]
radeonsi: don't enable VBOs in user SGPRs if compute-based culling can be used
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 7 Jan 2020 23:23:53 +0000 (18:23 -0500)]
radeonsi: put up to 5 VBO descriptors into user SGPRs
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs
We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.
Totals from affected shaders:
SGPRS:
1110528 ->
1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
23766296 ->
22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Wed, 8 Jan 2020 20:52:44 +0000 (15:52 -0500)]
ac,radeonsi: increase the maximum number of shader args and return values
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Wed, 8 Jan 2020 00:45:01 +0000 (19:45 -0500)]
radeonsi: simplify si_set_vertex_buffers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 7 Jan 2020 23:16:59 +0000 (18:16 -0500)]
radeonsi: don't allow draw calls with uninitialized VS inputs
These always hang, because vertex buffer descriptors are not set up.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 7 Jan 2020 23:10:38 +0000 (18:10 -0500)]
radeonsi: add si_context::num_vertex_elements
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 7 Jan 2020 23:06:14 +0000 (18:06 -0500)]
radeonsi: rename desc_list_byte_size -> vb_desc_list_alloc_size
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Lionel Landwerlin [Tue, 26 Nov 2019 15:53:09 +0000 (17:53 +0200)]
anv: set stencil layout for input attachments
If an input attachment has a stencil format, we need to set this.
v2: Fish out VkAttachmentReferenceStencilLayoutKHR from
VkAttachmentReference2KHR::pNext (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: c1c346f16673 ("anv: implement VK_KHR_separate_depth_stencil_layouts")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2891>
Jason Ekstrand [Mon, 13 Jan 2020 18:20:48 +0000 (12:20 -0600)]
anv: Drop an unused variable
Jason Ekstrand [Wed, 8 Jan 2020 01:22:13 +0000 (19:22 -0600)]
nir/lower_atomics_to_ssbo: Also lower barriers
This is more correct for a pass which is supposed to completely lower
away atomic counters. It also lets us stop supporting atomic counter
barriers in most of the drivers.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Jason Ekstrand [Tue, 7 Jan 2020 20:54:26 +0000 (14:54 -0600)]
nir: Rename nir_intrinsic_barrier to control_barrier
This is a more explicit name now that we don't want it to be doing any
memory barrier stuff for us.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Jason Ekstrand [Tue, 7 Jan 2020 20:58:45 +0000 (14:58 -0600)]
intel/nir: Stop adding redundant barriers
Now that both GLSL and SPIR-V are adding shared and tcs_patch barriers
(as appropreate) prior to the nir_intrinsic_barrier, we don't need to do
it ourselves in the back-end. This reverts commit
26e950a5de01564e3b5f2148ae994454ae5205fe.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
Jason Ekstrand [Tue, 7 Jan 2020 20:40:53 +0000 (14:40 -0600)]
nir/glsl: Emit memory barriers as part of barrier()
The GLSL barrier() intrinsic does an implicit shared memory barrier in
compute shaders and an implicit TCS patch output barrier in tessellation
control shaders. We'd like NIR's barrier intrinsic to just be a control
flow barrier and not have memory implications. To satisfy this, we need
to add an extra memory barrier in front of each nir_intrinsic_barrier.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>