mesa.git
5 years agopanfrost: Promote uniform registers late
Alyssa Rosenzweig [Tue, 16 Jul 2019 21:10:08 +0000 (14:10 -0700)]
panfrost: Promote uniform registers late

Rather than creating either a load or a uniform register read with a
fixed beginning offset, we always create a load and then promote to a
uniform register later. This will allow us to promote in a register
pressure aware manner.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Call scheduler/RA in a loop
Alyssa Rosenzweig [Tue, 16 Jul 2019 16:45:11 +0000 (09:45 -0700)]
pan/midgard: Call scheduler/RA in a loop

This will allow us to insert instructions as a result of register
allocation, permitting spilling to be implemented. As a side effect,
with the assert commented out this would fix a bunch of glamor crashes
(due to RA failures) so MATE becomes useable.

Ideally we'll have scheduling or RA actually sorted out before the
branch point but if not this gives us a one-line out to get X working...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Remove custom register selection callback
Alyssa Rosenzweig [Tue, 16 Jul 2019 16:16:39 +0000 (09:16 -0700)]
pan/midgard: Remove custom register selection callback

What we have is equivalent to the default callback; let's use that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoradv: fix crash in vkCmdClearAttachments with unused attachment
Samuel Pitoiset [Mon, 22 Jul 2019 08:12:48 +0000 (10:12 +0200)]
radv: fix crash in vkCmdClearAttachments with unused attachment

depth_stencil_attachment and/or ds_resolve attachment can be NULL.

This fixes crashes with
dEQP-VK.renderpass.suballocation.unused_clear_attachments.*

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoi965: free object labels when deleting
Sergii Romantsov [Wed, 17 Jul 2019 15:59:28 +0000 (18:59 +0300)]
i965: free object labels when deleting

Some leaks detected with GL_KHR_debug on i965.

CC: Timothy Arceri <t_arceri@yahoo.com.au>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv/gfx10: update descriptors for inline uniform blocks
Samuel Pitoiset [Thu, 18 Jul 2019 13:51:33 +0000 (15:51 +0200)]
radv/gfx10: update descriptors for inline uniform blocks

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: emit the GS NGG prologue before the nested barrier
Samuel Pitoiset [Thu, 18 Jul 2019 13:51:32 +0000 (15:51 +0200)]
radv/gfx10: emit the GS NGG prologue before the nested barrier

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: do not allocate space for the ZPASS_DONE bug
Samuel Pitoiset [Thu, 18 Jul 2019 13:51:31 +0000 (15:51 +0200)]
radv/gfx10: do not allocate space for the ZPASS_DONE bug

GFX10 isn't affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: do not set ELEMENT_SIZE for buffer descriptors
Samuel Pitoiset [Thu, 18 Jul 2019 13:51:30 +0000 (15:51 +0200)]
radv/gfx10: do not set ELEMENT_SIZE for buffer descriptors

This field doesn't exist.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: clean up fill_geom_tess_rings()
Samuel Pitoiset [Thu, 18 Jul 2019 13:51:29 +0000 (15:51 +0200)]
radv: clean up fill_geom_tess_rings()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: change a bunch of >= GFX9 to == GFX9
Samuel Pitoiset [Thu, 18 Jul 2019 13:51:28 +0000 (15:51 +0200)]
radv: change a bunch of >= GFX9 to == GFX9

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: do not clamp shadow reference on GFX10
Samuel Pitoiset [Thu, 18 Jul 2019 13:51:27 +0000 (15:51 +0200)]
ac/nir: do not clamp shadow reference on GFX10

RadeonSI only uses Z32_FLOAT_CLAMP for upgraded depth textures
on GFX10 and RADV doesn't promotes Z16 or Z24.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: move nir_opt_conditional_discard out of optimization loop
Daniel Schürmann [Sat, 20 Jul 2019 17:21:14 +0000 (19:21 +0200)]
radv: move nir_opt_conditional_discard out of optimization loop

This late optimization pass is only affected by nir_opt_if() and handles all cases
in a single pass. It's enough to call it once after the optimization loop.
No changes on vkpipeline-db.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agov3d: fill logicop_func in the fragment shader key when precompiling shaders
Iago Toral Quiroga [Fri, 19 Jul 2019 07:54:54 +0000 (09:54 +0200)]
v3d: fill logicop_func in the fragment shader key when precompiling shaders

Since logicop_func 0 is PIPE_LOGIOP_CLEAR, we were trigger lowerinng
of logic ops on precompiled shaders, which we don't want to do. Also, this
had the side effect of making shader-db crash, as during this lowering we
would try to read the color format swizzle information from the fragment shader
key that we don't populate in precompiled shaders because right now we only
need it when logic operations are enabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Avoid scheduling an instruction that stalls waiting for SFU retval
Jose Maria Casanova Crespo [Tue, 9 Jul 2019 17:23:25 +0000 (19:23 +0200)]
v3d: Avoid scheduling an instruction that stalls waiting for SFU retval

If we detect that a scheduling candidate will stall because having a
register source that is the written by the SFU unit in the previous
instruction we reduce its priority so any non stalling operation would
be chosen.

The latency of SFU operations is defined as 2. So they would be scheduled
earlier if other candidates have the same priority.

Finally we won't merge instructions that stall to a previously chosen one.
As the result of the previous one would be waiting for an extra cycle.

Although shader-db result show that instruction are hurt with an increase
of 0.35% the sum of instructions + stalls is reduced a 0.52%. And
the total of sfu-stalls is reduced a 63.51%. It implies also a small
increase in the max-temps metric because of scheduling earlier SFU
operations.

total instructions in shared programs: 9102719 -> 9117851 (0.17%)
instructions in affected programs: 4324628 -> 4339760 (0.35%)
helped: 4162
HURT: 12128
helped stats (abs) min: 1 max: 10 x̄: 1.28 x̃: 1
helped stats (rel) min: 0.09% max: 4.76% x̄: 0.66% x̃: 0.51%
HURT stats (abs)   min: 1 max: 27 x̄: 1.69 x̃: 1
HURT stats (rel)   min: 0.05% max: 7.69% x̄: 0.87% x̃: 0.68%
95% mean confidence interval for instructions value: 0.90 0.96
95% mean confidence interval for instructions %-change: 0.47% 0.50%
Instructions are HURT.

total max-temps in shared programs: 1327728 -> 1327812 (<.01%)
max-temps in affected programs: 4730 -> 4814 (1.78%)
helped: 61
HURT: 134
helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1
helped stats (rel) min: 2.70% max: 13.33% x̄: 4.89% x̃: 4.17%
HURT stats (abs)   min: 1 max: 3 x̄: 1.12 x̃: 1
HURT stats (rel)   min: 1.54% max: 20.00% x̄: 6.10% x̃: 5.26%
95% mean confidence interval for max-temps value: 0.28 0.58
95% mean confidence interval for max-temps %-change: 1.80% 3.52%
Max-temps are HURT.

total sfu-stalls in shared programs: 99551 -> 36324 (-63.51%)
sfu-stalls in affected programs: 95029 -> 31802 (-66.53%)
helped: 25882
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 2
helped stats (rel) min: 5.26% max: 100.00% x̄: 79.86% x̃: 100.00%
95% mean confidence interval for sfu-stalls value: -2.47 -2.42
95% mean confidence interval for sfu-stalls %-change: -80.18% -79.54%
Sfu-stalls are helped.

total inst-and-stalls in shared programs: 9202270 -> 9154175 (-0.52%)
inst-and-stalls in affected programs: 5618516 -> 5570421 (-0.86%)
helped: 22728
HURT: 855
helped stats (abs) min: 1 max: 31 x̄: 2.16 x̃: 1
helped stats (rel) min: 0.07% max: 16.67% x̄: 1.14% x̃: 0.92%
HURT stats (abs)   min: 1 max: 5 x̄: 1.25 x̃: 1
HURT stats (rel)   min: 0.12% max: 5.26% x̄: 1.24% x̃: 0.86%
95% mean confidence interval for inst-and-stalls value: -2.07 -2.01
95% mean confidence interval for inst-and-stalls %-change: -1.07% -1.05%
Inst-and-stalls are helped.

v2: Rename v3d_qpu_generates_sfu_stalls to v3d_qpu_instr_is_sfu (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: add shader-db stat to count SFU stalls
Jose Maria Casanova Crespo [Tue, 2 Jul 2019 16:31:09 +0000 (18:31 +0200)]
v3d: add shader-db stat to count SFU stalls

SFU operations have a latency of 2 cicles, so if their results
are used in the following cycle to a SFU instruction, the GPU
stalls for an extra cycle until the result is available.

This adds the number of stalls to the shader-db debug mode and
sum of instruction + stalls to evaluate optimizations to schedule
instructions that avoid generating sfu-stalls.

v2: Rename v3d_qpu_generates_sfu_stalls to v3d_qpu_instr_is_sfu (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: replace memset()+strcpy() with snprintf()
Eric Engestrom [Mon, 8 Jul 2019 16:14:28 +0000 (17:14 +0100)]
radv: replace memset()+strcpy() with snprintf()

Just like the next line :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: drop unnecessary memset() before snprintf()
Eric Engestrom [Mon, 8 Jul 2019 16:12:39 +0000 (17:12 +0100)]
radv: drop unnecessary memset() before snprintf()

snprintf() always terminates the string.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: Fix uninitialized warning.
Bas Nieuwenhuizen [Thu, 18 Jul 2019 22:00:03 +0000 (00:00 +0200)]
radv:  Fix uninitialized warning.

For es_vgpr_comp_cnt.

Fixes: 795adbbadd4 "radv/gfx10: Add pipeline state support for tess."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agovirgl: fix a sync issue in virgl_buffer_transfer_extend
Chia-I Wu [Tue, 16 Jul 2019 23:48:03 +0000 (16:48 -0700)]
virgl: fix a sync issue in virgl_buffer_transfer_extend

In virgl_buffer_transfer_extend, when no flush is needed, it tries
to extend a previously queued transfer instead if it can find one.
Comparing to virgl_resource_transfer_prepare, it fails to check if
the resource is busy.

The existence of a previously queued transfer normally implies that
the resource is not busy, maybe except for when the transfer is
PIPE_TRANSFER_UNSYNCHRONIZED.  Rather than burdening us with a
lengthy comment, and potential concerns over breaking it as the
transfer code evolves, this commit makes the valid_buffer_range
check the only condition to take the fast path.

In real world, we hit the fast path almost only because of the
valid_buffer_range check.  In micro benchmarks, the condition should
always be true, otherwise the benchmarks are not very representative
of meaningful workloads.  I think this fix is justified.

The recent change to PIPE_TRANSFER_MAP_DIRECTLY usage disables the
fast path.  This commit re-enables it as well.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: rework virgl_transfer_queue_extend
Chia-I Wu [Wed, 17 Jul 2019 00:11:55 +0000 (17:11 -0700)]
virgl: rework virgl_transfer_queue_extend

Do not take a transfer and do the memcpy.  Add a _buffer suffix to
the function name to make it clear that it is only for buffers.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: fix virgl_buffer_transfer_extend
Chia-I Wu [Wed, 10 Jul 2019 07:33:29 +0000 (00:33 -0700)]
virgl: fix virgl_buffer_transfer_extend

Without setting hw_res, virgl_transfer_queue_extend never finds a
match and always returns NULL.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agoradeonsi: initialize scissor registers etc. without clear state
Marek Olšák [Wed, 17 Jul 2019 03:17:38 +0000 (23:17 -0400)]
radeonsi: initialize scissor registers etc. without clear state

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: return success from vi_dcc_clear_level to simplify callers
Marek Olšák [Wed, 17 Jul 2019 01:30:57 +0000 (21:30 -0400)]
radeonsi: return success from vi_dcc_clear_level to simplify callers

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: fix compute-based culling regression in 1ce52c1e373
Marek Olšák [Tue, 16 Jul 2019 23:04:48 +0000 (19:04 -0400)]
radeonsi: fix compute-based culling regression in 1ce52c1e373

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: fix VGT_PRIMITIVE_TYPE programming
Marek Olšák [Tue, 16 Jul 2019 17:23:17 +0000 (13:23 -0400)]
radeonsi/gfx10: fix VGT_PRIMITIVE_TYPE programming

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: enable Wave32 for vertex, geometry, and tessellation shaders
Marek Olšák [Fri, 12 Jul 2019 19:55:33 +0000 (15:55 -0400)]
radeonsi/gfx10: enable Wave32 for vertex, geometry, and tessellation shaders

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: add debug options to enable/disable Wave32
Marek Olšák [Fri, 12 Jul 2019 19:45:33 +0000 (15:45 -0400)]
radeonsi/gfx10: add debug options to enable/disable Wave32

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64
Marek Olšák [Fri, 12 Jul 2019 23:49:30 +0000 (19:49 -0400)]
radeonsi/gfx10: add as_ngg variant for TES as ES to select Wave32/64

Legacy GS has to use Wave64, so TES before GS has to use Wave64 too.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: implement Wave32
Marek Olšák [Fri, 12 Jul 2019 21:37:29 +0000 (17:37 -0400)]
radeonsi/gfx10: implement Wave32

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: use 32-bit wavemasks for Wave32
Marek Olšák [Tue, 16 Jul 2019 04:55:46 +0000 (00:55 -0400)]
radeonsi/gfx10: use 32-bit wavemasks for Wave32

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: create the LLVM builder in ac_llvm_context_init
Marek Olšák [Fri, 12 Jul 2019 21:35:39 +0000 (17:35 -0400)]
ac: create the LLVM builder in ac_llvm_context_init

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: create the LLVM module for Wave32 or Wave64 in ac_llvm_context_init
Marek Olšák [Fri, 12 Jul 2019 21:32:18 +0000 (17:32 -0400)]
ac: create the LLVM module for Wave32 or Wave64 in ac_llvm_context_init

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac/rtld: add support for Wave32
Marek Olšák [Fri, 12 Jul 2019 21:20:36 +0000 (17:20 -0400)]
ac/rtld: add support for Wave32

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: add Wave32 LLVM target machine
Marek Olšák [Fri, 12 Jul 2019 21:14:11 +0000 (17:14 -0400)]
ac: add Wave32 LLVM target machine

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: initial Wave32 support in LLVM build helpers
Marek Olšák [Fri, 12 Jul 2019 21:12:17 +0000 (17:12 -0400)]
ac: initial Wave32 support in LLVM build helpers

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: assume that selector != NULL for compute shaders
Marek Olšák [Tue, 16 Jul 2019 02:00:05 +0000 (22:00 -0400)]
radeonsi: assume that selector != NULL for compute shaders

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: remove what appears to be legacy compute code
Marek Olšák [Tue, 16 Jul 2019 01:55:43 +0000 (21:55 -0400)]
radeonsi: remove what appears to be legacy compute code

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: remove si_program::use_code_object_v2
Marek Olšák [Tue, 16 Jul 2019 01:49:30 +0000 (21:49 -0400)]
radeonsi: remove si_program::use_code_object_v2

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: add si_shader_selector into si_compute
Marek Olšák [Tue, 16 Jul 2019 01:39:22 +0000 (21:39 -0400)]
radeonsi: add si_shader_selector into si_compute

Now we can assume that shader->selector is always set.
This will simplify some code.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: set threadgroup size to 0 for threadgroups with only 1 wave
Marek Olšák [Fri, 12 Jul 2019 21:22:30 +0000 (17:22 -0400)]
radeonsi: set threadgroup size to 0 for threadgroups with only 1 wave

This has no effect on Wave64.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: set as_ngg for GS prolog
Marek Olšák [Fri, 12 Jul 2019 21:26:24 +0000 (17:26 -0400)]
radeonsi/gfx10: set as_ngg for GS prolog

as_ngg is required by Wave32.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: remove the disable_ngg option
Marek Olšák [Fri, 12 Jul 2019 19:31:14 +0000 (15:31 -0400)]
radeonsi/gfx10: remove the disable_ngg option

because legacy VS hangs.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: combine hw edgeflags with user edgeflags for correct behavior
Marek Olšák [Sat, 6 Jul 2019 04:12:26 +0000 (00:12 -0400)]
radeonsi/gfx10: combine hw edgeflags with user edgeflags for correct behavior

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: deduplicate code for esvert_lds_size
Marek Olšák [Sat, 6 Jul 2019 04:11:36 +0000 (00:11 -0400)]
radeonsi/gfx10: deduplicate code for esvert_lds_size

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: simplify a streamout loop in gfx10_emit_ngg_epilogue
Marek Olšák [Sat, 6 Jul 2019 03:32:36 +0000 (23:32 -0400)]
radeonsi/gfx10: simplify a streamout loop in gfx10_emit_ngg_epilogue

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: don't use MALLOC for outputs
Marek Olšák [Sat, 6 Jul 2019 03:22:33 +0000 (23:22 -0400)]
radeonsi/gfx10: don't use MALLOC for outputs

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: clean up ESGS ring size computation
Marek Olšák [Sat, 6 Jul 2019 02:19:47 +0000 (22:19 -0400)]
radeonsi/gfx10: clean up ESGS ring size computation

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: fix unnecessary LDS overallocation for NGG GS
Marek Olšák [Sat, 6 Jul 2019 02:12:36 +0000 (22:12 -0400)]
radeonsi/gfx10: fix unnecessary LDS overallocation for NGG GS

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: don't compile the GS copy shader if it's 100% not needed
Marek Olšák [Sat, 6 Jul 2019 01:19:41 +0000 (21:19 -0400)]
radeonsi/gfx10: don't compile the GS copy shader if it's 100% not needed

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: set GE_CTNL.PACKET_TO_ONE_PA for NGG
Marek Olšák [Sat, 6 Jul 2019 01:06:04 +0000 (21:06 -0400)]
radeonsi/gfx10: set GE_CTNL.PACKET_TO_ONE_PA for NGG

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: update a tunable max_es_verts_base for NGG
Marek Olšák [Fri, 5 Jul 2019 21:53:47 +0000 (17:53 -0400)]
radeonsi/gfx10: update a tunable max_es_verts_base for NGG

We have to fix the computation so as not to break quads.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: implement ARB_post_depth_coverage
Marek Olšák [Fri, 5 Jul 2019 21:30:08 +0000 (17:30 -0400)]
radeonsi/gfx10: implement ARB_post_depth_coverage

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: fix leaked compute shader NIR
Marek Olšák [Tue, 16 Jul 2019 04:08:27 +0000 (00:08 -0400)]
radeonsi: fix leaked compute shader NIR

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: save the enable_nir option in the shader cache correctly
Marek Olšák [Fri, 12 Jul 2019 19:42:44 +0000 (15:42 -0400)]
radeonsi: save the enable_nir option in the shader cache correctly

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi/gfx10: enable SDMA
Marek Olšák [Sat, 13 Jul 2019 00:21:01 +0000 (20:21 -0400)]
radeonsi/gfx10: enable SDMA

no changes since gfx9 for buffers

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: use llvm.amdgcn.writelane
Marek Olšák [Tue, 16 Jul 2019 05:07:49 +0000 (01:07 -0400)]
ac: use llvm.amdgcn.writelane

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: fix shader clock on LLVM 9
Marek Olšák [Tue, 16 Jul 2019 03:42:35 +0000 (23:42 -0400)]
ac: fix shader clock on LLVM 9

Probably relevant commit:

commit dd32dc3f72ec99b1794d62c74d2beb3b60468d50
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Date:   Tue Jul 9 03:10:18 2019 +0000

    [AMDGPU] Always use s_memtime for readcyclecounter

    Differential Revision: https://reviews.llvm.org/D64369

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365431 91177308-0d34-0410-b5e6-96231b3b80d8

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeon/vcn: adding engine type for new fw interface
Boyuan Zhang [Wed, 15 May 2019 19:05:21 +0000 (15:05 -0400)]
radeon/vcn: adding engine type for new fw interface

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradeonsi: use the correct buffer size in si_vid_clear_buffer
Marek Olšák [Mon, 8 Jul 2019 18:57:42 +0000 (14:57 -0400)]
radeonsi: use the correct buffer size in si_vid_clear_buffer

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agomesa: add EXT_dsa glEnabledIndexedEXT
Pierre-Eric Pelloux-Prayer [Fri, 26 Apr 2019 14:50:57 +0000 (16:50 +0200)]
mesa: add EXT_dsa glEnabledIndexedEXT

The implementation uses _mesa_ActiveTexture to change the active texture unit and
then reset it.

It causes an unnecessary _NEW_TEXTURE_STATE but:
  - adding an index argument to _mesa_set_enable causes a lot of changes (~140 callers)
  - enable_texture (called by _mesa_set_enable) might cause a _NEW_TEXTURE_STATE
    anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa glGetTextureLevelParameter*vEXT functions
Pierre-Eric Pelloux-Prayer [Mon, 20 May 2019 12:12:54 +0000 (14:12 +0200)]
mesa: add EXT_dsa glGetTextureLevelParameter*vEXT functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: add EXT_dsa gl(Copy)Texture(Sub)Image1D/2D/3DEXT functions
Pierre-Eric Pelloux-Prayer [Fri, 26 Apr 2019 14:50:31 +0000 (16:50 +0200)]
mesa: add EXT_dsa gl(Copy)Texture(Sub)Image1D/2D/3DEXT functions

Added functions:
- glTextureImage1DEXT
- glTextureImage2DEXT
- glTextureImage3DEXT
- glTextureSubImage1DEXT
- glTextureSubImage3DEXT
- glCopyTextureImage1DEXT
- glCopyTextureImage2DEXT
- glCopyTextureSubImage1DEXT
- glCopyTextureSubImage2DEXT
- glCopyTextureSubImage3DEXT
- glGetTextureImageEXT

All but the last one can be compiled in a display list.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move lookup_texture_ext_dsa up in teximage.c
Pierre-Eric Pelloux-Prayer [Tue, 2 Jul 2019 09:33:36 +0000 (11:33 +0200)]
mesa: move lookup_texture_ext_dsa up in teximage.c

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: pass gl_texture_object as arg to not depend on state
Pierre-Eric Pelloux-Prayer [Tue, 2 Jul 2019 09:32:06 +0000 (11:32 +0200)]
mesa: pass gl_texture_object as arg to not depend on state

This will allow to use the same functions for EXT_dsa implementation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: refactor get_texture_image to remove duplicate code
Pierre-Eric Pelloux-Prayer [Tue, 2 Jul 2019 08:59:21 +0000 (10:59 +0200)]
mesa: refactor get_texture_image to remove duplicate code

Move shared code in a new function (_get_texture_image) and use it instead
of duplicating the same lines.
Will be also used by the EXT_dsa functions (GetTextureImageEXT and GetMultiTexImageEXT).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agopipe-loader: use radeonsi for MM if amdgpu dri is used
Jeremy Newton [Wed, 10 Jul 2019 14:23:53 +0000 (10:23 -0400)]
pipe-loader: use radeonsi for MM if amdgpu dri is used

The amdgpu dri is used for the closed source AMD driver. Since this driver
does not implement multimedia, we fall back to radeonsi in mesa to do
multimedia. This corrects the dri driver name for when it is set to amdgpu.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> (v1)
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agoegl: drop incorrect pkg-config file for glvnd
Eric Engestrom [Thu, 4 Jul 2019 13:48:43 +0000 (14:48 +0100)]
egl: drop incorrect pkg-config file for glvnd

With b01524fff05eef66e8cd ("meson: don't build libGLES*.so with GLVND")
we dropped the incorrect pkg-config files for GLES*.

Since then, the glvnd issue of its missing files has become painfully
apparent, since it break the build for everyone using glvnd.

NVIDIA has had a fix for a few years now, but has yet to accept it:
https://github.com/NVIDIA/libglvnd/pull/86

Since the breakage is already there, let's clean up everything on our side
while we wait for NVIDIA to accept the fix.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agodocs: simplify `Fixes:` git command
Eric Engestrom [Fri, 19 Jul 2019 20:27:44 +0000 (21:27 +0100)]
docs: simplify `Fixes:` git command

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomesa/tests: add missing dep_thread
Eric Engestrom [Fri, 19 Jul 2019 14:00:35 +0000 (15:00 +0100)]
mesa/tests: add missing dep_thread

Fixes: f8c27c277585141f2d27 ("state_tracker: Move the format test out to be an actual unit test.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Vinson Lee <vlee@freedesktop.org>
5 years agoutil: drop strncat(), strcmp(), strncmp(), snprintf() & vsnprintf() MSVC fallbacks
Eric Engestrom [Thu, 4 Jul 2019 15:13:34 +0000 (16:13 +0100)]
util: drop strncat(), strcmp(), strncmp(), snprintf() & vsnprintf() MSVC fallbacks

It would seem MSVC>=2015 is now C99-compliant wrt these functions:
strncat:   https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strncat-strncat-l-wcsncat-wcsncat-l-mbsncat-mbsncat-l?view=vs-2017
strcmp:    https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strcmp-wcscmp-mbscmp?view=vs-2017
strncmp:   https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/strncmp-wcsncmp-mbsncmp-mbsncmp-l?view=vs-2017
snprintf:  https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/snprintf-snprintf-snprintf-l-snwprintf-snwprintf-l?view=vs-2017
vsnprintf: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/vsnprintf-vsnprintf-vsnprintf-l-vsnwprintf-vsnwprintf-l?view=vs-2017

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for vsnprintf()
Eric Engestrom [Tue, 20 Nov 2018 12:02:36 +0000 (12:02 +0000)]
util: use standard name for vsnprintf()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for snprintf()
Eric Engestrom [Tue, 20 Nov 2018 11:59:28 +0000 (11:59 +0000)]
util: use standard name for snprintf()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for vasprintf()
Eric Engestrom [Tue, 20 Nov 2018 11:55:55 +0000 (11:55 +0000)]
util: use standard name for vasprintf()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for sprintf()
Eric Engestrom [Tue, 20 Nov 2018 11:55:00 +0000 (11:55 +0000)]
util: use standard name for sprintf()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for strcmp()
Eric Engestrom [Tue, 20 Nov 2018 11:49:52 +0000 (11:49 +0000)]
util: use standard name for strcmp()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for strcasecmp()
Eric Engestrom [Tue, 20 Nov 2018 11:39:28 +0000 (11:39 +0000)]
util: use standard name for strcasecmp()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for strncmp()
Eric Engestrom [Tue, 20 Nov 2018 11:48:38 +0000 (11:48 +0000)]
util: use standard name for strncmp()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for strncat()
Eric Engestrom [Tue, 20 Nov 2018 11:47:06 +0000 (11:47 +0000)]
util: use standard name for strncat()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for strdup()
Eric Engestrom [Tue, 20 Nov 2018 11:42:14 +0000 (11:42 +0000)]
util: use standard name for strdup()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: use standard name for strchrnul()
Eric Engestrom [Tue, 20 Nov 2018 11:24:55 +0000 (11:24 +0000)]
util: use standard name for strchrnul()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: drop unused vsprintf() wrapper
Eric Engestrom [Tue, 20 Nov 2018 11:57:03 +0000 (11:57 +0000)]
util: drop unused vsprintf() wrapper

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: drop unused strchr() wrapper
Eric Engestrom [Tue, 20 Nov 2018 11:51:35 +0000 (11:51 +0000)]
util: drop unused strchr() wrapper

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoutil: drop unused strstr() wrapper
Eric Engestrom [Tue, 20 Nov 2018 11:43:57 +0000 (11:43 +0000)]
util: drop unused strstr() wrapper

Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: Only rematerialize comparisons with all SSA sources
Jason Ekstrand [Fri, 19 Jul 2019 18:07:39 +0000 (13:07 -0500)]
nir: Only rematerialize comparisons with all SSA sources

Otherwise, you may end up moving a register read and that could result
in an incorrect shader.  This commit fixes a rendering issue in Elite:
Dangerous.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111152
Fixes: 3ee2e84c60 "nir: Rematerialize compare instructions"
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agospirv: Fix order of barriers in SpvOpControlBarrier
Daniel Schürmann [Thu, 18 Jul 2019 18:48:14 +0000 (20:48 +0200)]
spirv: Fix order of barriers in SpvOpControlBarrier

Semantically, the memory barrier has to come first to wait
for the completion of pending memory requests.
Afterwards, the workgroups can be synchronized.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir: use a switch when printing intrinsic indices
Caio Marcelo de Oliveira Filho [Thu, 7 Mar 2019 19:07:04 +0000 (11:07 -0800)]
nir: use a switch when printing intrinsic indices

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agonir/algebraic: mark a few comparison simplifications as precise
Rhys Perry [Mon, 1 Jul 2019 14:49:40 +0000 (15:49 +0100)]
nir/algebraic: mark a few comparison simplifications as precise

No vkpipeline-db changes found.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir/algebraic: optimize contradictory iand operands
Rhys Perry [Fri, 28 Jun 2019 15:13:04 +0000 (16:13 +0100)]
nir/algebraic: optimize contradictory iand operands

Some of these were found in a few GTAV, Rise of the Tomb Raider and
Shadow of the Tomb Raider shaders.

Results from vkpipeline-db run with ACO:
Totals from affected shaders:
SGPRS: 376 -> 376 (0.00 %)
VGPRS: 220 -> 220 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 13492 -> 11560 (-14.32 %) bytes
LDS: 6 -> 6 (0.00 %) blocks
Max Waves: 69 -> 69 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

v2: use False instead of 0

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reveiewed-by: Alyssa Rosenzweig alyssa.rosenzweig@collabora.com
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agolima/ppir: handle all node types in ppir_node_replace_child
Erico Nunes [Tue, 16 Jul 2019 23:31:01 +0000 (01:31 +0200)]
lima/ppir: handle all node types in ppir_node_replace_child

ppir_node_replace_child is used by the const lowering routine in ppir.
All types need to be handled here, otherwise the src node is not updated
properly when one of the lowered nodes is a const, which results in, for
example, regalloc not assigning registers correctly.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: branch regalloc fixes
Erico Nunes [Tue, 16 Jul 2019 23:30:55 +0000 (01:30 +0200)]
lima/ppir: branch regalloc fixes

The branch instruction has sources which must be handled in src handling
paths so that regalloc assigns registers to them properly.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agomain: Destroy static hash table
Yevhenii Kolesnikov [Thu, 18 Jul 2019 14:38:48 +0000 (17:38 +0300)]
main: Destroy static hash table

format_array_format_table has a static lifetime - it will be destroyed
by an atexit handler.

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: reset the window scissor with no clear state.
Dave Airlie [Thu, 18 Jul 2019 01:19:11 +0000 (11:19 +1000)]
radv: reset the window scissor with no clear state.

If we don't have clear state (which gfx10 doesn't currently)
we will fix to reset the scissor. AMDVLK will leave it set
to something else.

Marek also has this fix for radeonsi pending.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: fix crash in shader tracing.
Dave Airlie [Thu, 18 Jul 2019 00:44:10 +0000 (10:44 +1000)]
radv: fix crash in shader tracing.

Enabling tracing, and then having a vmfault, can leads to a segfault
before we print out the traces, as if a meta shader is executing
and we don't have the NIR for it.

Just pass the stage and give back a default.

Fixes: 9b9ccee4d64 ("radv: take LDS into account for compute shader occupancy stats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoiris: change last_vue_stage() to look at uncompiled shaders
Timothy Arceri [Fri, 28 Jun 2019 01:13:11 +0000 (11:13 +1000)]
iris: change last_vue_stage() to look at uncompiled shaders

This allows us to find the last vue stage before we have compiled
the shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_clip: add support for geometry shaders
Timothy Arceri [Thu, 27 Jun 2019 04:20:37 +0000 (14:20 +1000)]
nir/lower_clip: add support for geometry shaders

This will be used to enabled compat profile support for geometry
shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_clip: add lower_clip_outputs() helper
Timothy Arceri [Fri, 28 Jun 2019 00:50:15 +0000 (10:50 +1000)]
nir/lower_clip: add lower_clip_outputs() helper

This will be reused in the following patch to add support for clip
vertex lowering in geometry shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_clip: add create_clipdist_vars() helper
Timothy Arceri [Fri, 28 Jun 2019 00:35:11 +0000 (10:35 +1000)]
nir/lower_clip: add create_clipdist_vars() helper

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_clip: add a find_clipvertex_and_position_outputs() helper
Timothy Arceri [Fri, 28 Jun 2019 00:10:28 +0000 (10:10 +1000)]
nir/lower_clip: add a find_clipvertex_and_position_outputs() helper

This will allow code sharing in a following patch that adds support
for lowering in geometry shaders. It also allows us to exit early
if there is no lowering to do which allows a small code tidy up.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agopanfrost: Set rt_count
Alyssa Rosenzweig [Thu, 18 Jul 2019 19:43:39 +0000 (12:43 -0700)]
panfrost: Set rt_count

This doesn't quite work yet, but it illustrates how MRT is implemented
in the MFBD: rt_count is set appropriately based on the number of render
targets, while additional render target descriptors are appended on with
an index variable in them (not quite decoded since there's some aspects
we don't understand there, but conceptually this should be right).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>