mesa.git
4 years agovulkan/overlay: add queue present timing measurement
Lionel Landwerlin [Mon, 8 Jul 2019 13:03:14 +0000 (16:03 +0300)]
vulkan/overlay: add queue present timing measurement

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agoradv/gfx10: Enable tess.
Bas Nieuwenhuizen [Mon, 8 Jul 2019 21:50:09 +0000 (23:50 +0200)]
radv/gfx10: Enable tess.

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv/gfx10: Add pipeline state support for tess.
Bas Nieuwenhuizen [Mon, 8 Jul 2019 21:44:32 +0000 (23:44 +0200)]
radv/gfx10: Add pipeline state support for tess.

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv/gfx10: Only set HW edge flags with gs & tess disabled.
Bas Nieuwenhuizen [Mon, 8 Jul 2019 21:43:34 +0000 (23:43 +0200)]
radv/gfx10: Only set HW edge flags with gs & tess disabled.

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv/gfx10: Add tess eval ngg shader support.
Bas Nieuwenhuizen [Mon, 8 Jul 2019 21:42:45 +0000 (23:42 +0200)]
radv/gfx10: Add tess eval ngg shader support.

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv: Use correct gs_out with tessellation.
Bas Nieuwenhuizen [Mon, 8 Jul 2019 21:18:55 +0000 (23:18 +0200)]
radv: Use correct gs_out with tessellation.

We should use the primitives output by the TES in that case.

There is always a separate TES if there is no GS.

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv/gfx10: Use correct count of max_offchip_buffers.
Bas Nieuwenhuizen [Mon, 8 Jul 2019 21:18:28 +0000 (23:18 +0200)]
radv/gfx10: Use correct count of max_offchip_buffers.

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv/gfx10: Load global pointers in correct userdata registers for hs/gs.
Bas Nieuwenhuizen [Tue, 9 Jul 2019 00:56:10 +0000 (02:56 +0200)]
radv/gfx10: Load global pointers in correct userdata registers for hs/gs.

Fixes: cfaad5e3cad "radv/gfx10: implement radv_emit_global_shader_pointers()"
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradeonsi: update function name in comment
Timothy Arceri [Mon, 8 Jul 2019 00:59:46 +0000 (10:59 +1000)]
radeonsi: update function name in comment

This was missed in 2361558eb71d

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agor600: remove query/apply_opaque_metadata callbacks
Timothy Arceri [Mon, 8 Jul 2019 00:52:45 +0000 (10:52 +1000)]
r600: remove query/apply_opaque_metadata callbacks

Theses seem to have been radeonsi specific callbacks that are no
longer needed now that these drivers no longer share this code
path.

These callbacks were removed from radeonsi in c0d44fe0e91c.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agovulkan/overlay: fix crash on freeing NULL command buffer
Lionel Landwerlin [Mon, 8 Jul 2019 13:00:59 +0000 (16:00 +0300)]
vulkan/overlay: fix crash on freeing NULL command buffer

It is legal to call vkFreeCommandBuffers() on NULL command buffers.

This fix requires eb41ce1b012f24 ("util/hash_table: Properly handle
the NULL key in hash_table_u64").

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4438188f492e1f ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agovulkan: bump headers & registry to 1.1.114
Lionel Landwerlin [Mon, 8 Jul 2019 07:30:50 +0000 (10:30 +0300)]
vulkan: bump headers & registry to 1.1.114

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agoradv: only use specialised 3D meta paths on GFX9.
Dave Airlie [Mon, 8 Jul 2019 19:08:09 +0000 (05:08 +1000)]
radv: only use specialised 3D meta paths on GFX9.

GFX10 appears to act like GFX8 here.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agomesa: Set minimum possible GLSL version
Ian Romanick [Sat, 9 Mar 2019 07:50:29 +0000 (23:50 -0800)]
mesa: Set minimum possible GLSL version

Set the absolute minimum possible GLSL version.  API_OPENGL_CORE can
mean an OpenGL 3.0 forward-compatible context, so that implies a minimum
possible version of 1.30.  Otherwise, the minimum possible version 1.20.
Since Mesa unconditionally advertises GL_ARB_shading_language_100 and
GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't
advertise any extensions to enable any shader stages (e.g.,
GL_ARB_vertex_shader).

Converts about 2,500 piglit tests from crash to skip on NV18.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110955
Cc: mesa-stable@lists.freedesktop.org
4 years agoanv: Set maxComputeSharedMemorySize to 64k
Caio Marcelo de Oliveira Filho [Mon, 8 Jul 2019 17:36:59 +0000 (10:36 -0700)]
anv: Set maxComputeSharedMemorySize to 64k

This value is supported since gen7.  See also 8514c75a26e "i965: Set
compute shader shared memory max to 64k".

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agointel/vec4: Delete vec4_visitor::emit_lrp
Ian Romanick [Thu, 6 Jun 2019 18:00:40 +0000 (11:00 -0700)]
intel/vec4: Delete vec4_visitor::emit_lrp

Effectivley unused since dd7135d55d5 ("intel/compiler: Use the flrp
lowering pass for all stages on Gen4 and Gen5").  I had intended to
remove this code as part of that series, but I forgot.

Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agonir: Allow nir_ssa_alu_instr_src_components to operate on non-SSA destinations
Ian Romanick [Fri, 7 Jun 2019 15:35:51 +0000 (08:35 -0700)]
nir: Allow nir_ssa_alu_instr_src_components to operate on non-SSA destinations

Existing users only operate on instructions with SSA destinations.  Some
later patches add new direct calls and indirect calls (via existing NIR
functions) on instructions after going out of SSA.  At the very least,
these calls are added by:

intel/vec4: Try to emit a VF source in try_immediate_source
intel/vec4: Try to emit a single load for multiple 3-src instruction operands

The first commit adds direct calls, and the second adds calls via
nir_alu_srcs_equal and nir_alu_srcs_negative_equal.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agonir: Handle swizzle in nir_alu_srcs_negative_equal
Ian Romanick [Mon, 10 Jun 2019 22:05:14 +0000 (15:05 -0700)]
nir: Handle swizzle in nir_alu_srcs_negative_equal

When I added this function, I was not sure if swizzles of immediate
values were a thing that occurred in NIR.  The only existing user of
these functions is the partial redundancy elimination for compares.
Since comparison instructions are inherently scalar, this does not
occur.

However, a couple later patches, "nir/algebraic: Recognize
open-coded flrp(-1, 1, a) and flrp(1, -1, a)" combined with "intel/vec4:
Try to emit a single load for multiple 3-src instruction operands",
collaborate to create a few thousand instances.

No shader-db changes on any Intel platform.

v2: Handle the swizzle in nir_alu_srcs_negative_equal and leave
nir_const_value_negative_equal unchanged.  Suggested by Jason.

v3: Correctly handle write masks.  Add note (and assertion) that the
caller is responsible for various compatibility checks.  The single
existing caller only calls this for combinations of scalar fadd and
float comparison instructions, so all of the requirements are met.  A
later patch (intel/vec4: Try to emit a single load for multiple 3-src
instruction operands) will call this for sources of the same
instruction, so all of the requirements are met.

v4: Add unit test for nir_opt_comparison_pre that is fixed by this
commit.

Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agonir: nir_const_value_negative_equal compares one value at a time
Ian Romanick [Thu, 13 Jun 2019 21:06:55 +0000 (14:06 -0700)]
nir: nir_const_value_negative_equal compares one value at a time

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agonir: Port some const_value_negative_equal tests to alu_src_negative_equal
Ian Romanick [Thu, 13 Jun 2019 20:55:30 +0000 (13:55 -0700)]
nir: Port some const_value_negative_equal tests to alu_src_negative_equal

The next commit will make the existing tests irrelevant.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agonir: Pass fully qualified type to nir_const_value_negative_equal
Ian Romanick [Thu, 13 Jun 2019 19:59:29 +0000 (12:59 -0700)]
nir: Pass fully qualified type to nir_const_value_negative_equal

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agonir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_size
Ian Romanick [Thu, 13 Jun 2019 21:19:11 +0000 (14:19 -0700)]
nir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_size

This is important because, for example nir_op_fne has
dest.dest.ssa.bit_size == 1, but the source operands can be 16-, 32-, or
64-bits.  Fixing this helps partial redundancy elimination for compares
in a few more shaders.

v2: Add unit tests for nir_opt_comparison_pre that are fixed by this
commit.

All Intel platforms had similar results.
total instructions in shared programs: 17179408 -> 17179081 (<.01%)
instructions in affected programs: 43958 -> 43631 (-0.74%)
helped: 118
HURT: 2
helped stats (abs) min: 1 max: 5 x̄: 2.87 x̃: 2
helped stats (rel) min: 0.06% max: 4.12% x̄: 1.19% x̃: 0.81%
HURT stats (abs)   min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel)   min: 5.83% max: 6.06% x̄: 5.94% x̃: 5.94%
95% mean confidence interval for instructions value: -3.08 -2.37
95% mean confidence interval for instructions %-change: -1.30% -0.85%
Instructions are helped.

total cycles in shared programs: 360959066 -> 360942386 (<.01%)
cycles in affected programs: 774274 -> 757594 (-2.15%)
helped: 111
HURT: 4
helped stats (abs) min: 1 max: 1591 x̄: 169.49 x̃: 36
helped stats (rel) min: <.01% max: 24.43% x̄: 8.86% x̃: 2.24%
HURT stats (abs)   min: 1 max: 2068 x̄: 533.25 x̃: 32
HURT stats (rel)   min: 0.02% max: 5.10% x̄: 3.06% x̃: 3.56%
95% mean confidence interval for cycles value: -200.61 -89.47
95% mean confidence interval for cycles %-change: -10.32% -6.58%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> [v1]
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Fixes: be1cc3552bc ("nir: Add nir_const_value_negative_equal")
4 years agointel/vec4: Reswizzle VF immediates too
Ian Romanick [Wed, 12 Jun 2019 20:19:25 +0000 (13:19 -0700)]
intel/vec4: Reswizzle VF immediates too

Previously, an instruction like

mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F]

would get rewritten as

mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F]

The latter does not produce the correct result.  The VF immediate in the
second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F].  This
commit produces the former.

Fixes: 1ee1d8ab468 ("i965/vec4: Reswizzle sources when necessary.")
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agonir: Add unit tests for nir_opt_comparison_pre
Ian Romanick [Mon, 17 Jun 2019 23:27:37 +0000 (16:27 -0700)]
nir: Add unit tests for nir_opt_comparison_pre

Each tests has a comment with the expected before and after NIR.  The
tests don't actually check this.  The tests only check whether or not
the optimization pass reported progress.  I couldn't think of a robust,
future-proof way to check the before and after code.

Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agoanv: disable repacking for compression for applicable gen
Dongwon Kim [Thu, 27 Jun 2019 16:54:37 +0000 (09:54 -0700)]
anv: disable repacking for compression for applicable gen

set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register
if the gen attribute, 'disable_ccs_repack' is set.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
4 years agoiris: disable repacking for compression for applicable gen
Dongwon Kim [Thu, 27 Jun 2019 16:54:36 +0000 (09:54 -0700)]
iris: disable repacking for compression for applicable gen

set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register
if the gen attribute, 'disable_ccs_repack' is set.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
4 years agoi965: disable repacking for compression for applicable gen
Dongwon Kim [Thu, 27 Jun 2019 16:54:35 +0000 (09:54 -0700)]
i965: disable repacking for compression for applicable gen

set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register
if the gen attribute, 'disable_ccs_repack' is set.

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
4 years agointel: add disable_ccs_repack to gen_device_info
Dongwon Kim [Thu, 27 Jun 2019 16:54:34 +0000 (09:54 -0700)]
intel: add disable_ccs_repack to gen_device_info

add a new attribute, 'disable_ccs_repack' to gen_device info, which
indicates whether repacking of components in certain pixel formats
before compression needs to be disabled to keep the compatibility
with decompression capability of display controller (gen11+)

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
4 years agointel/genxml: correct bit fields in CACHE_MODE_0 reg for gen11
Dongwon Kim [Thu, 27 Jun 2019 16:54:33 +0000 (09:54 -0700)]
intel/genxml: correct bit fields in CACHE_MODE_0 reg for gen11

correct bit fields information of CACHE_MODE_0 reg in current gen11.xml

Signed-off-by: Dongwon Kim <dongwon.kim@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
4 years agonir: print ptr_stride for deref_casts
Caio Marcelo de Oliveira Filho [Wed, 3 Jul 2019 19:10:43 +0000 (12:10 -0700)]
nir: print ptr_stride for deref_casts

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoanv: Advertise VK_EXT_shader_demote_to_helper_invocation
Caio Marcelo de Oliveira Filho [Sat, 8 Jun 2019 00:37:38 +0000 (17:37 -0700)]
anv: Advertise VK_EXT_shader_demote_to_helper_invocation

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agospirv: Implement SPV_EXT_demote_to_helper_invocation
Caio Marcelo de Oliveira Filho [Sat, 8 Jun 2019 06:08:04 +0000 (23:08 -0700)]
spirv: Implement SPV_EXT_demote_to_helper_invocation

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agospirv: Update the headers from latest Khronos master
Caio Marcelo de Oliveira Filho [Fri, 7 Jun 2019 23:41:04 +0000 (16:41 -0700)]
spirv: Update the headers from latest Khronos master

This corresponds to 29c11140baaf9f7fdaa39a583672c556bf1795a1 in
https://github.com/KhronosGroup/SPIRV-Headers.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agointel/fs: Implement "demote to helper invocation"
Caio Marcelo de Oliveira Filho [Sat, 8 Jun 2019 06:06:27 +0000 (23:06 -0700)]
intel/fs: Implement "demote to helper invocation"

The "demote" intrinsic works like "discard" but don't change the
control flow, allowing derivative operations to work.  This is the
semantics of D3D discard.

The "is_helper_invocation" intrinsic will return true for helper
invocations -- both the ones that started as helpers and the ones that
where demoted.  This is needed to avoid changing the behavior of
gl_HelperInvocation which is an input (so not expected to change
during shader execution).

v2: Emit the discard jump and comment why it is safe.  (Jason)
    Rework the is_helper_invocation() that was stomping f0.1.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agonir: Add demote and is_helper_invocation intrinsics
Caio Marcelo de Oliveira Filho [Sat, 8 Jun 2019 00:29:05 +0000 (17:29 -0700)]
nir: Add demote and is_helper_invocation intrinsics

From SPV_EXT_demote_to_helper_invocation.  Demote will be implemented
as a variant of discard, so mark uses_discard if it is used.

v2: Add CAN_ELIMINATE flag to the new intrinsic.  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agoradv: do not emit VGT_FLUSH on GFX10
Samuel Pitoiset [Mon, 8 Jul 2019 11:45:08 +0000 (13:45 +0200)]
radv: do not emit VGT_FLUSH on GFX10

We don't need it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoac/nir: Remove now-unused interp_deref handling
Connor Abbott [Fri, 17 May 2019 14:31:17 +0000 (16:31 +0200)]
ac/nir: Remove now-unused interp_deref handling

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoradeonsi/nir: Use NIR barycentric intrinsics
Connor Abbott [Tue, 14 May 2019 11:29:52 +0000 (13:29 +0200)]
radeonsi/nir: Use NIR barycentric intrinsics

This is simpler than radv, since the driver_location is already assigned
for us.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoradeonsi/nir: Delete unreachable code
Connor Abbott [Mon, 13 May 2019 14:28:58 +0000 (16:28 +0200)]
radeonsi/nir: Delete unreachable code

We always get gl_FragCoord as a system value, not a varying, so this is
never hit. We already set PIXEL_CENTER_INTEGER elsewhere.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agocompiler: Add color system value
Connor Abbott [Mon, 27 May 2019 15:48:42 +0000 (17:48 +0200)]
compiler: Add color system value

This is nice to have with radeonsi, where color varyings are handled
specially to avoid recompiles.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoradv: Use NIR barycentric intrinsics
Connor Abbott [Fri, 10 May 2019 08:44:20 +0000 (10:44 +0200)]
radv: Use NIR barycentric intrinsics

We have to add a few lowering to deal with things that used to be dealt
with inline when creating inputs. We also move the code that fills out
the radv_shader_variant_info struct for linking purposes to
radv_shader.c, as it's no longer tied to the NIR->LLVM lowering.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoac/nir: Implement barycentric intrinsics
Connor Abbott [Mon, 13 May 2019 08:55:07 +0000 (10:55 +0200)]
ac/nir: Implement barycentric intrinsics

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agointel/nir: Extract add_const_offset_to_base
Connor Abbott [Tue, 14 May 2019 10:10:11 +0000 (12:10 +0200)]
intel/nir: Extract add_const_offset_to_base

Pretty much every driver using nir_lower_io_to_temporaries followed by
nir_lower_io is going to want this. In particular, radv and radeonsi in
the next commits.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agonir/lower_io_to_temporaries: Handle interpolation intrinsics
Connor Abbott [Wed, 15 May 2019 16:48:25 +0000 (18:48 +0200)]
nir/lower_io_to_temporaries: Handle interpolation intrinsics

These weren't properly supported. This does pretty much the same thing
that the radv code did.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agonir: Avoid coalescing vars created by lower_io_to_temporaries
Connor Abbott [Wed, 15 May 2019 10:49:29 +0000 (12:49 +0200)]
nir: Avoid coalescing vars created by lower_io_to_temporaries

Right now nir_copy_prop_vars is effectively undoing
nir_lower_io_to_temporaries for inputs by propagating the original
variable through the copy created in lower_io_to_temporaries. A
theoretical variable coalescing pass would have the same issue with
output variables, although that doesn't exist yet. To fix this, add a
new bit to nir_variable, and disable copy propagation when it's set.

This doesn't seem to affect any drivers now, probably since since no one
uses lower_io_to_temporaries for inputs as well as copy_prop_vars, but
it will fix radv once we flip on lower_io_to_temporaries for fs inputs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agonir: Return correct size in nir_assign_io_var_locations()
Connor Abbott [Fri, 17 May 2019 12:56:45 +0000 (14:56 +0200)]
nir: Return correct size in nir_assign_io_var_locations()

It was double-counting cases where multiple variables were assigned to
the same slot, and not handling the case where the last variable is a
compact variable.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agonir: Handle compact variables when assigning i/o locations
Connor Abbott [Tue, 14 May 2019 12:08:46 +0000 (14:08 +0200)]
nir: Handle compact variables when assigning i/o locations

These are used in Vulkan for clip/cull distances, instead of the GLSL
lowering when the clip/cull arrays are shared.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agonir: Move st_nir_assign_var_locations() to common code
Connor Abbott [Fri, 10 May 2019 08:18:12 +0000 (10:18 +0200)]
nir: Move st_nir_assign_var_locations() to common code

It isn't really doing anything Gallium-specific, and it's needed for
handling component packing, overlapping, etc.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoradv: Make FragCoord a sysval
Connor Abbott [Mon, 13 May 2019 13:39:54 +0000 (15:39 +0200)]
radv: Make FragCoord a sysval

load_fragcoord is already handled in common code for radeonsi, so we
don't need to do anything to handle it. However, there were some passes
creating NIR with the varying, so we switch them over to the sysval. In
the case of nir_lower_input_attachments which is used by both radv and
anv, we add handling for both until intel switches to using a sysval.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agospirv: Add an option for making FragCoord a sysval
Connor Abbott [Mon, 13 May 2019 13:32:26 +0000 (15:32 +0200)]
spirv: Add an option for making FragCoord a sysval

On AMD, FragCoord should be a sysval because it is handled separately
from all the other inputs. We were already doing this in radeonsi, but
we weren't doing it with radv. It'll be much more annoying to handle
VARYING_SLOT_POS in fragment shaders when we let NIR lower FS inputs for
us, so here we add an option so that radv can get it as a system value.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: Lower input attachments in NIR.
Daniel Schürmann [Fri, 5 Apr 2019 09:01:39 +0000 (11:01 +0200)]
radv: Lower input attachments in NIR.

v2 (Connor)
- Fix warning in release mode using MAYBE_UNUSED

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: Implement nir_intrinsic_load_layer_id().
Daniel Schürmann [Fri, 5 Apr 2019 08:52:31 +0000 (10:52 +0200)]
radv: Implement nir_intrinsic_load_layer_id().

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoanv,nir: Move lower_input_attachments pass from ANV to NIR.
Daniel Schürmann [Wed, 3 Apr 2019 15:29:20 +0000 (17:29 +0200)]
anv,nir: Move lower_input_attachments pass from ANV to NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: don't emit PFP packets on ME.
Dave Airlie [Mon, 8 Jul 2019 06:53:27 +0000 (16:53 +1000)]
radv/gfx10: don't emit PFP packets on ME.

This was done for all previous GPUs.

This fixes Talos Principle launch hangs.

Fixes: 7e43022e8c8 (radv/gfx10: add gfx10_cs_emit_cache_flush)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoac: select the GFX ring when halting waves with UMR on GFX10
Samuel Pitoiset [Sun, 7 Jul 2019 17:32:29 +0000 (19:32 +0200)]
ac: select the GFX ring when halting waves with UMR on GFX10

GFX10 has two rings, so UMR want to know which one to halt.
Select the first one by default.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: Move NGG output handling outside of giant if-statement.
Bas Nieuwenhuizen [Sun, 7 Jul 2019 23:19:55 +0000 (01:19 +0200)]
radv/gfx10: Move NGG output handling outside of giant if-statement.

In merged shaders we put a big if around each shader, so both stages
can have a different number of threads. However, the NGG output code
still needs to run if the first shader is not executed.

This can happen when there are more gs threads than vs/es threads, or
when there are 0 es/vs threads (why? no clue).

Fixes: ee21bd7440c "radv/gfx10: implement NGG support (VS only)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv: Actually use VK formats for the format table.
Bas Nieuwenhuizen [Sun, 7 Jul 2019 21:03:33 +0000 (23:03 +0200)]
radv: Actually use VK formats for the format table.

No ETC2 or ASTC on navi so nothing to add.

Fixes: 3dc5ec5d167 "radv/gfx10: generate gfx10_format_table.h"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoanv: fix VkExternalBufferProperties for host allocation
Chia-I Wu [Sat, 6 Jul 2019 19:12:51 +0000 (12:12 -0700)]
anv: fix VkExternalBufferProperties for host allocation

It was reported as unsupported previously.  It should be importable
and is compatible with itself.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Fixes: 69cc6272fbc199 ("anv: Implement VK_EXT_external_memory_host")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: fix VkExternalBufferProperties for unsupported handles
Chia-I Wu [Sat, 6 Jul 2019 19:02:51 +0000 (12:02 -0700)]
anv: fix VkExternalBufferProperties for unsupported handles

compatibleHandleTypes must include the queried handle type.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoradv: Handle cmask being disallowed by addrlib.
Bas Nieuwenhuizen [Sun, 7 Jul 2019 19:24:17 +0000 (21:24 +0200)]
radv: Handle cmask being disallowed by addrlib.

alignment=0 does weird things with align64.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agoradv/gfx10: enable support for NAVI10, NAVI12 and NAVI14
Samuel Pitoiset [Tue, 25 Jun 2019 06:21:37 +0000 (08:21 +0200)]
radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: Use GS rectlist when needed.
Bas Nieuwenhuizen [Sat, 6 Jul 2019 10:30:31 +0000 (12:30 +0200)]
radv/gfx10: Use GS rectlist when needed.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoradv/gfx10: implement NGG support (VS only)
Samuel Pitoiset [Fri, 5 Jul 2019 06:33:06 +0000 (08:33 +0200)]
radv/gfx10: implement NGG support (VS only)

This needs to be cleaned up a bit, and it probably contains
missing stuff and/or bugs.

This doesn't fix the "half of the triangles" issue.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: Combine vs and tes output keys parts.
Bas Nieuwenhuizen [Sat, 6 Jul 2019 21:24:07 +0000 (23:24 +0200)]
radv: Combine vs and tes output keys parts.

That way the same deref is valid for both shader stages.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoradv/gfx10: Use new uconfig reg index packet for GFX10+.
Bas Nieuwenhuizen [Sat, 6 Jul 2019 12:48:58 +0000 (14:48 +0200)]
radv/gfx10: Use new uconfig reg index packet for GFX10+.

Otherwise the hardware/firmware seems to not set the registers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoradv/gfx10: Set MEM_ORDERED flags on shaders.
Bas Nieuwenhuizen [Sat, 6 Jul 2019 10:31:25 +0000 (12:31 +0200)]
radv/gfx10: Set MEM_ORDERED flags on shaders.

Scattered because depending on stage they are at offset 24/25/27/30.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoradv/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode
Samuel Pitoiset [Wed, 26 Jun 2019 07:42:35 +0000 (09:42 +0200)]
radv/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: double the number of tessellation offchip buffers per SE
Samuel Pitoiset [Wed, 26 Jun 2019 07:09:33 +0000 (09:09 +0200)]
radv/gfx10: double the number of tessellation offchip buffers per SE

Each gfx10 shader engine corresponds to two gfx9 shader engines, so scale
the number of offchip buffers accordingly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: require LLVM 9+
Samuel Pitoiset [Tue, 25 Jun 2019 06:21:15 +0000 (08:21 +0200)]
radv/gfx10: require LLVM 9+

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: disable geometry and tessellation shaders
Samuel Pitoiset [Fri, 5 Jul 2019 07:08:04 +0000 (09:08 +0200)]
radv/gfx10: disable geometry and tessellation shaders

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: disable binning
Samuel Pitoiset [Tue, 25 Jun 2019 09:50:13 +0000 (11:50 +0200)]
radv/gfx10: disable binning

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: disable CLEAR_STATE
Samuel Pitoiset [Tue, 25 Jun 2019 09:48:27 +0000 (11:48 +0200)]
radv/gfx10: disable CLEAR_STATE

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: disable VK_EXT_transform_feedback
Samuel Pitoiset [Tue, 25 Jun 2019 13:48:06 +0000 (15:48 +0200)]
radv/gfx10: disable VK_EXT_transform_feedback

It requires a bunch of work, so disable for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: set user data base registers
Samuel Pitoiset [Tue, 25 Jun 2019 15:45:25 +0000 (17:45 +0200)]
radv/gfx10: set user data base registers

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: add gfx10_cs_emit_cache_flush
Samuel Pitoiset [Tue, 25 Jun 2019 15:19:11 +0000 (17:19 +0200)]
radv/gfx10: add gfx10_cs_emit_cache_flush

The cache flush logic on GFX10 is quite different and it's
implemented with a new function.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: set the DCC constant encoding flag
Samuel Pitoiset [Tue, 25 Jun 2019 14:18:40 +0000 (16:18 +0200)]
radv/gfx10: set the DCC constant encoding flag

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: do not declare streamout SGPRS
Samuel Pitoiset [Fri, 5 Jul 2019 16:14:24 +0000 (18:14 +0200)]
radv/gfx10: do not declare streamout SGPRS

Streamout is completely different on GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: do not set stream output shader config
Samuel Pitoiset [Tue, 25 Jun 2019 13:45:20 +0000 (15:45 +0200)]
radv/gfx10: do not set stream output shader config

Transform feedback is really different on GFX10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: emit VGT_VERTEX_REUSE_BLOCK_CNTL during gfx initialization
Samuel Pitoiset [Tue, 25 Jun 2019 13:38:44 +0000 (15:38 +0200)]
radv/gfx10: emit VGT_VERTEX_REUSE_BLOCK_CNTL during gfx initialization

The value doesn't need to be updated for tess.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: update shader-related fields in si_emit_graphics()
Samuel Pitoiset [Tue, 25 Jun 2019 12:48:08 +0000 (14:48 +0200)]
radv/gfx10: update shader-related fields in si_emit_graphics()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement si_emit_compute()
Samuel Pitoiset [Tue, 25 Jun 2019 12:42:06 +0000 (14:42 +0200)]
radv/gfx10: implement si_emit_compute()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: mask DCC tile swizzle by alignment
Samuel Pitoiset [Tue, 25 Jun 2019 12:28:10 +0000 (14:28 +0200)]
radv/gfx10: mask DCC tile swizzle by alignment

DCC alignment can be less than the alignment of the main surface. In that
case, the DCC tile swizzle needs to be masked accordingly. Should have no
impact on pre-gfx10.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: initialize GE_{MAX,MIN}_VTX_INDX/INDX_OFFSET
Samuel Pitoiset [Tue, 25 Jun 2019 12:13:36 +0000 (14:13 +0200)]
radv/gfx10: initialize GE_{MAX,MIN}_VTX_INDX/INDX_OFFSET

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement radv_flush_vertex_descriptors() change
Samuel Pitoiset [Fri, 5 Jul 2019 14:09:03 +0000 (16:09 +0200)]
radv/gfx10: implement radv_flush_vertex_descriptors() change

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement fill_geom_tess_rings()
Samuel Pitoiset [Tue, 25 Jun 2019 12:10:42 +0000 (14:10 +0200)]
radv/gfx10: implement fill_geom_tess_rings()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement radv_CmdBindDescriptorSets()
Samuel Pitoiset [Tue, 25 Jun 2019 12:02:52 +0000 (14:02 +0200)]
radv/gfx10: implement radv_CmdBindDescriptorSets()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement write_buffer_descriptor()
Samuel Pitoiset [Tue, 25 Jun 2019 12:00:04 +0000 (14:00 +0200)]
radv/gfx10: implement write_buffer_descriptor()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: use the correct register for image descriptor dumping
Samuel Pitoiset [Tue, 25 Jun 2019 11:56:22 +0000 (13:56 +0200)]
radv/gfx10: use the correct register for image descriptor dumping

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement radv_pipeline_generate_hw_hs()
Samuel Pitoiset [Tue, 25 Jun 2019 11:17:51 +0000 (13:17 +0200)]
radv/gfx10: implement radv_pipeline_generate_hw_hs()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement radv_fill_shader_variant()
Samuel Pitoiset [Tue, 25 Jun 2019 11:33:03 +0000 (13:33 +0200)]
radv/gfx10: implement radv_fill_shader_variant()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement radv_pipeline_generate_geometry_shader()
Samuel Pitoiset [Tue, 25 Jun 2019 11:25:32 +0000 (13:25 +0200)]
radv/gfx10: implement radv_pipeline_generate_geometry_shader()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement radv_init_sampler()
Samuel Pitoiset [Tue, 25 Jun 2019 10:32:01 +0000 (12:32 +0200)]
radv/gfx10: implement radv_init_sampler()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: fix PS exports for SPI_SHADER_32_AR
Samuel Pitoiset [Tue, 25 Jun 2019 10:16:39 +0000 (12:16 +0200)]
radv/gfx10: fix PS exports for SPI_SHADER_32_AR

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: implement radv_get_device_name()
Samuel Pitoiset [Tue, 25 Jun 2019 10:11:42 +0000 (12:11 +0200)]
radv/gfx10: implement radv_get_device_name()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: set RADV_FORCE_FAMILY
Samuel Pitoiset [Tue, 25 Jun 2019 10:09:41 +0000 (12:09 +0200)]
radv/gfx10: set RADV_FORCE_FAMILY

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0
Samuel Pitoiset [Tue, 25 Jun 2019 10:05:35 +0000 (12:05 +0200)]
radv/gfx10: fix a possible hang with exp pos0 with done=0 and exec=0

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: set PA_SC_TILE_STEERING_OVERRIDE
Samuel Pitoiset [Tue, 25 Jun 2019 10:01:40 +0000 (12:01 +0200)]
radv/gfx10: set PA_SC_TILE_STEERING_OVERRIDE

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: set cache control registers
Samuel Pitoiset [Tue, 25 Jun 2019 09:59:53 +0000 (11:59 +0200)]
radv/gfx10: set cache control registers

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: set llvm_has_working_vgpr_indexing
Samuel Pitoiset [Tue, 25 Jun 2019 09:24:48 +0000 (11:24 +0200)]
radv/gfx10: set llvm_has_working_vgpr_indexing

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: update DB_DFSM_CONTROL register
Samuel Pitoiset [Tue, 25 Jun 2019 08:53:17 +0000 (10:53 +0200)]
radv/gfx10: update DB_DFSM_CONTROL register

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>