mesa.git
6 years agospirv: Rework logging
Jason Ekstrand [Wed, 16 Aug 2017 23:04:08 +0000 (16:04 -0700)]
spirv: Rework logging

This commit reworks the way that logging works in SPIR-V to provide
richer and more detailed logging infrastructure.  This commit contains
several improvements over the old mechanism:

 1) Log messages are now more detailed.  They contain the SPIR-V byte
    offset as well as source language information from OpSource and
    OpLine.

 2) There is now a logging callback mechanism so that errors can get
    propagated to the client through debug callbak extensions.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
6 years agospirv: Re-arrange vtn_builder initialization
Jason Ekstrand [Wed, 16 Aug 2017 23:10:23 +0000 (16:10 -0700)]
spirv: Re-arrange vtn_builder initialization

This simply moves allocating the vtn_builder and initializing it to the
very beginning before we even parse the header.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
6 years agospirv: Parent the nir_shader to the builder while building
Jason Ekstrand [Wed, 16 Aug 2017 15:43:08 +0000 (08:43 -0700)]
spirv: Parent the nir_shader to the builder while building

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
6 years agofreedreno: mark stencil buffer valid too in case of z32x24s8
Rob Clark [Mon, 4 Dec 2017 16:01:52 +0000 (11:01 -0500)]
freedreno: mark stencil buffer valid too in case of z32x24s8

The separate stencil buffer was not also getting marked as valid if
written by a draw/clear, resulting in gmem2mem getting skipped.  Move
this into fd_batch_resource_used() which also handles the separate
stencil case.

Also fix restore_buffers typo.

Fixes: 4ab6ab80365 freedreno: avoid mem2gmem for invalidated buffers
Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: remove use of u_transfer
Rob Clark [Mon, 4 Dec 2017 14:02:07 +0000 (09:02 -0500)]
freedreno: remove use of u_transfer

Freedreno doesn't treat buffers and images differently, so it's use was
kind of pointless.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add -Wno-packed-bitfield-compat for meson build
Eric Engestrom [Mon, 4 Dec 2017 13:40:54 +0000 (08:40 -0500)]
freedreno: add -Wno-packed-bitfield-compat for meson build

Otherwise huge amount of spam from instr-a2xx.h.. gcc has no way to know
that freedreno was never built with such an old gcc version to care
about the bugs in old gcc ;-)

Reported-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
[added commit message]
Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agoglsl: don't run intrastage array validation when the interface type is not an array
Samuel Iglesias Gonsálvez [Thu, 9 Nov 2017 10:15:03 +0000 (11:15 +0100)]
glsl: don't run intrastage array validation when the interface type is not an array

We validate that the interface block array type's definition matches.
However, previously, the function could be called if an non-array
interface block has different type definitions -for example, when the
precision qualifier differs in a GLSL ES shader, we would create two
different types-, and it would return invalid as both definitions are
non-arrays.

We fix this by specifying that at least one definition should be an
array to call the validation.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoglsl/es: precision qualifier doesn't need to match in UBOs
Samuel Iglesias Gonsálvez [Thu, 9 Nov 2017 08:58:25 +0000 (09:58 +0100)]
glsl/es: precision qualifier doesn't need to match in UBOs

They might mismatch due to the two shaders using different GLSL
versions, and that's ok in desktop GL. In ES, precision qualifiers
don't need to match.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agonvc0/ir: Properly lower 64-bit shifts when the shift value is >32
Pierre Moreau [Sun, 3 Dec 2017 20:28:57 +0000 (21:28 +0100)]
nvc0/ir: Properly lower 64-bit shifts when the shift value is >32

Fixes: 61d7676df77 "nvc0/ir: add support for 64-bit shift lowering on SM20/SM30"
Fixes fs-shift-scalar-by-scalar.shader_test from piglit for the current
set-up:

uniform int64_t ival -0x7dfcfefbdf6536ff # bit pattern: 0x82030104209ac901
uniform uint64_t uval 0x1400000085010203
uniform int shl 36
uniform int shr 36
uniform int64_t iexpected_shl 0x09ac901000000000
uniform int64_t iexpected_shr -0x7dfcff0 # bit pattern: 0xfffffffff8203010
uniform uint64_t uexpected_shl 0x5010203000000000
uniform uint64_t uexpected_shr 0x0000000001400000
draw rect ortho 12 0 4 4

Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoglsl: Match order of gl_LightSourceParameters elements.
Fabian Bieler [Thu, 23 Nov 2017 20:48:00 +0000 (13:48 -0700)]
glsl: Match order of gl_LightSourceParameters elements.

spotExponent and spotCosCutoff were swapped in the
gl_builtin_uniform_element struct.
Now the order matches across gl_builtin_uniform_element,
glsl_struct_field and the spec.

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agoglsl: Fix gl_NormalScale.
Fabian Bieler [Thu, 23 Nov 2017 20:48:00 +0000 (13:48 -0700)]
glsl: Fix gl_NormalScale.

GLSL shaders can access the normal scale factor with the built-in
gl_NormalScale.  Mesa's modelspace lighting optimization uses a different
normal scale factor than defined in the spec.  We have to take care not
to use this factor for gl_NormalScale.

Mesa already defines two seperate states: state.normalScale and
state.internal.normalScale.  The first is used by the glsl compiler
while the later is used by the fixed function T&L pipeline.  Previously
the only difference was some component swizzling.  With this commit
state.normalScale always uses the normal scale factor for eyespace
lighting.

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agost/glsl_to_nir/radeonsi: enable gs support for nir backend
Timothy Arceri [Fri, 10 Nov 2017 10:33:37 +0000 (21:33 +1100)]
st/glsl_to_nir/radeonsi: enable gs support for nir backend

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac: add si_nir_load_input_gs() to the abi
Timothy Arceri [Wed, 8 Nov 2017 03:20:23 +0000 (14:20 +1100)]
ac: add si_nir_load_input_gs() to the abi

V2: make use of driver_location and don't expose NIR to the ABI.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac: move build_varying_gather_values() to ac_llvm_build.h and expose
Timothy Arceri [Fri, 10 Nov 2017 02:55:48 +0000 (13:55 +1100)]
ac: move build_varying_gather_values() to ac_llvm_build.h and expose

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac: add basic nir -> llvm type helper
Timothy Arceri [Fri, 10 Nov 2017 02:47:50 +0000 (13:47 +1100)]
ac: add basic nir -> llvm type helper

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: create si_llvm_load_input_gs()
Timothy Arceri [Tue, 7 Nov 2017 10:13:37 +0000 (21:13 +1100)]
radeonsi: create si_llvm_load_input_gs()

This creates a common function that can be shared by the tgsi
and nir backends.

v2: use LLVMBuildBitCast() directly

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: pass llvm type to lds_load()
Timothy Arceri [Tue, 7 Nov 2017 10:42:10 +0000 (21:42 +1100)]
radeonsi: pass llvm type to lds_load()

v2: use LLVMBuildBitCast() directly

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: add llvm_type_is_64bit() helper
Timothy Arceri [Tue, 7 Nov 2017 10:41:27 +0000 (21:41 +1100)]
radeonsi: add llvm_type_is_64bit() helper

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: pass llvm type to si_llvm_emit_fetch_64bit()
Timothy Arceri [Tue, 7 Nov 2017 10:27:18 +0000 (21:27 +1100)]
radeonsi: pass llvm type to si_llvm_emit_fetch_64bit()

v2: use LLVMBuildBitCast() directly

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: add nir support for gs epilogue
Timothy Arceri [Thu, 9 Nov 2017 02:34:37 +0000 (13:34 +1100)]
radeonsi: add nir support for gs epilogue

v2: add emit_gs_epilogue() helper function to reduce duplication.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: add nir support for es epilogue
Timothy Arceri [Mon, 6 Nov 2017 11:35:50 +0000 (22:35 +1100)]
radeonsi: add nir support for es epilogue

v2: make use of existing si_tgsi_emit_epilogue()

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: add nir support for ls epilogue
Timothy Arceri [Thu, 2 Nov 2017 04:02:55 +0000 (15:02 +1100)]
radeonsi: add nir support for ls epilogue

v2: make use of existing si_tgsi_emit_epilogue()

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/glsl_to_nir: add gs support to st_nir_assign_var_locations()
Timothy Arceri [Wed, 22 Nov 2017 06:37:32 +0000 (17:37 +1100)]
st/glsl_to_nir: add gs support to st_nir_assign_var_locations()

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/glsl_to_nir: use nir_lower_io_arrays_to_elements() to lower arrays
Timothy Arceri [Wed, 15 Nov 2017 03:36:22 +0000 (14:36 +1100)]
st/glsl_to_nir: use nir_lower_io_arrays_to_elements() to lower arrays

This pass is more fully featured, it supports geom and tess shaders.
It also supports interpolation intrinsics.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agonir: allow builin arrays to be lowered
Timothy Arceri [Wed, 15 Nov 2017 03:30:22 +0000 (14:30 +1100)]
nir: allow builin arrays to be lowered

Galliums nir drivers expect this to be done.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agonir: add array lowering function that assumes there are no indirects
Timothy Arceri [Wed, 15 Nov 2017 03:28:01 +0000 (14:28 +1100)]
nir: add array lowering function that assumes there are no indirects

The gallium glsl->nir pass currently lowers away all indirects on both inputs
and outputs. This fuction allows us to lower vs inputs and fs outputs and also
lower things one stage at a time as we don't need to worry about indirects
on the other side of the shaders interface.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: enable nir varying array splitting
Timothy Arceri [Mon, 30 Oct 2017 00:58:52 +0000 (11:58 +1100)]
radv: enable nir varying array splitting

Acked-by: Dave Airlie <airlied@redhat.com>
6 years agost/glsl_to_nir: enable NIR link time opts
Timothy Arceri [Mon, 20 Nov 2017 06:20:35 +0000 (17:20 +1100)]
st/glsl_to_nir: enable NIR link time opts

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi/nir: add support for packed inputs
Timothy Arceri [Fri, 17 Nov 2017 06:04:22 +0000 (17:04 +1100)]
radeonsi/nir: add support for packed inputs

Because NIR can create non vec4 variables when implementing component
packing we need to make sure not to reprocess the same slot again.

Also we can drop the fs_attr_idx counter and just use driver_location.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts()
Timothy Arceri [Fri, 17 Nov 2017 06:45:32 +0000 (17:45 +1100)]
st/glsl_to_nir: move some calls out of st_glsl_to_nir_post_opts()

NIR component packing will be inserted between these calls and the
calling of st_glsl_to_nir_post_opts().

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl_to_nir: call some lowering passes earlier
Timothy Arceri [Tue, 14 Nov 2017 01:56:20 +0000 (12:56 +1100)]
st/glsl_to_nir: call some lowering passes earlier

This is required so that we can enbale NIR linking optimisations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl_to_nir: add basic NIR opt loop helper
Timothy Arceri [Mon, 13 Nov 2017 23:13:58 +0000 (10:13 +1100)]
st/glsl_to_nir: add basic NIR opt loop helper

We need to be able to do these NIR opts in the state tracker
rather than the driver in order for the NIR linking opts to
be useful.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl_to_nir: make st_glsl_to_nir() static
Timothy Arceri [Mon, 13 Nov 2017 23:06:47 +0000 (10:06 +1100)]
st/glsl_to_nir: make st_glsl_to_nir() static

Here we also move the extern C functions to the bottom of the file.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl_to_nir: split the st_glsl_to_nir() function in two
Timothy Arceri [Mon, 13 Nov 2017 22:15:54 +0000 (09:15 +1100)]
st/glsl_to_nir: split the st_glsl_to_nir() function in two

We want to be able to generate NIR then apply NIR optimisations.
Once the optimisations are done we can then apply the new post opt
function which assigns uniforms etc based on the optimised IR.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl_to_nir: create set_st_program() helper
Timothy Arceri [Mon, 13 Nov 2017 22:03:45 +0000 (09:03 +1100)]
st/glsl_to_nir: create set_st_program() helper

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl: move nir linking loop to new function st_link_nir()
Timothy Arceri [Thu, 9 Nov 2017 06:32:08 +0000 (17:32 +1100)]
st/glsl: move nir linking loop to new function st_link_nir()

This will allow us to refactor linking and include some nir link
time optimisations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agonir: fix support for scalar arrays in nir_lower_io_types()
Timothy Arceri [Fri, 17 Nov 2017 03:27:27 +0000 (14:27 +1100)]
nir: fix support for scalar arrays in nir_lower_io_types()

This was just recreating the same vector type we alreay had and
hitting an assert for scalars.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agost/glsl_to_nir: add st_nir_assign_var_locations() helper
Timothy Arceri [Tue, 21 Nov 2017 00:53:58 +0000 (11:53 +1100)]
st/glsl_to_nir: add st_nir_assign_var_locations() helper

This avoids packed varyings being assigned different driver locations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradv: enable nir component packing
Timothy Arceri [Wed, 18 Oct 2017 02:48:41 +0000 (13:48 +1100)]
radv: enable nir component packing

SaschaWillems Vulkan demo tessellation:

~4000fps -> ~4600fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonir: add varying component packing helpers
Timothy Arceri [Wed, 18 Oct 2017 08:40:06 +0000 (19:40 +1100)]
nir: add varying component packing helpers

v2: update shader info input/output masks when pack components
v3: make sure interpolation loc matches, this is required for the
    radeonsi NIR backend.
v4: 33dca36f4f28 fixed nir_gather_info to update outputs_read
    correct, make sure we also adjust this correctly when
    packing components.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)
6 years agonir: add varying array splitting pass
Timothy Arceri [Mon, 23 Oct 2017 04:51:29 +0000 (15:51 +1100)]
nir: add varying array splitting pass

V2:
 - fix matrix support, non-array matrices were being skipped in v1

v3:
 - handle lowering of tcs output loads correctly
 - correctly mark indirect locations for either in or out not both
   when processing a stage.
 - use nir_src_copy() when lowering stores.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agofreedreno/ir3: relax barriers
Rob Clark [Sun, 3 Dec 2017 16:50:09 +0000 (11:50 -0500)]
freedreno/ir3: relax barriers

Instructions with no barrier_class can move wrt. an EVERYTHING barrier.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: all mem instructions have WAR hazzard
Rob Clark [Sun, 3 Dec 2017 16:48:56 +0000 (11:48 -0500)]
freedreno/ir3: all mem instructions have WAR hazzard

It isn't just load instructions that have write-after-read hazzard.

Fixes stk gaussian blur compute shaders.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add debug option to force emulated indirect
Rob Clark [Wed, 29 Nov 2017 20:06:39 +0000 (15:06 -0500)]
freedreno: add debug option to force emulated indirect

Useful mostly for debugging indirect draw.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: also mark draw-indirect buffer as read
Rob Clark [Wed, 29 Nov 2017 14:04:08 +0000 (09:04 -0500)]
freedreno: also mark draw-indirect buffer as read

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: small cleanups
Rob Clark [Mon, 20 Nov 2017 20:33:54 +0000 (15:33 -0500)]
freedreno: small cleanups

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: avoid unneccessary batch flush
Rob Clark [Sun, 19 Nov 2017 21:45:04 +0000 (16:45 -0500)]
freedreno: avoid unneccessary batch flush

In some cases we can end up trying to add a write dependency on ourself,
which shouldn't trigger a flush.

Avoids an extra couple flushes per from in stk.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: avoid mem2gmem for invalidated buffers
Rob Clark [Sun, 19 Nov 2017 17:50:50 +0000 (12:50 -0500)]
freedreno: avoid mem2gmem for invalidated buffers

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: deferred flush support
Rob Clark [Sun, 19 Nov 2017 16:42:25 +0000 (11:42 -0500)]
freedreno: deferred flush support

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: rework fence tracking
Rob Clark [Sun, 19 Nov 2017 15:36:19 +0000 (10:36 -0500)]
freedreno: rework fence tracking

ctx->last_fence isn't such a terribly clever idea, if batches can be
flushed out of order.  Instead, each batch now holds a fence, which is
created before the batch is flushed (useful for next patch), that later
gets populated after the batch is actually flushed.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: proper locking for iterating dependent batches
Rob Clark [Mon, 20 Nov 2017 14:52:04 +0000 (09:52 -0500)]
freedreno: proper locking for iterating dependent batches

In transfer_map(), when we need to flush batches that read from a
resource, we should be holding screen->lock to guard against race
conditions.  Somehow deferred flush seems to make this existing
race more obvious.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/a5xx: correct max_indicies for indirect draws
Rob Clark [Wed, 22 Nov 2017 14:45:28 +0000 (09:45 -0500)]
freedreno/a5xx: correct max_indicies for indirect draws

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agospirv: Convert the supported_extensions struct to spirv_options
Jason Ekstrand [Thu, 19 Oct 2017 00:28:19 +0000 (17:28 -0700)]
spirv: Convert the supported_extensions struct to spirv_options

This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
6 years agospirv: Only emit functions which are actually used
Jason Ekstrand [Thu, 19 Oct 2017 17:11:22 +0000 (10:11 -0700)]
spirv: Only emit functions which are actually used

Instead of emitting absolutely everything, just emit the few functions
that are actually referenced in some way by the entrypoint.  This should
save us quite a bit of time when handed large shader modules containing
many entrypoints.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
6 years agospirv: Drop the impl field from vtn_builder
Jason Ekstrand [Thu, 19 Oct 2017 16:56:22 +0000 (09:56 -0700)]
spirv: Drop the impl field from vtn_builder

We have a nir_builder and it has an impl field.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
6 years agoi965: Serialize nir later in the linking process
Jordan Justen [Fri, 1 Dec 2017 01:48:57 +0000 (17:48 -0800)]
i965: Serialize nir later in the linking process

Fixes MESA_GLSL=cache_fb with piglit
tests/spec/glsl-1.50/execution/geometry/clip-distance-vs-gs-out.shader_test

Fixes: 0610a624a12 i965/link: Serialize program to nir after linking for shader cache
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103988
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoconfigure: avoid testing for negative compiler options
Marc Dietrich [Wed, 29 Nov 2017 21:25:05 +0000 (22:25 +0100)]
configure: avoid testing for negative compiler options

gcc seems to always accept unsupported negative compiler warning options:

echo "int i;" | gcc -c -xc -Wno-bob - # no error
echo "int i;" | gcc -c -xc -Walice -  # unsupported compiler option

Inverting the options fixes the tests.

V2: fix options in meson build

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Marc Dietrich <marvin24@gmx.de>
6 years agobroadcom/vc4: Use a single-entry cached last_hindex value.
Eric Anholt [Wed, 29 Nov 2017 00:17:16 +0000 (16:17 -0800)]
broadcom/vc4: Use a single-entry cached last_hindex value.

Since almost all BOs will be in one CL at a time, this cache will almost
always hit except for the first usage of the BO in each CL.

This didn't show up as statistically significant on the minetest trace
(n=340), but if I lop off the throttled lobe of the bimodal distribution,
it very clearly does (0.74731% +/- 0.162093%, n=269).

6 years agobroadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.
Eric Anholt [Sat, 25 Nov 2017 06:34:12 +0000 (22:34 -0800)]
broadcom/vc4: Decompose single QUADs to a TRIANGLE_FAN.

No significant difference in the minetest replay, but it should reduce
overhead by not requiring that we write quad indices to index buffers that
we repeatedly re-upload (and making the draw packet smaller, as well).

Over the course of the series the actual game seems to be up by 1-2 fps.

6 years agobroadcom/vc4: Use the new enum functionality of the XML to decode better.
Eric Anholt [Sat, 25 Nov 2017 06:20:21 +0000 (22:20 -0800)]
broadcom/vc4: Use the new enum functionality of the XML to decode better.

6 years agobroadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.
Eric Anholt [Sat, 25 Nov 2017 06:15:28 +0000 (22:15 -0800)]
broadcom/vc4: Skip emitting redundant VC4_PACKET_GEM_HANDLES.

Now that there's only one user of it, it's pretty obvious how to avoid
emitting redundant ones.  This should save a bunch of kernel validation
overhead.

No statistically sigificant difference on the minetest trace I was looking
at (n=169), but the maximum FPS is up by .3%

6 years agobroadcom/vc4: Simplify the relocation handling for index buffers.
Eric Anholt [Sat, 25 Nov 2017 06:11:11 +0000 (22:11 -0800)]
broadcom/vc4: Simplify the relocation handling for index buffers.

Originally there was CL code for handling various relocations back when I
had relocs for the TSDA/TA buffers.  Now that the kernel handles those
entirely on its own, I can inline that code into the one place using it.

6 years agobroadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.
Eric Anholt [Sat, 25 Nov 2017 05:40:50 +0000 (21:40 -0800)]
broadcom/vc4: Fix handling of GFXH-515 workaround with a start vertex count.

We failed to take the start into account for how many vertices to draw in
this round, so we would end up decrementing count below 0, which as an
unsigned number meant we would loop until the CLs soon ran out of space.

When I wrote the code I was thinking about how to use the previously
emitted shader state (no index bias baked into the elements) by emitting
up to 65535 and then only re-emitting with bias for the second wround, but
that doesn't work if the start is over 65535.  Instead, just delay
emitting shader state until we get into the drawarrays GFXH-515 loop and
always bake the bias in when we're doing the workaround.

6 years agobroadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.
Eric Anholt [Fri, 1 Dec 2017 23:29:05 +0000 (15:29 -0800)]
broadcom/vc4: Fix the scaling factor for the GFXH-515 workaround.

For triangle strips, we step by max_verts - 2.

6 years agomeson: use dep_thread instead of dependency('threads') in freedreno
Dylan Baker [Wed, 29 Nov 2017 00:49:02 +0000 (16:49 -0800)]
meson: use dep_thread instead of dependency('threads') in freedreno

They are the same thing, but this is more consistent with the rest of
the project.

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Add lmsensors support
Dylan Baker [Wed, 29 Nov 2017 00:42:37 +0000 (16:42 -0800)]
meson: Add lmsensors support

v2: - Make -Dlmsensors=false work
    - Simplify auto and true cases

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: Add support for gallium extra hud
Dylan Baker [Wed, 29 Nov 2017 00:31:06 +0000 (16:31 -0800)]
meson: Add support for gallium extra hud

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agoglx: Prepare driFetchDrawable for no-config contexts
Adam Jackson [Tue, 14 Nov 2017 20:13:05 +0000 (15:13 -0500)]
glx: Prepare driFetchDrawable for no-config contexts

When we look up the DRI drawable state we need to associate an fbconfig
with the drawable. With GLX_EXT_no_config_context we can no longer infer
that from the context and must instead query the server.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoglx: Use __glXSendError instead of open-coding it
Adam Jackson [Tue, 14 Nov 2017 20:13:02 +0000 (15:13 -0500)]
glx: Use __glXSendError instead of open-coding it

This also fixes a bug, the error path through MakeCurrent didn't
translate the error code by the extension's error base.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoglx: Simplify some dummy vtable interactions
Adam Jackson [Tue, 14 Nov 2017 20:13:01 +0000 (15:13 -0500)]
glx: Simplify some dummy vtable interactions

The dummy vtable has these slots as NULL already, no need to check for
the dummy context explicitly.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agodocs/release-calendar: update and extend
Emil Velikov [Wed, 29 Nov 2017 18:16:38 +0000 (18:16 +0000)]
docs/release-calendar: update and extend

v2: Missing td tag, add Andres + Juan for 17.2.8 and 17.3.3

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
6 years agodocs/specs: annotate MESA_set_3dfx_mode as obsolete
Emil Velikov [Wed, 29 Nov 2017 15:21:03 +0000 (15:21 +0000)]
docs/specs: annotate MESA_set_3dfx_mode as obsolete

Aimed to work with Glide, which hasn't been a thing in over 10 years.
There are no drivers that implement it, so annotate it as obsolete

v2: Move the extension to OLD/

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoxlib: remove dummy GLX_MESA_set_3dfx_mode implementation
Emil Velikov [Wed, 29 Nov 2017 15:15:19 +0000 (15:15 +0000)]
xlib: remove dummy GLX_MESA_set_3dfx_mode implementation

The implementation is a simple 'return EGL_FALSE'. Stop pretending and
simply remove it.

Note: the removal of XMesa API is fine, since there hasn't been any
users for it in years.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agodocs/specs: annotate MESA_agp_offset as obsolete
Emil Velikov [Wed, 29 Nov 2017 15:09:01 +0000 (15:09 +0000)]
docs/specs: annotate MESA_agp_offset as obsolete

No Mesa driver has implemented the extension in ages. Seemingly non Mesa
drivers don't implement it either.

As mentioned by Ian, the extension is effectively superseded by
ARB_vertex_buffer_object.

v2: Move the extension to OLD/

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com> (v1)
Reviewed-by: Adam Jackson <ajax@redhat.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoxlib: remove empty GLX_MESA_agp_offset stubs
Emil Velikov [Wed, 29 Nov 2017 14:46:26 +0000 (14:46 +0000)]
xlib: remove empty GLX_MESA_agp_offset stubs

The extension was never implemented and seemingly never will.
The DRI based libGL dropped support for it over 10 years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoxlib: remove empty GLX_NV_vertex_array_range stubs
Emil Velikov [Wed, 29 Nov 2017 14:32:36 +0000 (14:32 +0000)]
xlib: remove empty GLX_NV_vertex_array_range stubs

The extension was never implemented and seemingly never will.
The DRI based libGL dropped support for it over 10 years ago.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoi965/gen10: Change the order of PIPE_CONTROL and load register.
Rafael Antognolli [Wed, 8 Nov 2017 19:39:52 +0000 (11:39 -0800)]
i965/gen10: Change the order of PIPE_CONTROL and load register.

I believe the workaround describes that the MI_LOAD_REGISTER_IMM should
come right after the 3DSTATE_SAMPLE_PATTERN.

This fixes GPU hangs in the i965 initial state batchbuffer when running
some Piglit tests with always_flush_batch=true.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agointel/compiler: Implement WaClearTDRRegBeforeEOTForNonPS.
Rafael Antognolli [Fri, 6 Oct 2017 18:41:54 +0000 (11:41 -0700)]
intel/compiler: Implement WaClearTDRRegBeforeEOTForNonPS.

The bspec describes:

   "WA: Clear tdr register before send EOT in all non-PS shader kernels

   mov(8) tdr0:ud 0x0:ud {NoMask}"

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965/gen10: emit 3DSTATE_MULTISAMPLE more often.
Rafael Antognolli [Mon, 2 Oct 2017 18:06:05 +0000 (11:06 -0700)]
i965/gen10: emit 3DSTATE_MULTISAMPLE more often.

On CNL, we see multiple multisample failures on piglit tests. By
emitting this extra state, though not documented in the bspec, those
failures seem to go away.

This workaround could be removed if we ever find out a better solution,
but it should be good enough for now.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agomeson: install khrplatform header for EGL as well as GLES
Dylan Baker [Thu, 30 Nov 2017 18:39:29 +0000 (10:39 -0800)]
meson: install khrplatform header for EGL as well as GLES

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: install dri internal header
Dylan Baker [Thu, 30 Nov 2017 18:37:11 +0000 (10:37 -0800)]
meson: install dri internal header

Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agoi965: Disable regular fast-clears (CCS_D) on gen9+
Jason Ekstrand [Thu, 30 Nov 2017 00:22:42 +0000 (16:22 -0800)]
i965: Disable regular fast-clears (CCS_D) on gen9+

This partially reverts commit 3e57e9494c2279580ad6a83ab8c065d01e7e634e
which caused a bunch of GPU hangs on several Source titles.  To date, we
have no clue why these hangs are actually happening.  This undoes the
final effect of 3e57e9494c227 and gets us back to not hanging.  Tested
with Team Fortress 2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102435
Fixes: 3e57e9494c2279580ad6a83ab8c065d01e7e634e
Cc: mesa-stable@lists.freedesktop.org
6 years agoegl/x11: Remove unneeded free() on always null string
Vadym Shovkoplias [Fri, 1 Dec 2017 15:08:53 +0000 (17:08 +0200)]
egl/x11: Remove unneeded free() on always null string

In this condition dri2_dpy->driver_name string always equals
NULL, so call to free() is useless

Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agogallium/hud: use #ifdef to test for macro existence
Eric Engestrom [Wed, 29 Nov 2017 14:19:26 +0000 (14:19 +0000)]
gallium/hud: use #ifdef to test for macro existence

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoamd: remove always-true BRAHMA_BUILD define
Eric Engestrom [Fri, 24 Nov 2017 16:23:03 +0000 (16:23 +0000)]
amd: remove always-true BRAHMA_BUILD define

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoglx/dri3: Remove unused deviceName variable
Vadym Shovkoplias [Fri, 1 Dec 2017 11:23:02 +0000 (13:23 +0200)]
glx/dri3: Remove unused deviceName variable

deviceName string is declared, assigned and freed but actually
never used in dri3_create_screen() function.

Fixes: 2d94601582e ("Add DRI3+Present loader")
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agoswr/scons: Fix intermittent build failure
George Kyriazis [Thu, 30 Nov 2017 20:24:39 +0000 (14:24 -0600)]
swr/scons: Fix intermittent build failure

gen_rasterizer*.cpp depends on gen_ar_eventhandler.hpp.
Account for new dependency.

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
6 years agoradv: only reset command buffers when the allocation fails
Samuel Pitoiset [Thu, 30 Nov 2017 21:23:37 +0000 (22:23 +0100)]
radv: only reset command buffers when the allocation fails

   "vkAllocateCommandBuffers can be used to create multiple command
    buffers. If the creation of any of those command buffers fails, the
    implementation must destroy all successfully created command buffer
    objects from this command, set all entries of the pCommandBuffers
    array to NULL and return the error."

This has been suggested by gabriel@system.is.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: do not dump meta shaders with RADV_DEBUG=shaders
Samuel Pitoiset [Thu, 30 Nov 2017 21:16:09 +0000 (22:16 +0100)]
radv: do not dump meta shaders with RADV_DEBUG=shaders

It's really annoying and this pollutes the output especially
when a bunch of non-meta shaders are compiled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agor600: add ARB_shader_storage_buffer_object support (v3)
Dave Airlie [Fri, 3 Nov 2017 00:15:38 +0000 (10:15 +1000)]
r600: add ARB_shader_storage_buffer_object support (v3)

This just builds on the image support. Evergreen only has ssbo
for fragment and compute no other stages.

v2: handle images and ssbo in the same shader properly (Ilia)
v3: fix RESQ on buffers,
    fix missing atom emit
    fix first element offset
    use R32 format
    write separate buffer rat store path.
(from running deqp gles3.1 tests)

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agor600/cayman: looks like cmpxchg moved to Z
Dave Airlie [Mon, 27 Nov 2017 06:39:49 +0000 (06:39 +0000)]
r600/cayman: looks like cmpxchg moved to Z

On cayman it appears the cmp component is now in Z.

Fixes:
arb_shader_image_load_store-dead-fragments on cayman.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agor600/shader: fix 64->32 conversions
Dave Airlie [Mon, 27 Nov 2017 02:07:45 +0000 (02:07 +0000)]
r600/shader: fix 64->32 conversions

These didn't handle the TGSI at all properly, this fixes
them to use the common path for 64->32 then adds the 32->int
on at the end.

Fixes:
generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/*

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: do not allocate CMASK or DCC for small surfaces
Samuel Pitoiset [Wed, 29 Nov 2017 13:48:32 +0000 (14:48 +0100)]
radv: do not allocate CMASK or DCC for small surfaces

The idea is ported from RadeonSI, but using 512x512 instead of
256x256 seems slightly better. This improves dota2 performance
by +2%.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
6 years agoradv: do not set DISABLE_LSB_CEIL on GFX9
Samuel Pitoiset [Thu, 30 Nov 2017 19:58:29 +0000 (20:58 +0100)]
radv: do not set DISABLE_LSB_CEIL on GFX9

The state no longer exists on GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: remove set but unnecessary radv_color_buffer_info::micro_tile_mode
Samuel Pitoiset [Thu, 30 Nov 2017 13:32:58 +0000 (14:32 +0100)]
radv: remove set but unnecessary radv_color_buffer_info::micro_tile_mode

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: do not store gfx9_epitch in radv_color_buffer_info
Samuel Pitoiset [Thu, 30 Nov 2017 13:32:57 +0000 (14:32 +0100)]
radv: do not store gfx9_epitch in radv_color_buffer_info

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agomeson: fix glxext.h install
Dylan Baker [Wed, 29 Nov 2017 19:18:52 +0000 (11:18 -0800)]
meson: fix glxext.h install

Another typo, the glext.h header was being install instead.

Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agomeson: fix GLES3/gl31.h install
Dylan Baker [Wed, 29 Nov 2017 19:16:59 +0000 (11:16 -0800)]
meson: fix GLES3/gl31.h install

This is a typo, gl32.h is installed twice.

Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agoac/surface: always compute DCC info when DCC is possible on GFX9
Marek Olšák [Thu, 30 Nov 2017 01:14:18 +0000 (02:14 +0100)]
ac/surface: always compute DCC info when DCC is possible on GFX9

The same code for VI doesn't check for scanout either.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
6 years agoradeonsi/gfx9: fix importing shared textures with DCC
Marek Olšák [Thu, 30 Nov 2017 01:16:29 +0000 (02:16 +0100)]
radeonsi/gfx9: fix importing shared textures with DCC

VI has 11 dwords at least. GFX9 has 10 dwords.

Cc: 17.2 17.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>