mesa.git
4 years agonir/samplers: don't zero samplers_used/txf.
Dave Airlie [Fri, 29 Nov 2019 00:56:05 +0000 (10:56 +1000)]
nir/samplers: don't zero samplers_used/txf.

This allows this pass to be run multiple times and the results are
just or'ed together.

It fixes on test on llvmpipe nir, and regresses none.

Suggested by Kenneth

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoaco: drop useless lowering of deref operations for shared memory
Samuel Pitoiset [Thu, 7 Nov 2019 21:34:20 +0000 (22:34 +0100)]
aco: drop useless lowering of deref operations for shared memory

Moved to RADV. No pipeline-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoradv,ac/nir: lower deref operations for shared memory
Samuel Pitoiset [Thu, 7 Nov 2019 14:56:35 +0000 (15:56 +0100)]
radv,ac/nir: lower deref operations for shared memory

This shouldn't introduce any functional changes for RadeonSI
when NIR is enabled because these operations are already lowered.

pipeline-db (NAVI10/LLVM):
SGPRS: 9043 -> 9051 (0.09 %)
VGPRS: 7272 -> 7292 (0.28 %)
Code Size: 638892 -> 621628 (-2.70 %) bytes
LDS: 1333 -> 1331 (-0.15 %) blocks
Max Waves: 1614 -> 1608 (-0.37 %)

Found this while glancing at some F12019 shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoaco: fix a couple of value numbering issues
Daniel Schürmann [Fri, 29 Nov 2019 15:47:13 +0000 (16:47 +0100)]
aco: fix a couple of value numbering issues

Fixes: 3a20ef4a3299fddc886f9d5908d8b3952dd63a54 'aco: refactor value numbering'
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: don't split live-ranges of linear VGPRs
Daniel Schürmann [Fri, 29 Nov 2019 15:43:24 +0000 (16:43 +0100)]
aco: don't split live-ranges of linear VGPRs

Fixes: 93c8ebfa780ebd1495095e794731881aef29e7d3 'aco: Initial commit of independent AMD compiler'
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
4 years agoaco: implement global atomics
Rhys Perry [Wed, 27 Nov 2019 16:51:10 +0000 (16:51 +0000)]
aco: implement global atomics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: improve FLAT/GLOBAL scheduling
Rhys Perry [Wed, 27 Nov 2019 17:27:36 +0000 (17:27 +0000)]
aco: improve FLAT/GLOBAL scheduling

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: don't enable store_global for helper invocations
Rhys Perry [Wed, 27 Nov 2019 17:15:54 +0000 (17:15 +0000)]
aco: don't enable store_global for helper invocations

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: fix SADDR with FLAT on GFX10
Rhys Perry [Wed, 27 Nov 2019 17:08:27 +0000 (17:08 +0000)]
aco: fix SADDR with FLAT on GFX10

The reference guide is incorrect and SADDR is actually used with FLAT on
GFX10.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: fix assembly of FLAT/GLOBAL atomics
Rhys Perry [Wed, 27 Nov 2019 17:06:10 +0000 (17:06 +0000)]
aco: fix assembly of FLAT/GLOBAL atomics

They can take both a definition and data operand

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: fix GFX10 opcodes for some global/flat atomics
Rhys Perry [Tue, 26 Nov 2019 21:06:35 +0000 (21:06 +0000)]
aco: fix GFX10 opcodes for some global/flat atomics

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: improve WAR hazard workaround with >64bit stores
Rhys Perry [Wed, 27 Nov 2019 17:23:02 +0000 (17:23 +0000)]
aco: improve WAR hazard workaround with >64bit stores

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: add v_nop inbetween exec write and VMEM/DS/FLAT
Rhys Perry [Wed, 27 Nov 2019 17:20:15 +0000 (17:20 +0000)]
aco: add v_nop inbetween exec write and VMEM/DS/FLAT

LLVM and the proprietary compiler seem to do this

Fixes: b01847bd9 ("aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: fix incorrect cast in parse_wait_instr()
Rhys Perry [Wed, 27 Nov 2019 17:11:58 +0000 (17:11 +0000)]
aco: fix incorrect cast in parse_wait_instr()

s_waitcnt is SOPP, not SOPK

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: fix i2i64
Rhys Perry [Wed, 27 Nov 2019 17:24:23 +0000 (17:24 +0000)]
aco: fix i2i64

Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoaco: propagate p_wqm on an image_sample's coordinate p_create_vector
Rhys Perry [Thu, 28 Nov 2019 15:29:40 +0000 (15:29 +0000)]
aco: propagate p_wqm on an image_sample's coordinate p_create_vector

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2156
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
4 years agoetnaviv: remove dead code
Christian Gmeiner [Fri, 29 Nov 2019 14:15:27 +0000 (15:15 +0100)]
etnaviv: remove dead code

ptiled is always NULL so the if statement is useless.

CoverityID: 1415572
Fixes: b9627765303 ("etnaviv: rework compatible render base")
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: handle integer case for GENERIC_ATTRIB_SCALE
Christian Gmeiner [Sat, 19 Oct 2019 17:12:53 +0000 (19:12 +0200)]
etnaviv: handle integer case for GENERIC_ATTRIB_SCALE

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: fix R10G10B10A2 vertex format entries
Christian Gmeiner [Sat, 19 Oct 2019 16:48:35 +0000 (18:48 +0200)]
etnaviv: fix R10G10B10A2 vertex format entries

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: use NORMALIZE_SIGN_EXTEND
Christian Gmeiner [Wed, 16 Oct 2019 04:31:17 +0000 (06:31 +0200)]
etnaviv: use NORMALIZE_SIGN_EXTEND

The blob driver does something like this for all vertex formats:

if (normalize) {
   if (OPENGL_ES30)
      val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_SIGN_EXTEND;
   else
      val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_ON;
} else {
   val = VIVS_FE_VERTEX_ELEMENT_CONFIG_NORMALIZE_OFF;
}

As there is no way to get to that information in gallium we always
assume OPENGL_ES30.

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: fix integer vertex formats
Christian Gmeiner [Sun, 13 Oct 2019 08:54:48 +0000 (10:54 +0200)]
etnaviv: fix integer vertex formats

Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoi965: update Makefile.sources for perf changes
Jonathan Gray [Thu, 28 Nov 2019 05:57:23 +0000 (16:57 +1100)]
i965: update Makefile.sources for perf changes

brw_performance_query_metrics.h was removed in
134e750e16bfc53480e0bba6f0ae3e1d2a7fb87c and
brw_performance_query.h was removed in
8ae6667992ccca41d08884d863b8aeb22a4c4e65

remove reference to these files from Makefile.sources

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Fixes: 134e750e16bfc53480e0 ("i965: extract performance query metrics")
Fixes: 8ae6667992ccca41d088 ("intel/perf: move query_object into perf")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agoscons: Bump C standard to gnu11 on macOS 10.15.
Vinson Lee [Thu, 28 Nov 2019 08:05:13 +0000 (00:05 -0800)]
scons: Bump C standard to gnu11 on macOS 10.15.

Fix build error on macOS 10.15 Catalina.

src/util/u_queue.c:179:7: error: implicit declaration of function 'timespec_get' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
      timespec_get(&ts, TIME_UTC);
      ^

timespec_get needs C11 starting with macOS 10.15.

/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/time.h
   193 #if (__DARWIN_C_LEVEL >= __DARWIN_C_FULL) && \
   194         ((defined(__STDC_VERSION__) && __STDC_VERSION__ >= 201112L) || \
   195         (defined(__cplusplus) && __cplusplus >= 201703L))
   196 /* ISO/IEC 9899:201x 7.27.2.5 The timespec_get function */
   197 #define TIME_UTC 1 /* time elapsed since epoch */
   198 __API_AVAILABLE(macosx(10.15), ios(13.0), tvos(13.0), watchos(6.0))
   199 int timespec_get(struct timespec *ts, int base);
   200 #endif

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
4 years agopanfrost: Make sure we reset the damage region of RTs at flush time
Boris Brezillon [Thu, 14 Nov 2019 08:35:27 +0000 (09:35 +0100)]
panfrost: Make sure we reset the damage region of RTs at flush time

We must reset the damage info of our render targets here even though a
damage reset normally happens when the DRI layer swaps buffers. That's
because there can be implicit flushes the GL app is not aware of, and
those might impact the damage region: if part of the damaged portion
is drawn during those implicit flushes, you have to reload those areas
before next draws are pushed, and since the driver can't easily know
what's been modified by the draws it flushed, the easiest solution is
to reload everything.

Reported-by: Carsten Haitzler <raster@rasterman.com>
Fixes: 65ae86b85422 ("panfrost: Add support for KHR_partial_update()")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agogallium: Fix the ->set_damage_region() implementation
Boris Brezillon [Fri, 8 Nov 2019 23:02:54 +0000 (00:02 +0100)]
gallium: Fix the ->set_damage_region() implementation

BACK_LEFT attachment can be outdated when the user calls
KHR_partial_update() (->lastStamp != ->texture_stamp), leading to a
damage region update on the wrong pipe_resource object.
Let's delay the ->set_damage_region() call until the attachments are
updated when we're in that case.

Reported-by: Carsten Haitzler <raster@rasterman.com>
Fixes: 492ffbed63a2 ("st/dri2: Implement DRI2bufferDamageExtension")
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agozink: silence coverity error
Erik Faye-Lund [Wed, 27 Nov 2019 16:46:29 +0000 (17:46 +0100)]
zink: silence coverity error

Coverity doesn't know that we always have coordinates if we have lod. To
avoid annoying errors, let's just zero-initialize this.

CoverityID: 1455202
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: error-check right variable
Erik Faye-Lund [Wed, 27 Nov 2019 16:44:05 +0000 (17:44 +0100)]
zink: error-check right variable

That's not the value we just allocated...

CoverityID: 1455177
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: avoid NULL-deref
Erik Faye-Lund [Wed, 27 Nov 2019 16:38:53 +0000 (17:38 +0100)]
zink: avoid NULL-deref

Same story as the previous two commits; these functions dereference the
memory they are pointed at. We can't do that.

CoverityID: 1455180
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: avoid NULL-deref
Erik Faye-Lund [Wed, 27 Nov 2019 16:34:08 +0000 (17:34 +0100)]
zink: avoid NULL-deref

Similar to the previous commit, pipe_resource_reference also dereference
the memory pointed at. Let's avoid it.

CoverityID: 1455198
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: avoid NULL-deref
Erik Faye-Lund [Wed, 27 Nov 2019 16:17:08 +0000 (17:17 +0100)]
zink: avoid NULL-deref

zink_render_pass_reference will dereference the memory 'dst' points at,
which can't really go well. All we want to do here is to increase the
reference-count, so let's use a different helper for that instead.

CoverityID: 1455200
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: handle calloc-failure
Erik Faye-Lund [Wed, 27 Nov 2019 16:31:28 +0000 (17:31 +0100)]
zink: handle calloc-failure

In case we fail to allocate the context, we should notice and fail
gracefully.

CoverityID: 1455193
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: do not try to destroy NULL-fence
Erik Faye-Lund [Wed, 27 Nov 2019 16:22:24 +0000 (17:22 +0100)]
zink: do not try to destroy NULL-fence

destroy_fence doesn't handle NULL-pointers gracefully. So let's avoid
hitting that code-path, by simply returning NULL early here instead.

CoverityID: 1455179
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: delete query rather than allocating a new one
Erik Faye-Lund [Wed, 27 Nov 2019 15:30:29 +0000 (16:30 +0100)]
zink: delete query rather than allocating a new one

It seems I had some fat fingers when writing this function, and I
accidentally ended up allocating a new query and immediately trying to
delete an uninitialized pool instead of just deleting the pool of the
query that was passed.

CoverityID: 1455196
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: fix crash when restoring sampler-states
Erik Faye-Lund [Thu, 28 Nov 2019 17:22:24 +0000 (18:22 +0100)]
zink: fix crash when restoring sampler-states

When I changed to heap-allocated sampler-objects, I missed the code-path
that restores sampler-states after the blitter; it needs an array of
pointers, not an array of VkSampler objects to behave.

This fixes spec@arb_texture_cube_map@copyteximage for me.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 5ea787950f6 ("zink: heap-allocate samplers objects")
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: reject invalid sample-counts
Erik Faye-Lund [Thu, 28 Nov 2019 17:41:30 +0000 (18:41 +0100)]
zink: reject invalid sample-counts

Vulkan only allows power-of-two sample counts. We already kinda checked
for this, but forgot to validate the result in the end. Let's check the
result and error properly.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agozink: use true/false instead of TRUE/FALSE
Erik Faye-Lund [Thu, 28 Nov 2019 17:41:05 +0000 (18:41 +0100)]
zink: use true/false instead of TRUE/FALSE

Reviewed-by: Dave Airlie <airlied@redhat.com>
4 years agost/mesa: unmap pbo after updating cache
Erik Faye-Lund [Thu, 28 Nov 2019 16:51:20 +0000 (17:51 +0100)]
st/mesa: unmap pbo after updating cache

Unmapping first leads to accessing an invalid pointer. So let's switch
these lines around.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agopanfrost: Fix gnu-empty-initializer build errors.
Vinson Lee [Thu, 28 Nov 2019 07:37:00 +0000 (23:37 -0800)]
panfrost: Fix gnu-empty-initializer build errors.

Fixes: a24d6fbae60c ("meson: Add -Werror=gnu-empty-initializer to MSVC compat args")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agodocs: update source code repository documentation
Timothy Arceri [Thu, 28 Nov 2019 04:26:34 +0000 (15:26 +1100)]
docs: update source code repository documentation

This drops all the old documentaion around applying for push access.

Also this removes the documentation stating that you can push
directly to mesa rather than using merge requests.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1969
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
4 years agoradv: Fix timeline semaphore refcounting.
Bas Nieuwenhuizen [Fri, 22 Nov 2019 00:51:36 +0000 (01:51 +0100)]
radv: Fix timeline semaphore refcounting.

Was totally broken ...

Removed two if(point) {} because point is always non-NULL and we
were counting on that already for counting, since we NULL our
references to semaphores without active point earlier.

Fixes: 4aa75bb3bdd "radv: Add wait-before-submit support for timelines."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2137
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agowinsys/amdgpu: avoid double simple_mtx_unlock()
Jonathan Gray [Thu, 28 Nov 2019 05:56:30 +0000 (16:56 +1100)]
winsys/amdgpu: avoid double simple_mtx_unlock()

pthread_mutex_unlock() when unlocked is documented by posix as
being undefined behaviour.  On OpenBSD pthread_mutex_unlock() will call
abort(3) if this happens.

This occurs in amdgpu_winsys_create() after
cb446dc0fa5c68f681108f4613560543aa4cf553
winsys/amdgpu: Add amdgpu_screen_winsys

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Cc: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
4 years agoutil/driconfig: print ATTENTION if MESA_DEBUG=silent is not set
Marek Olšák [Tue, 26 Nov 2019 01:05:47 +0000 (20:05 -0500)]
util/driconfig: print ATTENTION if MESA_DEBUG=silent is not set

unix-bytebenchmark refuses to run if the driver prints ATTENTION to stderr.

Acked-by: Eric Engestrom <eric@engestrom.ch>
4 years agoglsl: handle max uniform limits with lower_const_arrays_to_uniforms
Tapani Pälli [Fri, 8 Nov 2019 06:17:17 +0000 (08:17 +0200)]
glsl: handle max uniform limits with lower_const_arrays_to_uniforms

Fixes arb_tessellation_shader-large-uniforms Piglit test.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agoradv: Unify max_descriptor_set_size.
Bas Nieuwenhuizen [Wed, 27 Nov 2019 23:36:24 +0000 (00:36 +0100)]
radv: Unify max_descriptor_set_size.

They were out of sync. Besides syncing, lets ensure they never diverge
again.

Fixes: 8d2654a4197 "radv: Support VK_EXT_inline_uniform_block."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoamd/llvm: Refactor ac_build_scan.
Bas Nieuwenhuizen [Wed, 27 Nov 2019 22:33:59 +0000 (23:33 +0100)]
amd/llvm: Refactor ac_build_scan.

Split out the logic for exclusive scans into a separate function
that makes clear what it does instead of having this opaque 60
line if.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoradv: add more constants to avoid using magic numbers
Samuel Pitoiset [Tue, 26 Nov 2019 07:32:02 +0000 (08:32 +0100)]
radv: add more constants to avoid using magic numbers

Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoac/llvm: convert src operands to pointers if necessary
Samuel Pitoiset [Wed, 27 Nov 2019 14:32:45 +0000 (15:32 +0100)]
ac/llvm: convert src operands to pointers if necessary

To avoid generating invalid LLVM IR when both operands don't have
the same type. This might happen when performing pointer comparisons
with SPIRV 1.4.

Fixes invalid LLVM IR for:
dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrequal.variable_pointers_ssbo_equal
dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrnotequal.variable_pointers_ssbo_not_equal

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agollvmpipe: add initial nir support
Dave Airlie [Thu, 5 Sep 2019 05:49:25 +0000 (15:49 +1000)]
llvmpipe: add initial nir support

This adds the hooks between llvmpipe and the gallivm NIR
code, for compute and fragment shaders.

NIR support is hidden behind LP_DEBUG=nir for now until
all the intergration issues are solved

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallivm: add swizzle support where one channel isn't defined.
Dave Airlie [Thu, 5 Sep 2019 05:41:05 +0000 (15:41 +1000)]
gallivm: add swizzle support where one channel isn't defined.

NIR doesn't always define all output channels
relies on outputs being memset to 0

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallium: add nir lowering passes for the draw pipe stages. (v2)
Dave Airlie [Thu, 5 Sep 2019 05:47:39 +0000 (15:47 +1000)]
gallium: add nir lowering passes for the draw pipe stages. (v2)

This transforms the NIR shaders like the TGSI transforms worked.

v2: fix some nir info requirements, use 32-bit bools

Acked-by: Roland Scheidegger <sroland@vmware.com>
4 years agodraw: add nir info gathering and building support
Dave Airlie [Thu, 5 Sep 2019 05:47:19 +0000 (15:47 +1000)]
draw: add nir info gathering and building support

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallivm: add nir->llvm translation (v2)
Dave Airlie [Thu, 5 Sep 2019 05:46:31 +0000 (15:46 +1000)]
gallivm: add nir->llvm translation (v2)

This add the initial implementation of the NIR->LLVM conversion
for llvmpipe NIR support.

v2: lower bool to int32 in nir not llvm

Acked-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallivm: add selection for non-32 bit types
Dave Airlie [Thu, 5 Sep 2019 05:43:38 +0000 (15:43 +1000)]
gallivm: add selection for non-32 bit types

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallivm: add cttz wrapper
Dave Airlie [Wed, 20 Nov 2019 01:44:22 +0000 (11:44 +1000)]
gallivm: add cttz wrapper

this will be used to write find_lsb support

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallivm: add popcount intrinsic wrapper
Dave Airlie [Mon, 28 Oct 2019 04:21:43 +0000 (14:21 +1000)]
gallivm: add popcount intrinsic wrapper

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallivm: nir->tgsi info convertor (v2)
Dave Airlie [Thu, 5 Sep 2019 05:32:21 +0000 (15:32 +1000)]
gallivm: nir->tgsi info convertor (v2)

This is a port of the old radeonsi code to be used for llvmpipe NIR support.

Once we remove TGSI support from llvmpipe (I can dream? :-), then
we should be able to refine most of this down and remove it.

v2: port to later radeonsi code for vertex inputs and sampler/io parsing.

Acked-by: Roland Scheidegger <sroland@vmware.com>
4 years agogallivm: split out the flow control ir to a common file.
Dave Airlie [Thu, 5 Sep 2019 05:34:46 +0000 (15:34 +1000)]
gallivm: split out the flow control ir to a common file.

We can share a bunch of flow control handling between NIR and TGSI.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
4 years agoradeonsi: enable SPIR-V and GL 4.6 for NIR
Marek Olšák [Wed, 6 Nov 2019 23:03:30 +0000 (18:03 -0500)]
radeonsi: enable SPIR-V and GL 4.6 for NIR

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi/nir: support interface output types to fix SPIR-V xfb piglits
Marek Olšák [Thu, 7 Nov 2019 01:50:26 +0000 (20:50 -0500)]
radeonsi/nir: support interface output types to fix SPIR-V xfb piglits

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi/nir: fix location_frac handling for TCS outputs
Marek Olšák [Thu, 7 Nov 2019 01:19:17 +0000 (20:19 -0500)]
radeonsi/nir: fix location_frac handling for TCS outputs

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi/nir: don't rely on data.patch for tess factors
Marek Olšák [Thu, 7 Nov 2019 01:18:23 +0000 (20:18 -0500)]
radeonsi/nir: don't rely on data.patch for tess factors

GLCTS SPIR-V tests have this issue.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi/nir: validate is_patch because SPIR-V doesn't set it for tess factors
Marek Olšák [Thu, 7 Nov 2019 01:12:40 +0000 (20:12 -0500)]
radeonsi/nir: validate is_patch because SPIR-V doesn't set it for tess factors

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi: simplify get_tcs_tes_buffer_address_from_generic_indices
Marek Olšák [Thu, 7 Nov 2019 00:48:34 +0000 (19:48 -0500)]
radeonsi: simplify get_tcs_tes_buffer_address_from_generic_indices

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi: simplify the interface of get_dw_address_from_generic_indices
Marek Olšák [Thu, 7 Nov 2019 00:40:23 +0000 (19:40 -0500)]
radeonsi: simplify the interface of get_dw_address_from_generic_indices

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoradeonsi/nir: implement subgroup system values for SPIR-V
Marek Olšák [Thu, 7 Nov 2019 00:06:09 +0000 (19:06 -0500)]
radeonsi/nir: implement subgroup system values for SPIR-V

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agoac/nir: don't rely on data.patch for tess factors
Marek Olšák [Thu, 7 Nov 2019 01:19:10 +0000 (20:19 -0500)]
ac/nir: don't rely on data.patch for tess factors

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
4 years agodrirc: Set vs_position_always_invariant for Shadow of Mordor on Intel
Kenneth Graunke [Fri, 22 Nov 2019 09:37:02 +0000 (01:37 -0800)]
drirc: Set vs_position_always_invariant for Shadow of Mordor on Intel

When drawing the main character in Shadow of Mordor, the game appears
to draw Talion with one vertex shader, and the Wraith with another.
If the compiler optimizes those in different ways which lead to slight
imprecisions, then the resulting positions may not line up, leading to
Z-fighting occurring as the game decides which of the two are in front.

brw_nir_opt_peephole_ffma looks at usages of multiply adds across the
entire shader, and may make different decisions between the two, leading
to such imprecisions and Z-fighting.  This started happening recently
after a NIR change to eliminate unnecessary MOVs (7025dbe7), but that
change simply exposed the existing problem.

Improves performance on Skylake GT4e by 1.22945% +/- 0.398672% (n=3),
likely due to the fixed rendering.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1985
Fixes: 7025dbe794b ("nir: Skip emitting no-op movs from the builder.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
4 years agodriconf, glsl: Add a vs_position_always_invariant option
Kenneth Graunke [Fri, 22 Nov 2019 00:11:15 +0000 (16:11 -0800)]
driconf, glsl: Add a vs_position_always_invariant option

Many applications use multi-pass rendering and require their vertex
shader position to be computed the same way each time.  Optimizations
may consider, say, fusing a multiply-add based on global usage of an
expression in a shader.  But a second shader with the same expression
may have different code, causing that optimization to make the other
choice the second time around.

The correct solution is for applications to mark their VS outputs
'invariant', indicating they need multiple shaders to compute that
output in the same manner.  However, most applications fail to do so.

So, we add a new driconf option - vs_position_always_invariant - which
forces the gl_Position output in vertex shaders to be marked invariant.

Fixes: 7025dbe794b ("nir: Skip emitting no-op movs from the builder.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
4 years agoturnip: Disable timestamp queries for now.
Eric Anholt [Wed, 27 Nov 2019 00:16:05 +0000 (16:16 -0800)]
turnip: Disable timestamp queries for now.

They're not implemented, and not critical to bring up immediately.  Avoids
failures in the CTS when nothing gets written to the query.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agofreedreno/perfcntrs/fdperf: add missing a2xx case in select_counter
Jonathan Marek [Wed, 27 Nov 2019 15:46:22 +0000 (10:46 -0500)]
freedreno/perfcntrs/fdperf: add missing a2xx case in select_counter

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/perfcntrs/fdperf: add missing a20x compatible
Jonathan Marek [Wed, 27 Nov 2019 15:45:41 +0000 (10:45 -0500)]
freedreno/perfcntrs/fdperf: add missing a20x compatible

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds
Jonathan Marek [Wed, 27 Nov 2019 15:44:57 +0000 (10:44 -0500)]
freedreno/perfcntrs/fdperf: fix u64 print on 32-bit builds

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/perfcntrs: add a2xx MH counters
Jonathan Marek [Wed, 27 Nov 2019 15:40:59 +0000 (10:40 -0500)]
freedreno/perfcntrs: add a2xx MH counters

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/registers: add missing MH perfcounter enum for a2xx
Jonathan Marek [Wed, 27 Nov 2019 15:38:14 +0000 (10:38 -0500)]
freedreno/registers: add missing MH perfcounter enum for a2xx

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agogitlab-ci: Put HTML summary in artifacts for failed piglit jobs
Michel Dänzer [Mon, 25 Nov 2019 17:42:10 +0000 (18:42 +0100)]
gitlab-ci: Put HTML summary in artifacts for failed piglit jobs

This will make it easier to look at details of failed / skipped tests.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agogitlab-ci: Stop storing piglit test results as JUnit
Michel Dänzer [Tue, 26 Nov 2019 15:27:07 +0000 (16:27 +0100)]
gitlab-ci: Stop storing piglit test results as JUnit

Since we're not reporting test results as JUnit anymore, we can use the
default JSON format.

This affects how test results are summarized, update the reference files
accordingly.

Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agogitlab-ci: Stop reporting piglit test results via JUnit
Michel Dänzer [Tue, 26 Nov 2019 16:44:49 +0000 (17:44 +0100)]
gitlab-ci: Stop reporting piglit test results via JUnit

It was basically useless in this form, and processing the JUnit data in
the GitLab backend was pretty expensive.

Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agov3d: fix indirect BO allocation for uniforms
Iago Toral Quiroga [Tue, 26 Nov 2019 14:28:52 +0000 (15:28 +0100)]
v3d: fix indirect BO allocation for uniforms

We were always ensuring a minimum size of 4 bytes for uniforms
for the case where we don't have any, to account for hardware pre-fetching
of the uniform stream, however, pre-fetching could also lead to to out
of bounds reads when have read the last uniform in the stream, so we
probably want to have the extra 4 bytes to prevent the kernel from
observing invalid memory accesses when the uniform stream sits right at
the end of a page.

This seems to fix MMU exceptions reported with a Linux 5.4 kernel.

Credit goes to Phil Elwell for identifying the problem and narrowing
it down to memory accesses in the uniform stream.

Reported-by: Phil Elwell <phil@raspberrypi.org>
Tested-by: Phil Elwell <phil@raspberrypi.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoradv: enable VK_KHR_shader_subgroup_extended_types on GFX10
Samuel Pitoiset [Mon, 25 Nov 2019 15:53:54 +0000 (16:53 +0100)]
radv: enable VK_KHR_shader_subgroup_extended_types on GFX10

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoac: add 8-bit and 16-bit supports to ac_build_permlane16()
Samuel Pitoiset [Mon, 25 Nov 2019 16:02:44 +0000 (17:02 +0100)]
ac: add 8-bit and 16-bit supports to ac_build_permlane16()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: fix implementation of exclusive scans
Samuel Pitoiset [Fri, 23 Aug 2019 15:53:05 +0000 (17:53 +0200)]
radv/gfx10: fix implementation of exclusive scans

This implementation is loosely based on ROCm.
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl

This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10.

Fixes: 227c29a80de ("amd/common/gfx10: implement scan & reduce operations")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: fix enabling sample shading with SampleID/SamplePosition
Samuel Pitoiset [Tue, 26 Nov 2019 17:29:00 +0000 (18:29 +0100)]
radv: fix enabling sample shading with SampleID/SamplePosition

When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.

Cc: 19.2, 19.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoturnip: fix integer render targets
Jonathan Marek [Tue, 19 Nov 2019 04:01:18 +0000 (23:01 -0500)]
turnip: fix integer render targets

Add missing required bits.  Fixes at least:

dEQP-VK.pipeline.render_to_image.dedicated_allocation.1d.small.r16g16_sint_d24_unorm_s8_uint
dEQP-VK.pipeline.render_to_image.dedicated_allocation.2d.mipmap.r16g16_sint_d24_unorm_s8_uint
dEQP-VK.renderpass.dedicated_allocation.attachment.4.401
dEQP-VK.renderpass2.suballocation.formats.r16_uint.load.draw
dEQP-VK.synchronization.op.single_queue.barrier.write_draw_read_copy_image_to_buffer.image_128x128_r16_uint

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoanv: Push constants are relative to dynamic state on IVB
Jason Ekstrand [Mon, 25 Nov 2019 17:05:42 +0000 (11:05 -0600)]
anv: Push constants are relative to dynamic state on IVB

Fixes: aecde2351 "anv: Pre-compute push ranges for graphics pipelines"
Closes: #2136
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agomeson: Add -Werror=gnu-empty-initializer to MSVC compat args
Dylan Baker [Thu, 21 Nov 2019 17:11:45 +0000 (09:11 -0800)]
meson: Add -Werror=gnu-empty-initializer to MSVC compat args

Only clang has this argument (at least as of clang 8 and gcc 9), which
errors when using the gcc empty initializer syntax in C:

```C
struct foo f = {};
```

GCC has a warning for this, but only when using -Wpedantic, which is a
lot of noise to lose useful warnings in.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agogallium/auxiliary: Fix uses of gnu struct = {} extension
Dylan Baker [Thu, 21 Nov 2019 17:50:27 +0000 (09:50 -0800)]
gallium/auxiliary: Fix uses of gnu struct = {} extension

Most of these will never actually be compiled by windows, but in the
interest of being able to make using struct foo = {}; an error and
avoiding breaking windows removing a handful of safe uses seems like a
good trade off.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
4 years agost/mesa: add st_variant base class to simplify code for shader variants
Marek Olšák [Mon, 25 Nov 2019 22:58:45 +0000 (17:58 -0500)]
st/mesa: add st_variant base class to simplify code for shader variants

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agost/mesa: don't use ** in the st_nir_link_shaders signature
Marek Olšák [Sat, 23 Nov 2019 00:31:46 +0000 (19:31 -0500)]
st/mesa: don't use ** in the st_nir_link_shaders signature

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agost/mesa: simplify looping over linked shaders when linking NIR
Marek Olšák [Tue, 19 Nov 2019 22:30:03 +0000 (17:30 -0500)]
st/mesa: simplify looping over linked shaders when linking NIR

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agost/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking for NIR
Marek Olšák [Wed, 13 Nov 2019 04:48:02 +0000 (23:48 -0500)]
st/mesa: propagate gl_PatchVerticesIn from TCS to TES before linking for NIR

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agost/mesa: don't call ProgramStringNotify in glsl_to_nir
Marek Olšák [Wed, 13 Nov 2019 04:46:37 +0000 (23:46 -0500)]
st/mesa: don't call ProgramStringNotify in glsl_to_nir

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agost/mesa: don't use redundant stp->state.ir.nir
Marek Olšák [Thu, 21 Nov 2019 00:18:21 +0000 (19:18 -0500)]
st/mesa: don't use redundant stp->state.ir.nir

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agost/mesa: don't serialize all streamout state if there are no SO outputs
Marek Olšák [Mon, 25 Nov 2019 22:01:42 +0000 (17:01 -0500)]
st/mesa: don't serialize all streamout state if there are no SO outputs

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
4 years agoiris: Disable VF cache partial address workaround on Gen11+
Kenneth Graunke [Mon, 25 Nov 2019 18:04:38 +0000 (10:04 -0800)]
iris: Disable VF cache partial address workaround on Gen11+

The vertex cache uses the full 48-bit address on Gen11+.  See the
documentation for 3DSTATE_VERTEX_BUFFERS, which describes the
workaround and lists it as pre-Icelake.

Interestingly, the docs don't mention index buffers as needing a
workaround at all.  So either we've been overzealous, or the docs
never got updated to record that.  Which begs the question of whether
the issue there was fixed, if there was one...

Cuts 40% of the PIPE_CONTROLs from Civilization VI's benchmark; appears
that it improves performance by about 1-2% on Icelake 8x8 (not frequency
locked).

4 years agofreedreno: switch to layout helper
Rob Clark [Sun, 5 May 2019 17:59:37 +0000 (10:59 -0700)]
freedreno: switch to layout helper

The slices table and most of the other layout fields in the
freedreno_resource moves into fdl_layout.

v2: Changes by anholt to not have duplicate fields, which was introducing
    a surprising behavior change in resource layout (using the
    level_linear helper before the setup of the shadowed fields)

Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/a6xx: Log the tiling mode in resource layout debug.
Eric Anholt [Thu, 21 Nov 2019 04:54:27 +0000 (20:54 -0800)]
freedreno/a6xx: Log the tiling mode in resource layout debug.

This was important for figuring out what went wrong with the layout
refactor.

Acked-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: Convert the slice struct to the new resource header.
Eric Anholt [Wed, 20 Nov 2019 20:40:25 +0000 (12:40 -0800)]
freedreno: Convert the slice struct to the new resource header.

This gets the worst of the sed required for shared resource layout out of
the way.  The texture layout comment is dropped now that we're referencing
the shared header, which has a more complete description.

Acked-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: Introduce a resource layout header.
Eric Anholt [Wed, 20 Nov 2019 20:28:43 +0000 (12:28 -0800)]
freedreno: Introduce a resource layout header.

This will be used for sharing resource layout code between freedreno and
tu.  Mostly copied from a commit by Rob, with a new location and the slice
struct renamed for consistency.

Acked-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: Introduce a fd_resource_tile_mode() helper.
Eric Anholt [Wed, 20 Nov 2019 21:17:27 +0000 (13:17 -0800)]
freedreno: Introduce a fd_resource_tile_mode() helper.

Multiple places were doing the same thing to get the tile mode of a level,
so refactor it out.  This will make the shared resource helper transition
cleaner.

Acked-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: Introduce a fd_resource_layer_stride() helper.
Eric Anholt [Wed, 20 Nov 2019 20:55:56 +0000 (12:55 -0800)]
freedreno: Introduce a fd_resource_layer_stride() helper.

This factors out a bit of duplicated code, but will also make the shared
resource layout transition process clearer.

Acked-by: Rob Clark <robdclark@chromium.org>