mesa.git
5 years agovirgl: Use virgl_resource_cache in the drm winsys
Alexandros Frantzis [Wed, 12 Jun 2019 07:30:26 +0000 (10:30 +0300)]
virgl: Use virgl_resource_cache in the drm winsys

Replace the cache implementation in the drm winsys with
virgl_resource_cache.

Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
5 years agovirgl: Introduce virgl_resource_cache
Alexandros Frantzis [Wed, 12 Jun 2019 07:29:32 +0000 (10:29 +0300)]
virgl: Introduce virgl_resource_cache

Introduce a resource cache implementation that can be used by any virgl
winsys backend.

Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
5 years agoi965: support UYVY for external import only
Haihao Xiang [Wed, 12 Jun 2019 08:28:58 +0000 (16:28 +0800)]
i965: support UYVY for external import only

It is similar with YUYV

Fixes: 165e704719b85c ("i965/i915: Add UYVY as the supported format")
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoglsl: Set default precision on record members
Neil Roberts [Mon, 22 Apr 2019 14:33:38 +0000 (16:33 +0200)]
glsl: Set default precision on record members

Record types have their own slot to store the precision for each
member in glsl_struct_field. Previously if the member didn’t have an
explicit precision qualifier this was being left as
GLSL_PRECISION_NONE. This patch makes it take into account the type’s
default precision qualifier like it does for regular variables in
apply_type_qualifier_to_variable.

This has the additional benefit of correctly reporting an error when a
float type is used in a struct without declaring the default type.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl/linker: Make precision matching optional in intrastage_match
Neil Roberts [Tue, 23 Apr 2019 14:52:36 +0000 (16:52 +0200)]
glsl/linker: Make precision matching optional in intrastage_match

This function is confusingly also used to match interstage interfaces
as well as intrastage. In the interstage case it needs to avoid
comparing the precisions. This patch adds a parameter to specify
whether to take the precision into account or not so that it can be
used for both cases.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoglsl/linker: Don’t check precision for shader interface
Neil Roberts [Tue, 23 Apr 2019 13:19:35 +0000 (15:19 +0200)]
glsl/linker: Don’t check precision for shader interface

On GLES, the interface between vertex and fragment shaders doesn’t
need to have matching precision.

Section 4.3.10 of the GLSL ES 3.00 spec:

“The type of vertex outputs and fragment inputs with the same name
 must match, otherwise the link command will fail. The precision does
 not need to match.”

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agocompiler/types: Making comparing record precision optional
Neil Roberts [Wed, 24 Apr 2019 10:28:51 +0000 (12:28 +0200)]
compiler/types: Making comparing record precision optional

On GLES, the interface between vertex and fragment shaders doesn’t
need to have matching precision. This adds an extra argument to
glsl_types::record_compare to disable the precision comparison. This
will later be used for the shader interface check.

In order to make this work this patch also adds a helper function to
recursively compare types while ignoring the precision.

v2: Call record_compare from within compare_no_precision to avoid
    duplicating code (Eric Anholt).

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoetnaviv: fix some pm query issues
Lucas Stach [Fri, 31 May 2019 16:05:12 +0000 (18:05 +0200)]
etnaviv: fix some pm query issues

The offsets to read the query results were off-by-one, which causes the
counters to report bogus increasing values.

Also the counter result is u32, so we need to initialize the query type
to reflect that.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agov3d: do not setup execute flags for else block in uniform control flow
Iago Toral Quiroga [Thu, 13 Jun 2019 09:38:45 +0000 (11:38 +0200)]
v3d: do not setup execute flags for else block in uniform control flow

Either all channels executed the 'then' block, in which case all
channels will directly jump to the 'endif' block at the end of the
'then' block, or all channels execute the 'else' block (so no
execution masking is necessary).

Shader-db results:

total instructions in shared programs: 9119238 -> 9117550 (-0.02%)
instructions in affected programs: 401252 -> 399564 (-0.42%)
helped: 855
HURT: 77

total uniforms in shared programs: 3022622 -> 3022605 (<.01%)
uniforms in affected programs: 3566 -> 3549 (-0.48%)
helped: 17
HURT: 0

total max-temps in shared programs: 1327762 -> 1327774 (<.01%)
max-temps in affected programs: 619 -> 631 (1.94%)
helped: 2
HURT: 15

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agonir: detect more dynamically uniform expressions
Iago Toral Quiroga [Wed, 12 Jun 2019 11:57:03 +0000 (13:57 +0200)]
nir: detect more dynamically uniform expressions

Shader-db results for v3d:

total instructions in shared programs: 9132728 -> 9119238 (-0.15%)
instructions in affected programs: 596886 -> 583396 (-2.26%)
helped: 1118
HURT: 224

total threads in shared programs: 234298 -> 234308 (<.01%)
threads in affected programs: 10 -> 20 (100.00%)
helped: 5
HURT: 0

total uniforms in shared programs: 3022949 -> 3022622 (-0.01%)
uniforms in affected programs: 29163 -> 28836 (-1.12%)
helped: 108
HURT: 37

total max-temps in shared programs: 1328030 -> 1327762 (-0.02%)
max-temps in affected programs: 10097 -> 9829 (-2.65%)
helped: 263
HURT: 15

total spills in shared programs: 3793 -> 3777 (-0.42%)
spills in affected programs: 432 -> 416 (-3.70%)
helped: 16
HURT: 0

total fills in shared programs: 4380 -> 4266 (-2.60%)
fills in affected programs: 828 -> 714 (-13.77%)
helped: 16
HURT: 0

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoir3: initialize progress false before ir3_nir_lower_imul
Tapani Pälli [Thu, 13 Jun 2019 09:58:04 +0000 (12:58 +0300)]
ir3: initialize progress false before ir3_nir_lower_imul

Removes a compiler warning about uninitialized variable.

Fixes: c02ffd2700c "ir3: Use the new NIR lowering pass for integer multiplication"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Clark <robclark@gmail.com>
Reviewed-by: Eduardo Lima <elima@igalia.com>
5 years agopanfrost: Fix general purpose varying handling
Boris Brezillon [Thu, 13 Jun 2019 12:56:02 +0000 (14:56 +0200)]
panfrost: Fix general purpose varying handling

When both the fragment and vertex shaders point to the same varying
location they expect to share the same varying slot.
Make sure vertex and fragment varyings pointing to the same loc have
->src_offset set to the same value.

[Alyssa: In addition a patch implement txs, this fixes GALLIUM_HUD on
Panfrost]

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoac/registers: use better names for disambiguated definitions
Marek Olšák [Fri, 7 Jun 2019 20:22:29 +0000 (16:22 -0400)]
ac/registers: use better names for disambiguated definitions

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac/registers: remove deprecated/inapplicable definitions
Marek Olšák [Fri, 7 Jun 2019 19:56:42 +0000 (15:56 -0400)]
ac/registers: remove deprecated/inapplicable definitions

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoiris: Enable INTEL_shader_atomic_float_minmax
Caio Marcelo de Oliveira Filho [Wed, 12 Jun 2019 03:07:32 +0000 (20:07 -0700)]
iris: Enable INTEL_shader_atomic_float_minmax

Supported only for gen >= 9.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agogallium: Add PIPE_CAP_ATOMIC_FLOAT_MINMAX
Caio Marcelo de Oliveira Filho [Wed, 12 Jun 2019 03:06:41 +0000 (20:06 -0700)]
gallium: Add PIPE_CAP_ATOMIC_FLOAT_MINMAX

Used to enable INTEL_shader_atomic_float_minmax.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agofreedreno/a6xx: fix MAX_INDICES
Rob Clark [Tue, 11 Jun 2019 12:16:21 +0000 (05:16 -0700)]
freedreno/a6xx: fix MAX_INDICES

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agofreedreno/blitter: remove dead code
Rob Clark [Wed, 12 Jun 2019 18:24:55 +0000 (11:24 -0700)]
freedreno/blitter: remove dead code

The src/dst format is overriden from the pipe_blit_info, so this just
logic just serves to confuse the reader.

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agofreedreno: turn staging cube into 2d-array
Rob Clark [Wed, 12 Jun 2019 20:36:18 +0000 (13:36 -0700)]
freedreno: turn staging cube into 2d-array

Since we could only need a subset of the layers, and otherwise we
trigger an assert in util_max_layer()

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agopanfrost: ci: Exclude some tests from results
Tomeu Vizoso [Thu, 13 Jun 2019 13:00:35 +0000 (15:00 +0200)]
panfrost: ci: Exclude some tests from results

These are tests that regressed in RK3288 but still pass on RK3399.

So we still have a CI we can rely on, add them to the flip-flop list for
now.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Update test expectations
Tomeu Vizoso [Thu, 13 Jun 2019 10:08:38 +0000 (12:08 +0200)]
panfrost: ci: Update test expectations

Some tests got fixed since the last update, but also some regressions
crept in.

To keep the CI green, add the regressions to the expected failures.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agonir: Don't manually index intrinsic index enum
Connor Abbott [Thu, 13 Jun 2019 14:48:41 +0000 (16:48 +0200)]
nir: Don't manually index intrinsic index enum

This fixes a rebase fail in ea51275e07b, and prevents it from happening
again. There's no reason to do this manually.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agodocs: work around broken altsoftware.com link
Erik Faye-Lund [Tue, 11 Jun 2019 09:28:11 +0000 (11:28 +0200)]
docs: work around broken altsoftware.com link

altsoftware.com seems to no longer be around, and is currently being
held by a domain squatter. Let's link to waybackmachine instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: work around broken dsbox.com link
Erik Faye-Lund [Tue, 11 Jun 2019 08:11:31 +0000 (10:11 +0200)]
docs: work around broken dsbox.com link

dsbox.com now forwards to haystax.com, which is tehcnially unrealted to
this link. Let's link to waybackmachine instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: work around broken sgi.com links
Erik Faye-Lund [Tue, 11 Jun 2019 08:05:51 +0000 (10:05 +0200)]
docs: work around broken sgi.com links

sgi.com now forwards to hpe.com, which is technically unrelated to these
links. Let's link to waybackmachine instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: update link to OpenGL FAQ
Erik Faye-Lund [Tue, 11 Jun 2019 08:20:12 +0000 (10:20 +0200)]
docs: update link to OpenGL FAQ

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: update link to the Linux OpenGL ABI
Erik Faye-Lund [Tue, 11 Jun 2019 08:18:39 +0000 (10:18 +0200)]
docs: update link to the Linux OpenGL ABI

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: update link to glw
Erik Faye-Lund [Tue, 11 Jun 2019 08:16:32 +0000 (10:16 +0200)]
docs: update link to glw

GLW is currently living in gitlab, the cgit-page is just a mirror.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: fixup link-target
Erik Faye-Lund [Tue, 11 Jun 2019 07:32:08 +0000 (09:32 +0200)]
docs: fixup link-target

Just a couple of lines above, we have this exact same link, but this
time with a leading "www.". Let's match that.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: eliminate another stale autoconf-reference
Erik Faye-Lund [Mon, 10 Jun 2019 18:57:37 +0000 (20:57 +0200)]
docs: eliminate another stale autoconf-reference

Meson is what should tell you about these issues, not the configure
script. We no longer have that.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: replace autoconf with meson
Erik Faye-Lund [Mon, 10 Jun 2019 18:55:16 +0000 (20:55 +0200)]
docs: replace autoconf with meson

We no longer have an autoconf build-system to maintain, but we do have a
meson build-system. So let's mention that instead.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: update required packages
Erik Faye-Lund [Mon, 10 Jun 2019 18:52:16 +0000 (20:52 +0200)]
docs: update required packages

Automake and libtool are no longer required to build, instead we need
meson and ninja-build.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: remove pointless haiku-comment
Erik Faye-Lund [Thu, 6 Jun 2019 09:43:34 +0000 (11:43 +0200)]
docs: remove pointless haiku-comment

The only build system that doesn't support Haiku is `Android.mk`,
which also doesn't support most other platforms either, so there is
no need to single it out.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: fixup typo
Erik Faye-Lund [Thu, 6 Jun 2019 08:33:09 +0000 (10:33 +0200)]
docs: fixup typo

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoradv: enable AMD_shader_ballot with RADV_PERFTEST_SHADER_BALLOT ('shader_ballot')
Daniel Schürmann [Wed, 9 May 2018 18:43:16 +0000 (20:43 +0200)]
radv: enable AMD_shader_ballot with RADV_PERFTEST_SHADER_BALLOT ('shader_ballot')

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoamd/common: add support for AMD_shader_ballot functions
Daniel Schürmann [Wed, 9 May 2018 18:42:09 +0000 (20:42 +0200)]
amd/common: add support for AMD_shader_ballot functions

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agospirv/nir: add support for AMD_shader_ballot and Groups capability
Daniel Schürmann [Wed, 9 May 2018 18:41:23 +0000 (20:41 +0200)]
spirv/nir: add support for AMD_shader_ballot and Groups capability

This commit also renames existing AMD capabilities:
 - gcn_shader -> amd_gcn_shader
 - trinary_minmax -> amd_trinary_minmax

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir: add intrinsics for AMD_shader_ballot
Daniel Schürmann [Wed, 9 May 2018 18:37:24 +0000 (20:37 +0200)]
nir: add intrinsics for AMD_shader_ballot

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoradv: enable shader_subgroup_vote & shader_subgroup_ballot extensions
Daniel Schürmann [Thu, 8 Mar 2018 15:25:20 +0000 (16:25 +0100)]
radv: enable shader_subgroup_vote & shader_subgroup_ballot extensions

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir/spirv: add support for the SubgroupBallotKHR SPIR-V capability
Daniel Schürmann [Fri, 9 Mar 2018 09:55:15 +0000 (10:55 +0100)]
nir/spirv: add support for the SubgroupBallotKHR SPIR-V capability

This capability is required for the VK_EXT_shader_subgroup_ballot extension.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir/spirv: add support for the SubgroupVoteKHR SPIR-V capability
Daniel Schürmann [Fri, 9 Mar 2018 09:27:20 +0000 (10:27 +0100)]
nir/spirv: add support for the SubgroupVoteKHR SPIR-V capability

This capability is required for the VK_EXT_shader_subgroup_vote extension.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agov3d: fix checking twice auf flag
Alejandro Piñeiro [Wed, 12 Jun 2019 12:49:05 +0000 (14:49 +0200)]
v3d: fix checking twice auf flag

Seems a C&P error, and should check for auf/muf.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110902
Fixes: 8f065596d22ab000c53f "v3d: Add an optimization pass for redundant flags updates."
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: flush and invalidate CB before resetting query pools on GFX9
Samuel Pitoiset [Thu, 13 Jun 2019 08:52:02 +0000 (10:52 +0200)]
radv: flush and invalidate CB before resetting query pools on GFX9

We have to emit a CACHE_FLUSH_AND_INV_TS_EVENT to be sure all
prior GPU work is done. While we are at it, also flush and
invalidate DB.

This fixes the following CTS (when the small hint is disabled):
dEQP-VK.query_pool.statistics_query.reset_before_copy.*

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovl: Always enable drm winsys.
Bas Nieuwenhuizen [Wed, 5 Jun 2019 12:34:23 +0000 (14:34 +0200)]
vl: Always enable drm winsys.

The dri2 winsys also uses libdrm (and you can only enable dri3 if
you enable dri2), and the drm winsys only requires libdrm.

So if any winsys is enabled you can also enable the drm winsys, and
since we always want at least one winsys we can always enable it.

I removed the check for the drm platform for VA and OMX since they
do not care anymore. Since we still check for one of r600g, nouveau
or radeonsi, we are guarantueed to still only enable it by default
in a configuration that requires libdrm anyway. So for people using
va=auto, we don't suddenly start requiring libdrm were we did not
before.

This supersedes "vl: Enable DRM by default.", which I pushed, but
rolled back because it used dep_libdrm before its definition.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoradv: Always disable DCC on shareable images.
Bas Nieuwenhuizen [Wed, 12 Jun 2019 22:57:16 +0000 (00:57 +0200)]
radv: Always disable DCC on shareable images.

Do not want it for perf reasons. Always have to disable DCC when
transferring to external queue.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Skip transitions coming from external queue.
Bas Nieuwenhuizen [Wed, 12 Jun 2019 22:52:18 +0000 (00:52 +0200)]
radv: Skip transitions coming from external queue.

Transitions to external queue should do the transition & make sure
it works on all queues.

Fixes: 8ebc7dcb59a "radv: Allow fast clears with concurrent queue mask for some layouts."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agolima/ppir: change offset type to int
Mateusz Krzak [Fri, 7 Jun 2019 22:33:30 +0000 (00:33 +0200)]
lima/ppir: change offset type to int

Offset doesn't need to be 64-bit. This fixes compilation error
with 64-bit off_t.

Fixes: af0de6b9 lima/ppir: implement discard and discard_if
Suggested-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Mateusz Krzak <kszaquitto@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
5 years agovirgl: virgl_transfer should own its virgl_resource
Chia-I Wu [Wed, 15 May 2019 22:38:49 +0000 (15:38 -0700)]
virgl: virgl_transfer should own its virgl_resource

We should avoid having potentially dangling pointers to
pipe_resources in general.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
5 years agovirgl: pass virgl_context to transfer create/destroy
Chia-I Wu [Wed, 15 May 2019 22:46:40 +0000 (15:46 -0700)]
virgl: pass virgl_context to transfer create/destroy

A pipe_transfer is a context object.  It is fine for the
constructor/destructor to have access to the context.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
5 years agovirgl: init transfer queue from virgl_context
Chia-I Wu [Wed, 15 May 2019 22:52:34 +0000 (15:52 -0700)]
virgl: init transfer queue from virgl_context

A pipe_transfer is a context object.  It is fine for
virgl_transfer_queue to have access to the context.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
5 years agovirgl: clean up virgl_transfer_queue.h
Chia-I Wu [Wed, 15 May 2019 23:01:02 +0000 (16:01 -0700)]
virgl: clean up virgl_transfer_queue.h

Add header guard and forward declare structs.  Move virgl_resource.h
inclusion to the C file.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
5 years agoradeonsi: add radeonsi_debug_disassembly option
Nicolai Hähnle [Mon, 29 Apr 2019 14:02:29 +0000 (16:02 +0200)]
radeonsi: add radeonsi_debug_disassembly option

This dumps disassembly to the pipe_debug_callback together with shader
stats.

Can be used together with shader-db to get full disassembly of all shaders
in the database.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: fix line splitting in si_shader_dump_assembly
Nicolai Hähnle [Mon, 29 Apr 2019 13:59:22 +0000 (15:59 +0200)]
radeonsi: fix line splitting in si_shader_dump_assembly

Compute the count since the start of the current line instead of the
count since the start of the the disassembly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: raise the alignment of LDS memory for compute shaders
Nicolai Hähnle [Sat, 4 May 2019 10:37:36 +0000 (12:37 +0200)]
radeonsi: raise the alignment of LDS memory for compute shaders

This implies that the memory will always be at address 0, which allows
LLVM to generate slightly better code.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: use an explicit symbol for the LSHS LDS memory
Nicolai Hähnle [Sat, 4 May 2019 10:34:52 +0000 (12:34 +0200)]
radeonsi: use an explicit symbol for the LSHS LDS memory

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: rename lds_{load,store} to lshs_lds_{load,store}
Nicolai Hähnle [Sat, 4 May 2019 10:18:07 +0000 (12:18 +0200)]
radeonsi: rename lds_{load,store} to lshs_lds_{load,store}

These functions are now only used in LS/HS shaders (both separate and
merged).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9
Nicolai Hähnle [Sat, 4 May 2019 10:11:08 +0000 (12:11 +0200)]
radeonsi/gfx9: declare LDS ESGS ring as an explicit symbol on LLVM >= 9

This will make it easier to use LDS for other purposes in geometry
shaders in the future.

The lifetime of the esgs_ring variable is as follows:
- declared as [0 x i32] while compiling shader parts or monolithic shaders
- just before uploading, gfx9_get_gs_info computes (among other things)
  the final ESGS ring size (this depends on both the ES and the GS shader)
- during upload, the "esgs_ring" symbol is given to ac_rtld as a shared
  LDS symbol, which will lead to correctly laying out the LDS including
  other LDS objects that may be defined in the future
- si_shader_gs uses shader->config.lds_size as the LDS size

This change depends on the LLVM changes for emitting LDS symbols into
the ELF file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/rtld: layout and relocate LDS symbols
Nicolai Hähnle [Fri, 3 May 2019 19:18:51 +0000 (21:18 +0200)]
amd/rtld: layout and relocate LDS symbols

Upcoming changes to LLVM will emit LDS objects as symbols in the ELF
symbol table, with relocations that will be resolved with this change.

Callers will also be able to define LDS symbols that are shared between
shader parts. This will be used by radeonsi for the ESGS ring in gfx9+
merged shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: cleanup some #includes
Nicolai Hähnle [Fri, 30 Nov 2018 10:50:07 +0000 (11:50 +0100)]
radeonsi: cleanup some #includes

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: use ARRAY_SIZE for the LLVM command line options
Nicolai Hähnle [Thu, 4 Apr 2019 09:49:52 +0000 (11:49 +0200)]
amd/common: use ARRAY_SIZE for the LLVM command line options

This is more convenient for changing it around during debug.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: inline si_shader_binary_read_config into its only caller
Nicolai Hähnle [Fri, 3 May 2019 17:15:52 +0000 (19:15 +0200)]
radeonsi: inline si_shader_binary_read_config into its only caller

Since it can only be used for reading the config of an individual,
non-combined shader, it is not very reusable anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: use the new run-time linker for shaders
Nicolai Hähnle [Tue, 22 May 2018 14:14:16 +0000 (16:14 +0200)]
radeonsi: use the new run-time linker for shaders

v2:
- fix a memory leak

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: don't declare pointers to static strings
Nicolai Hähnle [Thu, 6 Dec 2018 11:43:44 +0000 (12:43 +0100)]
radeonsi: don't declare pointers to static strings

The compiler should be able to optimize them away, but still. There's
no point in declaring those as pointers, and if the compiler *doesn't*
optimize them away, they add unnecessary load-time relocations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: add ac_compile_module_to_elf
Nicolai Hähnle [Wed, 28 Nov 2018 11:46:45 +0000 (12:46 +0100)]
amd/common: add ac_compile_module_to_elf

A new variant of ac_compile_module_to_binary that allows us to
keep the entire ELF around.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: dump shader binary buffer contents
Nicolai Hähnle [Thu, 6 Dec 2018 11:35:36 +0000 (12:35 +0100)]
radeonsi: dump shader binary buffer contents

Help identify bugs related to corruption of shaders in memory,
or errors in shader upload / rtld.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: return bool from si_shader_binary_upload
Nicolai Hähnle [Thu, 17 May 2018 16:17:07 +0000 (18:17 +0200)]
radeonsi: return bool from si_shader_binary_upload

We didn't really use error codes anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: let si_shader_create return a boolean
Nicolai Hähnle [Wed, 28 Nov 2018 10:32:01 +0000 (11:32 +0100)]
radeonsi: let si_shader_create return a boolean

We didn't really use error codes anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: use ac_shader_config
Nicolai Hähnle [Wed, 9 May 2018 14:38:33 +0000 (16:38 +0200)]
radeonsi: use ac_shader_config

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: add a more powerful runtime linker
Nicolai Hähnle [Fri, 4 May 2018 14:00:35 +0000 (16:00 +0200)]
amd/common: add a more powerful runtime linker

Using an explicit linker instead of just concatenating .text
sections will allow us to start using .rodata sections and
explicit descriptions of data on LDS that is shared between
stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoi965: Fix INTEL_DEBUG=bat
Caio Marcelo de Oliveira Filho [Mon, 10 Jun 2019 21:23:34 +0000 (14:23 -0700)]
i965: Fix INTEL_DEBUG=bat

Use hash_table_u64 instead of hash_table directly, since the former
will also handle the special keys (deleted and freed) and allow use
the whole u64 space.

Fixes crash in INTEL_DEBUG=bat when using a key with value 0 -- the
current value for a freed key.

Fixes: b38dab101ca "util/hash_table: Assert that keys are not reserved pointers"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoutil/hash_table: Properly handle the NULL key in hash_table_u64
Caio Marcelo de Oliveira Filho [Mon, 10 Jun 2019 19:10:54 +0000 (12:10 -0700)]
util/hash_table: Properly handle the NULL key in hash_table_u64

The hash_table_u64 should support any uint64_t as input.  It does
special handling for the "deleted" key, storing the data in the table
itself; do the same for the "freed" key.

Fixes: b38dab101ca "util/hash_table: Assert that keys are not reserved pointers"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoamd/common: clarify ac_shader_binary::lds_size
Nicolai Hähnle [Wed, 23 May 2018 19:52:26 +0000 (21:52 +0200)]
amd/common: clarify ac_shader_binary::lds_size

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: extract ac_parse_shader_binary_config
Nicolai Hähnle [Tue, 22 May 2018 11:29:27 +0000 (13:29 +0200)]
amd/common: extract ac_parse_shader_binary_config

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agou_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros
Nicolai Hähnle [Mon, 13 May 2019 14:58:08 +0000 (16:58 +0200)]
u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macros

The main motivation for this change is API ergonomics: most operations
on dynarrays are really on elements, not on bytes, so it's weird to have
grow and resize as the odd operations out.

The secondary motivation is memory safety. Users of the old byte-oriented
functions would often multiply a number of elements with the element size,
which could overflow, and checking for overflow is tedious.

With this change, we only need to implement the overflow checks once.
The checks are cheap: since eltsize is a compile-time constant and the
functions should be inlined, they only add a single comparison and an
unlikely branch.

v2:
- ensure operations are no-op when allocation fails
- in util_dynarray_clone, call resize_bytes with a compile-time constant element size

v3:
- fix iris, lima, panfrost

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agou_dynarray: return 0 on realloc failure and ensure no-op
Nicolai Hähnle [Mon, 13 May 2019 14:58:07 +0000 (16:58 +0200)]
u_dynarray: return 0 on realloc failure and ensure no-op

We're not very good at handling out-of-memory conditions in general, but
this change at least gives the caller the option of handling it gracefully
and without memory leaks.

This happens to fix an error in out-of-memory handling in i965, which has
the following code in brw_bufmgr.c:

      node = util_dynarray_grow(vma_list, sizeof(struct vma_bucket_node));
      if (unlikely(!node))
         return 0ull;

Previously, allocation failure for util_dynarray_grow wouldn't actually
return NULL when the dynarray was previously non-empty.

v2:
- make util_dynarray_ensure_cap a no-op on failure, add MUST_CHECK attribute
- simplify the new capacity calculation: aside from avoiding a useless loop
  when newcap is very large, this also avoids an infinite loop when newcap
  is larger than 1 << 31

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agofreedreno: use util_dynarray_clear instead of util_dynarray_resize(_, 0)
Nicolai Hähnle [Fri, 3 May 2019 16:11:27 +0000 (18:11 +0200)]
freedreno: use util_dynarray_clear instead of util_dynarray_resize(_, 0)

This is more expressive and simplifies a subsequent change.

v2:
- fix one more call-site after rebase

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agopanfrost/midgard: Differentiate vertex/fragment texture tags
Alyssa Rosenzweig [Tue, 11 Jun 2019 17:24:57 +0000 (10:24 -0700)]
panfrost/midgard: Differentiate vertex/fragment texture tags

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Assert on unknown texture source
Alyssa Rosenzweig [Tue, 11 Jun 2019 16:55:18 +0000 (09:55 -0700)]
panfrost/midgard: Assert on unknown texture source

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Set minimal swizzle on texture input
Alyssa Rosenzweig [Tue, 11 Jun 2019 16:54:22 +0000 (09:54 -0700)]
panfrost/midgard: Set minimal swizzle on texture input

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Lower texture projectors
Alyssa Rosenzweig [Tue, 11 Jun 2019 16:51:29 +0000 (09:51 -0700)]
panfrost/midgard: Lower texture projectors

We do have native support for perspective division on the load/store
unit, but this is for the future, something ideally we would select
generally, not just for textures. Meanwhile, flipping on projector
lowering works now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Implement txl
Alyssa Rosenzweig [Tue, 11 Jun 2019 16:43:08 +0000 (09:43 -0700)]
panfrost/midgard: Implement txl

This follows the txb implementation, but requires an adjustment to how
the cont/last flags are set.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Implement txb op
Alyssa Rosenzweig [Tue, 11 Jun 2019 16:23:05 +0000 (09:23 -0700)]
panfrost/midgard: Implement txb op

We refactor the main tex handling to fit a bias argument in as well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Unify bind_vs/fs_state
Alyssa Rosenzweig [Tue, 4 Jun 2019 23:48:17 +0000 (23:48 +0000)]
panfrost: Unify bind_vs/fs_state

This replaces bind_vs/fs_state calls to a unified bind_shader_state
call, removing a great deal of duplicated logic related to variant
selection.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add panfrost_job_type_for_pipe helper
Alyssa Rosenzweig [Tue, 4 Jun 2019 23:47:35 +0000 (23:47 +0000)]
panfrost: Add panfrost_job_type_for_pipe helper

This logic is repeated in a bunch of places and will only grow worse as
we support more job types; collect it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Extract emit_varying_read
Alyssa Rosenzweig [Tue, 4 Jun 2019 23:26:09 +0000 (23:26 +0000)]
panfrost/midgard: Extract emit_varying_read

Paralleling emit_uniform_read, this allows varying reads to be emitted
independent of an honest-to-goodness load vary instruction in the NIR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Remove "vertex/tiler render target" silliness
Alyssa Rosenzweig [Tue, 11 Jun 2019 21:56:30 +0000 (14:56 -0700)]
panfrost: Remove "vertex/tiler render target" silliness

I don't think these are actual structures, just figments over
cargoculting dumped memory without making any sense of it. Nothing seems
to break if the region is zeroed out, anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Print line number of bad memory access
Alyssa Rosenzweig [Tue, 11 Jun 2019 20:47:37 +0000 (13:47 -0700)]
panfrost/decode: Print line number of bad memory access

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Replace pantrace with direct decoding
Alyssa Rosenzweig [Tue, 11 Jun 2019 19:25:35 +0000 (12:25 -0700)]
panfrost: Replace pantrace with direct decoding

History lesson! In the early days of a Panfrost, we had a library
independent of the driver called `panwrap` which would be LD_PRELOAD'ed
into a driver to decode its cmdstream in real-time. When upstreaming
Panfrost, we realized that we would much rather have this decode
functionality maintained in-tree to avoid divergence, but that we could
not upstream panwrap because of its use with the legacy API. So we
instead dumped GPU memory to the filesystem with an out-of-tree panwrap,
and decoded that with the in-tree pandecode module. When we migrated to
the new kernel, we just added support for doing this memory dump
directly from the driver (via a module "pantrace").

This works, but dumping memory every frame is sloooooooooooooow and
error-prone. I figured if we have pandecode in-tree, we might as well
link to it directly in the driver, allowing us to decode Panfrost's
command streams without dumping memory to the filesystem first. This
cleans up the code *substantially* and improves dumping performance by a
HUGE margin. I'm talking "several seconds per frame" to "dumping in
real-time" kind of jump.

Note to users: this removes the environmental option "PANTRACE_BASE".
Instead, for equivalent functionality set "PAN_MESA_DEBUG=trace" and
redirect stdout to the file of your choosing.

This should be debugging Panfrost much more pleasant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agost/mesa: Add rgbx handling for fp formats
Kevin Strasser [Thu, 30 May 2019 20:31:20 +0000 (13:31 -0700)]
st/mesa: Add rgbx handling for fp formats

Add missing cases for fp32 and fp16 formats.

Fixes: c68334ffc0a9 "st/mesa: add floating point formats in st_new_renderbuffer_fb()"
Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogallium/winsys/kms: Fix dumb buffer bpp
Kevin Strasser [Thu, 30 May 2019 19:37:07 +0000 (12:37 -0700)]
gallium/winsys/kms: Fix dumb buffer bpp

The bpp in the dumb buffer creation request is hardcoded to 32, which is an
incorrect assumption as the caller is free to pick any pipe format. Use the
bpp supplied to us through util_format_get_blocksizebits().

Fixes: 3b176c441b "gallium: Add a dumb drm/kms winsys backed swrast provider"
Signed-off-by: Kevin Strasser <kevin.strasser@intel.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoutil/futex: fix dangling pointer use
Eric Engestrom [Wed, 12 Jun 2019 16:23:27 +0000 (17:23 +0100)]
util/futex: fix dangling pointer use

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110901
Fixes: 7dc2f4788288ec9c7ab6 "util: emulate futex on FreeBSD using umtx"
Cc: Greg V <greg@unrelenting.technology>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoradv: fix VK_EXT_memory_budget if one heap isn't available
Samuel Pitoiset [Wed, 12 Jun 2019 07:44:29 +0000 (09:44 +0200)]
radv: fix VK_EXT_memory_budget if one heap isn't available

When the visible VRAM size is equal to the VRAM size only two
heaps are exposed.

This fixes dEQP-VK.api.info.device.memory_budget.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix occlusion queries on VegaM
Samuel Pitoiset [Tue, 11 Jun 2019 14:46:32 +0000 (16:46 +0200)]
radv: fix occlusion queries on VegaM

The number of render backends is 16 but the enabled mask is 0xaaaa.

As noticed by Bas, allowing disabled render backends might break
the OCCLUSION_QUERY packet. We don't use it yet but keep this in
mind.

This fixes dEQP-VK.query_pool.* and dEQP-VK.multiview.*.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoanv: do not parse genxml data without INTEL_DEBUG=bat
Lionel Landwerlin [Wed, 12 Jun 2019 09:41:36 +0000 (12:41 +0300)]
anv: do not parse genxml data without INTEL_DEBUG=bat

This significantly slows down the CTS runs.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 32ffd90002b04b ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agointel/dump: fix segfault when the app hasn't accessed the device
Lionel Landwerlin [Sat, 27 Apr 2019 01:36:23 +0000 (09:36 +0800)]
intel/dump: fix segfault when the app hasn't accessed the device

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Only upload surface state for grid info when needed
Caio Marcelo de Oliveira Filho [Wed, 5 Jun 2019 05:29:13 +0000 (22:29 -0700)]
iris: Only upload surface state for grid info when needed

Special care is needed to ensure that when we have two consecutive
calls with the same grid size, we only bail in the second one if it
either don't need the surface state or the surface state was already
uploaded.

v2: Instead of having a new bool in ice->state to know whether we had
    a surface, check whether we have state->ref.  (Ken)
    Clean up the logic a little bit by adding 'grid_updated' local. (Ken)

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Create binding table slot for num_work_groups only when needed
Caio Marcelo de Oliveira Filho [Tue, 4 Jun 2019 20:38:36 +0000 (13:38 -0700)]
iris: Create binding table slot for num_work_groups only when needed

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agor300g: implement GLSL disk shader caching
Rui Salvaterra [Fri, 7 Jun 2019 11:19:23 +0000 (12:19 +0100)]
r300g: implement GLSL disk shader caching

This implements GLSL disk shader caching for the R300-R500 series of AMD GPUs.

Signed-off-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agor300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added
Richard Thier [Sat, 8 Jun 2019 06:35:36 +0000 (08:35 +0200)]
r300g: restore performance after RADEON_FLAG_NO_INTERPROCESS_SHARING was added

v1: Fix skipped slab allocators and the buffer cache.

v2: Use only 1 domain for texture allocation

v3: Added flag for the create_fence call too

Based on Marek v1 and v2 proposed fixes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=1107812.patch

Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: don't test SDMA perf if SDMA is disabled/unsupported
Marek Olšák [Tue, 21 May 2019 22:39:43 +0000 (18:39 -0400)]
radeonsi: don't test SDMA perf if SDMA is disabled/unsupported