mesa.git
4 years agointel/fs: Lower 64-bit MOVs after lower_load_payload()
Caio Marcelo de Oliveira Filho [Thu, 12 Dec 2019 21:25:33 +0000 (13:25 -0800)]
intel/fs: Lower 64-bit MOVs after lower_load_payload()

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3070>

4 years agoamd/common: Always use addrlib for HTILE tc-compat.
Bas Nieuwenhuizen [Thu, 12 Dec 2019 11:10:58 +0000 (12:10 +0100)]
amd/common: Always use addrlib for HTILE tc-compat.

Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.

One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.

On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>

4 years agoamd/common: Fix tcCompatible degradation on Stoney.
Bas Nieuwenhuizen [Wed, 11 Dec 2019 15:04:58 +0000 (16:04 +0100)]
amd/common: Fix tcCompatible degradation on Stoney.

addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.

This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>

4 years agodocs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe...
Denis Pauk [Sat, 14 Dec 2019 07:58:25 +0000 (09:58 +0200)]
docs/features: mark GL_ARB_texture_compression_bptc as done for llvmpipe, softpipe, swr

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Rhys Perry <pendingchaos02@gmail.com>
CC: Bruce Cherniak <bruce.cherniak@intel.com>
CC: Matt Turner <mattst88@gmail.com>
4 years agogallium/swr: Enable support bptc format.
Denis Pauk [Sat, 14 Dec 2019 07:54:48 +0000 (09:54 +0200)]
gallium/swr: Enable support bptc format.

Reuse Code from:
f69bc797e1 gallium/auxiliary: Add helper support for bptc format compress/decompress

Signed-off-by: Denis Pauk <pauk.denis@gmail.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: Marek Olšák <maraeo@gmail.com>
CC: Tim Rowley <timothy.o.rowley@intel.com>
4 years agofreedreno/a6xx: fix OUT_REG() vs growable cmdstream
Rob Clark [Sat, 14 Dec 2019 17:09:08 +0000 (09:09 -0800)]
freedreno/a6xx: fix OUT_REG() vs growable cmdstream

BEGIN_RING() could decide we can't fit the next packet in the current
cmdstream segment, and grow a new segment.  So we need to grab ring->cur
*after* BEGIN_RING(), otherwise we are writing cmdstream past the end of
the previous segment.

Fixes: bdd98b892f3 ("freedreno: New struct packing macros")
Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agolima: split draw calls on 64k vertices
Erico Nunes [Sat, 9 Nov 2019 12:50:52 +0000 (13:50 +0100)]
lima: split draw calls on 64k vertices

The Mali400 only supports draws with up to 64k vertices per command.
To handle this, break the draw_vbo call into multiple commands.
Indexed drawing is left to a separate code path.
This implementation was ported from vc4_draw_vbo.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agovc4: move the draw splitting routine to shared code
Erico Nunes [Tue, 12 Nov 2019 13:56:33 +0000 (14:56 +0100)]
vc4: move the draw splitting routine to shared code

This can also be useful for other hardware which has similar limitations
on vertex count per single draw.
The Mali400 has a similar limitation and can reuse this.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agolima: refactor indexed draw indices upload
Erico Nunes [Sat, 9 Nov 2019 12:50:07 +0000 (13:50 +0100)]
lima: refactor indexed draw indices upload

As of this commit this is just a refactor in preparation to enable
support for more than 64k vertices.
To support splitting the draw_vbo call, indices shouldn't be re-uploaded
every time.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agolima: allocate separate bo to store varyings
Erico Nunes [Wed, 23 Oct 2019 22:27:22 +0000 (00:27 +0200)]
lima: allocate separate bo to store varyings

The current strategy using the suballocator with fixed size doesn't
scale and causes some programs with large number of vertices (like some
glmark2 scenes) to crash.
Change it to dynamically allocate a separate bo to accomodate for
arbitrary number of vertices.
This also fixes the buffer read/write flags for gp.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agogallium/util: add alignment parameter to util_upload_index_buffer
Erico Nunes [Sat, 7 Dec 2019 03:38:03 +0000 (04:38 +0100)]
gallium/util: add alignment parameter to util_upload_index_buffer

At least on Mali Utgard, index buffers need to be aligned on 0x40.
To avoid duplicating this, add an alignment parameter.
Keep the previous default for the other existing users.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>

4 years agodrirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version
Kenneth Graunke [Fri, 13 Dec 2019 05:26:12 +0000 (21:26 -0800)]
drirc: Final Fantasy VIII: Remastered needs allow_higher_compat_version

This gets it running on i965 with Mesa master.  (The game won't start
without GL 3.3 compatibility, but uses 1.20 with GL_EXT_gpu_shader4
for shaders.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3076>

4 years agost/glsl_to_nir: fix SSO validation regression
Timothy Arceri [Fri, 13 Dec 2019 10:58:28 +0000 (21:58 +1100)]
st/glsl_to_nir: fix SSO validation regression

Fixes: b77907edb554 ("st/glsl_to_nir: use nir based program resource list builder")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2216
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agoci: Remove T760/T860 from CI temporarily
Alyssa Rosenzweig [Fri, 13 Dec 2019 22:13:14 +0000 (17:13 -0500)]
ci: Remove T760/T860 from CI temporarily

I feel really bad about this but this one test is flaking. I don't want
to do a mass revert (and bisection is extremely difficult with
nondeterministic/Heisenbugs), but it's Friday night and master needs to
pass. This commit should be reverted asap (once the flake is solved)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoiris: Implement WA for push constants.
Rafael Antognolli [Tue, 19 Nov 2019 23:00:06 +0000 (15:00 -0800)]
iris: Implement WA for push constants.

v2: Apply WA to gen11+ instead of gen12+ (Jordan).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
4 years agolima/parser: Add texture descriptor parser
Andreas Baierl [Mon, 9 Dec 2019 11:42:30 +0000 (12:42 +0100)]
lima/parser: Add texture descriptor parser

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>

4 years agolima/parser: Add RSW parsing
Andreas Baierl [Fri, 6 Dec 2019 08:30:14 +0000 (09:30 +0100)]
lima/parser: Add RSW parsing

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>

4 years agolima/parser: Some fixes and cleanups
Andreas Baierl [Thu, 5 Dec 2019 16:39:01 +0000 (17:39 +0100)]
lima/parser: Some fixes and cleanups

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>

4 years agovulkan/overlay: Update docs.
Rafael Antognolli [Thu, 12 Dec 2019 21:55:11 +0000 (13:55 -0800)]
vulkan/overlay: Update docs.

Add mention to overlay control socket.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add basic overlay control script.
Rafael Antognolli [Thu, 12 Dec 2019 14:45:51 +0000 (06:45 -0800)]
vulkan/overlay: Add basic overlay control script.

This can be used to start/stop statistics capturing from the command
line.

v3:
 - Install script (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add a command to start capturing data to a file.
Rafael Antognolli [Fri, 6 Dec 2019 20:18:19 +0000 (12:18 -0800)]
vulkan/overlay: Add a command to start capturing data to a file.

By default, if an output_file is specified, the overlay layer will start
capturing data immediately. After this commit, when a control socket is
used, the capture starts disabled by default, and is only enabled when a
command ":capture=1;" is received.

when the capture is enabled, we might have already accumulated some
stats. To avoid capturing such noise, we discard and reset the fps and
stats, updating the display and capturing only data from that point on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add support for a control socket.
Rafael Antognolli [Fri, 6 Dec 2019 21:44:19 +0000 (13:44 -0800)]
vulkan/overlay: Add support for a control socket.

Add support for socket from which the overlay layer can receive
commands. This control socket can be useful to allow setting options
once the application is already running. For instance, triggering the
capture of fps data at a certain point.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agovulkan/overlay: Add a control socket.
Rafael Antognolli [Fri, 6 Dec 2019 22:38:07 +0000 (14:38 -0800)]
vulkan/overlay: Add a control socket.

v2: Use a socket instead of named pipe.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoutil/os_socket: Add socket related functions.
Rafael Antognolli [Wed, 11 Dec 2019 23:01:11 +0000 (15:01 -0800)]
util/os_socket: Add socket related functions.

v3:
 - Add os_socket.c/h into Makefile.sources (Lionel)
 - Add empty non-linux implementation to public functions.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: drop unused #include
Eric Engestrom [Fri, 13 Dec 2019 17:16:17 +0000 (17:16 +0000)]
anv: drop unused #include

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agoutil/simple_mtx: don't set the canary when it can't be checked
Eric Engestrom [Fri, 13 Dec 2019 17:12:48 +0000 (17:12 +0000)]
util/simple_mtx: don't set the canary when it can't be checked

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agointel/compiler: replace `0` pointer with `NULL`
Eric Engestrom [Fri, 13 Dec 2019 17:01:39 +0000 (17:01 +0000)]
intel/compiler: replace `0` pointer with `NULL`

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agointel/compiler: add ASSERTED annotation to avoid "unused variable" warning
Eric Engestrom [Fri, 13 Dec 2019 17:01:17 +0000 (17:01 +0000)]
intel/compiler: add ASSERTED annotation to avoid "unused variable" warning

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
4 years agoiris: Alphabetize source files after iris_perf.c was added
Kenneth Graunke [Fri, 13 Dec 2019 19:03:03 +0000 (11:03 -0800)]
iris: Alphabetize source files after iris_perf.c was added

4 years agofreedreno/ir3: add iterator macros
Rob Clark [Thu, 12 Dec 2019 23:30:49 +0000 (15:30 -0800)]
freedreno/ir3: add iterator macros

So many open coded list iterators were getting annoying.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/ir3: add scheduler traces
Rob Clark [Fri, 22 Nov 2019 19:13:19 +0000 (11:13 -0800)]
freedreno/ir3: add scheduler traces

Add some infrastructure to trace scheduler decisions.  The next patch
will add some more traces, just splitting this out to reduce clutter.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/ir3: add last-baryf shaderdb stat
Rob Clark [Wed, 11 Dec 2019 23:52:32 +0000 (15:52 -0800)]
freedreno/ir3: add last-baryf shaderdb stat

Sometimes sched changes that are a win in terms of instruction count
and/or register pressure, are worse in real life, due to keeping varying
storage locked for too long.  Add a shader-db stat to give this more
visibility.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agonir/opt_peephole_select: remove unused variables
Alejandro Piñeiro [Fri, 13 Dec 2019 13:57:37 +0000 (14:57 +0100)]
nir/opt_peephole_select: remove unused variables

To avoid "unused variable" warnings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
4 years agopanfrost: Report GPU name in es2_info
Alyssa Rosenzweig [Mon, 9 Dec 2019 21:02:17 +0000 (16:02 -0500)]
panfrost: Report GPU name in es2_info

We can prettify the ID.

Closes #2093

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Add panfrost_model_name helper
Alyssa Rosenzweig [Mon, 9 Dec 2019 21:02:03 +0000 (16:02 -0500)]
panfrost: Add panfrost_model_name helper

This gives us a string representation of a GPU ID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Move property queries to _encoder
Alyssa Rosenzweig [Mon, 9 Dec 2019 20:54:09 +0000 (15:54 -0500)]
panfrost: Move property queries to _encoder

We'll want these in non-Gallium devices.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Move nir_undef_to_zero to Midgard compiler
Alyssa Rosenzweig [Mon, 9 Dec 2019 20:41:52 +0000 (15:41 -0500)]
panfrost: Move nir_undef_to_zero to Midgard compiler

Nothing Gallium about it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopandecode: Add cast
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:22:06 +0000 (10:22 -0500)]
pandecode: Add cast

Fixes minor coverity warning about the format specifier.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Pass size to panfrost_batch_get_scratchpad
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:02:15 +0000 (11:02 -0500)]
panfrost: Pass size to panfrost_batch_get_scratchpad

We'll compute the size with the new scratchpad helpers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Calculate maximum stack_size per batch
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:18:47 +0000 (11:18 -0500)]
panfrost: Calculate maximum stack_size per batch

We'll need this so we can allocate a stack for the batch large enough
for all the jobs within it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Handle misc. cppcheck warnings
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:13:24 +0000 (10:13 -0500)]
pan/midgard: Handle misc. cppcheck warnings

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Remove unused ld/st packing hepers
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:09:16 +0000 (10:09 -0500)]
pan/midgard: Remove unused ld/st packing hepers

Identified by cppcheck.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Handle minor cppcheck issues
Alyssa Rosenzweig [Fri, 13 Dec 2019 15:07:44 +0000 (10:07 -0500)]
panfrost: Handle minor cppcheck issues

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Emit SFBD/MFBD after a batch, instead of before
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:00:42 +0000 (11:00 -0500)]
panfrost: Emit SFBD/MFBD after a batch, instead of before

The size of the scratchpad (as well as some tiler details) depend on the
contents of the batch, so we need to wait to defer filling out the FBD
until after all draws are queued.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Route stack_size from compiler
Alyssa Rosenzweig [Mon, 9 Dec 2019 16:18:21 +0000 (11:18 -0500)]
panfrost: Route stack_size from compiler

We'll need it in pan_context.c

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoetnaviv: add missing vs_needs_z_div handling to NIR backend
Jonathan Marek [Sun, 8 Dec 2019 23:16:34 +0000 (18:16 -0500)]
etnaviv: add missing vs_needs_z_div handling to NIR backend

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: add missing formats
Jonathan Marek [Sun, 8 Dec 2019 16:52:32 +0000 (11:52 -0500)]
etnaviv: add missing formats

Add missing texture/render formats supported by hardware.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: remove swizzle from format table
Jonathan Marek [Sun, 8 Dec 2019 16:20:46 +0000 (11:20 -0500)]
etnaviv: remove swizzle from format table

The only format that needs swizzle is R8 emulated with L8, so we can get
rid of the SWIZ(X, Y, Z, W) everywhere.

Note: R8G8 also had a swizzle, but it wasn't necessary.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: disable integer vertex formats on pre-HALTI2 hardware
Jonathan Marek [Sun, 8 Dec 2019 16:54:31 +0000 (11:54 -0500)]
etnaviv: disable integer vertex formats on pre-HALTI2 hardware

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: update INT_FILTER choice for GLES3 formats
Jonathan Marek [Sun, 20 Oct 2019 06:10:43 +0000 (02:10 -0400)]
etnaviv: update INT_FILTER choice for GLES3 formats

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: set output mode and saturate bits
Jonathan Marek [Mon, 12 Aug 2019 18:19:59 +0000 (14:19 -0400)]
etnaviv: set output mode and saturate bits

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: sRGB render target support
Jonathan Marek [Sun, 20 Oct 2019 06:21:43 +0000 (02:21 -0400)]
etnaviv: sRGB render target support

Note: no srgb render target support before HALTI3

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: remove sRGB formats from format table
Jonathan Marek [Sat, 10 Aug 2019 18:38:32 +0000 (14:38 -0400)]
etnaviv: remove sRGB formats from format table

This supports all sRGB formats, without having them in the format table.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agogallium/swr: Fix arb_transform_feedback2
Tomasz Pyra [Thu, 12 Dec 2019 14:38:43 +0000 (15:38 +0100)]
gallium/swr: Fix arb_transform_feedback2

Added support for pause/resume transform feedback.
Fixed DrawTransformFeedback.

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
4 years agoradv: handle unaligned vertex fetches on GFX6/GFX10
Samuel Pitoiset [Fri, 29 Nov 2019 14:12:30 +0000 (15:12 +0100)]
radv: handle unaligned vertex fetches on GFX6/GFX10

The Vulkan spec doesn't have any words for vertex attributes alignment.

Fixes a test failure on GFX6 and a GPU hang on GFX10 with:
dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point

vkpipeline-db results on GFX10:
Totals from affected shaders:
SGPRS: 463772 -> 472972 (1.98 %)
VGPRS: 343208 -> 343752 (0.16 %)
Spilled SGPRs: 323 -> 336 (4.02 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 13806200 -> 14164472 (2.60 %) bytes
Max Waves: 84021 -> 83755 (-0.32 %)

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2161
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoi965/iris: perf-queries: don't invalidate/flush 3d pipeline
Lionel Landwerlin [Mon, 20 May 2019 06:56:18 +0000 (07:56 +0100)]
i965/iris: perf-queries: don't invalidate/flush 3d pipeline

Our current implementation of performance queries is fairly harsh
because it completely flushes and invalidates the 3d pipeline caches
at the beginning and end of each query. An argument can be made that
this is how performance should be measured but it probably doesn't
reflect what the application is actually doing and the actual cost of
draw calls.

A more appropriate approach is to just stall the pipeline at
scoreboard, so that we measure the effect of a draw call without
having the pipeline in a completely pristine state for every draw
call.

v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agointel/perf: drop batchbuffer flushing at query begin
Lionel Landwerlin [Thu, 12 Dec 2019 10:14:09 +0000 (12:14 +0200)]
intel/perf: drop batchbuffer flushing at query begin

This was initially intended to fix issues with the query timings going
occassionally high.

It turns out there was a bug in the attribution of OA reports to our
context when parsing the OA data. This led to reports flagged with
other context IDs to be included in our queries results.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agov3d: actually root the first BO in a command list in the job
Iago Toral Quiroga [Thu, 12 Dec 2019 10:19:23 +0000 (11:19 +0100)]
v3d: actually root the first BO in a command list in the job

We were passing cl->bo, which is NULL, so v3d_job_add_bo was a no-op.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoetnaviv: drop compiled_rs_state forward declaration
Christian Gmeiner [Tue, 10 Dec 2019 16:15:35 +0000 (17:15 +0100)]
etnaviv: drop compiled_rs_state forward declaration

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoetnaviv: remove not used etna_bits_ones(..)
Christian Gmeiner [Tue, 10 Dec 2019 08:54:18 +0000 (09:54 +0100)]
etnaviv: remove not used etna_bits_ones(..)

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
4 years agoswr: Fix build with llvm-10.0.
Vinson Lee [Thu, 12 Dec 2019 01:34:30 +0000 (17:34 -0800)]
swr: Fix build with llvm-10.0.

Fix build error after llvm-10.0 commit ("1b2842bf902a [Alignment][NFC]
CreateMemSet use MaybeAlign").

../src/gallium/drivers/swr/swr_shader.cpp: In member function ‘void (* BuilderSWR::CompileGS(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’:
../src/gallium/drivers/swr/swr_shader.cpp:738:65: error: no matching function for call to ‘BuilderSWR::MEMSET(llvm::Value*&, llvm::Constant*, int, long unsigned int)’
       MEMSET(pStream, C((char)0), VERTEX_COUNT_SIZE + CONTROL_HEADER_SIZE, sizeof(float) * KNOB_SIMD_WIDTH);
                                                                 ^
In file included from ../src/gallium/drivers/swr/rasterizer/jitter/builder.h:163:0,
                 from ../src/gallium/drivers/swr/swr_shader.cpp:43:
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:51:11: note: candidate: llvm::CallInst* SwrJit::Builder::MEMSET(llvm::Value*, llvm::Value*, uint64_t, llvm::MaybeAlign, bool, llvm::MDNode*, llvm::MDNode*, llvm::MDNode*)
 CallInst* MEMSET(Value *Ptr, Value *Val, uint64_t Size, MaybeAlign Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr)
           ^
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:51:11: note:   no known conversion for argument 4 from ‘long unsigned int’ to ‘llvm::MaybeAlign’
In file included from ../src/gallium/drivers/swr/rasterizer/jitter/builder.h:163:0,
                 from ../src/gallium/drivers/swr/swr_shader.cpp:43:
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:56:11: note: candidate: llvm::CallInst* SwrJit::Builder::MEMSET(llvm::Value*, llvm::Value*, llvm::Value*, llvm::MaybeAlign, bool, llvm::MDNode*, llvm::MDNode*, llvm::MDNode*)
 CallInst* MEMSET(Value *Ptr, Value *Val, Value *Size, MaybeAlign Align, bool isVolatile = false, MDNode *TBAATag = nullptr, MDNode *ScopeTag = nullptr, MDNode *NoAliasTag = nullptr)
           ^
src/gallium/drivers/swr/rasterizer/jitter/gen_builder.hpp:56:11: note:   no known conversion for argument 4 from ‘long unsigned int’ to ‘llvm::MaybeAlign’

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>
4 years agoturnip: implement subpass input attachments
Jonathan Marek [Thu, 12 Dec 2019 22:05:22 +0000 (17:05 -0500)]
turnip: implement subpass input attachments

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: CmdClearAttachments fixes
Jonathan Marek [Thu, 12 Dec 2019 19:02:49 +0000 (14:02 -0500)]
turnip: CmdClearAttachments fixes

Partial depth/stencil clear and skipping unused attachments.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: subpass rework
Jonathan Marek [Fri, 6 Dec 2019 01:53:34 +0000 (20:53 -0500)]
turnip: subpass rework

A renderpass is a tile load/store cycle.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: add dirty bit for push constants
Jonathan Marek [Thu, 12 Dec 2019 22:13:55 +0000 (17:13 -0500)]
turnip: add dirty bit for push constants

Fixes push constants not updating in some cases.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: no 8x msaa on 128bpp formats
Jonathan Marek [Thu, 12 Dec 2019 22:03:26 +0000 (17:03 -0500)]
turnip: no 8x msaa on 128bpp formats

We don't have an entry for cpp 128 in the tile_alignment table, but I don't
think the HW supports this at all (blob driver just doesn't have 8x msaa).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: fix VK_IMAGE_ASPECT_STENCIL_BIT image view
Jonathan Marek [Thu, 12 Dec 2019 22:01:52 +0000 (17:01 -0500)]
turnip: fix VK_IMAGE_ASPECT_STENCIL_BIT image view

Use a special format which allows sampling the stencil and set the correct
swizzle.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: set FRAG_WRITES_SAMPMASK bit
Jonathan Marek [Thu, 12 Dec 2019 22:00:13 +0000 (17:00 -0500)]
turnip: set FRAG_WRITES_SAMPMASK bit

GPU hangs if SAMPMASK_REGID is used without this bit.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: set load_layer_id to zero
Jonathan Marek [Thu, 12 Dec 2019 21:58:56 +0000 (16:58 -0500)]
turnip: set load_layer_id to zero

We don't have layered rendering and ir3 doesn't support this intrinsic, so
just set it to zero for now.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: update tile_align_w/tile_align_h
Jonathan Marek [Thu, 12 Dec 2019 21:58:03 +0000 (16:58 -0500)]
turnip: update tile_align_w/tile_align_h

It looks like the actual tile alignment requirement is less than 32x32, but
in some cases input attachment texture needs 64 alignment.

Reduced the h alignment to 16 to compensate and it seems to work fine.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: fix tile layout logic
Jonathan Marek [Thu, 12 Dec 2019 21:55:15 +0000 (16:55 -0500)]
turnip: fix tile layout logic

Use DIV_ROUND_UP and stop trying to increase the tile_count width/height
once tile_align_w/tile_align_h are reached.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: fix hw binning render area
Jonathan Marek [Thu, 12 Dec 2019 21:51:39 +0000 (16:51 -0500)]
turnip: fix hw binning render area

Fix a mistake in the y2 coordinate.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno/registers: add a6xx texture format for stencil sampler
Jonathan Marek [Thu, 12 Dec 2019 18:59:45 +0000 (13:59 -0500)]
freedreno/registers: add a6xx texture format for stencil sampler

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno/ir3: add GLSL_SAMPLER_DIM_SUBPASS to tex_info
Jonathan Marek [Thu, 12 Dec 2019 18:58:28 +0000 (13:58 -0500)]
freedreno/ir3: add GLSL_SAMPLER_DIM_SUBPASS to tex_info

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoturnip: fix incorrectly failing assert
Jonathan Marek [Fri, 6 Dec 2019 01:58:58 +0000 (20:58 -0500)]
turnip: fix incorrectly failing assert

pColorBlendState is allowed to be NULL if subpass has >0 color attachments
but they are all unused.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agopanfrost: Query core count and thread tls alloc
Alyssa Rosenzweig [Mon, 9 Dec 2019 14:06:51 +0000 (09:06 -0500)]
panfrost: Query core count and thread tls alloc

This is supported only on newer kernels.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Factor out panfrost_query_raw
Alyssa Rosenzweig [Mon, 9 Dec 2019 14:00:49 +0000 (09:00 -0500)]
panfrost: Factor out panfrost_query_raw

We would like to query properties other than product ID.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agost/glsl_to_nir: use nir based program resource list builder
Timothy Arceri [Fri, 6 Dec 2019 10:57:16 +0000 (21:57 +1100)]
st/glsl_to_nir: use nir based program resource list builder

Here we use the NIR based builder to add everything to the resource
list execpt for SSO packed varyings. Since the details of those
varyings get lost during packing we leave the special handing to
the GLSL IR pass for now. In order to do this we add some bools
to the build resource list functions.

Using the NIR based resource list builder gets us a step closer to
using a native NIR based linker. It should also be faster than the
GLSL IR builder, one because the NIR optimisations should mean we
add less entries due to better optimisations, and two because nir
gives us better lists to work with and we don't need to walk the
entire IR to find the resources.

Ack-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agost/glsl_to_nir: call gl_nir_lower_buffers() a little later
Timothy Arceri [Fri, 6 Dec 2019 10:53:17 +0000 (21:53 +1100)]
st/glsl_to_nir: call gl_nir_lower_buffers() a little later

In a following commit we will use a NIR based builder to build the
OpenGL resource list, so we want to delay this call a little.

Ack-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agoglsl: add subroutine support to nir_build_program_resource_list()
Timothy Arceri [Wed, 23 Oct 2019 02:43:34 +0000 (13:43 +1100)]
glsl: add subroutine support to nir_build_program_resource_list()

This is required so we can use the NIR linker to link GLSL in
addition to spirv.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agoglsl: add support for named varyings in nir_build_program_resource_list()
Timothy Arceri [Tue, 22 Oct 2019 03:59:27 +0000 (14:59 +1100)]
glsl: add support for named varyings in nir_build_program_resource_list()

This adds support for adding names of varying to the resource list
which is required for us to use this function with the glsl linker.
Support for names is optional for spirv which is why it had not been
added yet.

This is mostly a copy of the GLSL IR code adapted to nir.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agoglsl: copy the new data fields when converting to nir
Timothy Arceri [Tue, 22 Oct 2019 03:54:34 +0000 (14:54 +1100)]
glsl: copy the new data fields when converting to nir

These fields added in the previous commit will be used to make use
of a NIR based GLSL linker.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agonir: add some fields to nir_variable_data
Timothy Arceri [Wed, 23 Oct 2019 01:05:10 +0000 (12:05 +1100)]
nir: add some fields to nir_variable_data

These will be used to provide NIR linking functionality to GLSL.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agoglsl: copy the how_declared field when converting to nir
Timothy Arceri [Tue, 22 Oct 2019 00:29:47 +0000 (11:29 +1100)]
glsl: copy the how_declared field when converting to nir

This is needed to make use of nir_build_program_resource_list().

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agoglsl: move nir_remap_dual_slot_attributes() call out of glsl_to_nir()
Timothy Arceri [Fri, 6 Dec 2019 02:53:24 +0000 (13:53 +1100)]
glsl: move nir_remap_dual_slot_attributes() call out of glsl_to_nir()

In order to be able to implement a NIR based glsl linker we need to
build the program resource list with NIR. This change delays the
remaping so that a later commit can call the NIR based resource
list builder.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
4 years agodocs: Update release notes, index, and calendar for 19.3.0
Dylan Baker [Thu, 12 Dec 2019 20:05:00 +0000 (12:05 -0800)]
docs: Update release notes, index, and calendar for 19.3.0

4 years agodocs/19.3.0: Add SHA256 sums
Dylan Baker [Thu, 12 Dec 2019 19:55:00 +0000 (11:55 -0800)]
docs/19.3.0: Add SHA256 sums

4 years agodocs: add release notes for 19.3.0
Dylan Baker [Thu, 12 Dec 2019 19:21:43 +0000 (11:21 -0800)]
docs: add release notes for 19.3.0

4 years agoi965: Enable GL_EXT_gpu_shader4 on Gen6+
Jason Ekstrand [Wed, 4 Dec 2019 14:55:50 +0000 (08:55 -0600)]
i965: Enable GL_EXT_gpu_shader4 on Gen6+

It's already enabled for all gallium drivers that support GLSL 1.40 or
above and we already support everything in our compiler on SNB+

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
4 years agoradv: enable SpvCapabilityImageMSArray
Samuel Pitoiset [Thu, 12 Dec 2019 17:22:34 +0000 (18:22 +0100)]
radv: enable SpvCapabilityImageMSArray

The Vulkan spec says that StorageImageMultisample and ImageMSArray
SPIRV-V capabilities must be enabled if the
shaderStorageImageMultisample feature is supported.

This fixes a warning with RenderDoc.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2212
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agopanfrost: Add routines to calculate stack size/shift
Alyssa Rosenzweig [Mon, 9 Dec 2019 14:00:24 +0000 (09:00 -0500)]
panfrost: Add routines to calculate stack size/shift

These implement the aforementioned formulas.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Split stack_shift nibble from unk0
Alyssa Rosenzweig [Mon, 9 Dec 2019 13:41:33 +0000 (08:41 -0500)]
panfrost: Split stack_shift nibble from unk0

It's conceptually independent from the upper part (which is not yet
understood, but for spilling generally remains equal to 0x1e).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Rename unknown_address_0 -> scratchpad
Alyssa Rosenzweig [Mon, 9 Dec 2019 13:41:07 +0000 (08:41 -0500)]
panfrost: Rename unknown_address_0 -> scratchpad

It's the analogue pointer in SFBD.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Describe thread local storage sizing rules
Alyssa Rosenzweig [Sat, 7 Dec 2019 21:42:01 +0000 (16:42 -0500)]
panfrost: Describe thread local storage sizing rules

Deeply nested powers-of-two, basically :-)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Fix shift for TLS access
Alyssa Rosenzweig [Sat, 7 Dec 2019 20:54:36 +0000 (15:54 -0500)]
pan/midgard: Fix shift for TLS access

Due to this issue we were using 4x the memory we should have for TLS,
which was messing up the size calculations. Oops!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Simplify and fix vector copyprop
Alyssa Rosenzweig [Fri, 6 Dec 2019 22:11:44 +0000 (17:11 -0500)]
pan/midgard: Simplify and fix vector copyprop

Fixes a regression in QuakeSpasm. See
https://gitlab.freedesktop.org/mesa/mesa/issues/2169 for apitrace.

Closes #2169

Fixes: f72873e6aa0 ("pan/midgard: Copypropagate vector creation")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95
4 years agopan/midgard: Don't try to free NULL in LCRA
Alyssa Rosenzweig [Fri, 6 Dec 2019 21:49:26 +0000 (16:49 -0500)]
pan/midgard: Don't try to free NULL in LCRA

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 12e393bacf0 ("panfrost: add lcra_free() to free lcra state")
4 years agopan/midgard: Force alignment for csel_v
Alyssa Rosenzweig [Fri, 6 Dec 2019 21:22:06 +0000 (16:22 -0500)]
pan/midgard: Force alignment for csel_v

The swizzle on the conditional gets lost.

Fixes "horizontal mirroring" in godot. See
https://gitlab.freedesktop.org/mesa/mesa/issues/2108 which has attached
apitrace.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: d3b3daa9d3f ("pan/midgard: Use new scheduler")
Reported-by: Icecream95
4 years agopan/midgard: Don't use no_spill for memory spill src
Alyssa Rosenzweig [Fri, 6 Dec 2019 20:28:08 +0000 (15:28 -0500)]
pan/midgard: Don't use no_spill for memory spill src

I'm not totally sure why this would *break* things, but it's certainly
not necessary and it does break things. Somehow this gives the RA more
freedom, fixing some spill issues.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Use no_spill bitmask
Alyssa Rosenzweig [Fri, 6 Dec 2019 20:17:44 +0000 (15:17 -0500)]
pan/midgard: Use no_spill bitmask

We would like no_spill decisions to be class-specific -- spilling from
special register to a work register doesn't preclude also spilling that
work register to stack.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>