Tapani Pälli [Wed, 19 Feb 2020 06:33:04 +0000 (08:33 +0200)]
nir/glsl: gather bitmask of images used by program
In a similar fashion as commit
f5c7df4dc95 does for textures.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4080>
Danylo Piliaiev [Fri, 13 Mar 2020 15:10:08 +0000 (17:10 +0200)]
st/mesa: Fix signed integer overflow when using util_throttle_memory_usage
../src/mesa/state_tracker/st_cb_texture.c:1719:57: runtime error: signed integer overflow:
203489280 * 16 cannot be represented in type 'int'
Fixes: 21ca322e637291b89a445159fc45b8dbf638e6c9
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4185>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4185>
Matt Turner [Thu, 12 Mar 2020 21:44:46 +0000 (14:44 -0700)]
isl: Avoid EXPECT_DEATH in unit tests
EXPECT_DEATH works by forking the process and letting the forked process
fail with an assertion. This process is evidently incredibly expensive,
taking ~30 seconds to run the whole isl_aux_info_test on a 2.8GHz
Skylake. Annoyingly all of the (expected) assertion failures also leaves
lots of messages in dmesg and potentially generates lots of coredumps.
Instead, avoid the expense of fork/exec by redefining assert() and
unreachable() in the code we're testing to return a unit-test-only
value. With this patch, the test takes ~1ms.
Also, while modifying the EXPECT_EQ() calls, reverse the arguments so
that the expected value comes first, as is intended. Otherwise gtest
failure messages don't make much sense.
Fixes: https://gitlab.freedesktop.org/mesa/mesa/issues/2567
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4174>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4174>
Jan Zielinski [Fri, 13 Mar 2020 19:00:39 +0000 (20:00 +0100)]
gallium/swr: use ElementCount type arguments for getSplat()
Reviewed-by: Alok Hota <alok.hota@intel.com>
In LLVM11, ConstantVector::getSplat() function definition
has changed and the first function argument has now ElementCount type.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4188>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4188>
Christian Gmeiner [Fri, 6 Mar 2020 21:44:42 +0000 (22:44 +0100)]
etnaviv: enable shareable shaders
We are not using any pctx reference in the shader so it seems fine
to enable this cap.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4095>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4095>
Christian Gmeiner [Fri, 6 Mar 2020 22:23:47 +0000 (23:23 +0100)]
etnaviv: get rid of etna_spec in etna_context
There is no need to have a complete copy of etna_spec - just
reference the one and only from etna_screen.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4095>
Jason Ekstrand [Thu, 12 Mar 2020 23:05:31 +0000 (18:05 -0500)]
anv: Dump push ranges via VK_KHR_pipeline_executable_properties
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4173>
Rhys Perry [Tue, 11 Feb 2020 16:55:39 +0000 (16:55 +0000)]
aco: don't stop scheduling at exports
This allows us to move v_cvt_pkrtz_f16_f32 instructions upwards, improving
schedules and (somewhat unintentionally) moving the exports slightly
closer together.
Totals from affected shaders:
SGPRS:
1030224 ->
1030248 (0.00 %)
VGPRS: 794080 -> 794392 (0.04 %)
Spilled SGPRs: 127117 -> 127117 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
89028152 ->
89032312 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 65252 -> 65219 (-0.05 %)
SMEM score: 843808.00 -> 843918.00 (0.01 %)
VMEM score:
5331687.00 ->
5397802.00 (1.24 %)
SMEM clauses: 567659 -> 567655 (-0.00 %)
VMEM clauses: 290715 -> 290716 (0.00 %)
Instructions:
17143219 ->
17144259 (0.01 %)
Cycles:
1098442808 ->
1098446968 (0.00 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
Rhys Perry [Tue, 11 Feb 2020 14:15:32 +0000 (14:15 +0000)]
aco: allow barriers to be skipped during scheduling
Much better scheduling apparently in 160 shaders
Totals from affected shaders:
SGPRS: 6272 -> 6344 (1.15 %)
VGPRS: 4832 -> 4844 (0.25 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 467192 -> 467428 (0.05 %) bytes
LDS: 459 -> 459 (0.00 %) blocks
Max Waves: 1407 -> 1409 (0.14 %)
SMEM score: 9309.00 -> 11216.00 (20.49 %)
VMEM score: 26679.00 -> 33652.00 (26.14 %)
SMEM clauses: 1817 -> 1776 (-2.26 %)
VMEM clauses: 2286 -> 2288 (0.09 %)
Instructions: 86537 -> 86596 (0.07 %)
Cycles: 676260 -> 676568 (0.05 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
Rhys Perry [Thu, 7 Nov 2019 14:48:51 +0000 (14:48 +0000)]
aco: add helpers for ensuring correct ordering while scheduling
Pipeline-db changes in 721 shaders.
Totals from affected shaders:
SGPRS: 42336 -> 42656 (0.76 %)
VGPRS: 38368 -> 38636 (0.70 %)
Spilled SGPRs: 11967 -> 11967 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
5268088 ->
5269840 (0.03 %) bytes
LDS: 1069 -> 1069 (0.00 %) blocks
Max Waves: 4473 -> 4447 (-0.58 %)
SMEM score: 41155.00 -> 41826.00 (1.63 %)
VMEM score: 146339.00 -> 147471.00 (0.77 %)
SMEM clauses: 24434 -> 24535 (0.41 %)
VMEM clauses: 16637 -> 16592 (-0.27 %)
Instructions: 996037 -> 996388 (0.04 %)
Cycles:
76476112 ->
75281416 (-1.56 %)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
Rhys Perry [Wed, 6 Nov 2019 16:38:57 +0000 (16:38 +0000)]
aco: add helpers for moving instructions for scheduling
No pipeline-db changes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
Samuel Pitoiset [Thu, 12 Mar 2020 13:49:55 +0000 (14:49 +0100)]
radv: add llvm_compiler_shader() helper
To match aco_compile_shader().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Samuel Pitoiset [Thu, 12 Mar 2020 13:29:54 +0000 (14:29 +0100)]
radv: remove unnecessary LLVM includes
They are already included from src/amd/llvm.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Samuel Pitoiset [Thu, 12 Mar 2020 13:24:49 +0000 (14:24 +0100)]
radv: remove radv_shader_variant::aco_used
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Samuel Pitoiset [Thu, 12 Mar 2020 13:20:05 +0000 (14:20 +0100)]
radv: cleanup occurences of use_aco everywhere
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Danylo Piliaiev [Wed, 11 Mar 2020 13:29:12 +0000 (15:29 +0200)]
glsl: do not crash if string literal is used outside of #include/#line
Fixes: 67b32190f3c953c5b7091d76ddeff95c0cbfb439
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2619
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4146>
Caio Marcelo de Oliveira Filho [Thu, 5 Mar 2020 20:33:05 +0000 (12:33 -0800)]
anv: Remove duplicate code in anv_cmd_buffer_bind_descriptor_set
Also use a single condition statement instead of two.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 23:42:33 +0000 (15:42 -0800)]
anv: Reduce compute pipeline batch_data size
The batch associated with the compute pipeline only needs room for a
MEDIA_VFE_STATE. So this patch moves the batch_data to each pipeline
struct and cap the one in compute pipeline.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 23:31:50 +0000 (15:31 -0800)]
anv: Split graphics and compute bits from anv_pipeline
Add two new structs that use the anv_pipeline as base. Changed all
functions that work on a specific pipeline to use the corresponding
struct.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 21:43:39 +0000 (13:43 -0800)]
anv: Use a separate field in the pipeline for compute shader
This is a preparation for splitting the compute and graphics pipelines
into separate structs.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 21:21:45 +0000 (13:21 -0800)]
anv: Decouple flush_descriptor_sets() from pipeline struct
Explicitly pass the active stages and the array (and size) of shaders
to be processed. This will make easy to store only the shaders needed
for each pipeline.
The active stages can be identified by a non-NULL shader in the
shaders array, so stop using it and keep track of the flushed stages
as iteration happens.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 20:51:53 +0000 (12:51 -0800)]
anv: Decouple flush_descriptor_sets() helpers from pipeline struct
Pass the `anv_shader_bin *` instead of expecting the helpers to peek
into the pipeline struct. Also reach for the device from the
cmd_buffer instead of the pipeline.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 20:40:37 +0000 (12:40 -0800)]
anv: Remove redundant check in flush_descriptor_sets() helpers
These helpers are only called for stages that are active, so the code
for a non-active stage is never executed.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 20:23:31 +0000 (12:23 -0800)]
anv: Pass the right pipe_state to flush_descriptor_sets()
The caller has this information, so pass directly instead of making
each helper function call figure that one out. Also, since we can
reach the pipeline from pipe_state, drop that parameter from the
function.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 20:10:00 +0000 (12:10 -0800)]
anv: Keep the shader stage in anv_shader_bin
This will be used to decouple the logic flush_descriptor_sets() from
the position in the shader array, allowing us to store just the
shaders needed for each pipeline.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 18:19:15 +0000 (10:19 -0800)]
anv: Use a dynamic array for storing executables in pipeline
Avoids waste for pipelines that don't use all the shaders, and is
flexible enough to cover cases where there are multiple variants per
shader (e.g. SIMD8/16/32 for fragment shader).
Even though we could pre-calculate the exact size of the array, this
is not a critical path so it is worth preventing the bug that will
likely happen when new variants are added but not accounted for.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 18:10:05 +0000 (10:10 -0800)]
anv: Use pipeline type to decide whether or not lower multiview
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Caio Marcelo de Oliveira Filho [Tue, 3 Mar 2020 18:09:29 +0000 (10:09 -0800)]
anv: Add a new enum to identify the pipeline type
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4040>
Eric Anholt [Tue, 10 Mar 2020 21:52:42 +0000 (14:52 -0700)]
glsl/tests: Fix waiting for disk_cache_put() to finish.
We were wasting 4s on waiting for expected-not-to-appear files to show
up on every test. Using timeouts in test code is error-prone anyway,
as our shared runners may be busy on other jobs.
Fixes: 50989f87e62e ("util/disk_cache: use a thread queue to write to shader cache")
Link: https://gitlab.freedesktop.org/mesa/mesa/issues/2505
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140>
Eric Anholt [Tue, 10 Mar 2020 20:36:51 +0000 (13:36 -0700)]
glsl/tests: Catch mkdir errors to help explain when they happen.
A recent pipeline
(https://gitlab.freedesktop.org/Venemo/mesa/-/jobs/
1893357) failed
with what looks like an intermittent error related to making files for
the cache test inside of the core of the cache. Given some of the
errors, it looks like maybe a mkdir failed, so log those errors
earlier so we can debug what's going on next time.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4140>
Caio Marcelo de Oliveira Filho [Fri, 21 Feb 2020 18:58:48 +0000 (10:58 -0800)]
intel/fs: Combine adjacent memory barriers
This will avoid generating multiple identical fences in a row.
For Gen11+ we have multiple types of fences (affecting different
variable modes), but is still better to combine them in a single
scoped barrier so that the translation to backend IR have the option
of dispatching both fences in parallel.
This will clean up redundant barriers from various
dEQP-VK.memory_model.* tests.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>
Caio Marcelo de Oliveira Filho [Fri, 21 Feb 2020 18:53:05 +0000 (10:53 -0800)]
nir: Add pass to combine adjacent scoped memory barriers
SPIR-V generates very granular barriers, however HW and backends might
not necessarily take advantage of those. This pass provides a general
mechanism to combine such barriers.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>
Caio Marcelo de Oliveira Filho [Fri, 21 Feb 2020 18:46:29 +0000 (10:46 -0800)]
nir: Reorder nir_scopes so wider scope has larger numeric value
Makes code comparing and combining scopes slightly more readable.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>
Caio Marcelo de Oliveira Filho [Thu, 20 Feb 2020 17:47:06 +0000 (09:47 -0800)]
nir: Don't skip a bit in nir_memory_semantics
There was another enum entry in the draft versions of
nir_memory_semantics, but when it got dropped the entries were not
updated.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3224>
Samuel Pitoiset [Wed, 11 Mar 2020 08:25:21 +0000 (09:25 +0100)]
radv: use ac_gpu_info::use_late_alloc
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
Samuel Pitoiset [Wed, 11 Mar 2020 08:20:17 +0000 (09:20 +0100)]
radv: rewrite late alloc computation
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
Samuel Pitoiset [Wed, 11 Mar 2020 07:57:37 +0000 (08:57 +0100)]
radv: tune primitive binning for small chips
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
Samuel Pitoiset [Wed, 11 Mar 2020 07:51:08 +0000 (08:51 +0100)]
radv: use better tessellation tunables on GFX9+
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
Samuel Pitoiset [Wed, 11 Mar 2020 08:31:26 +0000 (09:31 +0100)]
radv/gfx10: cache metadata in L2 on small chips
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
Jason Ekstrand [Thu, 5 Mar 2020 20:50:59 +0000 (14:50 -0600)]
intel/isl: Set DepthStencilResource based on aux usage
In ISL, usage flags only carry intent and not semantic meaning. We
don't have a bulletproof way in ISL to specify that an image is of
depth/stencil type. The usage flags are great but blorp, for instance,
loves to disrespect them. One proposed solution to this problem is to
add explicit depth/stencil formats which are distinct from the
corresponding color formats.
Fortunately, however, empirical evidence suggests that this bit only
affects the sampler's interpretation of the CCS data. Therefore, we can
set the bit based off of the aux_usage which is now very specific and
does carry semantic meaning. In particular, aux_usage now makes a
distinction between color CCS and depth/stencil CCS which appears to be
exactly what the DepthStencilResource bit is for.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Tue, 10 Mar 2020 17:11:43 +0000 (12:11 -0500)]
intel: Require ISL_AUX_USAGE_STC_CCS for stencil CCS
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Thu, 5 Mar 2020 17:46:51 +0000 (11:46 -0600)]
iris: Use ISL_AUX_USAGE_STC_CCS for stencil CCS
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Thu, 5 Mar 2020 17:47:27 +0000 (11:47 -0600)]
intel/blorp: Allow STC_CCS in blit sources
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Thu, 5 Mar 2020 17:17:28 +0000 (11:17 -0600)]
intel/isl: Add a separate ISL_AUX_USAGE_STC_CCS
Stencil CCS is slightly different from color CCS. Using a color CCS
resolve with stencil CCS doesn't do the right thing and you can't sample
from a stencil CCS image without the DepthStencilResource bit set or you
will get the wrong data. Stencil CCS also has it's own rules such as it
doesn't support fast-clear and has no partial resolve. This seems to
indicate that it should probably be its own isl_aux_usage. Now that
adding new isl_aux_usage values is pretty cheap, let's split stencil CCS
out on its own.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Wed, 4 Mar 2020 19:56:30 +0000 (13:56 -0600)]
intel/isl: Require ISL_AUX_USAGE_HIZ_CCS_WT for HZ+CCS WT mode
We also delete the badly named isl_surf_supports_hiz_ccs_wt. The name
is misleading because it doesn't return whether or not the surface
supports HiZ+CCS in write-through mode (any single-sampled HiZ+CCS
capable surface does) but rather a heuristic decision about whether or
not we want to enable write-through mode based on the usage flags in the
isl_surf.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Wed, 4 Mar 2020 04:20:26 +0000 (22:20 -0600)]
iris: Use ISL_AUX_USAGE_HIZ_CCS_WT to indicate write-through HiZ
Previously, we always set the aux_usage to ISL_AUX_USAGE_HIZ_CCS and let
ISL choose write-through based on isl_surf_supports_hiz_ccs_wt. This
commit makes us choose explicitly at surface creation time whether to
use HIZ_CCS or HIZ_CCS_WT based on the same set of conditions. This is
more explicit and should be more robust as it lets us choose WT mode in
one place rather than trusting isl_surf_supports_hiz_ccs_wt to return
the same thing every time.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Wed, 4 Mar 2020 20:26:43 +0000 (14:26 -0600)]
intel/blorp: Allow HIZ_CCS_WT in copy sources
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Tue, 3 Mar 2020 23:03:41 +0000 (17:03 -0600)]
intel/isl: Add a separate ISL_AUX_USAGE_HIZ_CCS_WT
This is distinct from ISL_AUX_USAGE_HIZ_CCS in that the HiZ surface
operates in write-through mode which means that the HiZ surface is only
used for depth-testing acceleration and the CCS-compressed main surface
is always valid so we can texture from it.
Separating full HiZ from write-through mode at the isl_aux_usage level
has a couple of advantages:
1. It's more explicit. Instead of write-through mode depending on the
heuristic decision in isl_surf_supports_hiz_ccs_wt, it's now
something that's explicitly requested by the driver. This should be
more robust than hoping isl_surf_supports_hiz_ccs_wt always returns
the same thing every time. If someone (say BLORP) ever drops a
usage flag on the isl_surf, there's a chance it could return a
different value without us noticing leading to corruptions.
2. Because ISL_AUX_USAGE_HIZ_CCS_WT is it's own isl_aux_usage flag, we
can say inside the driver that HIZ_CCS does not support sampling but
HIZ_CCS_WT does. We can also pass HIZ_CCS_WT to isl_surf_fill_state
and it can do some validation for us beyond what we would be able to
do if we conflate HIZ_CCS_WT and CCS_E.
3. In the future, we can add new heuristics to the driver which do
things such as start all depth surfaces (regardless of usage flags)
off in HIZ_CCS and then do a full resolve and drop to HIZ_CCS_WT the
first time it gets used by the sampler. This would potentially let
us enable the faster HIZ_CCS mode even in cases where it technically
comes in through the API as a texture.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Jason Ekstrand [Thu, 5 Mar 2020 17:30:59 +0000 (11:30 -0600)]
intel/isl: Clean up some aux surface logic
The first check is redundant because the first thing we do in the "emit
the aux surface" section is assert that we actually have an aux_surf.
The second check involves an exclusion list of things which don't have
aux surfaces on Gen12 but an inclusion list is much simpler because it's
just "does it have MCS?".
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4056>
Marek Olšák [Wed, 11 Mar 2020 01:52:42 +0000 (21:52 -0400)]
ac: disable late alloc on small gfx10 chips
same as PAL.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143>
Marek Olšák [Wed, 11 Mar 2020 01:51:01 +0000 (21:51 -0400)]
ac: add radeon_info::use_late_alloc to control LATE_ALLOC globally
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143>
Marek Olšák [Wed, 11 Mar 2020 00:46:16 +0000 (20:46 -0400)]
radeonsi: tune primitive binning for small chips
same as PAL
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143>
Marek Olšák [Wed, 11 Mar 2020 00:45:08 +0000 (20:45 -0400)]
radeonsi: set better tessellation tunables on gfx9 and gfx10
same as PAL
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143>
Marek Olšák [Wed, 11 Mar 2020 00:44:03 +0000 (20:44 -0400)]
radeonsi/gfx10: cache metadata in L2 on small chips
same as PAL.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4143>
Samuel Pitoiset [Tue, 3 Mar 2020 14:53:20 +0000 (15:53 +0100)]
radv/sqtt: describe layout transitions with user markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4138>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4138>
Samuel Pitoiset [Tue, 3 Mar 2020 14:03:25 +0000 (15:03 +0100)]
radv/sqtt: describe begin/end subpass barriers with user markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4138>
Juan A. Suarez Romero [Tue, 10 Mar 2020 10:50:30 +0000 (10:50 +0000)]
nir/algebraic: coalesce fmod lowering
As fmod for 16/32/64 bits lowering does the same, let's merge all of
them in a single case.
Fixes dEQP-VK.glsl.builtin.precision_double.mod.compute.* on ACO.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4118>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4118>
Juan A. Suarez Romero [Tue, 10 Mar 2020 10:49:42 +0000 (10:49 +0000)]
nir/lower_double_ops: relax lower mod()
Currently when lowering mod() we add an extra instruction so if
mod(a,b) == b then 0 is returned instead of b, as mathematically
mod(a,b) is in the interval [0, b).
But Vulkan spec has relaxed this restriction, and allows the result to
be in the interval [0, b].
For the OpenGL case, while the spec does not allow this behaviour, due
the allowed precision errors we can end up having the same result, so
from a practical point of view, this behaviour is allowed (see
https://github.com/KhronosGroup/VK-GL-CTS/issues/51).
This commit takes this in account to remove the extra instruction
required to return 0 instead.
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4118>
Bas Nieuwenhuizen [Mon, 2 Dec 2019 08:53:37 +0000 (09:53 +0100)]
amd/llvm: Fix divergent descriptor indexing. (v3)
There are multiple LLVM passes that very much move the
intrinsic using the descriptor outside of the loop, defeating
the entire point of creating the loop.
Defeat the optimizer by splitting the break into a separate
if-statement and putting an optimization barrier on the bool
in between.
v2: Move from a callback based system to begin/end loop.
This does not make it significantly less intrusive but
is a bit nicer with all the extra struct and callback
stubs.
v3: Deal with non-divergent values in divergent path.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2160
Fixes: 028ce527395 "radv: Add non-uniform indexing lowering."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4109>
Ian Romanick [Wed, 11 Mar 2020 22:53:23 +0000 (15:53 -0700)]
intel/fs: Fix NULL destinations on 3-source instructions again after late DCE
We considered moving this down near the call to
insert_gen4_send_dependency_workarounds. By that point it's too late
for a couple reasons. One, we're potentially increasing resiter
pressure that may lead to anoter spill. Two, fixup_3src_null_dest tries
to allocate a VGRF, but the post-register allocation shader uses
physical registers.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2621
Fixes: ba2fa1ceaf4 ("intel/fs: Do cmod prop again after scheduling")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4155>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4155>
Timur Kristóf [Wed, 11 Mar 2020 12:39:46 +0000 (13:39 +0100)]
radv: Enable subgroup shuffle on GFX10 when ACO is used.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4159>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4159>
Timur Kristóf [Wed, 11 Mar 2020 14:05:47 +0000 (15:05 +0100)]
radv: Enable lowering dynamic quad broadcasts.
This will lower dynamic quad broadcasts into something that both
LLVM and ACO can understand. On hardware which supports shuffles,
they are lowered to shuffle, on older hardware (GFX6-7) they will
get lowered to constant quad broadcasts.
Fixes dEQP-VK.subgroups.quad.*.subgroupquadbroadcast_nonconst_*
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147>
Timur Kristóf [Wed, 11 Mar 2020 14:01:56 +0000 (15:01 +0100)]
nir: Add ability to lower non-const quad broadcasts to const ones.
Some hardware doesn't support subgroup shuffle, and on such hardware
it makes no sense to lower quad broadcasts to shuffle. Instead, let's
lower them to four const quad broadcasts, paired with bcsel instructions.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4147>
Eric Engestrom [Mon, 9 Mar 2020 11:58:05 +0000 (12:58 +0100)]
gen_release_notes: resolve ambiguity by renaming `version` to `previous_version` and `next_version` to `this_version`
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4113>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4113>
Eric Engestrom [Mon, 9 Mar 2020 11:54:24 +0000 (12:54 +0100)]
gen_release_notes: fix version in "you should wait" message
Fixes: 86079447da1e00d49db0 ("scripts: Add a gen_release_notes.py script")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4113>
Alyssa Rosenzweig [Thu, 12 Mar 2020 12:05:58 +0000 (08:05 -0400)]
pan/bi: Interpret register allocation results
Once LCRA has run, we have a map from IR indices to byte offsets into
the register file, so we need to "install" these results, rewriting the
IR to use native registers and fixing up writemasks/swizzles to
substitute vectorization for adjacent registers (for LCRA, we're
modeling in terms of real vectors).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Alyssa Rosenzweig [Thu, 12 Mar 2020 00:39:36 +0000 (20:39 -0400)]
pan/bi: Add register allocator
We model the machine as vector (with restrictions) to natively handle
mixed types and I/O and other goodies. We use LCRA for the heavylifting.
This commit adds only the modeling to feed into LCRA and spit LCRA
solutions back; next commit will integrate it with the IR.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Alyssa Rosenzweig [Thu, 12 Mar 2020 01:45:32 +0000 (21:45 -0400)]
pan/bi: Fix missing src_types
We want types to be consistent throughout the IR so we don't have to
make exceptions to parse things out. These cases just got missed.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Alyssa Rosenzweig [Thu, 12 Mar 2020 01:41:57 +0000 (21:41 -0400)]
pan/bi: Fix vector handling of readmasks
The issue was messing with liveness analysis... with Midgard we look at
the writemask to decide how the instruction behaves. Here, since our ALU
is scalar (except for subdivision which doesn't have proper writemasks
anyway) we just look at the component count directly -- either 4 for
vector instructions (essentially - for smaller loads we can replicate
manually without much burden), or 1 for scalar.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Alyssa Rosenzweig [Thu, 12 Mar 2020 01:04:26 +0000 (21:04 -0400)]
pan/bi: Minor fixes in iteration macros
Found during RA bringup.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Alyssa Rosenzweig [Thu, 12 Mar 2020 00:15:08 +0000 (20:15 -0400)]
pan/midgard: Remove incorrect comment in RA
Ironically, this comment was mistakenly added by the commit that fixed
the purported issue in the comment (
1bce7fdecd86 - found by `git blame`)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Alyssa Rosenzweig [Thu, 12 Mar 2020 00:08:03 +0000 (20:08 -0400)]
panfrost: Move lcra to panfrost/util
We'll want to use it for the Bifrost RA as well.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4158>
Rhys Perry [Wed, 19 Feb 2020 15:09:38 +0000 (15:09 +0000)]
glsl/list: use uintptr_t for exec_node_data()'s subtraction
This fixes UBSan warnings when foreach_list_typed_safe() passes NULL:
pointer index expression with base 0x000000000000 overflowed to 0xffffffffffffffa8
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4157>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4157>
Rhys Perry [Tue, 10 Mar 2020 15:07:19 +0000 (15:07 +0000)]
aco: fix uninitialized data error in waitcnt pass
Shouldn't create any incorrect waitcnts but may create suboptimial
waitcnts in rare cases.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4133>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4133>
Samuel Pitoiset [Mon, 27 Jan 2020 12:42:11 +0000 (13:42 +0100)]
ac/llvm: add missing optimization barrier for 64-bit readlanes
Otherwise, LLVM optimizes it but it's actually incorrect.
Fixes: 0f45d4dc2b1 ("ac: add ac_build_readlane without optimization barrier")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3585>
Tapani Pälli [Tue, 10 Mar 2020 07:21:09 +0000 (09:21 +0200)]
iris: toggle on PIPE_CAP_MIXED_COLOR_DEPTH_BITS
This enables additional EGL configs where we have depth/stencil buffer
with different number of bits per pixel than color buffer has. This
enables some Android games to work that require such config.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4127>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4127>
Hyunjun Ko [Thu, 5 Mar 2020 06:59:55 +0000 (06:59 +0000)]
turnip: Add tu6_control struct.
Follow the way that freedreno is doing so that we could see the whole
layout of the scratch buffer.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
Hyunjun Ko [Thu, 20 Feb 2020 05:41:55 +0000 (14:41 +0900)]
turnip: Enable VK_EXT_transform_feedback
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
Hyunjun Ko [Thu, 20 Feb 2020 06:46:57 +0000 (15:46 +0900)]
turnip: Implement an empty function vkCmdDrawIndirectByteCountEXT
TODO. We should implement this since indirect draw is enabled.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
Hyunjun Ko [Tue, 25 Feb 2020 01:08:25 +0000 (10:08 +0900)]
turnip: Implement stream-out emit and vkApis for transform feedback
1. Implement vkCmdBindTransformFeedbackBuffersEXT,
vkCmdBeginTransformFeedbackEXT and vkCmdEndTransformFeedbackEXT.
- Not handling counter buffers yet.
2. Implement streamout emit function, mostly taken from fd6_emit.c
v2. Replace emit_pkt4 funcs with emit_regs.
v3. Don't copy the state of stream-output from tu_pipeline.
v4. Set zero to VPC_SO_CNTL/VPC_SO_BUF_CNTL in tu6_init_hw.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
Hyunjun Ko [Tue, 25 Feb 2020 01:07:25 +0000 (10:07 +0900)]
turnip: Setup stream-output when linking program
Mostly taken from fd6_program.c.
v2. Note that it forces to use full VS instead of binning pass VS if
there's stream output as the binning pass VS will have outputs on
other than position/psize stripped out, which is the same as freedreno.
v3. fix indentation.
v4. Use register index instead of location when setup streamout.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
Hyunjun Ko [Thu, 20 Feb 2020 05:54:35 +0000 (14:54 +0900)]
turnip: Define structs for transform feedback
Define new structures for streamout buffers and state.
Most members of the state struct are taken from freedreno driver.
v2. Use IR3_MAX_SO_* and avoid using magic values.
v3. Remove the state of stream-output in tu_cmd_state and use one in
tu_pipeline and split out reset and enabled fields.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
Hyunjun Ko [Thu, 20 Feb 2020 05:48:28 +0000 (14:48 +0900)]
turnip: Gather information for transform feedback
- Add one member to the existed ir3_stream_output so that we could
assign location information from nir_xfb_info, rather than defining
new struct.
- Redefine maximum of so buffers, streams and outputs, which will be
used for turnip.
- Also enable caps for transform feedback for spirv_to_nir.
v2. Remove redefined maximums and use IR3_MAX_SO_* and add
IR3_MAX_SO_STREAMS.
v3. Remove the newly added location field so that we could keep aligned
with 32 bytes. Instead we create an array mapping between the location
and consecutive index, which is GL driver is doing.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3942>
David Stevens [Thu, 5 Mar 2020 03:52:38 +0000 (12:52 +0900)]
egl/android: set window usage flags
When creating an egl surface from an ANativeWindow, the window's usage
flags need to be set so that buffers are allocated properly.
Signed-off-by: David Stevens <stevensd@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lepton Wu <lepton@chromium.org>
Eric Anholt [Tue, 3 Mar 2020 22:38:09 +0000 (14:38 -0800)]
ci: Make a simple little bare-metal fastboot mode for db410c.
This supports powering up the device (using an external tool you
provide based on your particular lab), talking over serial to wait for
the fastboot prompt, and then booting a fastboot image on a target
device.
I was previously relying on LAVA for this, but that ran afoul of
corporate policies related to the AGPL. However, LAVA wasn't doing
too much for us, given that gitlab already has a job scheduler and
tagging and runners. We were spending a lot of engineering on making
the two systems match up, when we can just have gitlab do it directly.
Lightly-reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4076>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4076>
Eric Anholt [Thu, 5 Mar 2020 23:26:28 +0000 (15:26 -0800)]
ci: Fix installation of firmware for db410c's nic.
The debian firmware package doesn't actually contain it, costing us a
minute of boot time waiting for it to show up.
Lightly-reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4076>
Eric Anholt [Thu, 5 Mar 2020 22:35:55 +0000 (14:35 -0800)]
ci: Print the renderer/version that our dEQP invocation is using.
This is useful for sanity checking how the driver loads.
Lightly-reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4076>
Yevhenii Kolesnikov [Fri, 3 Jan 2020 14:37:00 +0000 (16:37 +0200)]
intel/compiler: fix cmod propagation optimisations
Knowing following:
- CMP writes to flag register the result of
applying cmod to the `src0 - src1`.
After that it stores the same value to dst.
Other instructions first store their result to
dst, and then store cmod(dst) to the flag
register.
- inst is either CMP or MOV
- inst->dst is null
- inst->src[0] overlaps with scan_inst->dst
- inst->src[1] is zero
- scan_inst wrote to a flag register
There can be three possible paths:
- scan_inst is CMP:
Considering that src0 is either 0x0 (false),
or 0xffffffff (true), and src1 is 0x0:
- If inst's cmod is NZ, we can always remove
scan_inst: NZ is invariant for false and true. This
holds even if src0 is NaN: .nz is the only cmod,
that returns true for NaN.
- .g is invariant if src0 has a UD type
- .l is invariant if src0 has a D type
- scan_inst and inst have the same cmod:
If scan_inst is anything than CMP, it already
wrote the appropriate value to the flag register.
- else:
We can change cmod of scan_inst to that of inst,
and remove inst. It is valid as long as we make
sure that no instruction uses the flag register
between scan_inst and inst.
Nine new cmod_propagation unit tests:
- cmp_cmpnz
- cmp_cmpg
- plnnz_cmpnz
- plnnz_cmpz (*)
- plnnz_sel_cmpz
- cmp_cmpg_D
- cmp_cmpg_UD (*)
- cmp_cmpl_D (*)
- cmp_cmpl_UD
(*) this would fail without changes to brw_fs_cmod_propagation.
This fixes optimisation that used to be illegal (see issue #2154)
= Before =
0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F
1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f
= After =
0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F
Now it is optimised as such (note change of cmod in line 0):
= Before =
0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F
1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f
= After =
0: linterp.nz.f0.0(8) vgrf0:F, g2:F, attr0<0>:F
No shaderdb changes
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2154
Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3348>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3348>
Alyssa Rosenzweig [Wed, 11 Mar 2020 19:17:25 +0000 (15:17 -0400)]
pan/bi: Fix swizzle for second argument to ST_VARY
Off-by-one.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 19:15:41 +0000 (15:15 -0400)]
pan/bi: Implement nir_op_ffma
We have native FMA which works for graphics usage (unlike Midgard where
it's really reserved for compute for various reasons), let's use it.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 19:10:32 +0000 (15:10 -0400)]
pan/bi: Add dead code elimination pass
Now that we have liveness analysis, we can cleanup the IR considerably.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 18:54:49 +0000 (14:54 -0400)]
pan/bi: Add liveness analysis pass
Now that all the guts are shared with Midgard, it's just a matter of
wiring it in.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 18:51:57 +0000 (14:51 -0400)]
pan/bi: Add bi_max_temp helper
Instead of trying to reindex all the times, just be okay with consistent
but sparse indices, then figuring out the max index is easy enough.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 18:48:55 +0000 (14:48 -0400)]
pan/bi: Add bi_next/prev_op helpers
From Midgard. These are surprisingly helpful.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 18:46:01 +0000 (14:46 -0400)]
pan/bi: Add bi_bytemask_of_read_components helpers
Same purpose as the Midgard version, but the implementation is
*dramatically* simpler thanks to our more regular IR.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 18:40:01 +0000 (14:40 -0400)]
pan/bi: Paste over bi_has_arg
While we're at it, cleanup the Midgard one.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 18:35:38 +0000 (14:35 -0400)]
panfrost: Sync Midgard/Bifrost control flow
We can move e v e n more code to be shared and let bi_block inherit from
pan_block, which will allow us to use the shared data flow analysis.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 17:58:10 +0000 (13:58 -0400)]
panfrost: Move liveness analysis to root panfrost/
This way we can share the code with Bifrost.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 12:36:31 +0000 (08:36 -0400)]
pan/midgard: Subclass midgard_block from pan_block
Promote as much as we feasibly can while keeping it Midgard/Bifrost
agnostic.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
Alyssa Rosenzweig [Wed, 11 Mar 2020 12:22:08 +0000 (08:22 -0400)]
pan/midgard: Sync midgard_block field names with Bifrost
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>