Erik Faye-Lund [Tue, 29 Oct 2019 09:07:53 +0000 (10:07 +0100)]
zink: heap-allocate samplers objects
VkSampler is 64-bit even on 32-bit systems, so casting it to a pointer
is a bad idea there. So let's heap-allocate the sampler-object instead.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2017
Reviewed-by: Witold Baryluk <witold.baryluk@gmail.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Jason Ekstrand [Wed, 30 Oct 2019 17:31:12 +0000 (12:31 -0500)]
anv: Zero released anv_bo structs
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 29 Oct 2019 19:26:15 +0000 (14:26 -0500)]
anv: Use a bitset for tracking residency
Now that we can conveniently map between GEM handles and struct anv_bo
pointers, we can use a simple bitset for residency tracking instead of
the complex hash set. This shaves about 3% off of a CPU-limited example
running with the Dawn WebGPU implementation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 30 Oct 2019 19:37:45 +0000 (14:37 -0500)]
anv: Set the batch allocator for compute pipelines
Otherwise relocations just up and crash.
Fixes: a3153162a9b "anv: Delay allocation of relocation lists"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 29 Oct 2019 20:18:16 +0000 (15:18 -0500)]
anv: Add a device parameter to anv_execbuf_add_bo
We're about to start needing to lookup BO pointers by GEM handle so we
need access to the device.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 28 Oct 2019 23:03:32 +0000 (18:03 -0500)]
anv: Drop anv_bo_init and anv_bo_init_new
BOs are now only ever allocated through the BO cache so there's no need
to have these exposed.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 28 Oct 2019 22:28:09 +0000 (17:28 -0500)]
anv: Allocate misc BOs from the cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 28 Oct 2019 21:42:02 +0000 (16:42 -0500)]
anv: Allocate scratch BOs from the cache
While we're here, we get rid of the locking and use a lock-free
algorithm. The chances of spilling contention are low and this is
actually a bit simpler in some ways.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 28 Oct 2019 20:42:20 +0000 (15:42 -0500)]
anv: Allocate batch and fence buffers from the cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 28 Oct 2019 19:49:38 +0000 (14:49 -0500)]
util: Add a free list structure for use with util_sparse_array
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 22:15:31 +0000 (17:15 -0500)]
anv: Allocate descriptor buffers from the BO cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 22:12:06 +0000 (17:12 -0500)]
anv: Set more flags on descriptor pool buffers
the ASYNC flag, in particular, has the potential to help performance
because it means less sync tracking in the kernel.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 22:07:36 +0000 (17:07 -0500)]
anv: Allocate query pool BOs from the cache
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 22:07:02 +0000 (17:07 -0500)]
anv: Use the query_slot helper in vkResetQueryPoolEXT
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 21:29:29 +0000 (16:29 -0500)]
anv: Allocate block pool BOs from the cache
This commit switches block pools over to being allocated from the BO
cache rather than being allocated manually by the block pool.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 30 Oct 2019 16:44:12 +0000 (11:44 -0500)]
anv/tests: Initialize the BO cache and device mutex
We're about to start depending on the BO cache in the state and block
pools so we need them properly initialized for the tests to work.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Wed, 30 Oct 2019 16:43:53 +0000 (11:43 -0500)]
anv/tests: Zero-initialize instances
Some of the tests were actually relying on some of those uninitialized
bits to be non-zero. In particular, a couple want use_softpin = true.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 23:18:52 +0000 (18:18 -0500)]
anv: Choose BO flags internally in anv_block_pool
All block pools are allocated with the same flags. There's no good
reason why it needs to be configurable.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 22:45:28 +0000 (17:45 -0500)]
anv: Rework the internal BO allocation API
This makes a number of changes to the current API:
1. Everything is renamed to anv_device_* instead of anv_bo_cache_*
because the BO cache is soon going to be the sole BO allocation path
and not some special case to make import/export work.
2. Drop the cache parameter. It's totally redundant with the device
and just annoying to keep typing.
3. Rework flags so that they go the convenient direction for usage in
ANV rather than whichever awkward way the i915 specified it to
maintain backwards compatibility. This also gives us the
opportunity to set some defaults.
4. Add flags for mapping and coherency.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 21:42:47 +0000 (16:42 -0500)]
anv: Use anv_block_pool_foreach_bo in get_bo_from_pool
While we're at it, use gen_48b_address().
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 21:10:11 +0000 (16:10 -0500)]
anv: Rework anv_block_pool_expand_range
The growing algorithms for the softpin case and the userptr version are
almost entirely different. Having this weird join doesn't make the code
more comprehensible. This rework does a few things:
1. Move the comment about 48-bit addresses to anv_device_init where we
actually unset the EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag.
2. Separate the paths in anv_block_pool_expand_range so it's easier to
see what happens in the two different cases.
3. Use the anv_block_poo::bos array for storing all allocated BOs in
both paths rather than using the cleanup list in both paths. This
lets us make the cleanups array only used for mmaps of the memfd for
the userptr case.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 20:42:22 +0000 (15:42 -0500)]
anv: Fix a potential BO handle leak
Fixes: 731c4adcf9b "anv/allocator: Add support for non-userptr"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 19:52:37 +0000 (14:52 -0500)]
anv: Handle state pool relocations using "wrapper" BOs
Instead of depending on a mutable BO in the state pool for handling
growing state pools, add a concept of "wrapper" BOs which just wrap an
actual BO. This way, the wrapper can exist once for all of time and we
can put it in relocation lists even if the actual BO it references gets
swapped out.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Tue, 29 Oct 2019 01:12:24 +0000 (20:12 -0500)]
anv: Replace ANV_BO_EXTERNAL with anv_bo::is_external
We're not THAT strapped for space that we can't burn one extra bit for
a boolean. If we're really worried about it, we can always shrink the
flags field to 16 bits because the kernel only uses 7 currently.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 19:51:19 +0000 (14:51 -0500)]
anv: Inline anv_block_pool_get_bo
It has exactly one caller and we're about to change some of the dynamics
which would make this confusing as a separate function.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 21:33:23 +0000 (16:33 -0500)]
anv: Declare the bo in the anv_block_pool_foreach_bo loop
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 19:33:48 +0000 (14:33 -0500)]
anv: Stop storing the GEM handle in anv_reloc_list_add
We have to go through and rewrite them all anyway so it doesn't do us
any good to put them in the list in anv_reloc_list_add. Also, for state
pools the handles are likely wrong by the time vkQueueSubmit is called.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 19:28:02 +0000 (14:28 -0500)]
anv: Fix a relocation race condition
Previously, we would read the offset from the BO in anv_reloc_list_add
to generate the presumed offset and then again in the caller to compute
the 64-bit address to write into the buffer. However, if the offset
somehow changed between these two points, the presumed offset would no
longer match the written offset. This is unlikely to actually ever be a
problem in practice because the presumed offset gets recorded first and
so if the written address is wrong then the presumed offset is almost
certainly wrong and the relocation will trigger. However, it's much
safer to simply have anv_reloc_list_add return the 64-bit address.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 17:45:41 +0000 (12:45 -0500)]
anv: Use a util_sparse_array for the GEM handle -> BO map
This lets us do less allocation because the anv_bo's are now embedded in
the sparse array and it also allows lock-free translation from GEM
handle to BO which will be useful in future commits.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 25 Oct 2019 18:01:55 +0000 (13:01 -0500)]
anv: Move refcount to anv_bo
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sat, 5 Oct 2019 19:07:50 +0000 (14:07 -0500)]
util: Add a util_sparse_array data structure
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Pierre-Eric Pelloux-Prayer [Tue, 29 Oct 2019 18:45:48 +0000 (19:45 +0100)]
mesa: enable msaa in clear_with_quad if needed
If the DrawBuffer sample count is > 1 and msaa is enabled we must also
enable msaa when clearing it.
Fixes: ea5b7de138b ("radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabled")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1991
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Witold Baryluk <witold.baryluk@gmail.com>
Lionel Landwerlin [Thu, 31 Oct 2019 09:34:35 +0000 (11:34 +0200)]
intel/perf: fix Android build
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 15b7b56eb2fb ("intel/perf: add TGL support")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-By: Tapani Pälli <tapani.palli@intel.com>
Tomeu Vizoso [Wed, 30 Oct 2019 10:41:41 +0000 (11:41 +0100)]
gitlab-ci: Disable lima jobs
The runner that submits jobs there is down and will turn some time to
get fixed. Disable them for now to keep the CI green.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Bas Nieuwenhuizen [Wed, 30 Oct 2019 14:00:39 +0000 (15:00 +0100)]
radv: Fix disk_cache_get size argument.
Got some int->pointer warnings and 20 is not a valid pointer ....
Fixes: 2e3a635ee69 "radv: Add an early exit in the secure compile if we already have the cache entries."
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Andrii Simiklit [Mon, 28 Oct 2019 12:23:55 +0000 (14:23 +0200)]
main: fix several 'may be used uninitialized' warnings
This patch fixes approximately 39 warnings in 'texcompress_etc.c'
for the release configuration
v2: Fixed by adding the unreachable case to the etc2_rgb8_fetch_texel
( Eric Engestrom <eric.engestrom@intel.com> )
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Bas Nieuwenhuizen [Thu, 31 Oct 2019 01:36:23 +0000 (02:36 +0100)]
anv: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.
CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Thu, 31 Oct 2019 01:35:51 +0000 (02:35 +0100)]
turnip: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.
CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Thu, 31 Oct 2019 01:33:46 +0000 (02:33 +0100)]
radv: Remove _mesa_locale_init/fini calls.
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.
CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Pierre-Eric Pelloux-Prayer [Wed, 30 Oct 2019 13:28:01 +0000 (14:28 +0100)]
radeonsi: tell the shader disk cache what IR is used
Until
8bef4df196fbb the IR (TGSI or NIR) was used in disk_cache driver_flags.
This commit restores this features to avoid crashing when switching from
one IR to the other.
As radeonsi's default is TGSI, I used "driver_flags & 0x8000000 = 0" for TGSI
to keep the same driver_flags.
Fixes: 8bef4df196f ("radeonsi: add si_debug_options for convenient adding/removing of options")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Lionel Landwerlin [Fri, 20 Sep 2019 18:11:33 +0000 (21:11 +0300)]
intel/perf: add TGL support
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Robert Foss [Tue, 22 Oct 2019 17:31:52 +0000 (19:31 +0200)]
android: Add panfrost support to build scripts
Currently the Android build system doesn't expose the panfrost
driver.
This patch enables the panfrost driver to be build on for the
Android platform.
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-By: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Robert Foss [Fri, 25 Oct 2019 15:34:37 +0000 (17:34 +0200)]
nir: Build nir_lower_point_size.c in libmesa_nir
nir_lower_point_size.c was not build into the libmesa_nir library for non-meson
builds. However it was included in the meson build.
This patch fixes that.
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Iago Toral Quiroga [Tue, 29 Oct 2019 07:32:44 +0000 (08:32 +0100)]
v3d: rename vertex shader key (num)_fs_inputs fields
Until now this made sense because we always paired vertex shaders
with fragment shaders, but as soon as we implement geometry and
tessellation shaders that will no longer be the case, so rename
this to (num_)used_outputs.
v2: Use 'used_outputs' instead of ns_outputs, which is more explicit (Eric).
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Mauro Rossi [Thu, 31 Oct 2019 00:59:07 +0000 (01:59 +0100)]
android: aco: fix Lower to CSSA
Fixes the following building error:
external/mesa/src/amd/compiler/aco_spill.cpp:1768:
error: undefined reference to 'aco::lower_to_cssa(aco::Program*, aco::live&, radv_nir_compiler_options const*)'
Fixes: 0b8216b ("aco: Lower to CSSA")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Jan Zielinski [Tue, 29 Oct 2019 18:29:27 +0000 (19:29 +0100)]
gallium/swr: Fix depth values for blit scenario
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Jordan Justen [Fri, 15 Feb 2019 19:35:28 +0000 (11:35 -0800)]
iris/gen11+: Move flush for render target change
When starting a BLORP operation, we do the BTI-change flush. However,
when ending it and transitioning back to regular drawing, we change the
render target again - without a set_framebuffer_state() call. We need
to do the BTI flush there too. BLORP flags IRIS_DIRTY_RENDER_BUFFER
now, which will cause the next draw to get the BTI flush again.
(explanation of fix by Ken)
Fixes: 2b956a093a1 ("iris: totally untested icelake support")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jordan Justen [Fri, 15 Feb 2019 19:31:31 +0000 (11:31 -0800)]
iris: Add IRIS_DIRTY_RENDER_BUFFER state flag
Fixes: 2b956a093a1 ("iris: totally untested icelake support")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Samuel Pitoiset [Mon, 28 Oct 2019 13:41:13 +0000 (14:41 +0100)]
radv: declare NGG scratch for VS or TES and only on GFX10
Do not need to declare it for other stages because this is for
streamout.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Arno Messiaen [Tue, 17 Sep 2019 21:40:03 +0000 (23:40 +0200)]
lima: add cubemap support
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Arno Messiaen [Sat, 12 Oct 2019 22:05:57 +0000 (00:05 +0200)]
lima: introduce ppir_op_load_coords_reg to differentiate between loading texture coordinates straight from a varying vs loading them from a register
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Arno Messiaen [Sun, 29 Sep 2019 21:20:45 +0000 (23:20 +0200)]
lima: add layer_stride field to lima_resource struct
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Arno Messiaen [Sun, 29 Sep 2019 21:21:39 +0000 (23:21 +0200)]
lima: fix stride in texture descriptor
Signed-off-by: Arno Messiaen <arnomessiaen@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Ian Romanick [Tue, 29 Oct 2019 19:18:16 +0000 (12:18 -0700)]
intel/compiler: Report the number of non-spill/fill SEND messages on vec4 too
This make shader-db's report.py work on Haswell and earlier platforms.
The problem is that the script would detect the "sends" output for
scalar shaders and expect in in vec4 shaders too. When it didn't find
it, the script would fail with:
Traceback (most recent call last):
File "./report.py", line 351, in <module>
main()
File "./report.py", line 182, in main
before_count = before[p][m]
KeyError: 'sends'
Fixes: f192741ddd8 ("intel/compiler: Report the number of non-spill/fill SEND messages")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tapani Pälli [Wed, 30 Oct 2019 12:43:57 +0000 (14:43 +0200)]
nir: fix couple of compile warnings
Fixes "warning: braces around scalar initializer" warnings.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bas Nieuwenhuizen [Wed, 30 Oct 2019 20:58:42 +0000 (21:58 +0100)]
radv: Fix timeout handling in syncobj wait.
libdrm returns -errno instead of directly the ioctl ret of -1.
Fixes: 1c3cda7d277 "radv: Add syncobj signal/reset/wait to winsys."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Ilia Mirkin [Mon, 14 Oct 2019 06:40:11 +0000 (02:40 -0400)]
nv50/ir: mark STORE destination inputs as used
Observed an issue when looking at the code generatedy by the
image-vertex-attrib-input-output piglit test. Even though the test
itself worked fine (due to TIC 0 being used for the image), this needs
to be fixed.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Mon, 4 Feb 2019 04:25:07 +0000 (23:25 -0500)]
gm107/ir: fix loading z offset for layered 3d image bindings
Unfortuantely we don't know if a particular load is a real 2d image (as
would be a cube face or 2d array element), or a layer of a 3d image.
Since we pass in the TIC reference, the instruction's type has to match
what's in the TIC (experimentally). In order to properly support
bindless images, this also can't be done by looking at the current
bindings and generating appropriate code.
As a result all plain 2d loads are converted into a pair of 2d/3d loads,
with appropriate predicates to ensure only one of those actually
executes, and the values are all merged in.
This goes somewhat against the current flow, so for GM107 we do the OOB
handling directly in the surface processing logic. Perhaps the other
gens should do something similar, but that is left to another change.
This fixes dEQP tests like image_load_store.3d.*_single_layer and GL-CTS
tests like shader_image_load_store.non-layered_binding without breaking
anything else.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "20.0" <mesa-stable@lists.freedesktop.org>
Lionel Landwerlin [Wed, 30 Oct 2019 22:03:30 +0000 (00:03 +0200)]
intel/dev: set default num_eu_per_subslice on gen12
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8125d7960b ("intel/dev: Add preliminary device info for Tigerlake")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Dylan Baker [Wed, 30 Oct 2019 22:18:27 +0000 (15:18 -0700)]
docs/new_features: Empty the feature list for the 20.0 cycle
Dylan Baker [Wed, 30 Oct 2019 21:56:02 +0000 (14:56 -0700)]
Bump VERSION to 20.0.0-devel
Jordan Justen [Fri, 25 Oct 2019 11:20:37 +0000 (04:20 -0700)]
docs/relnotes/new_features.txt: Add note about gen12 support
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Jordan Justen [Tue, 20 Mar 2018 15:23:35 +0000 (08:23 -0700)]
intel/eu/validate/gen12: Add TGL to eu_validate tests.
These reworks were combined into this patch:
* Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+
* Francisco Jerez: intel/eu/validate/gen12: Disable
qword_low_power_no_depctrl eu_validate test.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jordan Justen [Tue, 8 Aug 2017 21:08:58 +0000 (14:08 -0700)]
intel/dev: Add preliminary device info for Tigerlake
Reworks:
* adjust 64-bit support, hiz (Jason Ekstrand)
* sim-id (Lionel Landwerlin)
* adjust threads, urb size (Rafael Antognolli)
* adjust urb size (Kenneth Graunke)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Lionel Landwerlin [Fri, 25 Oct 2019 10:52:47 +0000 (13:52 +0300)]
intel/dump_gpu: handle context create extended ioctl
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Bas Nieuwenhuizen [Wed, 30 Oct 2019 18:52:51 +0000 (19:52 +0100)]
radv: Allocate space for temp. semaphore parts.
Calculated the number for allocation and did not
reserve space ....
Fixes: 2117c53b723 "radv: Add temporary datastructure for submissions."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Rafael Antognolli [Tue, 30 Apr 2019 20:34:20 +0000 (13:34 -0700)]
anv: Add Tile Cache Flush for Unified Cache.
Rafael Antognolli [Tue, 30 Apr 2019 20:34:06 +0000 (13:34 -0700)]
blorp: Add Tile Cache Flush for Unified Cache.
Rafael Antognolli [Mon, 29 Apr 2019 18:05:07 +0000 (11:05 -0700)]
iris: Add Tile Cache Flush for Unified Cache.
Jordan Justen [Sat, 9 Sep 2017 02:08:21 +0000 (19:08 -0700)]
intel/genxml: Add gen12 tile cache flush bit
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Daniel Schürmann [Thu, 24 Oct 2019 16:27:25 +0000 (18:27 +0200)]
aco: implement VGPR spilling
VGPR spilling is implemented via MUBUF instructions and scratch memory.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 30 Oct 2019 17:24:39 +0000 (18:24 +0100)]
aco: always set scratch_offset in startpgm
This patch also moves private_segment_buffer and
scratch_offset to Program to easily access it.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 30 Oct 2019 13:54:44 +0000 (14:54 +0100)]
aco: omit linear VGPRs as spill variables
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 30 Oct 2019 13:42:00 +0000 (14:42 +0100)]
aco: ensure that spilled VGPR reloads are done after p_logical_start
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Thu, 24 Oct 2019 09:38:37 +0000 (11:38 +0200)]
aco: simplify calculation of target register pressure when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Rhys Perry [Wed, 30 Oct 2019 18:00:36 +0000 (18:00 +0000)]
aco: fix new_demand calculation for first instructions
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Daniel Schürmann [Wed, 30 Oct 2019 11:32:32 +0000 (12:32 +0100)]
aco: don't add interferences between spilled phi operands
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 30 Oct 2019 11:04:22 +0000 (12:04 +0100)]
aco: consider loop_exit blocks like merge blocks, even if they have only one predecessor
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 30 Oct 2019 11:00:23 +0000 (12:00 +0100)]
aco: don't insert the exec mask into set of live-out variables when spilling
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Wed, 16 Oct 2019 14:39:06 +0000 (16:39 +0200)]
aco: fix transitive affinities of spilled variables
Variables spilled on both branch legs need to be assigned to the same spilling slot.
These affinities can be transitive through multiple merge blocks.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Tue, 29 Oct 2019 10:58:21 +0000 (11:58 +0100)]
aco: fix live-range splits of phis
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Tue, 29 Oct 2019 10:57:11 +0000 (11:57 +0100)]
aco: remove potential critical edge on loops.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Tue, 29 Oct 2019 10:56:09 +0000 (11:56 +0100)]
aco: improve live variable analysis
This patch makes the live variable analysis more precise
w.r.t. killed phi operands and the block's register pressure.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Daniel Schürmann [Tue, 15 Oct 2019 16:23:52 +0000 (18:23 +0200)]
aco: Lower to CSSA
Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes.
Previously, it was possible that phi operands have intersecting live-ranges, and thus,
couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to
spill phis, even if it was beneficial.
This patch implements a conversion pass which is currently only called if spilling is necessary.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Jonathan Marek [Wed, 3 Jul 2019 18:08:37 +0000 (14:08 -0400)]
etnaviv: fix non-pointsprite points on GC7000L
Fixes these deqp tests (and more):
dEQP-GLES2.functional.draw.draw_arrays.points.single_attribute
dEQP-GLES2.functional.draw.draw_arrays.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_arrays.points.default_attribute
dEQP-GLES2.functional.draw.draw_elements.points.single_attribute
dEQP-GLES2.functional.draw.draw_elements.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_elements.points.default_attribute
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Sun, 20 Oct 2019 18:37:25 +0000 (14:37 -0400)]
etnaviv: stencil fix
The final version of previous stencil fix patch ended up breaking one-sided
stencil.
Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.fragment_ops.depth_stencil.*
Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0
Fixes: 05da025f ("etnaviv: fix two-sided stencil")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Mon, 2 Sep 2019 18:46:15 +0000 (14:46 -0400)]
etnaviv: fix depth bias
Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.polygon_offset.*
Fixes: 6c3c05dc ("etnaviv: fix polygon offset")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jordan Justen [Fri, 10 May 2019 18:50:54 +0000 (11:50 -0700)]
iris: Set MOCS for external surfaces to uncached
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Tue, 13 Aug 2019 21:47:27 +0000 (14:47 -0700)]
iris: Align fast clear color state buffer to a page.
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.
v2: Fix typo case in the comment (Nanley)
v3: Rebase and fix conflicts.
v4: Fix rebase mistake (Nanley).
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Rafael Antognolli [Tue, 13 Aug 2019 21:47:27 +0000 (14:47 -0700)]
anv: Align fast clear color state buffer to a page.
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.
v2: Assert that image->planes[plane].offset is 4K aligned (Nanley)
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Erik Faye-Lund [Wed, 23 Oct 2019 10:16:22 +0000 (12:16 +0200)]
zink: only enable KHR_external_memory_fd if supported
While we're at it, make sure we error out if it's not supported when
required.
This brings us a bit closer to being able to test on SwiftShader, which
doesn't currently support KHR_external_memory_fd.
Bas Nieuwenhuizen [Wed, 30 Oct 2019 13:51:17 +0000 (14:51 +0100)]
radv: Start signalling semaphores in WSI acquire.
Winsys semaphores without signal operation get silently ignored.
Not so for syncobjs, so actually signal them.
Fixes: 84d9551b232 "radv: Always enable syncobj when supported for all fences/semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Rhys Perry [Mon, 21 Oct 2019 14:08:07 +0000 (15:08 +0100)]
aco: rename README to README.md
Closes: #1974
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Rhys Perry [Tue, 29 Oct 2019 11:19:39 +0000 (11:19 +0000)]
aco: a couple loop handling fixes for GFX10 hazard pass
It was joining from the wrong blocks and block.kind is a bitmask instead
of an enum.
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Matt Turner [Mon, 9 Sep 2019 20:01:06 +0000 (13:01 -0700)]
intel/compiler: Add instruction compaction support on Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Matt Turner [Thu, 15 Feb 2018 18:33:18 +0000 (10:33 -0800)]
intel/compiler: Make separate src0/src1 index tables
TGL uses different data (and even a different format!) for each source.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Matt Turner [Tue, 13 Feb 2018 00:35:49 +0000 (16:35 -0800)]
intel/compiler: Inline get_src_index()
TGL will have separate tables for src0 and src1, so the shared function
will no longer make sense.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Matt Turner [Tue, 13 Feb 2018 00:26:20 +0000 (16:26 -0800)]
intel/compiler: Restructure instruction compaction in preparation for Gen12
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Matt Turner [Wed, 16 Oct 2019 19:45:55 +0000 (12:45 -0700)]
intel/compiler: Remove unreachable() from brw_reg_type.c
The EU compaction unit test fuzzes the compaction code by flipping bits.
We use a simple skip_bits() function with a list of reserved bits to
ignore, but for more complex cases like invalid combinations of register
file:type, we need either machinery to check validity or for these
functions to simply inform us whether a combination was valid.
enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it
with an "INVALID" value, just return -1 and let the caller check for
that.
Scott suggested redefining unreachable() within the unit test to
longjmp() which would allow driver code like this to still use it and
allow the test to handle expected failures like this. If that plan works
out, I plan to revert this.
Jonathan Marek [Fri, 6 Sep 2019 16:59:15 +0000 (12:59 -0400)]
freedreno/a2xx: add missing vertex formats (SSCALE/USCALE/FIXED)
Mostly for vertex formats, but they are supported as texture formats too
(untested however).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@gmail.com>