mesa.git
5 years agoradv: Unset vk_info in radv_image_create_layout.
Bas Nieuwenhuizen [Tue, 24 Sep 2019 11:42:31 +0000 (13:42 +0200)]
radv: Unset vk_info in radv_image_create_layout.

For better test coverage of this corner case.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Handle slightly different image dimensions.
Bas Nieuwenhuizen [Tue, 24 Sep 2019 11:23:36 +0000 (13:23 +0200)]
radv: Handle slightly different image dimensions.

The minigbm comment really says it all. We should
fix minigbm as well, but for now this is the more
robust solution.

Note that this only changes width and height for
the surface creation, not for the image and hence
also not for the sampler, where it would wreak
havoc due to the normalized coords.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Delay patching for imported images until layout time.
Bas Nieuwenhuizen [Mon, 23 Sep 2019 14:42:39 +0000 (16:42 +0200)]
radv: Delay patching for imported images until layout time.

We want this flexibility because in GFX10 we lose any stride fields,
so we have to make sure our width/height are in alignment with
the external image we import.

Furthermore, we need the ability to inject tiling modifiers on import
time which is strictly after create time for Android. So, with the
layout & patch functions being fully independent of pCreateInfo, we
can delay it until import/bind time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Split out layout code from image creation.
Bas Nieuwenhuizen [Mon, 23 Sep 2019 14:14:50 +0000 (16:14 +0200)]
radv: Split out layout code from image creation.

So we can delay the layout until later in some import cases.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Handle device memory alloc failure with normal free.
Bas Nieuwenhuizen [Wed, 10 Jul 2019 12:21:11 +0000 (14:21 +0200)]
radv: Handle device memory alloc failure with normal free.

Less duplication/complexity.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Cleanup buffer_from_fd.
Bas Nieuwenhuizen [Tue, 24 Sep 2019 15:26:41 +0000 (17:26 +0200)]
radv: Cleanup buffer_from_fd.

Unused stride/offset args.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agogitlab-ci/lava: Test Lima driver with dEQP
Tomeu Vizoso [Sun, 6 Oct 2019 15:49:56 +0000 (08:49 -0700)]
gitlab-ci/lava: Test Lima driver with dEQP

Run dEQP on boards with Mali 400 and 450 in Baylibre's lab.

There's lots of skipped tests because of crashes and undetermined
behavior. May be a good idea to run the tests with valgrind and fix any
issues found.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
5 years agogitlab-ci/lava: Use files to list tests to skip
Tomeu Vizoso [Sun, 6 Oct 2019 22:21:39 +0000 (15:21 -0700)]
gitlab-ci/lava: Use files to list tests to skip

As the non-LAVA runner script does, have per-GPU version files listing
the tests that are to be skipped, due to being very slow, unstable, etc.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
5 years agointel/tools: Support multiple contexts in intel_dump_gpu.
Rafael Antognolli [Mon, 16 Sep 2019 16:07:26 +0000 (09:07 -0700)]
intel/tools: Support multiple contexts in intel_dump_gpu.

Create basic aub_context on GEM_CONTEXT_CREATE.

Set it up and submit a context + ring + pphwsp during execbuf
submission, if it has not been initialized yet.

v2: Write the HWSP only once per engine (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/tools: Add basic aub_context code and helpers.
Rafael Antognolli [Fri, 13 Sep 2019 22:13:31 +0000 (15:13 -0700)]
intel/tools: Add basic aub_context code and helpers.

v2:
 - Only dump context if there were no erros (Lionel).
 - Store counter for context handles in aub_file (Lionel).
v3:
 - Add a comment about aub_context -> GEM context (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/tools: Use common code for GGTT address allocation.
Rafael Antognolli [Tue, 17 Sep 2019 01:12:02 +0000 (18:12 -0700)]
intel/tools: Use common code for GGTT address allocation.

We want to be able to create contexts on demand, and increase the GGTT
as needed for that. Use the aub_map_ggtt() function for that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/tools: Factor out GGTT allocation.
Rafael Antognolli [Mon, 9 Sep 2019 21:24:41 +0000 (14:24 -0700)]
intel/tools: Factor out GGTT allocation.

We want to reuse it in execlists_setup().

v2: Rename it to write_ggtt_ptes() (Lionel).
v3: Rename it to aub_map_ggtt() (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv: Implement & enable VK_EXT_texel_buffer_alignment.
Bas Nieuwenhuizen [Thu, 10 Oct 2019 09:40:27 +0000 (11:40 +0200)]
radv: Implement & enable VK_EXT_texel_buffer_alignment.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: use a compute shader for copying timestamp query results
Samuel Pitoiset [Wed, 9 Oct 2019 11:34:52 +0000 (13:34 +0200)]
radv: use a compute shader for copying timestamp query results

When the timestamp is not ready (ie. UINT64_MAX), the availabily bit
should be zero. The previous code used to copy the timestamp value
as the availabily bit and that's completely wrong.

Because it's not that simple to emit a conditional with the CP, the
driver now uses a compute shader for copying timestamp query results.

Fixes dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: sync before resetting query pools if timestamps have been written
Samuel Pitoiset [Wed, 9 Oct 2019 12:30:49 +0000 (14:30 +0200)]
radv: sync before resetting query pools if timestamps have been written

Otherwise, the GPU might write timestamp queries after the reset
operation. This is similar to other query operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: Clean up usages of PhysReg::reg from aco_assembler.
Timur Kristóf [Wed, 9 Oct 2019 08:40:24 +0000 (10:40 +0200)]
aco: Clean up usages of PhysReg::reg from aco_assembler.

These are not needed anymore, since PhyReg has an implicit
conversion operator that can convert it to unsigned int,
which is equivalent to accessing this field.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Add extra assertion for number of FS input VGPRs.
Timur Kristóf [Thu, 3 Oct 2019 17:32:48 +0000 (19:32 +0200)]
aco: Add extra assertion for number of FS input VGPRs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Fix s_dcache_wb on GFX10.
Timur Kristóf [Tue, 17 Sep 2019 17:59:17 +0000 (19:59 +0200)]
aco: Fix s_dcache_wb on GFX10.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Have s_waitcnt_vscnt write to NULL.
Rhys Perry [Thu, 12 Sep 2019 14:28:49 +0000 (15:28 +0100)]
aco: Have s_waitcnt_vscnt write to NULL.

Not sure if this instruction actually writes anything, but LLVM
disassembles a destination and sets it to NULL.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Use the VOP3-only add/sub GFX10 instructions if needed.
Rhys Perry [Thu, 12 Sep 2019 12:25:18 +0000 (13:25 +0100)]
aco: Use the VOP3-only add/sub GFX10 instructions if needed.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Initial work to avoid GFX10 hazards.
Rhys Perry [Thu, 12 Sep 2019 15:42:17 +0000 (17:42 +0200)]
aco: Initial work to avoid GFX10 hazards.

Currently just breaks up SMEM groups and fixes
FeatureVMEMtoScalarWriteHazard (name from LLVM).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: pad code with s_code_end on GFX10
Rhys Perry [Tue, 8 Oct 2019 12:47:00 +0000 (14:47 +0200)]
aco: pad code with s_code_end on GFX10

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: workaround GFX10 0x3f branch bug
Rhys Perry [Tue, 10 Sep 2019 17:11:13 +0000 (18:11 +0100)]
aco: workaround GFX10 0x3f branch bug

According to LLVM, branches with an offset of 0x3f are buggy.

v2: (by Timur Kristóf)
- extract the GFX10 specific part to its own function

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Fix VS input VGPRs on GFX10.
Timur Kristóf [Tue, 27 Aug 2019 14:27:41 +0000 (16:27 +0200)]
aco: Fix VS input VGPRs on GFX10.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Assemble opsel in VOP3 instructions.
Rhys Perry [Thu, 12 Sep 2019 18:55:12 +0000 (19:55 +0100)]
aco: Assemble opsel in VOP3 instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Allow literals on VOP3 instructions.
Rhys Perry [Thu, 12 Sep 2019 18:55:36 +0000 (19:55 +0100)]
aco: Allow literals on VOP3 instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
5 years agoaco: Support subvector loops in aco_assembler.
Timur Kristóf [Tue, 8 Oct 2019 12:43:43 +0000 (14:43 +0200)]
aco: Support subvector loops in aco_assembler.

These are currently not used, but could be useful later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Set GFX10 dimensionality on the instructions that need it.
Timur Kristóf [Tue, 8 Oct 2019 12:42:52 +0000 (14:42 +0200)]
aco: Set GFX10 dimensionality on the instructions that need it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Use ac_get_sampler_dim, delete duplicate code.
Timur Kristóf [Fri, 4 Oct 2019 13:12:21 +0000 (15:12 +0200)]
aco: Use ac_get_sampler_dim, delete duplicate code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Set GFX10 DLC bit properly.
Timur Kristóf [Thu, 26 Sep 2019 15:53:17 +0000 (17:53 +0200)]
aco: Set GFX10 DLC bit properly.

The DLC bit is now set to 1 for all loads when GLC is also set,
but cleared to 0 for all stores (otherwise it causes issues),
and also cleared to 0 for atomics.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:51:51 +0000 (17:51 +0200)]
aco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 EXP in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:50:48 +0000 (17:50 +0200)]
aco: Support GFX10 EXP in aco_assembler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support.
Timur Kristóf [Thu, 26 Sep 2019 15:50:06 +0000 (17:50 +0200)]
aco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 MIMG and GFX9 D16 in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:48:55 +0000 (17:48 +0200)]
aco: Support GFX10 MIMG and GFX9 D16 in aco_assembler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 MTBUF in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:48:08 +0000 (17:48 +0200)]
aco: Support GFX10 MTBUF in aco_assembler.

Also remove img_format from aco_ir, since it can be calculated
from dfmt and nfmt. So only the assember needs to deal with it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Link ACO with amd/common.
Timur Kristóf [Fri, 27 Sep 2019 07:26:40 +0000 (09:26 +0200)]
aco: Link ACO with amd/common.

We'd like to use some functions, for example some
ac_shader_util functions in ACO, so we need to link
ACO to AC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoamd/common: Add extern "C" to some headers that were missing it.
Timur Kristóf [Fri, 27 Sep 2019 07:26:14 +0000 (09:26 +0200)]
amd/common: Add extern "C" to some headers that were missing it.

We'd like to include some of these in C++ code later.
Specifically, ACO is written in C++ and we would like to use
some of this code in ACO in order to avoid code duplication.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 MUBUF in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:47:51 +0000 (17:47 +0200)]
aco: Support GFX10 MUBUF in aco_assembler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 DS in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:47:30 +0000 (17:47 +0200)]
aco: Support GFX10 DS in aco_assembler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 VINTRP in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:46:43 +0000 (17:46 +0200)]
aco: Support GFX10 VINTRP in aco_assembler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Support GFX10 SMEM in aco_assembler.
Timur Kristóf [Thu, 26 Sep 2019 15:46:05 +0000 (17:46 +0200)]
aco: Support GFX10 SMEM in aco_assembler.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Add missing GFX10 specific fields and some README notes.
Timur Kristóf [Thu, 26 Sep 2019 15:45:13 +0000 (17:45 +0200)]
aco: Add missing GFX10 specific fields and some README notes.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode.
Timur Kristóf [Sat, 21 Sep 2019 15:58:08 +0000 (17:58 +0200)]
aco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agov3d: take into account prim_counts_offset
Alejandro Piñeiro [Tue, 8 Oct 2019 13:44:19 +0000 (15:44 +0200)]
v3d: take into account prim_counts_offset

Specifically when reading the primitive counters.

This fixed ~700 CTS tests using this pattern:
dEQP-GLES3.functional.transform_feedback.*

when run after tests like
dEQP-GLES3.functional.prerequisite.read_pixels on the same
caselist. When run individually those tests were passing because
prim_counts_offset was zero.

Fixes: 0f2d1dfe65bfe1ee8f02ce45f100a5508debdfd4 ("v3d: use the GPU to
       record primitives written to transform feedback")

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agoradv: get the device name from radeon_info::name
Samuel Pitoiset [Wed, 9 Oct 2019 16:15:42 +0000 (18:15 +0200)]
radv: get the device name from radeon_info::name

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agost/mesa: fix R8 bitmap texture for TGSI paths.
Dave Airlie [Wed, 9 Oct 2019 23:56:27 +0000 (09:56 +1000)]
st/mesa: fix R8 bitmap texture for TGSI paths.

The initial patch only fixed up the NIR path, but forgot
the TGSI path needed fixing as well.

Fixes: f92226931b ("st/mesa: Prefer R8 for bitmap textures")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoanv/pipeline: Capture serialized NIR
Jason Ekstrand [Wed, 9 Oct 2019 18:21:21 +0000 (13:21 -0500)]
anv/pipeline: Capture serialized NIR

This allows the serialized NIR to be displayed in RenderDoc and similar
tools.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoclover: Remove unused code
Matt Turner [Wed, 9 Oct 2019 21:52:09 +0000 (14:52 -0700)]
clover: Remove unused code

Fixes: 96b592696f1 ("gallium: Require LLVM >= 3.9")
Bug: https://bugs.gentoo.org/685678

5 years agoclover: use iterator_range in get_kernel_nodes
Greg V [Wed, 4 Jul 2018 17:15:04 +0000 (20:15 +0300)]
clover: use iterator_range in get_kernel_nodes

With libc++ (LLVM's STL implementation), the original code does not compile because an
appropriate vector constructor cannot be found (for the _ForwardIterator one, requirement
is_constructible is not satisfied).

5 years agoradeonsi: enable MSAA shader images
Marek Olšák [Fri, 13 Sep 2019 01:57:00 +0000 (21:57 -0400)]
radeonsi: enable MSAA shader images

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: expand FMASK before MSAA image stores are used
Marek Olšák [Fri, 13 Sep 2019 00:20:53 +0000 (20:20 -0400)]
radeonsi: expand FMASK before MSAA image stores are used

Image stores don't use FMASK, so we have to turn it into identity.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: apply FMASK to MSAA image loads
Marek Olšák [Fri, 13 Sep 2019 01:56:41 +0000 (21:56 -0400)]
radeonsi: apply FMASK to MSAA image loads

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: clean up image_fetch_rsrc
Marek Olšák [Fri, 13 Sep 2019 01:39:31 +0000 (21:39 -0400)]
radeonsi: clean up image_fetch_rsrc

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: add FMASK slots for shader images (for MSAA images)
Marek Olšák [Fri, 13 Sep 2019 01:13:08 +0000 (21:13 -0400)]
radeonsi: add FMASK slots for shader images (for MSAA images)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: set the sample index for shader images correctly
Marek Olšák [Sat, 14 Sep 2019 03:58:52 +0000 (23:58 -0400)]
radeonsi: set the sample index for shader images correctly

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: fix GLSL imageSamples()
Marek Olšák [Mon, 16 Sep 2019 23:37:36 +0000 (19:37 -0400)]
radeonsi: fix GLSL imageSamples()

We haven't supported MSAA images, so it doesn't matter much.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agotgsi/scan: add tgsi_shader_info::msaa_images_declared
Marek Olšák [Fri, 13 Sep 2019 01:09:19 +0000 (21:09 -0400)]
tgsi/scan: add tgsi_shader_info::msaa_images_declared

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agonir: add shader_info::last_msaa_image
Marek Olšák [Fri, 13 Sep 2019 01:09:50 +0000 (21:09 -0400)]
nir: add shader_info::last_msaa_image

for radeonsi

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoradeonsi: don't set BO metadata for non-zero planes
Marek Olšák [Wed, 9 Oct 2019 21:02:07 +0000 (17:02 -0400)]
radeonsi: don't set BO metadata for non-zero planes

pointed out by Bas

5 years agoradeonsi: ignore metadata for non-zero planes
Marek Olšák [Mon, 30 Sep 2019 18:03:30 +0000 (14:03 -0400)]
radeonsi: ignore metadata for non-zero planes

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi: remove si_vid_join_surfaces and use combined planar allocations
Marek Olšák [Thu, 19 Sep 2019 01:17:11 +0000 (21:17 -0400)]
radeonsi: remove si_vid_join_surfaces and use combined planar allocations

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi: allocate planar multimedia formats in 1 buffer
Marek Olšák [Wed, 28 Aug 2019 01:18:57 +0000 (21:18 -0400)]
radeonsi: allocate planar multimedia formats in 1 buffer

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovl: use u_format in vl_video_buffer_formats
Marek Olšák [Thu, 19 Sep 2019 00:48:15 +0000 (20:48 -0400)]
vl: use u_format in vl_video_buffer_formats

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogallium/u_tests: test NV12 allocation and export
Marek Olšák [Sat, 24 Aug 2019 01:16:43 +0000 (21:16 -0400)]
gallium/u_tests: test NV12 allocation and export

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogallium/util: add planar format layouts and helpers
Marek Olšák [Tue, 27 Aug 2019 00:10:43 +0000 (20:10 -0400)]
gallium/util: add planar format layouts and helpers

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agogallium/util: remove enum numbering from util_format_layout
Marek Olšák [Mon, 26 Aug 2019 23:53:09 +0000 (19:53 -0400)]
gallium/util: remove enum numbering from util_format_layout

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoi965: Disable fast clears when running with INTEL_DEBUG=nofc
Caio Marcelo de Oliveira Filho [Tue, 8 Oct 2019 00:15:26 +0000 (17:15 -0700)]
i965: Disable fast clears when running with INTEL_DEBUG=nofc

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoiris: Disable fast clears when running with INTEL_DEBUG=nofc
Caio Marcelo de Oliveira Filho [Tue, 8 Oct 2019 00:12:14 +0000 (17:12 -0700)]
iris: Disable fast clears when running with INTEL_DEBUG=nofc

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoanv: Disable fast clears when running with INTEL_DEBUG=nofc
Caio Marcelo de Oliveira Filho [Tue, 8 Oct 2019 00:05:23 +0000 (17:05 -0700)]
anv: Disable fast clears when running with INTEL_DEBUG=nofc

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agointel: Add INTEL_DEBUG=nofc for disabling fast clears
Caio Marcelo de Oliveira Filho [Tue, 8 Oct 2019 00:04:01 +0000 (17:04 -0700)]
intel: Add INTEL_DEBUG=nofc for disabling fast clears

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agollvmpipe: avoid left-shifting a negative number.
Maya Rashish [Tue, 3 Sep 2019 10:04:15 +0000 (13:04 +0300)]
llvmpipe: avoid left-shifting a negative number.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Maya Rashish <coypu@sdf.org>
5 years agoegl: Include stddef.h in generated source
Danilo Spinella [Fri, 13 Sep 2019 22:03:20 +0000 (00:03 +0200)]
egl: Include stddef.h in generated source

Required for NULL macro used throughout the generated file.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoutil: fix to detect NetBSD properly
OBATA Akio [Mon, 16 Sep 2019 07:39:32 +0000 (16:39 +0900)]
util: fix to detect NetBSD properly

<sys/param.h> is required for NetBSD version detection,
and __NetBSD__ must be used to detect even on older releases.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoutil: simplify BSD includes
Jan Beich [Mon, 16 Sep 2019 12:55:06 +0000 (12:55 +0000)]
util: simplify BSD includes

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
5 years agoutil: detect AltiVec at runtime on BSDs
Jan Beich [Mon, 16 Sep 2019 11:42:07 +0000 (11:42 +0000)]
util: detect AltiVec at runtime on BSDs

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
5 years agoutil: skip AltiVec detection if built with -maltivec
Jan Beich [Mon, 16 Sep 2019 11:26:24 +0000 (11:26 +0000)]
util: skip AltiVec detection if built with -maltivec

Helps platforms where runtime detection isn't implemented.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
5 years agoutil: detect NEON at runtime on FreeBSD
Jan Beich [Mon, 16 Sep 2019 11:34:28 +0000 (11:34 +0000)]
util: detect NEON at runtime on FreeBSD

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
5 years agoutil: skip NEON detection if built with -mfpu=neon
Jan Beich [Mon, 16 Sep 2019 11:26:24 +0000 (11:26 +0000)]
util: skip NEON detection if built with -mfpu=neon

Helps platforms where runtime detection isn't implemented.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jan Beich <jbeich@FreeBSD.org>
5 years agoegl: Make native display detection work more than once
Adam Jackson [Wed, 9 Oct 2019 00:20:36 +0000 (20:20 -0400)]
egl: Make native display detection work more than once

eglGetDisplay is awful because you have to inspect the pointer you're
given and guess what type of native display it corresponds to. We make
it worse by caching the type of the first such display we detect, so if
the second call to eglGetDisplay is to a different display type, kaboom.

Fortunately this is a problem that can be solved with the delete key.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/156
5 years agoaco: enable nir_opt_sink
Rhys Perry [Wed, 18 Sep 2019 19:39:41 +0000 (20:39 +0100)]
aco: enable nir_opt_sink

SGPRS: 880272 -> 838936 (-4.70 %)
VGPRS: 705316 -> 680988 (-3.45 %)
Spilled SGPRs: 1032 -> 832 (-19.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 252 -> 252 (0.00 %) dwords per thread
Code Size: 55150788 -> 55172436 (0.04 %) bytes
LDS: 451 -> 451 (0.00 %) blocks
Max Waves: 66178 -> 68706 (3.82 %)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agonir/sink: Don't sink load_ubo to outside of its defining loop
Connor Abbott [Wed, 25 Sep 2019 12:17:23 +0000 (14:17 +0200)]
nir/sink: Don't sink load_ubo to outside of its defining loop

Previously, this could have made the resource divergent in code like
that which is genereated by nir_lower_non_uniform_access.

Fixes: da8ed68a ('nir: replace nir_move_load_const() with nir_opt_sink()')
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agonir/sink: Rewrite loop handling logic
Connor Abbott [Wed, 25 Sep 2019 12:02:48 +0000 (14:02 +0200)]
nir/sink: Rewrite loop handling logic

Previously, for code like:
loop {
    loop {
        a = load_ubo()
    }
    use(a)
}
adjust_block_for_loops() would return the block before the first loop.
Now we compute the range of allowed blocks and then walk the dominance
tree directly, guaranteeing directly that we always choose a block that
dominates all the uses and is dominated by the definition.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoamd: don't use AMD_FAMILY definitions from amdgpu_drm.h
Marek Olšák [Wed, 9 Oct 2019 16:25:10 +0000 (12:25 -0400)]
amd: don't use AMD_FAMILY definitions from amdgpu_drm.h

use the ones from addrlib

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agodocs: update calendar, add news item, and link release notes for 19.2.1
Dylan Baker [Wed, 9 Oct 2019 17:25:17 +0000 (10:25 -0700)]
docs: update calendar, add news item, and link release notes for 19.2.1

5 years agodocs: Add SHA256 sum for 19.2.1
Dylan Baker [Wed, 9 Oct 2019 17:19:16 +0000 (10:19 -0700)]
docs: Add SHA256 sum for 19.2.1

5 years agodocs: Add relnotes for 19.2.1
Dylan Baker [Wed, 9 Oct 2019 16:42:37 +0000 (09:42 -0700)]
docs: Add relnotes for 19.2.1

5 years agoaco: move s_andn2_b64 instructions out of the p_discard_if
Rhys Perry [Tue, 8 Oct 2019 12:40:17 +0000 (13:40 +0100)]
aco: move s_andn2_b64 instructions out of the p_discard_if

And use a new p_discard_early_exit instruction. This fixes some cases
where a definition having the same register as an operand causes issues.

v2: rename instruction to p_exit_early_if
v2: modify the existing instruction instead of creating a new one
v3: merge the "i == num - 1" IFs

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: don't reorder instructions in order to lower boolean phis
Daniel Schürmann [Mon, 7 Oct 2019 00:52:55 +0000 (02:52 +0200)]
aco: don't reorder instructions in order to lower boolean phis

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
5 years agoaco: re-use existing phi instruction when lowering boolean phis
Daniel Schürmann [Mon, 7 Oct 2019 00:32:54 +0000 (02:32 +0200)]
aco: re-use existing phi instruction when lowering boolean phis

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
5 years agoaco: Cleanup insert_before_logical_end
Michael Schellenberger Costa [Mon, 12 Aug 2019 18:40:37 +0000 (20:40 +0200)]
aco: Cleanup insert_before_logical_end

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
5 years agolima/ppir: don't clone texture loads
Vasily Khoruzhick [Tue, 8 Oct 2019 03:11:46 +0000 (20:11 -0700)]
lima/ppir: don't clone texture loads

Cloning texture loads isn't a good idea since we may move it into
a block that is not shared between all the invocations of the shader.
We'd like to avoid that since it may result in undefined behavior.

Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agogitlab-ci/lava: Add needs: for container image to test jobs
Michel Dänzer [Tue, 8 Oct 2019 15:50:07 +0000 (17:50 +0200)]
gitlab-ci/lava: Add needs: for container image to test jobs

Without this, the test jobs could spuriously run after the container
job failed or was cancelled, even if the build job didn't run at all.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
5 years agoradv: bump minTexelBufferOffsetAlignment to 4
Samuel Pitoiset [Wed, 9 Oct 2019 08:37:04 +0000 (10:37 +0200)]
radv: bump minTexelBufferOffsetAlignment to 4

The spec has probably been misinterpreted during RADV bringup.

This fixes GPU hangs with dEQP-VK.binding_model.*offset_nonzero*.

Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agometa: leak of shader program when decompressing tex-images
Sergii Romantsov [Thu, 18 Jul 2019 12:43:59 +0000 (15:43 +0300)]
meta: leak of shader program when decompressing tex-images

CC: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
5 years agomesa/main: prefer R8-textures instead of A8 for glBitmap in display lists
Erik Faye-Lund [Mon, 15 Jul 2019 10:03:43 +0000 (12:03 +0200)]
mesa/main: prefer R8-textures instead of A8 for glBitmap in display lists

This allows drivers to communicate that they prefer R8 textures rather
than A8 for glBitmap usage.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: Prefer R8 for bitmap textures
Dave Airlie [Thu, 4 Oct 2018 01:41:26 +0000 (02:41 +0100)]
st/mesa: Prefer R8 for bitmap textures

If it's not available, we fall back to A8. This should work on all drivers,
because we depend on it in the display-list code already.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agodrirc: enable vk_x11_override_min_image_count for DOOM
Samuel Pitoiset [Tue, 8 Oct 2019 08:30:03 +0000 (10:30 +0200)]
drirc: enable vk_x11_override_min_image_count for DOOM

DOOM fails to handle more images than expected when the adaptative
sync mode is enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1902
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: implement VK_KHR_shader_clock
Samuel Pitoiset [Mon, 7 Oct 2019 08:26:22 +0000 (10:26 +0200)]
radv: implement VK_KHR_shader_clock

NIR->LLVM and ACO already support nir_intrinsic_shader_clock.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoiris: Implement the Broadwell NP Z PMA Stall Fix
Kenneth Graunke [Wed, 25 Sep 2019 07:31:07 +0000 (00:31 -0700)]
iris: Implement the Broadwell NP Z PMA Stall Fix

This should help avoid stalls in the pixel mask array in certain
non-promoted depth cases.  It especially helps for Z16, as each bit
in the PMA corresponds to two pixels when using Z16, as opposed to
the usual one pixel.

Improves performance in GFXBench5 TRex by 22% (n=1).

5 years agodocs: Update recently enabled VK extensions on Intel
Caio Marcelo de Oliveira Filho [Tue, 8 Oct 2019 22:52:04 +0000 (15:52 -0700)]
docs: Update recently enabled VK extensions on Intel