mesa.git
4 years agofreedreno/ir3: Drop the max_const on a6xx to 512.
Eric Anholt [Fri, 29 May 2020 23:35:43 +0000 (16:35 -0700)]
freedreno/ir3: Drop the max_const on a6xx to 512.

The GLES blob on the p3a limits constlen to 512 between VS and FS across
a6xx gpu ids (615, 630, 640, and 650).  Experimentally, exceeding that
limit in any one stage results in rendering corruption or GPU hangs
(though my most detailed testing had a loop limit in a uniform, so that
may the cause of the hang).  Clamp the limit we use inside of a shader so
we don't exceed it within a stage.

This commit doesn't resovle limiting inter-stage.  Experimentally, I've
found that I can push up to a total of ~768 vec4s between VS and FS on
a630, with or without uniform updates between each draw.  We'll need to do
some shader key-based limiting of constlen at draw time to respect that
limit, but that's left for future work, and this commit is enough for the
google earth case that initiated this work.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>

4 years agofreedreno/ir3: Account for driver params in UBO max const upload.
Eric Anholt [Fri, 29 May 2020 23:31:43 +0000 (16:31 -0700)]
freedreno/ir3: Account for driver params in UBO max const upload.

The const state setup needs to be able to push its driver params, so
account for them in the analyze_ubo_ranges.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>

4 years agofreedreno/ir3: Stop shifting UBO 1 down to be UBO 0.
Eric Anholt [Fri, 29 May 2020 17:21:02 +0000 (10:21 -0700)]
freedreno/ir3: Stop shifting UBO 1 down to be UBO 0.

It turns out the GL uniforms file is larger than the hardware constant
file, so we need to limit how many UBOs we lower to constbuf loads.  To do
actual UBO loads, we'll need to be able to upload UBO 0's pointer or
descriptor.

No difference on nohw 1 UBO update drawoverhead case (n=35).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>

4 years agofreedreno/ir3: Drop unnecessary alignment of pushed UBO size.
Eric Anholt [Mon, 1 Jun 2020 18:40:36 +0000 (11:40 -0700)]
freedreno/ir3: Drop unnecessary alignment of pushed UBO size.

The analysis pass gives us vec4-aligned size, and all of our other
constbuf allocations here are in vec4 units, so we can just divide by 16.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>

4 years agofreedreno/ir3: Stop pushing immediates once we've filled the constbuf.
Eric Anholt [Mon, 1 Jun 2020 18:32:04 +0000 (11:32 -0700)]
freedreno/ir3: Stop pushing immediates once we've filled the constbuf.

If we filled the constbuf up with UBOs, we may need to avoid generating
more immediate push constants.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>

4 years agofreedreno/ir3: Refactor ir3_cp's lower_immed().
Eric Anholt [Mon, 1 Jun 2020 18:20:51 +0000 (11:20 -0700)]
freedreno/ir3: Refactor ir3_cp's lower_immed().

There was duplicated handling in the callers that we can just move inside.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>

4 years agofreedreno: Upload gallium constbufs as needed when referenced as a UBO.
Eric Anholt [Mon, 1 Jun 2020 18:53:22 +0000 (11:53 -0700)]
freedreno: Upload gallium constbufs as needed when referenced as a UBO.

For now we never ask to set up UBO 0 as a real UBO, so this doesn't
trigger, but it gets us ready for handling the case where UBO 0 is too big
to be push constants in the HW.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>

4 years agofreedreno/a6xx: Add support for ALPHA_TO_ONE.
Eric Anholt [Fri, 5 Jun 2020 00:08:18 +0000 (17:08 -0700)]
freedreno/a6xx: Add support for ALPHA_TO_ONE.

Fixes piglit ext_framebuffer_multisample-draw-buffers-alpha-to-one

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5343>

4 years agoturnip: Add support for alphaToOne.
Eric Anholt [Fri, 5 Jun 2020 00:00:59 +0000 (17:00 -0700)]
turnip: Add support for alphaToOne.

Comparing a blob trace using the feature to one not, the difference was
pretty obvious and in the spot you'd expect compared to alphaToCoverage.
The SP_ reg didn't have a corresponding bit set, though it also has an
alphaToCoverage.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5343>

4 years agoturnip: Use tu_cs_emit_regs() for BLEND_CONTROL.
Eric Anholt [Thu, 4 Jun 2020 23:51:13 +0000 (16:51 -0700)]
turnip: Use tu_cs_emit_regs() for BLEND_CONTROL.

Just a cleanup since I was in the area.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5343>

4 years agoradv: set keep_statistic_info with RADV_DEBUG=shaderstats
Rhys Perry [Fri, 5 Jun 2020 13:28:28 +0000 (14:28 +0100)]
radv: set keep_statistic_info with RADV_DEBUG=shaderstats

Needed for RADV_DEBUG=shaderstats to dump ACO statistics.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5358>

4 years agointel: fix gen_sort_tags.py
Eric Engestrom [Fri, 5 Jun 2020 09:26:56 +0000 (11:26 +0200)]
intel: fix gen_sort_tags.py

The script was failing for me (python 3.8), not sure if this is a recent
python version break or not as I don't know how often people have been
running this script:

    Processing ./gen9.xml... Traceback (most recent call last):
      File "./gen_sort_tags.py", line 177, in <module>
        main()
      File "./gen_sort_tags.py", line 170, in main
        genxml[:] = enums + sorted_structs.values() + instructions + registers
    TypeError: can only concatenate list (not "odict_values") to list

Turning the odict into a list fixes it for me, and the resulting xml
file are identical to before :)

Fixes: 903e142f0d35bc550ffd ("genxml: add a sorting script")
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5352>

4 years agoradv/aco: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7
Samuel Pitoiset [Thu, 4 Jun 2020 08:41:50 +0000 (10:41 +0200)]
radv/aco: enable VK_KHR_shader_subgroup_extended_types on GFX6-GFX7

CTS pass on Pitcairn (GFX6). This extension isn't really useful
without 8-bit/16-bit storage though but this is going to be exposed
soon.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>

4 years agoaco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7
Samuel Pitoiset [Thu, 4 Jun 2020 08:39:51 +0000 (10:39 +0200)]
aco: fix nir_intrinsic_quad_* with 8-bit in GFX6-GFX7

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>

4 years agoaco: fix sign-extend 8-bit subgroup operations on GFX6-GFX7
Samuel Pitoiset [Thu, 4 Jun 2020 08:35:23 +0000 (10:35 +0200)]
aco: fix sign-extend 8-bit subgroup operations on GFX6-GFX7

SDWA is GFX8+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>

4 years agoaco: use v_bfe_u32 for unsigned reductions sign-extension on GFX6-GFX7
Samuel Pitoiset [Fri, 5 Jun 2020 06:54:52 +0000 (08:54 +0200)]
aco: use v_bfe_u32 for unsigned reductions sign-extension on GFX6-GFX7

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5327>

4 years agointel/genxml: drop sort_xml.sh and move the loop directly in gen_sort_tags.py
Eric Engestrom [Fri, 5 Jun 2020 09:49:06 +0000 (11:49 +0200)]
intel/genxml: drop sort_xml.sh and move the loop directly in gen_sort_tags.py

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5353>

4 years agoradv: Use ac_surface to allocate aux surfaces.
Bas Nieuwenhuizen [Sun, 24 May 2020 13:04:04 +0000 (15:04 +0200)]
radv: Use ac_surface to allocate aux surfaces.

For consistency and a bunch of codesharing.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoamd/common: Add total alignment calculation.
Bas Nieuwenhuizen [Sun, 24 May 2020 12:23:24 +0000 (14:23 +0200)]
amd/common: Add total alignment calculation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Allocate values/predicates at the end of the image.
Bas Nieuwenhuizen [Sun, 24 May 2020 12:14:34 +0000 (14:14 +0200)]
radv: Allocate values/predicates at the end of the image.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Disable HTILE in ac_surface.
Bas Nieuwenhuizen [Sun, 24 May 2020 12:00:05 +0000 (14:00 +0200)]
radv: Disable HTILE in ac_surface.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Disable DCC in ac_surface.
Bas Nieuwenhuizen [Sun, 24 May 2020 11:57:02 +0000 (13:57 +0200)]
radv: Disable DCC in ac_surface.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Use offsets in surface struct.
Bas Nieuwenhuizen [Sun, 24 May 2020 11:47:20 +0000 (13:47 +0200)]
radv: Use offsets in surface struct.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Rely on ac_surface for avoiding cmask for linear images.
Bas Nieuwenhuizen [Sun, 24 May 2020 11:25:53 +0000 (13:25 +0200)]
radv: Rely on ac_surface for avoiding cmask for linear images.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Enforce the contiguous memory for DCC layers in ac_surface.
Bas Nieuwenhuizen [Sun, 24 May 2020 10:50:55 +0000 (12:50 +0200)]
radv: Enforce the contiguous memory for DCC layers in ac_surface.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Pass no_metadata_planes info in to ac_surface.
Bas Nieuwenhuizen [Sun, 24 May 2020 10:10:00 +0000 (12:10 +0200)]
radv: Pass no_metadata_planes info in to ac_surface.

Also do not allocate aux surfaces for multi-plane images. I may
have messed up and used plane 1 offsets for the other planes as well.
I cannot imagine that sharing aux surfaces between the planes will
work well.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoradv: Use ac_surface to determine fmask enable.
Bas Nieuwenhuizen [Sun, 24 May 2020 09:57:09 +0000 (11:57 +0200)]
radv: Use ac_surface to determine fmask enable.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5194>

4 years agoci: add U-Boot specific fetch strings
Christian Gmeiner [Thu, 4 Jun 2020 10:56:00 +0000 (12:56 +0200)]
ci: add U-Boot specific fetch strings

U-Boot's fastboot over udp generates the following output:
  Listening for fastboot command on x.y.z.w

Also add a general 'data abort' error string seen with an
too old U-Boot version:
  https://github.com/u-boot/u-boot/commit/95712af

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5258>

4 years agoci: extend expect-output.sh
Christian Gmeiner [Thu, 21 May 2020 22:05:37 +0000 (00:05 +0200)]
ci: extend expect-output.sh

We need to support different fastboot fetch strings for different
bootloader solutions. Lets extend expect-output.sh to support
multiple fetch strings (-f) and add support for error catch
strings (-e) to stop the CI run early.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5258>

4 years agofreedreno/computerator: fix missing dependency on generated header
Rob Clark [Thu, 4 Jun 2020 21:15:58 +0000 (14:15 -0700)]
freedreno/computerator: fix missing dependency on generated header

Fixes:
```
 ../mesa-freedreno-20.2.0_pre/src/freedreno/computerator/ir3_asm.c:25:10: fatal error: 'ir3/ir3_parser.h' file not found
 #include "ir3/ir3_parser.h"
          ^~~~~~~~~~~~~~~~~~
 1 error generated.
```

Fixes: da467817e3e ("freedreno/ir3: Move ir3 assembler to backend compiler")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5340>

4 years agoglapi: remove deprecated .getchildren() that has been replace with an iterator
Eric Engestrom [Thu, 4 Jun 2020 23:05:46 +0000 (01:05 +0200)]
glapi: remove deprecated .getchildren() that has been replace with an iterator

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3086
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5342>

4 years agoradv/aco: enable 64-bit atomic features if RADV is linked with LLVM 8
Samuel Pitoiset [Thu, 4 Jun 2020 12:53:20 +0000 (14:53 +0200)]
radv/aco: enable 64-bit atomic features if RADV is linked with LLVM 8

Just in case someone links RADV with this old LLVM 8 and wants ACO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5331>

4 years agosvga: Performance fixes
Neha Bhende [Tue, 26 May 2020 15:59:50 +0000 (21:29 +0530)]
svga: Performance fixes

This is a squash commit of in house performance fixes and misc bug fixes
for GL4.1 support.

Performance fixes:
* started using system memory for constant buffer to gain 3X performance boost with metro redux

Misc bug fixes:
* fixed usage of vertexid in shader
* added empty control point phase in hull shader for zero ouput control point
* misc shader signature fixes
* fixed clip_distance input declaration
* clearing the dirty bit for the surface while using direct map if surface is already flushed
  and there is no pending primitive

This patch also uses SVGA_RETRY macro for commands retries. Part of it is already
used in previous patch.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5317>

4 years agosvga: Add GL4.1(compatibility profile) support in svga driver
Neha Bhende [Tue, 26 May 2020 15:56:42 +0000 (21:26 +0530)]
svga: Add GL4.1(compatibility profile) support in svga driver

This patch is a squash commit of a very long in-house patch series.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5317>

4 years agosvga/include: Headers for GL4.1 support
Neha Bhende [Tue, 26 May 2020 15:47:44 +0000 (21:17 +0530)]
svga/include: Headers for GL4.1 support

This brings in the new types, enums and #defines for GL 4.1
features in the virtual device.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5317>

4 years agowinsys/drm: Add GL4.1 support in drm winsys
Neha Bhende [Tue, 26 May 2020 15:45:23 +0000 (21:15 +0530)]
winsys/drm: Add GL4.1 support in drm winsys

This is to check whether virtual hardware has SM5 support

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5317>

4 years agoutil: Add util functionality for GL4.1 support
Neha Bhende [Tue, 26 May 2020 15:43:14 +0000 (21:13 +0530)]
util: Add util functionality for GL4.1 support

This patch adds the following tgsi utilities

* tgsi_dynamic_indexing: This utility flattens out the dyanamic indexing of constant buffers
* tgsi_vpos: This utility writes zeros to position at index 0 in vertex shader.
  This utility can be used if there is no shader output in vertex shader
* util_make_tess_ctrl_passthrough_shader: This adds passthough tessellation control shader.
  Input of passthrough tess ctrl shader is output of vertex shader
  and output is input of tessellation eval shader.
  If program has tessellation eval shader but no tessellation control shader,
  this utility can be used to create passthrough tessellation control shader.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Neha Bhende <bhenden@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5317>

4 years agofreedreno/a6xx: more early-z
Rob Clark [Thu, 4 Jun 2020 17:07:57 +0000 (10:07 -0700)]
freedreno/a6xx: more early-z

Technically we only have to do late-z in the alpha-test or discard case
if depth-write is enabled.  If depth write is disabled, the depth read /
test / conditional-write interlock that we need to emulate is not a
problem, so we can still use early-z test.

There is a slightly weird case when there is no zsbuf attachment (see
dEQP-GLES31.functional.fbo.no_attachments.*) where the hw wants us to
use LATE_Z.. not entirely sure if this is an interaction with occlusion
query or just a pecularity of how the hw works when there is no depth
buffer.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5336>

4 years agoci: bump virglrenderer to latest version
Dave Airlie [Thu, 4 Jun 2020 04:03:47 +0000 (14:03 +1000)]
ci: bump virglrenderer to latest version

Need this for upcoming GL 4.0 llvmpipe support.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5323>

4 years agoturnip: Simplify vertex buffer bindings.
Eric Anholt [Thu, 4 Jun 2020 00:07:06 +0000 (17:07 -0700)]
turnip: Simplify vertex buffer bindings.

We were remapping the bindings so the HW binding points were consecutive,
which there's no need for.  Now that we don't shuffle, we can mostly drop
the dependency on the pipeline for this SDS.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5321>

4 years agoturnip: Don't bother clamping VB size.
Eric Anholt [Thu, 4 Jun 2020 17:11:44 +0000 (10:11 -0700)]
turnip: Don't bother clamping VB size.

From the VK spec: "All elements of pOffsets must be less than the size of
the corresponding element in pBuffers"

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5321>

4 years agoturnip: Move vertex buffer bindings to SET_DRAW_STATE.
Eric Anholt [Thu, 4 Jun 2020 17:10:01 +0000 (10:10 -0700)]
turnip: Move vertex buffer bindings to SET_DRAW_STATE.

This means that the HW can skip over the vertex buffer state when it's not
used in a bin.  The blob also has this behavior.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5321>

4 years agollvmpipe: move coroutines out of noopt case
Dave Airlie [Thu, 4 Jun 2020 02:10:40 +0000 (12:10 +1000)]
llvmpipe: move coroutines out of noopt case

the virgl CI code was using the noopt path and crashing with a
wierd can't select llvm.coro.subfn.addr error, turns out we have
to call the cleanup pass no matter what.

This enable a lot more virgl gles31 passes, but we have
to disable tessellation shaders as now they executed, they
crash due to missing OES_gpu_shader5, I should try and reenable
them when llvmpipe is further along

Fixes: d32690b43c91d ("gallivm: add coroutine pass manager support")
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Acked-by: Elie Tournier <elie.tournier@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5320>

4 years agopan/mdg: Ensure ld_vary_16 is aligned
Alyssa Rosenzweig [Thu, 4 Jun 2020 15:32:59 +0000 (11:32 -0400)]
pan/mdg: Ensure ld_vary_16 is aligned

Otherwise packing may fail.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 5f8dd413bcc ("pan/mdg: Handle 16-bit ld_vary")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5339>

4 years agofreedreno/a6xx: Fix VFD_CONTROL emit
Kristian H. Kristensen [Thu, 4 Jun 2020 06:18:52 +0000 (23:18 -0700)]
freedreno/a6xx: Fix VFD_CONTROL emit

The FETCH_CNT field isn't actually the FETCH count. We don't have a
lot of data where it's different from DECODE_CNT, so there's not much
to go by. It could be number of VFD_DEST_CNTL or maybe DECODE_CNT for
binning.  For now, setting both to number of DEST_CNTL gets Google
Earth working again.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5324>

4 years agoradv: Always expose non-visible local memory type on dedicated GPUs
Clément Guérin [Wed, 3 Jun 2020 05:14:44 +0000 (22:14 -0700)]
radv: Always expose non-visible local memory type on dedicated GPUs

DOOM Eternal expects this type, but RADV doesn't expose it when the VRAM
is entirely host-visible, in my case on Fiji. Matches AMDVLK behavior.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/3054
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5308>

4 years agopan/mdg: Legalize inverts with constants
Alyssa Rosenzweig [Tue, 2 Jun 2020 16:15:18 +0000 (12:15 -0400)]
pan/mdg: Legalize inverts with constants

We need to force src_invert to be in the right place even if we flip
when lowering an embedded->inline constant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Fixes: 449e5ded934 ("pan/mdg: Treat inot as a modifier")
Reported-by: Icecream95 <ixn@keemail.me>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5299>

4 years agonir: reuse existing psiz-variable
Erik Faye-Lund [Tue, 2 Jun 2020 08:44:13 +0000 (10:44 +0200)]
nir: reuse existing psiz-variable

For shaders where there's already a psiz-variable, we should rather
reuse it than create a second one. This can happen if a shader writes
gl_PointSize, but disables GL_PROGRAM_POINT_SIZE.

Fixes: 878c94288a8 ("nir: add lowering-pass for point-size mov")
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5328>

4 years agoi965: fix export of GEM handles
Lionel Landwerlin [Sat, 2 May 2020 13:59:19 +0000 (16:59 +0300)]
i965: fix export of GEM handles

We reuse DRM file descriptors internally. Therefore when we export a
GEM handle we must do so in the file descriptor used externally.

v2: Fix dmabuf leak
    Fix GEM handle leaks by tracking exported handles

v3: Check os_same_file_description error (Michel)
    Don't create multiple exports for a given GEM table

v4: Add WARN_ONCE (Ken)

v5: Remove blank line (Ian)
    Remove unused field (Ian)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2882
Fixes: 4094558e8643 ("i965: share buffer managers across screens")
Tested-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>

4 years agoiris: fix export of GEM handles
Lionel Landwerlin [Sat, 2 May 2020 13:46:47 +0000 (16:46 +0300)]
iris: fix export of GEM handles

We reuse DRM file descriptors internally. Therefore when we export a
GEM handle we must do so in the file descriptor used externally.

This change also fixes a file descriptor leak of the FD given at
screen creation.

v2: Don't bother checking fd equals, they're always different
    Fix dmabuf leak
    Fix GEM handle leaks by tracking exported handles

v3: Check os_same_file_description error (Michel)
    Don't create multiple exports for a given GEM table

v4: Add WARN_ONCE (Ken)
    Rename external_fd to winsys_fd

v5: Remove export lock in favor of bufmgr's

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2882
Fixes: 7557f1605968 ("iris: share buffer managers accross screens")
Tested-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>

4 years agoi965: don't forget to set screen on duped image
Lionel Landwerlin [Tue, 2 Jun 2020 08:52:35 +0000 (11:52 +0300)]
i965: don't forget to set screen on duped image

We'll start using this field more for querying image properties.
Without it we run into a crash.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>

4 years agoiris: fix BO destruction in error path
Lionel Landwerlin [Sat, 2 May 2020 19:43:22 +0000 (22:43 +0300)]
iris: fix BO destruction in error path

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>

4 years agomesa: Fix NetBSD compiler macro.
Vinson Lee [Sat, 23 May 2020 20:46:28 +0000 (13:46 -0700)]
mesa: Fix NetBSD compiler macro.

Reported-by: Rafał Mikrut <mikrutrafal54@gmail.com>
Fixes: a63b90712aad ("mesa: also check for __NetBSD__")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3015
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5191>

4 years agofreedreno/a6xx: also consider alpha-test for ztest-mode
Rob Clark [Wed, 3 Jun 2020 22:01:11 +0000 (15:01 -0700)]
freedreno/a6xx: also consider alpha-test for ztest-mode

Looks like we don't have CI coverage for this (since deqp==GLES) but
alpha test is conceptually the same as frag shaders with discard, and
should be handled as such.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>

4 years agofreedreno/a6xx: add early-lrz-late-z mode
Rob Clark [Wed, 3 Jun 2020 17:08:55 +0000 (10:08 -0700)]
freedreno/a6xx: add early-lrz-late-z mode

Now that we are doing a better job of managing LRZ, add support for the
EARLY_LRZ_LATE_Z mode.  Since we properly disable LRZ write in cases
where we don't know a fragment's z value during the binning pass (or
when blend is enabled in a later draw, meaning we will need the earlier
fragment's color), we can enable a mode that keeps the early-lrz test
when the frag shader has kill/discard.  This will only discard geometry
that is definitely not visible.

This is a pretty big win for games/benchmarks that have a lot of frag
shaders with kill/discard.  More than 10% gain for gfxbench trex/mh and
40% gain for mh31.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>

4 years agofreedreno/a6xx: re-work LRZ state tracking
Rob Clark [Wed, 3 Jun 2020 17:06:58 +0000 (10:06 -0700)]
freedreno/a6xx: re-work LRZ state tracking

In particular, properly detect reversal of depth-test direction.
With that we can remove a lot of cases where we were unnecessarily
invalidating LRZ, which was simply papering over the direction-
reversal issue in deqp.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>

4 years agofreedreno/a6xx: update depth-plane control regs
Rob Clark [Sun, 31 May 2020 17:46:54 +0000 (10:46 -0700)]
freedreno/a6xx: update depth-plane control regs

And document the early-lrz-late-z mode.

Initially I thought this would be two bits to control early-lrz vs
early-z.  But having early-z without early-lrz does not make sense,
and the way the values line up makes an enum fit better.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>

4 years agofreedreno/a6xx: sync registers from envytools
Rob Clark [Sat, 30 May 2020 21:24:36 +0000 (14:24 -0700)]
freedreno/a6xx: sync registers from envytools

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>

4 years agofreedreno/ir3: split kill from no_earlyz
Rob Clark [Tue, 2 Jun 2020 00:29:00 +0000 (17:29 -0700)]
freedreno/ir3: split kill from no_earlyz

Unlike other conditions which prevent early-discard of fragments, kill
does not prevent early LRZ test.  Split `has_kill` from `no_earlyz` so
we can take advantage of this.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>

4 years agodocs/features.txt: Update for freedreno
Kristian H. Kristensen [Wed, 3 Jun 2020 19:50:15 +0000 (12:50 -0700)]
docs/features.txt: Update for freedreno

We've had GL_OES_texture_cube_map_array for a while for a4xx+ and
support for geometry and tessellation for a6xx+.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5319>

4 years agofreedreno/a6xx: Turn on robustness extensions
Kristian H. Kristensen [Wed, 3 Jun 2020 19:28:05 +0000 (12:28 -0700)]
freedreno/a6xx: Turn on robustness extensions

With UBO access going through LDC, all memory access uses buffer based
io primitives.  We can then advertise
PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR and
PIPE_CAP_DEVICE_RESET_STATUS_QUERY, which turn on GL_EXT_robustness,
GL_KHR_robust_buffer_access_behavior and GL_KHR_robustness.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5319>

4 years agovdpau: Fix wrong calloc sizeof argument.
Vinson Lee [Sat, 23 May 2020 00:59:27 +0000 (17:59 -0700)]
vdpau: Fix wrong calloc sizeof argument.

Fix warning reported by Coverity Scan.

Wrong sizeof argument (SIZEOF_MISMATCH)
suspicious_sizeof: Passing argument 3544UL (sizeof
(vlVdpPresentationQueue)) to function calloc that returns a pointer of
type vlVdpPresentationQueueTarget * is suspicious because a multiple of
sizeof (vlVdpPresentationQueueTarget) /*16*/ is expected.

Fixes: 65fe0866aec7 ("vl: implemented a few functions and made stubs to get mplayer running")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3026
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5182>

4 years agoOPTIONAL: iris: Perform BLORP buffer barriers outside of iris_blorp_exec() hook.
Francisco Jerez [Wed, 6 May 2020 22:40:30 +0000 (15:40 -0700)]
OPTIONAL: iris: Perform BLORP buffer barriers outside of iris_blorp_exec() hook.

The iris_blorp_exec() hook needs to be executed under a single
indivisible sync region, which means that in cases where we need to
emit a PIPE_CONTROL for a buffer barrier we won't be able to track the
subsequent commands separately from the previous commands, which will
prevent us from optimizing out subsequent PIPE_CONTROLs if we
encounter the same buffers again.  In particular I've encountered this
situation in some SynMark test-cases which perform lots of BLORP
operations with the same buffer bound as both source and destination
(in order to generate mipmaps): In such a scenario if the source
requires flushing we'd also end up flushing for the destination
redundantly, even though a single PIPE_CONTROL would have been
sufficient.

This avoids a 4.5% FPS regression in SynMark OglHdrBloom and a 3.5%
FPS regression in SynMark OglMultithread.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Remove iris_flush_depth_and_render_caches().
Francisco Jerez [Thu, 6 Feb 2020 04:29:09 +0000 (20:29 -0800)]
iris: Remove iris_flush_depth_and_render_caches().

This helper is unused now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Emit single render target flush PIPE_CONTROL on format mismatch.
Francisco Jerez [Fri, 1 May 2020 00:40:52 +0000 (17:40 -0700)]
iris: Emit single render target flush PIPE_CONTROL on format mismatch.

The big-hammer iris_flush_depth_and_render_caches() is largely
redundant whenever a format mismatch is detected from
iris_cache_flush_for_render().  There is no need to kick the depth,
sampler nor constant caches in that case.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Open-code iris_cache_flush_for_read() and iris_cache_flush_for_depth().
Francisco Jerez [Thu, 6 Feb 2020 04:36:35 +0000 (20:36 -0800)]
iris: Open-code iris_cache_flush_for_read() and iris_cache_flush_for_depth().

These have become one-liners now so they can be easily inlined.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Remove render cache hash table-based synchronization.
Francisco Jerez [Wed, 19 Feb 2020 06:39:43 +0000 (22:39 -0800)]
iris: Remove render cache hash table-based synchronization.

The render cache hash table is now *mostly* redundant with the more
general seqno matrix-based cache tracking mechanism.  Most hash table
operations are now gone except for the format mismatch checks done in
iris_cache_flush_for_render().  Redundant code removed as a separate
patch for bisectability.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Remove depth cache set tracking and synchronization.
Francisco Jerez [Wed, 19 Feb 2020 04:53:26 +0000 (20:53 -0800)]
iris: Remove depth cache set tracking and synchronization.

The depth cache set is now redundant with the more general seqno
matrix-based cache tracking mechanism.  Removed as a separate patch
for bisectability.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Perform compute predraw flushes from compute batch.
Francisco Jerez [Thu, 6 Feb 2020 02:27:46 +0000 (18:27 -0800)]
iris: Perform compute predraw flushes from compute batch.

Whenever iris_predraw_resolve_inputs() ends up doing a flush or
invalidate, we really want it to be on the same batch which is going
to consume the result.  Any resolves should still be performed from
the render batch thanks to the previous patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Remove batch argument of iris_resource_prepare_access() and friends.
Francisco Jerez [Fri, 24 Apr 2020 01:00:15 +0000 (18:00 -0700)]
iris: Remove batch argument of iris_resource_prepare_access() and friends.

The resolves performed by this function are only expected to work from
the render batch, so make sure we use it independently of the batch
the caller wants to use.  This function provides no synchronization
guarantees anyway, the caller is expected to insert any cache flushing
and synchronization required for the resolved surface to be visible to
the target batch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Insert buffer barrier in existing cache flush helpers.
Francisco Jerez [Thu, 6 Feb 2020 02:58:23 +0000 (18:58 -0800)]
iris: Insert buffer barrier in existing cache flush helpers.

As a first step to phasing out the current hashtable-based depth and
render cache tracking mechanisms.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Implement buffer-local memory barrier based on cache coherency matrix.
Francisco Jerez [Thu, 6 Feb 2020 02:59:46 +0000 (18:59 -0800)]
iris: Implement buffer-local memory barrier based on cache coherency matrix.

This takes advantage of the previously introduced cache tracking
infrastructure in order to define a multi-purpose barrier operation
that allows the caller to order memory operations with respect to
previous operations performed on the same buffer from any other cache
domain.

v2: Assorted CPU overhead micro-optimizations (Francisco).
v3: Use C99 designated initializers (Ken).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Update cache coherency matrix on PIPE_CONTROL.
Francisco Jerez [Mon, 10 Feb 2020 09:24:29 +0000 (01:24 -0800)]
iris: Update cache coherency matrix on PIPE_CONTROL.

This introduces a batch synchronization boundary at every PIPE_CONTROL
command, and updates the cache coherency status tracked during batch
construction according to the specified control bits.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Introduce cache coherency matrix for batch-local memory ordering.
Francisco Jerez [Wed, 19 Feb 2020 04:48:23 +0000 (20:48 -0800)]
iris: Introduce cache coherency matrix for batch-local memory ordering.

This introduces a representation of the cache coherency status of the
GPU at any point in the batch.  This is done by defining a matrix C of
synchronization sequence numbers such that at any point of batch
construction, a memory operation from domain i introduced into the
batch is guaranteed to be ordered after any memory operation from
domain j in a previous batch section with seqno n if the following
condition holds:

  C_i_j >= n

This allows us to efficiently determine whether additional flushing
and/or invalidation is required in order to access a buffer object
from some arbitrary domain.

Except for batch buffer reset which requires clearing the whole
matrix, all operations on the matrix are either O(n) or O(1) on the
number of caching domains (which is basically constant).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Report use of any in-flight buffers on first draw call after sync boundary.
Francisco Jerez [Fri, 7 Feb 2020 05:06:17 +0000 (21:06 -0800)]
iris: Report use of any in-flight buffers on first draw call after sync boundary.

This is the main performance trade-off of this cache tracking
mechanism: In order for the seqno vector of buffer objects to be
accurate, they need to be marked as used again every time the batch is
split into a new synchronization section if they remain bound to the
pipeline.  This can be achieved easily by re-using
iris_restore_render_saved_bos() and iris_restore_compute_saved_bos(),
which currently serve a similar purpose across batch buffer
boundaries.

The impact on Piglit drawoverhead results seems to be within a
standard deviation of the current results.

XXX - It might be possible to completely remove the current
      iris_batch::contains_draw flag at a small additional performance
      cost.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Drop redundant iris_address::write flag.
Francisco Jerez [Wed, 27 May 2020 20:42:22 +0000 (13:42 -0700)]
iris: Drop redundant iris_address::write flag.

The write flag is redundant since it can be inferred easily from the
iris_address::access domain.  This allows the iris_address struct to
be laid out more efficiently in memory, leading to a measurable
improvement in several Piglit Drawoverhead test-cases.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Annotate all BO uses with domain and sequence number information.
Francisco Jerez [Fri, 29 May 2020 23:38:43 +0000 (16:38 -0700)]
iris: Annotate all BO uses with domain and sequence number information.

Probably the most annoying patch to review from the whole series --
Mark every buffer object use as accessed through some caching domain
with the sequence number of the current synchronization section of the
batch.  The additional argument of iris_use_pinned_bo() makes sure I'd
have gotten a compile error if I had missed any buffer added to the
batch validation list.

There are only a few exceptions where a buffer is left untracked while
adding it to the validation list, justified below:

 - Batch buffers: These are strictly read-only for the moment.

 - BLORP buffer objects: Their seqnos are bumped manually at the end
   of iris_blorp_exec() instead, in order to avoid plumbing domain
   information through BLORP address combining.

 - Scratch buffers: The contents of these are strictly thread-local.

 - Shader images and SSBOs: Accesses of these buffers are explicitly
   synchronized at the API level.

v2: Opt out of tracking more aggressively (Ken): In addition to the
    above, surface states, binding tables, instructions and most
    dynamic states are now left untracked, which means a *lot* more BO
    uses marked IRIS_DOMAIN_NONE which need to be reviewed extremely
    carefully, since the cache tracker won't be able to provide any
    coherency guarantees for them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Bracket batch operations which access memory within sync regions.
Francisco Jerez [Fri, 24 Apr 2020 00:58:48 +0000 (17:58 -0700)]
iris: Bracket batch operations which access memory within sync regions.

This delimits all batch operations which access memory between
iris_batch_sync_region_start() and iris_batch_sync_region_end() calls.
This makes sure that any buffer objects accessed within the region are
considered in use through the same caching domain until the end of the
region.

Adding any buffer to the batch validation list outside of a sync
region will lead to an assertion failure in a future commit, unless
the caller explicitly opted out of the cache tracking mechanism.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Add infrastructure to partition batch into sync boundaries.
Francisco Jerez [Wed, 27 May 2020 20:34:04 +0000 (13:34 -0700)]
iris: Add infrastructure to partition batch into sync boundaries.

This introduces some minimalistic infrastructure which will be used in
order to partition the batch into a series of sections, each one with
a unique, monotonically-increasing sequence number.  Section
boundaries will typically lie at points in the batch where the
execution and memory coherency status of some previous commands are
known, e.g. at batch buffer boundaries or PIPE_CONTROL commands.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agoiris: Add batch-local synchronization book-keeping to iris_bo.
Francisco Jerez [Fri, 24 Apr 2020 00:56:11 +0000 (17:56 -0700)]
iris: Add batch-local synchronization book-keeping to iris_bo.

The purpose of this is to represent the cache coherency state of a
buffer as a vector of integers (AKA seqnos), one for each incoherent
caching domain of the GPU.  A seqno will identify a single section of
a batch buffer uniquely across the whole pipe_screen (which means that
there will be no ambiguity about what context a given seqno belongs to
even if there are multiple threads accessing the same buffer in
parallel), and is guaranteed to be allocated in monotonically
increasing order within any given context.  The iris_bo_bump_seqno()
helper is provided for marking the last update of a buffer from a
given caching domain in a lockless manner.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3875>

4 years agopanfrost: Mark point sprites as todo on Bifrost
Alyssa Rosenzweig [Tue, 2 Jun 2020 00:52:59 +0000 (20:52 -0400)]
panfrost: Mark point sprites as todo on Bifrost

Emulating them will be a rather annoying dance. Let's not worry about
this until further down the line when we have a better sence of how to
do handle them efficiently.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5290>

4 years agopanfrost: Fix gl_PointSize out of GL_POINTS
Alyssa Rosenzweig [Tue, 2 Jun 2020 00:44:19 +0000 (20:44 -0400)]
panfrost: Fix gl_PointSize out of GL_POINTS

In this case, vs->writes_point_size is true as the VS writes
gl_PointSize, but panfrost_writes_points_size() is false as we are not
drawing points so the hardware doesn't process it. Thus the varying
descriptor is emitted but elements is never written. When the VS runs,
it will attempt to write to elements, a NULL pointer.

The behaviour is architecture-independent. On Midgard, the write
silently fails, hence why this bug was never noticed before. On Bifrost,
this raises an MMU fault.

The fix is to set the format to VARYING_DISCARD to ignore the write.

Noticed on Neverball.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5290>

4 years agopanfrost: Prefer sysval for gl_PointCoord on Bifrost
Alyssa Rosenzweig [Mon, 1 Jun 2020 22:26:03 +0000 (18:26 -0400)]
panfrost: Prefer sysval for gl_PointCoord on Bifrost

It's like gl_FragCoord. Still not implemented. This unfortunately makes
point sprites a lot more complicated.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5290>

4 years agopan/bi: Disassemble gl_PointCoord reads.
Alyssa Rosenzweig [Mon, 1 Jun 2020 21:36:35 +0000 (17:36 -0400)]
pan/bi: Disassemble gl_PointCoord reads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5290>

4 years agopanfrost: Explicitly convert to 32-bit for logic-ops
Alyssa Rosenzweig [Tue, 2 Jun 2020 00:34:34 +0000 (20:34 -0400)]
panfrost: Explicitly convert to 32-bit for logic-ops

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95 <ixn@keemail.me>
Fixes: 19b4e586f62 ("panfrost: Switch to pan_lower_framebuffer")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5289>

4 years agopanfrost: Readd MIDGARD_SHADERLESS quirk to t760
Alyssa Rosenzweig [Mon, 1 Jun 2020 22:32:41 +0000 (18:32 -0400)]
panfrost: Readd MIDGARD_SHADERLESS quirk to t760

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reported-by: Icecream95 <ixn@keemail.me>
Fixes: e53d27de61b ("panfrost: Add quirks for blend shader types")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5289>

4 years agoiris: Extend iris_context dirty state flags to 128 bits.
Francisco Jerez [Fri, 29 May 2020 23:57:01 +0000 (16:57 -0700)]
iris: Extend iris_context dirty state flags to 128 bits.

We're nearly out of dirty bits, and some patches pending review on
GitLab no longer apply due to that.  Make room for them by splitting
off shader stage-specific bits into a separate stage_dirty mask.

An alternative would be to split compute-related bits into a separate
mask, but that would prevent the '<< stage' indexing done in various
parts of the driver from working.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>

4 years agoiris: Simplify iris_batch_prepare_noop().
Francisco Jerez [Fri, 29 May 2020 23:54:35 +0000 (16:54 -0700)]
iris: Simplify iris_batch_prepare_noop().

This makes iris_batch_prepare_noop() return a boolean instead of
passing through the relevant set of dirty flags.  It will make it
easier to change the representation of dirty flags.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5279>

4 years agonir/lower_tex: fixes for fp16 yuv lowering
Rob Clark [Wed, 3 Jun 2020 18:34:09 +0000 (11:34 -0700)]
nir/lower_tex: fixes for fp16 yuv lowering

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3079
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>

4 years agonir/builder: add bitsize conversion helpers
Rob Clark [Wed, 3 Jun 2020 19:12:54 +0000 (12:12 -0700)]
nir/builder: add bitsize conversion helpers

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>

4 years agonir: extract out convert_to_bitsize() helper
Rob Clark [Wed, 3 Jun 2020 19:09:59 +0000 (12:09 -0700)]
nir: extract out convert_to_bitsize() helper

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>

4 years agonir: get_base_type() should return enum type
Rob Clark [Wed, 3 Jun 2020 19:07:52 +0000 (12:07 -0700)]
nir: get_base_type() should return enum type

Needed by the next patch, for c++ code which is more strict about
conversions between integers and enums.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5318>

4 years agopanfrost: Handle writes_memory correctly
Alyssa Rosenzweig [Tue, 2 Jun 2020 18:12:29 +0000 (14:12 -0400)]
panfrost: Handle writes_memory correctly

We need to pass it thru to EARLY_Z and WRITES_GLOBAL instead of ignoring
and assuming respectively. Nontrivial performance fix.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5300>

4 years agopanfrost: Document MALI_WRITES_GLOBAL bit
Alyssa Rosenzweig [Tue, 2 Jun 2020 18:05:34 +0000 (14:05 -0400)]
panfrost: Document MALI_WRITES_GLOBAL bit

We've been setting this unconditionally -- oops!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5300>

4 years agopanfrost: Update MALI_EARLY_Z description
Alyssa Rosenzweig [Tue, 2 Jun 2020 18:03:58 +0000 (14:03 -0400)]
panfrost: Update MALI_EARLY_Z description

Via the ES3.1 early-z testing force, I've confirmed this bit is e-z.
I've also confirmed e-z must be disabled for global writes, as expected.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5300>

4 years agoiris: remove unused iris_bo->swizzle_mode
Marcin Ślusarz [Wed, 3 Jun 2020 15:00:38 +0000 (17:00 +0200)]
iris: remove unused iris_bo->swizzle_mode

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5313>

4 years agoaco: sign-extend input/identity for 16-bit subgroup ops on GFX6-GFX7
Samuel Pitoiset [Tue, 26 May 2020 14:21:44 +0000 (16:21 +0200)]
aco: sign-extend input/identity for 16-bit subgroup ops on GFX6-GFX7

16-bit subgroup ops are implemented with 32-bit instructions
on GFX6-GFX7.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>

4 years agoaco: fix subdword copies on GFX6-GFX7
Samuel Pitoiset [Tue, 26 May 2020 07:38:27 +0000 (09:38 +0200)]
aco: fix subdword copies on GFX6-GFX7

SDWA is only GFX8+. Use v_mov_b32 since the upper 16 bits don't matter.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>

4 years agoaco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7
Samuel Pitoiset [Tue, 26 May 2020 14:06:14 +0000 (16:06 +0200)]
aco: implement 16-bit nir_intrinsic_quad_* on GFX6-GFX7

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>

4 years agoaco: implement 16-bit reduce operations on GFX6-GFX7
Samuel Pitoiset [Mon, 25 May 2020 17:59:57 +0000 (19:59 +0200)]
aco: implement 16-bit reduce operations on GFX6-GFX7

No fp16 on GFX6-GFX7.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>