mesa.git
4 years agoutil/u_debug: use detect_os.h
Eric Engestrom [Thu, 1 Aug 2019 21:36:30 +0000 (22:36 +0100)]
util/u_debug: use detect_os.h

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agoutil/os_misc: use detect_os.h to start uncoupling from gallium
Eric Engestrom [Thu, 1 Aug 2019 21:33:05 +0000 (22:33 +0100)]
util/os_misc: use detect_os.h to start uncoupling from gallium

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agoutil/os_memory: use detect_os.h to uncouple it from gallium
Eric Engestrom [Thu, 1 Aug 2019 15:55:39 +0000 (16:55 +0100)]
util/os_memory: use detect_os.h to uncouple it from gallium

While at it, remove p_compiler.h as well as it is unused.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agogallium: deduplicate os detection logic by using detect_os.h
Eric Engestrom [Thu, 1 Aug 2019 13:58:52 +0000 (14:58 +0100)]
gallium: deduplicate os detection logic by using detect_os.h

This allows us to avoid having to rename all the PIPE_OS_* at once while
still making sure PIPE_OS_* and DETECT_OS_* are always in sync.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agogallium/utils: drop PIPE_SUBSYSTEM_WINDOWS_USER
Eric Engestrom [Thu, 1 Aug 2019 21:49:05 +0000 (22:49 +0100)]
gallium/utils: drop PIPE_SUBSYSTEM_WINDOWS_USER

This is basically just an alias for PIPE_OS_WINDOWS.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agoscons: rename PIPE_SUBSYSTEM_EMBEDDED to EMBEDDED_DEVICE
Eric Engestrom [Thu, 1 Aug 2019 20:45:25 +0000 (21:45 +0100)]
scons: rename PIPE_SUBSYSTEM_EMBEDDED to EMBEDDED_DEVICE

It has nothing to do with the PIPE_SUBSYSTEM_* stuff from gallium.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agogallium: remove never-used PIPE_SUBSYSTEM_DRI
Eric Engestrom [Thu, 1 Aug 2019 14:02:15 +0000 (15:02 +0100)]
gallium: remove never-used PIPE_SUBSYSTEM_DRI

PIPE_SUBSYSTEM_DRI was introduced in dacfef158943665fc0d1 ("gallium: New
configuration header.") 11 years ago, and was never used.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agoutil: fix typo in comment
Eric Engestrom [Thu, 1 Aug 2019 14:01:54 +0000 (15:01 +0100)]
util: fix typo in comment

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agoutil: introduce detect_os.h
Eric Engestrom [Thu, 1 Aug 2019 13:48:26 +0000 (14:48 +0100)]
util: introduce detect_os.h

Mostly copied from src/gallium/include/pipe/p_config.h, so I kept its
copyright and authorship.

Other than the obvious rename, the big difference is that these are
always defined, to be used as `#if DETECT_OS_LINUX`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agofreedreno/batch: fix dependency loop detection
Rob Clark [Wed, 31 Jul 2019 19:30:24 +0000 (12:30 -0700)]
freedreno/batch: fix dependency loop detection

We can have a scenario like:

  A -> B
  A -> C -> B

When adding the A->C dependency, it doesn't really matter that C depends
on something that A depends on, that isn't a necessary condition for a
dependency loop.

Instead what we want to know is that nothing C depends on, directly or
indirectly, depends on A.  We can detect this by recursively OR'ing the
dependents_mask of C and all it's dependencies.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/a6xx: add missing flush/invalidates for blit
Rob Clark [Tue, 30 Jul 2019 16:34:53 +0000 (09:34 -0700)]
freedreno/a6xx: add missing flush/invalidates for blit

Various things we were missing for multiple blits in a single batch.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/a6xx: skip tiles with no geometry
Rob Clark [Sat, 27 Jul 2019 16:00:37 +0000 (09:00 -0700)]
freedreno/a6xx: skip tiles with no geometry

If no clear, and no geometry according to VSC_STATE[pipe] we can skip
the tile entirely.  If there is a fast-clear, we can't skip restore
(clear) or resolve IBs, but we can still skip draw IB.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno/a6xx: VSC overflow detection/handling
Rob Clark [Fri, 26 Jul 2019 16:55:14 +0000 (09:55 -0700)]
freedreno/a6xx: VSC overflow detection/handling

Check VSC_SIZE/VSC_SIZE2 regs from cmdstream to detect overflow, and
skip use of VSC visibility stream when overflow is detected, to avoid
GPU hangs.  This is done w/ introduction of some CP_REG_TEST/
CP_COND_REG_EXEC packet pairs.

In addition, eventually (after a frame or two) detect the condition and
resize the VSC buffers until overflow no longer happens.

Note that this significantly reduces the initial size of the VSC
buffers, backing out a previous hack to make them 16x larger than
what should be typically required (the previous "solution" for
VSC overflow).

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno/a6xx: remove USE/IGNORE_VISIBILITY draw patching
Rob Clark [Fri, 26 Jul 2019 16:05:58 +0000 (09:05 -0700)]
freedreno/a6xx: remove USE/IGNORE_VISIBILITY draw patching

Seems this isn't needed anymore on a6xx to control whether visibility
stream is used.  And it would be hard to deal with if it was, for
disabling use of VSC stream in draw pass.  So just remove it and
simplify things.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno/a6xx: cleanup "blit_mem"
Rob Clark [Thu, 25 Jul 2019 23:34:25 +0000 (16:34 -0700)]
freedreno/a6xx: cleanup "blit_mem"

Rename to "control_mem", and switch to using a struct to manage the
layout, rather than just ad-hoc hard-coded offsets.

For recovering from VSC stream overflow, we'll need to add more, but
best to clean it up first.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno: refresh tile debug
Rob Clark [Wed, 24 Jul 2019 21:28:10 +0000 (14:28 -0700)]
freedreno: refresh tile debug

Fix some #ifdef'd bitrot, and get rid of #ifdef so it doesn't bitrot
again.

And add a prints for per-tile state.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno: update registers
Rob Clark [Thu, 25 Jul 2019 22:25:22 +0000 (15:25 -0700)]
freedreno: update registers

Pull in some updates of VSC regs

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno/gmem: small cleanup
Rob Clark [Thu, 25 Jul 2019 20:40:02 +0000 (13:40 -0700)]
freedreno/gmem: small cleanup

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno/drm: convert ring_pool to child_pool
Rob Clark [Mon, 29 Jul 2019 18:48:06 +0000 (11:48 -0700)]
freedreno/drm: convert ring_pool to child_pool

Worth another couple percent at driver2

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/drm: remove idx_lock
Rob Clark [Mon, 29 Jul 2019 17:27:18 +0000 (10:27 -0700)]
freedreno/drm: remove idx_lock

Since it ends up contended, it is a bit of a bottleneck for workloads
with high driver overhead.  Worth nearly +10% at gfxbench driver2.

Signed-off-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno/batch: always update last_fence
Rob Clark [Sun, 28 Jul 2019 17:04:25 +0000 (10:04 -0700)]
freedreno/batch: always update last_fence

Not all flush paths come thru fd_context_flush(), so we should also set
last_fence in the batch flush path.  This avoids some no-op flushes just
to get a fence.  For example when pctx->flush_resource() triggers a
flush.

We should probably keep the last_fence update in fd_context_flush() as
well to handle deferred flush case.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agofreedreno: drop unused fd_fence_ref param
Rob Clark [Tue, 30 Jul 2019 15:12:46 +0000 (08:12 -0700)]
freedreno: drop unused fd_fence_ref param

The pscreen param was just there to satisfy pipe_screen::fence_reference

But some of the internal uses passed NULL for screen.  Which is a bit
ugly.  Instead drop the param and add a shim function to plug into the
screen.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agopan/midgard: Print invert modifier
Alyssa Rosenzweig [Fri, 26 Jul 2019 21:59:00 +0000 (14:59 -0700)]
pan/midgard: Print invert modifier

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Flip conditionals
Alyssa Rosenzweig [Fri, 26 Jul 2019 21:25:25 +0000 (14:25 -0700)]
pan/midgard: Flip conditionals

We would like to flip ops to have a constant in the second place to
enable inlining of the constant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Add bitwise src/invert fusing
Alyssa Rosenzweig [Fri, 26 Jul 2019 20:32:54 +0000 (13:32 -0700)]
pan/midgard: Add bitwise src/invert fusing

De Morgan's Laws and some special ops basically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Add .not propagation pass
Alyssa Rosenzweig [Fri, 26 Jul 2019 20:14:55 +0000 (13:14 -0700)]
pan/midgard: Add .not propagation pass

Essentially .pos propagation but for bitwise.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Fuse invert into bitwise ops
Alyssa Rosenzweig [Fri, 26 Jul 2019 20:08:54 +0000 (13:08 -0700)]
pan/midgard: Fuse invert into bitwise ops

We use the new invert flag to produce ops like inand.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agofreedreno: a2xx: implement texture tiling
Jonathan Marek [Thu, 1 Aug 2019 18:41:44 +0000 (14:41 -0400)]
freedreno: a2xx: implement texture tiling

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering pass
Jonathan Marek [Thu, 1 Aug 2019 19:52:58 +0000 (15:52 -0400)]
freedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering pass

nir_lower_alu_to_scalar can now be used to only lower certain ops, so we
don't need the custom pass. And we can lower fall_equal/fany_nequal with
lower_vector_cmp instead.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: a2xx: fix HW binning for batches with >256K vertices
Jonathan Marek [Thu, 1 Aug 2019 19:22:47 +0000 (15:22 -0400)]
freedreno: a2xx: fix HW binning for batches with >256K vertices

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: a2xx: fix fneg/fabs/fsat opcodes
Jonathan Marek [Thu, 1 Aug 2019 18:43:12 +0000 (14:43 -0400)]
freedreno: a2xx: fix fneg/fabs/fsat opcodes

Previously we would get a fmov with modifiers, but now that mov has no type
these opcodes need to be supported.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: a2xx: fix order of NIR opts
Jonathan Marek [Thu, 1 Aug 2019 18:38:18 +0000 (14:38 -0400)]
freedreno: a2xx: fix order of NIR opts

int_to_float needs to come after bool_to_float, and lower_to_source_mods
needs to come after both, since they don't deal wih source mods.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: a2xx: fix non-etc1 cubemaps
Jonathan Marek [Thu, 1 Aug 2019 18:36:41 +0000 (14:36 -0400)]
freedreno: a2xx: fix non-etc1 cubemaps

Not sure how this happened, but apparently all cubemaps need swapped XY.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: a2xx: fix fast clear not being used for Z24X8 buffers
Jonathan Marek [Thu, 1 Aug 2019 16:50:03 +0000 (12:50 -0400)]
freedreno: a2xx: fix fast clear not being used for Z24X8 buffers

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agofreedreno: align renderonly scanout buffers
Jonathan Marek [Thu, 1 Aug 2019 16:42:33 +0000 (12:42 -0400)]
freedreno: align renderonly scanout buffers

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Rob Clark <robdclark@chromium.org>
4 years agogitlab-ci: just build all the tools
Eric Engestrom [Thu, 1 Aug 2019 22:32:34 +0000 (23:32 +0100)]
gitlab-ci: just build all the tools

This line was mistakenly added while there is already a `-D tools=all`
a few lines below.

Fixes: f60defa72d5d20d99e3a ("gitlab-ci: Add a shader-db run using v3d on drm-shim.")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agoi965/clear: clear_value better precision
Sergii Romantsov [Fri, 12 Jul 2019 13:46:45 +0000 (16:46 +0300)]
i965/clear: clear_value better precision

Test-case with depth-clear 0.5 and format
MESA_FORMAT_Z24_UNORM_X8_UINT fails due inconsistent
clear-value of 0.4999997.
Maybe its better to improve?

CC: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 0ae9ce0f29ea (i965/clear: Quantize the depth clear value based on the format)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111113
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
4 years agoradv: fix image_has_{cmask,fmask}() helpers
Samuel Pitoiset [Fri, 2 Aug 2019 11:55:01 +0000 (13:55 +0200)]
radv: fix image_has_{cmask,fmask}() helpers

The driver should now rely on cmask_offset because CMASK can be
disabled by the driver for some reasons (eg. mipmaps). Apply the
same change for FMASK, although it should be useless.

Fixes: ad1bc8621df ("radv: remove radv_get_image_fmask_info()")
Fixes: 10d08da52c6 ("radv/gfx10: add missing dcc_tile_swizzle tweak")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: remove radv_get_image_fmask_info()
Samuel Pitoiset [Thu, 1 Aug 2019 15:59:56 +0000 (17:59 +0200)]
radv: remove radv_get_image_fmask_info()

It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: add missing dcc_tile_swizzle tweak
Samuel Pitoiset [Thu, 1 Aug 2019 13:45:11 +0000 (15:45 +0200)]
radv/gfx10: add missing dcc_tile_swizzle tweak

Fixes: c90f46700dd ("radv/gfx10: mask DCC tile swizzle by alignment")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: remove radv_get_image_cmask_info()
Samuel Pitoiset [Thu, 1 Aug 2019 15:59:55 +0000 (17:59 +0200)]
radv: remove radv_get_image_cmask_info()

It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: only account for tile_swizzle for color surfaces with DCC
Samuel Pitoiset [Thu, 1 Aug 2019 13:45:10 +0000 (15:45 +0200)]
radv: only account for tile_swizzle for color surfaces with DCC

It's 0 for depth surfaces with TC compat HTILE enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv: Enable VK_KHR_shader_atomic_int64
Bas Nieuwenhuizen [Fri, 2 Aug 2019 00:16:23 +0000 (02:16 +0200)]
radv: Enable VK_KHR_shader_atomic_int64

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoac/nir: Implement LLVM9 64-bit buffer compare & exchange.
Bas Nieuwenhuizen [Fri, 2 Aug 2019 10:01:34 +0000 (12:01 +0200)]
ac/nir: Implement LLVM9 64-bit buffer compare & exchange.

LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this
extracts the ptr, does a bound check and then uses a cmpxchg LLVM
instruction.

Not ideal, but the earliest release we're going to get a proper
intrinsic is LLVM 10.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
4 years agoRevert "ac/nir: handle negate modifier"
Connor Abbott [Fri, 2 Aug 2019 09:14:50 +0000 (11:14 +0200)]
Revert "ac/nir: handle negate modifier"

This reverts commit bfea7e4d2965269bff8f1f6449cb99c312fd7384.

4 years agoRevert "ac/nir: handle abs modifier"
Connor Abbott [Fri, 2 Aug 2019 09:14:08 +0000 (11:14 +0200)]
Revert "ac/nir: handle abs modifier"

This reverts commit d3c80733cdfe8552b2f447ec8ed62465d0f2af1a.

These were only appearing due to memory corruption.

4 years agoiris: bump compat profile support to 4.6
Timothy Arceri [Fri, 28 Jun 2019 04:55:20 +0000 (14:55 +1000)]
iris: bump compat profile support to 4.6

All of the current piglit compat profile tests pass.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agoegl: fix OpenGL 3.1 context creation
Timothy Arceri [Fri, 2 Aug 2019 01:38:45 +0000 (11:38 +1000)]
egl: fix OpenGL 3.1 context creation

>From the EGL_KHR_create_context spec:

   "* If OpenGL 3.1 is requested, the context returned may implement
       any of the following versions:

         * Version 3.1. The GL_ARB_compatibility extension may or may
           not be implemented, as determined by the implementation.
         * The core profile of version 3.2 or greater."

Fixes CTS tests:

    dEQP-EGL.functional.create_context_ext.gl_31.rgb888_depth_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_depth_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgb888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgb888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgb888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_no_depth_no_stencil
    dEQP-EGL.functional.create_context_ext.gl_31.rgba8888_depth_stencil
    dEQP-EGL.functional.create_context_ext.robust_gl_31.rgba8888_depth_stencil

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agonir/find_array_copies: Reject copies with mismatched type
Connor Abbott [Wed, 31 Jul 2019 09:32:30 +0000 (11:32 +0200)]
nir/find_array_copies: Reject copies with mismatched type

When we detect a scalar/vector copy through load_deref/store_deref, we
have to be careful since those can bitcast an int to a float and
vice-versa even though copy_deref can't.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111251
Fixes: 156306e5e62 ("nir/find_array_copies: Handle wildcards and overlapping copies")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoradv: re-apply "Optimize rebinding the same descriptor set."
Samuel Pitoiset [Fri, 2 Aug 2019 07:56:32 +0000 (09:56 +0200)]
radv: re-apply "Optimize rebinding the same descriptor set."

This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.

This optimization has been reverted a while back because of
random GPU hangs on GFX9, no it looks fine, at least CTS no longer
hangs on GFX9 and it doesn't hang on GFX10 as well.

It fixes a performance problem with Wolfenstein Youngblood.

Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
4 years agoradv/gfx10: use the correct target machine for Wave32
Samuel Pitoiset [Thu, 1 Aug 2019 08:43:44 +0000 (10:43 +0200)]
radv/gfx10: use the correct target machine for Wave32

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders
Samuel Pitoiset [Thu, 1 Aug 2019 08:43:42 +0000 (10:43 +0200)]
radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders

It can be enabled with RADV_PERFTEST=gewave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agoradv/gfx10: add Wave32 support for fragment shaders
Samuel Pitoiset [Thu, 1 Aug 2019 08:43:41 +0000 (10:43 +0200)]
radv/gfx10: add Wave32 support for fragment shaders

It can be enabled with RADV_PERFTEST=pswave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
4 years agogallium: Implement GL_EXT_shader_samples_identical via a new capability
Kenneth Graunke [Wed, 31 Jul 2019 22:47:34 +0000 (15:47 -0700)]
gallium: Implement GL_EXT_shader_samples_identical via a new capability

This exposes the textureSamplesIdenticalEXT function in GLSL.

We enable it for iris and radeonsi, because their compilers already
have support for this.  Tested on Intel Kabylake and AMD Vega 64.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
4 years agointel/tools: Fix aubinator_viewer build.
Kenneth Graunke [Fri, 2 Aug 2019 06:36:41 +0000 (23:36 -0700)]
intel/tools: Fix aubinator_viewer build.

This functions was recently renamed and not all callers were updated.

Fixes: 086c486a75f ("intel/device: rename gen_get_device_info")
4 years agointel/ir: Fix CFG corruption in opt_predicated_break().
Francisco Jerez [Tue, 23 Jul 2019 23:17:07 +0000 (16:17 -0700)]
intel/ir: Fix CFG corruption in opt_predicated_break().

Specifically the optimization of a conditional BREAK + WHILE sequence
into a conditional WHILE seems pretty broken.  The list of successors
of "earlier_block" (where the conditional BREAK was found) is emptied
and then re-created with the same edges for no apparent reason.  On
top of that the list of predecessors of the block immediately after
the WHILE loop is emptied, but only one of the original edges will be
added back, which means that potentially several blocks that still
have it on their list of successors won't be on its list of
predecessors anymore, causing all sorts of hilarity due to the
inconsistency in the control flow graph.

The solution is to remove the code that's removing valid edges from
the CFG.  cfg_t::remove_block() will already clean up after itself.
The assert in bblock_t::combine_with() also needs to be removed since
we will be merging a block with multiple children into the first one
of them.

Found the issue on a hardware enabling branch originally, but
apparently somebody reproduced the same problem independently on
master in the meantime.

Fixes: d13bcdb3a9f ("i965/fs: Extend predicated break pass to predicate WHILE.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111009
Cc: jiradet.jd@gmail.com
Cc: Sergii Romantsov <sergii.romantsov@globallogic.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Tested-by: Paul Chelombitko <qamonstergl@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
4 years agointel/device: make internal functions private
Mark Janes [Tue, 30 Jul 2019 00:38:42 +0000 (17:38 -0700)]
intel/device: make internal functions private

The device info initializer makes several fuctions internal:

  - handling of device override
  - updating topology from kernel information

The implementation file is slightly reordered due to the renamed
functions being static.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agointel/device: rename gen_get_device_info
Mark Janes [Thu, 25 Jul 2019 22:57:30 +0000 (15:57 -0700)]
intel/device: rename gen_get_device_info

Rename the original device info initialization routine so callers
don't mistakenly call the wrong one:

  gen_get_device_info_from_fd:

      Queries kernel for full device info, including topology
      details.

  gen_get_device_info_from_pci_id:

      Partially initializes device info based on PCI ID lookup, when
      the kernel is not available.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agointel/tools: use device info initializer
Mark Janes [Thu, 25 Jul 2019 21:31:40 +0000 (14:31 -0700)]
intel/tools: use device info initializer

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoanv: use initialization routine for gen_device_info
Mark Janes [Thu, 25 Jul 2019 17:40:55 +0000 (10:40 -0700)]
anv: use initialization routine for gen_device_info

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoiris/screen: use initialization routine for gen_device_info
Mark Janes [Wed, 24 Jul 2019 22:21:36 +0000 (15:21 -0700)]
iris/screen: use initialization routine for gen_device_info

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoi965: Move device info initialization to common code
Mark Janes [Wed, 24 Jul 2019 20:48:03 +0000 (13:48 -0700)]
i965: Move device info initialization to common code

With perf queries, initializing the device info is much more complex
than just getting a PCI ID and calling gen_get_device_info.  This commit
adds a new gen_get_device_info_from_fd helper in common code which does
all of the requisite kernel queries to get device info including all of
the topology information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agoi965/perf: verify kernel support before registering OA metrics
Mark Janes [Wed, 31 Jul 2019 23:16:50 +0000 (16:16 -0700)]
i965/perf: verify kernel support before registering OA metrics

When gen_device_info updates the topology in it's initializer, the
kernel queries will fail silently.  Iris and anv have minimum
kernel requirements that support the queries.  i965 must verify kernel
support before reporting OA metrics.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agointel/common: provide common ioctl routine
Mark Janes [Thu, 25 Jul 2019 17:50:36 +0000 (10:50 -0700)]
intel/common: provide common ioctl routine

i965 links against libdrm for drmIoctl, but anv and iris both
re-implement this routine to avoid the dependency.

intel/dev also needs an ioctl wrapper, so lets share the same
implementation everywhere.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
4 years agopanfrost: Remove unused argument
Alyssa Rosenzweig [Thu, 1 Aug 2019 15:10:03 +0000 (08:10 -0700)]
panfrost: Remove unused argument

A relic from when we didn't have an online compiler, hah.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Handle MESA_SHADER_COMPUTE in compile callback
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:52:04 +0000 (15:52 -0700)]
panfrost: Handle MESA_SHADER_COMPUTE in compile callback

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/midgard: Use standard list traversal to find initial tag
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:49:30 +0000 (15:49 -0700)]
pan/midgard: Use standard list traversal to find initial tag

Fixes a hang (and abort) on empty shaders, which you shouldn't have
anyway but better safe than sorry. DCE going on the fritz is no reason
to freeze the system.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Use gl_shader_stage directly for compiles
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:49:13 +0000 (15:49 -0700)]
panfrost: Use gl_shader_stage directly for compiles

No need to add a third set of enums to the mix.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Emit "draw" info for compute jobs
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:32:18 +0000 (15:32 -0700)]
panfrost: Emit "draw" info for compute jobs

Important fields relating to shader state and UBOs are filled out from
this (misnomer) function.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Feed compute shaders into the compiler
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:31:23 +0000 (15:31 -0700)]
panfrost: Feed compute shaders into the compiler

The path for compute shader compiles resembles the graphic shader
compile path, although it is substantially simpler as we don't need any
shader keying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Expose compute shaders as panfrost_shader_variants
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:20:00 +0000 (15:20 -0700)]
panfrost: Expose compute shaders as panfrost_shader_variants

Whether variants are packed by graphics or compute is irrelevant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Remove shader state *base
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:19:44 +0000 (15:19 -0700)]
panfrost: Remove shader state *base

It is now unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Remove CSO dependency from shader_compile
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:19:09 +0000 (15:19 -0700)]
panfrost: Remove CSO dependency from shader_compile

We want this routine to be generic across graphics and compute, so let
the caller deal with the typing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Generalize UBO upload for other shader stages
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:06:38 +0000 (15:06 -0700)]
panfrost: Generalize UBO upload for other shader stages

Now that everything is unified, this generalization is nice and easy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Guard vertex upload by ctx->vertex != NULL
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:06:14 +0000 (15:06 -0700)]
panfrost: Guard vertex upload by ctx->vertex != NULL

This is irrelevant for graphics but matters for compute workloads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Generalize vertex shader upload
Alyssa Rosenzweig [Wed, 31 Jul 2019 22:05:57 +0000 (15:05 -0700)]
panfrost: Generalize vertex shader upload

This allows us to reuse the same code path for compute.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Share gl_enables between VERTEX/COMPUTE
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:56:03 +0000 (14:56 -0700)]
panfrost: Share gl_enables between VERTEX/COMPUTE

Catch-all for magic bits.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Invoke compute shader according to grid info
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:27:53 +0000 (14:27 -0700)]
panfrost: Invoke compute shader according to grid info

We already have helpers for packing invocations (due to its role in
instanced vertex shaders), so we can reuse this drop in for compute
shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Explain and include compute FBD
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:22:37 +0000 (14:22 -0700)]
panfrost: Explain and include compute FBD

Squint at it hard enough and you realize it's the beginning of an
SFBD... I guess...

A compute shader with register spilling would be able to confirm this,
but we would expect to see the first field | 1 and an address splattered
later, setting up TLS.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Unify-driven cleanup
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:15:19 +0000 (14:15 -0700)]
panfrost: Unify-driven cleanup

Again, now that stages are unified some logic goes away.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Unify ctx->vs and ctx->fs
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:13:30 +0000 (14:13 -0700)]
panfrost: Unify ctx->vs and ctx->fs

It's a little verbose, but this way we can support other shader stages
without too much contortion.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Flesh out launch_grid stub
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:08:59 +0000 (14:08 -0700)]
panfrost: Flesh out launch_grid stub

It's still incomplette, but we're able to hook into launch_grid to
create a stub COMPUTE job.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Cleanup via payload unification
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:08:07 +0000 (14:08 -0700)]
panfrost: Cleanup via payload unification

Since these are now indexable, quite a bit of code cleans up.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Unify payload_vertex/payload_tiler
Alyssa Rosenzweig [Wed, 31 Jul 2019 21:05:14 +0000 (14:05 -0700)]
panfrost: Unify payload_vertex/payload_tiler

Rather than disparate variables, let's use an array of payloads indexed
by the shader stage.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Only wallpaper if we drew something
Alyssa Rosenzweig [Wed, 31 Jul 2019 20:54:23 +0000 (13:54 -0700)]
panfrost: Only wallpaper if we drew something

last_tiler.gpu may be NULL at flush time despite no clear and existing
jobs -- if we executed a compute-only workload.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Adjust shader CAPs to expose dEQP compute
Alyssa Rosenzweig [Wed, 31 Jul 2019 20:40:46 +0000 (13:40 -0700)]
panfrost: Adjust shader CAPs to expose dEQP compute

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Expose NIR as our PIPE_SHADER_CAP_SUPPORTED_IRS
Alyssa Rosenzweig [Tue, 23 Jul 2019 23:40:42 +0000 (16:40 -0700)]
panfrost: Expose NIR as our PIPE_SHADER_CAP_SUPPORTED_IRS

We *could* expose TGSI as well -- we pipe it through tgsi_to_nir for
Gallium-internal shaders anyway -- but we'd rather not.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Copy freedreno's panfrost_get_compute_param
Alyssa Rosenzweig [Tue, 23 Jul 2019 17:41:25 +0000 (10:41 -0700)]
panfrost: Copy freedreno's panfrost_get_compute_param

Values reported here aren't remotely correct, but it's a start to just
get the entrypoint stubbed out.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Expose COMPUTE-related caps for GLES3.1
Alyssa Rosenzweig [Tue, 23 Jul 2019 16:05:40 +0000 (09:05 -0700)]
panfrost: Expose COMPUTE-related caps for GLES3.1

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Stub out launch_grid
Alyssa Rosenzweig [Tue, 23 Jul 2019 15:31:14 +0000 (08:31 -0700)]
panfrost: Stub out launch_grid

Just dumps some information about the invocation for later debug.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Stub out compute CSO
Alyssa Rosenzweig [Tue, 23 Jul 2019 15:28:23 +0000 (08:28 -0700)]
panfrost: Stub out compute CSO

Doesn't do anything, just gets the functions there.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Implement gl_FrontFacing
Alyssa Rosenzweig [Wed, 31 Jul 2019 19:24:32 +0000 (12:24 -0700)]
panfrost: Implement gl_FrontFacing

Interestingly, this requires no compiler changes. It's just exposed as a
special varying.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopanfrost: Add support for decoding gl_FrontFacing
Alyssa Rosenzweig [Wed, 31 Jul 2019 18:56:55 +0000 (11:56 -0700)]
panfrost: Add support for decoding gl_FrontFacing

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agopan/decode: Use max varying index as varying buffer count
Alyssa Rosenzweig [Wed, 31 Jul 2019 18:52:52 +0000 (11:52 -0700)]
pan/decode: Use max varying index as varying buffer count

This allows us to decode asymmetric varyings correctly, which occurs
with e.g. gl_FrontFacing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
4 years agoiris: add support for gl_ClipVertex in tess eval shaders
Timothy Arceri [Fri, 28 Jun 2019 12:25:57 +0000 (22:25 +1000)]
iris: add support for gl_ClipVertex in tess eval shaders

Required for OpenGL compat support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agoiris: add support for gl_ClipVertex in geometry shaders
Timothy Arceri [Thu, 27 Jun 2019 05:06:30 +0000 (15:06 +1000)]
iris: add support for gl_ClipVertex in geometry shaders

This will enable us to support the OpenGL compat profile.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
4 years agonir: Stop whacking gl_FrontFacing to a system value
Jason Ekstrand [Wed, 31 Jul 2019 20:17:17 +0000 (15:17 -0500)]
nir: Stop whacking gl_FrontFacing to a system value

We have a cap bit for gallium and a GLSL compiler flag to control this.
Just trust what GLSL gives us and stop forcing it.  In order for this to
be safe, we have to advertise another cap in some of the gallium
drivers.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agopanfrost: Implement panfrost_set_shader_buffers callback
Alyssa Rosenzweig [Thu, 1 Aug 2019 17:31:35 +0000 (10:31 -0700)]
panfrost: Implement panfrost_set_shader_buffers callback

Just copy over the passed SSBO for now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agogallium/util: Add util_set_shader_buffers_mask helper
Alyssa Rosenzweig [Thu, 1 Aug 2019 17:30:40 +0000 (10:30 -0700)]
gallium/util: Add util_set_shader_buffers_mask helper

Conceptually follows util_set_vertex_buffers_mask but for SSBOs.

v2: Fix missing ~ when clearing mask. Adjust mask behaviour to match
freedreno/v3d when buffer == NULL.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
4 years agokmsro: move entry points from etnaviv to kmsro
Jonathan Marek [Thu, 1 Aug 2019 16:48:40 +0000 (12:48 -0400)]
kmsro: move entry points from etnaviv to kmsro

These drivers are kmsro drivers so they should be part of the kmsro #if

This fixes missing imx_drm driver when building with only freedreno+kmsro

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>