git.libre-soc.org Git - mesa.git/log

iris: Move fresh BO allocation into a helper function.

There's enough going on here to warrant a helper. More cleaning coming.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

iris: Do SET_TILING at a single point rather than in two places.

Both the from-cache and fresh-from-GEM cases were calling SET_TILING.
In the cached case, we would retry the allocation on failure, pitching
one BO from the cache each time.  This is silly, because the only time
it should fail is if the tiling or stride parameters are unacceptable,
which has nothing to do with the particular BO in question.  So there's
no point in retrying - we should simply fail the allocation.

This patch moves both calls to bo_set_tiling_internal() below the
cache/fresh split, so we have it at a single point in time instead
of two.

To preserve the ordering between SET_TILING and SET_DOMAIN, we move
that below as well.  (I am unsure if the order matters.)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

iris: Use the BO cache even for coherent buffers on non-LLC.

We mark snooped BOs as non-reusable, so we never return them to the
cache.  This means that we'd need to call I915_GEM_SET_CACHING to make
any BO we find in the cache snooped.  But then again, any BO we freshly
allocate from the kernel will also be non-snooped, so it has the same
issue.  There's really no reason to skip the cache - we may as well use
it to avoid the I915_GEM_CREATE overhead.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

iris: Fix locking around vma_alloc in iris_bo_create_userptr

util_vma needs to be protected by a lock. All other callers of
vma_alloc and vma_free appear to be holding a lock already.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

iris: Fix lock/unlock mismatch for non-LLC coherent BO allocation.

The goto jumped over the mtx_lock, but proceeded to hit the mtx_unlock.
We can simply set the bucket to NULL and it will skip the cache without
goto, and without messing up locking.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>

radeonsi: fix timestamp queries for compute-only contexts

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>

Change a few frequented uses of DEBUG to !NDEBUG

debugoptimized builds don't define NDEBUG, but they also don't define
DEBUG. We want to enable cheap debug code for these builds.
I only chose those occurences that I care about.

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>

iris: Re-emit Surface State Base Address when context is lost.

When we hit a GPU hang, we failed to reset Surface State Base Address
right away, and would keep hanging until we filled up the binder. Then
we'd finally get it right after a lot of repeated stumbles. Update it
right away so we hopefully hang fewer times before succeeding.

iris: Enable nir_opt_large_constants

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15306230 -> 15304726 (<.01%)
    instructions in affected programs: 4570 -> 3066 (-32.91%)
    helped: 16
    HURT: 0

    total cycles in shared programs: 361703436 -> 361680041 (<.01%)
    cycles in affected programs: 129388 -> 105993 (-18.08%)
    helped: 16
    HURT: 0

    LOST:   0
    GAINED: 2

The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal
Space Program

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

iris: Don't assume UBO indices are constant

It will be true for the constant/system value buffer because they use a
constant zero but it's not true in general. If we ever got here when
the source wasn't constant, nir_src_as_uint would assert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org

iris: Move upload_ubo_ssbo_surf_state to iris_program.c

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

nir: silence three compiler warnings seen with MinGW

Silence two unused var warnings. And init elem_size, elem_align to
zero to silence "maybe uninitialized" warnings.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>

svga: clamp max_const_buffers to SVGA_MAX_CONST_BUFS

In case the device reports 15 (or more) buffers.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>

iris: Clone before calling nir_strip and serializing

This is non-destructive and leaves the debugging information in place.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

iris: Only store the SHA1 of the NIR in iris_uncompiled_shader

Jason pointed out that we don't need to keep an entire copy of the
serialized NIR around, we just need the SHA1. This does change our
disk cache key to be taking a SHA1 of a SHA1, which is a bit odd,
but should work out and be faster and use less memory.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

spirv: Change spirv_to_nir() to return a nir_shader

spirv_to_nir() returned the nir_function corresponding to the
entrypoint, as a way to identify it.  There's now a bool is_entrypoint
in nir_function and also a helper function to get the entry_point from
a nir_shader.

The return type reflects better what the function name suggests.  It
also helps drivers avoid the mistake of reusing internal shader
references after running NIR_PASS on it.  When using NIR_TEST_CLONE or
NIR_TEST_SERIALIZE, those would be invalidated right in the first pass
executed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: Don't re-use entry_point pointer from spirv_to_nir

Replace its uses with checking for is_entrypoint and calling
nir_shader_get_entrypoint().

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

glspirv: Don't re-use entry_point pointer from spirv_to_nir

Replace its use with checking for is_entrypoint.

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

turnip: Don't re-use entry_point pointer from spirv_to_nir

Replace its uses with nir_shader_get_entrypoint(), and change the
helper function to return nir_shader *.

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

virgl: fix readback with pending transfers

When readback is true, and there are pending writes in the transfer
queue, we should flush to avoid reading back outdated data. This
fixes piglit arb_copy_buffer/dlist and a subtest of
arb_copy_buffer/data-sync.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>

nir: Allow derefs to be used as phi sources

It is possible and valid for a pointer to be selected based on a
conditional before used, and depending on the mode, those cases will
result in a phi with derefs as sources.

To achieve this, we don't rematerialize derefs that are used by phis.
As a consequence, when converting from SSA to regs, we may have phis
that come from different blocks and are used by phis. We now convert
those to regs too.

Validation was added to ensure only derefs of certain modes can be
used as phi sources. No extra validation is needed for the presence
of cast, any instruction that uses derefs will validate the
deref-chain is complete (ending in a cast or a var).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

radeonsi: Fix editorconfig

At least on vim, indenting doesn't work without this. Copied from
src/amd/vulkan.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

mesa/main: clean up extension-check for GL_SAMPLE_MASK

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: clean up extension-check for GL_SAMPLE_SHADING

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: correct extension-checks for GL_PRIMITIVE_RESTART_FIXED_INDEX

This shouldn't be allowed in GLES 1/2.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: correct extension-checks for GL_BLEND_ADVANCED_COHERENT_KHR

KHR_blend_equation_advanced_coherent isn't exposed on OpenGL ES 1.x, so
we shouldn't allow its enums there either.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: correct extension-checks for GL_FRAMEBUFFER_SRGB

This enum shouldn't be allowed on OpenGL ES 1.x, so let's instead
use the extenion-helpers, and check for desktop and gles extensions
separately.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: correct extension-checks for MESA_tile_raster_order

This extension isn't enabled for GLES 1.x, so we shouldn't allow the
state there. Let's use the extension-helpers instead of CHECK_EXTENSION
for this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: make the CONSERVATIVE_RASTERIZATION_NV checks consistent

This just makes the logic of the checks for this enum the same for
gl{Enable,Disable} and for glIsEnabled. They are already functionally
the same, so this is just a minor code-cleanup.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: make the PRIMITIVE_RESTART_NV checks consistent

{En,Dis}ableClientState(PRIMITIVE_RESTART_NV) should only work on
compatibility contextxs. While we're at it, modernize the code a bit,
by using the extension helpers instead of open-coding.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

radv: use view format when selecting the resolve path for subpasses

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: always use view format when performing subpass resolves

It makes sense to use the image view formats when resolving
inside subpasses, while we have to use the image formats for
normal resolves.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: sync before resetting a pool if there is active pending queries

Make sure to sync all previous work if the given command buffer
has pending active queries. Otherwise the GPU might write queries
data after the reset operation.

This fixes a bunch of new dEQP-VK.query_pool.* CTS failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

intel/decoder: Use get_state_size() over guessed counts in more cases

This makes the following packets use actual driver provided sizes rather
than guessing an arbitrary number:

  - CC_VIEWPORT
  - SF_CLIP_VIEWPORT
  - BLEND_STATE
  - COLOR_CALC_STATE
  - SCISSOR_RECT

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>

meson: Link Gallium drivers with ld_args_build_id

Link all Gallium drivers with ld_args_build_id to prevent failures in
Iris that uses GNU_BUILD_ID

Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=110757
Fixes: 4756864cdc5f "iris: Start wiring up on-disk shader cache"
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

nir/lower_non_uniform: safely iterate over blocks

This fixes a problem where the same instruction gets replaced twice.
This was happening when the replaced instruction would be at the end
of a block.

Replacement of :

   if ssa_8 {
                ....
      intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
   }

Would be :

   if ssa_8 {
      loop {
         vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) ()
         vec1 1 ssa_48 = ieq ssa_47, ssa_44
         if ssa_48 {
            loop {
               vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) ()
               vec1 1 ssa_50 = ieq ssa_49, ssa_44
               if ssa_50 {
                  intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
                  break
               } else {
        ....
   }

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

radv: allocate more space in the CS when emitting events

If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.

This fixes a crash with Quake Champions.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

iris: Ask st to vectorize our IO.

(Technically this is common code, but it doesn't affect i965 or anv.)

Improves performance of GFXBench5/gl_tess_off on Skylake GT4e at 1080p
by 9.3933% +/- 0.0305157% by eliminating all spilling in the GS.

Improves performance of GFXBench5/gl_4_off (Car Chase) on Skylake GT4e
at 1080p by 0.325208% +/- 0.0842233% (n=18).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

st/nir: Re-vectorize shader IO

We scalarize IO to enable further optimizations, such as propagating
constant components across shaders, eliminating dead components, and
so on.  This patch attempts to re-vectorize those operations after
the varying optimizations are done.

Intel GPUs are a scalar architecture, but IO operations work on whole
vec4's at a time, so we'd prefer to have a single IO load per vector
rather than 4 scalar IO loads.  This re-vectorization can help a lot.

Broadcom GPUs, however, really do want scalar IO.  radeonsi may want
this, or may want to leave it to LLVM.  So, we make a new flag in the
NIR compiler options struct, and key it off of that, allowing drivers
to pick.  (It's a bit awkward because we have per-stage settings, but
this is about IO between two stages...but I expect drivers to globally
prefer one way or the other.  We can adjust later if needed.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

mesa: Prevent classic swrast crash on a surfaceless context v2.

This fixes the egl_mesa_platform_surfaceless piglit test as well
as the new egl_ext_device_base piglit test on classic swrast.

v2: Fix swrast surfaceless contexts on the driver side.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>

radv add radv_get_resolve_pipeline() in the compute path

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: cleanup the compute resolve path for subpass

This makes use of radv_meta_resolve_compute_image() by filling
a VkImageResolve region instead of duplicating code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radeonsi: add drirc workaround for American Truck Simulator

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110711

Revert "st/mesa: expose 0 shader binary formats for compat profiles for Qt"

This reverts commit 55376cb31e2f495a4d872b4ffce2135c3365b873.

It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

anv: fix apply_pipeline_layout pass for arrays of YCbCr descriptors

When using the binding tables to access arrays of YCbCr descriptors we
did not consider the offset of the accessed element. We can't do a
simple multiple because the binding table entries are tightly packed.

For example element 0 of the array could use 2 entries/planes and
element 1 could use 2 entries/planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bb8768b9d62 ("anv: toggle on support for VK_EXT_ycbcr_image_arrays")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

radeonsi: clean up winsys creation

- unify the code
- choose radeon or amdgpu based on the DRM version, not based on which one
succeeds first

radeonsi: allow query functions for compute-only contexts

ac: treat Mullins as Kabini, remove the enum

it's the same design

etnaviv: rs: choose clear format based on block size

Fixes following piglit and does not introduce any regressions.
spec@ext_packed_depth_stencil@fbo-depth-gl_depth24_stencil8-blit

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>

lima/ppir: implement discard and discard_if

This commit also adds codegen for branch since we need it
for discard_if.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>

radv: ignore the loadOp if the first use of an attachment is a resolve

Based on ANV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: always dirty the framebuffer when restoring a subpass

The old code was not wrong because the transitions performed
after the resolves should re-emit the framebuffer if needed.

This change is mostly a no-op but it improves consistency
regarding other meta operations that need to save/restore subpasses.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

radv: add radv_clear_htile() helper

This helper will be useful for clearing HTILE after some
depth/stencil resolves.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

anv/android: fix missing dependencies issue during parallel build

The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure
it gets generated as a dependency before building them.

Signed-off-by: Chenglei Ren <chenglei.ren@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>

radv: tidy up GetQueryPoolResults for occlusion queries

Just move the block that checks the availability bit into the
switch like other query types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

iris: Don't flag IRIS_DIRTY_URB after BLORP operations unless it changed

We already flag IRIS_DIRTY_URB when we change it, but we were
additionally flagging it on every BLORP operation, even if we didn't.

Revert "mesa: unreference current winsys buffers when unbinding winsys buffers"

This reverts commit 12bf7cfecf52083c484602f971738475edfe497e.

This commits caused lots of problems:
https://bugs.freedesktop.org/show_bug.cgi?id=110721
https://bugs.freedesktop.org/show_bug.cgi?id=110761

Fixes: 12bf7cfecf52 ("mesa: unreference current winsys buffers when unbinding winsys buffers")
Pushing without review as we need to get it into next stable.

panfrost/midgard: Implement fneg/fabs/fsat

Fix a regression I inadvertently caused by acking typeless movs before
implementing/pushing this *whistles*

Nothing to see here, move along folks.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

lima: fix lima_blit with non-zero level source resource

lima_blit will do blit between resources with different levels.
When blit from a level!=0 source, it will sample from that level
of resource as texture.

Current texture setup won't respect level when not mipmap filter.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>

lima: fix render to non-zero level texture

Current implementation won't respect level of surface to render.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>

editorconfig: Fix meson style

The syntax was wrong, resulting in it not working at all.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

virgl: remove an incorrect check in virgl_res_needs_flush

Imagine this

  resource_copy_region(ctx, dst, ..., src, ...);
  transfer_map(ctx, src, 0, PIPE_TRANSFER_WRITE, ...);

at the beginning of a cmdbuf.  We need to flush in transfer_map so
that the transfer is not reordered before the resource copy.  The
check for "vctx->num_draws == 0 && vctx->num_compute == 0" is not
enough.  Removing the optimization entirely.

Because of the more precise resource tracking in the previous
commit, I hope the performance impact is minimized.  We will have to
go with perfect resource tracking, or attempt a more limited
optimization, if there are specific cases we really need to optimize
for.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>

virgl: reemit resources on first draw/clear/compute

This gives us more precise resource tracking. It can be beneficial
because glFlush is often followed by state changes. We don't want
to reemit resources that are going to be unbound.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>

virgl: add missing emit_res for SO targets

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>

gallivm: fix default cbuf info.

The default null_output really needs to be static, otherwise the values
we'll eventually get later are doubly random (they are not initialized,
and even if they were it's a pointer to a local stack variable).
VMware bug 2349556.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

scons: fix build with llvm 9.

The x86asmprinter component is gone, and things seem to work by just
removing it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110707

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

panfrost: Dereference sampled texture

We are currently leaking resources if they were sampled from. Once we
are done with a sampler, we should dereference that resource.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

panfrost: ci: Avoid pulling Docker image on every run

Jump over the container stage if we haven't changed any of the files
that involved in building the container images.

This saves 1-2 minutes in each run and helps conserve resources.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

nir: Drop imov/fmov in favor of one mov instruction

The difference between imov and fmov has been a constant source of
confusion in NIR for years.  No one really knows why we have two or when
to use one vs. the other.  The real reason is that they do different
things in the presence of source and destination modifiers.  However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>

nir/builder: Merge nir_[if]mov_alu into one nir_mov_alu helper

Unless source modifiers are present, fmov and imov are the same.
There's no good reason for having two helpers.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

nir/lower_to_source_mods: Stop turning add, sat, and neg into mov

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

nir/source_mods: Add a helpers for setting source modifiers

It's potentially a tiny bit less efficient but the helpers make it much
easier to sort out the rules for updating source modifiers.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

intel: Implement abs, neg, and sat in the back-end

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

intel/nir: Call alu_to_scalar one last time before out-of-ssa

A few of our very late passes can end up generating vectors accidentally
so we need to get rid of them.  The only known case of this is the ffma
peephole which generates fneg and fabs as vectors.  Currently, they're
not a problem because they get turned into fmov which the back-end
compiler knows how to handle as a vector.  That's about to change.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

nir/builder: Remove the use_fmov parameter from nir_swizzle

This flag has caused more confusion than good in most cases. You can
validly use imov for floats or fmov for integers because, without source
modifiers, neither modify their input in any way. Using imov for floats
is more reliable so we go that direction.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

ptn,ttn: Use nir_channel for selecting channels

Both of these passes predate the nir_channel helper. We should just use
it instead of hand-rolling it in both passes.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>

scons: For MinGW use -posix flag.

Signed-off-by: Jose Fonseca <jfonseca@vmware.com>

etnaviv: use the correct uniform dirty bits

Found during code inspection.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>

anv: Do not emulate texture swizzle for INPUT_ATTACHMENT, STORAGE_IMAGE

If descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
or VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT, the imageView member of each
element of pImageInfo must have been created with the identity swizzle.

Fixes: d2aa65eb
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

st/dri: enable EGL_ANDROID_blob_cache on gallium drivers

Verified to work properly with Iris driver on Android Celadon. Cache
files get generated as 'com.android.opengl.shaders_cache' for each
application.

v2: check that cache was returned (Eric Engestrom)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

panfrost: Remove the standalone compiler

Now that the online compiler and pandecode are reliable and upstreamed,
nobody is using this. If somebody does need it, it should be easy enough
to bring back, I suppose. At the moment, it's just a maintenance hazard,
since meson is silly and does double builds for compiler updates (triple
for disassembler changes).

If people need the standalone _disassembler_, that can be added
trivially into pandecode (pandecode already includes the disassembler).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ryan Houdek <Sonicadvance1@gmail.com>

vk/util: suppress warning about out-of-enum android value

src/vulkan/util/vk_enum_to_str.c: In function ‘vk_structure_type_size’:
src/vulkan/util/vk_enum_to_str.c:3335:9: warning: case value ‘1000010000’ not in enumerated type ‘VkStructureType’ {aka ‘const enum VkStructureType’} [-Wswitch]
case VK_STRUCTURE_TYPE_NATIVE_BUFFER_ANDROID: return sizeof(VkNativeBufferANDROID);
^~~~

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

iris: Advertise coherent framebuffer fetches

This lets us advertise GL_EXT_shader_framebuffer_fetch and
GL_KHR_blend_equation_advanced_coherent support.

gallium: Add PIPE_CAP_FBFETCH_COHERENT and expose extensions

st/mesa now exposes KHR_blend_equation_advanced_coherent and
EXT_shader_framebuffer_fetch if the new capability is supported.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

st/mesa: Advertise GL_EXT_shader_framebuffer_fetch_non_coherent

This extension requires the ability to read from all render targets,
so we only enable it if PIPE_CAP_FBFETCH >= PIPE_CAP_MAX_RENDER_TARGETS.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

gallium: Change PIPE_CAP_TGSI_FS_FBFETCH bool to PIPE_CAP_FBFETCH count

TGSI's FBFETCH instruction currently only supports reading from a single
render target, but NIR intrinsics can support multiple render targets.

radeonsi can only support fetching from RT 0, but other drivers may be
able to support fetching from any render target.

To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply
PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?"
to an integer number of render targets which can be fetched.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

iris: Record state sizes for INTEL_DEBUG=bat decoding.

Felix noticed a crash when using INTEL_DEBUG=bat decoding.  It turned
out that we were sometimes placing variable length data near the end
of a buffer, and with the decoder guessing random lengths rather than
having an actual count, it was walking off the end and crashing.  So
this does more than improve the decoder output.

Unfortunately, this is a bit more complicated than i965's handling,
because we don't have a single state buffer.  Various places upload
data via u_upload_mgr, and so there isn't a central place to record
the size.  We don't need to catch every single place, however, since
it's only important to record variable length packets (like viewports
and binding tables).

State data also lives arbitrarily long, rather than being discarded on
every batch like i965, so we don't know when to clear out old entries
either.  (We also don't have a callback when an upload buffer is
released.)  So, this tracking may space leak over time.  That's probably
okay though, as this is only a debugging feature and it's a slow leak.
We may also get lucky and overwrite existing entries as we reuse BOs,
though I find this unlikely to happen.

The fact that the decoder works in terms of offsets from a state base
address is also not ideal, as dynamic state base address and surface
state base address differ for iris.  However, because dynamic state
addresses start from the top of a 4GB region, and binding tables start
from addresses [0, 64K), it's highly unlikely that we'll get overlap.

We can always improve this, but for now it's better than what we had.

vk/util: drop no-op compiler warning workaround

`-Wswitch` applies to `switch()`, not `case:`, and is bypassed by the
presence of a `default:` anyway, so let's drop the `default:` and move
the warning suppression to where it can make a difference, and then it
turns out that we don't need to keep a list of special cases anymore :)

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

mesa/main: make the CONSERVATIVE_RASTERIZATION_INTEL checks consistent

INTEL_conservative_rasterization isn't exposed on compatibility
contexts, nor for GLES 3.0 and below. We already do this correctly for
gl{Enable,Disable}, but we should do the same for glIsEnabled as well.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: make the FRAGMENT_PROGRAM checks consistent

IsEnabled(FRAGMENT_PROGRAM) isn't supposed to be allowed, but our
check allowed this anyway. Let's make these checks consistent, and
while we're at it, modernize them a bit.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: make the TEXTURE_CUBE_MAP checks consistent

IsEnabled(TEXTURE_CUBE_MAP) isn't supposed to be allowed, but our
check allowed this anyway. Let's make these checks consistent, and
while we're at it, modernize them a bit.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: remove duplicate macros

These are already defined as the exactly same, so let's get rid of
the duplicate definitions.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: remove unused argument

The 'CAP' argument has been unused in both of these macros since
2010, so let's get rid of it from both.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

mesa/main: remove unused macro

The first version of this macro is unused, so let's get rid of it.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>

glsl: simplify resource list building code

This greatly simplifies the code to calculate if we should add a
buffer to the resource list. This uses the spec rules and simple
math to decide if we should add the buffer rather than complex
string processing.

This patch refines a patch present in the ARB_gl_spriv merge
request for the NIR linker and applies it to the GLSL IR linker.
This is why we also move the function to the shared linker code,
because we will want to reuse the code for the NIR linker also.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

virgl: track valid buffer range for transfer sync

virgl_transfer_queue_is_queued was used to avoid flushing.  That
fails when the resource is being accessed by previous cmdbufs but
not the current one.

The new approach of tracking the valid buffer range does not apply
to textures however.  But hopefully it is fine because the goal is
to avoid waiting for this scenario

  glBufferSubData(..., offset, size, data1);
  glDrawArrays(...);
  // append new vertex data
  glBufferSubData(..., offset+size, size, data2);
  glDrawArrays(...);

If glTex(Sub)Image* turns out to be an issue, we will need to track
valid level/layer ranges as well.

v2: update virgl_buffer_transfer_extend as well
v3: do not remove virgl_transfer_queue_is_queued

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com> (v1)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> (v2)

virgl: remove support for buffer surfaces

st/mesa does not need it and virglrenderer does not really support
it. Remove the support so that we are sure pipe_surface never
refers to a buffer resource.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>

virgl: handle NULL shader resource explicitly

When shader images/buffers are set, do not rely on
virgl_encoder_write_res and virgl_resource_dirty to do the implicit
NULL check.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>

vulkan: fix build dependency issue with generated files

On machines with many cores, you can run into that issue :

../mesa-9999/src/vulkan/overlay-layer/overlay.cpp:42:10: fatal error: vk_enum_to_str.h: No such file or directory

v2: Move declare_dependency around (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Jan Ziak
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

gallium: enable dmabuf on BSD as well

The DRM_CONF_SHARE_FD code did not check for Linux, so the commit that
introduced PIPE_CAP_DMABUF broke Wayland-EGL clients on FreeBSD.

Fixes: 8ae50e60 (gallium: replace DRM_CONF_SHARE_FD with PIPE_CAP_DMABUF)
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>