mesa.git
5 years agonir: Actually propagate progress in nir_opt_move_load_ubo.
Bas Nieuwenhuizen [Thu, 30 May 2019 20:48:46 +0000 (22:48 +0200)]
nir: Actually propagate progress in nir_opt_move_load_ubo.

Found with Jasons new metadata rework (https://gitlab.freedesktop.org/mesa/mesa/merge_requests/950).

Fixes: af355aaa071 "nir: add nir_opt_move_load_ubo() optimization pass"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoradv: use RADV_CMD_DIRTY_DYNAMIC_* when restoring viewport/scissor
Samuel Pitoiset [Thu, 30 May 2019 10:29:40 +0000 (12:29 +0200)]
radv: use RADV_CMD_DIRTY_DYNAMIC_* when restoring viewport/scissor

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: use CmdPushConstants when restoring constants after meta operations
Samuel Pitoiset [Thu, 30 May 2019 10:29:39 +0000 (12:29 +0200)]
radv: use CmdPushConstants when restoring constants after meta operations

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir/split_vars: Properly bail in the presence of complex derefs
Jason Ekstrand [Wed, 22 May 2019 20:54:39 +0000 (15:54 -0500)]
nir/split_vars: Properly bail in the presence of complex derefs

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/vars_to_ssa: Properly ignore variables with complex derefs
Jason Ekstrand [Mon, 4 Mar 2019 18:12:48 +0000 (12:12 -0600)]
nir/vars_to_ssa: Properly ignore variables with complex derefs

Because the core principle of the vars_to_ssa pass is that it globally
(within a function) looks at all of the uses of a never-indirected path
and does a full into-SSA on that path, it can't handle a path which has
any chance of having aliasing.  If a function_temp variable has a cast
or anything else which may cause aliasing, we have to assume that all
paths to that variable may alias and ignore the entire variable.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/vars_to_ssa: Use a non-null UNDEF_NODE pointer
Jason Ekstrand [Wed, 22 May 2019 23:01:14 +0000 (18:01 -0500)]
nir/vars_to_ssa: Use a non-null UNDEF_NODE pointer

We're about to change the meaning of get_deref_node returning NULL so we
need a non-NULL value to mean properly undefined.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/deref: Add a has_complex_use helper
Jason Ekstrand [Wed, 22 May 2019 20:00:20 +0000 (15:00 -0500)]
nir/deref: Add a has_complex_use helper

This lets passes easily detect derefs which have uses that fall outside
the standard load/store/copy pattern so they can bail appropriately.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/dead_cf: Call instructions aren't dead
Jason Ekstrand [Thu, 23 May 2019 03:13:15 +0000 (22:13 -0500)]
nir/dead_cf: Call instructions aren't dead

When we inlined cf_node_has_side_effects into node_is_dead, all the
conditions flipped and we forgot to flip one.  Fortunately, it doesn't
matter right now because no one uses this pass on shaders with more than
one function.

Fixes: b50465d197 "nir/dead_cf: Inline cf_node_has_side_effects"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agovtn: create cast with type stride.
Dave Airlie [Mon, 27 May 2019 01:07:52 +0000 (11:07 +1000)]
vtn: create cast with type stride.

When creating function parameters, we create pointers from ssa
values, this creates nir casts with stride 0, however we have
no where else to get this value from. Later passes to lower
explicit io need this stride value to do the right thing.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agolist: add some iterator debug
Rob Clark [Sat, 25 May 2019 17:50:41 +0000 (10:50 -0700)]
list: add some iterator debug

Debugging use of unsafe iterators when you should have used the _safe
version sucks.  Add some DEBUG build support to catch and assert if
someone does that.

I didn't update the UPPERCASE verions of the iterators.  They should
probably be deprecated/removed.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
5 years agonir: Accept nir_var_mem_global in derefs used by phis
Caio Marcelo de Oliveira Filho [Thu, 30 May 2019 20:49:19 +0000 (13:49 -0700)]
nir: Accept nir_var_mem_global in derefs used by phis

This mode is used by PhysicalStorageBufferEXT storage class.

Fixes: 8bdf5a008b3 "nir: Allow derefs to be used as phi sources"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/blorp: Use the hardware op for CCS ambiguate on gen10+
Jason Ekstrand [Tue, 15 May 2018 22:28:05 +0000 (15:28 -0700)]
intel/blorp: Use the hardware op for CCS ambiguate on gen10+

Cannonlake hardware adds a new resolve type in 3DSTATE_PS called
FAST_CLEAR_0 which does an ambiguate.  Now that the hardware can do it
directly, we should use that instead of binding the CCS as a render
target and doing it manually.  This was tested with a full Vulkan CTS
run on Cannonlake.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
5 years agoswr/rast: Enable ARB_GL_texture_buffer_range
Jan Zielinski [Mon, 27 May 2019 12:21:49 +0000 (14:21 +0200)]
swr/rast: Enable ARB_GL_texture_buffer_range

No significant changes in the code needed to enable
the extension. Just updating SWR capabilities
and the documentation

Reviewed-by: Alok Hota <alok.hota@intel.com>
5 years agoswr/rast: fix 32-bit compilation on Linux
Jan Zielinski [Mon, 27 May 2019 12:15:53 +0000 (14:15 +0200)]
swr/rast: fix 32-bit compilation on Linux

Removing unused but problematic code from simdlib header to fix
compilation problem on 32-bit Linux.

Reviewed-by: Alok Hota <alok.hota@intel.com>
5 years agointel/fs: Do a stalling MFENCE in endInvocationInterlock()
Jason Ekstrand [Wed, 22 May 2019 17:36:17 +0000 (12:36 -0500)]
intel/fs: Do a stalling MFENCE in endInvocationInterlock()

Fixes: 939312702e "i965: Add ARB_fragment_shader_interlock support"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel/fs,vec4: Use g0 as the header for MFENCE
Jason Ekstrand [Wed, 22 May 2019 17:20:01 +0000 (12:20 -0500)]
intel/fs,vec4: Use g0 as the header for MFENCE

We set header_present but then pass it some random garbage.  Give it g0
instead.  I'm not actually sure this does anything but g0 is the usual
header data and this is what the windows driver does so it seems like a
good idea.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoradv: enable transformFeedbackStreamsLinesTriangles
Samuel Pitoiset [Thu, 30 May 2019 08:08:48 +0000 (10:08 +0200)]
radv: enable transformFeedbackStreamsLinesTriangles

The driver should already support this without any changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: implement VK_EXT_sample_locations and disable it
Samuel Pitoiset [Thu, 16 May 2019 09:55:02 +0000 (11:55 +0200)]
radv: implement VK_EXT_sample_locations and disable it

Basically, this extension allows applications to use custom
sample locations. It doesn't support variable sample locations
during subpass. Note that we don't have to upload the user
sample locations because the spec doesn't allow this.

The extension is currently disabled because the driver needs to
support variable sample locations during layout transitions. The
depth decompress needs to know them and that's a bit invasive.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoiris: Avoid holding the lock while allocating pages.
Kenneth Graunke [Thu, 30 May 2019 07:04:38 +0000 (00:04 -0700)]
iris: Avoid holding the lock while allocating pages.

We only need the lock for:
1. Rummaging through the cache
2. Allocating VMA

We don't need it for alloc_fresh_bo(), which does GEM_CREATE, and also
SET_DOMAIN to allocate the underlying pages.  The idea behind calling
SET_DOMAIN was to avoid a lock in the kernel while allocating pages,
now we avoid our own global lock as well.

We do have to re-lock around VMA.  Hopefully this shouldn't happen too
much in practice because we'll find a cached BO in the right memzone
and not have to reallocate it.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
5 years agoiris: Move SET_DOMAIN to alloc_fresh_bo()
Kenneth Graunke [Thu, 30 May 2019 06:40:20 +0000 (23:40 -0700)]
iris: Move SET_DOMAIN to alloc_fresh_bo()

Chris pointed out that the order between SET_DOMAIN and SET_TILING
doesn't matter, so we can just do the page allocation when creating
a new BO.  Simplifies the flow a bit.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
5 years agoiris: Be lazy about cleaning up purged BOs in the cache.
Kenneth Graunke [Thu, 30 May 2019 06:20:31 +0000 (23:20 -0700)]
iris: Be lazy about cleaning up purged BOs in the cache.

Mathias Fröhlich reported that commit 6244da8e23e5470d067680 crashes.
list_for_each_entry_safe is safe against removing the current entry,
but iris_bo_cache_purge_bucket was potentially removing next entries
too, which broke our saved next pointer.

To fix this, don't bother with the iris_bo_cache_purge_bucket step.
We just detected a single entry where the kernel has purged the BO's
memory, and so it isn't a usable entry for our cache.  We're about to
continue the search with the next BO.  If that one's purged, we'll
clean it up too.  And so on.

We may miss cleaning up purged BOs that are further down the list
after non-purged BOs...but that's probably fine.  We still have the
time-based cleaner (cleanup_bo_cache) which will take care of them
eventually, and the kernel's already freed their memory, so it's not
that harmful to have a few kicking around a little longer.

Fixes: 6244da8e23e iris: Dig through the cache to find a BO in the right memzone
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
5 years agoiris: Dig through the cache to find a BO in the right memzone
Kenneth Graunke [Mon, 27 May 2019 00:11:59 +0000 (17:11 -0700)]
iris: Dig through the cache to find a BO in the right memzone

This saves some util_vma thrash when the first entry in the cache
happens to be in a different memory zone, but one just a tiny bit
ahead is already there and instantly reusable.  Hopefully the cost
of a little extra searching won't break the bank - if it does, we
can consider having separate list heads or keeping a separate VMA
cache.

Improves OglDrvRes performance by 22%, restoring a regression from
deleting the bucket allocators in 694d1a08d3e5883d97d5352895f8431f.

Thanks to Clayton Craft for alerting me to the regression.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Tidy BO sizing code and comments
Kenneth Graunke [Sun, 26 May 2019 23:21:55 +0000 (16:21 -0700)]
iris: Tidy BO sizing code and comments

Buckets haven't been power of two sized in over a decade.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Move some field setting after we drop the lock.
Kenneth Graunke [Sun, 26 May 2019 23:11:46 +0000 (16:11 -0700)]
iris: Move some field setting after we drop the lock.

It's not much, but we may as well hold the lock for a bit less time.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Move cached BO allocation into a helper function.
Kenneth Graunke [Sun, 26 May 2019 22:52:56 +0000 (15:52 -0700)]
iris: Move cached BO allocation into a helper function.

There's enough going on here to warrant a helper.  This also simplifies
the control flow and eliminates the last non-error-case goto.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Fall back to fresh allocations of mapping for zero-memset fails.
Kenneth Graunke [Sun, 26 May 2019 20:48:42 +0000 (13:48 -0700)]
iris: Fall back to fresh allocations of mapping for zero-memset fails.

It is unlikely that we would fail to map a cached BO in order to zero
its contents.  When we did, we would free the first BO in the cache and
try again with the second.  It's possible that this next BO already had
a map setup, in which case we'd succeed.  But if it didn't, we'd likely
fail again in the same manner.

There's not much point in optimizing this case (and frankly, if we're
out of CPU-side VMA we should probably dump the cache entirely)...so
instead, just fall back to allocating a fresh BO from the kernel which
will already be zeroed so we don't have to try and map it.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Move fresh BO allocation into a helper function.
Kenneth Graunke [Sun, 26 May 2019 20:43:32 +0000 (13:43 -0700)]
iris: Move fresh BO allocation into a helper function.

There's enough going on here to warrant a helper.  More cleaning coming.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Do SET_TILING at a single point rather than in two places.
Kenneth Graunke [Sun, 26 May 2019 20:34:28 +0000 (13:34 -0700)]
iris: Do SET_TILING at a single point rather than in two places.

Both the from-cache and fresh-from-GEM cases were calling SET_TILING.
In the cached case, we would retry the allocation on failure, pitching
one BO from the cache each time.  This is silly, because the only time
it should fail is if the tiling or stride parameters are unacceptable,
which has nothing to do with the particular BO in question.  So there's
no point in retrying - we should simply fail the allocation.

This patch moves both calls to bo_set_tiling_internal() below the
cache/fresh split, so we have it at a single point in time instead
of two.

To preserve the ordering between SET_TILING and SET_DOMAIN, we move
that below as well.  (I am unsure if the order matters.)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Use the BO cache even for coherent buffers on non-LLC.
Kenneth Graunke [Sun, 26 May 2019 20:03:20 +0000 (13:03 -0700)]
iris: Use the BO cache even for coherent buffers on non-LLC.

We mark snooped BOs as non-reusable, so we never return them to the
cache.  This means that we'd need to call I915_GEM_SET_CACHING to make
any BO we find in the cache snooped.  But then again, any BO we freshly
allocate from the kernel will also be non-snooped, so it has the same
issue.  There's really no reason to skip the cache - we may as well use
it to avoid the I915_GEM_CREATE overhead.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Fix locking around vma_alloc in iris_bo_create_userptr
Kenneth Graunke [Sun, 26 May 2019 22:57:55 +0000 (15:57 -0700)]
iris: Fix locking around vma_alloc in iris_bo_create_userptr

util_vma needs to be protected by a lock.  All other callers of
vma_alloc and vma_free appear to be holding a lock already.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoiris: Fix lock/unlock mismatch for non-LLC coherent BO allocation.
Kenneth Graunke [Sun, 26 May 2019 19:58:42 +0000 (12:58 -0700)]
iris: Fix lock/unlock mismatch for non-LLC coherent BO allocation.

The goto jumped over the mtx_lock, but proceeded to hit the mtx_unlock.
We can simply set the bucket to NULL and it will skip the cache without
goto, and without messing up locking.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoradeonsi: fix timestamp queries for compute-only contexts
Marek Olšák [Mon, 27 May 2019 20:09:33 +0000 (16:09 -0400)]
radeonsi: fix timestamp queries for compute-only contexts

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
5 years agoChange a few frequented uses of DEBUG to !NDEBUG
Marek Olšák [Fri, 10 May 2019 01:04:23 +0000 (21:04 -0400)]
Change a few frequented uses of DEBUG to !NDEBUG

debugoptimized builds don't define NDEBUG, but they also don't define
DEBUG. We want to enable cheap debug code for these builds.
I only chose those occurences that I care about.

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agoiris: Re-emit Surface State Base Address when context is lost.
Kenneth Graunke [Wed, 29 May 2019 21:48:41 +0000 (14:48 -0700)]
iris: Re-emit Surface State Base Address when context is lost.

When we hit a GPU hang, we failed to reset Surface State Base Address
right away, and would keep hanging until we filled up the binder.  Then
we'd finally get it right after a lot of repeated stumbles.  Update it
right away so we hopefully hang fewer times before succeeding.

5 years agoiris: Enable nir_opt_large_constants
Jason Ekstrand [Tue, 28 May 2019 22:33:58 +0000 (17:33 -0500)]
iris: Enable nir_opt_large_constants

Shader-db results on Kaby Lake:

    total instructions in shared programs: 15306230 -> 15304726 (<.01%)
    instructions in affected programs: 4570 -> 3066 (-32.91%)
    helped: 16
    HURT: 0

    total cycles in shared programs: 361703436 -> 361680041 (<.01%)
    cycles in affected programs: 129388 -> 105993 (-18.08%)
    helped: 16
    HURT: 0

    LOST:   0
    GAINED: 2

The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal
Space Program

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Don't assume UBO indices are constant
Jason Ekstrand [Wed, 29 May 2019 02:56:04 +0000 (21:56 -0500)]
iris: Don't assume UBO indices are constant

It will be true for the constant/system value buffer because they use a
constant zero but it's not true in general.  If we ever got here when
the source wasn't constant, nir_src_as_uint would assert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
5 years agoiris: Move upload_ubo_ssbo_surf_state to iris_program.c
Jason Ekstrand [Tue, 28 May 2019 22:52:58 +0000 (17:52 -0500)]
iris: Move upload_ubo_ssbo_surf_state to iris_program.c

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir: silence three compiler warnings seen with MinGW
Brian Paul [Mon, 20 May 2019 13:33:14 +0000 (07:33 -0600)]
nir: silence three compiler warnings seen with MinGW

Silence two unused var warnings.  And init elem_size, elem_align to
zero to silence "maybe uninitialized" warnings.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agosvga: clamp max_const_buffers to SVGA_MAX_CONST_BUFS
Brian Paul [Mon, 20 May 2019 12:24:06 +0000 (06:24 -0600)]
svga: clamp max_const_buffers to SVGA_MAX_CONST_BUFS

In case the device reports 15 (or more) buffers.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
5 years agoiris: Clone before calling nir_strip and serializing
Kenneth Graunke [Tue, 28 May 2019 22:39:24 +0000 (15:39 -0700)]
iris: Clone before calling nir_strip and serializing

This is non-destructive and leaves the debugging information in place.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoiris: Only store the SHA1 of the NIR in iris_uncompiled_shader
Kenneth Graunke [Tue, 28 May 2019 22:34:52 +0000 (15:34 -0700)]
iris: Only store the SHA1 of the NIR in iris_uncompiled_shader

Jason pointed out that we don't need to keep an entire copy of the
serialized NIR around, we just need the SHA1.  This does change our
disk cache key to be taking a SHA1 of a SHA1, which is a bit odd,
but should work out and be faster and use less memory.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agospirv: Change spirv_to_nir() to return a nir_shader
Caio Marcelo de Oliveira Filho [Sun, 19 May 2019 07:22:17 +0000 (00:22 -0700)]
spirv: Change spirv_to_nir() to return a nir_shader

spirv_to_nir() returned the nir_function corresponding to the
entrypoint, as a way to identify it.  There's now a bool is_entrypoint
in nir_function and also a helper function to get the entry_point from
a nir_shader.

The return type reflects better what the function name suggests.  It
also helps drivers avoid the mistake of reusing internal shader
references after running NIR_PASS on it.  When using NIR_TEST_CLONE or
NIR_TEST_SERIALIZE, those would be invalidated right in the first pass
executed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: Don't re-use entry_point pointer from spirv_to_nir
Caio Marcelo de Oliveira Filho [Sun, 19 May 2019 07:11:37 +0000 (00:11 -0700)]
radv: Don't re-use entry_point pointer from spirv_to_nir

Replace its uses with checking for is_entrypoint and calling
nir_shader_get_entrypoint().

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoglspirv: Don't re-use entry_point pointer from spirv_to_nir
Caio Marcelo de Oliveira Filho [Sun, 19 May 2019 06:57:25 +0000 (23:57 -0700)]
glspirv: Don't re-use entry_point pointer from spirv_to_nir

Replace its use with checking for is_entrypoint.

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoturnip: Don't re-use entry_point pointer from spirv_to_nir
Caio Marcelo de Oliveira Filho [Sun, 19 May 2019 06:55:01 +0000 (23:55 -0700)]
turnip: Don't re-use entry_point pointer from spirv_to_nir

Replace its uses with nir_shader_get_entrypoint(), and change the
helper function to return nir_shader *.

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovirgl: fix readback with pending transfers
Chia-I Wu [Tue, 21 May 2019 23:21:27 +0000 (23:21 +0000)]
virgl: fix readback with pending transfers

When readback is true, and there are pending writes in the transfer
queue, we should flush to avoid reading back outdated data.  This
fixes piglit arb_copy_buffer/dlist and a subtest of
arb_copy_buffer/data-sync.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
5 years agonir: Allow derefs to be used as phi sources
Caio Marcelo de Oliveira Filho [Tue, 7 May 2019 09:03:59 +0000 (02:03 -0700)]
nir: Allow derefs to be used as phi sources

It is possible and valid for a pointer to be selected based on a
conditional before used, and depending on the mode, those cases will
result in a phi with derefs as sources.

To achieve this, we don't rematerialize derefs that are used by phis.
As a consequence, when converting from SSA to regs, we may have phis
that come from different blocks and are used by phis.  We now convert
those to regs too.

Validation was added to ensure only derefs of certain modes can be
used as phi sources.  No extra validation is needed for the presence
of cast, any instruction that uses derefs will validate the
deref-chain is complete (ending in a cast or a var).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradeonsi: Fix editorconfig
Connor Abbott [Fri, 17 May 2019 13:04:21 +0000 (15:04 +0200)]
radeonsi: Fix editorconfig

At least on vim, indenting doesn't work without this. Copied from
src/amd/vulkan.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/main: clean up extension-check for GL_SAMPLE_MASK
Erik Faye-Lund [Mon, 25 Feb 2019 11:23:27 +0000 (12:23 +0100)]
mesa/main: clean up extension-check for GL_SAMPLE_MASK

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa/main: clean up extension-check for GL_SAMPLE_SHADING
Erik Faye-Lund [Mon, 25 Feb 2019 11:21:43 +0000 (12:21 +0100)]
mesa/main: clean up extension-check for GL_SAMPLE_SHADING

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa/main: correct extension-checks for GL_PRIMITIVE_RESTART_FIXED_INDEX
Erik Faye-Lund [Mon, 25 Feb 2019 12:34:41 +0000 (13:34 +0100)]
mesa/main: correct extension-checks for GL_PRIMITIVE_RESTART_FIXED_INDEX

This shouldn't be allowed in GLES 1/2.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa/main: correct extension-checks for GL_BLEND_ADVANCED_COHERENT_KHR
Erik Faye-Lund [Mon, 25 Feb 2019 12:27:07 +0000 (13:27 +0100)]
mesa/main: correct extension-checks for GL_BLEND_ADVANCED_COHERENT_KHR

KHR_blend_equation_advanced_coherent isn't exposed on OpenGL ES 1.x, so
we shouldn't allow its enums there either.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa/main: correct extension-checks for GL_FRAMEBUFFER_SRGB
Erik Faye-Lund [Mon, 25 Feb 2019 12:18:05 +0000 (13:18 +0100)]
mesa/main: correct extension-checks for GL_FRAMEBUFFER_SRGB

This enum shouldn't be allowed on OpenGL ES 1.x, so let's instead
use the extenion-helpers, and check for desktop and gles extensions
separately.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa/main: correct extension-checks for MESA_tile_raster_order
Erik Faye-Lund [Mon, 25 Feb 2019 12:14:50 +0000 (13:14 +0100)]
mesa/main: correct extension-checks for MESA_tile_raster_order

This extension isn't enabled for GLES 1.x, so we shouldn't allow the
state there. Let's use the extension-helpers instead of CHECK_EXTENSION
for this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa/main: make the CONSERVATIVE_RASTERIZATION_NV checks consistent
Erik Faye-Lund [Mon, 25 Feb 2019 12:28:39 +0000 (13:28 +0100)]
mesa/main: make the CONSERVATIVE_RASTERIZATION_NV checks consistent

This just makes the logic of the checks for this enum the same for
gl{Enable,Disable} and for glIsEnabled. They are already functionally
the same, so this is just a minor code-cleanup.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agomesa/main: make the PRIMITIVE_RESTART_NV checks consistent
Erik Faye-Lund [Mon, 25 Feb 2019 11:06:23 +0000 (12:06 +0100)]
mesa/main: make the PRIMITIVE_RESTART_NV checks consistent

{En,Dis}ableClientState(PRIMITIVE_RESTART_NV) should only work on
compatibility contextxs. While we're at it, modernize the code a bit,
by using the extension helpers instead of open-coding.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoradv: use view format when selecting the resolve path for subpasses
Samuel Pitoiset [Tue, 28 May 2019 09:03:29 +0000 (11:03 +0200)]
radv: use view format when selecting the resolve path for subpasses

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: always use view format when performing subpass resolves
Samuel Pitoiset [Tue, 28 May 2019 08:47:12 +0000 (10:47 +0200)]
radv: always use view format when performing subpass resolves

It makes sense to use the image view formats when resolving
inside subpasses, while we have to use the image formats for
normal resolves.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: sync before resetting a pool if there is active pending queries
Samuel Pitoiset [Tue, 28 May 2019 09:08:32 +0000 (11:08 +0200)]
radv: sync before resetting a pool if there is active pending queries

Make sure to sync all previous work if the given command buffer
has pending active queries. Otherwise the GPU might write queries
data after the reset operation.

This fixes a bunch of new dEQP-VK.query_pool.* CTS failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/decoder: Use get_state_size() over guessed counts in more cases
Kenneth Graunke [Thu, 23 May 2019 01:11:50 +0000 (18:11 -0700)]
intel/decoder: Use get_state_size() over guessed counts in more cases

This makes the following packets use actual driver provided sizes rather
than guessing an arbitrary number:

  - CC_VIEWPORT
  - SF_CLIP_VIEWPORT
  - BLEND_STATE
  - COLOR_CALC_STATE
  - SCISSOR_RECT

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
5 years agomeson: Link Gallium drivers with ld_args_build_id
Mike Lothian [Tue, 28 May 2019 11:26:21 +0000 (12:26 +0100)]
meson: Link Gallium drivers with ld_args_build_id

Link all Gallium drivers with ld_args_build_id to prevent failures in
Iris that uses GNU_BUILD_ID

Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=110757
Fixes: 4756864cdc5f "iris: Start wiring up on-disk shader cache"
Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agonir/lower_non_uniform: safely iterate over blocks
Lionel Landwerlin [Tue, 28 May 2019 07:52:50 +0000 (08:52 +0100)]
nir/lower_non_uniform: safely iterate over blocks

This fixes a problem where the same instruction gets replaced twice.
This was happening when the replaced instruction would be at the end
of a block.

Replacement of :

   if ssa_8 {
                ....
      intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
   }

Would be :

   if ssa_8 {
      loop {
         vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) ()
         vec1 1 ssa_48 = ieq ssa_47, ssa_44
         if ssa_48 {
            loop {
               vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) ()
               vec1 1 ssa_50 = ieq ssa_49, ssa_44
               if ssa_50 {
                  intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
                  break
               } else {
        ....
   }

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv: allocate more space in the CS when emitting events
Samuel Pitoiset [Tue, 28 May 2019 10:58:05 +0000 (12:58 +0200)]
radv: allocate more space in the CS when emitting events

If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.

This fixes a crash with Quake Champions.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoiris: Ask st to vectorize our IO.
Kenneth Graunke [Tue, 21 May 2019 22:18:25 +0000 (15:18 -0700)]
iris: Ask st to vectorize our IO.

(Technically this is common code, but it doesn't affect i965 or anv.)

Improves performance of GFXBench5/gl_tess_off on Skylake GT4e at 1080p
by 9.3933% +/- 0.0305157% by eliminating all spilling in the GS.

Improves performance of GFXBench5/gl_4_off (Car Chase) on Skylake GT4e
at 1080p by 0.325208% +/- 0.0842233% (n=18).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/nir: Re-vectorize shader IO
Kenneth Graunke [Thu, 11 Apr 2019 19:28:48 +0000 (12:28 -0700)]
st/nir: Re-vectorize shader IO

We scalarize IO to enable further optimizations, such as propagating
constant components across shaders, eliminating dead components, and
so on.  This patch attempts to re-vectorize those operations after
the varying optimizations are done.

Intel GPUs are a scalar architecture, but IO operations work on whole
vec4's at a time, so we'd prefer to have a single IO load per vector
rather than 4 scalar IO loads.  This re-vectorization can help a lot.

Broadcom GPUs, however, really do want scalar IO.  radeonsi may want
this, or may want to leave it to LLVM.  So, we make a new flag in the
NIR compiler options struct, and key it off of that, allowing drivers
to pick.  (It's a bit awkward because we have per-stage settings, but
this is about IO between two stages...but I expect drivers to globally
prefer one way or the other.  We can adjust later if needed.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: Prevent classic swrast crash on a surfaceless context v2.
Mathias Fröhlich [Wed, 8 May 2019 06:07:24 +0000 (08:07 +0200)]
mesa: Prevent classic swrast crash on a surfaceless context v2.

This fixes the egl_mesa_platform_surfaceless piglit test as well
as the new egl_ext_device_base piglit test on classic swrast.

v2: Fix swrast surfaceless contexts on the driver side.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agoradv add radv_get_resolve_pipeline() in the compute path
Samuel Pitoiset [Mon, 27 May 2019 15:42:36 +0000 (17:42 +0200)]
radv add radv_get_resolve_pipeline() in the compute path

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: cleanup the compute resolve path for subpass
Samuel Pitoiset [Mon, 27 May 2019 15:42:35 +0000 (17:42 +0200)]
radv: cleanup the compute resolve path for subpass

This makes use of radv_meta_resolve_compute_image() by filling
a VkImageResolve region instead of duplicating code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi: add drirc workaround for American Truck Simulator
Timothy Arceri [Mon, 27 May 2019 01:57:27 +0000 (11:57 +1000)]
radeonsi: add drirc workaround for American Truck Simulator

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110711

5 years agoRevert "st/mesa: expose 0 shader binary formats for compat profiles for Qt"
Timothy Arceri [Mon, 27 May 2019 10:07:41 +0000 (20:07 +1000)]
Revert "st/mesa: expose 0 shader binary formats for compat profiles for Qt"

This reverts commit 55376cb31e2f495a4d872b4ffce2135c3365b873.

It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoanv: fix apply_pipeline_layout pass for arrays of YCbCr descriptors
Lionel Landwerlin [Fri, 24 May 2019 12:17:43 +0000 (13:17 +0100)]
anv: fix apply_pipeline_layout pass for arrays of YCbCr descriptors

When using the binding tables to access arrays of YCbCr descriptors we
did not consider the offset of the accessed element. We can't do a
simple multiple because the binding table entries are tightly packed.

For example element 0 of the array could use 2 entries/planes and
element 1 could use 2 entries/planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bb8768b9d62 ("anv: toggle on support for VK_EXT_ycbcr_image_arrays")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoradeonsi: clean up winsys creation
Marek Olšák [Fri, 17 May 2019 19:36:57 +0000 (15:36 -0400)]
radeonsi: clean up winsys creation

- unify the code
- choose radeon or amdgpu based on the DRM version, not based on which one
  succeeds first

5 years agoradeonsi: allow query functions for compute-only contexts
Marek Olšák [Mon, 13 May 2019 22:39:44 +0000 (18:39 -0400)]
radeonsi: allow query functions for compute-only contexts

5 years agoac: treat Mullins as Kabini, remove the enum
Marek Olšák [Wed, 15 May 2019 18:33:34 +0000 (14:33 -0400)]
ac: treat Mullins as Kabini, remove the enum

it's the same design

5 years agoetnaviv: rs: choose clear format based on block size
Christian Gmeiner [Sun, 26 May 2019 19:06:51 +0000 (21:06 +0200)]
etnaviv: rs: choose clear format based on block size

Fixes following piglit and does not introduce any regressions.
  spec@ext_packed_depth_stencil@fbo-depth-gl_depth24_stencil8-blit

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
5 years agolima/ppir: implement discard and discard_if
Vasily Khoruzhick [Sat, 11 May 2019 02:17:40 +0000 (19:17 -0700)]
lima/ppir: implement discard and discard_if

This commit also adds codegen for branch since we need it
for discard_if.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoradv: ignore the loadOp if the first use of an attachment is a resolve
Samuel Pitoiset [Mon, 27 May 2019 08:20:03 +0000 (10:20 +0200)]
radv: ignore the loadOp if the first use of an attachment is a resolve

Based on ANV.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: always dirty the framebuffer when restoring a subpass
Samuel Pitoiset [Thu, 23 May 2019 12:57:07 +0000 (14:57 +0200)]
radv: always dirty the framebuffer when restoring a subpass

The old code was not wrong because the transitions performed
after the resolves should re-emit the framebuffer if needed.

This change is mostly a no-op but it improves consistency
regarding other meta operations that need to save/restore subpasses.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add radv_clear_htile() helper
Samuel Pitoiset [Wed, 22 May 2019 13:38:47 +0000 (15:38 +0200)]
radv: add radv_clear_htile() helper

This helper will be useful for clearing HTILE after some
depth/stencil resolves.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoanv/android: fix missing dependencies issue during parallel build
Chenglei Ren [Thu, 23 May 2019 03:22:00 +0000 (11:22 +0800)]
anv/android: fix missing dependencies issue during parallel build

The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure
it gets generated as a dependency before building them.

Signed-off-by: Chenglei Ren <chenglei.ren@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
5 years agoradv: tidy up GetQueryPoolResults for occlusion queries
Samuel Pitoiset [Wed, 22 May 2019 15:46:33 +0000 (17:46 +0200)]
radv: tidy up GetQueryPoolResults for occlusion queries

Just move the block that checks the availability bit into the
switch like other query types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoiris: Don't flag IRIS_DIRTY_URB after BLORP operations unless it changed
Kenneth Graunke [Fri, 24 May 2019 07:16:11 +0000 (00:16 -0700)]
iris: Don't flag IRIS_DIRTY_URB after BLORP operations unless it changed

We already flag IRIS_DIRTY_URB when we change it, but we were
additionally flagging it on every BLORP operation, even if we didn't.

5 years agoRevert "mesa: unreference current winsys buffers when unbinding winsys buffers"
Dave Airlie [Sun, 26 May 2019 23:36:28 +0000 (09:36 +1000)]
Revert "mesa: unreference current winsys buffers when unbinding winsys buffers"

This reverts commit 12bf7cfecf52083c484602f971738475edfe497e.

This commits caused lots of problems:
https://bugs.freedesktop.org/show_bug.cgi?id=110721
https://bugs.freedesktop.org/show_bug.cgi?id=110761

Fixes: 12bf7cfecf52 ("mesa: unreference current winsys buffers when unbinding winsys buffers")
Pushing without review as we need to get it into next stable.

5 years agopanfrost/midgard: Implement fneg/fabs/fsat
Alyssa Rosenzweig [Sun, 26 May 2019 03:16:37 +0000 (03:16 +0000)]
panfrost/midgard: Implement fneg/fabs/fsat

Fix a regression I inadvertently caused by acking typeless movs before
implementing/pushing this *whistles*

Nothing to see here, move along folks.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agolima: fix lima_blit with non-zero level source resource
Qiang Yu [Thu, 16 May 2019 11:38:01 +0000 (19:38 +0800)]
lima: fix lima_blit with non-zero level source resource

lima_blit will do blit between resources with different levels.
When blit from a level!=0 source, it will sample from that level
of resource as texture.

Current texture setup won't respect level when not mipmap filter.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: fix render to non-zero level texture
Qiang Yu [Wed, 15 May 2019 09:35:19 +0000 (17:35 +0800)]
lima: fix render to non-zero level texture

Current implementation won't respect level of surface to render.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agoeditorconfig: Fix meson style
Dylan Baker [Thu, 23 May 2019 17:21:05 +0000 (10:21 -0700)]
editorconfig: Fix meson style

The syntax was wrong, resulting in it not working at all.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agovirgl: remove an incorrect check in virgl_res_needs_flush
Chia-I Wu [Wed, 8 May 2019 21:53:47 +0000 (14:53 -0700)]
virgl: remove an incorrect check in virgl_res_needs_flush

Imagine this

  resource_copy_region(ctx, dst, ..., src, ...);
  transfer_map(ctx, src, 0, PIPE_TRANSFER_WRITE, ...);

at the beginning of a cmdbuf.  We need to flush in transfer_map so
that the transfer is not reordered before the resource copy.  The
check for "vctx->num_draws == 0 && vctx->num_compute == 0" is not
enough.  Removing the optimization entirely.

Because of the more precise resource tracking in the previous
commit, I hope the performance impact is minimized.  We will have to
go with perfect resource tracking, or attempt a more limited
optimization, if there are specific cases we really need to optimize
for.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: reemit resources on first draw/clear/compute
Chia-I Wu [Wed, 8 May 2019 21:19:08 +0000 (14:19 -0700)]
virgl: reemit resources on first draw/clear/compute

This gives us more precise resource tracking.  It can be beneficial
because glFlush is often followed by state changes.  We don't want
to reemit resources that are going to be unbound.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: add missing emit_res for SO targets
Chia-I Wu [Wed, 8 May 2019 21:33:28 +0000 (14:33 -0700)]
virgl: add missing emit_res for SO targets

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agogallivm: fix default cbuf info.
Roland Scheidegger [Fri, 24 May 2019 00:41:12 +0000 (02:41 +0200)]
gallivm: fix default cbuf info.

The default null_output really needs to be static, otherwise the values
we'll eventually get later are doubly random (they are not initialized,
and even if they were it's a pointer to a local stack variable).
VMware bug 2349556.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoscons: fix build with llvm 9.
Roland Scheidegger [Fri, 24 May 2019 01:46:07 +0000 (03:46 +0200)]
scons: fix build with llvm 9.

The x86asmprinter component is gone, and things seem to work by just
removing it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110707

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agopanfrost: Dereference sampled texture
Tomeu Vizoso [Thu, 23 May 2019 08:09:33 +0000 (10:09 +0200)]
panfrost: Dereference sampled texture

We are currently leaking resources if they were sampled from. Once we
are done with a sampler, we should dereference that resource.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Avoid pulling Docker image on every run
Tomeu Vizoso [Mon, 13 May 2019 07:11:27 +0000 (09:11 +0200)]
panfrost: ci: Avoid pulling Docker image on every run

Jump over the container stage if we haven't changed any of the files
that involved in building the container images.

This saves 1-2 minutes in each run and helps conserve resources.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agonir: Drop imov/fmov in favor of one mov instruction
Jason Ekstrand [Mon, 6 May 2019 16:45:46 +0000 (11:45 -0500)]
nir: Drop imov/fmov in favor of one mov instruction

The difference between imov and fmov has been a constant source of
confusion in NIR for years.  No one really knows why we have two or when
to use one vs. the other.  The real reason is that they do different
things in the presence of source and destination modifiers.  However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>
5 years agonir/builder: Merge nir_[if]mov_alu into one nir_mov_alu helper
Jason Ekstrand [Mon, 6 May 2019 16:26:27 +0000 (11:26 -0500)]
nir/builder: Merge nir_[if]mov_alu into one nir_mov_alu helper

Unless source modifiers are present, fmov and imov are the same.
There's no good reason for having two helpers.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agonir/lower_to_source_mods: Stop turning add, sat, and neg into mov
Jason Ekstrand [Mon, 6 May 2019 16:17:40 +0000 (11:17 -0500)]
nir/lower_to_source_mods: Stop turning add, sat, and neg into mov

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agonir/source_mods: Add a helpers for setting source modifiers
Jason Ekstrand [Mon, 6 May 2019 20:30:36 +0000 (15:30 -0500)]
nir/source_mods: Add a helpers for setting source modifiers

It's potentially a tiny bit less efficient but the helpers make it much
easier to sort out the rules for updating source modifiers.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agointel: Implement abs, neg, and sat in the back-end
Jason Ekstrand [Mon, 6 May 2019 16:16:25 +0000 (11:16 -0500)]
intel: Implement abs, neg, and sat in the back-end

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agointel/nir: Call alu_to_scalar one last time before out-of-ssa
Jason Ekstrand [Mon, 6 May 2019 17:25:29 +0000 (12:25 -0500)]
intel/nir: Call alu_to_scalar one last time before out-of-ssa

A few of our very late passes can end up generating vectors accidentally
so we need to get rid of them.  The only known case of this is the ffma
peephole which generates fneg and fabs as vectors.  Currently, they're
not a problem because they get turned into fmov which the back-end
compiler knows how to handle as a vector.  That's about to change.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>