mesa.git
5 years agopanfrost: Cleanup after scoreboarding
Alyssa Rosenzweig [Mon, 15 Jul 2019 18:32:23 +0000 (11:32 -0700)]
panfrost: Cleanup after scoreboarding

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Allocate UBOs on the stack, not the heap
Alyssa Rosenzweig [Mon, 15 Jul 2019 18:30:35 +0000 (11:30 -0700)]
panfrost: Allocate UBOs on the stack, not the heap

Saves a call to calloc (the maximum size is small and known at
compile-time) and fixes a leak.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonir,intel: Add support for lowering 64-bit nir_opt_extract_*
Jason Ekstrand [Mon, 15 Jul 2019 15:31:49 +0000 (10:31 -0500)]
nir,intel: Add support for lowering 64-bit nir_opt_extract_*

We need this when doing full software 64-bit emulation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110309
Fixes: cbad201c2b3 "nir/algebraic: Add missing 64-bit extract_[iu]8..."
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agonir/opt_if: Clean up single-src phis in opt_if_loop_terminator
Jason Ekstrand [Wed, 10 Jul 2019 20:14:42 +0000 (15:14 -0500)]
nir/opt_if: Clean up single-src phis in opt_if_loop_terminator

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111071
Fixes: 2a74296f24ba "nir: add opt_if_loop_terminator()"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoradeonsi: verify buffer_offset value before using it
Pierre-Eric Pelloux-Prayer [Fri, 5 Jul 2019 12:57:29 +0000 (14:57 +0200)]
radeonsi: verify buffer_offset value before using it

This buffer_ofset can come directly from the application (e.g: when using
glVertexAttribPointer) and can contain an invalid value.

st_atom_array already makes sure that if it's not negative so all that's left
is to verify that it's smaller that the buffer size.

Bugs related to this issue:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105251#c52
Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=109693
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: verify that vertex buffer offset isn't negative
Pierre-Eric Pelloux-Prayer [Fri, 5 Jul 2019 12:51:23 +0000 (14:51 +0200)]
st/mesa: verify that vertex buffer offset isn't negative

For drivers supporting PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET the buffer_offset value
will be interpreted as an signed int.

An example of application code causing a negative offset:

            float b[] = { ... }; // 3 float for pos, 3 for color
            glBufferData(GL_ARRAY_BUFFER, ..., b, ...);
            glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), 0);
            glVertexAttribPointer(1, 3, GL_FLOAT, GL_FALSE, 6 * sizeof(float), &b[3]);
                                                                                ^
                                                                    should be 3 * sizeof(float)

The offset is a ptr so when interpreted as a signed int it can be negative.

This commit adds a verification that (int) buffer_offset is not negative - this would
indicate an application bug. Since it's too late to emit a GL_INVALID_VALUE error,
we replace the negative offset by 0 and emit a debug message.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: don't invalidate a buffer range that is mapped
Marek Olšák [Wed, 3 Jul 2019 22:51:24 +0000 (18:51 -0400)]
st/mesa: don't invalidate a buffer range that is mapped

This is needed to fix an issue with OpenGL when a buffer is mapped and
BufferSubData is called. In this case, we can't invalidate the buffer range.

5 years agogallium: use MAP_DIRECTLY to mean supression of DISCARD in buffer_subdata
Marek Olšák [Wed, 3 Jul 2019 22:51:24 +0000 (18:51 -0400)]
gallium: use MAP_DIRECTLY to mean supression of DISCARD in buffer_subdata

This is needed to fix an issue with OpenGL when a buffer is mapped and
BufferSubData is called. In this case, we can't invalidate the buffer range.

5 years agoiris: Better handle decoder base addresses
Kenneth Graunke [Fri, 12 Jul 2019 20:52:35 +0000 (13:52 -0700)]
iris: Better handle decoder base addresses

It can be useful to call the decoder on a single batch.  But, that batch
may not contain STATE_BASE_ADDRESS, at which point the decoder will have
no idea how to find any buffers.  We can initialize the two static bases
at the beginning of time, so it has them even if it never sees SBA.

Surface base address changes dynamically, possibly in the middle of a
batch.  So we update it at the start of each batch, making it always
start at the value we inherited from the previous one.  SBA commands
inside the batch can update it to a proper value.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agoradv/gfx10: enable OC_LDS_EN for NGG GS if the ES stage is TES
Samuel Pitoiset [Mon, 15 Jul 2019 16:46:48 +0000 (18:46 +0200)]
radv/gfx10: enable OC_LDS_EN for NGG GS if the ES stage is TES

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoanv: Add android dependencies on android.
Bas Nieuwenhuizen [Mon, 8 Jul 2019 10:47:14 +0000 (12:47 +0200)]
anv: Add android dependencies on android.

Specifically needed for nativewindow for some VK_EXT_external_memory_android_hardware_buffers
functions, where we call into some AHardwareBuffer functions.

The legacy Android ext did not have us call into any Android function
at all and hence it was not noticed.

Fixes: 755c633b8d9 "anv: Fix vulkan build in meson."
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
5 years agopanfrost: Advertise more depth/stencil formats
Alyssa Rosenzweig [Mon, 15 Jul 2019 14:12:47 +0000 (07:12 -0700)]
panfrost: Advertise more depth/stencil formats

Fixes a regression in glmark's shadow/refract scenes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/mfbd: Add Z32 rendering support
Alyssa Rosenzweig [Mon, 15 Jul 2019 14:10:31 +0000 (07:10 -0700)]
panfrost/mfbd: Add Z32 rendering support

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Fix blend_cso if nr_cbufs == 0
Alyssa Rosenzweig [Mon, 15 Jul 2019 14:08:15 +0000 (07:08 -0700)]
panfrost: Fix blend_cso if nr_cbufs == 0

Fixes: 46396af1ec4b69ca4a ("panfrost: Refactor blend infrastructure")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Cleanup shader upload code
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:53:52 +0000 (16:53 -0700)]
panfrost: Cleanup shader upload code

The old algorithm is still used (and the same issue -- namely, leaking
all shaders -- applies) but we're way more concise about it since we're
*only* using the routine for shaders nowadays; everything else is a
BO-proper or transient.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Remove all old allocators
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:45:44 +0000 (16:45 -0700)]
panfrost: Remove all old allocators

With the new refactor, this all becomes dead code.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Use transient memory for occlusion queries
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:38:11 +0000 (16:38 -0700)]
panfrost: Use transient memory for occlusion queries

These only last a frame anyway.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Remove bizarre hack
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:37:45 +0000 (16:37 -0700)]
panfrost: Remove bizarre hack

I don't think this is still necessary, and if it is, we'll have to
figure out how to fix it the right way.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Upload vertex descriptors to *transient* memory
Alyssa Rosenzweig [Fri, 12 Jul 2019 23:35:47 +0000 (16:35 -0700)]
panfrost: Upload vertex descriptors to *transient* memory

It's not legal to reuse the vertex shader descriptor across frames now
that we patch it at draw-time, so upload to transient memory.

Ideally, we could be smarter about this such that subsequent draws with
the same vertex shader and same patched state would reuse the
descriptor, but for now, let's simply achieve correctness.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Delay resource mmaps
Alyssa Rosenzweig [Fri, 12 Jul 2019 22:50:58 +0000 (15:50 -0700)]
panfrost: Delay resource mmaps

We use the new PAN_ALLOCATE_DELAY_MMAP flag to only map resources
on-demand, which should avoid mapping FBOs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Cleanup PAN_ALLOCATE_*
Alyssa Rosenzweig [Fri, 12 Jul 2019 22:45:28 +0000 (15:45 -0700)]
panfrost: Cleanup PAN_ALLOCATE_*

While we're at it, prompted by a semantics issue around INVISIBLE, also
add a separate DELAY_MMAP flag.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/drm: Don't mmap INVISIBLE buffers
Alyssa Rosenzweig [Fri, 12 Jul 2019 22:39:47 +0000 (15:39 -0700)]
panfrost/drm: Don't mmap INVISIBLE buffers

On the new kernel, mmaping doesn't *hurt* per se, but it's still
wasteful for buffers explicitly marked as not needing an mmap.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoanv: fix crash in vkCmdClearAttachments with unused attachment
Lionel Landwerlin [Mon, 15 Jul 2019 12:35:11 +0000 (15:35 +0300)]
anv: fix crash in vkCmdClearAttachments with unused attachment

anv_render_pass_compile() turns an unused attachment into a NULL
depth_stencil_attachment pointer so check that pointer before
accessing it.

Found with updates to existing CTS tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 208be8eafa30be ("anv: Make subpass::depth_stencil_attachment a pointer")
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agoradv/gfx10: export the PrimitiveID for ES stages (VS or TES)
Samuel Pitoiset [Wed, 10 Jul 2019 22:34:18 +0000 (00:34 +0200)]
radv/gfx10: export the PrimitiveID for ES stages (VS or TES)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: declare an external symbol for the ESGS ring
Samuel Pitoiset [Wed, 10 Jul 2019 22:29:50 +0000 (00:29 +0200)]
radv/gfx10: declare an external symbol for the ESGS ring

It will be used for stream output but for now only declares it
if VS and if the PrimitiveID needs to be exported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: allocate ESGS ring space for exporting PrimitiveID
Samuel Pitoiset [Wed, 10 Jul 2019 22:25:28 +0000 (00:25 +0200)]
radv/gfx10: allocate ESGS ring space for exporting PrimitiveID

Only VS needs that. We shouldn't hardcode these values but
that's complicated to not do that for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: fix crash when emitting NGG GS prologue
Samuel Pitoiset [Sun, 14 Jul 2019 10:55:48 +0000 (12:55 +0200)]
radv/gfx10: fix crash when emitting NGG GS prologue

ac_nir_context is initialized after the driver emits the NGG GS
prologue so it's likely to crash.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agolima/ppir: Fix branch codegen
Vasily Khoruzhick [Mon, 15 Jul 2019 01:22:36 +0000 (18:22 -0700)]
lima/ppir: Fix branch codegen

"unknown_2" field is actually a size of instruction that branch
points to. If it's set to a smaller size than actual instruction
branch behavior is not defined (and it usually wedges the GPU).

Fix it by setting this field correctly.

Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: Fix assert condition in ppir_codegen_encode_discard
Vasily Khoruzhick [Mon, 15 Jul 2019 01:21:57 +0000 (18:21 -0700)]
lima/ppir: Fix assert condition in ppir_codegen_encode_discard

Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if")
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoetnaviv: fix incorrect varying interpolation
Jonathan Marek [Wed, 19 Jun 2019 21:49:47 +0000 (17:49 -0400)]
etnaviv: fix incorrect varying interpolation

This corresponds to what the GC3000 blob does. The USED / UNUSED enums are
wrong, at least for GC2000/GC3000.

Without this the 3rd texture component is not interpolated correctly (flat?)
in the following test (and others):

dEQP-GLES2.functional.texture.mipmap.cube.generate.rgba8888_nicest

Strangely, when the texture is sampled from OpenGL it works correctly,
the problem only shows up for sampling by gallium/blitter. This fixes other
cube map tests which use util_blitter_blit.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: reduce rs alignment requirement for two pixel pipes GPU
Jonathan Marek [Wed, 19 Jun 2019 21:16:27 +0000 (17:16 -0400)]
etnaviv: reduce rs alignment requirement for two pixel pipes GPU

The rs alignment doesn't have to be multiplied by # of pixel pipes.

This works on GC2000 which doesn't have the SINGLE_BUFFER feature.

This fixes some cubemaps (NPOT / small mipmap levels) because aligning by 8
breaks the expected alignment of 4 for tiled format. We don't want to mess
with the alignment of tiled formats.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: fix nearest_linear / linear_nearest filtering on GC3000
Jonathan Marek [Wed, 19 Jun 2019 15:43:54 +0000 (11:43 -0400)]
etnaviv: fix nearest_linear / linear_nearest filtering on GC3000

The MIN filter is never used when not using mipmaps. This fixes that.

Interestingly, only GC3000 needs this (GC2000 works without this fix).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: fix nearest filtering
Jonathan Marek [Wed, 19 Jun 2019 15:42:13 +0000 (11:42 -0400)]
etnaviv: fix nearest filtering

ROUND_UV rounding breaks nearest filtering.

Enable it only when nearest filtering isn't used.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoradv/gfx10: Fix DCC clears.
Bas Nieuwenhuizen [Sat, 13 Jul 2019 14:05:40 +0000 (16:05 +0200)]
radv/gfx10: Fix DCC clears.

Looks like if the reg clear bit is set, the hwardware does not use the 0/1
clears for textures.

Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agomeson: Add dep_thread dependency.
Vinson Lee [Thu, 13 Jun 2019 22:08:27 +0000 (15:08 -0700)]
meson: Add dep_thread dependency.

Fix this build error on Ubuntu 18.04.

/usr/bin/ld: src/util/libmesa_util.a(u_cpu_detect.c.o): undefined reference to symbol 'pthread_once@@GLIBC_2.2.5'

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110663
Suggested-by: Eric Engestrom <eric@@engestrom.ch>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Eric Engestrom <eric@engestrom.ch>
5 years agogitlab-ci: Build i386 and ARM drivers in surfaceless mode.
Eric Anholt [Thu, 11 Jul 2019 19:58:28 +0000 (12:58 -0700)]
gitlab-ci: Build i386 and ARM drivers in surfaceless mode.

I don't particularly care about getting x86/ARM cross-build coverage
of all the window systems, but we do want to be building src/mesa/
(for x86 asm) and gallium drivers (for vc4 NEON asm).  I'm also hoping
to use these build products for testing freedreno on actual HW (which
we do using surfaceless).

This increases the docker image from 1.4G to 1.5G.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
5 years agolima: Fix compiler warnings for unused functions.
Andreas Baierl [Thu, 11 Jul 2019 13:26:24 +0000 (15:26 +0200)]
lima: Fix compiler warnings for unused functions.

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agoanv: Fix pool allocator when first alloc needs to grow
Caio Marcelo de Oliveira Filho [Fri, 12 Jul 2019 21:37:38 +0000 (14:37 -0700)]
anv: Fix pool allocator when first alloc needs to grow

When using softpin, the first allocation was not calculating the
padding and offset correctly for the case the first allocation needed
to grow.  We were missing initialize the state.end right after
expanding the pool for the first time.

This is not a problem for non-softpin since there we don't use
leftover padding so the ends would re-arrange incrementally.

This fixes running dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 in
SKL -- the test uses a shader larger than the initial size for the
instruction pool.

Fixes: dfc9ab2ccd9 "anv/allocator: Add padding information."
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agomesa: Port errors.c to util/list.h instead of simple_list.
Kenneth Graunke [Sun, 31 May 2015 23:02:36 +0000 (16:02 -0700)]
mesa: Port errors.c to util/list.h instead of simple_list.

There is widespread consensus that simple_list should go away.
This patch converts one more use to the modern kernel-style list.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agointel: Run the optimization loop before and after lowering int64
Jason Ekstrand [Fri, 12 Jul 2019 23:47:15 +0000 (18:47 -0500)]
intel: Run the optimization loop before and after lowering int64

For bindless SSBO access, we have to do 64-bit address calculations.  On
ICL and above, we don't have 64-bit integer support so we have to lower
the address calculations to 32-bit arithmetic.  If we don't run the
optimization loop before lowering, we won't fold any of the address
chain calculations before lowering 64-bit arithmetic and they aren't
really foldable afterwards.  This cuts the size of the generated code in
the compute shader in dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 by
around 30%.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agopanfrost/decode: Drop _replay prefix
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:57:10 +0000 (08:57 -0700)]
panfrost/decode: Drop _replay prefix

We don't even support replay anymore; this is just wasting characters
and adding clutter.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Drop _name suffixes
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:54:49 +0000 (08:54 -0700)]
panfrost/decode: Drop _name suffixes

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Add MEMORY_PROP_DIR variant
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:47:35 +0000 (08:47 -0700)]
panfrost/decode: Add MEMORY_PROP_DIR variant

This allows dumping memory properties directly without dereferencing an
address, allowing us to fix more -Waddress-of-packed-member warnings.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Copy embedded structs before using
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:45:51 +0000 (08:45 -0700)]
panfrost/decode: Copy embedded structs before using

Fixes some, but not all, warnings from -Waddress-of-packed-member

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Remove pandecode_decode_fbd_type
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:45:36 +0000 (08:45 -0700)]
panfrost/decode: Remove pandecode_decode_fbd_type

It is unused.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Use generic outmod type
Alyssa Rosenzweig [Fri, 12 Jul 2019 15:41:13 +0000 (08:41 -0700)]
panfrost/midgard: Use generic outmod type

It could be midgard_outmod_float or midgard_outmod_int; don't assume
it's one or the other. Fixes -Wenum-conversion warnings.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Precompute scoreboard dependents
Alyssa Rosenzweig [Fri, 12 Jul 2019 21:48:34 +0000 (14:48 -0700)]
panfrost: Precompute scoreboard dependents

Mali job dependency graphs, at least for GLES3.0, have the special
property that a given node will only have at most a single dependent.
This allows us to efficiently precompute the dependent array and
replace an inner loop's O(N) search with an O(1) lookup, bringing the
algorithmic complexity of scoreboarding from O(N^2) to O(N).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Remove transient pool abstraction
Alyssa Rosenzweig [Fri, 12 Jul 2019 21:09:57 +0000 (14:09 -0700)]
panfrost: Remove transient pool abstraction

Now that it has been totally replaced by the borrow mechanism, it is now
unused code.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Subdivide fixed-size transient slabs
Alyssa Rosenzweig [Fri, 12 Jul 2019 20:59:35 +0000 (13:59 -0700)]
panfrost: Subdivide fixed-size transient slabs

The whole purpose of the transient memory model is to make subdivision
stupidly easy, so let's handle that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Recycle fixed-size transient BOs
Alyssa Rosenzweig [Fri, 12 Jul 2019 20:05:14 +0000 (13:05 -0700)]
panfrost: Recycle fixed-size transient BOs

The usual case. We use the bitset to mark freedom and seize it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Bookkeep transient indices
Alyssa Rosenzweig [Fri, 12 Jul 2019 19:53:36 +0000 (12:53 -0700)]
panfrost: Bookkeep transient indices

The batch now temporarily possesses the transient buffer, so it'll need
to remember that to free it later.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Rewrite allocate_transient with new abstraction
Alyssa Rosenzweig [Fri, 12 Jul 2019 19:49:23 +0000 (12:49 -0700)]
panfrost: Rewrite allocate_transient with new abstraction

We use a fixed size slab if we can, otherwise we create a dedicated
("oversized") BO and add that to the job. In the latter case we'll get
reference counting for free so we can forget about this corner case for
the rest of the series.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add pan_bo_for_screen helper
Alyssa Rosenzweig [Fri, 12 Jul 2019 20:55:45 +0000 (13:55 -0700)]
panfrost: Add pan_bo_for_screen helper

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Add panfrost_transient_bo array
Alyssa Rosenzweig [Thu, 11 Jul 2019 17:34:40 +0000 (10:34 -0700)]
panfrost: Add panfrost_transient_bo array

We would like transient allocations to occur on the screen (borrowed by
the batch) rather than on the context. Add fields to track this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Don't upload vertex/tiler twice
Alyssa Rosenzweig [Thu, 11 Jul 2019 18:39:33 +0000 (11:39 -0700)]
panfrost: Don't upload vertex/tiler twice

The latter upload is correct, but the former upload is unassociated with
any particular FBO and therefore becomes orphaned. We do have to upload
at draw-time at the latest, if we haven't by then.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/drm: Check allocation size is positive
Alyssa Rosenzweig [Thu, 11 Jul 2019 17:53:37 +0000 (10:53 -0700)]
panfrost/drm: Check allocation size is positive

Zero-sized allocations will fail with an unhelpful errno from the
kernel; check size explicitly in userspace before it gets that far.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agomesa/glspirv: Validate that compute shaders are not linked with other stages
Neil Roberts [Tue, 12 Jun 2018 20:24:00 +0000 (22:24 +0200)]
mesa/glspirv: Validate that compute shaders are not linked with other stages

The test is based on link_shaders().

For example, it allows the following test (when run on SPIR-V mode) to
pass:
   spec/arb_compute_shader/linker/mix_compute_and_non_compute.shader_test

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomesa/glspirv: Validate that there is a VS when there is a TCS, TES or GS
Neil Roberts [Fri, 25 May 2018 13:34:26 +0000 (15:34 +0200)]
mesa/glspirv: Validate that there is a VS when there is a TCS, TES or GS

The shader combination tests are copied from link_shaders().

For example, it allows the following tests (when run on SPIR-V mode) to
pass:
   spec/arb_tessellation_shader/linker/no-vs
   spec/arb_tessellation_shader/linker/tcs-no-vs
   spec/arb_tessellation_shader/linker/tes-no-vs
   spec/glsl-1.50/linker/gs-without-vs

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoi965: don't use disk cache with SPIR-V shaders
Alejandro Piñeiro [Wed, 27 Feb 2019 14:28:47 +0000 (15:28 +0100)]
i965: don't use disk cache with SPIR-V shaders

Right now we don't support disk cache for SPIR-V shaders (from
ARB_gl_spirv), so let's avoid writing the program data to or reading
it from the disk if any in-use shaders use SPIR-V.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoglsl/shader_cache: handle SPIR-V shaders
Alejandro Piñeiro [Wed, 27 Feb 2019 14:29:15 +0000 (15:29 +0100)]
glsl/shader_cache: handle SPIR-V shaders

Right now we don't have cache support for SPIR-V shaders (from
ARB_gl_spirv). Right now they are properly skipped because they fall
on the ff shader code path (no key, no name), but it would be better
to update current comments, and add some guards.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/linker: Initialize UniformDataDefaults when using SPIR-V
Arcady Goldmints-Orlov [Mon, 28 Jan 2019 16:19:28 +0000 (10:19 -0600)]
nir/linker: Initialize UniformDataDefaults when using SPIR-V

Allocate UniformDataDefaults and fill in the data defaults when
linking a SPIR-V program. Among other things, this allows program
serialization to work.

It allows the following piglit test (when run on SPIR-V mode) to pass:
  spec/arb_get_program_binary/execution/uniform-after-restore.shader_test

v2: use memcpy to initialize UniformDataDefaults

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoglsl/serialize: Update write_program_resource_data() to handle NULL input and output...
Arcady Goldmints-Orlov [Thu, 20 Dec 2018 01:12:25 +0000 (02:12 +0100)]
glsl/serialize: Update write_program_resource_data() to handle NULL input and output variable names

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoglsl/serialize: Handle NULL uniform name in write_uniforms()
Arcady Goldmints-Orlov [Thu, 29 Nov 2018 14:16:34 +0000 (15:16 +0100)]
glsl/serialize: Handle NULL uniform name in write_uniforms()

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomesa/main: Fix UBO/SSBO ACTIVE_VARIABLES query (ARB_gl_spirv)
Antia Puentes [Tue, 18 Dec 2018 10:55:04 +0000 (11:55 +0100)]
mesa/main: Fix UBO/SSBO ACTIVE_VARIABLES query (ARB_gl_spirv)

When querying MAX_NUM_ACTIVE_VARIABLES, NUM_ACTIVE_VARIABLES and
ACTIVE_VARIABLES over SSBO and UBO interfaces, we filter the variables
which are active using the variable's name and looking for it in the
program resource list. If it is in the program resource list, the
variable will be considered active.

However due to ARB_gl_spirv where name reflection information is not
mandatory, we can use the UBO/SSBO binding and variable offset to
filter which variables which are active.

v2: use RESOURCE_UBO/UNI macros instead of direct castings, update
    comment (Alejandro)

v3: Change signature of _mesa_program_resource_find_active_variable
    to simplify calling it. Also, squash the fix for find_binding_offset
    for arrays of blocks (Arcady)

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomesa/shader_query: Fix LOCATION_INDEX query (ARB_gl_spirv)
Antia Puentes [Fri, 14 Sep 2018 06:55:24 +0000 (08:55 +0200)]
mesa/shader_query: Fix LOCATION_INDEX query (ARB_gl_spirv)

When querying GL_LOCATION_INDEX using glGetProgramResourceiv
we already know the index of the resource, we do not need to find
it using the name, which is convenient for shaders coming from
SPIR-V binaries where names are optional.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomesa/shaderapi: Fix TRANSFORM_FEEDBACK_VARYING program query
Antia Puentes [Mon, 13 Aug 2018 12:13:38 +0000 (14:13 +0200)]
mesa/shaderapi: Fix TRANSFORM_FEEDBACK_VARYING program query

Fixes the program queries API (glGetProgramiv):
TRANSFORM_FEEDBACK_VARYINGS and TRANSFORM_FEEDBACK_VARYING_MAX_LENGTH
in two cases:

  1. ARB_enhaced_layouts:

The queries were not working for GLSL shaders which specify the
varyings using enhanced layouts. We were returning the info as if the
varyings could only be specified using the API.

  2. ARB_gl_spirv:

TRANSFORM_FEEDBACK_VARYING_MAX_LENGTH should return 1 if there is no
name reflection information available.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomesa/uniforms: Fix GetUniformLocation (ARB_gl_spirv)
Antia Puentes [Wed, 8 Aug 2018 15:52:04 +0000 (17:52 +0200)]
mesa/uniforms: Fix GetUniformLocation (ARB_gl_spirv)

From the ARB_gl_spirv specification, glGetUniformLocation should
return -1 when no name reflection is available.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomesa/shader_query: Fix NAME_LENGTH queries (ARB_gl_spirv)
Antia Puentes [Mon, 13 Aug 2018 16:48:37 +0000 (18:48 +0200)]
mesa/shader_query: Fix NAME_LENGTH queries (ARB_gl_spirv)

For shaders constructed from SPIR-V binaries, it is possible that
no name reflection information is available. In that case,

 - glGetProgramInterfaceiv(.., pname=MAX_NAME_LENGTH, ..)
 - gletProgramResourceiv(.., props=NAME_LENGTH, ..)

should return 1.

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomesa: Fix ACTIVE_*_MAX_LENGTH program queries (ARB_gl_spirv)
Alejandro Piñeiro [Sat, 18 Nov 2017 09:04:42 +0000 (10:04 +0100)]
mesa: Fix ACTIVE_*_MAX_LENGTH program queries (ARB_gl_spirv)

Since ARB_gl_spirv it is possible to miss a lot of name reflection
information, so it is needed to add NULL name checks for several
queries, and return a specific value on those cases. This commit add
them for ACTIVE_UNIFORM_BLOCK_MAX_NAME_LENGTH,
ACTIVE_ATTRIBUTE_MAX_LENGTH and ACTIVE_UNIFORM_MAX_LENGTH.

From ARB_gl_spirv spec:

   "If pname is ACTIVE_UNIFORM_BLOCK_MAX_NAME_LENGTH, the length of
    the longest active uniform block name, including the null
    terminator, is returned. If no active uniform blocks exist, zero
    is returned. If no name reflection information is available, one
    is returned.

    If pname is ACTIVE_ATTRIBUTE_MAX_LENGTH, the length of the longest
    active attribute name, including a null terminator, is returned.
    If no active attributes exist, zero is returned. If no name
    reflection information is available, one is returned.

    If pname is ACTIVE_UNIFORM_MAX_LENGTH, the length of the longest
    active uniform name, including a null terminator, is returned. If
    no active uniforms exist, zero is returned. If no name reflection
    information is available, one is returned."

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/types: Add glsl_type_is_unsized_array helper
Antia Puentes [Thu, 15 Nov 2018 08:13:08 +0000 (09:13 +0100)]
nir/types: Add glsl_type_is_unsized_array helper

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/linker: Fill TOP_LEVEL_ARRAY_SIZE and STRIDE
Antia Puentes [Sun, 30 Jun 2019 23:27:59 +0000 (18:27 -0500)]
nir/linker: Fill TOP_LEVEL_ARRAY_SIZE and STRIDE

From the ARB_program_interface_query specification:

    "For the property TOP_LEVEL_ARRAY_SIZE, a single integer
    identifying the number of active array elements of the top-level
    shader storage block member containing to the active variable is
    written to <params>.  If the top-level block member is not
    declared as an array, the value one is written to <params>.  If
    the top-level block member is an array with no declared size, the
    value zero is written to <params>."

    "For the property TOP_LEVEL_ARRAY_STRIDE, a single integer
    identifying the stride between array elements of the top-level
    shader storage block member containing the active variable is
    written to <params>.  For top-level block members declared as
    arrays, the value written is the difference, in basic machine
    units, between the offsets of the active variable for consecutive
    elements in the top-level array.  For top-level block members not
    declared as an array, zero is written to <params>."

v2: move top_level_array_size and stride into nir_link_uniforms_state
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/linker: Compute the offset for non-trivial uniform types.
Antia Puentes [Wed, 12 Sep 2018 11:51:57 +0000 (13:51 +0200)]
nir/linker: Compute the offset for non-trivial uniform types.

ARB_gl_spirv points that the offset must be explicit, however this is
true for 'root' types. For complex types, like struct members or
arrays of arraya, it needs to be computed.

We are not using the offset stored in the gl_buffer_variables during
the uniform blocks linking because currently we do not have a way to
relate a gl_buffer_variable with its corresponding gl_uniform_storage.
The GLSL path uses the name for that, but we can not rely on that
because names are optional in SPIR-V.

Notice that uniforms non-backed by a buffer object will have an offset
equal to -1, like in the GLSL path.

v2: add offset and var_is_in_block as per-variable state in
    nir_link_uniforms_state (Arcady)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/linker: Add atomic counters to the program resource list
Antia Puentes [Sat, 15 Dec 2018 17:34:11 +0000 (18:34 +0100)]
nir/linker: Add atomic counters to the program resource list

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: Add XFB resources to the program resource list
Antia Puentes [Sat, 15 Dec 2018 17:33:18 +0000 (18:33 +0100)]
nir/linker: Add XFB resources to the program resource list

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/linker: Add BUFFER_VARIABLEs to the prog resource list
Antia Puentes [Sat, 15 Dec 2018 17:25:41 +0000 (18:25 +0100)]
nir/linker: Add BUFFER_VARIABLEs to the prog resource list

v2: use link_util_should_add_buffer_variable() (Arcady)
Signed-off-by: Arcady Goldmints-Orlov <agoldmints@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: Add inputs/outputs to the program resource list
Antia Puentes [Wed, 8 Aug 2018 12:29:38 +0000 (14:29 +0200)]
nir/linker: Add inputs/outputs to the program resource list

v2: added TODO comment hinting possible future refactoring of
    nir_build_program_resource_list and build_program_resource_list,
    to avoid code duplication (Alejandro, to explicitly reflect a
    valid concern from Timothy during the review).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: add ubo/ssbo to the program resource list
Alejandro Piñeiro [Fri, 23 Mar 2018 11:35:48 +0000 (12:35 +0100)]
nir/linker: add ubo/ssbo to the program resource list

v2: "nir/linker: Use the stageref when adding UBO/SSBO resources"
     squashed on this one (Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/linker: Fill the uniform's BLOCK_INDEX
Antia Puentes [Sat, 25 Aug 2018 13:15:30 +0000 (15:15 +0200)]
nir/linker: Fill the uniform's BLOCK_INDEX

Binding comparison is used to determine the block the uniform is part
of. Note that to do the binding comparison we need the information in
UniformBlocks[] and ShaderStorageBlocks[] to be available, so we have
to call gl_nir_link_uniform_blocks() before linking the uniforms.

v2: add missing break (Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoradv/gfx10: enable 1D textures
Samuel Pitoiset [Fri, 12 Jul 2019 06:17:09 +0000 (08:17 +0200)]
radv/gfx10: enable 1D textures

Mirror RadeonSI. This also fixes crashes in addrlib.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/compiler: remove abandoned comments
Andres Gomez [Fri, 12 Jul 2019 15:17:01 +0000 (18:17 +0300)]
intel/compiler: remove abandoned comments

c8665005: ("intel/compiler: Don't always require precise lowering of flrp")
forgot to remove some comments that didn't apply any more after the
change.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrnd.net>
5 years agonir/compiler: keep same bit size when lowering with flrp
Andres Gomez [Mon, 8 Jul 2019 13:26:52 +0000 (16:26 +0300)]
nir/compiler: keep same bit size when lowering with flrp

This was probably not caught before because no supported test was
exercising the flrp lowering with other bit size different than 32.

With the arrival of VK_KHR_shader_float_controls we will have some of
those and, unless we keep the bit size, we will end with something
like:

../src/compiler/nir/nir_builder.h:420: nir_builder_alu_instr_finish_and_insert: Assertion `src_bit_size == bit_size' failed.

Fixes: 158370ed2a0 ("nir/flrp: Add new lowering pass for flrp instructions")
Fixes: ae02622d8fd ("nir/flrp: Lower flrp(a, b, c) differently if another flrp(_, b, c) exists")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrnd.net>
5 years agoanv: Properly compute image usage in CreateImageView
Jason Ekstrand [Fri, 12 Jul 2019 13:20:25 +0000 (08:20 -0500)]
anv: Properly compute image usage in CreateImageView

With separate stencil usage, we can't just grab the usage from the image
directly and have to consider the per-aspect usage instead.

Fixes: 1be38f9178 "anv:Use VK_EXT_separate_stencil_usage to avoid..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoradv/gfx10: emit DISABLE_CONSERVATIVE_ZPASS_COUNTS
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:18 +0000 (12:17 +0200)]
radv/gfx10: emit DISABLE_CONSERVATIVE_ZPASS_COUNTS

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: init more registers in the graphics preamble
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:17 +0000 (12:17 +0200)]
radv/gfx10: init more registers in the graphics preamble

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: set HS/GS/CS.WGP_MODE
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:16 +0000 (12:17 +0200)]
radv/gfx10: set HS/GS/CS.WGP_MODE

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: emit GE_PC_ALLOC
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:15 +0000 (12:17 +0200)]
radv/gfx10: emit GE_PC_ALLOC

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: enable vertex shaders without export parameters
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:14 +0000 (12:17 +0200)]
radv/gfx10: enable vertex shaders without export parameters

GFX10 allows this.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: launch 2 compute waves per CU before going onto the next CU
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:13 +0000 (12:17 +0200)]
radv/gfx10: launch 2 compute waves per CU before going onto the next CU

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: use ac_get_compute_resource_limits()
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:12 +0000 (12:17 +0200)]
radv: use ac_get_compute_resource_limits()

No behaviour change.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: import ac_get_compute_resource_limits() from RadeonSI
Samuel Pitoiset [Fri, 12 Jul 2019 10:17:11 +0000 (12:17 +0200)]
ac: import ac_get_compute_resource_limits() from RadeonSI

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agopanfrost: Initialize shift/extra_flags
Alyssa Rosenzweig [Thu, 11 Jul 2019 22:46:22 +0000 (15:46 -0700)]
panfrost: Initialize shift/extra_flags

Don't rely on them being preinitialized to zero; this can cause junk to
appear on the wire.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Fix build warnings
Alyssa Rosenzweig [Thu, 11 Jul 2019 22:34:56 +0000 (15:34 -0700)]
panfrost: Fix build warnings

A bunch of these are from asserts not being compiled in 32-bit mode
(once Erik's ASSERTABLE stuff is merged, we'll want to switch).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agoradv/gfx10: invalidate everything in L2 when shaders read data
Samuel Pitoiset [Fri, 12 Jul 2019 11:59:08 +0000 (13:59 +0200)]
radv/gfx10: invalidate everything in L2 when shaders read data

This includes metadata as well. On GFX10, we have to invalidate
the L2 metadata cache when shaders read DCC.

Note that we still have to implement GFX10 coherency by
introducing INV_L2_METATADA but for now just flush L2.

This fixes a corruption with DCC and Talos.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/gfx10: fix wrong emission of GE_CNTL
Samuel Pitoiset [Fri, 12 Jul 2019 09:12:58 +0000 (11:12 +0200)]
radv/gfx10: fix wrong emission of GE_CNTL

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add more assertions to make sure packets are correctly emitted
Samuel Pitoiset [Fri, 12 Jul 2019 09:12:57 +0000 (11:12 +0200)]
radv: add more assertions to make sure packets are correctly emitted

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agov3d: use inc/dec tmu operation with image atomic sub/add of 1
Alejandro Piñeiro [Tue, 25 Jun 2019 13:02:56 +0000 (15:02 +0200)]
v3d: use inc/dec tmu operation with image atomic sub/add of 1

This allows to remove a mov of 1/-1, as it is implicit with the
operation.

As with atomic inc/dec/add, usual shader-db set doesn't include any
GLES shader using it. So using as workaround vk-gl-cts shaders, we get
this:

total instructions in shared programs: 1217013 -> 1217006 (<.01%)
instructions in affected programs: 53 -> 46 (-13.21%)
helped: 2
HURT: 0

One of the helped shader went from 40 to 34 instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: refactor some code from v3d40_vir_emit_image_load_store
Alejandro Piñeiro [Thu, 27 Jun 2019 12:16:15 +0000 (14:16 +0200)]
v3d: refactor some code from v3d40_vir_emit_image_load_store

And moved to new auxiliar method v3d40_image_load_store_tmu_op,
equivalent to the nir_to_nir v3d_general_tmu_op, to clean-up a little.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: use inc/dec tmu operation with atomic sub/add of 1
Alejandro Piñeiro [Wed, 19 Jun 2019 11:17:41 +0000 (13:17 +0200)]
v3d: use inc/dec tmu operation with atomic sub/add of 1

Among other things, this avoid the need of loading 1/-1 constants (so
one less operation).

The removed comment suggest the option of adding support on NIR for
inc/dec. Intel just uses an auxiliar method to get which hw operation
is needed, so no lowering is needed. And at the same time, being so
small, seems unreasonable to try to add a general one on NIR
itself. It is more easy to just adapt the method here (that is what
the patch does right now).

It is worth to note that we are not getting any change on shader-db
stats because all those methods are used on the usual shader-db set
with shaders needing GLSL > 4.2. In general there aren't too many GLSL
ES 3.1 tests.

As an alternative, we captured the GLES3/GLSL31/GLS32 used on
vk-gl-cts, even if that is not a real life usage of shaders. With
those we get the following:

total instructions in shared programs: 1217022 -> 1217013 (<.01%)
instructions in affected programs: 117 -> 108 (-7.69%)
helped: 6
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.50 x̃: 1
helped stats (rel) min: 3.57% max: 10.00% x̄: 8.09% x̃: 9.09%
95% mean confidence interval for instructions value: -2.07 -0.93
95% mean confidence interval for instructions %-change: -10.54% -5.64%
Instructions are helped.

Note that the shaders helped are really low because most of the
vk-gl-cts tests using AtomicInc/Dec/Add are mostly used on compute
shaders. Although right now there is a branch around with CS support,
the usual is doing the stats against master.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: remove redefinition of tmu operations on nir_to_vir
Alejandro Piñeiro [Tue, 2 Jul 2019 10:45:49 +0000 (12:45 +0200)]
v3d: remove redefinition of tmu operations on nir_to_vir

They are already defined, although is a slightly different format on
the generated packet headers, so it was needed to change how it is
used on nir_to_vir.

In addition to allow to remove some duplicated headers, it will allow
to define just one get_op_for_atomic_add aux method later to support
using inc/dec instead of add of 1/-1.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: tweak initial comment on pack generator script
Alejandro Piñeiro [Tue, 2 Jul 2019 10:02:04 +0000 (12:02 +0200)]
v3d: tweak initial comment on pack generator script

As the files it mentions to use as reference has slightly different
names.

Reviewed-by: Eric Anholt <eric@anholt.net>