mesa.git
5 years agopanfrost: Get rid of the "free imported BO" logic
Boris Brezillon [Tue, 2 Jul 2019 08:30:45 +0000 (10:30 +0200)]
panfrost: Get rid of the "free imported BO" logic

bo->imported was never set to true which means this path was never taken.
Moreover, panfrost_drm_free_imported_bo() is doing missing the munmap()
call which seems wrong because the import BO function calls mmap().

Let's just kill this function along with the ->imported field.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agopanfrost: Get rid of the panfrost_driver abstraction leftovers
Boris Brezillon [Tue, 2 Jul 2019 08:25:26 +0000 (10:25 +0200)]
panfrost: Get rid of the panfrost_driver abstraction leftovers

Commit 5f81669d880b ("panfrost: Remove the panfrost_driver abstraction")
left a few things behind, remove them now.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agopanfrost: Move scanout res creation out of panfrost_resource_create()
Boris Brezillon [Tue, 2 Jul 2019 08:12:10 +0000 (10:12 +0200)]
panfrost: Move scanout res creation out of panfrost_resource_create()

Which improves readability and help us avoid a memory leak.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
5 years agopanfrost: Add the sampled texture BO to the job
Boris Brezillon [Mon, 1 Jul 2019 15:22:26 +0000 (17:22 +0200)]
panfrost: Add the sampled texture BO to the job

Otherwise we get random use-after-{free,unmap} errors.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
---
Changes in v2:
- Move the panfrost_job_add_bo() call out of the loop

5 years agoradv: enable DCC for layers on GFX8
Samuel Pitoiset [Mon, 1 Jul 2019 14:31:01 +0000 (16:31 +0200)]
radv: enable DCC for layers on GFX8

It's currently only enabled if dcc_slice_size is equal to
dcc_slice_fast_clear_size because the driver assumes that
portions of multiple layers are contiguous but it's not
always true.

Still not supported on GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: do not enable DCC for mipmapped arrays because performance is worse
Samuel Pitoiset [Mon, 1 Jul 2019 14:31:00 +0000 (16:31 +0200)]
radv: do not enable DCC for mipmapped arrays because performance is worse

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: implement clearing DCC layers on GFX8
Samuel Pitoiset [Mon, 1 Jul 2019 14:30:59 +0000 (16:30 +0200)]
radv: implement clearing DCC layers on GFX8

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: merge radv_dcc_clear_level() into radv_clear_dcc()
Samuel Pitoiset [Mon, 1 Jul 2019 14:30:58 +0000 (16:30 +0200)]
radv: merge radv_dcc_clear_level() into radv_clear_dcc()

This will help for clearing DCC arrays because we need to know
the subresource range.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add support for decompressing DCC layers with compute
Samuel Pitoiset [Mon, 1 Jul 2019 14:30:57 +0000 (16:30 +0200)]
radv: add support for decompressing DCC layers with compute

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: compute the DCC fast clear size per slice on GFX8
Samuel Pitoiset [Mon, 1 Jul 2019 14:30:56 +0000 (16:30 +0200)]
ac: compute the DCC fast clear size per slice on GFX8

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: compute the size of one DCC slice on GFX8
Samuel Pitoiset [Mon, 1 Jul 2019 14:30:55 +0000 (16:30 +0200)]
ac: compute the size of one DCC slice on GFX8

Addrlib doesn't provide this info. Because DCC is linear, at least
on GFX8, it's easy to compute the size of one slice.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoiris: Defer closing and freeing VMA until buffers are idle.
Kenneth Graunke [Sat, 4 May 2019 03:27:40 +0000 (20:27 -0700)]
iris: Defer closing and freeing VMA until buffers are idle.

There will unfortunately be circumstances where we cannot re-use a
virtual memory address until it's no longer active on the GPU.  To
facilitate this, we instead move BOs to a "dead" list, and defer
closing them and returning their VMA until they are idle.  We
periodically sweep these away in cleanup_bo_cache, which triggers
every time a new object's refcount hits zero.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agoiris: Add an explicit alignment parameter to iris_bo_alloc_tiled().
Kenneth Graunke [Sat, 23 Mar 2019 17:04:16 +0000 (10:04 -0700)]
iris: Add an explicit alignment parameter to iris_bo_alloc_tiled().

In the future, some images will need to be aligned to a larger value
than 4096.  Most buffers, however, don't have any such requirement,
so for now we only add the parameter to iris_bo_alloc_tiled() and
leave the others with the simpler interface.

v2: Fix missing alignment in vma_alloc, caught by Caio!

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agov3d: do not flush jobs that are synced with 'Wait for transform feedback'
Iago Toral Quiroga [Thu, 20 Jun 2019 11:46:02 +0000 (13:46 +0200)]
v3d: do not flush jobs that are synced with 'Wait for transform feedback'

Generally, we achieve this by skipping the flush on calls to
v3d_flush_jobs_writing_resource() when we detect that the resource is written
in the current job from a transform feedback write.

The exception to this is the case where the caller is about to map the
resource, in which case we need to flush immediately since we can only emit
'Wait for transform feedback' commands on rendering jobs. We add a parameter
to the function so the caller can identify that scenario.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: emit 'Wait for transform feedback' commands when needed
Iago Toral Quiroga [Thu, 20 Jun 2019 10:14:17 +0000 (12:14 +0200)]
v3d: emit 'Wait for transform feedback' commands when needed

The hardware can flush transform feedback writes before reads in the same
job by inserting this command.

This patch detects when the rendering state for the current draw call reads
resources that had been previously written by transform feedback in the
same job and inserts the 'Wait for transform feedback' command before
emitting the new draw.

v2 (Eric):
  - this was intended to look at job->tf_write_prscs for TF jobs.
  - clear job->tf_write_prscs after we emit the TF flush.
  - can skip flushes for fragment shader reads from TF.

v3 (Eric):
  - all resources in job->tf_write_prscs are resources written by TF so
   we don't need to check if they are bound to PIPE_BIND_STREAM_OUTPUT.
  - documented optimization opportunity for geometry stages.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: keep track of resources written by transform feedback
Iago Toral Quiroga [Thu, 20 Jun 2019 11:38:56 +0000 (13:38 +0200)]
v3d: keep track of resources written by transform feedback

The hardware provides a feature to sync reads from previous transform feedback
writes in the same job so if we use this mechanism we no longer have to flush
the job.

In order to identify this scenario we need a mechanism to identify resources
that are written by transform feedback.

v2: use _mesa_pointer_set_create (Eric)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agost/dri: fix typo in format table for GR1616 format
Mike Blumenkrantz [Thu, 30 May 2019 12:47:59 +0000 (08:47 -0400)]
st/dri: fix typo in format table for GR1616 format

the dri image format here should match the fourcc format

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agost/dri: pass dri2_format_mapping directly to dri2_create_image_from_winsys
Mike Blumenkrantz [Wed, 29 May 2019 20:56:30 +0000 (16:56 -0400)]
st/dri: pass dri2_format_mapping directly to dri2_create_image_from_winsys

this makes the entire struct available for use here

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agomesa/st: simplify format usage in st_bind_egl_image
Mike Blumenkrantz [Wed, 29 May 2019 20:41:59 +0000 (16:41 -0400)]
mesa/st: simplify format usage in st_bind_egl_image

the formats handled in the switch statement will always return an
unknown mesa format, so process them directly and leave the default
case for other/unknown formats

no functional changes

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Use MI_COPY_MEM_MEM for tiny resource_copy_region calls.
Kenneth Graunke [Wed, 26 Jun 2019 07:05:06 +0000 (00:05 -0700)]
iris: Use MI_COPY_MEM_MEM for tiny resource_copy_region calls.

If our resource_copy_region size is a small number of DWords, then
instead of firing up BLORP, we can simply use MI_COPY_MEM_MEM (after
a CS stall).  We also try and select the optimal batch.

Improves performance in Shadow of Mordor on Low settings at 1920x1080
on Skylake GT4e by 0.689096% +/- 0.473968% (n=4).  It tries to copy
4 bytes of data to a buffer which was most recently used as a writable
compute shader SSBO.  Previously we were switching from compute to the
render pipeline, then firing up all of blorp_buffer_copy...for 4 bytes.

I arbitrarily decided to support 4/8/12/16 bytes.  Jason thinks this
is about the right threshold where it's cheaper to use MI_COPY_MEM_MEM.

5 years agoradv: Only allocate supplied number of descriptors when variable.
Bas Nieuwenhuizen [Sat, 29 Jun 2019 01:07:03 +0000 (03:07 +0200)]
radv: Only allocate supplied number of descriptors when variable.

Fixes: b5e04e9217b "radv: Support allocating variable size descriptor sets."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoegl: simplify loop
Eric Engestrom [Thu, 13 Dec 2018 17:35:28 +0000 (17:35 +0000)]
egl: simplify loop

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Sagar Ghuge<sagar.ghuge@intel.com>
5 years agosparc: Reuse m_vector_asm.h.
Eric Anholt [Thu, 20 Jun 2019 17:35:32 +0000 (10:35 -0700)]
sparc: Reuse m_vector_asm.h.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomesa: Enable asm unconditionally, now that gen_matypes is gone.
Eric Anholt [Thu, 20 Jun 2019 17:27:28 +0000 (10:27 -0700)]
mesa: Enable asm unconditionally, now that gen_matypes is gone.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agomesa: Replace gen_matypes with a simple header for V4F/mat layout.
Eric Anholt [Thu, 20 Jun 2019 17:18:41 +0000 (10:18 -0700)]
mesa: Replace gen_matypes with a simple header for V4F/mat layout.

We can greatly simplify our builds by just hardcoding GLvector4f and
GLmatrix's layouts.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomatypes: Drop some unused defines.
Eric Anholt [Thu, 20 Jun 2019 16:18:20 +0000 (09:18 -0700)]
matypes: Drop some unused defines.

Most of these haven't been used since the conversion from checked-in
matypes to generation.  By cutting down the generated contents, this
should clarify why the file is generated: we need
architecture-specific offsets to the V4F fields in the asm that uses
it.

v2: Keep matrix offsets to prevent x86 build breakage..

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomeson: drop duplicate source & inc_dir
Eric Engestrom [Sat, 29 Jun 2019 23:14:47 +0000 (00:14 +0100)]
meson: drop duplicate source & inc_dir

These two are already pulled from `idep_vulkan_util_headers`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoswrast: simplify function pointer calls
Eric Engestrom [Sat, 29 Jun 2019 23:01:15 +0000 (00:01 +0100)]
swrast: simplify function pointer calls

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
5 years agoegl/wayland: use bitset.h for `formats` bit set
Eric Engestrom [Tue, 27 Nov 2018 12:27:45 +0000 (12:27 +0000)]
egl/wayland: use bitset.h for `formats` bit set

Currently only 7 formats are supported, but we don't want the 16 limit
(it's an `unsigned`) to hit us by surprise :]

Let's use bitset.h's BITSET magic to allow us to have any number of
formats, with a static assert to make sure we don't forget to update it.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agointel/tools: Add assembler unit tests for ROL/ROR instructions
Sagar Ghuge [Tue, 4 Jun 2019 20:05:20 +0000 (13:05 -0700)]
intel/tools: Add assembler unit tests for ROL/ROR instructions

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/tools: Add ROL/ROR support in assembler
Sagar Ghuge [Tue, 4 Jun 2019 20:04:49 +0000 (13:04 -0700)]
intel/tools: Add ROL/ROR support in assembler

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir: Add lower_rotate flag and set to true in all drivers
Sagar Ghuge [Tue, 4 Jun 2019 00:11:57 +0000 (17:11 -0700)]
nir: Add lower_rotate flag and set to true in all drivers

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/compiler: Emit ROR and ROL instruction
Sagar Ghuge [Thu, 30 May 2019 21:14:52 +0000 (14:14 -0700)]
intel/compiler: Emit ROR and ROL instruction

v2: Reorder patch (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir: Add optimization to use ROR/ROL instructions
Sagar Ghuge [Thu, 30 May 2019 21:15:51 +0000 (14:15 -0700)]
nir: Add optimization to use ROR/ROL instructions

v2: 1) Add more optimization rules for ROL/ROR (Matt Turner)
    2) Add lowering rules for ROL/ROR (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir: Add urol and uror opcodes
Sagar Ghuge [Thu, 30 May 2019 21:11:58 +0000 (14:11 -0700)]
nir: Add urol and uror opcodes

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/compiler: Enable the emission of ROR/ROL instructions
Sagar Ghuge [Wed, 29 May 2019 18:43:30 +0000 (11:43 -0700)]
intel/compiler: Enable the emission of ROR/ROL instructions

v2: 1) Drop changes for vec4 backend as on Gen11+ we don't support
       align16 mode (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agopanfrost: Implement instanced rendering
Alyssa Rosenzweig [Thu, 27 Jun 2019 21:13:10 +0000 (14:13 -0700)]
panfrost: Implement instanced rendering

We implement GLES3.0 instanced rendering with full support for instanced
arrays (via instance divisors). To do so, we use the new invocation
helpers to invoke a triplet of (1, vertex_count, instance_count), rather
than simply (1, vertex_count, 1). We rewrite the attribute handling code
into a new pan_instancing.c file which handles both the simple LINEAR
case for non-instanced as well as each of the new instancing cases:
MODULO (for per-vertex attributes), POT and NPOT divisors.

As a side effect, we rework how vertex buffers are handled, duplicating
them to be 1:1 with vertex descriptors to simplify instancing code paths
dramatically. This might be a performance regression, but this remains
to be seen; if so, we can always deduplicate later with some added logic
in pan_instancing.c

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/decode: Compute padded_num_vertices for MODULO
Alyssa Rosenzweig [Thu, 27 Jun 2019 17:18:22 +0000 (10:18 -0700)]
panfrost/decode: Compute padded_num_vertices for MODULO

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Emit type appropriate ld_vary
Alyssa Rosenzweig [Fri, 28 Jun 2019 16:30:59 +0000 (09:30 -0700)]
panfrost/midgard: Emit type appropriate ld_vary

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Add unsigned ld/st ops
Alyssa Rosenzweig [Fri, 28 Jun 2019 16:07:30 +0000 (09:07 -0700)]
panfrost/midgard: Add unsigned ld/st ops

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost/midgard: Use the appropriate ld_attr type
Alyssa Rosenzweig [Thu, 27 Jun 2019 22:33:07 +0000 (15:33 -0700)]
panfrost/midgard: Use the appropriate ld_attr type

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Implement dispatch helpers
Alyssa Rosenzweig [Thu, 27 Jun 2019 15:29:06 +0000 (08:29 -0700)]
panfrost: Implement dispatch helpers

Rather than open-coding workgroups_shift_* type fields, we include a
general routine for packing the vertex/tiler/compute descriptor based on
the provided dispatch parameters.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Remove ancient comment
Alyssa Rosenzweig [Thu, 27 Jun 2019 14:43:33 +0000 (07:43 -0700)]
panfrost: Remove ancient comment

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Extend software tiling to larger bpp
Alyssa Rosenzweig [Mon, 1 Jul 2019 14:39:22 +0000 (07:39 -0700)]
panfrost: Extend software tiling to larger bpp

Should not affect lima.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Rewrite u-interleaving code
Alyssa Rosenzweig [Tue, 25 Jun 2019 15:09:58 +0000 (08:09 -0700)]
panfrost: Rewrite u-interleaving code

Rather than using a magic lookup table with no explanations, let's add
liberal comments to the code to explain what this tiling scheme is and
how to encode/decode it efficiently.

It's not so mysterious after all -- just reordering bits with some XORs
thrown in.

v2: Correct copyright identifier. Fix spelling error. Switch space_4 to
a LUT. Fix comment typo. Use LUT instead of space_x tricks. Fallback on
generic rather than split up unaligned writes.

v3: Correct stride order (fixes crash loading). Correct coordinate
system mishap.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Andreas Baierl <ichgeh@imkreisrum.de>
5 years agofreedreno: update generated registers
Rob Clark [Mon, 1 Jul 2019 13:14:41 +0000 (06:14 -0700)]
freedreno: update generated registers

Corrects the a3xx texconst state for TILE_MODE.

Signed-off-by: Rob Clark <robdclark@chromium.org>
5 years agoradv: rework how the number of VGPRs is computed
Samuel Pitoiset [Wed, 26 Jun 2019 13:11:03 +0000 (15:11 +0200)]
radv: rework how the number of VGPRs is computed

Just a cleanup, it shouldn't change anything.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: gather if a vertex shaders needs the instance ID
Samuel Pitoiset [Wed, 26 Jun 2019 13:11:02 +0000 (15:11 +0200)]
radv: gather if a vertex shaders needs the instance ID

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix decompressing DCC levels with compute
Samuel Pitoiset [Thu, 27 Jun 2019 13:06:17 +0000 (15:06 +0200)]
radv: fix decompressing DCC levels with compute

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: the number of VGPR_COMP_CNT for GS is expected to be 0 on GFX8
Samuel Pitoiset [Wed, 26 Jun 2019 13:11:01 +0000 (15:11 +0200)]
radv: the number of VGPR_COMP_CNT for GS is expected to be 0 on GFX8

Just move around the switch case. GFX9+ is handled below.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: reduce number of VGPRs for TESS_EVAL if primitive ID is not used
Samuel Pitoiset [Wed, 26 Jun 2019 13:11:00 +0000 (15:11 +0200)]
radv: reduce number of VGPRs for TESS_EVAL if primitive ID is not used

We only need to 2.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: make sure to mark the image as compressed when clearing DCC levels
Samuel Pitoiset [Thu, 27 Jun 2019 13:06:15 +0000 (15:06 +0200)]
radv: make sure to mark the image as compressed when clearing DCC levels

Found while working on DCC for arrays.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agotargets/opencl: Add clangASTMatchers library as dependency
Michel Dänzer [Fri, 28 Jun 2019 09:07:39 +0000 (11:07 +0200)]
targets/opencl: Add clangASTMatchers library as dependency

Fixes link failure since clang r364424 "[clang/DIVar] Emit the flag for
params that have unmodified value", clangCodeGen depends on
clangASTMatchers now.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglsl/nir: Lower buffers using Binding instead of Names
Caio Marcelo de Oliveira Filho [Fri, 22 Mar 2019 04:47:02 +0000 (21:47 -0700)]
glsl/nir: Lower buffers using Binding instead of Names

When using ARB_gl_spirv, the block names are optional and the uniform
blocks are referred using Bindings instead.  Teach
gl_nir_lower_buffers to handle those.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoglspirv: Enable the new deref-base UBO/SSBO path on gl_spirv
Alejandro Piñeiro [Sat, 29 Dec 2018 14:08:50 +0000 (15:08 +0100)]
glspirv: Enable the new deref-base UBO/SSBO path on gl_spirv

Among other things, it supports arrays of arrays of UBO/SSBO (default
codepath doesn't).

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
v2: nir_address_format_vk_index_offset got renamed to
    nir_address_format_32bit_index_offset (after rebase against master)

v3: the ptr_type fields in spirv_to_nir_options got changed to be of
    type nir_address_format.

v4: remove phys_ssbo_addr_format and push_const_addr_format as they are
    not used by glspirv

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoi965: call to gl_nir_link_uniform_blocks
Alejandro Piñeiro [Thu, 30 Nov 2017 11:50:24 +0000 (12:50 +0100)]
i965: call to gl_nir_link_uniform_blocks

When using a SPIR-V shader. Note that needs to be done before linking
uniforms, so when creating the uniform storage entries, block_index
could be filled properly (among other things).

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoi965: use GLboolean for all brw_link_shader returns
Alejandro Piñeiro [Sat, 15 Sep 2018 14:42:58 +0000 (16:42 +0200)]
i965: use GLboolean for all brw_link_shader returns

The function had a mix of true/GL_TRUE and false/GL_FALSE
returns. Using GL_TRUE/GL_FALSE as the function returns a GLboolean.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: update already processed uniforms search for UBOs/SSBOs
Alejandro Piñeiro [Sat, 1 Sep 2018 11:18:24 +0000 (13:18 +0200)]
nir/linker: update already processed uniforms search for UBOs/SSBOs

Until now, we were using the uniform explicit location to check if the
current nir variable was already processed while adding entries on the
uniform storage. But for UBOs/SSBOs, entries are added too but we lack
a explicit location.

For those we need to rely on the UBO/SSBO binding and the unifor
storage block_index. In that case several uniforms would need to be
updated at once.

v2: (from Timothy review)
   * Improve wording and fix typos of some long comments.
   * Rename update_uniform_storage for mark_stage_as_active

v3: (from cmarcelo review)
   * Fixed some comment typos

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: fill up uniform_storage with explicit data
Alejandro Piñeiro [Fri, 31 Aug 2018 10:20:49 +0000 (12:20 +0200)]
nir/linker: fill up uniform_storage with explicit data

Specifically, offset, stride (coming from arrays or matrices) and
row_major.

On GLSL, most of that info is computed using the layout qualifier, but
on ARB_gl_spirv they are explicit, and for Mesa, included on the
glsl_type.

From ARB_gl_spirv spec:

   "Mapping of layouts

      std140/std430 -> explicit *Offset*, *ArrayStride*, and
                       *MatrixStride* Decoration on struct members""

    "7.6.2.spv SPIR-V Uniform Offsets and Strides

    The SPIR-V decorations *GLSLShared* or *GLSLPacked* must not be
    used. A variable in the *Uniform* Storage Class decorated as a
    *Block* must be explicitly laid out using the *Offset*,
    *ArrayStride*, and *MatrixStride* decorations"

For offset we needed to include the parent and index_in_parent while
processing the type, as the offset is maintained on glsl_struct_field
of the parent type, not on the type itself.

v2: Fix the default values for MATRIX_STRIDE, ARRAY_STRIDE and
    ROW_MAJOR when the variable is not backed by a buffer object
    (Antia Puentes).

v3: Update after Jason series "SPIR-V: Use NIR deref instructions for
    UBO/SSBO access" that included just one explicit stride, instead
    of a previous patch we wrote that had matrix_stride and
    array_stride (Alejandro)

Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: use only the array element type for array of ssbo/ubo
Alejandro Piñeiro [Wed, 29 Aug 2018 14:24:12 +0000 (16:24 +0200)]
nir/linker: use only the array element type for array of ssbo/ubo

For this interfaces, the inner members are added only once as uniforms
or resources, in opposite to other cases, like a uniform array of
structs.

For those guessing why a issue (16) from ARB_program_interface_query
was used, instead of a quote of the core spec: The core spec is not
really clear about how members of arrays of blocks should be
enumerated.

On GLSL this was also problematic, specially when we were trying to
pass the 4.5 CTS tests. See commit "glsl: Fix program interface
queries relating to interface blocks"
(4c4d9e4f032d5753034361ee70aa88d16d3a04b4), as a reference. That one
also needed to rely on issue (16) to justify the change, pointing that
the core spec needs to be clarified.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: fill is_shader_storage for uniforms
Alejandro Piñeiro [Mon, 12 Feb 2018 14:50:18 +0000 (15:50 +0100)]
nir/linker: fill is_shader_storage for uniforms

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir/linker: add gl_nir_link_uniform_blocks.c
Alejandro Piñeiro [Thu, 30 Nov 2017 11:01:52 +0000 (12:01 +0100)]
nir/linker: add gl_nir_link_uniform_blocks.c

Adding the ability to link uniform blocks and shader storage blocks
using NIR, intended for ARB_gl_spirv support. Among other things, this
linking needs to take into account that everything should work without
names, as they could be not present, while the GLSL IR uniform block
linking was wrote with the names on its core.

The other major difference compared with the GLSL IR linker is that we
don't deal with layouts. There are no references to std140, std430,
etc. Layouts are expressed through explicit offset, array stride and
matrix stride. That simplifies how the buffer size are computed. But
also means that we couldn't use the existing methods at glsl_types, so
we needed to implement new methods.

It is worth to note that this linking do a iteration over the
glsl_types, similarly to what the linking uniforms do. A possible
future improvement would be refactor both cases to try to share more
code that it sharing right now. On GLSL IR there are a class visitor,
specialized on each case, for that sharing. As adding a class visitor
on C would more complicated, for now we are just iterating on both.

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
v2: (from Timothy review)
   * Fix variable name convention
   * Stop to use _function_name convention
   * Don't use // for comments
   * "nir/linker: Keep track of the stages referencing an UBO/SSBO"
     squashed with this patch

v3: (from Caio review)
   * Don't delete the linked shader on failure
   * Use rzalloc_array to avoid some explicit initializations

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir_types: add glsl_type_is_leaf helper
Alejandro Piñeiro [Thu, 17 Jan 2019 12:20:54 +0000 (13:20 +0100)]
nir_types: add glsl_type_is_leaf helper

Helper used to know when a glsl_type is a leaf when iteraring through
a complex type. Note that GLSL IR linking also uses the concept of
leaf while doing the same iteration, although in that case it uses a
visitor. See link_uniform_blocks, process_array_leaf and others as
reference.

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
v2:
   * Moved from gl_nir_linker to nir_types, so it could be used on nir
     xfb gathering (Timothy Arceri)
   * Minor update after Timothy's series about record to struct
     renaming landed master.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoglsl/nir: add glsl_types::explicit_size plus nir C wrapper
Alejandro Piñeiro [Wed, 16 Jan 2019 15:50:29 +0000 (16:50 +0100)]
glsl/nir: add glsl_types::explicit_size plus nir C wrapper

While using SPIR-V shaders (ARB_gl_spirv), layout data is not implicit
to a specific value (std140, std430, etc) but explicitly included on
the type (explicit values for offset, stride and row_major).

So this method is equivalent to the existing std140_size and
std430_size, but using such explicit values.

Note that the value returned by this method is only valid if such data
is set, so when dealing with SPIR-V shaders.

v2: (all changes suggested by Jason Ekstrand)
   * Iterate through all struct members, instead of assume that fields
     are ordered by offset
   * Use else if
   * Take into account the case that explicit_stride > elem_size, to
     fine graine the final size on arrays and matrices
   * Handle different bit-sizes in general, not just 32 and 64.

v3: (change suggested by Caio Marcelo de Oliveira Filho)
   * fix up explicit_size() to consider interface types

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agoglsl_types: add type::bit_size and glsl_base_type_bit_size helpers
Alejandro Piñeiro [Wed, 6 Feb 2019 17:11:19 +0000 (18:11 +0100)]
glsl_types: add type::bit_size and glsl_base_type_bit_size helpers

Note that the nir_types glsl_get_bit_size is not a wrapper of this
one, because for bools at the nir level, we want to return size 1, but
at the glsl_types we want to return 32.

v2: reuse the new method in order to simplify is_16bit and is_32bit
    helpers (Timothy)

v3: add a comment clarifying the difference between
    glsl_base_type_bit_size and glsl_get_bit_size.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir: add is_in_ubo/ssbo/block helpers
Alejandro Piñeiro [Tue, 11 Sep 2018 10:40:08 +0000 (12:40 +0200)]
nir: add is_in_ubo/ssbo/block helpers

Equivalent to the already existing ir_variable is_in_buffer_block and
is_in_shader_storage_block, adding the uniform buffer object one. I'm
using the short forms (ssbo, ubo) to avoid having method names too
long.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agospirv/nir: fill up nir variable info for ubos and ssbo
Alejandro Piñeiro [Mon, 4 Dec 2017 15:29:34 +0000 (16:29 +0100)]
spirv/nir: fill up nir variable info for ubos and ssbo

The data for some nir variables is only filled up for some specific
modes. We need now too for UBO/SSBO, as such info would be used when
linking for OpenGL (ARB_gl_spirv).

There is an existing comment just before that code (starts with XXX)
that points that binding still needs to be filled up for uniform
variables at that point, and that should be fixed, although it doesn't
specify why that's a problem or what would be the alternative. For now
doing the same for UBO/SSBO, and will hope that the future fixing is
done for all of them.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agospirv/nir: create nir variable for UBO/SSBO
Alejandro Piñeiro [Sun, 19 Nov 2017 08:47:36 +0000 (09:47 +0100)]
spirv/nir: create nir variable for UBO/SSBO

Providing nir variables for UBO/SSBO it is not required for Vulkan,
but it is needed for OpenGL (ARB_gl_spirv), like for example, to
gather info from the UBO/SSBO while linking.

In opposite with most cases where the nir variables is created, here
the type assigned is the full type (not just the bare type). This is
needed because while linking using the nir shader we need the explicit
layout info (explicit stride, explicit offset, row_major, etc).

Also, we need to assign an interface type, used also on the OpenGL
linker if it is a UBO/SSBO. See ir_variable::is_in_buffer_block as
example.

v2: assign interface_type to be the variable type, not need to be
    arrayness (Timothy)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agovl: Use CS composite shader only if TEX_LZ and DIV are supported
Gert Wollny [Thu, 13 Jun 2019 07:26:01 +0000 (09:26 +0200)]
vl: Use CS composite shader only if TEX_LZ and DIV are supported

Enable the compute shader copositer only when TEX_LZ is supported by the driver.

v2: Also check whether DIV is supported.

https://bugs.freedesktop.org/show_bug.cgi?id=110783

Fixes: 9364d66cb7f7
 gallium/auxiliary/vl: Add video compositor compute shader render

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallium: Add CAP for opcode DIV
Gert Wollny [Fri, 14 Jun 2019 14:54:24 +0000 (16:54 +0200)]
gallium: Add CAP for opcode DIV

Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able
to check this.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agovl: replace DIV-ADD with MAD using inverse size
Gert Wollny [Wed, 12 Jun 2019 14:04:33 +0000 (16:04 +0200)]
vl: replace DIV-ADD with MAD using inverse size

Optimize the shader a bit by emitting MAD with the inverse size values
instead of DIV+ADD.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoetnaviv: blt: blit with the original format when possible
Jonathan Marek [Thu, 20 Jun 2019 15:53:05 +0000 (11:53 -0400)]
etnaviv: blt: blit with the original format when possible

This fixes BGR565 blit: currently BGRA444 is used for the blit, but with
swizzles from the original BGR565 format, so the 4 alpha bits are set to 1.
We can't just use the swizzle from the 'compatible' format, since there are
cases where BGR<->RGB swap needs to happen.

We can avoid all this trouble by using the original formats and only
falling back to the 'compatible' format when we need to.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: clear all bits for 24bpp depth without stencil
Jonathan Marek [Mon, 24 Jun 2019 21:05:06 +0000 (17:05 -0400)]
etnaviv: clear all bits for 24bpp depth without stencil

For fast clear to happen, all bits must be cleared.

This allows using fast clear for 24bpp depth without stencil.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agomesa: use binary search for MESA_EXTENSION_OVERRIDE
Eric Engestrom [Tue, 20 Nov 2018 09:57:41 +0000 (09:57 +0000)]
mesa: use binary search for MESA_EXTENSION_OVERRIDE

Not a hot path obviously, but the table still has 425 extensions, which
you can go through in just 9 steps with a binary search.

The table is already sorted, as required by other parts of the code and
enforced by mesa's `main-test`.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agogitlab-ci: test meson installation
Eric Engestrom [Fri, 28 Jun 2019 19:11:11 +0000 (20:11 +0100)]
gitlab-ci: test meson installation

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoanv: fix indentation
Eric Engestrom [Sat, 29 Jun 2019 13:00:03 +0000 (14:00 +0100)]
anv: fix indentation

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: fix typo
Eric Engestrom [Sat, 29 Jun 2019 12:59:37 +0000 (13:59 +0100)]
anv: fix typo

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: replace hard-coded platform list with vk.xml parse
Eric Engestrom [Sat, 29 Jun 2019 12:58:59 +0000 (13:58 +0100)]
anv: replace hard-coded platform list with vk.xml parse

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoandroid: fix typo LOCAL_EXPORT_C_INCLUDES
Chih-Wei Huang [Mon, 24 Jun 2019 06:54:02 +0000 (06:54 +0000)]
android: fix typo LOCAL_EXPORT_C_INCLUDES

Should be LOCAL_EXPORT_C_INCLUDE_DIRS.

Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoandroid: virgl: fix generated virgl_driinfo.h building rules
Mauro Rossi [Fri, 21 Jun 2019 17:50:45 +0000 (19:50 +0200)]
android: virgl: fix generated virgl_driinfo.h building rules

Changelog in Android makefile:
- Add LOCAL_MODULE_CLASS, intermediates and LOCAL_GENERATED_SOURCES
- Use LOCAL_EXPORT_C_INCLUDE_DIRS to export $(intermediates) path
- Move generated header rules before 'include $(BUILD_STATIC_LIBRARY)'

Fixes the following building error:

In file included from external/mesa/src/gallium/targets/dri/target.c:1:
external/mesa/src/gallium/auxiliary/target-helpers/drm_helper.h:257:16:
fatal error: 'virgl/virgl_driinfo.h' file not found
      #include "virgl/virgl_driinfo.h"
               ^~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: cf800998a ("virgl: Add driinfo file and tie it into the build")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Review-by: Chih-Wei Huang <cwhuang@linux.org.tw>
5 years agointel/compiler: don't use byte operands for src1 on ICL
Lionel Landwerlin [Wed, 19 Jun 2019 12:09:35 +0000 (05:09 -0700)]
intel/compiler: don't use byte operands for src1 on ICL

The simulator complains about using byte operands, we also have
documentation telling us.

Note that add operations on bytes seems to work fine on HW (like ADD).
Using dwords operands with CMP & SEL fixes the following tests :

   dEQP-VK.spirv_assembly.type.vec*.i8.*

v2: Drop the GLK changes (Matt)
    Add validator tests (Matt)

v3: Drop GLK ref (Matt)
    Don't mix float/integer in MAD (Matt)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
BSpec: 3017
Cc: <mesa-stable@lists.freedesktop.org>
5 years agoegl: Enable eglGetPlatformDisplay on Android Platform
renchenglei [Fri, 28 Jun 2019 07:21:08 +0000 (15:21 +0800)]
egl: Enable eglGetPlatformDisplay on Android Platform

This helps to add eglGetPlatformDisplay support on Android
Platform.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agonir/serach: Increase maximum commutative expressions from 4 to 8
Ian Romanick [Mon, 24 Jun 2019 22:30:35 +0000 (15:30 -0700)]
nir/serach: Increase maximum commutative expressions from 4 to 8

No shader-db change on any Intel platform.  No shader-db run-time
difference on a certain 36-core / 72-thread system at 95% confidence
(n=20).

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir/algebraic: Don't mark expression with duplicate sources as commutative
Ian Romanick [Mon, 24 Jun 2019 23:00:29 +0000 (16:00 -0700)]
nir/algebraic: Don't mark expression with duplicate sources as commutative

There is no reason to mark the fmul in the expression

    ('fmul', ('fadd', a, b), ('fadd', a, b))

as commutative.  If a source of an instruction doesn't match one of the
('fadd', a, b) patterns, it won't match the other either.

This change is enough to make this pattern work:

    ('~fadd@32', ('fmul', ('fadd', 1.0, ('fneg', a)),
                          ('fadd', 1.0, ('fneg', a))),
                 ('fmul', ('flrp', a, 1.0, a), b))

This pattern has 5 commutative expressions (versus a limit of 4), but
the first fmul does not need to be commutative.

No shader-db change on any Intel platform.  No shader-db run-time
difference on a certain 36-core / 72-thread system at 95% confidence
(n=20).

There are more subpatterns that could be marked as non-commutative, but
detecting these is more challenging.  For example, this fadd:

    ('fadd', ('fmul', a, b), ('fmul', a, c))

The first fadd:

    ('fmul', ('fadd', a, b), ('fadd', a, b))

And this fadd:

    ('flt', ('fadd', a, b), 0.0)

This last case may be easier to detect.  If all sources are variables
and they are the only instances of those variables, then the pattern can
be marked as non-commutative.  It's probably not worth the effort now,
but if we end up with some patterns that bump up on the limit again, it
may be worth revisiting.

v2: Update the comment about the explicit "len(self.sources)" check to
be more clear about why it is necessary.  Requested by Connor.  Many
Python fixes style / idom fixes suggested by Dylan.  Add missing (!!!)
opcode check in Expression::__eq__ method.  This bug is the reason the
expected number of commutative expressions in the bitfield_reverse
pattern changed from 61 to 45 in the first version of this patch.

v3: Use all() in Expression::__eq__ method.  Suggested by Connor.
Revert away from using __eq__ overloads.  The "equality" implementation
of Constant and Variable needed for commutativity pruning is weaker than
the one needed for propagating and validating bit sizes.  Using actual
equality caused the pruning to fail for my ('fmul', ('fadd', 1, a),
('fadd', 1, a)) case.  I changed the name to "equivalent" rather than
the previous "same_as" to further differentiate it from __eq__.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agonir/search: Log Boolean constants instead of asserting
Ian Romanick [Mon, 24 Jun 2019 21:49:17 +0000 (14:49 -0700)]
nir/search: Log Boolean constants instead of asserting

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir/algebraic: Fail build when too many commutative expressions are used
Ian Romanick [Mon, 24 Jun 2019 22:12:56 +0000 (15:12 -0700)]
nir/algebraic: Fail build when too many commutative expressions are used

Search patterns that are expected to have too many (e.g., the giant
bitfield_reverse pattern) can be added to a white list.

This would have saved me a few hours debugging. :(

v2: Implement the expected-failure annotation as a property of the
search-replace pattern instead of as a property of the whole list of
patterns.  Suggested by Connor.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agonir/algebraic: Fix whitespace error
Ian Romanick [Tue, 25 Jun 2019 17:04:21 +0000 (10:04 -0700)]
nir/algebraic: Fix whitespace error

Trivial

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agopanfrost: Allow R11G11B10 rendering
Alyssa Rosenzweig [Sat, 29 Jun 2019 01:47:10 +0000 (18:47 -0700)]
panfrost: Allow R11G11B10 rendering

Doesn't fully work yet, but better than crashing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Default to util_pack_color for clears
Alyssa Rosenzweig [Sat, 29 Jun 2019 01:46:43 +0000 (18:46 -0700)]
panfrost: Default to util_pack_color for clears

This might help as we bringup more render-target formats.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agointel/vec4: Try both sources as candidates for being immediates
Ian Romanick [Wed, 26 Jun 2019 20:36:17 +0000 (13:36 -0700)]
intel/vec4: Try both sources as candidates for being immediates

For some reason, when I first wrote try_immediate_source, I thought the
sources had already been ordered so that the immediate value was the
second source.  That's rubbish.  The generator assumes *neither* source
is immediate, and it relies on later copy/constant propagation passes to
do the reordering.

For this reason, the changes to try_immediate_source have to go to some
efforts to reorder the operands and tell the caller when it reordered
them.  The generator for comparison instructions uses this to determine
when the comparison needs to change (e.g., from GT to LT).

No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.

Haswell
total instructions in shared programs: 13484431 -> 13480500 (-0.03%)
instructions in affected programs: 441138 -> 437207 (-0.89%)
helped: 1883
HURT: 0
helped stats (abs) min: 1 max: 49 x̄: 2.09 x̃: 1
helped stats (rel) min: 0.07% max: 8.91% x̄: 1.10% x̃: 0.90%
95% mean confidence interval for instructions value: -2.19 -1.98
95% mean confidence interval for instructions %-change: -1.14% -1.06%
Instructions are helped.

total cycles in shared programs: 376420286 -> 376406400 (<.01%)
cycles in affected programs: 15995668 -> 15981782 (-0.09%)
helped: 1692
HURT: 219
helped stats (abs) min: 2 max: 764 x̄: 13.78 x̃: 4
helped stats (rel) min: <.01% max: 9.69% x̄: 0.69% x̃: 0.35%
HURT stats (abs)   min: 2 max: 516 x̄: 43.09 x̃: 22
HURT stats (rel)   min: 0.02% max: 12.09% x̄: 2.30% x̃: 1.13%
95% mean confidence interval for cycles value: -9.70 -4.83
95% mean confidence interval for cycles %-change: -0.42% -0.28%
Cycles are helped.

total spills in shared programs: 23166 -> 23158 (-0.03%)
spills in affected programs: 66 -> 58 (-12.12%)
helped: 2
HURT: 0

total fills in shared programs: 34592 -> 34580 (-0.03%)
fills in affected programs: 75 -> 63 (-16.00%)
helped: 2
HURT: 0

Ivy Bridge
total instructions in shared programs: 12051590 -> 12048513 (-0.03%)
instructions in affected programs: 355911 -> 352834 (-0.86%)
helped: 1481
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 2.08 x̃: 1
helped stats (rel) min: 0.07% max: 4.92% x̄: 1.08% x̃: 0.90%
95% mean confidence interval for instructions value: -2.17 -1.98
95% mean confidence interval for instructions %-change: -1.12% -1.04%
Instructions are helped.

total cycles in shared programs: 180319624 -> 180307642 (<.01%)
cycles in affected programs: 15591028 -> 15579046 (-0.08%)
helped: 1340
HURT: 174
helped stats (abs) min: 2 max: 764 x̄: 14.19 x̃: 2
helped stats (rel) min: <.01% max: 8.68% x̄: 0.64% x̃: 0.32%
HURT stats (abs)   min: 2 max: 518 x̄: 40.41 x̃: 14
HURT stats (rel)   min: 0.02% max: 8.37% x̄: 1.59% x̃: 0.67%
95% mean confidence interval for cycles value: -10.85 -4.97
95% mean confidence interval for cycles %-change: -0.45% -0.31%
Cycles are helped.

All Gen6 and earlier platforms had simlar results. (Sandy Bridge shown)
total instructions in shared programs: 10863159 -> 10861462 (-0.02%)
instructions in affected programs: 157839 -> 156142 (-1.08%)
helped: 715
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 2.37 x̃: 2
helped stats (rel) min: 0.23% max: 4.33% x̄: 1.07% x̃: 0.85%
95% mean confidence interval for instructions value: -2.53 -2.21
95% mean confidence interval for instructions %-change: -1.13% -1.02%
Instructions are helped.

total cycles in shared programs: 153957782 -> 153948778 (<.01%)
cycles in affected programs: 3171648 -> 3162644 (-0.28%)
helped: 696
HURT: 62
helped stats (abs) min: 2 max: 390 x̄: 15.72 x̃: 4
helped stats (rel) min: 0.02% max: 10.57% x̄: 0.57% x̃: 0.12%
HURT stats (abs)   min: 2 max: 300 x̄: 31.29 x̃: 2
HURT stats (rel)   min: 0.11% max: 7.23% x̄: 0.83% x̃: 0.34%
95% mean confidence interval for cycles value: -15.65 -8.11
95% mean confidence interval for cycles %-change: -0.56% -0.36%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/vec4: Try immediate sources for dot products too
Ian Romanick [Wed, 26 Jun 2019 01:47:18 +0000 (18:47 -0700)]
intel/vec4: Try immediate sources for dot products too

No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.

All Haswell and earlier platforms has similar results. (Haswell shown)
total instructions in shared programs: 13484467 -> 13484431 (<.01%)
instructions in affected programs: 8540 -> 8504 (-0.42%)
helped: 33
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.09 x̃: 1
helped stats (rel) min: 0.31% max: 1.53% x̄: 0.49% x̃: 0.35%
95% mean confidence interval for instructions value: -1.19 -0.99
95% mean confidence interval for instructions %-change: -0.60% -0.38%
Instructions are helped.

total cycles in shared programs: 376420572 -> 376420286 (<.01%)
cycles in affected programs: 56260 -> 55974 (-0.51%)
helped: 26
HURT: 5
helped stats (abs) min: 2 max: 204 x̄: 11.85 x̃: 2
helped stats (rel) min: 0.11% max: 3.08% x̄: 0.39% x̃: 0.13%
HURT stats (abs)   min: 2 max: 6 x̄: 4.40 x̃: 6
HURT stats (rel)   min: 0.03% max: 0.35% x̄: 0.24% x̃: 0.35%
95% mean confidence interval for cycles value: -22.91 4.45
95% mean confidence interval for cycles %-change: -0.56% -0.02%
Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/vec4: Try emitting non-scalar immediates
Ian Romanick [Wed, 26 Jun 2019 01:39:59 +0000 (18:39 -0700)]
intel/vec4: Try emitting non-scalar immediates

Sometimes an instruction has a vector as a source, but all of the
components have the same value.  For example,

    vec3 32 ssa_16 = load_const (1.0, 1.0, 1.0)
    ...
    vec3 32 ssa_82 = fadd ssa_16, -ssa_81.xyz

No changes on any Gen8 or later platform because those platforms do not
use the vec4 backend.

Haswell
total instructions in shared programs: 13487811 -> 13484467 (-0.02%)
instructions in affected programs: 421981 -> 418637 (-0.79%)
helped: 1859
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.80 x̃: 1
helped stats (rel) min: 0.04% max: 9.80% x̄: 1.04% x̃: 0.84%
95% mean confidence interval for instructions value: -1.85 -1.74
95% mean confidence interval for instructions %-change: -1.07% -1.00%
Instructions are helped.

total cycles in shared programs: 376423252 -> 376420572 (<.01%)
cycles in affected programs: 14800970 -> 14798290 (-0.02%)
helped: 1519
HURT: 329
helped stats (abs) min: 2 max: 462 x̄: 10.59 x̃: 4
helped stats (rel) min: 0.03% max: 16.73% x̄: 0.79% x̃: 0.36%
HURT stats (abs)   min: 2 max: 598 x̄: 40.74 x̃: 16
HURT stats (rel)   min: <.01% max: 10.32% x̄: 2.56% x̃: 0.98%
95% mean confidence interval for cycles value: -3.53 0.63
95% mean confidence interval for cycles %-change: -0.30% -0.09%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 34601 -> 34592 (-0.03%)
fills in affected programs: 91 -> 82 (-9.89%)
helped: 9
HURT: 0

Ivy Bridge
total instructions in shared programs: 12053565 -> 12051626 (-0.02%)
instructions in affected programs: 298103 -> 296164 (-0.65%)
helped: 1228
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 1.58 x̃: 1
helped stats (rel) min: 0.04% max: 3.57% x̄: 0.91% x̃: 0.81%
95% mean confidence interval for instructions value: -1.63 -1.53
95% mean confidence interval for instructions %-change: -0.95% -0.88%
Instructions are helped.

total cycles in shared programs: 180322270 -> 180319922 (<.01%)
cycles in affected programs: 14123840 -> 14121492 (-0.02%)
helped: 1036
HURT: 195
helped stats (abs) min: 2 max: 462 x̄: 11.93 x̃: 2
helped stats (rel) min: 0.03% max: 14.05% x̄: 0.82% x̃: 0.35%
HURT stats (abs)   min: 2 max: 598 x̄: 51.33 x̃: 16
HURT stats (rel)   min: <.01% max: 9.68% x̄: 3.02% x̃: 0.72%
95% mean confidence interval for cycles value: -4.92 1.10
95% mean confidence interval for cycles %-change: -0.35% -0.07%
Inconclusive result (value mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10864286 -> 10863189 (-0.01%)
instructions in affected programs: 159722 -> 158625 (-0.69%)
helped: 724
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 1.52 x̃: 1
helped stats (rel) min: 0.10% max: 2.91% x̄: 0.79% x̃: 0.62%
95% mean confidence interval for instructions value: -1.58 -1.46
95% mean confidence interval for instructions %-change: -0.82% -0.75%
Instructions are helped.

total cycles in shared programs: 153967938 -> 153957926 (<.01%)
cycles in affected programs: 1923186 -> 1913174 (-0.52%)
helped: 654
HURT: 56
helped stats (abs) min: 2 max: 170 x̄: 20.00 x̃: 4
helped stats (rel) min: 0.03% max: 11.82% x̄: 0.89% x̃: 0.18%
HURT stats (abs)   min: 2 max: 390 x̄: 54.75 x̃: 32
HURT stats (rel)   min: 0.05% max: 6.92% x̄: 3.09% x̃: 2.92%
95% mean confidence interval for cycles value: -17.42 -10.78
95% mean confidence interval for cycles %-change: -0.76% -0.40%
Cycles are helped.

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8142677 -> 8141721 (-0.01%)
instructions in affected programs: 139511 -> 138555 (-0.69%)
helped: 588
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 1.63 x̃: 1
helped stats (rel) min: 0.21% max: 4.39% x̄: 0.84% x̃: 0.46%
95% mean confidence interval for instructions value: -1.70 -1.55
95% mean confidence interval for instructions %-change: -0.89% -0.78%
Instructions are helped.

total cycles in shared programs: 188549394 -> 188547676 (<.01%)
cycles in affected programs: 3171960 -> 3170242 (-0.05%)
helped: 527
HURT: 0
helped stats (abs) min: 2 max: 18 x̄: 3.26 x̃: 2
helped stats (rel) min: <.01% max: 0.80% x̄: 0.08% x̃: 0.06%
95% mean confidence interval for cycles value: -3.49 -3.03
95% mean confidence interval for cycles %-change: -0.09% -0.07%
Cycles are helped.

Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir: Fix lowering of bitfield_insert to shifts.
Eric Anholt [Thu, 27 Jun 2019 21:29:37 +0000 (14:29 -0700)]
nir: Fix lowering of bitfield_insert to shifts.

The bfi/bfm behavior change replaced the bfi/bfm usage in
lower_bitfield_insert_to_shifts with actual shifts like the name says,
but it failed to handle the offset=0, bits==32 case in the new
lowering.

v2: Use 31 < bits instead of bits == 32, to get the 31 < (iand bits,
    31) -> false optimization.

Fixes regressions in dEQP-GLES31.*bitfield_insert* on freedreno.

Fixes: 165b7f3a4487 ("nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.")
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoRevert "meson: Add support for using cmake for finding LLVM"
Dylan Baker [Fri, 28 Jun 2019 23:36:38 +0000 (16:36 -0700)]
Revert "meson: Add support for using cmake for finding LLVM"

This reverts commit 5157a4276500c77e2210e853b262be1d1b30aedf.

There is a meson bug that causes llvm to always be statically linked,
which is obviously not what we want. I haven't had time to look into it
yet, but for now let's just revert it.

5 years agoRevert "meson: try to use cmake as a finder for clang"
Dylan Baker [Fri, 28 Jun 2019 23:36:27 +0000 (16:36 -0700)]
Revert "meson: try to use cmake as a finder for clang"

This reverts commit 0ba0c0c15c633a5a3b7a4651a743f800f30bcbf6.

5 years agomesa: stop trying new filenames if the filename existing is not the issue
Eric Engestrom [Sat, 22 Jun 2019 12:49:02 +0000 (13:49 +0100)]
mesa: stop trying new filenames if the filename existing is not the issue

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: use os_file_create_unique()
Eric Engestrom [Wed, 12 Jun 2019 14:46:11 +0000 (15:46 +0100)]
mesa: use os_file_create_unique()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoutil: add os_file_create_unique()
Eric Engestrom [Mon, 3 Jun 2019 16:51:37 +0000 (17:51 +0100)]
util: add os_file_create_unique()

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agopanfrost: Disable DXT-style texture compression
Alyssa Rosenzweig [Wed, 26 Jun 2019 23:38:50 +0000 (16:38 -0700)]
panfrost: Disable DXT-style texture compression

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopanfrost: Dump unknown formats before aborting
Alyssa Rosenzweig [Wed, 26 Jun 2019 23:36:17 +0000 (16:36 -0700)]
panfrost: Dump unknown formats before aborting

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>