mesa.git
5 years agopan/midgard: Fix component count handling for ldst
Alyssa Rosenzweig [Fri, 27 Sep 2019 21:07:30 +0000 (17:07 -0400)]
pan/midgard: Fix component count handling for ldst

It's not based on the writemask and it can't be inferred; it's just
intrinsic to the op itself.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agopan/midgard: Add missing parans in SWIZZLE definition
Alyssa Rosenzweig [Fri, 27 Sep 2019 21:07:10 +0000 (17:07 -0400)]
pan/midgard: Add missing parans in SWIZZLE definition

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
5 years agonouveau: set lower_sub = true
Daniel Schürmann [Fri, 27 Sep 2019 10:12:25 +0000 (12:12 +0200)]
nouveau: set lower_sub = true

Subtractions are already implemented as additions anyway.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agov3d: Enable the late algebraic optimizations to get real subs.
Eric Anholt [Wed, 25 Sep 2019 18:56:06 +0000 (11:56 -0700)]
v3d: Enable the late algebraic optimizations to get real subs.

This worked better than my original v3d-local pass for just subs, and is a
huge win over not producing subs.

total instructions in shared programs: 6408469 -> 6167932 (-3.75%)
total threads in shared programs: 153784 -> 154104 (0.21%)
total uniforms in shared programs: 2157078 -> 1905823 (-11.65%)
total max-temps in shared programs: 904546 -> 895796 (-0.97%)
total spills in shared programs: 4959 -> 4993 (0.69%)
total fills in shared programs: 6558 -> 6670 (1.71%)
total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%)
total inst-and-stalls in shared programs: 6434314 -> 6193107 (-3.75%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoaco: call nir_opt_algebraic_late() exhaustively
Daniel Schürmann [Thu, 26 Sep 2019 10:08:13 +0000 (12:08 +0200)]
aco: call nir_opt_algebraic_late() exhaustively

57559 shaders in 28980 tests
Totals:
SGPRS: 2963407 -> 2959935 (-0.12 %)
VGPRS: 2014812 -> 2016328 (0.08 %)
Spilled SGPRs: 1077 -> 1077 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 10348 -> 10348 (0.00 %) dwords per thread
Code Size: 114545436 -> 114498084 (-0.04 %) bytes
LDS: 933 -> 933 (0.00 %) blocks
Max Waves: 375997 -> 375866 (-0.03 %)

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoradv/aco: Don't lower subtractions
Daniel Schürmann [Wed, 25 Sep 2019 14:34:29 +0000 (16:34 +0200)]
radv/aco: Don't lower subtractions

40228 shaders in 20236 tests
Totals:
SGPRS: 2045512 -> 2046496 (0.05 %)
VGPRS: 1430856 -> 1430464 (-0.03 %)
Spilled SGPRs: 1077 -> 1077 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 10348 -> 10348 (0.00 %) dwords per thread
Code Size: 77202840 -> 77151832 (-0.07 %) bytes
LDS: 863 -> 863 (0.00 %) blocks
Max Waves: 260729 -> 260754 (0.01 %)

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir: Remove unnecessary subtraction optimizations
Daniel Schürmann [Wed, 25 Sep 2019 14:33:10 +0000 (16:33 +0200)]
nir: Remove unnecessary subtraction optimizations

These optimizations are already covered after lowering.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir: recombine nir_op_*sub when lower_sub = false
Daniel Schürmann [Wed, 25 Sep 2019 14:20:09 +0000 (16:20 +0200)]
nir: recombine nir_op_*sub when lower_sub = false

There are some optimizations which are only implemented for additions
and some optimizations which assume that subtractions have been lowered.
By lowering all subtractions first and later recombine for backends
which prefer this option, we don't have to implement them twice.

This patch also moves lower_negate to nir_opt_algebraic_late() to enable
these optimizations for backends which make use of it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agofreedreno: Enable the nir_opt_algebraic_late() pass.
Daniel Schürmann [Fri, 27 Sep 2019 10:49:06 +0000 (12:49 +0200)]
freedreno: Enable the nir_opt_algebraic_late() pass.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agovc4: Enable the nir_opt_algebraic_late() pass.
Eric Anholt [Thu, 26 Sep 2019 18:03:46 +0000 (11:03 -0700)]
vc4: Enable the nir_opt_algebraic_late() pass.

Upcoming changes to sub optimization will make this pass required.  Over
the course of that series, we see uniforms +.46%, instructions -.24%
(seems like a fine tradeoff -- uniforms are 1/2 the size of instructions
as far as cache occupancy)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agogitlab-ci: Add test-container:arm64 to needs: for arm64 test jobs
Michel Dänzer [Wed, 18 Sep 2019 14:28:41 +0000 (16:28 +0200)]
gitlab-ci: Add test-container:arm64 to needs: for arm64 test jobs

Without this, it was theoretically possible for the jobs to run before
the docker image was ready.

v2:
* Use - list syntax instead of [] (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogitlab-ci: Add needs: for x86 buster docker image
Michel Dänzer [Wed, 18 Sep 2019 14:21:32 +0000 (16:21 +0200)]
gitlab-ci: Add needs: for x86 buster docker image

This allows most build jobs to run before the stretch or arm64 docker
images are ready.

v2:
* Use - list syntax instead of [] (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogitlab-ci: Declare needs: for stretch docker image
Michel Dänzer [Wed, 18 Sep 2019 14:17:01 +0000 (16:17 +0200)]
gitlab-ci: Declare needs: for stretch docker image

This allows the *-old-llvm jobs to run before the buster docker images
are ready.

v2:
* Use - list syntax instead of [] (Eric Engestrom)

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoscons: Fix MSYS2 Mingw-w64 build.
pal1000 [Fri, 1 Mar 2019 10:30:15 +0000 (12:30 +0200)]
scons: Fix MSYS2 Mingw-w64 build.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
This patch is based on https://github.com/msys2/MINGW-packages/blob/28e3f85e09b6947ea80036c49f6c38f1394f93ca/mingw-w64-mesa/link-ole32.patch but with tweaks to avoid MSVC build break when applied.

v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation;

v3: Fix obviously wrong compiler flags for swr driver;

v4: Update original patch URL because it has been relocated;

v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway;

v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place.

5 years agoscons/windows: Support build with LLVM 9.
pal1000 [Fri, 6 Sep 2019 14:34:30 +0000 (17:34 +0300)]
scons/windows: Support build with LLVM 9.

As X86AsmPrinter component is gone, LLVMX86AsmPrinter got replaced
with LLVMRemarks, LLVMBitstreamReader and LLVMDebugInfoDWARF.

Tests done with llvm-config on both LLVM 8 and 9 indicate that
mcjit, bitwriter and x86asmprinter fully fit inside engine component.

On other platforms and with meson build mcdisassembler was used to replace
X86AsmPrinter but mcdisassembler also fully fits inside engine component
for LLVM>=8 according to same tests.

v2: Avoid duplicating code related to Mingw pthreads.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
On 19.1 this patch does not apply cleanly without 88eb2a1f

5 years agolima: set uniforms_address lower bits properly
Vasily Khoruzhick [Sat, 28 Sep 2019 16:57:55 +0000 (09:57 -0700)]
lima: set uniforms_address lower bits properly

Looks like blob uses following values for uniforms buffer:

0 for 8 bytes
1 for 16 bytes
2 for 24 bytes
2 for 32 bytes
3 for 40 bytes
3 for 48 bytes
3 for 56 bytes
3 for 64 bytes
4 for 72 bytes

It all looks like log2(size / 8) rounded up, so let's do the same.

Fixes: 931fc2a7b3f9("lima: do not set the PP uniforms address lowest bits")
Reviewed-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoscons: add py3 support
Michel Zou [Sat, 28 Sep 2019 06:53:38 +0000 (08:53 +0200)]
scons: add py3 support

SCons 3.1 has moved to python 3, requiring this fix
to continue supporting scons builds.

Closes: #944
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Eric Engestrom <eric@engestrom.ch>
5 years agoandroid: aco: add support for libmesa_aco
Mauro Rossi [Sat, 21 Sep 2019 15:58:52 +0000 (17:58 +0200)]
android: aco: add support for libmesa_aco

Android building rules are added in src/amd/Android.compiler.mk
libmesa_aco static library is built conditionally to radeonsi
as done for vulkan.radv module

This will prevent Android build errors for non x86 systems

filter-out compiler/aco_instruction_selection_setup.cpp source,
as already included by compiler/aco_instruction_selection.cpp
and would cause several multiple definition linker errors

NOTE: libLLVM requires AMDGPU Disassembler to build radv with aco

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Fixes: a70a998 ("radv/aco: Setup alternate path in RADV to support the experimental ACO compiler")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoandroid: compiler/nir: build nir_divergence_analysis.c
Mauro Rossi [Sat, 21 Sep 2019 15:48:52 +0000 (17:48 +0200)]
android: compiler/nir: build nir_divergence_analysis.c

Prerequisite to avoid following radv linking error happening with aco

FAILED: out/target/product/x86_64/obj_x86/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
external/mesa/src/amd/compiler/aco_instruction_selection_setup.cpp:178:
error: undefined reference to 'nir_divergence_analysis'
clang.real: error: linker command failed with exit code 1 (use -v to see invocation)

Fixes: df86c5f ("nir: add divergence analysis pass.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoandroid: aco: fix undefined template 'std::__1::array' build errors
Mauro Rossi [Sat, 21 Sep 2019 15:38:52 +0000 (17:38 +0200)]
android: aco: fix undefined template 'std::__1::array' build errors

Fixes a few building errors similar to the following:

In file included from external/mesa/src/amd/compiler/aco_instruction_selection.cpp:26:
In file included from external/libcxx/include/algorithm:639:
external/libcxx/include/utility:321:9:
error: implicit instantiation of undefined template 'std::__1::array<aco::Temp, 4>'
    _T2 second;
        ^

Fixes: 93c8ebf ("aco: Initial commit of independent AMD compiler")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoetnaviv: nir: fix gl_FragDepth
Jonathan Marek [Thu, 12 Sep 2019 20:12:02 +0000 (16:12 -0400)]
etnaviv: nir: fix gl_FragDepth

Fixes the following piglit test: fragdepth_gles2 (for ETNA_MESA_DEBUG=nir)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: disable earlyZ when shader writes fragment depth
Jonathan Marek [Tue, 17 Sep 2019 11:49:46 +0000 (07:49 -0400)]
etnaviv: disable earlyZ when shader writes fragment depth

Fixes the following piglit test: fragdepth_gles2

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: nir: make lower_alu easier to follow
Jonathan Marek [Thu, 12 Sep 2019 17:51:32 +0000 (13:51 -0400)]
etnaviv: nir: make lower_alu easier to follow

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: remove extra allocation for shader code
Jonathan Marek [Thu, 12 Sep 2019 17:17:21 +0000 (13:17 -0400)]
etnaviv: remove extra allocation for shader code

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: nir: remove "options" struct
Jonathan Marek [Thu, 12 Sep 2019 17:14:20 +0000 (13:14 -0400)]
etnaviv: nir: remove "options" struct

It just makes thing more complicated for no reason.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: nir: use store_deref instead of store_output
Jonathan Marek [Thu, 12 Sep 2019 16:28:28 +0000 (12:28 -0400)]
etnaviv: nir: use store_deref instead of store_output

Allows some simplification.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: nir: add native integers (HALTI2+)
Jonathan Marek [Wed, 11 Sep 2019 20:59:31 +0000 (16:59 -0400)]
etnaviv: nir: add native integers (HALTI2+)

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoqetnaviv: nir: use new immediates when possible
Jonathan Marek [Thu, 12 Sep 2019 20:07:14 +0000 (16:07 -0400)]
qetnaviv: nir: use new immediates when possible

Note it can still be improved a bit:
* Use alu swizzle to determine if src is scalar
* Take into account new immediates in the multiple uniform src lowering

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: nir: set num_components for inputs/outputs
Jonathan Marek [Wed, 11 Sep 2019 20:45:05 +0000 (16:45 -0400)]
etnaviv: nir: set num_components for inputs/outputs

This can improve performance by allowing the LAST_VARYING_2X bit to be
set when possible (and possibility more benefits on HALTI5 where the
number of components is set for each varying).

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: nir: allocate contiguous components for LOAD destination
Jonathan Marek [Wed, 11 Sep 2019 18:29:10 +0000 (14:29 -0400)]
etnaviv: nir: allocate contiguous components for LOAD destination

LOAD starts reading into the first enabled destination component, and
doesn't skip disabled components, so we need to allocate a destination with
contiguous components.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agoetnaviv: nir: fix gl_FrontFacing
Jonathan Marek [Wed, 11 Sep 2019 17:42:43 +0000 (13:42 -0400)]
etnaviv: nir: fix gl_FrontFacing

Only invert front facing when glFrontFace is GL_CW.

Fixes following deqp test:
dEQP-GLES2.functional.shaders.builtin_variable.frontfacing

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agolima: do not set the PP uniforms address lowest bits
Icenowy Zheng [Thu, 26 Sep 2019 15:25:09 +0000 (23:25 +0800)]
lima: do not set the PP uniforms address lowest bits

The PP uniforms address register in render state is not a direct pointer
to the uniforms storage -- instead, it points to an one-item array, and
the array item is the real pointer to the uniforms storage.

This register reuses some of its LSBs as a size field. Currently the
size is set according to the length of the real uniforms storage.
However, as the register itself contains only a pointer to the one-item
array, the size field should be set to the length of the one-item array
and subtract it by 1, which means a fixed value of 0. That means we can
just omit it now.

Test shows this should be the correct approach to set this register.

Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoglsl: disallow incompatible matrices multiplication
Andrii Simiklit [Tue, 10 Sep 2019 14:00:32 +0000 (17:00 +0300)]
glsl: disallow incompatible matrices multiplication

glsl 4.4 spec section '5.9 expressions':
"The operator is multiply (*), where both operands are matrices or one operand is a vector and the
 other a matrix. A right vector operand is treated as a column vector and a left vector operand as a
 row vector. In all these cases, it is required that the number of columns of the left operand is equal
 to the number of rows of the right operand. Then, the multiply (*) operation does a linear
 algebraic multiply, yielding an object that has the same number of rows as the left operand and the
 same number of columns as the right operand. Section 5.10 “Vector and Matrix Operations”
 explains in more detail how vectors and matrices are operated on."

This fix disallows a multiplication of incompatible matrices like:
mat4x3(..) * mat4x3(..)
mat4x2(..) * mat4x2(..)
mat3x2(..) * mat3x2(..)
....

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111664
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
5 years agoturnip: Fix failure behavior of vkCreateGraphicsPipelines.
Eric Anholt [Thu, 19 Sep 2019 18:09:46 +0000 (11:09 -0700)]
turnip: Fix failure behavior of vkCreateGraphicsPipelines.

According to the 1.1.123 spec:

    "The implementation will attempt to create all pipelines, and only
     return VK_NULL_HANDLE values for those that actually failed."

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoturnip: Silence compiler warning about uninit pipeline.
Eric Anholt [Thu, 19 Sep 2019 18:08:25 +0000 (11:08 -0700)]
turnip: Silence compiler warning about uninit pipeline.

The code was fine as far as I see, but the warning was irritating.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoturnip: Add a .editorconfig and .dir-locals.el
Eric Anholt [Thu, 19 Sep 2019 18:03:35 +0000 (11:03 -0700)]
turnip: Add a .editorconfig and .dir-locals.el

I was inheriting the one from src/freedreno with funny tabs, while
this driver is written with normal Mesa 3-space indents.
Unfortunately I have to add both files, because I use emacs and emacs
prefers .dir-locals to .editorconfig :(

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoshader_enums: Move MAX_DRAW_BUFFERS to this file.
Eric Anholt [Thu, 19 Sep 2019 17:54:08 +0000 (10:54 -0700)]
shader_enums: Move MAX_DRAW_BUFFERS to this file.

We include shader_enums.h from freedreno's compiler for both GL and
Vulkan, and the main/config.h include resulted in polluting the
namespace with things like MAX_VIEWPORTS that other Vulkan drivers use
as their driver-specific maximums.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agointel/fs: Fix fs_inst::flags_read for ANY/ALL predicates
Jason Ekstrand [Tue, 24 Sep 2019 22:06:12 +0000 (17:06 -0500)]
intel/fs: Fix fs_inst::flags_read for ANY/ALL predicates

Without this, we were DCEing flag writes because we didn't think their
results were used because we didn't understand that an ANY32 predicate
actually read all the flags.

Fixes: df1aec763eb "i965/fs: Define methods to calculate the flag..."
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoetnaviv: support ARB_framebuffer_object
Christian Gmeiner [Fri, 27 Sep 2019 08:39:30 +0000 (10:39 +0200)]
etnaviv: support ARB_framebuffer_object

Passes most of piglit's tests regarding arb_framebuffer_object
and unlocks some more piglit tests.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agoetnaviv: etna_resource_copy_region(..): drop assert
Christian Gmeiner [Fri, 27 Sep 2019 11:08:24 +0000 (13:08 +0200)]
etnaviv: etna_resource_copy_region(..): drop assert

We are using util_resource_copy_region(..) as fallback which supports
different formats for src and dst. Improves the experience when running
deqp or piglit with a debug build.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
5 years agomeson: Link xvmc with libxv
Dylan Baker [Thu, 26 Sep 2019 22:42:59 +0000 (15:42 -0700)]
meson: Link xvmc with libxv

Prior to xvmc 1.0.12 libxvmc incorrectly required libxv, but that was
fixed. This results in compilation failures for the gallium xvmc tracker
and tools. This patch fixes that by explicitly linking to libxv.

Fixes: 22a817af8a89eb3c762fc3e07b443a3ce37d7416
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1844
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agomeson: Try finding libxvmcw via pkg-config before using find_library
Dylan Baker [Thu, 26 Sep 2019 22:19:54 +0000 (15:19 -0700)]
meson: Try finding libxvmcw via pkg-config before using find_library

This fixes cross compiling issues, because pkg-config is less likely to
get the wrong libs.

v2: - Fix typo in comment

Fixes: 22a817af8a89eb3c762fc3e07b443a3ce37d7416
       ("meson: build gallium xvmc state tracker")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/939
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agodrisw: Fix shared memory leak on drawable resize
Andreas Gottschling [Fri, 27 Sep 2019 16:02:06 +0000 (12:02 -0400)]
drisw: Fix shared memory leak on drawable resize

XDestroyImage will mark the segment as to-be-destroyed, but it will
persist until we detach it, and we weren't doing so.

Cc: mesa-stable@lists.freedesktop.org
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/121
Reviewed-by: Adam Jackson <ajax@redhat.com>
5 years agodrisw: Fix and simplify drawable setup
Adam Jackson [Thu, 26 Sep 2019 19:43:46 +0000 (15:43 -0400)]
drisw: Fix and simplify drawable setup

We don't want to require a visual for the drawable, because there exist
fbconfigs that don't correspond to any visual (say a 565 pixmap|pbuffer
config on a depth-24 display). Fortunately, we don't need one either.
Passing the visual to XCreateImage serves only to fill in the XImage's
{red,green,blue}_mask fields, which libX11 itself never uses, they exist
only for the client's convenience, and we don't care. And we already
have the drawable depth in glx_config::rgbBits. So replace the
XVisualInfo field in the drawable private with a pointer to the
glx_config.

Having done that driswCreateGCs becomes trivial, so inline it into its
caller.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1194
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agodrisw: Simplify GC setup
Adam Jackson [Thu, 26 Sep 2019 18:42:16 +0000 (14:42 -0400)]
drisw: Simplify GC setup

There's no reason to have two GCs here. The only difference between
them is that swapgc would generate graphics exposures, except we only
ever use this GC for PutImage, and PutImage doesn't generate graphics
exposures. We also don't need to explicitly ChangeGC to GXCopy, because
that's the default.

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: Add todo for d24_s8 copies
Bas Nieuwenhuizen [Fri, 20 Sep 2019 13:57:02 +0000 (15:57 +0200)]
turnip: Add todo for d24_s8 copies

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: Disallow NPoT formats.
Bas Nieuwenhuizen [Thu, 19 Sep 2019 21:34:36 +0000 (23:34 +0200)]
turnip: Disallow NPoT formats.

Copying is a mess for these formats for now.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: Always use UINT formats for copies.
Bas Nieuwenhuizen [Fri, 20 Sep 2019 11:49:44 +0000 (13:49 +0200)]
turnip: Always use UINT formats for copies.

Looks like r16_unorm might have precision issues.

dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.color.r16_unorm.r16_unorm.general_general

fails, but the dumped images in the xml are the same so
I'd guess the low bits are the issue.

r8_unorm and r16_uint work.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: Add image->image blitting.
Bas Nieuwenhuizen [Thu, 19 Sep 2019 21:29:50 +0000 (23:29 +0200)]
turnip: Add image->image blitting.

3D blits & format reinterpretation are still TBD.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoaco: don't remove the loop exec mask in transition_to_Exact()
Rhys Perry [Thu, 26 Sep 2019 14:38:09 +0000 (15:38 +0100)]
aco: don't remove the loop exec mask in transition_to_Exact()

No pipeline-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: set loop_info::has_discard for demotes
Rhys Perry [Thu, 26 Sep 2019 09:33:43 +0000 (10:33 +0100)]
aco: set loop_info::has_discard for demotes

We need the loop header phis for the outer exec masks. Needed for
dEQP-VK.glsl.demote.dynamic_loop_texture

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoiris: Only resolve for image levels/layers which are actually in use.
Kenneth Graunke [Mon, 11 Mar 2019 04:30:28 +0000 (21:30 -0700)]
iris: Only resolve for image levels/layers which are actually in use.

There's no need to resolve everything.

5 years agolima/ppir: add NIR pass to split varying loads
Vasily Khoruzhick [Mon, 23 Sep 2019 05:03:22 +0000 (22:03 -0700)]
lima/ppir: add NIR pass to split varying loads

NIR may emit a single instrinsic to load several packed varyings,
but that's suboptimal for Utgard PP for several reasons:
- varyings that are used as sampler inputs can be passed using
  pipeline register with increased precision
- we have small number of regs, so using a vec4 regs for storing
  two vec2 varyings increases reg pressure.

Add NIR pass to split a single load into several loads and utilize
it in lima.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoradv: Fix L2 cache rinse programming.
Timur Kristóf [Thu, 26 Sep 2019 07:37:16 +0000 (09:37 +0200)]
radv: Fix L2 cache rinse programming.

According to radeonsi, GLM doesn't support WB alone, so
we have to set INV too when WB is set.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoturnip: emit texture and uniform state
Jonathan Marek [Thu, 26 Sep 2019 04:37:26 +0000 (00:37 -0400)]
turnip: emit texture and uniform state

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: add some shader information in pipeline state
Jonathan Marek [Thu, 26 Sep 2019 04:31:56 +0000 (00:31 -0400)]
turnip: add some shader information in pipeline state

This information is needed by texture/uniform descriptors.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: use nir_opt_copy_prop_vars
Jonathan Marek [Thu, 26 Sep 2019 04:30:22 +0000 (00:30 -0400)]
turnip: use nir_opt_copy_prop_vars

Avoids getting a "load_output" in a case like this:

   gl_Position = ubuf.MVP * ubuf.position[gl_VertexIndex];
   frag_pos = gl_Position.xyz;

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: lower samplers and uniform buffer indices
Jonathan Marek [Thu, 26 Sep 2019 04:29:26 +0000 (00:29 -0400)]
turnip: lower samplers and uniform buffer indices

Lower these to something compatible with ir3, and save the descriptor set
and binding information.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: basic descriptor sets (uniform buffer and samplers)
Jonathan Marek [Thu, 26 Sep 2019 03:00:16 +0000 (23:00 -0400)]
turnip: basic descriptor sets (uniform buffer and samplers)

Mostly copy-paste from radv, with a few modifications.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: enable linear filtering
Jonathan Marek [Thu, 26 Sep 2019 18:58:49 +0000 (14:58 -0400)]
turnip: enable linear filtering

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: align layer_size
Jonathan Marek [Thu, 26 Sep 2019 18:56:17 +0000 (14:56 -0400)]
turnip: align layer_size

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: use linear tiling for scanout image
Jonathan Marek [Thu, 26 Sep 2019 02:47:02 +0000 (22:47 -0400)]
turnip: use linear tiling for scanout image

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: implement image view descriptor
Jonathan Marek [Thu, 26 Sep 2019 02:24:13 +0000 (22:24 -0400)]
turnip: implement image view descriptor

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: implement sampler state
Jonathan Marek [Wed, 25 Sep 2019 16:55:14 +0000 (12:55 -0400)]
turnip: implement sampler state

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: fix vertex_id
Jonathan Marek [Wed, 25 Sep 2019 16:48:39 +0000 (12:48 -0400)]
turnip: fix vertex_id

ir3 uses non-zero based vertex id for a6xx

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: emit shader immediates
Jonathan Marek [Wed, 25 Sep 2019 16:46:04 +0000 (12:46 -0400)]
turnip: emit shader immediates

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoutil/rb_tree: Stop relying on &iter->field != NULL
Jason Ekstrand [Wed, 25 Sep 2019 15:02:15 +0000 (10:02 -0500)]
util/rb_tree: Stop relying on &iter->field != NULL

The old version of the iterators relies on a &iter->field != NULL check
which works fine on older GCC but newer GCC versions and clang have
optimizations that break if you do pointer math on a null pointer.  The
correct solution to this is to do the null comparisons before we do any
sort of &iter->field or use rb_node_data to do the reverse operation.

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoutil/rb_tree: Also test _safe iterators
Jason Ekstrand [Wed, 25 Sep 2019 15:01:27 +0000 (10:01 -0500)]
util/rb_tree: Also test _safe iterators

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agofreedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex.
Eric Anholt [Wed, 28 Aug 2019 20:24:41 +0000 (13:24 -0700)]
freedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex.

This is based on the fix I used for the same problem on V3D.  In this
case, it fixes all but the the
dEQP-GLES2.functional.texture.filtering.2d.*_npot cases of
dEQP-GLES2.functional.texture.filtering.2d.*'s failures.

Acked-by: Rob Clark <robdclark@chromium.org>
5 years agointel/compiler: avoid truncating int64_t to int
Maya Rashish [Thu, 26 Sep 2019 14:14:34 +0000 (17:14 +0300)]
intel/compiler: avoid truncating int64_t to int

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Maya Rashish <maya@netbsd.org>
5 years agolima: support rectangle texture
Icenowy Zheng [Thu, 26 Sep 2019 03:10:18 +0000 (11:10 +0800)]
lima: support rectangle texture

As Vasily discovered, the bit 7 of the word 1 of the texture descriptor
is set when reloading the framebuffer, to use framebuffer-based offset
rather than normalized one. This bit also works for regular textures to
enable accessing with non-normalized offset.

Add support for rectangle texture by setting this bit for
PIPE_TEXTURE_RECT.

Suggested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoloader: Avoid use-after-free / use of uninitialized local variables
Michel Dänzer [Thu, 26 Sep 2019 09:02:46 +0000 (11:02 +0200)]
loader: Avoid use-after-free / use of uninitialized local variables

Per the valgrind output below, we were returning the pointer to freed
memory if none of the later conditional pointer assignments were
executed. This caused dEQP CI jobs to crash on certain runners,
presumably due to a double-free down the line.

Also, we were skipping to the out: label before the vendor_id & chip_id
variables used by it were initialized, resulting in broken
LIBGL_DEBUG=verbose output such as

libGL: pci id for fd 4: 51108f00:51108f00, driver radeonsi

Fixes: 5a545e355b23 "loader: always map the "amdgpu" kernel driver name to radeonsi (v2)"
==403== Invalid read of size 1
==403==    at 0x4AFD576: surfaceless_probe_device (platform_surfaceless.c:316)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DE07A: deqp::gles2::Context::Context(tcu::TestContext&) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DB5EF: deqp::gles2::TestPackage::init() (in /deqp/modules/gles2/deqp-gles2)
==403==  Address 0x56bd340 is 0 bytes inside a block of size 4 free'd
==403==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==403==    by 0x4B01767: loader_get_driver_for_fd (loader.c:464)
==403==    by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2)
==403==  Block was alloc'd at
==403==    at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==403==    by 0x4EE5E09: strndup (strndup.c:43)
==403==    by 0x4B010B1: loader_get_kernel_driver_name (loader.c:101)
==403==    by 0x4B016AF: loader_get_driver_for_fd (loader.c:462)
==403==    by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoRevert "glx: Lift sending the MakeCurrent request to top-level code"
Adam Jackson [Thu, 26 Sep 2019 15:07:42 +0000 (11:07 -0400)]
Revert "glx: Lift sending the MakeCurrent request to top-level code"

Apparently this provokes crashes elsewhere in code unrelated to
MakeCurrent. I hate GLX so very very much.

This reverts commit 999c2aed8826f403b071f52b040ce25b56d35f9d.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207

5 years agoRevert "glx: Implement GLX_EXT_no_config_context"
Adam Jackson [Thu, 26 Sep 2019 15:07:13 +0000 (11:07 -0400)]
Revert "glx: Implement GLX_EXT_no_config_context"

This reverts commit 0d635ccc912d7122f35f81eec27d8b2c0a2a7a28.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207

5 years agoradv: Add debug option to dump meta shaders.
Timur Kristóf [Wed, 18 Sep 2019 12:39:10 +0000 (14:39 +0200)]
radv: Add debug option to dump meta shaders.

This new option can help debug shader compiler problems when
there are issues with the meta shaders.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoamd/common: Introduce ac_get_fs_input_vgpr_cnt.
Timur Kristóf [Wed, 25 Sep 2019 14:40:07 +0000 (16:40 +0200)]
amd/common: Introduce ac_get_fs_input_vgpr_cnt.

Add a function called ac_get_fs_input_vgpr_cnt which will return
the number of input VGPRs used by an AMD shader. Previously,
radv and radeonsi had the same code duplicated, but this commit also
allows them to share this code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: Set shared VGPR count in radv_postprocess_config.
Timur Kristóf [Fri, 13 Sep 2019 13:53:09 +0000 (15:53 +0200)]
radv: Set shared VGPR count in radv_postprocess_config.

This commit allows RADV to set the shared VGPR count according to
the shader config.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoamd/common: Add num_shared_vgprs to ac_shader_config for GFX10.
Timur Kristóf [Fri, 13 Sep 2019 13:38:50 +0000 (15:38 +0200)]
amd/common: Add num_shared_vgprs to ac_shader_config for GFX10.

In GFX10 wave64 mode, shared VGPRs allow the two wave halves to
share some data with each other.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: Extract some helper functions to ac_shader_util.
Timur Kristóf [Wed, 25 Sep 2019 12:10:18 +0000 (14:10 +0200)]
amd/common: Extract some helper functions to ac_shader_util.

This commit moves ac_get_tbuffer_format, ac_get_sampler_dim and
ac_get_image_dim into ac_shader_util, thus enabling them to be used
by compilers other than LLVM.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: Move ac_export_mrt_z to ac_llvm_build.
Timur Kristóf [Wed, 25 Sep 2019 12:05:19 +0000 (14:05 +0200)]
amd/common: Move ac_export_mrt_z to ac_llvm_build.

The aim of this commit is to keep ac_shader_util LLVM-free,
since we would like to use it in ACO later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoaco: CSE readlane/readfirstlane/permute/reduce with the same exec mask
Rhys Perry [Mon, 23 Sep 2019 13:31:24 +0000 (14:31 +0100)]
aco: CSE readlane/readfirstlane/permute/reduce with the same exec mask

v2: rename pass_temp to pass_flags
v2: also CSE reductions
v3: add ds_swizzle_b32 support
v3: check gds/offset0/offset1 fields

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: don't CSE v_readlane_b32/v_readfirstlane_b32
Rhys Perry [Sat, 21 Sep 2019 15:00:45 +0000 (16:00 +0100)]
aco: don't CSE v_readlane_b32/v_readfirstlane_b32

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
Rhys Perry [Wed, 25 Sep 2019 10:48:04 +0000 (11:48 +0100)]
aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/aco: return a correct name and description for the backend IR
Rhys Perry [Tue, 24 Sep 2019 14:25:07 +0000 (15:25 +0100)]
radv/aco: return a correct name and description for the backend IR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: store printed backend IR in binary
Rhys Perry [Tue, 24 Sep 2019 14:23:46 +0000 (15:23 +0100)]
aco: store printed backend IR in binary

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco,radv/aco: get dissassembly for release builds if requested
Rhys Perry [Tue, 24 Sep 2019 14:21:06 +0000 (15:21 +0100)]
aco,radv/aco: get dissassembly for release builds if requested

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/aco: actually disable ACO when unsupported
Rhys Perry [Wed, 25 Sep 2019 11:04:51 +0000 (12:04 +0100)]
radv/aco: actually disable ACO when unsupported

We were setting this twice. The second time, we weren't later disabling
it if unsupported.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomesa/st: calculate texture size based on EGLImage miplevel
Tapani Pälli [Tue, 24 Sep 2019 11:34:40 +0000 (14:34 +0300)]
mesa/st: calculate texture size based on EGLImage miplevel

Fixes issues with 'egl-gl_oes_egl_image' Piglit test.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomeson: fix logic for generating .pc files with old glvnd
Dylan Baker [Wed, 25 Sep 2019 23:25:27 +0000 (23:25 +0000)]
meson: fix logic for generating .pc files with old glvnd

We want to generate PC files for non-glvnd builds and for builds with
old glvnd, but the current logic doesn't do that, it builds them
unconditionally, and for GLES it builds the shared libraries, which is
also not what we want. This does not generate .pc files for gles1 or
gles2. Which it we weren't doing before either, making this not a
regression but a return to status-quo.o

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838
Fixes: 93df862b6affb6b8507e40601212a58012bfa873
       ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir/range-analysis: Use types to provide better ranges from bcsel and mov
Ian Romanick [Tue, 13 Aug 2019 00:28:35 +0000 (17:28 -0700)]
nir/range-analysis: Use types to provide better ranges from bcsel and mov

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16328255 -> 16315391 (-0.08%)
instructions in affected programs: 218318 -> 205454 (-5.89%)
helped: 988
HURT: 0
helped stats (abs) min: 1 max: 72 x̄: 13.02 x̃: 10
helped stats (rel) min: 0.33% max: 16.04% x̄: 6.27% x̃: 4.88%
95% mean confidence interval for instructions value: -13.69 -12.35
95% mean confidence interval for instructions %-change: -6.55% -5.99%
Instructions are helped.

total cycles in shared programs: 363683977 -> 363615417 (-0.02%)
cycles in affected programs: 1475193 -> 1406633 (-4.65%)
helped: 923
HURT: 36
helped stats (abs) min: 1 max: 624 x̄: 75.78 x̃: 48
helped stats (rel) min: 0.08% max: 13.89% x̄: 5.20% x̃: 5.08%
HURT stats (abs)   min: 1 max: 179 x̄: 38.58 x̃: 4
HURT stats (rel)   min: 0.06% max: 16.56% x̄: 3.33% x̃: 0.29%
95% mean confidence interval for cycles value: -75.88 -67.10
95% mean confidence interval for cycles %-change: -5.10% -4.66%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10785779 -> 10785654 (<.01%)
instructions in affected programs: 13855 -> 13730 (-0.90%)
helped: 67
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.87 x̃: 1
helped stats (rel) min: 0.20% max: 3.45% x̄: 0.97% x̃: 0.78%
95% mean confidence interval for instructions value: -2.47 -1.26
95% mean confidence interval for instructions %-change: -1.13% -0.81%
Instructions are helped.

total cycles in shared programs: 153704799 -> 153704481 (<.01%)
cycles in affected programs: 101509 -> 101191 (-0.31%)
helped: 38
HURT: 13
helped stats (abs) min: 1 max: 38 x̄: 12.53 x̃: 16
helped stats (rel) min: 0.07% max: 2.69% x̄: 0.87% x̃: 0.53%
HURT stats (abs)   min: 1 max: 36 x̄: 12.15 x̃: 7
HURT stats (rel)   min: 0.06% max: 2.53% x̄: 0.73% x̃: 0.44%
95% mean confidence interval for cycles value: -10.24 -2.24
95% mean confidence interval for cycles %-change: -0.75% -0.17%
Cycles are helped.

LOST:   2
GAINED: 0

No shader-db change on Iron Lake or GM45.

5 years agonir/range-analysis: Use types in the hash key
Ian Romanick [Tue, 13 Aug 2019 00:28:35 +0000 (17:28 -0700)]
nir/range-analysis: Use types in the hash key

This allows the reslut of mov and bcsel to be separately interpreted as
float or int depending on the use.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/range-analysis: Bail if the types don't match
Ian Romanick [Tue, 24 Sep 2019 22:55:49 +0000 (15:55 -0700)]
nir/range-analysis: Bail if the types don't match

Some shaders are hurt by this change because now a
load_const(0x00000000) is not recognized as eq_zero when loaded as a
float.  This behavior is restored in a later patch (nir/range-analysis:
Use types to provide better ranges from bcsel and mov).

v2: Add a comment about reinterpretation of int/uint/bool.  Suggested by
Caio.  Rewrite condition the check for types being float versus checking
for types not being all the things that aren't float.

Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16327543 -> 16328255 (<.01%)
instructions in affected programs: 55928 -> 56640 (1.27%)
helped: 0
HURT: 208
HURT stats (abs)   min: 1 max: 16 x̄: 3.42 x̃: 3
HURT stats (rel)   min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12%
95% mean confidence interval for instructions value: 3.06 3.79
95% mean confidence interval for instructions %-change: 1.17% 1.46%
Instructions are HURT.

total cycles in shared programs: 363682759 -> 363683977 (<.01%)
cycles in affected programs: 325758 -> 326976 (0.37%)
helped: 44
HURT: 133
helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5
helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29%
HURT stats (abs)   min: 1 max: 157 x̄: 20.28 x̃: 14
HURT stats (rel)   min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73%
95% mean confidence interval for cycles value: 0.38 13.39
95% mean confidence interval for cycles %-change: -0.06% 0.96%
Inconclusive result (%-change mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10787433 -> 10787443 (<.01%)
instructions in affected programs: 1842 -> 1852 (0.54%)
helped: 0
HURT: 10
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.36% 1.10%
Instructions are HURT.

total cycles in shared programs: 153724543 -> 153724563 (<.01%)
cycles in affected programs: 8407 -> 8427 (0.24%)
helped: 1
HURT: 3
helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18
helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98%
HURT stats (abs)   min: 4 max: 18 x̄: 12.67 x̃: 16
HURT stats (rel)   min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72%
95% mean confidence interval for cycles value: -21.31 31.31
95% mean confidence interval for cycles %-change: -1.11% 1.46%
Inconclusive result (value mean confidence interval includes 0).

No shader-db changes on Iron Lake or GM45.

5 years agointel: Add new Comet Lake PCI-ids
Lionel Landwerlin [Wed, 25 Sep 2019 14:43:07 +0000 (17:43 +0300)]
intel: Add new Comet Lake PCI-ids

Commit bfc4c359b282 ("drm/i915/cml: Add Missing PCI IDs") in i915
added 3 new CML PCI ids.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel: use proper label for Comet Lake skus
Lionel Landwerlin [Wed, 25 Sep 2019 14:40:50 +0000 (17:40 +0300)]
intel: use proper label for Comet Lake skus

Fixes: 82f6a746e8 ("intel: Add support for Comet Lake")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agofreedreno/a6xx: Move instrlen and obj_start writes to fd6_emit_shader
Kristian H. Kristensen [Fri, 20 Sep 2019 17:10:57 +0000 (10:10 -0700)]
freedreno/a6xx: Move instrlen and obj_start writes to fd6_emit_shader

Consolidate a few more generic shaders setup regs in fd6_emit_shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: Emit const and texture state for HS/DS/GS
Kristian H. Kristensen [Thu, 19 Sep 2019 22:04:09 +0000 (15:04 -0700)]
freedreno/a6xx: Emit const and texture state for HS/DS/GS

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: Add HS/DS/GS to shader key and cache
Kristian H. Kristensen [Thu, 19 Sep 2019 20:59:36 +0000 (13:59 -0700)]
freedreno/ir3: Add HS/DS/GS to shader key and cache

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: Add generic program stateobj support for HS/DS/GS
Kristian H. Kristensen [Thu, 19 Sep 2019 20:55:35 +0000 (13:55 -0700)]
freedreno/a6xx: Add generic program stateobj support for HS/DS/GS

This add generic stage state setup for HS/DS/GS to the program state
object.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Move fs functions after geometry pipeline stages
Kristian H. Kristensen [Thu, 19 Sep 2019 20:45:44 +0000 (13:45 -0700)]
freedreno: Move fs functions after geometry pipeline stages

Let's try to always order the stages in the pipeline order.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Add state binding functions for HS/DS/GS
Kristian H. Kristensen [Thu, 19 Sep 2019 20:25:00 +0000 (13:25 -0700)]
freedreno: Add state binding functions for HS/DS/GS

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>