mesa.git
5 years agoturnip: fix vertex_id
Jonathan Marek [Wed, 25 Sep 2019 16:48:39 +0000 (12:48 -0400)]
turnip: fix vertex_id

ir3 uses non-zero based vertex id for a6xx

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoturnip: emit shader immediates
Jonathan Marek [Wed, 25 Sep 2019 16:46:04 +0000 (12:46 -0400)]
turnip: emit shader immediates

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agoutil/rb_tree: Stop relying on &iter->field != NULL
Jason Ekstrand [Wed, 25 Sep 2019 15:02:15 +0000 (10:02 -0500)]
util/rb_tree: Stop relying on &iter->field != NULL

The old version of the iterators relies on a &iter->field != NULL check
which works fine on older GCC but newer GCC versions and clang have
optimizations that break if you do pointer math on a null pointer.  The
correct solution to this is to do the null comparisons before we do any
sort of &iter->field or use rb_node_data to do the reverse operation.

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoutil/rb_tree: Also test _safe iterators
Jason Ekstrand [Wed, 25 Sep 2019 15:01:27 +0000 (10:01 -0500)]
util/rb_tree: Also test _safe iterators

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agofreedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex.
Eric Anholt [Wed, 28 Aug 2019 20:24:41 +0000 (13:24 -0700)]
freedreno/a3xx: Mostly fix min-vs-mag filtering decisions on non-mipmap tex.

This is based on the fix I used for the same problem on V3D.  In this
case, it fixes all but the the
dEQP-GLES2.functional.texture.filtering.2d.*_npot cases of
dEQP-GLES2.functional.texture.filtering.2d.*'s failures.

Acked-by: Rob Clark <robdclark@chromium.org>
5 years agointel/compiler: avoid truncating int64_t to int
Maya Rashish [Thu, 26 Sep 2019 14:14:34 +0000 (17:14 +0300)]
intel/compiler: avoid truncating int64_t to int

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Maya Rashish <maya@netbsd.org>
5 years agolima: support rectangle texture
Icenowy Zheng [Thu, 26 Sep 2019 03:10:18 +0000 (11:10 +0800)]
lima: support rectangle texture

As Vasily discovered, the bit 7 of the word 1 of the texture descriptor
is set when reloading the framebuffer, to use framebuffer-based offset
rather than normalized one. This bit also works for regular textures to
enable accessing with non-normalized offset.

Add support for rectangle texture by setting this bit for
PIPE_TEXTURE_RECT.

Suggested-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoloader: Avoid use-after-free / use of uninitialized local variables
Michel Dänzer [Thu, 26 Sep 2019 09:02:46 +0000 (11:02 +0200)]
loader: Avoid use-after-free / use of uninitialized local variables

Per the valgrind output below, we were returning the pointer to freed
memory if none of the later conditional pointer assignments were
executed. This caused dEQP CI jobs to crash on certain runners,
presumably due to a double-free down the line.

Also, we were skipping to the out: label before the vendor_id & chip_id
variables used by it were initialized, resulting in broken
LIBGL_DEBUG=verbose output such as

libGL: pci id for fd 4: 51108f00:51108f00, driver radeonsi

Fixes: 5a545e355b23 "loader: always map the "amdgpu" kernel driver name to radeonsi (v2)"
==403== Invalid read of size 1
==403==    at 0x4AFD576: surfaceless_probe_device (platform_surfaceless.c:316)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DE07A: deqp::gles2::Context::Context(tcu::TestContext&) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DB5EF: deqp::gles2::TestPackage::init() (in /deqp/modules/gles2/deqp-gles2)
==403==  Address 0x56bd340 is 0 bytes inside a block of size 4 free'd
==403==    at 0x48369AB: free (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==403==    by 0x4B01767: loader_get_driver_for_fd (loader.c:464)
==403==    by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EBD1: glu::createRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::RenderConfig const&, glu::RenderContext const*) (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x53EFE9: glu::createDefaultRenderContext(tcu::Platform&, tcu::CommandLine const&, glu::ApiType) (in /deqp/modules/gles2/deqp-gles2)
==403==  Block was alloc'd at
==403==    at 0x483577F: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==403==    by 0x4EE5E09: strndup (strndup.c:43)
==403==    by 0x4B010B1: loader_get_kernel_driver_name (loader.c:101)
==403==    by 0x4B016AF: loader_get_driver_for_fd (loader.c:462)
==403==    by 0x4AFD553: surfaceless_probe_device (platform_surfaceless.c:308)
==403==    by 0x4AFD915: dri2_initialize_surfaceless (platform_surfaceless.c:391)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:984)
==403==    by 0x4AF5EEA: dri2_initialize (egl_dri2.c:958)
==403==    by 0x4AF1EEC: _eglMatchAndInitialize (egldriver.c:75)
==403==    by 0x4AF1F3B: _eglMatchDriver (egldriver.c:96)
==403==    by 0x4AE9367: eglInitialize (eglapi.c:617)
==403==    by 0x1D99C9: tcu::surfaceless::EglRenderContext::EglRenderContext(glu::RenderConfig const&, tcu::CommandLine const&) [clone .constprop.57] (in /deqp/modules/gles2/deqp-gles2)
==403==    by 0x1DABB0: tcu::surfaceless::ContextFactory::createContext(glu::RenderConfig const&, tcu::CommandLine const&, glu::RenderContext const*) const (in /deqp/modules/gles2/deqp-gles2)

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
5 years agoRevert "glx: Lift sending the MakeCurrent request to top-level code"
Adam Jackson [Thu, 26 Sep 2019 15:07:42 +0000 (11:07 -0400)]
Revert "glx: Lift sending the MakeCurrent request to top-level code"

Apparently this provokes crashes elsewhere in code unrelated to
MakeCurrent. I hate GLX so very very much.

This reverts commit 999c2aed8826f403b071f52b040ce25b56d35f9d.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207

5 years agoRevert "glx: Implement GLX_EXT_no_config_context"
Adam Jackson [Thu, 26 Sep 2019 15:07:13 +0000 (11:07 -0400)]
Revert "glx: Implement GLX_EXT_no_config_context"

This reverts commit 0d635ccc912d7122f35f81eec27d8b2c0a2a7a28.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1207

5 years agoradv: Add debug option to dump meta shaders.
Timur Kristóf [Wed, 18 Sep 2019 12:39:10 +0000 (14:39 +0200)]
radv: Add debug option to dump meta shaders.

This new option can help debug shader compiler problems when
there are issues with the meta shaders.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoamd/common: Introduce ac_get_fs_input_vgpr_cnt.
Timur Kristóf [Wed, 25 Sep 2019 14:40:07 +0000 (16:40 +0200)]
amd/common: Introduce ac_get_fs_input_vgpr_cnt.

Add a function called ac_get_fs_input_vgpr_cnt which will return
the number of input VGPRs used by an AMD shader. Previously,
radv and radeonsi had the same code duplicated, but this commit also
allows them to share this code.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: Set shared VGPR count in radv_postprocess_config.
Timur Kristóf [Fri, 13 Sep 2019 13:53:09 +0000 (15:53 +0200)]
radv: Set shared VGPR count in radv_postprocess_config.

This commit allows RADV to set the shared VGPR count according to
the shader config.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoamd/common: Add num_shared_vgprs to ac_shader_config for GFX10.
Timur Kristóf [Fri, 13 Sep 2019 13:38:50 +0000 (15:38 +0200)]
amd/common: Add num_shared_vgprs to ac_shader_config for GFX10.

In GFX10 wave64 mode, shared VGPRs allow the two wave halves to
share some data with each other.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: Extract some helper functions to ac_shader_util.
Timur Kristóf [Wed, 25 Sep 2019 12:10:18 +0000 (14:10 +0200)]
amd/common: Extract some helper functions to ac_shader_util.

This commit moves ac_get_tbuffer_format, ac_get_sampler_dim and
ac_get_image_dim into ac_shader_util, thus enabling them to be used
by compilers other than LLVM.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: Move ac_export_mrt_z to ac_llvm_build.
Timur Kristóf [Wed, 25 Sep 2019 12:05:19 +0000 (14:05 +0200)]
amd/common: Move ac_export_mrt_z to ac_llvm_build.

The aim of this commit is to keep ac_shader_util LLVM-free,
since we would like to use it in ACO later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoaco: CSE readlane/readfirstlane/permute/reduce with the same exec mask
Rhys Perry [Mon, 23 Sep 2019 13:31:24 +0000 (14:31 +0100)]
aco: CSE readlane/readfirstlane/permute/reduce with the same exec mask

v2: rename pass_temp to pass_flags
v2: also CSE reductions
v3: add ds_swizzle_b32 support
v3: check gds/offset0/offset1 fields

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: don't CSE v_readlane_b32/v_readfirstlane_b32
Rhys Perry [Sat, 21 Sep 2019 15:00:45 +0000 (16:00 +0100)]
aco: don't CSE v_readlane_b32/v_readfirstlane_b32

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string
Rhys Perry [Wed, 25 Sep 2019 10:48:04 +0000 (11:48 +0100)]
aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_string

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/aco: return a correct name and description for the backend IR
Rhys Perry [Tue, 24 Sep 2019 14:25:07 +0000 (15:25 +0100)]
radv/aco: return a correct name and description for the backend IR

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: store printed backend IR in binary
Rhys Perry [Tue, 24 Sep 2019 14:23:46 +0000 (15:23 +0100)]
aco: store printed backend IR in binary

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco,radv/aco: get dissassembly for release builds if requested
Rhys Perry [Tue, 24 Sep 2019 14:21:06 +0000 (15:21 +0100)]
aco,radv/aco: get dissassembly for release builds if requested

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/aco: actually disable ACO when unsupported
Rhys Perry [Wed, 25 Sep 2019 11:04:51 +0000 (12:04 +0100)]
radv/aco: actually disable ACO when unsupported

We were setting this twice. The second time, we weren't later disabling
it if unsupported.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomesa/st: calculate texture size based on EGLImage miplevel
Tapani Pälli [Tue, 24 Sep 2019 11:34:40 +0000 (14:34 +0300)]
mesa/st: calculate texture size based on EGLImage miplevel

Fixes issues with 'egl-gl_oes_egl_image' Piglit test.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomeson: fix logic for generating .pc files with old glvnd
Dylan Baker [Wed, 25 Sep 2019 23:25:27 +0000 (23:25 +0000)]
meson: fix logic for generating .pc files with old glvnd

We want to generate PC files for non-glvnd builds and for builds with
old glvnd, but the current logic doesn't do that, it builds them
unconditionally, and for GLES it builds the shared libraries, which is
also not what we want. This does not generate .pc files for gles1 or
gles2. Which it we weren't doing before either, making this not a
regression but a return to status-quo.o

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1838
Fixes: 93df862b6affb6b8507e40601212a58012bfa873
       ("meson: re-add incorrect pkg-config files with GLVND for backward compatibility")
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agonir/range-analysis: Use types to provide better ranges from bcsel and mov
Ian Romanick [Tue, 13 Aug 2019 00:28:35 +0000 (17:28 -0700)]
nir/range-analysis: Use types to provide better ranges from bcsel and mov

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16328255 -> 16315391 (-0.08%)
instructions in affected programs: 218318 -> 205454 (-5.89%)
helped: 988
HURT: 0
helped stats (abs) min: 1 max: 72 x̄: 13.02 x̃: 10
helped stats (rel) min: 0.33% max: 16.04% x̄: 6.27% x̃: 4.88%
95% mean confidence interval for instructions value: -13.69 -12.35
95% mean confidence interval for instructions %-change: -6.55% -5.99%
Instructions are helped.

total cycles in shared programs: 363683977 -> 363615417 (-0.02%)
cycles in affected programs: 1475193 -> 1406633 (-4.65%)
helped: 923
HURT: 36
helped stats (abs) min: 1 max: 624 x̄: 75.78 x̃: 48
helped stats (rel) min: 0.08% max: 13.89% x̄: 5.20% x̃: 5.08%
HURT stats (abs)   min: 1 max: 179 x̄: 38.58 x̃: 4
HURT stats (rel)   min: 0.06% max: 16.56% x̄: 3.33% x̃: 0.29%
95% mean confidence interval for cycles value: -75.88 -67.10
95% mean confidence interval for cycles %-change: -5.10% -4.66%
Cycles are helped.

Sandy Bridge
total instructions in shared programs: 10785779 -> 10785654 (<.01%)
instructions in affected programs: 13855 -> 13730 (-0.90%)
helped: 67
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 1.87 x̃: 1
helped stats (rel) min: 0.20% max: 3.45% x̄: 0.97% x̃: 0.78%
95% mean confidence interval for instructions value: -2.47 -1.26
95% mean confidence interval for instructions %-change: -1.13% -0.81%
Instructions are helped.

total cycles in shared programs: 153704799 -> 153704481 (<.01%)
cycles in affected programs: 101509 -> 101191 (-0.31%)
helped: 38
HURT: 13
helped stats (abs) min: 1 max: 38 x̄: 12.53 x̃: 16
helped stats (rel) min: 0.07% max: 2.69% x̄: 0.87% x̃: 0.53%
HURT stats (abs)   min: 1 max: 36 x̄: 12.15 x̃: 7
HURT stats (rel)   min: 0.06% max: 2.53% x̄: 0.73% x̃: 0.44%
95% mean confidence interval for cycles value: -10.24 -2.24
95% mean confidence interval for cycles %-change: -0.75% -0.17%
Cycles are helped.

LOST:   2
GAINED: 0

No shader-db change on Iron Lake or GM45.

5 years agonir/range-analysis: Use types in the hash key
Ian Romanick [Tue, 13 Aug 2019 00:28:35 +0000 (17:28 -0700)]
nir/range-analysis: Use types in the hash key

This allows the reslut of mov and bcsel to be separately interpreted as
float or int depending on the use.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agonir/range-analysis: Bail if the types don't match
Ian Romanick [Tue, 24 Sep 2019 22:55:49 +0000 (15:55 -0700)]
nir/range-analysis: Bail if the types don't match

Some shaders are hurt by this change because now a
load_const(0x00000000) is not recognized as eq_zero when loaded as a
float.  This behavior is restored in a later patch (nir/range-analysis:
Use types to provide better ranges from bcsel and mov).

v2: Add a comment about reinterpretation of int/uint/bool.  Suggested by
Caio.  Rewrite condition the check for types being float versus checking
for types not being all the things that aren't float.

Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 16327543 -> 16328255 (<.01%)
instructions in affected programs: 55928 -> 56640 (1.27%)
helped: 0
HURT: 208
HURT stats (abs)   min: 1 max: 16 x̄: 3.42 x̃: 3
HURT stats (rel)   min: 0.33% max: 6.74% x̄: 1.31% x̃: 1.12%
95% mean confidence interval for instructions value: 3.06 3.79
95% mean confidence interval for instructions %-change: 1.17% 1.46%
Instructions are HURT.

total cycles in shared programs: 363682759 -> 363683977 (<.01%)
cycles in affected programs: 325758 -> 326976 (0.37%)
helped: 44
HURT: 133
helped stats (abs) min: 1 max: 179 x̄: 33.61 x̃: 5
helped stats (rel) min: 0.06% max: 14.21% x̄: 2.47% x̃: 0.29%
HURT stats (abs)   min: 1 max: 157 x̄: 20.28 x̃: 14
HURT stats (rel)   min: 0.07% max: 14.44% x̄: 1.42% x̃: 0.73%
95% mean confidence interval for cycles value: 0.38 13.39
95% mean confidence interval for cycles %-change: -0.06% 0.96%
Inconclusive result (%-change mean confidence interval includes 0).

Sandy Bridge
total instructions in shared programs: 10787433 -> 10787443 (<.01%)
instructions in affected programs: 1842 -> 1852 (0.54%)
helped: 0
HURT: 10
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.33% max: 1.85% x̄: 0.73% x̃: 0.49%
95% mean confidence interval for instructions value: 1.00 1.00
95% mean confidence interval for instructions %-change: 0.36% 1.10%
Instructions are HURT.

total cycles in shared programs: 153724543 -> 153724563 (<.01%)
cycles in affected programs: 8407 -> 8427 (0.24%)
helped: 1
HURT: 3
helped stats (abs) min: 18 max: 18 x̄: 18.00 x̃: 18
helped stats (rel) min: 0.98% max: 0.98% x̄: 0.98% x̃: 0.98%
HURT stats (abs)   min: 4 max: 18 x̄: 12.67 x̃: 16
HURT stats (rel)   min: 0.21% max: 0.75% x̄: 0.56% x̃: 0.72%
95% mean confidence interval for cycles value: -21.31 31.31
95% mean confidence interval for cycles %-change: -1.11% 1.46%
Inconclusive result (value mean confidence interval includes 0).

No shader-db changes on Iron Lake or GM45.

5 years agointel: Add new Comet Lake PCI-ids
Lionel Landwerlin [Wed, 25 Sep 2019 14:43:07 +0000 (17:43 +0300)]
intel: Add new Comet Lake PCI-ids

Commit bfc4c359b282 ("drm/i915/cml: Add Missing PCI IDs") in i915
added 3 new CML PCI ids.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel: use proper label for Comet Lake skus
Lionel Landwerlin [Wed, 25 Sep 2019 14:40:50 +0000 (17:40 +0300)]
intel: use proper label for Comet Lake skus

Fixes: 82f6a746e8 ("intel: Add support for Comet Lake")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agofreedreno/a6xx: Move instrlen and obj_start writes to fd6_emit_shader
Kristian H. Kristensen [Fri, 20 Sep 2019 17:10:57 +0000 (10:10 -0700)]
freedreno/a6xx: Move instrlen and obj_start writes to fd6_emit_shader

Consolidate a few more generic shaders setup regs in fd6_emit_shader.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: Emit const and texture state for HS/DS/GS
Kristian H. Kristensen [Thu, 19 Sep 2019 22:04:09 +0000 (15:04 -0700)]
freedreno/a6xx: Emit const and texture state for HS/DS/GS

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/ir3: Add HS/DS/GS to shader key and cache
Kristian H. Kristensen [Thu, 19 Sep 2019 20:59:36 +0000 (13:59 -0700)]
freedreno/ir3: Add HS/DS/GS to shader key and cache

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: Add generic program stateobj support for HS/DS/GS
Kristian H. Kristensen [Thu, 19 Sep 2019 20:55:35 +0000 (13:55 -0700)]
freedreno/a6xx: Add generic program stateobj support for HS/DS/GS

This add generic stage state setup for HS/DS/GS to the program state
object.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Move fs functions after geometry pipeline stages
Kristian H. Kristensen [Thu, 19 Sep 2019 20:45:44 +0000 (13:45 -0700)]
freedreno: Move fs functions after geometry pipeline stages

Let's try to always order the stages in the pipeline order.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Add state binding functions for HS/DS/GS
Kristian H. Kristensen [Thu, 19 Sep 2019 20:25:00 +0000 (13:25 -0700)]
freedreno: Add state binding functions for HS/DS/GS

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Rename vp and fp to vs and fs in fd_program_stateobj
Kristian H. Kristensen [Thu, 19 Sep 2019 20:40:31 +0000 (13:40 -0700)]
freedreno: Rename vp and fp to vs and fs in fd_program_stateobj

We're using vs and fs now, and adding hs, ds and gs soon.  It's
confusing enough that we have both DS/TCS and HS/TES. At least for VS
and FS there doesn't have to be multiple names.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: Factor out const state setup
Kristian H. Kristensen [Thu, 19 Sep 2019 20:19:34 +0000 (13:19 -0700)]
freedreno/a6xx: Factor out const state setup

We'll be sharing this logic for new shader stages soon.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoglsl: turn runtime asserts of compile-time value into compile-time asserts
Eric Engestrom [Tue, 24 Sep 2019 15:58:31 +0000 (16:58 +0100)]
glsl: turn runtime asserts of compile-time value into compile-time asserts

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
5 years agodocs/release-calendar: add missing <td> and </td>
Eric Engestrom [Wed, 25 Sep 2019 18:20:00 +0000 (19:20 +0100)]
docs/release-calendar: add missing <td> and </td>

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agodocs/release-calendar: fix bugfix release numbers
Eric Engestrom [Wed, 25 Sep 2019 18:19:05 +0000 (19:19 +0100)]
docs/release-calendar: fix bugfix release numbers

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoanv: gem-stubs: return a valid fd got anv_gem_userptr()
Lionel Landwerlin [Wed, 25 Sep 2019 13:26:52 +0000 (16:26 +0300)]
anv: gem-stubs: return a valid fd got anv_gem_userptr()

Fixes invalid close(-1) in the unit tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agost/nine: Ignore D3DSIO_RET if it is the last instruction in a shader
Danylo Piliaiev [Tue, 24 Sep 2019 11:12:39 +0000 (14:12 +0300)]
st/nine: Ignore D3DSIO_RET if it is the last instruction in a shader

RET as a last instruction could be safely ignored.
Remove it to prevent crashes/warnings in case underlying driver
doesn't implement arbitrary returns.

A better way would be to remove the RET after the whole shader
is parsed which will handle a possible case when the last RET is
followed by a comment.

CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
5 years agobin/get-pick-list: use --oneline=pretty instead of --oneline
Dylan Baker [Mon, 23 Sep 2019 18:05:16 +0000 (11:05 -0700)]
bin/get-pick-list: use --oneline=pretty instead of --oneline

--oneline shortens hashes, while --oneline=pretty doesn't, otherwise
they are the same. Having full hashes is convenient as that is the
format that the bin/.cherry-ignore script requires to work correctly.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agorelease: Push 19.3 back two weeks
Dylan Baker [Fri, 20 Sep 2019 17:37:14 +0000 (10:37 -0700)]
release: Push 19.3 back two weeks

The main reason to do this is that 19.2 has slipped by two weeks, and
such the 19.3 branch is due to happen extremely close to the release of
19.2.0. I think it would be better to have a little more time between
releases for developers and for packagers.

This would still have the 19.3 release out before December, even if it
slips by 1 week.

Acked-By: Karol Herbst <kherbst@redhat.com>
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agodocs: update calendar, add news item, and link release notes for 19.2.0
Dylan Baker [Wed, 25 Sep 2019 17:40:00 +0000 (10:40 -0700)]
docs: update calendar, add news item, and link release notes for 19.2.0

5 years agodocs: add SHA256 sum for 19.2.0
Dylan Baker [Wed, 25 Sep 2019 17:39:51 +0000 (10:39 -0700)]
docs: add SHA256 sum for 19.2.0

5 years agodocs: Add release notes for 19.2.0
Dylan Baker [Wed, 25 Sep 2019 16:55:33 +0000 (09:55 -0700)]
docs: Add release notes for 19.2.0

5 years agolima/ppir: Add various varying fetch sources to disassembler
Andreas Baierl [Thu, 19 Sep 2019 06:53:18 +0000 (08:53 +0200)]
lima/ppir: Add various varying fetch sources to disassembler

Signed-off-by: Andreas Baierl <ichgeh@imkreisrum.de>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agomeson: re-add incorrect pkg-config files with GLVND for backward compatibility
Eric Engestrom [Thu, 19 Sep 2019 13:18:55 +0000 (14:18 +0100)]
meson: re-add incorrect pkg-config files with GLVND for backward compatibility

This is a bit counter-intuitive, but the issue is that GLVND is broken
in versions <= 1.1.1, so we need to keep wrongly providing these files
to cover up their mistake, otherwise the rest of the world ends up
broken.

Suggested-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoaco: check for duplicate opcode numbers
Rhys Perry [Tue, 24 Sep 2019 14:46:37 +0000 (15:46 +0100)]
aco: check for duplicate opcode numbers

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: fix opcode for s_mul_hi_i32
Rhys Perry [Sat, 21 Sep 2019 13:10:38 +0000 (14:10 +0100)]
aco: fix opcode for s_mul_hi_i32

Fixes dEQP-VK.glsl.builtin.function.integer.imulextended.*_compute

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: fix v_subrev_co_u32_e64 opcode
Rhys Perry [Tue, 24 Sep 2019 14:45:48 +0000 (15:45 +0100)]
aco: fix v_subrev_co_u32_e64 opcode

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: fix GFX9 opcode for v_xad_u32
Rhys Perry [Tue, 24 Sep 2019 13:34:28 +0000 (14:34 +0100)]
aco: fix GFX9 opcode for v_xad_u32

Fixes various dEQP-VK.image.store.* tests.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoaco: implement 64-bit ineg
Rhys Perry [Wed, 25 Sep 2019 11:16:34 +0000 (12:16 +0100)]
aco: implement 64-bit ineg

We currently lower them, but nir_opt_algebraic() can add new ones because
lower_sub=true.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agoaco: run nir_lower_int64() before nir_lower_idiv()
Rhys Perry [Tue, 24 Sep 2019 14:15:26 +0000 (15:15 +0100)]
aco: run nir_lower_int64() before nir_lower_idiv()

nir_lower_idiv() asserts on 64-bit integers.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
5 years agonir: Fix overlapping vars in nir_assign_io_var_locations()
Connor Abbott [Tue, 24 Sep 2019 15:29:53 +0000 (17:29 +0200)]
nir: Fix overlapping vars in nir_assign_io_var_locations()

When handling two variables with overlapping locations, we process the
one with lower location first, and then extend the location ->
driver_location map to guarantee that it's contiguous for the second
variable too. But the loop had the wrong bound, so we weren't extending
the map 100%, which could lead to problems later such as an incorrect
num_inputs. The loop index i is an index into the slots of the variable,
so we need to stop at the final slot of the variable (var_size) instead
of the number of unassigned slots.

This fixes
spec@arb_enhanced_layouts@execution@component-layout@vs-fs-array-interleave-range
on radeonsi NIR.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoclover: eliminate "ignoring attributes on template argument" warning
Karol Herbst [Fri, 20 Sep 2019 11:08:50 +0000 (13:08 +0200)]
clover: eliminate "ignoring attributes on template argument" warning

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agoclover/codegen: remove unused get_symbol_offsets function
Karol Herbst [Fri, 20 Sep 2019 10:45:11 +0000 (12:45 +0200)]
clover/codegen: remove unused get_symbol_offsets function

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agoclover/llvm: remove harmful std::move call
Karol Herbst [Fri, 20 Sep 2019 10:43:10 +0000 (12:43 +0200)]
clover/llvm: remove harmful std::move call

both clang and gcc warn with:
"moving a local object in a return statement prevents copy elision"

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agoiris: disable aux on first get_param if not created with aux
Tapani Pälli [Mon, 2 Sep 2019 10:02:33 +0000 (13:02 +0300)]
iris: disable aux on first get_param if not created with aux

This moves the fix from commit 361f3d19f1f to happen in get_param
(used now instead of get_handle by st/dri). This fixes artifacts
seen with Xorg and CCS_E.

Fixes: fc12fd05f56 "iris: Implement pipe_screen::resource_get_param"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoglsl: correct bitcast-helpers
Erik Faye-Lund [Tue, 24 Sep 2019 14:57:03 +0000 (16:57 +0200)]
glsl: correct bitcast-helpers

Without this, we'll incorrectly round off huge values to the nearest
representable double instead of keeping it at the exact value  as
we're supposed to.

Found by inspecting compiler-warnings.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 85faf5082f ("glsl: Add 64-bit integer support for constant expressions")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agolima/ppir: add support for indirect load of uniforms and varyings
Vasily Khoruzhick [Sat, 14 Sep 2019 18:00:16 +0000 (11:00 -0700)]
lima/ppir: add support for indirect load of uniforms and varyings

Utgard PP supports indirect load of uniforms and varyings, so let's
enable it.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: add node dependency types
Vasily Khoruzhick [Sat, 14 Sep 2019 18:01:03 +0000 (11:01 -0700)]
lima/ppir: add node dependency types

Currently we add dependecies in 3 cases:
1) One node consumes value produced by another node
2) Sequency dependencies
3) Write after read dependencies

2) and 3) only affect scheduler decisions since we still can use pipeline
register if we have only 1 dependency of type 1).

Add 3 dependency types and mark dependencies as we add them.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/ppir: don't attempt to clone tex coords if it's not varying
Vasily Khoruzhick [Tue, 24 Sep 2019 04:20:07 +0000 (21:20 -0700)]
lima/ppir: don't attempt to clone tex coords if it's not varying

It makes no sense to clone texture coords if it's not varying, moreover
we don't support cloning ALU nodes.

Fixes: 1c1890fa7077 ("lima/ppir: clone uniforms and load_coords into each successor")
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoradeonsi/nir: lower load constants to scalar
Timothy Arceri [Fri, 20 Sep 2019 06:54:31 +0000 (16:54 +1000)]
radeonsi/nir: lower load constants to scalar

We call nir_lower_load_const_to_scalar in the state trackers linker
however some later passes can reintroduce constant vectors. Here
we lower these to scalar and perform optimisations. The Intel
drivers do a similar call in their backend..

shader-db results VEGA 64:

Totals from affected shaders:
SGPRS: 152168 -> 151976 (-0.13 %)
VGPRS: 135224 -> 135112 (-0.08 %)
Spilled SGPRs: 4027 -> 4163 (3.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 10670028 -> 10654776 (-0.14 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 13122 -> 13135 (0.10 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoturnip: use image tile_mode for gmem configuration
Jonathan Marek [Tue, 24 Sep 2019 18:39:55 +0000 (14:39 -0400)]
turnip: use image tile_mode for gmem configuration

Fixes at least this deqp test:
dEQP-VK.api.smoke.triangle

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoturnip: fix binning shader compilation
Jonathan Marek [Tue, 24 Sep 2019 18:36:53 +0000 (14:36 -0400)]
turnip: fix binning shader compilation

ir3 segfaults if nonbinning is NULL for the bininng pass shader.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir/opt_remove_phis: handle phis with no sources
Rhys Perry [Mon, 23 Sep 2019 13:48:22 +0000 (14:48 +0100)]
nir/opt_remove_phis: handle phis with no sources

This can happen with loops with unreachable exits which are later
optimized away.

Fixes assertion in dEQP-VK.graphicsfuzz.unreachable-loops with RADV.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradeonsi: fix VAAPI segfault due to various bugs
Michel Dänzer [Thu, 19 Sep 2019 00:18:39 +0000 (20:18 -0400)]
radeonsi: fix VAAPI segfault due to various bugs

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111236

5 years agogallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH
Marek Olšák [Tue, 17 Sep 2019 22:22:08 +0000 (18:22 -0400)]
gallium/vl: don't set PIPE_HANDLE_USAGE_EXPLICIT_FLUSH

because vl doesn't call flush_resource and I wasn't able to find
all places where flush_resource needs to be called.

This fixes corrupted / unflushed surfaces with fullscreen videos on Raven.

Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
5 years agoradeonsi: initialize displayable DCC using the retile blit to prevent hangs
Marek Olšák [Tue, 17 Sep 2019 02:31:48 +0000 (22:31 -0400)]
radeonsi: initialize displayable DCC using the retile blit to prevent hangs

Cc 19.2 <mesa-stable@lists.freedesktop.org>

5 years agonir/opt_large_constants: Handle store writemasks
Connor Abbott [Tue, 24 Sep 2019 10:43:29 +0000 (12:43 +0200)]
nir/opt_large_constants: Handle store writemasks

This fixes some piglit tests on radeonsi NIR where a varying is
initialized to a constant array in the vertex shader. Varying packing
after nir_lower_io_to_temporaries creates writemasked stores which
persist after pulling the constant initialization down into the fragment
shader.

While we're here, rewrite handle_constant_store() to do the loop over
components outside the switch, so that we don't have to duplicate the
writemask checking for every bitsize.

Fixes: 1235850522c ("nir: Add a large constants optimization pass")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agomeson: split more compiler options to their own line
Eric Engestrom [Mon, 23 Sep 2019 17:53:22 +0000 (18:53 +0100)]
meson: split more compiler options to their own line

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agomeson: drop -Wno-foo bug workaround for Meson < 0.46
Eric Engestrom [Mon, 23 Sep 2019 17:37:01 +0000 (18:37 +0100)]
meson: drop -Wno-foo bug workaround for Meson < 0.46

This was a workaround for a bug in Meson that was fixed in 0.46 [1].

[1] https://github.com/mesonbuild/meson/pull/2284

Fixes: f7b6a8d12fdc446e3251 ("meson: bump required version to 0.46")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoradv: fix s/load/store/ copy-paste typo
Eric Engestrom [Mon, 23 Sep 2019 16:48:55 +0000 (17:48 +0100)]
radv: fix s/load/store/ copy-paste typo

Fixes: cdc6efddf918bc07d30d ("radv: implement all depth/stencil resolve modes using graphics")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonouveau: add idep_nir_headers as dep for libnouveau
Stephen Barber [Tue, 24 Sep 2019 00:51:43 +0000 (17:51 -0700)]
nouveau: add idep_nir_headers as dep for libnouveau

Fixes a compilation error when building libnouveau:

In file included from ../src/gallium/drivers/nouveau/nv50/nv50_program.c:25:
../src/compiler/nir/nir.h:1115:10: fatal error: nir_intrinsics.h: No such file or directory
 #include "nir_intrinsics.h"
           ^~~~~~~~~~~~~~~~~~
           compilation terminated.

Fixes: f014ae3c7cce504afe5d ("nouveau: add support for nir")
Signed-off-by: Stephen Barber <smbarber@chromium.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agoradv: Add workaround for hang in The Surge 2.
Bas Nieuwenhuizen [Tue, 24 Sep 2019 00:53:21 +0000 (02:53 +0200)]
radv: Add workaround for hang in The Surge 2.

Released today and hangs on RADV. We don't have the root cause yet,
but this should unblock people playing the game.

No drirc because the radv debugflags are not usable from drirc and
I want this backported.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoi965/fs: set rounding mode when emitting the flrp instruction
Andres Gomez [Mon, 23 Sep 2019 22:37:57 +0000 (01:37 +0300)]
i965/fs: set rounding mode when emitting the flrp instruction

flrp was forgotten when already adding the rounding mode for other
instructions.

Fixes: ba1e25e1aa6 ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions")
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agoi965/fs: add a comment about how the rounding mode in fmul is set
Andres Gomez [Mon, 23 Sep 2019 22:16:11 +0000 (01:16 +0300)]
i965/fs: add a comment about how the rounding mode in fmul is set

After
1711bf6cf2d ("intel/fs: Generate better code for fsign multiplied by a value"),
the conflicts resolution for setting the rounding mode after the
fused fmul and fsign optimization is non obvious.

Basically, the optimization doesn't really result in a MUL, or any
other operation which would need to have the rounding mode set. Hence,
we set it just before the actual MUL in the treatment of fmul.

Fixes: ba1e25e1aa6 ("i965/fs: set rounding mode when emitting fadd, fmul and ffma instructions")
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agobin/get-pick-list.sh: sha1 commits can be smaller than 8 chars
Juan A. Suarez Romero [Tue, 10 Sep 2019 08:30:43 +0000 (10:30 +0200)]
bin/get-pick-list.sh: sha1 commits can be smaller than 8 chars

The script only handles commits with "Fixes: <sha1>" where <sha1> is
equal or great than 8 chars. But <sha1> can be smaller, like 7 chars.

This commit relax the restriction to handle <sha1> 4 or more chars.

Fixes: 533fead4236 ("bin/get-pick-list.sh: tweak the commit sha matching pattern")
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agolima/gpir: Fix 64-bit shift in scheduler spilling
Connor Abbott [Wed, 18 Sep 2019 17:47:28 +0000 (00:47 +0700)]
lima/gpir: Fix 64-bit shift in scheduler spilling

There are 64 physical registers so the shift must be 64 bits.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Don't emit movs when translating from NIR
Connor Abbott [Wed, 18 Sep 2019 11:13:08 +0000 (18:13 +0700)]
lima/gpir: Don't emit movs when translating from NIR

The scheduler doesn't expect them. To do this, I had to refactor the
registration part of gpir_node_create_dest() to be separate from
creating and inserting the node, since the last two now aren't done when
handling moves. This adds more code but creates the possibility of
automatically inserting input dependencies when inserting nodes, similar
to what's done in NIR with the use-def lists (this isn't done yet).

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Fix postlog2 fixup handling
Connor Abbott [Wed, 18 Sep 2019 05:29:45 +0000 (12:29 +0700)]
lima/gpir: Fix postlog2 fixup handling

We guarantee that a complex1 op is always used by postlog2 directly by
rewriting the postlog2 op to be a move when there would be a move
inserted between them. But we weren't doing this in all circumstances
where there might be a move. Move the logic to place_move() so that it
always happens. Fixes a few log tests that happened to start failing due
to changes in the register allocator leading to a different scheduling
order.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Use registers for values live in multiple blocks
Connor Abbott [Fri, 13 Sep 2019 06:23:56 +0000 (13:23 +0700)]
lima/gpir: Use registers for values live in multiple blocks

This commit adds the framework for cross-basic-block register
allocation. Like ARM's compiler, we assume that the value registers
aren't usable across branches, which means we have to use physical
registers to store any value that crosses a basic block. There are three
parts to this:

1. When translating from NIR, we rely on the NIR out-of-ssa pass to
coalesce values into registers. We insert store_reg instructions for
values used in more than one basic block, and load_reg instructions for
values not defined in the same basic block (or defined after their use,
for loops). So by the time we've translated out of NIR we've already
split things into values (which are only used in the same basic block)
and registers (which are only used in different basic blocks than where
they're defined).

2. We allocate the registers at the same time that we allocate the
values, before the final scheduler. Unlike the values, where the
assigned color is fake, we assign the actual physical index & component
to physregs at this stage. load_reg and store_reg are treated as moves
in the allocator and when creating write-after-read dependencies.

3. Finally, in the main scheduler we have to avoid overwriting existing
live physregs when spilling. First, we have to tell the scheduler which
physical registers are live at the end of each block, to avoid
overwriting those. If a register is only live at the beginning, we can
reuse it for spilling after the last original use in the final program
happens, i.e. before any original use is scheduled, but we have to be
careful to add the proper dependencies so that the spill write is
scheduled before the original reads. To handle this we repurpose
reg_link for uses to be used by the scheduler.

A few register-related things copied over from NIR or from other
drivers can be dropped.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Support branch instructions
Connor Abbott [Wed, 28 Aug 2019 08:57:35 +0000 (10:57 +0200)]
lima/gpir: Support branch instructions

Because branch conditions have to be in the pass slot, there is no
unconditional branch, and realistically the pass slot has to contain a
move when branching (there's nothing it does that would be useful for
operating on booleans, so we can't use it for anything when computing
the branch condition), we put the branch instruction in the pass slot
and at codegen time turn it into a move of the branch condition. This
means that it doesn't have to be special-cased like store instructions
are in the scheduler. Because of this decision we can remove the
half-implemented BRANCH codegen slot. Finally, we (ab)use the existing
schedule_first mechanism to make sure that branches are always last in
the basic block.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Only try to place actual children
Connor Abbott [Tue, 10 Sep 2019 14:11:42 +0000 (21:11 +0700)]
lima/gpir: Only try to place actual children

When picking a node to be scheduled, we try to schedule its children as
well. But we shouldn't try to schedule nodes which only have a fake
dependency on the original node, since this isn't the point of
scheduling children at the same time and can break some expectations of
the rest of the code.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: Fix compiler warning
Connor Abbott [Fri, 13 Sep 2019 06:18:29 +0000 (13:18 +0700)]
lima/gpir: Fix compiler warning

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoglx: Implement GLX_EXT_no_config_context
Adam Jackson [Tue, 14 Nov 2017 20:13:06 +0000 (15:13 -0500)]
glx: Implement GLX_EXT_no_config_context

This is the GLX counterpart to EGL_KHR_no_config_context. Contexts may
now be created without reference to an fbconfig, in which case it is
treated as compatible with any fbconfig (and thus any GLX drawable).

Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoglx: Lift sending the MakeCurrent request to top-level code
Adam Jackson [Tue, 14 Nov 2017 20:13:04 +0000 (15:13 -0500)]
glx: Lift sending the MakeCurrent request to top-level code

Somewhat terrifyingly, we never sent this for direct contexts, which
means the server never knew the context/drawable bindings. To handle
this sanely, pull the request code up out of the indirect backend, and
rewrite the context switch path to call it as appropriate.  This
attempts to preserve the existing behavior of not calling unbind() on
the context if its refcount would not drop to zero.

Of course, you can't just do this indiscriminately, because this is GLX
and extant X servers have bugs and everything is terrible. To wit:

- For 1.20.x prior to 1.20.6, you can bind a direct context once, but
the second time you try to modify the context's binding you will get
GLXBadContextTag. This includes unbinding the context. And "deleting"
the context will leak memory, because it will still appear to be
current.

- For 1.19 and earlier, glXMakeCurrent(dpy, None, ctx) should be legal
for GL 3.0+ contexts, but the server will throw BadMatch.

To guard against this, we only send the request for indirect contexts
unless the server is known good, and only mention one context at a time
in such a request; if switching between contexts, we first unbind the
old, and then bind the new. Note that the second VendorRelease() version
is to catch XFree86 4.x and Xorg [67].x, which almost certainly have the
above bugs. Other servers might report different version numbers here,
but we can't do direct rendering against them, so this should be safe.

Fixes glx-make-context, glx-multi-window-single-context and
glx-query-drawable-glx_fbconfig_id-window. Sufficiently old piglit will
regress on glx-make-glxdrawable-current (throwing BadMatch), which is
fixed by mesa/piglit!116.

5 years agoglx: Move vertex array protocol state into the indirect backend
Adam Jackson [Tue, 14 Nov 2017 20:13:03 +0000 (15:13 -0500)]
glx: Move vertex array protocol state into the indirect backend

Only relevant for indirect contexts, so let's get that code out of the
common path.

5 years agointel: Increase Gen11 compute shader scratch IDs to 64.
Kenneth Graunke [Fri, 23 Aug 2019 00:32:25 +0000 (17:32 -0700)]
intel: Increase Gen11 compute shader scratch IDs to 64.

From the MEDIA_VFE_STATE docs:

   "Starting with this configuration, the Maximum Number of Threads must
    be set to (#EU * 8) for GPGPU dispatches.

    Although there are only 7 threads per EU in the configuration, the
    FFTID is calculated as if there are 8 threads per EU, which in turn
    requires a larger amount of Scratch Space to be allocated by the
    driver."

It's pretty clear that we need to increase this for scratch address
calculations, because the FFTID has a certain bit-pattern.  The quote
above seems to indicate that we should increase the actual thread count
programmed in MEDIA_VFE_STATE as well, but we think the intention is to
only bump the scratch space.

Fixes GPU hangs in Bioshock Infinite and Synmark's CSDof on Icelake 8x8.

Fixes: 5ac804bd9ac ("intel: Add a preliminary device for Ice Lake")
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoRevert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM"
Kenneth Graunke [Mon, 23 Sep 2019 23:30:29 +0000 (16:30 -0700)]
Revert "intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WM"

This reverts commit 729de1488f49033bc181b8123af5658228a51bf1.

It turns out that, although the register is in the logical context,
it isn't whitelisted, so we can't actually write it from userspace
batch buffers.  The write just becomes a noop, which is why we saw
no performance changes.

I manually whitelisted it, and still observed no performance gains, but
it did regress KHR-GL46.texture_cube_map_array.color_depth_attachments
on the iris driver.  So we might need to fix something before enabling
this.  To prevent it randomly getting turned on should the kernel ever
whitelist this register, we revert the patch for now.

5 years agoutil/rb_tree: Replace useless ifs with asserts
Jason Ekstrand [Mon, 23 Sep 2019 17:24:12 +0000 (12:24 -0500)]
util/rb_tree: Replace useless ifs with asserts

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agobroadcom/genxml: Stop manually scrubbing 'α' -> "alpha"
Kenneth Graunke [Fri, 20 Sep 2019 22:43:12 +0000 (15:43 -0700)]
broadcom/genxml: Stop manually scrubbing 'α' -> "alpha"

'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agointel/genxml: Stop manually scrubbing 'α' -> "alpha"
Kenneth Graunke [Thu, 22 Aug 2019 20:29:31 +0000 (13:29 -0700)]
intel/genxml: Stop manually scrubbing 'α' -> "alpha"

'α' has never appeared in any genxml files, so there's no need to
replace it with the word "alpha".

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agofreedreno/a6xx: do streamout only in binning pass
Rob Clark [Fri, 20 Sep 2019 21:58:49 +0000 (14:58 -0700)]
freedreno/a6xx: do streamout only in binning pass

Use VPC_SO_OVERRIDE to control whether we do streamout in binning or
draw pass.  Normally we want to do streamout in binning pass, except
when there is a single tile and binning passed is skipped.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: fix binning pass vs. xfb
Rob Clark [Fri, 20 Sep 2019 20:50:21 +0000 (13:50 -0700)]
freedreno/a6xx: fix binning pass vs. xfb

We could bit doing streamout from binning pass.  In this case we want to
use the full VS which doesn't have (potentially streamed out) varyings
stripped out.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno/a6xx: un-open-code PC_PRIMITIVE_CNTL_1.PSIZE
Rob Clark [Thu, 19 Sep 2019 18:30:01 +0000 (11:30 -0700)]
freedreno/a6xx: un-open-code PC_PRIMITIVE_CNTL_1.PSIZE

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoac/nir: force unnormalized coordinates for RECT
Marek Olšák [Wed, 18 Sep 2019 23:41:18 +0000 (19:41 -0400)]
ac/nir: force unnormalized coordinates for RECT

This fixes VAAPI.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>