mesa.git
5 years agospirv: Tell which opcode or value is unhandled when failing
Caio Marcelo de Oliveira Filho [Wed, 10 Apr 2019 18:13:40 +0000 (11:13 -0700)]
spirv: Tell which opcode or value is unhandled when failing

v2: When available, include the opcode name too. (Karol)

v3: Use more to_string helpers. (Karol)
    Include the wrong bit_size in those failures.
    Include the capability number in spv_check_supported.
    Provide vtn_fail_with_* macros to avoid noise in the call sites.

v4: Provide macros only for opcode and decoration, which have enough
    usages to justify them. (Jason)

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agospirv: Add more to_string helpers
Caio Marcelo de Oliveira Filho [Wed, 10 Apr 2019 17:04:05 +0000 (10:04 -0700)]
spirv: Add more to_string helpers

Also, use a set to identify repeated values.  The previous arrangement
worked when the repetitions were one after another, but in some of the
new cases they are not.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agointel/mi_builder: Disable mem_mem tests on IVB
Jason Ekstrand [Tue, 16 Apr 2019 17:37:12 +0000 (12:37 -0500)]
intel/mi_builder: Disable mem_mem tests on IVB

Tested-by: Clayton Craft <clayton.a.craft@intel.com>
5 years agoiris: Change vendor and renderer strings
Kenneth Graunke [Tue, 16 Apr 2019 07:27:33 +0000 (00:27 -0700)]
iris: Change vendor and renderer strings

This patch changes the GL_VENDOR string from "Mesa Project" to "Intel".
This makes GLX_MESA_query_renderer report "Vendor: Intel (0x8086)"
instead of "Vendor: Mesa Project (0x8086)" which is arguably wrong.
We now also use a consistent vendor string across Windows and Linux.

It also prepends "Mesa" to the GL_RENDERER string, both to credit the
community and have a distinguishing mark between the two drivers.  We
drop "DRI" compared to i965, as it's not really that important.

Improves performance in Portal by 1.8x.  Iris is now 3.86% faster
than i965 at the portal-d1.dem timedemo on my Kabylake laptop.  One
change is that Portal selects the MapBufferRange path based on the
vendor string, and iris's BufferSubData path is still missing the
storage invalidation optimization.

5 years agointel/mi_builder: Re-order an initializer
Jason Ekstrand [Mon, 15 Apr 2019 20:39:22 +0000 (15:39 -0500)]
intel/mi_builder: Re-order an initializer

The order doesn't matter in C99 but some C++ compilers seem to care.

Tested-by: Clayton Craft <clayton.a.craft@intel.com>
5 years agonir/algebraic: Use a cache to avoid re-emitting structs
Jason Ekstrand [Sat, 13 Apr 2019 15:35:07 +0000 (10:35 -0500)]
nir/algebraic: Use a cache to avoid re-emitting structs

This takes the stupid simplest and most reliable approach to reducing
redundancy that I could come up with:  Just use the struct declaration
as the cach key.  This cuts the size of the generated C file to about
half and takes about 50 KiB off the .data section.

size before (release build):

   text    data     bss     dec     hex filename
5363833  336880   13584 5714297  573179 _install/lib64/libvulkan_intel.so

size after (release build):

   text    data     bss     dec     hex filename
5229017  285264   13584 5527865  545939 _install/lib64/libvulkan_intel.so

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agonir/algebraic: Move the template closer to the render function
Jason Ekstrand [Sat, 13 Apr 2019 15:32:55 +0000 (10:32 -0500)]
nir/algebraic: Move the template closer to the render function

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
5 years agoiris: Move iris_debug_recompile calls before uploading.
Kenneth Graunke [Tue, 16 Apr 2019 05:58:17 +0000 (22:58 -0700)]
iris: Move iris_debug_recompile calls before uploading.

Order of operations is important, otherwise we'll find the program we
just uploaded as the "old" compile and get confused why nothing is
different between the two keys.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agoiris: Print the reason for shader recompiles.
Kenneth Graunke [Tue, 16 Apr 2019 05:17:49 +0000 (22:17 -0700)]
iris: Print the reason for shader recompiles.

I was lazy earlier and hadn't bothered typing / refactoring this.
Now I'm hitting some extra recompiles and would like to see why.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agoi965: Move program key debugging to the compiler.
Kenneth Graunke [Tue, 16 Apr 2019 04:59:50 +0000 (21:59 -0700)]
i965: Move program key debugging to the compiler.

The i965 driver has a bunch of code to compare two sets of program keys
and print out the differences.  This can be useful for debugging why a
shader needed to be recompiled on the fly due to non-orthogonal state
dependencies.  anv doesn't do recompiles, so we didn't need to share
this in the past - but I'd like to use it in iris.

This moves the bulk of the code to the compiler where it can be reused.
To make that possible, we need to decouple it from i965 - we can't get
at the brw program cache directly, nor use brw_context to print things.
Instead, we use compiler->shader_perf_log(), and simply pass in keys.

We put all of this debugging code in brw_debug_recompile.c, and only
export a single function, for simplicity.  I also tidied the code a
bit while moving it, now that it all lives in one file.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agowinsys/amdgpu: don't set GTT with GDS & OA placements on APUs
Marek Olšák [Mon, 15 Apr 2019 16:49:33 +0000 (12:49 -0400)]
winsys/amdgpu: don't set GTT with GDS & OA placements on APUs

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
5 years agonir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possible
Marek Olšák [Wed, 10 Apr 2019 01:40:33 +0000 (21:40 -0400)]
nir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possible

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agost/va/enc: Add support for frame_cropping_flag of VAEncSequenceParameterBufferH264
suresh guttula [Thu, 11 Apr 2019 04:51:56 +0000 (10:21 +0530)]
st/va/enc: Add support for frame_cropping_flag of VAEncSequenceParameterBufferH264

This patch will add support for frame_cropping when the input size is not
matched with aligned size. Currently vaapi driver ignores frame cropping
values provided by client. This change will update SPS nalu with proper
cropping values.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vce:Add support for frame_cropping_flag of VAEncSequenceParameterBufferH264
suresh guttula [Thu, 11 Apr 2019 04:49:33 +0000 (10:19 +0530)]
radeon/vce:Add support for frame_cropping_flag of VAEncSequenceParameterBufferH264

This patch will add support for frame_cropping when the input size is not
matched with aligned size. Currently vaapi driver ignores frame cropping
values provided by client. This change will update SPS nalu with proper
cropping values.

v2: Moving default crop setting to else when enc_frame_cropping_flag is not set.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agovl: Add cropping flags for H264
suresh guttula [Thu, 11 Apr 2019 04:39:10 +0000 (10:09 +0530)]
vl: Add cropping flags for H264

This patch adds cropping flags for H264 in pipe_h264_enc_pic_control.

Signed-off-by: Satyajit Sahu <satyajit.sahu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agocompiler/glsl: handle case where we have multiple users for types
Tapani Pälli [Fri, 15 Mar 2019 07:47:49 +0000 (09:47 +0200)]
compiler/glsl: handle case where we have multiple users for types

Both Vulkan and OpenGL might be using glsl_types simultaneously or we
can also have multiple concurrent Vulkan instances using glsl_types.
Patch adds a one time init to track number of users and will release
types only when last user calls _glsl_type_singleton_decref().

This change fixes glsl_type memory leaks we have with anv driver.

v2: reuse hash_mutex, cleanup, apply fix also to radv driver and
    rename helper functions (Jason)

v3: move init, destroy to happen on GL context init and destroy

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/compiler: Do not reswizzle dst if instruction writes to flag register
Danylo Piliaiev [Mon, 25 Mar 2019 12:15:27 +0000 (14:15 +0200)]
intel/compiler: Do not reswizzle dst if instruction writes to flag register

If we write to the flag register changing the swizzle would change
what channels are written to the flag register.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110201
Fixes: 4cd1a0be
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: <ian.d.romanick@intel.com>
5 years agogitlab-ci: Use LLVM 3.4 from Debian jessie for scons-llvm job
Michel Dänzer [Thu, 11 Apr 2019 16:38:30 +0000 (18:38 +0200)]
gitlab-ci: Use LLVM 3.4 from Debian jessie for scons-llvm job

This gets us closer to the officially supported minimum version of LLVM,
which is 3.3.

Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Do not use subshells for compiling dependencies
Michel Dänzer [Fri, 5 Apr 2019 16:32:25 +0000 (18:32 +0200)]
gitlab-ci: Do not use subshells for compiling dependencies

bash subshells don't inherit the -e option by default, so failures in
the subshell commands wouldn't cause the CI job to fail.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Drop unused clang 5/6 packages
Michel Dänzer [Fri, 5 Apr 2019 08:38:05 +0000 (10:38 +0200)]
gitlab-ci: Drop unused clang 5/6 packages

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Use clang 8 instead of 7
Michel Dänzer [Fri, 5 Apr 2019 08:36:29 +0000 (10:36 +0200)]
gitlab-ci: Use clang 8 instead of 7

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Remove unused Debian packages from Docker image
Michel Dänzer [Wed, 3 Apr 2019 13:48:51 +0000 (15:48 +0200)]
gitlab-ci: Remove unused Debian packages from Docker image

v2:
* Also remove autotools, now that the Mesa autotools build system has
  been dropped.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1
5 years agogitlab-ci: Remove unneded (stuff from) APT command lines
Michel Dänzer [Wed, 3 Apr 2019 10:21:48 +0000 (12:21 +0200)]
gitlab-ci: Remove unneded (stuff from) APT command lines

We either compile these locally, or they are dependencies of other
packages we install.

v2:
* Adapt to leaving self-compiled packages untouched.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Install most packages from Debian buster
Michel Dänzer [Thu, 4 Apr 2019 16:01:27 +0000 (18:01 +0200)]
gitlab-ci: Install most packages from Debian buster

We now use the C frontend of GCC 8 instead of 6 (required tweaking the
before_script for the clang job). We cannot use the C++ frontend of GCC
7 or newer yet, because upstream GCC 7 changed some C++ name mangling
stuff in backwards incompatible ways, and LLVM < 6.0 packages aren't
available in buster.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Use Debian packages instead of pip ones for meson and scons
Michel Dänzer [Wed, 3 Apr 2019 10:23:51 +0000 (12:23 +0200)]
gitlab-ci: Use Debian packages instead of pip ones for meson and scons

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Use HTTPS for APT repositories
Michel Dänzer [Thu, 4 Apr 2019 09:25:28 +0000 (11:25 +0200)]
gitlab-ci: Use HTTPS for APT repositories

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agogitlab-ci: Use Debian stretch instead of Ubuntu bionic
Michel Dänzer [Tue, 2 Apr 2019 14:56:54 +0000 (16:56 +0200)]
gitlab-ci: Use Debian stretch instead of Ubuntu bionic

The APT archive used by the Ubuntu docker image can be slow, even timing
out sometimes, causing spurious failures of the containers-build job.
The Debian docker image uses deb.debian.org, which is backed by a
content distribution network.

One downside is that stretch only has GCC 6, whereas bionic had 7.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodoc/features: Add a few extensions to the feature matrix
Gert Wollny [Thu, 11 Apr 2019 07:18:37 +0000 (09:18 +0200)]
doc/features: Add a few extensions to the feature matrix

These additions already landed but I forgot to update the feature
matrix.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: sort the shader capabilities alphabetically
Samuel Pitoiset [Tue, 16 Apr 2019 07:13:37 +0000 (09:13 +0200)]
radv: sort the shader capabilities alphabetically

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoiris: Make shader_perf_log print to stderr if INTEL_DEBUG=perf is set
Kenneth Graunke [Tue, 16 Apr 2019 05:34:15 +0000 (22:34 -0700)]
iris: Make shader_perf_log print to stderr if INTEL_DEBUG=perf is set

This matches i965's behavior, and makes sure that shader compiler
messages are visible when setting INTEL_DEBUG=perf.

5 years agoradv: enable shaderInt8 on SI and CIK
Samuel Pitoiset [Mon, 15 Apr 2019 15:42:20 +0000 (17:42 +0200)]
radv: enable shaderInt8 on SI and CIK

No CTS failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovirgl: fix fence fd version check
Chia-I Wu [Tue, 9 Apr 2019 20:46:38 +0000 (20:46 +0000)]
virgl: fix fence fd version check

Fixes: d1a1c21e762 ("virgl: native fence fd support")
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agovirgl: introduce virgl_drm_fence
Chia-I Wu [Tue, 9 Apr 2019 20:16:00 +0000 (20:16 +0000)]
virgl: introduce virgl_drm_fence

virgl_drm_fence can wrap either a fence fd or a virgl_hw_res.  Because a
fence fd is cheaper than a virgl_hw_res, we use it whenever it is
available.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agovirgl: hide fence internals from the driver
Chia-I Wu [Tue, 9 Apr 2019 18:18:43 +0000 (18:18 +0000)]
virgl: hide fence internals from the driver

Fence fds are cheaper than resources.  We want to let winsys make the
decision and use fence fds whenever they are supported.  This commit
prepares the work.

For the moment, we create a resource _and_ a fence fd when
supports_fences is true.  This will be fixed such that we create a
resource _or_ a fence fd.  (And because of a version check bug that we
will fix later, supports_fences is actually never true).

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agovirgl: handle fence_server_sync in winsys
Chia-I Wu [Tue, 9 Apr 2019 17:55:40 +0000 (17:55 +0000)]
virgl: handle fence_server_sync in winsys

It does not need help from the driver.  This also fixes one issue where
the fence is ignored when the transfer queue is full.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agogallivm: fix bogus assert in get_indirect_index
Roland Scheidegger [Mon, 15 Apr 2019 19:36:32 +0000 (21:36 +0200)]
gallivm: fix bogus assert in get_indirect_index

0 is a valid value as max index, and the code handles it fine. This isn't
commonly seen, as it will only happen with array declarations of size 1.
Fixes piglit tests/shaders/complex-loop-analysis-bug.shader_test

Fixes: a3c898dc97ec "gallivm: fix improper clamping of vertex index when fetching gs inputs"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110441

Reviewed-by: Brian Paul <brianp@vmware.com>
5 years agoglsl/linker: always validate explicit locations for first and last interfaces
Andres Gomez [Fri, 8 Mar 2019 21:21:58 +0000 (23:21 +0200)]
glsl/linker: always validate explicit locations for first and last interfaces

Until now, we were only doing this when linking a SSO
program. However, nothing avoids linking a non SSO program which
doesn't have both a VS and FS. In those cases, we also need to report
the usual linking errors, if happening.

v2: Use a better name for the renamed function (Timothy).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agovc4: fix build
Rhys Perry [Mon, 15 Apr 2019 22:11:49 +0000 (23:11 +0100)]
vc4: fix build

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 5131b7a43f8488a7 ('gallium: add support for formatted image loads')
5 years agodocs: drop Andres Gomez from the release cycles
Andres Gomez [Sat, 13 Apr 2019 20:50:41 +0000 (22:50 +0200)]
docs: drop Andres Gomez from the release cycles

Juan A. Suarez takes his place and the shorter loop makes Dylan
repeating earlier.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoiris: Fix FLUSH_EXPLICIT handling with staging buffers.
Kenneth Graunke [Sun, 7 Apr 2019 06:35:49 +0000 (23:35 -0700)]
iris: Fix FLUSH_EXPLICIT handling with staging buffers.

I neglected to blit the staging buffer back to the real one at
transfer_flush_region (FlushMappedBufferRange) time.

5 years agoiris: Preserve all PIPE_TRANSFER flags in xfer->usage
Kenneth Graunke [Mon, 8 Apr 2019 07:45:41 +0000 (00:45 -0700)]
iris: Preserve all PIPE_TRANSFER flags in xfer->usage

We need to preserve PIPE_TRANSFER_FLUSH_EXPLICIT, DISCARD_RANGE, and
so on, but don't want to pass them to iris_bo_map().  So, keep them all,
but mask them off when calling map.

Chris Wilson told me to do this a long time ago and he was right.

5 years agoiris: Actually mark blorp_copy_buffer destinations as written.
Kenneth Graunke [Fri, 12 Apr 2019 22:36:52 +0000 (15:36 -0700)]
iris: Actually mark blorp_copy_buffer destinations as written.

5 years agodrirc: add Spectacle, Falkon to a-sync blacklist
grmat [Tue, 9 Apr 2019 09:20:35 +0000 (11:20 +0200)]
drirc: add Spectacle, Falkon to a-sync blacklist

Spectacle is the plasma screenshot utility

Falkon is a KDE web browser that should succeed Konqueror

5 years agodrirc: add Waterfox to adaptive-sync blacklist
davidbepo [Wed, 10 Apr 2019 09:16:49 +0000 (09:16 +0000)]
drirc: add Waterfox to adaptive-sync blacklist

5 years agodrirc: add Budgie WM to adaptive-sync blacklist
El Christianito [Tue, 9 Apr 2019 18:30:16 +0000 (20:30 +0200)]
drirc: add Budgie WM to adaptive-sync blacklist

Budgie Window Manager is an increasingly used alternative to GNOME and MATE.
Default in Solus OS, also used in other distros.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agoci: Delete autotools build jobs
Dylan Baker [Mon, 8 Apr 2019 20:37:31 +0000 (13:37 -0700)]
ci: Delete autotools build jobs

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Matt Turner <mattst88@gmail.com>
5 years agodocs: drop most autoconf references
Dylan Baker [Mon, 8 Apr 2019 19:56:51 +0000 (12:56 -0700)]
docs: drop most autoconf references

There's still a few in here, but those docs are already so out of date
that it probably makes more sense to delete them. Such as the GLES
docs which still claim we only support 1.1 and 2.0, with no mention of
3.x at all.

v2: - Add docs for testing back end (Eric Engestrom)
    - Drop more autootols references
    - meson is now required not recommended
    - Add $PWD

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Matt Turner <mattst88@gmail.com>
5 years agoDelete autotools
Dylan Baker [Mon, 8 Apr 2019 19:44:17 +0000 (12:44 -0700)]
Delete autotools

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Matt Turner <mattst88@gmail.com>
5 years agoradeonsi: enable GL_EXT_shader_image_load_formatted
Marek Olšák [Mon, 15 Apr 2019 17:03:13 +0000 (13:03 -0400)]
radeonsi: enable GL_EXT_shader_image_load_formatted

no changes - the driver doesn't use the format

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agost/mesa: add support for EXT_shader_image_load_formatted
Rhys Perry [Wed, 16 Jan 2019 23:18:27 +0000 (23:18 +0000)]
st/mesa: add support for EXT_shader_image_load_formatted

v3: rebase

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa, glsl: add support for EXT_shader_image_load_formatted
Rhys Perry [Wed, 16 Jan 2019 23:18:26 +0000 (23:18 +0000)]
mesa, glsl: add support for EXT_shader_image_load_formatted

v3: rebase

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallium: add support for formatted image loads
Rhys Perry [Wed, 16 Jan 2019 23:18:25 +0000 (23:18 +0000)]
gallium: add support for formatted image loads

v3: rebase
v3: make use of u_pipe_screen_get_param_defaults

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: set ACCESS_NON_READABLE on stores for copy/fill/clear meta shaders
Samuel Pitoiset [Mon, 15 Apr 2019 16:41:15 +0000 (18:41 +0200)]
radv: set ACCESS_NON_READABLE on stores for copy/fill/clear meta shaders

The compiler will emit GLC=1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: Use local buffers for the global bo list.
Bas Nieuwenhuizen [Tue, 9 Apr 2019 23:42:31 +0000 (01:42 +0200)]
radv: Use local buffers for the global bo list.

Even if we don't use local buffers in general. Turns out that even
though the performance is not the best the kernel still does it
better than our own list.

We still have to keep the radv bo list for buffers that are shared
externally.

This improves Talos on lowest quality setting (so as CPU bound as
possible) by ~10% if the global bo list is enabled.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: Move has_local_buffers disable to radeonsi.
Bas Nieuwenhuizen [Tue, 9 Apr 2019 23:16:25 +0000 (01:16 +0200)]
ac: Move has_local_buffers disable to radeonsi.

In radv we had a separate flag to actually use it + an env option
to experimentally use it.

The common code setting has_local_buffers to false of course broke
that experimental option.

Also the "enable on APU" did not make sense for RADV as it is still
disabled by default.

Fixes: b21a4efb553 "radv/winsys: allow local BOs on APUs"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Add bolist RADV_PERFTEST flag.
Bas Nieuwenhuizen [Tue, 9 Apr 2019 22:37:54 +0000 (00:37 +0200)]
radv: Add bolist RADV_PERFTEST flag.

To test global_bo_list performance.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoac: fix incorrect bindless atomic code in visit_image_atomic
Marek Olšák [Fri, 12 Apr 2019 15:39:02 +0000 (11:39 -0400)]
ac: fix incorrect bindless atomic code in visit_image_atomic

Coverity: CID 1444664

Fixes: d62d434fe920 ("ac/nir_to_llvm: add image bindless support")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir,ac/nir: fix cube_face_coord
Rhys Perry [Fri, 12 Apr 2019 10:07:53 +0000 (11:07 +0100)]
nir,ac/nir: fix cube_face_coord

Seems it was missing the "/ ma + 0.5" and the order was swapped.

Fixes: a1a2a8dfda7b9cac7e ('nir: add AMD_gcn_shader extended instructions')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoanv: Update to use the new features struct names
Jason Ekstrand [Sat, 13 Apr 2019 23:44:03 +0000 (18:44 -0500)]
anv: Update to use the new features struct names

These were updated in version 1.1.106 of vulkan.h to make more sense
with the extension names.  We may as well keep with the times.

Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan: Update the XML and headers to 1.1.106
Jason Ekstrand [Sat, 13 Apr 2019 23:41:07 +0000 (18:41 -0500)]
vulkan: Update the XML and headers to 1.1.106

Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agonir: fix packing components with arrays
Timothy Arceri [Mon, 15 Apr 2019 05:00:02 +0000 (15:00 +1000)]
nir: fix packing components with arrays

When gathering info for unmovable types we need to handle arrays.
While we dont support packing/moving arrays we do support packing
scalar components with these arrays.

Fixes piglit:
tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test

Fixes: 5eb17506e159 ("nir: do not pack varying with different types")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: enable VK_KHR_shader_float16_int8
Samuel Pitoiset [Fri, 12 Apr 2019 06:53:36 +0000 (08:53 +0200)]
radv: enable VK_KHR_shader_float16_int8

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: add SpvCapabilityFloat16 support
Samuel Pitoiset [Fri, 12 Apr 2019 06:53:35 +0000 (08:53 +0200)]
spirv: add SpvCapabilityFloat16 support

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel: Emit 3DSTATE_VF_STATISTICS dynamically
Kenneth Graunke [Fri, 12 Apr 2019 18:55:38 +0000 (11:55 -0700)]
intel: Emit 3DSTATE_VF_STATISTICS dynamically

Pipeline statistics queries should not count BLORP's rectangles.

    (23) How do operations like Clear, TexSubImage, etc. affect the
         results of the newly introduced queries?

      DISCUSSION: Implementations might require "helper" rendering
      commands be issued to implement certain operations like Clear,
      TexSubImage, etc.

      RESOLVED: They don't. Only application submitted rendering
      commands should have an effect on the results of the queries.

Piglit's arb_pipeline_statistics_query-vert_adj exposes this bug when
the driver is hacked to always perform glBufferData via a GPU staging
copy (for debugging purposes).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agonir/validate: Require unused bits of nir_const_value to be zero
Jason Ekstrand [Tue, 2 Apr 2019 02:42:37 +0000 (21:42 -0500)]
nir/validate: Require unused bits of nir_const_value to be zero

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonir/load_const_to_scalar: Get rid of a bit size switch statement
Jason Ekstrand [Wed, 27 Mar 2019 23:31:01 +0000 (18:31 -0500)]
nir/load_const_to_scalar: Get rid of a bit size switch statement

Now that nir_const_value is a scalar, we don't need the switch on bit
size in order to pluck off components properly.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agospirv: Drop some unneeded bit size switch statements
Jason Ekstrand [Wed, 27 Mar 2019 23:28:30 +0000 (18:28 -0500)]
spirv: Drop some unneeded bit size switch statements

Now that nir_const_value is a scalar, we don't need the switch on bit
size in order copy components around properly.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonir/constant_folding: Get rid of a bit size switch statement
Jason Ekstrand [Wed, 27 Mar 2019 23:27:39 +0000 (18:27 -0500)]
nir/constant_folding: Get rid of a bit size switch statement

Now that nir_const_value is a scalar, we don't need the switch on bit
size in order to swizzle them properly.

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonir: make nir_const_value scalar
Karol Herbst [Tue, 26 Mar 2019 23:59:03 +0000 (00:59 +0100)]
nir: make nir_const_value scalar

v2: remove & operator in a couple of memsets
    add some memsets
v3: fixup lima

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v2)
5 years agospirv: reduce array size in vtn_handle_constant
Karol Herbst [Wed, 10 Apr 2019 14:46:50 +0000 (16:46 +0200)]
spirv: reduce array size in vtn_handle_constant

we already assert above that there are no more than 3 sources, so it
doesn't make sense to use an array of 4 sources

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/loop_analyze: use nir_const_value.b for boolean results, not u32
Karol Herbst [Tue, 2 Apr 2019 12:12:06 +0000 (14:12 +0200)]
nir/loop_analyze: use nir_const_value.b for boolean results, not u32

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/print: Use nir_src_as_int for array indices
Jason Ekstrand [Tue, 2 Apr 2019 02:36:12 +0000 (21:36 -0500)]
nir/print: Use nir_src_as_int for array indices

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonir/builder: Add a nir_imm_zero helper
Jason Ekstrand [Tue, 2 Apr 2019 02:31:26 +0000 (21:31 -0500)]
nir/builder: Add a nir_imm_zero helper

v2: replace nir_zero_vec with nir_imm_zero (Karol Herbst)

Reviewed-by: Karol Herbst <kherbst@redhat.com>
5 years agonir/builder: Move nir_imm_vec2 from blorp into the builder
Karol Herbst [Fri, 29 Mar 2019 23:53:42 +0000 (00:53 +0100)]
nir/builder: Move nir_imm_vec2 from blorp into the builder

While we're here, fix a typo which caused it to actually return a vec4
with the third and fourth components zero.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agolima: use nir_src_as_float
Karol Herbst [Sat, 13 Apr 2019 17:33:41 +0000 (19:33 +0200)]
lima: use nir_src_as_float

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agofreedreno/ir3: use nir_src_as_uint in a few places
Karol Herbst [Fri, 29 Mar 2019 19:57:52 +0000 (20:57 +0100)]
freedreno/ir3: use nir_src_as_uint in a few places

v2 (Jason Ekstrand):
 - Add even more places

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/nir: use nir_src_is_const and nir_src_as_uint
Karol Herbst [Sun, 31 Mar 2019 01:54:21 +0000 (03:54 +0200)]
intel/nir: use nir_src_is_const and nir_src_as_uint

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel/nir: Take a nir_tex_instr and src index in brw_texture_offset
Jason Ekstrand [Wed, 27 Mar 2019 22:34:10 +0000 (17:34 -0500)]
intel/nir: Take a nir_tex_instr and src index in brw_texture_offset

This makes things a bit simpler and it's also more robust because it no
longer has a hard dependency on the offset being a 32-bit value.

5 years agoradv: use nir constant helpers
Karol Herbst [Thu, 28 Mar 2019 15:53:47 +0000 (16:53 +0100)]
radv: use nir constant helpers

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoamd/nir: some cleanups
Karol Herbst [Thu, 28 Mar 2019 15:46:30 +0000 (16:46 +0100)]
amd/nir: some cleanups

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agopanfrost/midgard: Use shared nir_lower_viewport_transform
Alyssa Rosenzweig [Sun, 7 Apr 2019 16:37:28 +0000 (16:37 +0000)]
panfrost/midgard: Use shared nir_lower_viewport_transform

v2: Run before lowering I/O.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agonir: Add nir_lower_viewport_transform
Alyssa Rosenzweig [Sun, 14 Apr 2019 15:43:13 +0000 (15:43 +0000)]
nir: Add nir_lower_viewport_transform

On Mali hardware (supported by Panfrost and Lima), the fixed-function
transformation from world-space to screen-space coordinates is done in
the vertex shader prior to writing out the gl_Position varying, rather
than in dedicated hardware. This commit adds a shared NIR pass for
implementing coordinate transformation and lowering gl_Position writes
into screen-space gl_Position writes.

v2: Run directly on derefs before io/vars are lowered to cleanup the
code substantially. Thank you to Qiang for this suggestion!

v3: Bikeshed continues.

v4: Add to Makefile.sources (per Jason's comment). Bikeshed comment.

Ian and Qiang's reviews are from v3, but no real functional changes from
v4. Rob's review is from v4.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Suggested-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agopanfrost: Cleanup indexed draw handling
Alyssa Rosenzweig [Sat, 13 Apr 2019 00:10:20 +0000 (00:10 +0000)]
panfrost: Cleanup indexed draw handling

As part of this cleanup, we use the newly-exposed
u_vbuf_get_minmax_index, deduplicating quite a bit of bookkeeping. We
also centralize the draw_flags tracking to make this code cleaner /
futureproofed; we have already had bugs regarding this field so we might
as well get it right now.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost/midgard: Drop dependence on mesa/st
Alyssa Rosenzweig [Sat, 13 Apr 2019 00:04:52 +0000 (00:04 +0000)]
panfrost/midgard: Drop dependence on mesa/st

This was used as a workaround for uniform sizing which was fixed in
771adffe ("st: Lower uniforms in st in the...")

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agodraw: fix building error in draw_gs_init()
Mauro Rossi [Sat, 13 Apr 2019 16:34:53 +0000 (18:34 +0200)]
draw: fix building error in draw_gs_init()

Fixes the following building error happening with Android build system:

external/mesa/src/gallium/auxiliary/draw/draw_gs.c:740:79:
error: address of array 'draw->gs.tgsi.machine->PrimitiveOffsets' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
         if (!draw->gs.tgsi.machine->Primitives[i] || !draw->gs.tgsi.machine->PrimitiveOffsets)
                                                      ~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~
1 error generated.

Fixes: 7720ce3 ("draw: add support to tgsi paths for geometry streams. (v2)")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agolima/gpir: fix alu check miss last store slot
Qiang Yu [Fri, 12 Apr 2019 03:35:34 +0000 (11:35 +0800)]
lima/gpir: fix alu check miss last store slot

Fixes: 92d7ca4b1cd "gallium: add lima driver"
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima/gpir: fix compile fail when two slot node
Qiang Yu [Thu, 11 Apr 2019 07:42:59 +0000 (15:42 +0800)]
lima/gpir: fix compile fail when two slot node

Come from glmark2-es2 jellyfish test.

Fixes: 92d7ca4b1cd "gallium: add lima driver"
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agolima: add support for depth/stencil fbo attachments and textures
Vasily Khoruzhick [Sun, 7 Apr 2019 05:55:36 +0000 (22:55 -0700)]
lima: add support for depth/stencil fbo attachments and textures

Hardware supports writing back Z/S buffers and sampling from them,
so add support for that.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Icenowy Zheng <icenowy@aosc.io>
5 years agolima: use individual tile heap for each GP job.
Vasily Khoruzhick [Sun, 7 Apr 2019 05:48:16 +0000 (22:48 -0700)]
lima: use individual tile heap for each GP job.

Looks like it's somehow used by subsequent PP job, so we have to
preserve its contents until PP job is done.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tested-by: Icenowy Zheng <icenowy@aosc.io>
5 years agonir: add lower_ftrunc
Christian Gmeiner [Fri, 12 Apr 2019 08:12:27 +0000 (10:12 +0200)]
nir: add lower_ftrunc

Port TGSI TRUNC lowering to nir

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoandroid: fix LLVM version string related building errors
Mauro Rossi [Sat, 13 Apr 2019 16:56:14 +0000 (18:56 +0200)]
android: fix LLVM version string related building errors

Adding \ prior to " in llvm version string fixes the following building errors:

external/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1290:14:
error: expected ')'
                 ", LLVM " MESA_LLVM_VERSION_STRING
                           ^
<command line>:8:34: note: expanded from here
                                 ^
external/mesa/src/gallium/drivers/r600/r600_pipe_common.c:1287:10:
note: to match this '('
        snprintf(rscreen->renderer_string, sizeof(rscreen->renderer_string),
                ^
1 error generated.

Fixes: 05b114e ("simplify LLVM version string printing")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
5 years agoanv: leave the top 4Gb of the high heap VMA unused
Lionel Landwerlin [Fri, 12 Apr 2019 10:05:33 +0000 (11:05 +0100)]
anv: leave the top 4Gb of the high heap VMA unused

In 628c9ca9089789 I forgot to apply the same -4Gb of the high address
of the high heap VMA. This was previously computed in the
HIGH_HEAP_MAX_ADDRESS.

Many thanks to James for pointing this out.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Xiong, James <james.xiong@intel.com>
Fixes: 628c9ca9089789 ("anv: store heap address bounds when initializing physical device")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agov3d: Use the new lower_to_scratch implementation for indirects on temps.
Eric Anholt [Thu, 11 Apr 2019 18:12:01 +0000 (11:12 -0700)]
v3d: Use the new lower_to_scratch implementation for indirects on temps.

We can use the same register spilling infrastructure for our loads/stores
of indirect access of temp variables, instead of doing an if ladder.

Cuts 50% of instructions and max-temps from 2 KSP shaders in shader-db.
Also causes several other KSP shaders with large bodies and large loop
counts to not be force-unrolled.

The change was originally motivated by NOLTIS slightly modifying register
pressure in piglit temp mat4 array read/write tests, triggering register
allocation failures.

5 years agonir: Add a pass for selectively lowering variables to scratch space
Jason Ekstrand [Fri, 2 Dec 2016 19:36:42 +0000 (11:36 -0800)]
nir: Add a pass for selectively lowering variables to scratch space

This commit adds new nir_load/store_scratch opcodes which read and write
a virtual scratch space.  It's up to the back-end to figure out what to
do with it and where to put the actual scratch data.

v2: Drop const_index comments (by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Detect the correct number of QPUs and use it to fix the spill size.
Eric Anholt [Thu, 11 Apr 2019 19:28:30 +0000 (12:28 -0700)]
v3d: Detect the correct number of QPUs and use it to fix the spill size.

We were missing a * 4 even if the particular hardware matched our
assumption.

5 years agov3d: Add missing dumping for the spill offset/size uniforms.
Eric Anholt [Thu, 11 Apr 2019 18:46:47 +0000 (11:46 -0700)]
v3d: Add missing dumping for the spill offset/size uniforms.

5 years agov3d: Add missing base offset to CS shared memory accesses.
Eric Anholt [Thu, 11 Apr 2019 19:04:41 +0000 (12:04 -0700)]
v3d: Add missing base offset to CS shared memory accesses.

This code is so touchy, trying to emit the minimum amount of address math.
Some day we'll move it all to NIR, I hope.

5 years agov3d: Add Compute Shader compilation support.
Eric Anholt [Wed, 5 Dec 2018 23:41:35 +0000 (15:41 -0800)]
v3d: Add Compute Shader compilation support.

While waiting for the CSD UABI to get reviewed, I keep having to rebase
the CS patch.  Just land the compiler side for now to keep it from
diverging.

For now this covers just GLES 3.1 compute shaders, not CL kernels.

5 years agov3d: Replace the old shader-db env var output with the ARB_debug_output.
Eric Anholt [Thu, 14 Mar 2019 20:59:13 +0000 (13:59 -0700)]
v3d: Replace the old shader-db env var output with the ARB_debug_output.

We're using ARB_debug_output for the main shader-db, but I had this env
var left around from the shader-db-2 support (vc4 apitrace-based).  Keep
the env var around since it's nice sometimes to get the stats on a shader
you're optimizing without having to do a shader-db run, but drop the old
formatting that's not useful and keeps tricking me when I go to add
another measurement to the shader-db output.

5 years agov3d: Include the number of max temps used in the shader-db output.
Eric Anholt [Wed, 13 Mar 2019 21:19:02 +0000 (14:19 -0700)]
v3d: Include the number of max temps used in the shader-db output.

This gives us finer-grained feedback on how we're doing on register
pressure than "did we trigger a new shader to spill or drop thread count?"