mesa.git
7 years agoradeonsi: emit TGSI_OPCODE_BALLOT
Nicolai Hähnle [Thu, 30 Mar 2017 10:15:19 +0000 (12:15 +0200)]
radeonsi: emit TGSI_OPCODE_BALLOT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: implement TGSI_SEMANTIC_SUBGROUP_*
Nicolai Hähnle [Thu, 30 Mar 2017 12:15:10 +0000 (14:15 +0200)]
radeonsi: implement TGSI_SEMANTIC_SUBGROUP_*

64-bit system values are stored as v2i32 to simplify the fetch logic.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: support 64-bit system values
Nicolai Hähnle [Fri, 31 Mar 2017 11:02:34 +0000 (13:02 +0200)]
radeonsi: support 64-bit system values

For simplicitly, always store system values as 32-bit values or arrays
of 32-bit values. 64-bit values are unpacked and packed accordingly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: bump RADEON_LLVM_MAX_SYSTEM_VALUES
Nicolai Hähnle [Thu, 30 Mar 2017 12:14:27 +0000 (14:14 +0200)]
radeonsi: bump RADEON_LLVM_MAX_SYSTEM_VALUES

ARB_shader_ballot introduces 7 new system values that can be used
in all shader stages.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: enable ARB_shader_ballot
Nicolai Hähnle [Thu, 30 Mar 2017 09:16:27 +0000 (11:16 +0200)]
st/mesa: enable ARB_shader_ballot

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/glsl_to_tgsi: implement ARB_shader_ballot system variables
Nicolai Hähnle [Thu, 30 Mar 2017 12:14:45 +0000 (14:14 +0200)]
st/glsl_to_tgsi: implement ARB_shader_ballot system variables

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/glsl_to_tgsi: implement ARB_shader_ballot builtin functions
Nicolai Hähnle [Thu, 30 Mar 2017 10:15:51 +0000 (12:15 +0200)]
st/glsl_to_tgsi: implement ARB_shader_ballot builtin functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agotgsi: add SUBGROUP_* semantics
Ilia Mirkin [Thu, 9 Feb 2017 23:48:18 +0000 (18:48 -0500)]
tgsi: add SUBGROUP_* semantics

v2: add documentation (Nicolai)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agotgsi: add BALLOT/READ_* opcodes
Ilia Mirkin [Thu, 9 Feb 2017 23:38:17 +0000 (18:38 -0500)]
tgsi: add BALLOT/READ_* opcodes

v2 (Nicolai):
- BALLOT isn't per-channel
- expand the documentation (also for VOTE_*)

v3:
- only BALLOT returns a 64-bit lanemask (Boyan)
- relax the requirement on READ_INVOC: the invocation number to read
  from must be uniform within a sub-group. This matches the
  GL_ARB_shader_ballot spect (and the v_readlane instruction of AMD
  GCN)

v4:
- hopefully really fix the doc of VOTE_* returns (Ilia)

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v2)
7 years agogallium: add PIPE_CAP_TGSI_BALLOT
Nicolai Hähnle [Thu, 30 Mar 2017 09:16:09 +0000 (11:16 +0200)]
gallium: add PIPE_CAP_TGSI_BALLOT

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add gl_SubGroup*ARB builtins
Nicolai Hähnle [Thu, 30 Mar 2017 10:10:00 +0000 (12:10 +0200)]
glsl: add gl_SubGroup*ARB builtins

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add ARB_shader_ballot builtin functions
Nicolai Hähnle [Thu, 30 Mar 2017 09:18:47 +0000 (11:18 +0200)]
glsl: add ARB_shader_ballot builtin functions

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add ARB_shader_ballot operations
Nicolai Hähnle [Thu, 30 Mar 2017 09:18:30 +0000 (11:18 +0200)]
glsl: add ARB_shader_ballot operations

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: add ARB_shader_ballot enable
Nicolai Hähnle [Thu, 30 Mar 2017 09:17:47 +0000 (11:17 +0200)]
glsl: add ARB_shader_ballot enable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: add GL_ARB_shader_ballot boilerplate
Nicolai Hähnle [Thu, 30 Mar 2017 09:14:01 +0000 (11:14 +0200)]
mesa: add GL_ARB_shader_ballot boilerplate

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoswr: automake: add gen_common.py to the tarball
Emil Velikov [Tue, 4 Apr 2017 17:18:45 +0000 (18:18 +0100)]
swr: automake: add gen_common.py to the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agointel: genxml: automake: include gen_bits_header.py in the tarball
Emil Velikov [Tue, 4 Apr 2017 16:54:28 +0000 (17:54 +0100)]
intel: genxml: automake: include gen_bits_header.py in the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agointel: genxml: automake: polish automake rules
Emil Velikov [Tue, 4 Apr 2017 16:10:51 +0000 (17:10 +0100)]
intel: genxml: automake: polish automake rules

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoamd/addrlib: automake: add all headers to the tarball
Emil Velikov [Tue, 4 Apr 2017 16:47:09 +0000 (17:47 +0100)]
amd/addrlib: automake: add all headers to the tarball

Fixes: 7f160efcde4 ("amd/addrlib: import gfx9 support")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoradeonsi: enable ARB_sparse_buffer
Nicolai Hähnle [Thu, 2 Feb 2017 20:11:05 +0000 (21:11 +0100)]
radeonsi: enable ARB_sparse_buffer

v2:
- fill in DRM version requirement
- disable on SI due to CP DMA faults

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: disable SDMA clears and copies for sparse buffers
Nicolai Hähnle [Fri, 24 Mar 2017 22:30:55 +0000 (23:30 +0100)]
radeonsi: disable SDMA clears and copies for sparse buffers

VM faults cannot be disabled for SDMA on <= VI.

We could still use SDMA by asking the winsys about which parts of the
buffers are committed. This is left as a potential future improvement.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/radeon: implement pipe->resource_commit
Nicolai Hähnle [Wed, 8 Feb 2017 10:07:19 +0000 (11:07 +0100)]
gallium/radeon: implement pipe->resource_commit

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/radeon: transfers and invalidation for sparse buffers
Nicolai Hähnle [Tue, 7 Feb 2017 17:24:59 +0000 (18:24 +0100)]
gallium/radeon: transfers and invalidation for sparse buffers

Sparse buffers can never be mapped by the CPU.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium/radeon: implement sparse buffer creation
Nicolai Hähnle [Tue, 7 Feb 2017 17:03:55 +0000 (18:03 +0100)]
gallium/radeon: implement sparse buffer creation

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: sparse buffer debugging helpers
Nicolai Hähnle [Mon, 13 Feb 2017 11:51:12 +0000 (12:51 +0100)]
winsys/amdgpu: sparse buffer debugging helpers

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: take fences when freeing a backing buffer
Nicolai Hähnle [Tue, 7 Feb 2017 16:58:39 +0000 (17:58 +0100)]
winsys/amdgpu: take fences when freeing a backing buffer

We never add fences to backing buffers during submit. When we free a
backing buffer, it must inherit the sparse buffer's fences, so that it
doesn't get re-used prematurely via the cache.

v2:
- remove pipe_mutex_*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: add sparse buffers to CS
Nicolai Hähnle [Tue, 7 Feb 2017 16:11:00 +0000 (17:11 +0100)]
winsys/amdgpu: add sparse buffers to CS

... and implement the corresponding fence handling.

v2:
- add missing bit in amdgpu_bo_is_referenced_by_cs_with_usage
- remove pipe_mutex_*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: sparse buffer creation / destruction / commitment
Nicolai Hähnle [Tue, 7 Feb 2017 16:04:49 +0000 (17:04 +0100)]
winsys/amdgpu: sparse buffer creation / destruction / commitment

This is the bulk of the buffer allocation logic. It is fairly simple and
stupid. We'll probably want to use e.g. interval trees at some point to
keep track of commitments, but Mesa doesn't have an implementation of those
yet.

v2:
- remove pipe_mutex_*
- fix total_backing_pages accounting
- simplify by using the new VA_OP_CLEAR/REPLACE kernel interface

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: add sparse buffer data structures
Nicolai Hähnle [Tue, 7 Feb 2017 16:03:59 +0000 (17:03 +0100)]
winsys/amdgpu: add sparse buffer data structures

v2:
- remove pipe_mutex_*
- use a simple page commitment array

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: extend amdgpu_add_fence to allow adding multiple fences
Nicolai Hähnle [Tue, 7 Feb 2017 16:53:49 +0000 (17:53 +0100)]
winsys/amdgpu: extend amdgpu_add_fence to allow adding multiple fences

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: build handles and flags list late on submit thread
Nicolai Hähnle [Tue, 7 Feb 2017 16:35:02 +0000 (17:35 +0100)]
winsys/amdgpu: build handles and flags list late on submit thread

This probably has only minor performance effects, but it simplifies some
subsequent code slightly.

Ideally, it could also be used to simplify the handling of slab buffers
in the same way, but unfortunately that's not possible as long as we need
indices for relocations.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: share common code in amdgpu_add_fence_dependencies
Nicolai Hähnle [Tue, 7 Feb 2017 16:08:07 +0000 (17:08 +0100)]
winsys/amdgpu: share common code in amdgpu_add_fence_dependencies

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/amdgpu: extract amdgpu_do_add_real_buffer
Nicolai Hähnle [Tue, 7 Feb 2017 16:06:06 +0000 (17:06 +0100)]
winsys/amdgpu: extract amdgpu_do_add_real_buffer

We will use it for delayed adding of sparse buffers' backing buffers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agowinsys/radeon: sparse buffers will not be supported
Nicolai Hähnle [Tue, 7 Feb 2017 16:00:10 +0000 (17:00 +0100)]
winsys/radeon: sparse buffers will not be supported

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeon/winsys: add sparse buffer interface
Nicolai Hähnle [Tue, 7 Feb 2017 15:59:54 +0000 (16:59 +0100)]
radeon/winsys: add sparse buffer interface

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: plumbing for sparse buffers
Nicolai Hähnle [Thu, 2 Feb 2017 22:32:48 +0000 (23:32 +0100)]
st/mesa: plumbing for sparse buffers

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: enable ARB_sparse_buffer when supported
Nicolai Hähnle [Thu, 2 Feb 2017 20:11:29 +0000 (21:11 +0100)]
st/mesa: enable ARB_sparse_buffer when supported

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agotrace: add resource_commit pass-through
Nicolai Hähnle [Thu, 23 Mar 2017 17:43:07 +0000 (18:43 +0100)]
trace: add resource_commit pass-through

v2: fix return type to bool (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoddebug: add resource_commit pass-through
Nicolai Hähnle [Fri, 17 Feb 2017 13:55:49 +0000 (14:55 +0100)]
ddebug: add resource_commit pass-through

v2: fix return type to bool (Marek)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium: add sparse buffer interface and capability
Nicolai Hähnle [Thu, 2 Feb 2017 20:10:44 +0000 (21:10 +0100)]
gallium: add sparse buffer interface and capability

v2:
- explain the resource_commit interface in more detail

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: implement sparse buffer commitment
Nicolai Hähnle [Thu, 2 Feb 2017 22:24:35 +0000 (23:24 +0100)]
mesa: implement sparse buffer commitment

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: implement sparse storage buffer allocation
Nicolai Hähnle [Thu, 2 Feb 2017 20:50:32 +0000 (21:50 +0100)]
mesa: implement sparse storage buffer allocation

v2:
- spec quote and style (Ian)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: implement SPARSE_BUFFER_PAGE_SIZE_ARB
Nicolai Hähnle [Thu, 2 Feb 2017 19:56:48 +0000 (20:56 +0100)]
mesa: implement SPARSE_BUFFER_PAGE_SIZE_ARB

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: Add GL_ARB_sparse_buffer boilerplate
Nicolai Hähnle [Thu, 2 Feb 2017 19:47:31 +0000 (20:47 +0100)]
mesa: Add GL_ARB_sparse_buffer boilerplate

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoconfigure.ac: require libdrm_amdgpu 2.4.77
Nicolai Hähnle [Tue, 4 Apr 2017 14:35:27 +0000 (16:35 +0200)]
configure.ac: require libdrm_amdgpu 2.4.77

The sparse buffer implementation requires amdgpu_bo_va_op_raw.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa: Replace program locks with atomic inc/dec.
Matt Turner [Wed, 5 Apr 2017 00:49:35 +0000 (10:49 +1000)]
mesa: Replace program locks with atomic inc/dec.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoanv: Advertise larger heap sizes
Jason Ekstrand [Fri, 17 Mar 2017 23:16:06 +0000 (16:16 -0700)]
anv: Advertise larger heap sizes

Instead of just advertising the aperture size, we do something more
intelligent.  On systems with a full 48-bit PPGTT, we can address 100%
of the available system RAM from the GPU.  In order to keep clients from
burning 100% of your available RAM for graphics resources, we have a
nice little heuristic (which has received exactly zero tuning) to keep
things under a reasonable level of control.

Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
7 years agoanv: Add support for 48-bit addresses
Jason Ekstrand [Sat, 18 Mar 2017 00:31:44 +0000 (17:31 -0700)]
anv: Add support for 48-bit addresses

This commit adds support for using the full 48-bit address space on
Broadwell and newer hardware.  Thanks to certain limitations, not all
objects can be placed above the 32-bit boundary.  In particular, general
and state base address need to live within 32 bits.  (See also
Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.)  In order
to handle this, we add a supports_48bit_address field to anv_bo and only
set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set.  We set the bit
for all client-allocated memory objects but leave it false for
driver-allocated objects.  While this is more conservative than needed,
all driver allocations should easily fit in the first 32 bits of address
space and keeps things simple because we don't have to think about
whether or not any given one of our allocation data structures will be
used in a 48-bit-unsafe way.

Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
7 years agoanv: Replace anv_bo::is_winsys_bo with a uint32_t flags
Jason Ekstrand [Thu, 30 Mar 2017 18:48:05 +0000 (11:48 -0700)]
anv: Replace anv_bo::is_winsys_bo with a uint32_t flags

Reviewed-by: Kristian H. Kristensen <krh@bitplanet.net>
7 years agoi965/blorp: Align vertex buffers to 64B
Jason Ekstrand [Fri, 31 Mar 2017 22:23:35 +0000 (15:23 -0700)]
i965/blorp: Align vertex buffers to 64B

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agoanv/blorp: Align vertex buffers to 64B
Jason Ekstrand [Fri, 31 Mar 2017 22:21:04 +0000 (15:21 -0700)]
anv/blorp: Align vertex buffers to 64B

This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agoanv: Query the kernel for reset status
Jason Ekstrand [Mon, 27 Mar 2017 23:03:57 +0000 (16:03 -0700)]
anv: Query the kernel for reset status

When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible.  In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering.  In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoanv: Check for device loss at the end of WaitForFences
Jason Ekstrand [Mon, 27 Mar 2017 23:01:42 +0000 (16:01 -0700)]
anv: Check for device loss at the end of WaitForFences

It's possible that the device could have been lost while we were
waiting.  We should let the user know if this has happened.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoanv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex
Jason Ekstrand [Mon, 3 Apr 2017 19:25:15 +0000 (12:25 -0700)]
anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex

When the shader does not set one of these values, they are supposed to
get a default value of 0.  We have hardware bits in 3DSTATE_CLIP for
this but haven't been setting them.  This fixes the intermittent failure
of dEQP-VK.geometry.layered.3d.render_to_default_layer.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agoi965/fs: Always provide a default LOD of 0 for TXS and TXL
Jason Ekstrand [Wed, 29 Mar 2017 22:16:15 +0000 (15:16 -0700)]
i965/fs: Always provide a default LOD of 0 for TXS and TXL

We already provide a default LOD for textureQueryLevels and texture() on
non-fragment stages.  However, there are more cases where one is needed
such as textureSize(gsampler2DMS*) in SPIR-V.  Instead of trying to list
out all of the cases one at a time, just provide the default for all TXS
and TXL operations.  This fixes a shader validation error in the new
Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
7 years agomesa: Require mipmap completeness for glCopyImageSubData(), sometimes.
Kenneth Graunke [Thu, 23 Feb 2017 23:04:52 +0000 (15:04 -0800)]
mesa: Require mipmap completeness for glCopyImageSubData(), sometimes.

This patch makes glCopyImageSubData require mipmap completeness when the
texture object's built-in sampler object has a mipmapping MinFilter.

Fixes (on i965):
dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
7 years agolibgl-xlib: Link with libunwind.
Vinson Lee [Tue, 4 Apr 2017 21:52:39 +0000 (14:52 -0700)]
libgl-xlib: Link with libunwind.

Fix linking error.

  CXXLD    libGL.la
../../../../src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In function `debug_backtrace_capture':
src/gallium/auxiliary/util/u_debug_stack.c:59: undefined reference to `_Ux86_64_getcontext'
src/gallium/auxiliary/util/u_debug_stack.c:60: undefined reference to `_ULx86_64_init_local'
src/gallium/auxiliary/util/u_debug_stack.c:62: undefined reference to `_ULx86_64_step'
src/gallium/auxiliary/util/u_debug_stack.c:71: undefined reference to `_ULx86_64_get_proc_info'
src/gallium/auxiliary/util/u_debug_stack.c:73: undefined reference to `_ULx86_64_get_proc_name'
src/gallium/auxiliary/util/u_debug_stack.c:65: undefined reference to `_ULx86_64_step'

Fixes: 70c272004f72 ("gallium/util: libunwind support")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100562
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
7 years agointel/isl: Refactor and clerify gen8 alignment calculations
Jason Ekstrand [Tue, 4 Apr 2017 18:31:22 +0000 (11:31 -0700)]
intel/isl: Refactor and clerify gen8 alignment calculations

Adding the actual table from the docs makes it clearer exactly what the
restrictions are.  In particular, it becomes clear that compressed
textures ignore the alignment parameters in RENDER_SURFACE_STATE.

Reviewed-by: Chad Versace <chadversary@chromium.org>
7 years agodrirc: Set glsl_zero_init for Kerbal Space Program.
Francisco Jerez [Tue, 4 Apr 2017 21:12:59 +0000 (14:12 -0700)]
drirc: Set glsl_zero_init for Kerbal Space Program.

This fixes the stripes of garbage rendered on the floor of the vehicle
assembly building among other rendering issues.  The reason for the
misrendering seems to be that some of the GLSL shaders used by the
application use variables before initializing them, incorrectly
assuming that they will be implicitly set to zero by the
implementation.

Acked-by: Matt Turner <mattst88@gmail.com>
7 years agointel: tools: add aubinator_error_decode tool
Lionel Landwerlin [Wed, 1 Mar 2017 14:39:58 +0000 (14:39 +0000)]
intel: tools: add aubinator_error_decode tool

This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.

v2 (from Matt): Drop authors
                Remove undefined automake variable

v3: Fix incorrect offsets for dword > 1 (Jordan)

v4: Fix decompression error with large blobs (Jordan)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
7 years agointel: genxml: add RING_BUFFER_CTL registers
Lionel Landwerlin [Fri, 10 Mar 2017 17:27:01 +0000 (17:27 +0000)]
intel: genxml: add RING_BUFFER_CTL registers

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agointel: genxml: add FAULT_REG register
Lionel Landwerlin [Fri, 10 Mar 2017 14:27:53 +0000 (14:27 +0000)]
intel: genxml: add FAULT_REG register

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agointel: genxml: add gen7 ERR_INT register
Lionel Landwerlin [Fri, 10 Mar 2017 14:27:23 +0000 (14:27 +0000)]
intel: genxml: add gen7 ERR_INT register

v2: add register to gen7.5 (Matt)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agointel: genxml: add ACTHD registers
Lionel Landwerlin [Thu, 9 Mar 2017 15:38:43 +0000 (15:38 +0000)]
intel: genxml: add ACTHD registers

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agointel: genxml: add GFX_ARB_ERROR_RPT register
Lionel Landwerlin [Thu, 9 Mar 2017 15:38:20 +0000 (15:38 +0000)]
intel: genxml: add GFX_ARB_ERROR_RPT register

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agointel: genxml: add INSTDONE registers
Lionel Landwerlin [Thu, 9 Mar 2017 11:58:19 +0000 (11:58 +0000)]
intel: genxml: add INSTDONE registers

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agotargets: export radeon winsys_create functions to silence LLVM warning
Marek Olšák [Thu, 23 Mar 2017 23:55:55 +0000 (00:55 +0100)]
targets: export radeon winsys_create functions to silence LLVM warning

It silences the following radeonsi LLVM warning due to a previous
commit adding an LLVM workaround:
  "mesa: for the -simplifycfg-sink-common option: may only occur zero or one
   times!"

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by; Emil Velikov <emil.velikov@collabora.com>

7 years agor600g: check rasterizer primitive states like in radeonsi
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:06 +0000 (20:33 +0300)]
r600g: check rasterizer primitive states like in radeonsi

Specifically, non-line primitives skipped, and defaulting to reset on
each packet.

The skip of non-line primitives saves ≈110 resetting of
PA_SC_LINE_STIPPLE register per frame in Kane&Lynch2.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
7 years agor600g: extract a code into a r600_emit_rasterizer_prim_state()
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:05 +0000 (20:33 +0300)]
r600g: extract a code into a r600_emit_rasterizer_prim_state()

Also change gs_output_prim type: unsigned → pipe_prim_type. The idea of
the code is mostly taken from radeonsi. The new code operating on
prev/curr rast_primitives saves ≈15 reloads of PA_SC_LINE_STIPPLE per
frame in Kane&Lynch2

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
7 years agor600g/radeonsi: use the correct types (taken from pipe_draw_info)
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:04 +0000 (20:33 +0300)]
r600g/radeonsi: use the correct types (taken from pipe_draw_info)

Note: si_shader.h has also "type" variable that should be changed to
"enum pipe_prim_type", however it triggers a bunch of warnings about
unhandled switches, so due not knowing the correct way to handle them, I
decided to leave it as is.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
7 years agor600g: remove duplicate memset by using a pointer, and constify args
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:03 +0000 (20:33 +0300)]
r600g: remove duplicate memset by using a pointer, and constify args

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
7 years agoglsl: remove unused file
Elie TOURNIER [Thu, 9 Mar 2017 15:16:54 +0000 (15:16 +0000)]
glsl: remove unused file

udivmod64 appears in src/compiler/glsl/builtin_int64.h and src/compiler/glsl/udivmod.h
The second file seems unused.
Fix commit 6b03b345eb64e15e577bc8b2cf04b314a4c70537

This change doesn't affect shader-db.

Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoradeonsi: access gallivm through ctx in most places
Marek Olšák [Mon, 3 Apr 2017 09:49:59 +0000 (11:49 +0200)]
radeonsi: access gallivm through ctx in most places

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: use ctx->types instead of bld->types etc.
Marek Olšák [Mon, 3 Apr 2017 09:37:10 +0000 (11:37 +0200)]
radeonsi: use ctx->types instead of bld->types etc.

even vec_type is f32.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: use i32_0/1 instead of *int_bld.zero/one in most places
Marek Olšák [Mon, 3 Apr 2017 09:23:59 +0000 (11:23 +0200)]
radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agogallium: decrease the size of pipe_draw_info - 88 -> 80 bytes
Marek Olšák [Sun, 2 Apr 2017 12:42:17 +0000 (14:42 +0200)]
gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_vertex_element - 16 -> 8 bytes
Marek Olšák [Sun, 2 Apr 2017 00:36:16 +0000 (02:36 +0200)]
gallium: decrease the size of pipe_vertex_element - 16 -> 8 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_resource - 64 -> 48 bytes
Marek Olšák [Sun, 2 Apr 2017 00:13:12 +0000 (02:13 +0200)]
gallium: decrease the size of pipe_resource - 64 -> 48 bytes

Some other changes needed here.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_box - 24 -> 16 bytes
Marek Olšák [Sun, 2 Apr 2017 00:00:49 +0000 (02:00 +0200)]
gallium: decrease the size of pipe_box - 24 -> 16 bytes

Also:

pipe_transfer: 48 -> 40 bytes.
pipe_blit_info = 176 -> 160 bytes.

v2: add a comment at pipe_box

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_sampler_view - 48 -> 32 bytes
Marek Olšák [Sat, 1 Apr 2017 23:51:57 +0000 (01:51 +0200)]
gallium: decrease the size of pipe_sampler_view - 48 -> 32 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_surface - 48 -> 40 bytes
Marek Olšák [Sat, 1 Apr 2017 23:46:11 +0000 (01:46 +0200)]
gallium: decrease the size of pipe_surface - 48 -> 40 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes
Marek Olšák [Sat, 1 Apr 2017 23:27:13 +0000 (01:27 +0200)]
gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_stream_output_info - 532 -> 268 bytes
Marek Olšák [Sat, 1 Apr 2017 23:24:47 +0000 (01:24 +0200)]
gallium: decrease the size of pipe_stream_output_info - 532 -> 268 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agogallium: decrease the size of pipe_rasterizer_state - 36 -> 32 bytes
Marek Olšák [Sat, 1 Apr 2017 23:10:36 +0000 (01:10 +0200)]
gallium: decrease the size of pipe_rasterizer_state - 36 -> 32 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoamd/addrlib: second update for Vega10 + bug fixes
Marek Olšák [Mon, 27 Feb 2017 21:25:44 +0000 (22:25 +0100)]
amd/addrlib: second update for Vega10 + bug fixes

Highlights:
- Display needs tiled pitch alignment to be at least 32 pixels
- Implement Addr2ComputeDccAddrFromCoord().
- Macro-pixel packed formats don't support Z swizzle modes
- Pad pitch and base alignment of PRT + TEX1D to 64KB.
- Fix support for multimedia formats
- Fix a case "PRT" entries are not selected on SI.
- Fix wrong upper bits in equations for 3D resource.
- We can't support 2d array slice rotation in gfx8 swizzle pattern
- Set base alignment for PRT + non-xor swizzle mode resource to 64KB.
- Bug workaround for Z16 4x/8x and Z32 2x/4x/8x MSAA depth texture
- Add stereo support
- Optimize swizzle mode selection
- Report pitch and height in pixels for each mip
- Adjust bpp/expandX for format ADDR_FMT_GB_GR/ADDR_FMT_BG_RG
- Correct tcCompatible flag output for mipmap surface
- Other fixes and cleanups

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: use i32_0 and i32_1 more
Marek Olšák [Sun, 2 Apr 2017 23:44:32 +0000 (01:44 +0200)]
radeonsi: use i32_0 and i32_1 more

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: remove most uses of lp_build_const*
Marek Olšák [Sun, 2 Apr 2017 23:41:24 +0000 (01:41 +0200)]
radeonsi: remove most uses of lp_build_const*

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: clean up 'radeon_bld' references
Marek Olšák [Sun, 2 Apr 2017 23:25:02 +0000 (01:25 +0200)]
radeonsi: clean up 'radeon_bld' references

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradeonsi: fix broken texture filtering on SI-CIK since GFX9 changes
Marek Olšák [Sun, 2 Apr 2017 22:22:16 +0000 (00:22 +0200)]
radeonsi: fix broken texture filtering on SI-CIK since GFX9 changes

Don't clear state[7] on SI-CIK, and only do the meta stuff on VI+.
Fixes: 5abf60076ce4 ("radeonsi/gfx9: image descriptor changes in mutable fields")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100531
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agobin/get-fixes-pick-list.sh: fix typo
Juan A. Suarez Romero [Mon, 3 Apr 2017 16:48:33 +0000 (18:48 +0200)]
bin/get-fixes-pick-list.sh: fix typo

Replace "nore" by "more".

Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agoandroid: intel: genxml: fix genX_xml.h generation rules
Mauro Rossi [Sat, 1 Apr 2017 10:50:33 +0000 (12:50 +0200)]
android: intel: genxml: fix genX_xml.h generation rules

Recent changes in Makefile.sources merged the aubinator files in
a unique list of generated files and genxml/genX_xml.h is now needed
to avoid the following building error:

ninja: error: '.../genxml/genX_xml.h', needed by '.../genxml/genX_xml.h',
missing and no known rule to make it
build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed

Fixes: 0f83c05 "intel: genxml: compress all gen files into one"
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/vec4: Add some fall through comments
Jason Ekstrand [Mon, 3 Apr 2017 23:24:47 +0000 (16:24 -0700)]
intel/vec4: Add some fall through comments

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agomesa/glthread: Avoid unnecessary batch reallocation
Bartosz Tomczyk [Mon, 3 Apr 2017 19:19:40 +0000 (21:19 +0200)]
mesa/glthread: Avoid unnecessary batch reallocation

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoradv: Increase descriptor limits.
Bas Nieuwenhuizen [Mon, 3 Apr 2017 17:40:06 +0000 (19:40 +0200)]
radv: Increase descriptor limits.

We supported more generally. Decreased the dynamic buffers though, as
we only support 16 for uniform+storage.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
7 years agomesa/glthread: fix misaligned address access
Bartosz Tomczyk [Mon, 3 Apr 2017 19:12:54 +0000 (21:12 +0200)]
mesa/glthread: fix misaligned address access

Address sanitizer reports lot of misaligned access:
SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in
main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 for type
'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment
0x631000104866: note: pointer points here
 92 88 00 00 00 00  00 00 4a 03 0c 00 93 88  00 00 00 00 00 00 02 01  0c 00 40 8d 00 00 00 00  00 00
             ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 in
main/marshal_generated.c:28726:12: runtime error: member access within misaligned address 0x6310003fc874 for type
'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment
0x6310003fc874: note: pointer points here
  01 00 00 00 7a 02 20 00  00 00 00 00 be be be be  be be be be be be be be  be be be be be be be be
              ^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 in
main/marshal_generated.c:28726:12: runtime error: store to misaligned address 0x6310003fc87c for type
'GLint' (aka 'int'), which requires 8 byte alignment
0x6310003fc87c: note: pointer points here
  00 00 00 00 be be be be  be be be be be be be be  be be be be be be be be  be be be be be be be be

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: Fix blob memory leak
Bartosz Tomczyk [Mon, 3 Apr 2017 17:39:19 +0000 (19:39 +0200)]
glsl: Fix blob memory leak

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoradv: Rework guard band calculation.
Bas Nieuwenhuizen [Sun, 2 Apr 2017 10:32:39 +0000 (12:32 +0200)]
radv: Rework guard band calculation.

We want the guardband_x/y to be the largerst scalars such that each
viewport scaled by that amount is still a subrange of [-32767, 32767].

The old code has a couple of issues:
1) It used scissor instead of viewport_scissor, potentially taking into
   account a viewport that is too small and therefore selecting a scale
   that is too large.
2) Merging the viewports isn't ideal, as for example viewports with
   boundaries [0,1] and [1000, 1001] would allow a guardband scale of ~30k,
   while their union [0, 1001] only allows a scale of ~32.

The new code just determines the guardband per viewport and takes the minimum.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Enable VK_KHR_incremental_present.
Bas Nieuwenhuizen [Mon, 3 Apr 2017 19:33:51 +0000 (21:33 +0200)]
radv: Enable VK_KHR_incremental_present.

Just enabling the driver-independent implementation that Jason did.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Implement VK_KHR_incremental_present
Jason Ekstrand [Tue, 24 Jan 2017 23:13:31 +0000 (15:13 -0800)]
anv: Implement VK_KHR_incremental_present

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
7 years agovulkan/wsi/wayland: Pass damage through to the compositor
Jason Ekstrand [Tue, 24 Jan 2017 23:29:43 +0000 (15:29 -0800)]
vulkan/wsi/wayland: Pass damage through to the compositor

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>