mesa.git
7 years agoi965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE
Iago Toral Quiroga [Thu, 14 Sep 2017 08:06:33 +0000 (10:06 +0200)]
i965: rename BRW_NEW_FAST_CLEAR_COLOR to BRW_NEW_AUX_STATE

We want to use this flag to signal changes to the aux surfaces,
so let's not make it about fast clearing only. Suggested by Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agodocs: update calendar, add news item and link release notes for 17.2.1
Emil Velikov [Sun, 17 Sep 2017 23:16:42 +0000 (00:16 +0100)]
docs: update calendar, add news item and link release notes for 17.2.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs: add sha256 checksums for 17.2.1
Emil Velikov [Sun, 17 Sep 2017 23:12:36 +0000 (00:12 +0100)]
docs: add sha256 checksums for 17.2.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit bd903d4ee15333288848708a60d6c8002cbb5cb1)

7 years agodocs: add release notes for 17.2.1
Emil Velikov [Sun, 17 Sep 2017 22:57:32 +0000 (23:57 +0100)]
docs: add release notes for 17.2.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit d6d2b6b5ec9b1638c0827582872670c7da79bb53)

7 years agodocs: update sourcetree following omx rename
Eric Engestrom [Sat, 16 Sep 2017 22:56:08 +0000 (23:56 +0100)]
docs: update sourcetree following omx rename

Fixes: 6a8aa11c207b99920b93 "st/omx_bellagio: Rename state tracker and option"
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Andres Gomez <agomez@igalia.com>
7 years agogbm: Add gbm_device_get_format_modifier_plane_count to test
Gert Wollny [Sat, 16 Sep 2017 16:03:16 +0000 (18:03 +0200)]
gbm: Add gbm_device_get_format_modifier_plane_count to test

Adding gbm_device_get_format_modifier_plane_count made the
test gbm-symbols-check fail, this patch adds the according
function name to the test.

Fixes: 8824141b8d48d9120ddbf542d6fb661046c41c62
 (gbm: Add a gbm_device_get_format_modifier_plane_count function)

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Andres Gomez <agomez@igalia.com>
7 years agotravis: replace omx feature flag with omx-bellagio one
Andres Gomez [Sat, 16 Sep 2017 17:23:56 +0000 (20:23 +0300)]
travis: replace omx feature flag with omx-bellagio one

Fixes: 6a8aa11c207 ("st/omx_bellagio: Rename state tracker and
option")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Gurkirpal Singh <gurkirpal204@gmail.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
7 years agodocs/submittingpatches: add 'test each commit' instructions
Eric Engestrom [Fri, 15 Sep 2017 17:10:57 +0000 (17:10 +0000)]
docs/submittingpatches: add 'test each commit' instructions

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
7 years agoradv: Add support for more DCC compression with VK_KHR_image_format_list.
Bas Nieuwenhuizen [Fri, 15 Sep 2017 22:30:18 +0000 (00:30 +0200)]
radv: Add support for more DCC compression with VK_KHR_image_format_list.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Add code to check if two formats can share DCC metadata.
Bas Nieuwenhuizen [Fri, 15 Sep 2017 21:39:45 +0000 (23:39 +0200)]
radv: Add code to check if two formats can share DCC metadata.

Ported from radeonsi.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoi965: Add an INTEL_DEBUG=reemit option.
Kenneth Graunke [Sat, 16 Sep 2017 00:47:07 +0000 (17:47 -0700)]
i965: Add an INTEL_DEBUG=reemit option.

Jason and I use this for debugging all the time.  Recompiling the driver
to enable it is kind of annoying.  It's a great thing to try along with
always_flush_batch=true and always_flush_cache=true to detect a class of
problems - namely, atoms listening to an insufficient set of dirty bits.

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoclover: Fix build after LLVM r313390
Jan Vesely [Sat, 16 Sep 2017 00:34:42 +0000 (20:34 -0400)]
clover: Fix build after LLVM r313390

v2: pass llvm context reference instead of a pointer

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agoradv: Don't redundantly emit pipelines after secondary cmd buffer.
Bas Nieuwenhuizen [Tue, 12 Sep 2017 22:12:48 +0000 (00:12 +0200)]
radv: Don't redundantly emit pipelines after secondary cmd buffer.

Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Check for GFX9 for 1D arrays in image_size intrinsic.
Bas Nieuwenhuizen [Fri, 15 Sep 2017 19:40:00 +0000 (21:40 +0200)]
radv: Check for GFX9 for 1D arrays in image_size intrinsic.

Only on GFX9 we implement them as 2D images.

This fixes:
dEQP-VK.image.image_size.1d_array.readonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_7x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_12x34
dEQP-VK.image.image_size.1d_array.readonly_writeonly_1x1
dEQP-VK.image.image_size.1d_array.readonly_writeonly_32x32
dEQP-VK.image.image_size.1d_array.readonly_writeonly_7x1
dEQP-VK.image.image_size.1d_array.writeonly_12x34
dEQP-VK.image.image_size.1d_array.writeonly_1x1
dEQP-VK.image.image_size.1d_array.writeonly_32x32
dEQP-VK.image.image_size.1d_array.writeonly_7x1

Fixes: 1bcb953e166 "radv: handle GFX9 1D textures"
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoi965: drop unused variables
Eric Engestrom [Fri, 15 Sep 2017 17:11:11 +0000 (18:11 +0100)]
i965: drop unused variables

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/tex: Unify the TexImage and TexSubImage code
Jason Ekstrand [Wed, 31 May 2017 20:48:10 +0000 (13:48 -0700)]
i965/tex: Unify the TexImage and TexSubImage code

It's nearly the same so there's no good reason why it can't be in a
common function.  The one difference is that _mesa_store_teximage
calls AllocTextureImageBuffer for us, while _mesa_store_texsubimage
doesn't, but we don't need that anyway - intelTexImage already does it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agoi965/tex: Remove the for_glTexImage parameter from texsubimage_tiled_memcpy
Jason Ekstrand [Wed, 31 May 2017 20:43:54 +0000 (13:43 -0700)]
i965/tex: Remove the for_glTexImage parameter from texsubimage_tiled_memcpy

It is set to false in both callers.  It isn't needed for glTexImage
because intelTexImage calls AllocTextureImageBuffer before calling
texsubimage_tiled_memcpy.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agoi965/tex: Make a couple of helpers static
Jason Ekstrand [Wed, 31 May 2017 20:35:30 +0000 (13:35 -0700)]
i965/tex: Make a couple of helpers static

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agoi965: Move TexSubImage functions to intel_tex_image.c
Jason Ekstrand [Wed, 31 May 2017 20:32:29 +0000 (13:32 -0700)]
i965: Move TexSubImage functions to intel_tex_image.c

These two paths are basically the same.  There's no good reason to have
them in different files.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agoi965/blorp: Set r8stencil_needs_update when writing stencil
Jason Ekstrand [Sat, 17 Jun 2017 20:50:30 +0000 (13:50 -0700)]
i965/blorp: Set r8stencil_needs_update when writing stencil

This fixes a crash on Haswell when we try to upload a stencil texture
with blorp.  It would also be a problem if someone tried to texture from
stencil after glBlitFramebuffers.

Cc: "17.2 17.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
7 years agoutil/u_atomic: Add implementation of __sync_val_compare_and_swap_8
Matt Turner [Thu, 14 Sep 2017 18:00:26 +0000 (11:00 -0700)]
util/u_atomic: Add implementation of __sync_val_compare_and_swap_8

Needed for 32-bit PowerPC.

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038bd ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoutil: Link libmesautil into u_atomic_test
Matt Turner [Thu, 14 Sep 2017 17:48:57 +0000 (10:48 -0700)]
util: Link libmesautil into u_atomic_test

Platforms without particular atomic operations require the
implementations in u_atomic.c

Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Fixes: a6a38a038bd ("util/u_atomic: provide 64bit atomics where
they're missing")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agovulkan: update headers & registry to VK 1.0.61
Lionel Landwerlin [Fri, 15 Sep 2017 14:10:53 +0000 (15:10 +0100)]
vulkan: update headers & registry to VK 1.0.61

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoautomake: enable libunwind in `make distcheck'
Emil Velikov [Mon, 11 Sep 2017 17:13:55 +0000 (18:13 +0100)]
automake: enable libunwind in `make distcheck'

Enable the toggle to catch when the library is missing from the link
path. Better to test, fail and address before releasing Mesa ;-)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agotravis: Add libunwind-dev to gallium/make builds
Gert Wollny [Thu, 14 Sep 2017 10:27:42 +0000 (12:27 +0200)]
travis: Add libunwind-dev to gallium/make builds

libunwind is a optional dependency used by the gallium aux module
(libgallium) and consequently the final binaries must be linked against
it. To test whether the library is properly specified in the link pass
add it to the travis-ci build environment and force its use.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agotravis: force llvm-3.3 for "make Gallium ST Other"
Gert Wollny [Thu, 14 Sep 2017 10:27:41 +0000 (12:27 +0200)]
travis: force llvm-3.3 for "make Gallium ST Other"

In Ubuntu Trusty the default version of llvm is 3.4 and the build was
actually randomly picking 3.5 or 3.9. Adding libunwind would then result
is build success or failure depending of what version was picked.

Install the llvm-3.3-dev package and force its use: On one hand it is
the minimum required version we want to the build test against, and on
the other hand forcing the version stabilizes the build.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa/st/tests: Correct build flags and force -std=c++11
Gert Wollny [Wed, 13 Sep 2017 13:03:34 +0000 (15:03 +0200)]
mesa/st/tests: Correct build flags and force -std=c++11

Include src/gallium/Automake.inc, correct the build flags accordingly.

Force -std=c++11 (extensively used by the test) as otherwise it gets
defined only when building against llvm >= 3.9.

Fixes: 7be6d8fe12 ("mesa/st: glsl_to_tgsi: add tests for the new
temporary lifetime tracker")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
7 years agoautomake: include radv_shader.h in the sources list
Emil Velikov [Fri, 15 Sep 2017 12:40:22 +0000 (13:40 +0100)]
automake: include radv_shader.h in the sources list

Otherwise it will be missing from the tarball, leadin to build failure.

Fixes: d4d777317b9 ("radv: move shaders related code to radv_shader.c")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agost/omx_bellagio: Rename state tracker and option
Gurkirpal Singh [Sat, 12 Aug 2017 16:07:15 +0000 (21:37 +0530)]
st/omx_bellagio: Rename state tracker and option

Changes --enable-omx option to --enable-omx-bellagio

Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com>
Reviewed-and-Tested-by: Julien Isorce <julien.iso...@gmail.com>
Acked-by: Christian König <christian.koenig@amd.com>
7 years agoi965: fix build warning on clang
Tapani Pälli [Thu, 14 Sep 2017 07:26:39 +0000 (10:26 +0300)]
i965: fix build warning on clang

fixes following warning:
   warning: format specifies type 'long' but the argument has type 'uint64_t' (aka 'unsigned long long')

cast is needed to avoid this change turning in to another warning:
   warning: format specifies type 'unsigned long long' but the argument has type 'uint64_t' (aka 'unsigned long')

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
7 years agoradv: fix a potential crash if attachments allocation failed
Samuel Pitoiset [Thu, 14 Sep 2017 16:47:04 +0000 (18:47 +0200)]
radv: fix a potential crash if attachments allocation failed

Also, it's useless to set the error code twice. Though, we
should probably skip the next commands when the command buffer
is considered invalid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: dump the device name into the hang report
Samuel Pitoiset [Thu, 14 Sep 2017 09:25:24 +0000 (11:25 +0200)]
radv: dump the device name into the hang report

Similar to RadeonSI renderer string.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: add get_chip_name() callback
Samuel Pitoiset [Thu, 14 Sep 2017 09:25:23 +0000 (11:25 +0200)]
radv: add get_chip_name() callback

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agor600: add .gitignore for egd_tables.h
Dave Airlie [Fri, 15 Sep 2017 03:52:22 +0000 (13:52 +1000)]
r600: add .gitignore for egd_tables.h

7 years agoradeonsi: enable STD430 packing of UBOs by default
Timothy Arceri [Thu, 14 Sep 2017 22:22:33 +0000 (08:22 +1000)]
radeonsi: enable STD430 packing of UBOs by default

Before this change we were defaulting to STD140 which is slightly
less efficient at packing arrays.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: set UseSTD430AsDefaultPacking const based on CAP
Timothy Arceri [Thu, 14 Sep 2017 22:21:22 +0000 (08:21 +1000)]
st/mesa: set UseSTD430AsDefaultPacking const based on CAP

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium: introduce PIPE_CAP_LOAD_CONSTBUF
Timothy Arceri [Thu, 17 Aug 2017 10:12:42 +0000 (20:12 +1000)]
gallium: introduce PIPE_CAP_LOAD_CONSTBUF

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: make use of LOAD for UBOs
Timothy Arceri [Thu, 17 Aug 2017 03:29:54 +0000 (13:29 +1000)]
radeonsi: make use of LOAD for UBOs

v2: always set can_speculate and allow_smem to true

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa/st: add LOAD support for UBOs
Timothy Arceri [Tue, 25 Jul 2017 03:08:36 +0000 (13:08 +1000)]
mesa/st: add LOAD support for UBOs

This will allow us to use STD430 packing by default if the driver
supports it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa/st: create add_buffer_to_load_and_stores() helper
Timothy Arceri [Thu, 17 Aug 2017 10:29:27 +0000 (20:29 +1000)]
mesa/st: create add_buffer_to_load_and_stores() helper

Will be used to add LOAD support to UBOs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agogallium: add CONSTBUF type to tgsi_file_type
Timothy Arceri [Thu, 17 Aug 2017 06:50:01 +0000 (16:50 +1000)]
gallium: add CONSTBUF type to tgsi_file_type

This will be use to distinguish between load types when using
the TGSI_OPCODE_LOAD opcode.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agovirgl: drop const dimensions on first block.
Dave Airlie [Tue, 12 Sep 2017 23:23:15 +0000 (09:23 +1000)]
virgl: drop const dimensions on first block.

The virgl protocol version of tgsi doesn't handle this yet,
transform it back to the old ways.

Thanks to Nicolai Hähnle <nicolai.haehnle@amd.com>
for also writing nearly the same patch.

Fixes: 41e342d5 tgsi/ureg: always emit constants (and their decls) as 2D
Tested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agost/glsl->tgsi: fix u64 to bool comparisons.
Dave Airlie [Thu, 14 Sep 2017 04:03:19 +0000 (05:03 +0100)]
st/glsl->tgsi: fix u64 to bool comparisons.

Otherwise we end up using a 32-bit comparison which didn't end well.

Timothy caught this while playing around with some opt passes.

Fixes: 278580729a (st/glsl_to_tgsi: add support for 64-bit integers)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoi965: Print size of validation and relocation lists in INTEL_DEBUG=flush
Kenneth Graunke [Wed, 6 Sep 2017 17:55:07 +0000 (10:55 -0700)]
i965: Print size of validation and relocation lists in INTEL_DEBUG=flush

It's nice to have this information.  While we're at it, tweak the
formatting to try and vertically align numbers in the common case.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Disentangle batch and state buffer flushing.
Kenneth Graunke [Tue, 5 Sep 2017 22:14:18 +0000 (15:14 -0700)]
i965: Disentangle batch and state buffer flushing.

We now flush the batch when either the batchbuffer or statebuffer
reaches the original intended batch size, instead of when the sum of
the two reaches a certain size (which makes no sense now that they're
separate buffers).

With this change, we also need to update our "are we near the end?"
estimate to require separate batch and state buffer space.  I obtained
these estimates by looking at the size of draw calls in the Unreal 4
Elemental Demo (using INTEL_DEBUG=flush and always_flush_batch=true).

This will significantly impact the size of our batches.  I've adjusted
both down to try and be roughly similar to what we had been doing.  On
various benchmarks, a 20kB batch and 16kB statebuffer seemed to about
right, but we may need to adjust this further.  I tried a 16kB batch,
but that regressed Synmark OglMultithread performance by a fair bit.
32kB for both would have significantly increased our batch sizes.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Delete BATCH_RESERVED handling.
Kenneth Graunke [Tue, 5 Sep 2017 22:03:48 +0000 (15:03 -0700)]
i965: Delete BATCH_RESERVED handling.

Now that we can grow the batchbuffer if we absolutely need the extra
space, we don't need to reserve space for the final do-or-die ending
commands.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Make BLORP properly avoid batch wrapping.
Kenneth Graunke [Tue, 5 Sep 2017 22:57:41 +0000 (15:57 -0700)]
i965: Make BLORP properly avoid batch wrapping.

We need to set brw->no_batch_wrap to actually avoid flushing in the
middle of our BLORP operation, and instead grow the batchbuffer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Grow the batch/state buffers if we need space and can't flush.
Kenneth Graunke [Fri, 1 Sep 2017 23:42:56 +0000 (16:42 -0700)]
i965: Grow the batch/state buffers if we need space and can't flush.

Previously, we would just assert fail and die in this case.  The only
safeguard is the "estimated max prim size" checks when starting a draw
(or compute dispatch or BLORP operation)...which are woefully broken.

Growing is fairly straightforward:

1. Allocate a new larger BO.
2. memcpy the existing contents over to the new buffer
3. Set the new BO to the same GTT offset as the old BO.  When emitting
   relocations, we write the presumed GTT offset of the target BO.  If
   we changed it, we'd have to update all the existing values (by
   walking the relocation list and looking at offsets), which is more
   expensive.  With the old BO freed, ideally the kernel could simply
   place the new BO at that offset anyway.
4. Update the validation list to contain the new BO.
5. Update the relocation list to have the GEM handle for the new BO
   (which we can skip if using I915_EXEC_HANDLE_LUT).

v2: Update to handle malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Use a separate state buffer, but avoid changing flushing behavior.
Kenneth Graunke [Wed, 30 Aug 2017 08:37:24 +0000 (01:37 -0700)]
i965: Use a separate state buffer, but avoid changing flushing behavior.

Previously, we emitted GPU commands and indirect state into the same
buffer, using a stack/heap like system where we filled in commands from
the start of the buffer, and state from the end of the buffer.  We then
flushed before the two met in the middle.

Meeting in the middle is fatal, so you have to be certain that you
reserve the correct amount of space before emitting commands or state
for a draw.  Currently, we will assert !no_batch_wrap and die if the
estimate is ever too small.  This has been mercifully obscure, but has
happened on a number of occasions, and could in theory happen to any
application that issues a large draw at just the wrong time.

Estimating the amount of batch space required is painful - it's hard to
get right, and getting it right involves a lot of code that would burn
CPU time, and also be painful to maintain.  Rolling back to a saved
state and retrying is also painful - failing to save/restore all the
required state will break things, and redoing state emission burns a
lot of CPU.  memcpy'ing to a new batch and continuing is painful,
because commands we issue for a draw depend on earlier commands as well
(such as STATE_BASE_ADDRESS, or the GPU being in a pirtacular state).

The best plan is to never run out of space, which is totally doable but
pretty wasteful - a pessimal draw requires a huge amount of space, and
rarely occurs.  Instead, we'd like to grow the batch buffer if we need
more space and can't safely flush.

We can't grow with a meet in the middle approach - we'd have to move the
state to the end, which would mean updating every offset from dynamic
state base address.  Using separate batch and state buffers, where both
fill starting at the beginning, makes it easy to grow either as needed.

This patch separates the two concepts.  We create a separate state
buffer, with a second relocation list, and use that for brw_state_batch.

However, this patch tries to retain the original flushing behavior - it
adds the amount of batch and state space together, as if they were still
co-existing in a single buffer.  The hope is to flush at the same time
as before.  This is necessary to avoid provoking bugs caused by broken
batch wrap handling (which we'll fix shortly).  It also avoids suddenly
increasing the size of the batch (due to state not taking up space),
which could have a significant performance impact.  We'll tune it later.

v2:
- Mark the statebuffer with EXEC_OBJECT_CAPTURE when supported (caught
  by Chris).  Unfortunately, we lose the ability to capture state data
  on older kernels.
- Continue to support the malloc'd shadow buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Pass screen to intel_batchbuffer_reset().
Kenneth Graunke [Fri, 8 Sep 2017 06:43:46 +0000 (23:43 -0700)]
i965: Pass screen to intel_batchbuffer_reset().

This will let us access screen->kernel_features in the next patch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.
Kenneth Graunke [Wed, 30 Aug 2017 08:37:24 +0000 (01:37 -0700)]
i965: Prepare INTEL_DEBUG=bat decoding for a separate statebuffer.

We'll need to read from both buffers when decoding state.

This also drops the "failed to map" fallback - it's completely useless
on LLC systems where we write directly to the mapped BO.  It's not that
useful on non-LLC systems either.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc.
Kenneth Graunke [Thu, 31 Aug 2017 20:10:19 +0000 (13:10 -0700)]
i965: Split brw_emit_reloc into brw_batch_reloc and brw_state_reloc.

brw_batch_reloc emits a relocation from the batchbuffer to elsewhere.
brw_state_reloc emits a relocation from the statebuffer to elsewhere.

For now, they do the same thing, but when we actually split the two
buffers, we'll change brw_state_reloc to use the state buffer.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Refactor relocs into a brw_reloc_list structure.
Kenneth Graunke [Thu, 31 Aug 2017 18:57:01 +0000 (11:57 -0700)]
i965: Refactor relocs into a brw_reloc_list structure.

I'm planning on splitting batch and state into separate buffers, at
which point we'll need two relocation lists.  In preparation for that,
this patch refactors the relocation stuff into a structure we can
replicate...which looks a lot like anv_reloc_list.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965: Move brw_state_batch code to intel_batchbuffer.c
Kenneth Graunke [Wed, 30 Aug 2017 08:40:00 +0000 (01:40 -0700)]
i965: Move brw_state_batch code to intel_batchbuffer.c

The batch buffer and state buffer code is fairly tied together,
and having it in one .c file will make refactoring easier.

Also, drop some commentary above brw_state_batch.  The "aperture
checking performance hacks" are long since gone, so that paragraph
makes little sense at this point.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Drop a useless ret == 0 check.
Kenneth Graunke [Wed, 30 Aug 2017 07:47:03 +0000 (00:47 -0700)]
i965: Drop a useless ret == 0 check.

Prior to the previous patch, we would pwrite the batchbuffer contents,
and wanted to skip the execbuffer if that failed.  Now that we memcpy,
we don't set ret != 0 on failure anymore, so it will always be 0.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Use a WC map and memcpy for the batch instead of pwrite.
Kenneth Graunke [Fri, 8 Sep 2017 22:00:14 +0000 (15:00 -0700)]
i965: Use a WC map and memcpy for the batch instead of pwrite.

We'd like to eliminate the malloc'd shadow copy eventually, but there
are still unresolved performance problems.  In the meantime, let's at
least get rid of pwrite.

On Apollolake, improves Synmark OglBatch6 performance by:
1.53581% +/- 0.269589% (n=108).

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Use batch->bo->size in brw_emit_reloc assertion.
Kenneth Graunke [Wed, 30 Aug 2017 08:04:48 +0000 (01:04 -0700)]
i965: Use batch->bo->size in brw_emit_reloc assertion.

This makes the assertion safe against batchbuffers growing.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965: Delete a batch size assertion that isn't very useful.
Kenneth Graunke [Wed, 30 Aug 2017 07:54:40 +0000 (00:54 -0700)]
i965: Delete a batch size assertion that isn't very useful.

This assertion prevents you from doing intel_batchbuffer_require_space
with a size so huge it won't fit in the batchbuffer.  This doesn't seem
like a common mistake, and I've never seen the assert to be useful.

Soon, I hope to have batches grow, at which point this won't make sense.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965/screen: Implement queryDmaBufFormatModifierAttirbs
Jason Ekstrand [Wed, 16 Aug 2017 19:09:16 +0000 (12:09 -0700)]
i965/screen: Implement queryDmaBufFormatModifierAttirbs

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
7 years agoi965/screen: Report the correct number of image planes
Jason Ekstrand [Wed, 16 Aug 2017 19:01:15 +0000 (12:01 -0700)]
i965/screen: Report the correct number of image planes

For non-CCS images, we were reporting just one plane even though they
may have multiple in the case of YUV.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
7 years agogbm: Add a gbm_device_get_format_modifier_plane_count function
Jason Ekstrand [Wed, 16 Aug 2017 18:54:11 +0000 (11:54 -0700)]
gbm: Add a gbm_device_get_format_modifier_plane_count function

This allows the user to query the number of planes required by a given
format+modifier combination without having to create a bo or surface.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
7 years agodri/image: Add a format modifier attributes query
Jason Ekstrand [Wed, 16 Aug 2017 18:53:38 +0000 (11:53 -0700)]
dri/image: Add a format modifier attributes query

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
7 years agodrirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)
Christoph Berliner [Thu, 14 Sep 2017 19:01:04 +0000 (21:01 +0200)]
drirc: enable glthread for more games (Civ5, CivBE, Dreamfall, Hitman, SR3)

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
7 years agoglsl: avoid accessing invalid memory after get_variable_being_redeclared()
Iago Toral Quiroga [Wed, 13 Sep 2017 07:34:38 +0000 (09:34 +0200)]
glsl: avoid accessing invalid memory after get_variable_being_redeclared()

After get_variable_being_redeclared() has been called, it is no longer
safe to access the original variable pointer, since its memory might have
been freed.

Since callers of this function should only be accessing the variable pointer
returned by the function, avoid potential bugs by re-assigning the
original variable pointer to the result of the function call,
making it impossible for the remaining code to access an invalid variable
pointer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoglsl: make the redeclared variable NULL if it is deleted
Iago Toral Quiroga [Wed, 13 Sep 2017 07:08:01 +0000 (09:08 +0200)]
glsl: make the redeclared variable NULL if it is deleted

get_variable_being_redeclared() can delete the original variable
in a specific scenario. The code sets it to NULL after this so other
code in that same function doesn't try to access trashed memory after
the fact, however, the copy of that variable in the caller code
won't see any of this making it very easy to overlook.

Make the function a bit safer by taking a pointer to the original
variable so we can also make NULL the caller's pointer to the variable
if this function deletes it.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoglsl: use 'declared_var' instead of 'var' after checking redeclarations
Iago Toral Quiroga [Wed, 13 Sep 2017 06:59:18 +0000 (08:59 +0200)]
glsl: use 'declared_var' instead of 'var' after checking redeclarations

Since the original 'var' might have been deleted from this point forward.

Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=102685
Fixes: 51bf007d2c27fba (glsl: Disallow unsized array of atomic_uint)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agodri/radeon: use ARRAY_SIZE macro
Eric Engestrom [Mon, 4 Sep 2017 12:54:52 +0000 (13:54 +0100)]
dri/radeon: use ARRAY_SIZE macro

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
7 years agoradv: dump the list of enabled options when a hang occured
Samuel Pitoiset [Mon, 11 Sep 2017 20:28:42 +0000 (22:28 +0200)]
radv: dump the list of enabled options when a hang occured

Useful to know which debug/perftest options were enabled when
a hang report is generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: dump last 60 lines of dmesg when a hang occured
Samuel Pitoiset [Mon, 11 Sep 2017 20:02:54 +0000 (22:02 +0200)]
radv: dump last 60 lines of dmesg when a hang occured

Copied from dd_dump_dmesg().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: dump descriptors when a hang occured
Samuel Pitoiset [Mon, 11 Sep 2017 14:13:05 +0000 (16:13 +0200)]
radv: dump descriptors when a hang occured

Might be useful for checking if all descriptors are sets by
the application.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: save all descriptor pointers into the trace BO
Samuel Pitoiset [Mon, 11 Sep 2017 14:12:15 +0000 (16:12 +0200)]
radv: save all descriptor pointers into the trace BO

To dump them when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: dump annotated shaders using UMR
Samuel Pitoiset [Mon, 11 Sep 2017 11:44:20 +0000 (13:44 +0200)]
radv: dump annotated shaders using UMR

This might be very useful in order to figure out where a shader
is stucked. This uses UMR to detect which instruction is executing
bad things.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradeonsi: move si_get_wave_info() to AMD common code
Samuel Pitoiset [Thu, 7 Sep 2017 09:05:29 +0000 (11:05 +0200)]
radeonsi: move si_get_wave_info() to AMD common code

This will allow us to use it from radv.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: dump some status MMIO registers when a hang occured
Samuel Pitoiset [Wed, 6 Sep 2017 07:47:21 +0000 (09:47 +0200)]
radv: dump some status MMIO registers when a hang occured

Might report some useful information to help figuring out where
does the hang happened.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv/winsys: add a read_registers() callback
Samuel Pitoiset [Wed, 6 Sep 2017 07:38:19 +0000 (09:38 +0200)]
radv/winsys: add a read_registers() callback

To dump some status MMIO registers when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: dump shader stats when a hang occured
Samuel Pitoiset [Tue, 5 Sep 2017 13:36:59 +0000 (15:36 +0200)]
radv: dump shader stats when a hang occured

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: add radv_shader_dump_stats() helper
Samuel Pitoiset [Tue, 5 Sep 2017 13:34:07 +0000 (15:34 +0200)]
radv: add radv_shader_dump_stats() helper

To dump the shader stats when a hang is detected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: dump the active shaders when a hang occured
Samuel Pitoiset [Tue, 5 Sep 2017 19:07:57 +0000 (21:07 +0200)]
radv: dump the active shaders when a hang occured

Only the disassembly is currently dumped.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: add debug flags for syncing shaders after every draw call
Samuel Pitoiset [Mon, 11 Sep 2017 13:00:41 +0000 (15:00 +0200)]
radv: add debug flags for syncing shaders after every draw call

To improve GPU hangs detection when shaders are stucked.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: add radv_cmd_buffer_after_draw() helper function
Samuel Pitoiset [Mon, 11 Sep 2017 12:50:12 +0000 (14:50 +0200)]
radv: add radv_cmd_buffer_after_draw() helper function

To share common code after every draw/compute calls.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: save the bound pipeline pointers into the trace BO
Samuel Pitoiset [Tue, 5 Sep 2017 19:02:14 +0000 (21:02 +0200)]
radv: save the bound pipeline pointers into the trace BO

When a GPU hang is detected in radv_gpu_hang_occured() we know
which command buffer is faulty but the bound pipelines might
have been updated during the execution.

The pointers to the radv_pipeline objects are emitted just
after the second trace ID, that way it would be easy to dump
the active shaders at the moment of the hang.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: add a comment that describes the trace BO layout
Samuel Pitoiset [Mon, 11 Sep 2017 13:12:25 +0000 (15:12 +0200)]
radv: add a comment that describes the trace BO layout

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoradv: initialize the trace BO to 0
Samuel Pitoiset [Wed, 13 Sep 2017 09:13:03 +0000 (11:13 +0200)]
radv: initialize the trace BO to 0

To avoid random initial values.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoswr: use ARRAY_SIZE macro
Eric Engestrom [Wed, 6 Sep 2017 10:21:32 +0000 (11:21 +0100)]
swr: use ARRAY_SIZE macro

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agomesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB
Jeremy Huddleston Sequoia [Sun, 8 May 2016 07:47:10 +0000 (00:47 -0700)]
mesa: Deal with size differences between GLuint and GLhandleARB in GetAttachedObjectsARB

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agogallium/{r600, radeonsi}: Fix segfault with color format (v2)
Denis Pauk [Tue, 12 Sep 2017 20:38:45 +0000 (23:38 +0300)]
gallium/{r600, radeonsi}: Fix segfault with color format (v2)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102552

v2: Patch cleanup proposed by Nicolai Hähnle.
    * deleted changes in si_translate_texformat.

Cc: Nicolai Hähnle <nhaehnle@gmail.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
7 years agoi965: Add an INTEL_DEBUG=submit option for printing batch statistics.
Kenneth Graunke [Tue, 5 Sep 2017 22:46:30 +0000 (15:46 -0700)]
i965: Add an INTEL_DEBUG=submit option for printing batch statistics.

When a batch is submitted, INTEL_DEBUG=bat prints a message indicating
which part of the code triggered the flush, and some statistics about
the batch/state buffer utilization.

It also decodes the batchbuffer in debug builds...which is so much
output that it drowns out the utilization messages, if that's all you
care about.

INTEL_DEBUG=submit now just does the utilization messages.
INTEL_DEBUG=bat continues to do both (as the message is a good indicator
that we're starting decode of a new batch).

v2: Rename from "flush" to "submit" (suggested by Chris) because we
    might want "flush" for PIPE_CONTROL debugging someday.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoradv/nir: call opt_remove_phis after trivial continues.
Dave Airlie [Wed, 13 Sep 2017 02:49:31 +0000 (03:49 +0100)]
radv/nir: call opt_remove_phis after trivial continues.

With the shaders in the ssao demo, the nir_opt_if wasn't
working properly without this, after this the if gets optimised
so that loop unrolling gets called.

(loop unrolling fails due to instruction count, but at least
it gets to do that.)

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoutil/build_id: Include <dlfcn.h>
Chad Versace [Wed, 13 Sep 2017 18:51:04 +0000 (11:51 -0700)]
util/build_id: Include <dlfcn.h>

Fix the build for Android Nougat.

The dladdr(3) manpage says that <dlfcn.h> is required. On Linux, the
build succeeded without it because build_id.c includes <link.h> which
includes <dlfcn.h>. On Android, we must include <dlfcn.h> directly.

Fixes: 5c98d382 "util: Query build-id by symbol address, not library name"
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoutil: Query build-id by symbol address, not library name
Chad Versace [Tue, 12 Sep 2017 22:52:03 +0000 (15:52 -0700)]
util: Query build-id by symbol address, not library name

This patch renames build_id_find_nhdr() to
build_id_find_nhdr_for_addr(), and changes it to never examine the
library name.

Tested on Fedora by confirming that build_id_get_data() returns the same
build-id as the file(1) tool. For BSD, I confirmed that the API used
(dladdr() and struct Dl_info) is documented in FreeBSD's manpages.

This solves two problems:

    - We can now the query the build-id without knowing the installed library's
      filename.

      This matters because Android requires specific filenames for HAL
      modules, such as "/vendor/lib/hw/vulkan.${board}.so". The HAL
      filenames do not follow the Unix convention of "libfoo.so".  In
      other words, the same query code will now work on Linux and Android.

    - Querying the build-id now works correctly when the process
      contains multiple shared objects with the same basename.
      (Admittedly, this is a highly unlikely scenario).

Cc: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agost/glsl_to_tgsi: remove unused code in temprename
Nicolai Hähnle [Wed, 6 Sep 2017 09:49:12 +0000 (11:49 +0200)]
st/glsl_to_tgsi: remove unused code in temprename

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
7 years agost/glsl_to_tgsi: be precise about merging scopes
Nicolai Hähnle [Wed, 6 Sep 2017 09:43:06 +0000 (11:43 +0200)]
st/glsl_to_tgsi: be precise about merging scopes

enclosing_scope already contains enclosing_scope_first_read.
What we really want to check here -- not for correctness, but
for speed -- is whether last_read_scope already contains
enclosing_scope.

Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
7 years agoac/surface: match Z and stencil tile config
Nicolai Hähnle [Tue, 5 Sep 2017 14:16:29 +0000 (16:16 +0200)]
ac/surface: match Z and stencil tile config

Fixes various piglit tests on Stoney, see the comment.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoac/surface: sanity-check that we got a TC-compatible HTILE if requested
Nicolai Hähnle [Thu, 7 Sep 2017 11:20:25 +0000 (13:20 +0200)]
ac/surface: sanity-check that we got a TC-compatible HTILE if requested

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoac/addrlib: enable assertions in debug builds
Nicolai Hähnle [Thu, 7 Sep 2017 10:16:14 +0000 (12:16 +0200)]
ac/addrlib: enable assertions in debug builds

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoac/addrlib: relax an assertion
Nicolai Hähnle [Mon, 11 Sep 2017 13:20:41 +0000 (15:20 +0200)]
ac/addrlib: relax an assertion

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoac/addrlib: relax an assertion
Nicolai Hähnle [Thu, 7 Sep 2017 10:16:36 +0000 (12:16 +0200)]
ac/addrlib: relax an assertion

This assertion is triggered on Stoney in Piglit
./bin/framebuffer-blit-levels {draw,read} stencil -auto -fbo
and similar tests. It should be harmless -- just relax it until
we can get internal clarification.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: hard-code pixel center for interpolateAtSample without multisample buffers
Nicolai Hähnle [Sun, 10 Sep 2017 17:46:31 +0000 (19:46 +0200)]
radeonsi: hard-code pixel center for interpolateAtSample without multisample buffers

The GLSL rules for interpolateAtSample are unfortunate:

   "Returns the value of the input interpolant variable at
    the location of sample number sample. If
    multisample buffers are not available, the input
    variable will be evaluated at the center of the pixel.
    If sample sample does not exist, the position used to
    interpolate the input variable is undefined."

This fix will fallback to monolithic shader compilation when
interpolateAtSample is used without multisampling.

One alternative would be to always upload 16 sample positions,
filling the buffer up with repetition when the actual number of
samples is less, and then ANDing the sample ID with 0xf. However,
that punishes all well-behaving users of interpolateAtSample,
when in reality, only conformance tests should be affected by
the issue.

Fixes
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: apply a mask to gl_SampleMaskIn in the PS prolog
Nicolai HÃ\83¤hnle [Sun, 10 Sep 2017 17:19:40 +0000 (19:19 +0200)]
radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog

gl_SampleMaskIn is supposed to contain set bits only for the samples that
are covered by the current fragment shader invocation, but the VGPR
initialization hardware loads the set of all bits that are covered at the
current pixel.

Fixes various tests in
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: remove SET_PREDICATION workaround on newer firmware
Nicolai Hähnle [Fri, 25 Aug 2017 23:11:14 +0000 (01:11 +0200)]
radeonsi: remove SET_PREDICATION workaround on newer firmware

We need to keep the workaround for older firmware, though.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>