mesa.git
6 years agoradv: always set PA_SC_MODE_CNTL_1.OUT_OF_ORDER_WATER_MARK
Samuel Pitoiset [Wed, 3 Oct 2018 14:09:24 +0000 (16:09 +0200)]
radv: always set PA_SC_MODE_CNTL_1.OUT_OF_ORDER_WATER_MARK

It has probably no effect without out of order rasterization
anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: set DB_EQAA.INCOHERENT_EQAA_READS
Samuel Pitoiset [Wed, 3 Oct 2018 14:09:23 +0000 (16:09 +0200)]
radv: set DB_EQAA.INCOHERENT_EQAA_READS

My attempt was to set this field instead of duplicating one.

Fixes: 6cfa321c39 ("radv: add potential missing fields for DB_EQAA")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoi965: fallback RGBX to RGBA in glEGLImageTargetRenderbufferStorageOES
Chystiakov, Dmytro [Wed, 3 Oct 2018 09:52:52 +0000 (12:52 +0300)]
i965: fallback RGBX to RGBA in glEGLImageTargetRenderbufferStorageOES

In the same fashion as is done for glEGLImageTextureTarget2D.

v2: share the fallback which sets baseformat and internalformat correctly
    which makes both of the tests pass (Tapani)

Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests:

   #SingleLayer_ColorTest_GpuColorOutputCpuRead_R8G8B8X8_UNORM
   #SingleLayer_ColorTest_GpuColorOutputIsRenderable_R8G8B8X8_UNORM

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
6 years agoglsl: do not attempt assignment if operand type not parsed correctly
Tapani Pälli [Tue, 25 Sep 2018 14:04:40 +0000 (17:04 +0300)]
glsl: do not attempt assignment if operand type not parsed correctly

v2: check types of both operands (Ian)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108012

6 years agoutil/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY
Marek Olšák [Mon, 1 Oct 2018 19:51:06 +0000 (15:51 -0400)]
util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY

Initial version discussed with Rob Clark under a different patch name.
This approach leaves his driver unaffected.

6 years agoradeonsi: fix a typo at CS_PARTIAL_FLUSH
Marek Olšák [Fri, 21 Sep 2018 07:37:16 +0000 (03:37 -0400)]
radeonsi: fix a typo at CS_PARTIAL_FLUSH

harmless

6 years agoac: add ac_build_round
Marek Olšák [Sat, 22 Sep 2018 01:30:09 +0000 (21:30 -0400)]
ac: add ac_build_round

6 years agoac: correct PKT3_COPY_DATA definitions
Marek Olšák [Fri, 21 Sep 2018 07:30:18 +0000 (03:30 -0400)]
ac: correct PKT3_COPY_DATA definitions

6 years agoac: simplify LLVM alloca helpers
Marek Olšák [Fri, 21 Sep 2018 07:27:06 +0000 (03:27 -0400)]
ac: simplify LLVM alloca helpers

6 years agoac: define all address spaces properly
Marek Olšák [Fri, 7 Sep 2018 22:44:54 +0000 (18:44 -0400)]
ac: define all address spaces properly

6 years agogallivm: Make it possible to disable some optimization shortcuts in release builds
Gert Wollny [Fri, 5 Oct 2018 13:08:51 +0000 (15:08 +0200)]
gallivm: Make it possible to disable some optimization shortcuts in release builds

For testing it is of interest that all tests of dEQP pass, e.g. to test
virglrenderer on a host only providing software rendering like in a CI.
Hence make it possible to disable certain optimizations that make tests fail.

While we are there also add some documentation to the flags to make it clear
that this is opt-out.

Setting the environment variable "GALLIVM_PERF=no_filter_hacks" can be used to make
the following tests pass in release mode:

  dEQP-GLES2.functional.texture.mipmap.2d.affine.*_linear_*
  dEQP-GLES2.functional.texture.mipmap.cube.generate.*
  dEQP-GLES2.functional.texture.vertex.2d.filtering.*_mipmap_linear_*
  dEQP-GLES2.functional.texture.vertex.2d.wrap.*

Related:
  https://bugs.freedesktop.org/show_bug.cgi?id=94957

v2: rename optimization disabling flag to 'safemath' and also move the
    nopt flag to the perf flags.

v3: rename flag "safemath" to "no_filter_hacks" since safemath is usually
    associated with floating point operations (Roland)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
6 years agovirgl: Pass resource size and transfer offsets
Tomeu Vizoso [Thu, 4 Oct 2018 14:40:08 +0000 (16:40 +0200)]
virgl: Pass resource size and transfer offsets

Pass the size of a resource when creating it so a backing can be kept in
the other side.

Also pass the required offset to transfer commands.

This moves vtest closer to how virtio-gpu works, making it more useful
for testing.

v2: - Use new messages for creation and transfers, as changing the
      behavior of the existing messages would be messy given that we don't
      want to break compatibility with older servers.

v3: - Use correct strides: The resource corresponding to the output display
      might have a differnt line stride then the IOVs, so when reading back
      to this resource take the resource stride and the the IOV stride
      into account.

v4: Fix transfer size calculation (Andrey Simiklit)

v5: Add comment about transfer size value in the PUT commend (Gurchetan).
    Add a comment about the size correction for transfers for reading and
    writing the resource. Fixing this by correctly evaluating the size
    upfront will need some work also  on the virglrenderer side.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> (v2)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
6 years agovirgl, vtest: Correct the transfer size calculation
Gert Wollny [Thu, 4 Oct 2018 14:40:07 +0000 (16:40 +0200)]
virgl, vtest: Correct the transfer size calculation

The transfer size used in virglrenderer refers to uint32_t, so one
must add 3 and then divide by 4 instead of adding 3/4 which is a no-op
with integers.

Fixes: b3b82fe8ea virgl/vtest: add vtest driver
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
6 years agoutil: Make xmlconfig.c build on Solaris without d_type in dirent (v2)
Alan Coopersmith [Fri, 5 Oct 2018 23:34:35 +0000 (16:34 -0700)]
util: Make xmlconfig.c build on Solaris without d_type in dirent (v2)

v2: check for lstat() failing

Fixes: 04bdbbcab3c "xmlconfig: read more config files from drirc.d/"
Signed-off-by: Alan Coopersmith <alan.coopersmith@oracle.com>
Reviewed-by: Roland Mainz <roland.mainz@nrubsig.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoradeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuse
Sonny Jiang [Wed, 3 Oct 2018 15:53:14 +0000 (11:53 -0400)]
radeonsi:optimizing SET_CONTEXT_REG for shaders vgt_vertex_reuse

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi:optimizing SET_CONTEXT_REG for shaders Tessellation
Sonny Jiang [Wed, 3 Oct 2018 15:53:13 +0000 (11:53 -0400)]
radeonsi:optimizing SET_CONTEXT_REG for shaders Tessellation

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi:optimizing SET_CONTEXT_REG for shaders PS
Sonny Jiang [Wed, 3 Oct 2018 15:53:12 +0000 (11:53 -0400)]
radeonsi:optimizing SET_CONTEXT_REG for shaders PS

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi:optimizing SET_CONTEXT_REG for shaders VS
Sonny Jiang [Wed, 3 Oct 2018 15:53:11 +0000 (11:53 -0400)]
radeonsi:optimizing SET_CONTEXT_REG for shaders VS

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi:optimizing SET_CONTEXT_REG for shaders GS
Sonny Jiang [Wed, 3 Oct 2018 15:53:10 +0000 (11:53 -0400)]
radeonsi:optimizing SET_CONTEXT_REG for shaders GS

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: optimize and allow reg > 31 in radeon_opt_set_context_reg functions
Marek Olšák [Fri, 5 Oct 2018 22:09:37 +0000 (18:09 -0400)]
radeonsi: optimize and allow reg > 31 in radeon_opt_set_context_reg functions

reg_saved will have 64 bits, and (1 << reg) where reg > 31 has undefined
behavior. (1ull << reg) would be correct for 64 bits.

This commit shifts the other way in order to merge the conditions.

6 years agoradeonsi: optimizing SET_CONTEXT_REG for shaders ES
Sonny Jiang [Wed, 3 Oct 2018 15:53:09 +0000 (11:53 -0400)]
radeonsi: optimizing SET_CONTEXT_REG for shaders ES

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agospirv: mark variables decorated with XfbBuffer as always active
Samuel Pitoiset [Fri, 5 Oct 2018 12:39:01 +0000 (14:39 +0200)]
spirv: mark variables decorated with XfbBuffer as always active

Otherwise, they are removed during NIR linking or in some
lowering passes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agodocs: update calendar, add news and link release notes to 18.2.2
Juan A. Suarez Romero [Fri, 5 Oct 2018 10:51:34 +0000 (12:51 +0200)]
docs: update calendar, add news and link release notes to 18.2.2

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
6 years agodocs: add sha256 checksums for 18.2.2
Juan A. Suarez Romero [Fri, 5 Oct 2018 10:45:35 +0000 (12:45 +0200)]
docs: add sha256 checksums for 18.2.2

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit cb63a4e1144d9cd8feda3799c68a32a769417b5f)

6 years agodocs: add release notes for 18.2.2
Juan A. Suarez Romero [Fri, 5 Oct 2018 10:13:33 +0000 (12:13 +0200)]
docs: add release notes for 18.2.2

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit abaeb79eb2c16d7abad06719f24d1e59ad775aa6)

6 years agonir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansions
Jason Ekstrand [Wed, 3 Oct 2018 17:14:20 +0000 (12:14 -0500)]
nir/alu_to_scalar: Use ssa_for_alu_src in hand-rolled expansions

The ssa_for_alu_src helper will correctly handle swizzles and other
source modifiers for you.  The expansions for unpack_half_2x16,
pack_uvec2_to_uint, and pack_uvec4_to_uint were all broken with regards
to swizzles.  The brokenness of unpack_half_2x16 was causing rendering
errors in Rise of the Tomb Raider on Intel ever since c11833ab24dcba26
which added an extra copy propagation to the optimization pipeline and
caused us to start seeing swizzles where we hadn't seen any before.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926
Fixes: 9ce901058f3d "nir: Add lowering of nir_op_unpack_half_2x16."
Fixes: 9b8786eba955 "nir: Add lowering support for packing opcodes."
Tested-by: Alex Smith <asmith@feralinteractive.com>
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
6 years agoglsl/linker: Check the subroutine associated functions names
Vadym Shovkoplias [Wed, 3 Oct 2018 08:39:04 +0000 (11:39 +0300)]
glsl/linker: Check the subroutine associated functions names

>From Section 6.1.2 (Subroutines) of the GLSL 4.00 specification

    "A program will fail to compile or link if any shader
     or stage contains two or more functions with the same
     name if the name is associated with a subroutine type."

v2:
  - error out earlier (Tapani)
  - style fixes (Iago)

Fixes:
    * no-overloads.vert

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108109
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
6 years agovirgl: Negotiate version with vtest server
Tomeu Vizoso [Tue, 2 Oct 2018 07:07:31 +0000 (09:07 +0200)]
virgl: Negotiate version with vtest server

Check if server supports version negotation by sending a PING_PROTOCOL_VERSION
message right before a dummy RESOURCE_BUSY_WAIT. If we don't get a reply
for the first, we know the server doesn't support it.

If it does support it, we can query the max protocol version supported
by the server and fall back if needed.

v2: - Send a new message to negotiate the protocol version, checking if
      the server supports this message by immediately sending a busy wait
      message. (Dave Airlie)

v3: - Send a zero-arg command PING_PROTOCOL_VERSION so we actually keep
      compatibility with older servers. (Code by Dave Airlie)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
6 years agointel: aubinator: Fix memory leaks
Sagar Ghuge [Wed, 5 Sep 2018 17:19:47 +0000 (10:19 -0700)]
intel: aubinator: Fix memory leaks

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agointel/decoder: construct correct xml filename
Sagar Ghuge [Thu, 6 Sep 2018 04:14:23 +0000 (21:14 -0700)]
intel/decoder: construct correct xml filename

construct correct gen xml filename when we try to load hardware xml
description from a given path

v2: remove temporary variable (Francesco Ansanelli)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agointel/decoder: Avoid freeing invalid pointer
Sagar Ghuge [Thu, 6 Sep 2018 19:37:28 +0000 (12:37 -0700)]
intel/decoder: Avoid freeing invalid pointer

v2: Free ctx.spec if error while reading genxml (Lionel Landwerlin)

v3: Handle case where genxml is empty (Lionel Landwerlin)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agointel/decoder: add gen_spec_init method
Sagar Ghuge [Thu, 6 Sep 2018 04:05:21 +0000 (21:05 -0700)]
intel/decoder: add gen_spec_init method

Initialize gen_spec instance properly when loading hardware xml
description from specifc directory to avoid segmentation fault.

v2: correct function definition (Lionel Landwerlin)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoradv: fix resetting the pool for timestamp queries
Samuel Pitoiset [Thu, 4 Oct 2018 08:37:09 +0000 (10:37 +0200)]
radv: fix resetting the pool for timestamp queries

Since the driver no longer uses the availability bit for
timestamp queries it shouldn't reset it. Instead, it should
reset the query values to UINT32_MAX. This fixes VM faults.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108164
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoetnaviv: Use write combine instead of unached mappings for shader bo
Guido Günther [Mon, 1 Oct 2018 16:37:28 +0000 (18:37 +0200)]
etnaviv: Use write combine instead of unached mappings for shader bo

The later are sensitive to unaligned accesses on arm64[1] and we don't
need an uncached mapping here.

[1]: https://lists.freedesktop.org/archives/etnaviv/2018-September/001956.html

Signed-off-by: Guido Günther <guido.gunther@puri.sm>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
6 years agodrirc: add a workaround for ARMA 3
Marek Olšák [Thu, 4 Oct 2018 04:55:52 +0000 (00:55 -0400)]
drirc: add a workaround for ARMA 3

Cc: 18.2 <mesa-stable@lists.freedesktop.org>
6 years agoanv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START
Jason Ekstrand [Tue, 2 Oct 2018 22:19:32 +0000 (17:19 -0500)]
anv/batch_chain: Don't start a new BO just for BATCH_BUFFER_START

Previously, we just went ahead and emitted MI_BATCH_BUFFER_START as
normal.  If we are near enough to the end, this can cause us to start a
new BO just for the MI_BATCH_BUFFER_START which messes up chaining.  We
always reserve enough space at the end for an MI_BATCH_BUFFER_START so
we can just increment cmd_buffer->batch.end prior to emitting the
command.

Fixes: a0b133286a3 "anv/batch_chain: Simplify secondary batch return..."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107926
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoanv: Use separate MOCS settings for external BOs
Jason Ekstrand [Mon, 9 Jul 2018 21:21:33 +0000 (14:21 -0700)]
anv: Use separate MOCS settings for external BOs

On Broadwell and above, we have to use different MOCS settings to allow
the kernel to take over and disable caching when needed for external
buffers.  On Broadwell, this is especially important because the kernel
can't disable eLLC so we have to do it in userspace.  We very badly
don't want to do that on everything so we need separate MOCS for
external and internal BOs.

In order to do this, we add an anv-specific BO flag for "external" and
use that to distinguish between buffers which may be shared with other
processes and/or display and those which are entirely internal.  That,
together with an anv_mocs_for_bo helper lets us choose the right MOCS
settings for each BO use.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agomeson: remove invalid "opencl" llvm component
Emil Velikov [Fri, 7 Sep 2018 13:58:03 +0000 (14:58 +0100)]
meson: remove invalid "opencl" llvm component

Seeming copy/paste mistake from configure.ac which uses $2 for the
component and $3 for the fancy name printing.

Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
6 years agoRevert "mesa: remove unnecessary 'sort by year' for the GL extensions"
Emil Velikov [Mon, 24 Sep 2018 15:01:38 +0000 (16:01 +0100)]
Revert "mesa: remove unnecessary 'sort by year' for the GL extensions"

This reverts commit 3d81e11b49366b5636b8524ba0f8c7076e3fdf34.

As reported by Federico, some games require the 'sort by year' since
they truncate the extensions which do not fit the fixed size string
array.

Seemingly I did not consider that, as the documentation (both Mesa and
Nvidia) mentions about program crashes ... which are worked around by
setting the env. variable.

This commit reinstates the workaround and enhances the documentation.

Cc: Marek Olšák <maraeo@gmail.com>
Cc: Ian Romanick <idr@freedesktop.org>
Reported-by: Federico Dossena <info@fdossena.com>
Fixes: 3d81e11b493 ("mesa: remove unnecessary 'sort by year' for the GL
extensions")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Federico Dossena <info@fdossena.com>
6 years agomesa: reorder and document the tokens in glheader.h
Emil Velikov [Wed, 5 Sep 2018 16:35:17 +0000 (17:35 +0100)]
mesa: reorder and document the tokens in glheader.h

Split into different sections, document each one as well as strange
cases like GL_ATI_texture_compression_3dc.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agomesa: remove duplicate declarations from glheader.h
Emil Velikov [Wed, 5 Sep 2018 16:35:16 +0000 (17:35 +0100)]
mesa: remove duplicate declarations from glheader.h

Remove all the desktop GL and GLX entries from the list.
Former are pulled by the gl.h and glext.h includes at the top while the
latter are no longer needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoi965: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one
Emil Velikov [Wed, 5 Sep 2018 16:35:15 +0000 (17:35 +0100)]
i965: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one

Earlier commit updated the code to use the DRI tokens, yet forgot to
update the comment.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoi915: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one
Emil Velikov [Wed, 5 Sep 2018 16:35:14 +0000 (17:35 +0100)]
i915: reference __DRI_ATTRIB_SWAP_COPY token over the GLX one

Earlier commit updated the code to use the DRI tokens, yet forgot to
update the comment.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agodri/common: move the required GLX_* token definitions locally
Emil Velikov [Wed, 5 Sep 2018 16:35:13 +0000 (17:35 +0100)]
dri/common: move the required GLX_* token definitions locally

Will allow us to remove even bigger hack elsewhere. But more
importantly, we should not be using _any_ GLX tokens in DRI.

Document the gory details about the current side-effects.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agodri/common: use __DRI_ATTRIB_SWAP* instances when describing db_modes
Emil Velikov [Wed, 5 Sep 2018 16:35:12 +0000 (17:35 +0100)]
dri/common: use __DRI_ATTRIB_SWAP* instances when describing db_modes

Somewhat recently Thomas Hellstrom added the respective DRI tokens
and updated the drivers. Update the documentation to match reality.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoegl/x11: remove eglSwap* surface check
Emil Velikov [Mon, 3 Sep 2018 12:05:27 +0000 (13:05 +0100)]
egl/x11: remove eglSwap* surface check

Already handled further up in eglapi.c.

To make things a tiny bit strange, X11+DRI3 was doing the wrong thing by
returning EGL_FALSE (+ no error), while X11+DRI2 was returning EGL_TRUE.

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Eric Engestrom <eric.engestrom@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
6 years agoegl/surfaceless: remove eglSwap* stubs
Emil Velikov [Mon, 3 Sep 2018 12:05:26 +0000 (13:05 +0100)]
egl/surfaceless: remove eglSwap* stubs

The API validation in eglapi.c already returns if the surface type is
!window.

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Cc: Chad Versace <chadversary@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agoegl/drm: remove eglSwap* surface check
Emil Velikov [Mon, 3 Sep 2018 12:05:25 +0000 (13:05 +0100)]
egl/drm: remove eglSwap* surface check

Already handled further up in eglapi.c

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agoegl/android: remove eglSwap* surface check
Emil Velikov [Mon, 3 Sep 2018 12:05:24 +0000 (13:05 +0100)]
egl/android: remove eglSwap* surface check

Already handled further up in eglapi.c

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agoegl: make eglSwapBuffers* a no-op for !window surfaces
Emil Velikov [Mon, 3 Sep 2018 12:05:23 +0000 (13:05 +0100)]
egl: make eglSwapBuffers* a no-op for !window surfaces

Analogous to the previous commit - the spec says the function is a
no-op when a pbuffer or pixmap surface is used.

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agoegl: make eglSwapInterval a no-op for !window surfaces
Emil Velikov [Mon, 3 Sep 2018 12:05:22 +0000 (13:05 +0100)]
egl: make eglSwapInterval a no-op for !window surfaces

As the spec says, the function is a no-op when the surface is not a
window one.

That spec implies that EGL_TRUE should be returned in that case, yet
the ARM driver seems to return EGL_FALSE + EGL_BAD_SURFACE.

The Nvidia driver returns EGL_TRUE. We follow that behaviour until a
decision is made.

https://gitlab.khronos.org/egl/API/merge_requests/17

Cc: samiuddi <sami.uddin.mohammad@intel.com>
Cc: Erik Faye-Lund <kusmabite@gmail.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agofreedreno: add the a6xx sources to the Android build
Emil Velikov [Fri, 17 Aug 2018 12:51:47 +0000 (13:51 +0100)]
freedreno: add the a6xx sources to the Android build

Add the files otherwise things just won't build.
Haven't actually tested it, but it's a small step in the right
direction.

Fixes: de3b34df973 ("freedreno: Add a6xx backend")
Cc: Kristian H. Kristensen <hoegsberg@chromium.org>
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agopipe-loader: add a dup() in pipe_loader_sw_probe_kms
Emil Velikov [Thu, 30 Aug 2018 16:24:16 +0000 (17:24 +0100)]
pipe-loader: add a dup() in pipe_loader_sw_probe_kms

The pipe_loader_release API closes the fd given, even if the pipe-loader
should _not_ take ownership of it.

With earlier commit we fixed pipe_loader_drm_probe_fd, and now with
cover the final piece.

Note that unlike the DRM case, here the caller _did_ forget to dup
before using it ... most likely leading to all sorts of fun.

Don't forget the close in the error path. Seems like the things are a
bit leaky/asymmetrical with the semi-recent config work. But we can shave
that yak another day ;-)

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
6 years agopipe-loader: move dup(fd) within pipe_loader_drm_probe_fd
Emil Velikov [Wed, 29 Aug 2018 17:13:14 +0000 (18:13 +0100)]
pipe-loader: move dup(fd) within pipe_loader_drm_probe_fd

Currently pipe_loader_drm_probe_fd takes ownership of the fd given.
To match that, pipe_loader_release closes it.

Yet we have many instances which do not want the change of ownership,
and thus duplicate the fd before passing it to the pipe-loader.

Move the dup() within pipe-loader, explicitly document that and document
all the cases through the codebase.

A trivial git grep -2 pipe_loader_release makes things as obvious as it
gets ;-)

Cc: Leo Liu <leo.liu@amd.com>
Cc: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Axel Davy <davyaxel0@gmail.com>
Cc: Patrick Rudolph <siro@das-labor.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com> (for nine)
6 years agost/nine: do not double-close the fd on teardown
Emil Velikov [Wed, 29 Aug 2018 17:13:13 +0000 (18:13 +0100)]
st/nine: do not double-close the fd on teardown

As the newly introduced comment says:
 The pipe loader takes ownership of the fd

Thus, there's no need to close it again.

Cc: Patrick Rudolph <siro@das-labor.org>
Cc: Axel Davy <davyaxel0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
6 years agomesa: fold _glapi_check_multithread() back into _mesa_make_current
Emil Velikov [Wed, 5 Sep 2018 16:09:10 +0000 (17:09 +0100)]
mesa: fold _glapi_check_multithread() back into _mesa_make_current

With commit c6c0f947142, back in 2006 Brian removed the
_glapi_check_multithread() call from core mesa - _mesa_make_current.

It was done to remove fairly awkward #ifdef guard which caused subtle
differences in core mesa.

Since that guard is long gone, we can drop the duplication and
reintroduce the call in core.

Note that the function is was missing when using EGL + classic dri HW
drivers. Yet on TLS builds it's a no-op, so we're safe.

Any non TLS users - more or less anything !Linux (or even musl on Linux
up-to semi-recently) may have experienced problems.

v2: don't remove the call from swrast - move it to core (Eric)

Cc: Eric Anholt <eric@anholt.net>
Cc: Brian Paul <brianp@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agovl/dri3: do full teardown on screen_destroy
Emil Velikov [Wed, 29 Aug 2018 17:14:02 +0000 (18:14 +0100)]
vl/dri3: do full teardown on screen_destroy

Earlier commit added support for 'front_buffers', erroneously adding a
return in vl_dri3_screen_destroy. Effectively leaking a lot of state.

Fixes: 8d7ac0a4e4d ("vl/dri3: implement DRI3 BufferFromPixmap")
Cc: Leo Liu <leo.liu@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
6 years agost/dri: make swrast_no_present member of dri_screen
Emil Velikov [Fri, 24 Aug 2018 13:06:00 +0000 (14:06 +0100)]
st/dri: make swrast_no_present member of dri_screen

Just like the dri2 options, this is better suited in the dri_screen
struct.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
6 years agost/dri: inline dri2_buffer.h within dri2.c
Emil Velikov [Fri, 24 Aug 2018 13:05:59 +0000 (14:05 +0100)]
st/dri: inline dri2_buffer.h within dri2.c

The header was used only by dri2.c, containing a two-member struct and cast wrapper.
Just inline it where it's used/needed.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agost/xa: remove unused xa_screen::d[s]_depth_bits_last
Emil Velikov [Fri, 24 Aug 2018 13:05:58 +0000 (14:05 +0100)]
st/xa: remove unused xa_screen::d[s]_depth_bits_last

Unused since the initial import.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agomesa: use C99 initializer in get_gl_override()
Emil Velikov [Fri, 24 Aug 2018 13:05:57 +0000 (14:05 +0100)]
mesa: use C99 initializer in get_gl_override()

The overrides array contains entries indexed on the gl_api enum.
Use a C99 initializer to make it a bit more obvious.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
6 years agoanv: Ensure discreteQueuePriorities is at least 2
Gabriel Majeri [Sun, 26 Aug 2018 18:48:01 +0000 (21:48 +0300)]
anv: Ensure discreteQueuePriorities is at least 2

This is the minimum value according to the spec.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agor600: use build-id when available for disk cache
Timothy Arceri [Wed, 19 Sep 2018 01:59:09 +0000 (11:59 +1000)]
r600: use build-id when available for disk cache

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agonouveau: use build-id when available for disk cache
Timothy Arceri [Wed, 19 Sep 2018 01:56:37 +0000 (11:56 +1000)]
nouveau: use build-id when available for disk cache

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: use build-id when available for disk cache
Timothy Arceri [Wed, 19 Sep 2018 01:07:22 +0000 (11:07 +1000)]
radeonsi: use build-id when available for disk cache

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoutil: add disk_cache_get_function_identifier()
Timothy Arceri [Wed, 19 Sep 2018 01:44:12 +0000 (11:44 +1000)]
util: add disk_cache_get_function_identifier()

This can be used as a drop in replacement for
disk_cache_get_function_timestamp().

Here we use build-id to generate a driver-id rather than build
timestamp if available. This should resolve issues such as
distros using reproducable builds and flatpak not having
real build timestamps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoutil: rename timestamp param in disk_cache_create()
Timothy Arceri [Wed, 19 Sep 2018 00:21:05 +0000 (10:21 +1000)]
util: rename timestamp param in disk_cache_create()

Only some drivers use a timestamp here. Others use things such
as build-id, or even a combination of build-ids from Mesa and
LLVM.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: avoid sending GS_EMIT in shaders without outputs
Józef Kucia [Sun, 23 Sep 2018 22:44:00 +0000 (00:44 +0200)]
radeonsi: avoid sending GS_EMIT in shaders without outputs

Fixes GPU hangs.

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107857
Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoi965: Replace checks for rb->Name with FlipY (v2)
Fritz Koenig [Mon, 17 Sep 2018 20:51:35 +0000 (13:51 -0700)]
i965: Replace checks for rb->Name with FlipY (v2)

In the GL_MESA_framebuffer_flip_y implementation
_mesa_is_winsys_fbo checks were replaced with
FlipY checks.  rb->Name is also used to determine
if a buffer is winsys.

v2: Fixes annotation [for emil]

Fixes: ab05dd183cc ("i965: implement GL_MESA_framebuffer_flip_y [v3]")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agoradeonsi: initialize ac_gpu_info::name when using SI_FORCE_FAMILY
Marek Olšák [Mon, 17 Sep 2018 01:18:47 +0000 (21:18 -0400)]
radeonsi: initialize ac_gpu_info::name when using SI_FORCE_FAMILY

so that it's not NULL when loading radeonsi and a GCN GPU is not
present in the system.

6 years agoradeonsi: don't set the VS prolog key for the blit VS
Marek Olšák [Sun, 23 Sep 2018 03:57:05 +0000 (23:57 -0400)]
radeonsi: don't set the VS prolog key for the blit VS

6 years agospirv: Move function call handling to vtn_cfg
Jason Ekstrand [Sat, 22 Sep 2018 15:33:51 +0000 (10:33 -0500)]
spirv: Move function call handling to vtn_cfg

It makes way more sense for it to live there with the rest of function
handling.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir/from_ssa: Don't rewrite derefs destinations to registers
Jason Ekstrand [Sat, 22 Sep 2018 11:59:22 +0000 (06:59 -0500)]
nir/from_ssa: Don't rewrite derefs destinations to registers

We already call nir_rematerialize_derefs_in_use_blocks_impl prior to
calling nir_lower_ssa_defs_to_regs_block so the assertion that all deref
uses in the block should hold.  This fixes the following CTS test when
SPIR-V optimization recipe 1:

dEQP-VK.glsl.struct.local.loop_nested_struct_array_vertex

Fixes: 606eb56ab9449b "intel/nir: Only lower load/store derefs"
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agonir/cf: Remove phi sources if needed in nir_handle_add_jump
Jason Ekstrand [Fri, 21 Sep 2018 14:27:48 +0000 (09:27 -0500)]
nir/cf: Remove phi sources if needed in nir_handle_add_jump

If the block in which the jump is inserted is the predecessor of a phi
then we need to remove phi sources otherwise the phi may end up with
things improperly connected.  This fixes the following CTS test when
dEQP is run with SPIR-V optimization recipe 1:

dEQP-VK.glsl.functions.control_flow.return_in_nested_loop_vertex

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
6 years agoanv: suppress warning about unhandled image layout
Eric Engestrom [Tue, 2 Oct 2018 13:31:42 +0000 (14:31 +0100)]
anv: suppress warning about unhandled image layout

Let's just be explicit that VK_NV_shading_rate_image is not supported.

Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Fixes: 6ee17091708a41c4aa81a "vulkan: Update the XML and headers to 1.1.86"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
6 years agofreedreno/a6xx: hwbinning
Rob Clark [Tue, 11 Sep 2018 19:59:22 +0000 (15:59 -0400)]
freedreno/a6xx: hwbinning

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: update generated headers
Rob Clark [Fri, 28 Sep 2018 18:13:28 +0000 (14:13 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agointel/fs: Fix a typo in need_matching_subreg_offset
Jason Ekstrand [Tue, 2 Oct 2018 01:17:24 +0000 (20:17 -0500)]
intel/fs: Fix a typo in need_matching_subreg_offset

This fixes a bunch of Vulkan subgroup tests on little core platforms.

Fixes: 4150920b95 "intel/fs: Add a helper for emitting scan operations"
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoutil: disable cache if we have no build-id and timestamp is zero
Timothy Arceri [Wed, 19 Sep 2018 22:54:32 +0000 (08:54 +1000)]
util: disable cache if we have no build-id and timestamp is zero

Timestamp can be zero for example when Flatpak is used. In this
case just disable the cache rather then segfaulting when
incompatible cache items are loaded.

V2: actually return false when mtime is 0.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoinclude: sync eglext.h from Khronos
Eric Engestrom [Sun, 10 Jun 2018 08:35:53 +0000 (09:35 +0100)]
include: sync eglext.h from Khronos

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
6 years agoradeonsi: add a workaround for bitfield_extract when count is 0
Timothy Arceri [Sat, 22 Sep 2018 02:38:11 +0000 (12:38 +1000)]
radeonsi: add a workaround for bitfield_extract when count is 0

This ports the fix from 3d41757788ac. Both LLVM 7 & 8 continue
to have this problem.

It fixes rendering issues in some menu and loading screens of
Civ VI which can be seen in the trace from bug 104602.

Note: This does not fix the black triangles on Vega for bug
104602.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104602
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276

6 years agoanv: Implement VK_KHR_driver_properties
Jason Ekstrand [Wed, 20 Jun 2018 03:27:36 +0000 (20:27 -0700)]
anv: Implement VK_KHR_driver_properties

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agovulkan: Update the XML and headers to 1.1.86
Jason Ekstrand [Tue, 24 Apr 2018 15:30:24 +0000 (08:30 -0700)]
vulkan: Update the XML and headers to 1.1.86

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: do not try to set DCC_CONTROL when image doesn't use DCC
Samuel Pitoiset [Fri, 28 Sep 2018 12:35:52 +0000 (14:35 +0200)]
radv: do not try to set DCC_CONTROL when image doesn't use DCC

Unnecessary. While we are at it, remove the check for pre-VI
because it's already checked earlier.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: add a sanity check for mutable formats and TC-compat HTILE
Samuel Pitoiset [Fri, 28 Sep 2018 13:05:24 +0000 (15:05 +0200)]
radv: add a sanity check for mutable formats and TC-compat HTILE

If apps use the MUTABLE bit and the same formats as the image one
in the list, we can still enable TC-compat HTILE. I don't think
this happens often but given the fact that TC-compat HTILE allows
a nice boost in some situations, it's worth checking.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: disable HTILE for very small depth surfaces
Samuel Pitoiset [Fri, 28 Sep 2018 14:28:50 +0000 (16:28 +0200)]
radv: disable HTILE for very small depth surfaces

Like we disable DCC/CMASK for small color surfaces as well.
Serious Sam 2017 creates a 1x1 depth surface and I think
it should be faster to do slow clears on the graphics queue
instead of fast clears on compute, and eventually a depth
expand if the surface isn't TC-compatible HTILE.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: add potential missing fields for DB_EQAA
Samuel Pitoiset [Fri, 28 Sep 2018 10:30:08 +0000 (12:30 +0200)]
radv: add potential missing fields for DB_EQAA

Other drivers set these two as well, just apply the same rule.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: disable complicated point clipping against user clip planes
Samuel Pitoiset [Fri, 28 Sep 2018 10:30:07 +0000 (12:30 +0200)]
radv: disable complicated point clipping against user clip planes

I don't think this is required by Vulkan too.

Ported from RadeonSI (AMDVLK doesn't set it either).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agogallium/util: Clarify comment in util_init_thread_pinning
Michel Dänzer [Tue, 18 Sep 2018 15:23:04 +0000 (17:23 +0200)]
gallium/util: Clarify comment in util_init_thread_pinning

As discussed in the review of the patch which added the comment:

Nothing happens when a thread is created, because pthread_atfork doesn't
affect creating threads. However, spawning a child process will likely
crash.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradv: do not sync CP DMA when copying buffers
Samuel Pitoiset [Wed, 26 Sep 2018 09:21:06 +0000 (11:21 +0200)]
radv: do not sync CP DMA when copying buffers

We already track if the DMA engine is busy/idle with a flag,
and we emit a packet that waits for all CP DMA operations
to be complete. This is done at end of command buffer because
the kernel doesn't wait for them, and also when emitting
barriers, so it should be safe.

This improves small copies for both aligned and unaligned sizes.

Aligned sizes:
BEFORE:
1 KB: 59.840000 ms
2 KB: 71.200000 ms
AFTER:
1 KB: 31.200000 ms
2 KB: 31.040000 ms

Unaligned sizes:
BEFORE:
2 KB: 68.3200 ms
3 KB: 79.3600 ms
5 KB: 76.6400 ms
9 KB: 90.8800 ms
17 KB: 116.0000 ms
AFTER:
2 KB: 31.0400 ms
3 KB: 32.0000 ms
5 KB: 30.8800 ms
9 KB: 30.5600 ms
17 KB: 29.6000 ms

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: adjust the CmdUpdateBuffer threshold for optimal performance
Samuel Pitoiset [Wed, 26 Sep 2018 09:10:58 +0000 (11:10 +0200)]
radv: adjust the CmdUpdateBuffer threshold for optimal performance

According to my benchmark results, it appears that we should
reduce the threshold to 1024.

BEFORE:
1 KB: 68.656000 ms
2 KB: 118.368000 ms

AFTER:
1 KB: 31.760000 ms
2 KB: 29.840000 ms

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: do not use the availability bit for timestamp queries
Samuel Pitoiset [Tue, 25 Sep 2018 18:26:58 +0000 (20:26 +0200)]
radv: do not use the availability bit for timestamp queries

It's unnecessary because we can just check if the timestamp
is to different to the default value when a pool is created
or resetted. Instead of waiting for the availability bit to
be 1, we have to emit a not equal WAIT_REG_MEM for checking
if the timestamp is ready.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agofreedreno/a6xx: Build up draw dword0 outside visibilty if statement
Kristian H. Kristensen [Fri, 21 Sep 2018 19:24:47 +0000 (12:24 -0700)]
freedreno/a6xx: Build up draw dword0 outside visibilty if statement

Pulling this logic out means we can share the logic and avoid a couple
of temporary variables that helped make things clearer before. Note
that in either vismode case, we always program vismode 0.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
6 years agofreedreno/a6xx: Simplify draw_emit() branches a bit
Kristian H. Kristensen [Fri, 21 Sep 2018 19:07:22 +0000 (12:07 -0700)]
freedreno/a6xx: Simplify draw_emit() branches a bit

Now that we've copied the emit logic into each branch of the
if (info->index_size) statement, we can simplify the logic a bit
according to which case we're in.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
6 years agofreedreno/a6xx: Copy OUT_RING() part into each branch of the index if
Kristian H. Kristensen [Fri, 21 Sep 2018 19:02:34 +0000 (12:02 -0700)]
freedreno/a6xx: Copy OUT_RING() part into each branch of the index if

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
6 years agofreedreno/a6xx: Split fd6_draw_emit into direct and indirect paths
Kristian H. Kristensen [Fri, 21 Sep 2018 18:37:36 +0000 (11:37 -0700)]
freedreno/a6xx: Split fd6_draw_emit into direct and indirect paths

This splits the two code paths into separate functions and moves the
"if (info->indirect)" test into draw_impl().

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
6 years agofreedreno/a6xx: Inline fd6_draw()
Kristian H. Kristensen [Fri, 21 Sep 2018 04:25:27 +0000 (21:25 -0700)]
freedreno/a6xx: Inline fd6_draw()

Simplify the code a bit by inlining this helper.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
6 years agofreedreno/a6xx: Move emit_marker and wfi to draw_impl()
Kristian H. Kristensen [Fri, 21 Sep 2018 04:19:57 +0000 (21:19 -0700)]
freedreno/a6xx: Move emit_marker and wfi to draw_impl()

This way the markers clearly bracket the draw call and isn't
duplicated for both direct and indirect draw code.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
6 years agofreedreno/a6xx: Move inline functions out of fd6_draw.h
Kristian H. Kristensen [Fri, 21 Sep 2018 04:09:04 +0000 (21:09 -0700)]
freedreno/a6xx: Move inline functions out of fd6_draw.h

Only used in fd6_draw.c so put them there.

Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
6 years agofreedreno: fix a typo in launch_grid
Hyunjun Ko [Thu, 20 Sep 2018 02:39:49 +0000 (11:39 +0900)]
freedreno: fix a typo in launch_grid