mesa.git
6 years agoandroid: util/disk_cache: fix building errors in gallium drivers
Mauro Rossi [Sat, 21 Jul 2018 08:40:32 +0000 (10:40 +0200)]
android: util/disk_cache: fix building errors in gallium drivers

This patch applies the necessary changes in Android.common.mk
as per automake rules, to avoid following building error:

external/mesa/src/gallium/drivers/nouveau/nouveau_screen.c:159:8:
error: implicit declaration of function 'disk_cache_get_function_timestamp'
is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   if (disk_cache_get_function_timestamp(nouveau_disk_cache_create,
       ^
1 error generated.

(v2) -DENABLE_SHADER_CACHE Android cflag is kept, to leave the AS-IS capability enabled

Fixes: cc10b34 ("util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoAndroid: fix a missing nir_intrinsics.h error
Chih-Wei Huang [Thu, 24 May 2018 07:03:31 +0000 (15:03 +0800)]
Android: fix a missing nir_intrinsics.h error

The commit 76dfed8ae2d5 changed nir_intrinsics.h to be a generated
header, but the corresponding dependency was not updated for Android.
It causes the error:

[  0% 19/4336] target  C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_debug.c
...
In file included from external/mesa/src/gallium/drivers/radeonsi/si_debug.c:25:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:28:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:140:
In file included from external/mesa/src/amd/common/ac_llvm_build.h:30:
external/mesa/src/compiler/nir/nir.h:966:10: fatal error: 'nir_intrinsics.h' file not found
         ^~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 76dfed8ae2d5 ("nir: mako all the intrinsics")
Signed-off-by: Chih-Wei Huang <cwhuang@linux.org.tw>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Mauro Rossi <issor.oruam@gmail.com>
6 years agonir: Fix end of function without return warning/error.
Bas Nieuwenhuizen [Fri, 20 Jul 2018 17:54:56 +0000 (19:54 +0200)]
nir: Fix end of function without return warning/error.

There always is a continue block, so let us just do unreachable.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 8cacf38f527 "nir: Do not use continue block after removing it."
CC: 18.1 <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107312

6 years agost: Sweep NIR after linking phase to free held memory
Danylo Piliaiev [Tue, 10 Jul 2018 08:51:45 +0000 (11:51 +0300)]
st: Sweep NIR after linking phase to free held memory

After optimization passes and many trasfromations most of memory
NIR holds is a garbage which was being freed only after shader deletion.
Freeing it at the end of linking will save memory which would be useful
in case there are a lot of complex shaders being compiled.
The common case for this issue is 32bit game running under Wine.

The cost of the optimization is around ~3-5% of compilation speed
with complex shaders.

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agost/dri: Don't require a dri_format for image creation.
Eric Anholt [Mon, 16 Jul 2018 22:22:57 +0000 (15:22 -0700)]
st/dri: Don't require a dri_format for image creation.

Nothing in EGL_KHR_gl_image.txt seems to let us deny creation based on
formats, and doing so causes many failures in
dEQP-EGL.functional.image.api.*

The NONE value we were protecting from only gets looked at in the
__DRI_IMAGE_ATTRIB_FORMAT and __DRI_IMAGE_ATTRIB_FOURCC queries, which are
used from wayland and gbm (which throw an error cleanly on unknown format)
and DMABUF export.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoegl: Refuse EGL_MESA_image_dma_buf_export if we don't have a DRM fourcc.
Eric Anholt [Mon, 16 Jul 2018 23:18:03 +0000 (16:18 -0700)]
egl: Refuse EGL_MESA_image_dma_buf_export if we don't have a DRM fourcc.

The EGL CTS expects that you can make images from all sorts of things,
including things like z16 and s8, which we don't have DRM fourccs for.
Just return an error when trying to export one of those.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agov3d: Fix incorrect handling of two fences created back-to-back.
Eric Anholt [Mon, 9 Jul 2018 19:41:46 +0000 (12:41 -0700)]
v3d: Fix incorrect handling of two fences created back-to-back.

Recreating our context's syncobj with ALREADY_SIGNALED meant that if you
created two fences in a row, then waiting on the second would succeed
immediately.  Instead, export a sync file in the gallium fence (since we
don't have a syncobj clone ioctl), and just create a new syncobj to wait
on whenever we need to.

Noticed while debugging
dEQP-GLES3.functional.fence_sync.client_wait_sync_finish

6 years agov3d: Fix the timeout value passed to drmSyncobjWait().
Eric Anholt [Mon, 9 Jul 2018 20:18:34 +0000 (13:18 -0700)]
v3d: Fix the timeout value passed to drmSyncobjWait().

The API wants an absolute time, so we need to go add gallium's argument to
CLOCK_MONOTONIC.

6 years agov3d: Fix drmSyncobjWait() return value checking even more.
Eric Anholt [Wed, 18 Jul 2018 19:06:45 +0000 (12:06 -0700)]
v3d: Fix drmSyncobjWait() return value checking even more.

It tends to return >0 in the success case (I think the value is something
like "how much of the timeout remained").  Fixes
dEQP-GLES3.functional.fence_sync.client_wait_sync_finish

6 years agov3d: Use the list_first_entry/list_last_entry macros.
Eric Anholt [Tue, 17 Jul 2018 21:33:19 +0000 (14:33 -0700)]
v3d: Use the list_first_entry/list_last_entry macros.

6 years agov3d: Move BO cache counting to dump time instead of cache management.
Eric Anholt [Tue, 17 Jul 2018 21:29:41 +0000 (14:29 -0700)]
v3d: Move BO cache counting to dump time instead of cache management.

This is one less way to get the dump stats wrong.

6 years agov3d: Reduce the stale BO reclamation spam with dump_stats set.
Eric Anholt [Tue, 17 Jul 2018 20:21:58 +0000 (13:21 -0700)]
v3d: Reduce the stale BO reclamation spam with dump_stats set.

This was obviously meant to be when we were actually freeing a BO, not
just when there was at least one BO in the list.

6 years agov3d: Respect a sampler view's first_layer field.
Eric Anholt [Mon, 16 Jul 2018 23:44:58 +0000 (16:44 -0700)]
v3d: Respect a sampler view's first_layer field.

Fixes texturing from EGL images created from cubemap faces, as in
dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture

6 years agoradeonsi: emit_spi_map packets optimization
Sonny Jiang [Wed, 18 Jul 2018 21:48:50 +0000 (17:48 -0400)]
radeonsi: emit_spi_map packets optimization

v2: marek: remove an empty line before break;
    rename reg_val_seq -> spi_ps_input_cntl
    "type * x" -> "type *x"

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agovirgl: Expose GL_ARB_copy_image if host supports it
Gert Wollny [Tue, 3 Jul 2018 11:32:21 +0000 (13:32 +0200)]
virgl: Expose GL_ARB_copy_image if host supports it

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
6 years agovirgl: Allow RGB32* textures only as buffer objects
Gert Wollny [Thu, 12 Jul 2018 10:55:36 +0000 (12:55 +0200)]
virgl: Allow RGB32* textures only as buffer objects

When requesting a texture of the internal format GL_RGB32F Gallium will
try to allocate a renderable texture and returns RGBA32F or RGBX32F, but
when one requests GL_RGB32I or GL_RGB32UI the according 3-component
texture will be returned. This leads to problems later, when one wants
to use glCopyImageSubData to copy data between these textures that should
be compatible, but given the way virgl and Gallium  handle this the latter
fails with an assertion, because the per-texel bit size is different.

By allowing the GL_RGB32* only for texture buffers these problems are avoided
without losing the ARB_tbo_rgb32 extension (thanks Ilia Mirkin).

v2: Correct spelling (Gurchetan Singh)

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
6 years agointel: tools: dump: protect against multiple calls on destructor
Lionel Landwerlin [Fri, 20 Jul 2018 10:20:41 +0000 (11:20 +0100)]
intel: tools: dump: protect against multiple calls on destructor

When running gdb, make sure to pass the LD_PRELOAD variable only to
the executed program, not the debugger. Otherwise the debugger will
run the preloaded constructor/destructor too and bad things will
happen.

Suggested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agointel: tools: dump: make dump tool reliable under gdb
Lionel Landwerlin [Fri, 20 Jul 2018 10:18:18 +0000 (11:18 +0100)]
intel: tools: dump: make dump tool reliable under gdb

The problem with passing the configuration of the dump lib through a
file descriptor is that it can be read only once. But under gdb you
might want to rerun your program multiple times.

This change hands the configuration through a temporary file that is
deleted once the command line passes to intel_dump_gpu has exited.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agoradv: don't flush DB before subpass FS resolves
Samuel Pitoiset [Fri, 20 Jul 2018 13:07:34 +0000 (15:07 +0200)]
radv: don't flush DB before subpass FS resolves

That shouldn't be needed because the DB state is invalid.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agor600: Correct evaluation of cube array index and face
Gert Wollny [Tue, 17 Jul 2018 17:04:09 +0000 (19:04 +0200)]
r600: Correct evaluation of cube array index and face

The array index needs to be corrected and it must be insured that it is
rounded and its value is non-negative before it is combined with the
face id.

v5: Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin)

v6: Fix type (Roland Scheidegger)

Fixes 182 from android/cts/master/gles31-master.txt:
  dEQP-GLES31.functional.texture.filtering.cube_array.formats.*
  dEQP-GLES31.functional.texture.filtering.cube_array.sizes.*
  dEQP-GLES31.functional.texture.filtering.cube_array.combinations.nearest_mipmap_*
  dEQP-GLES31.functional.texture.filtering.cube_array.combinations.linear_mipmap_*
  dEQP-GLES31.functional.texture.filtering.cube_array.no_edges_visible.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
6 years agor600: correct texture offset for array index lookup
Gert Wollny [Tue, 17 Jul 2018 17:04:08 +0000 (19:04 +0200)]
r600: correct texture offset for array index lookup

Correct the array index for TEXTURE_*1D_ARRAY, and TEXTURE_*2D_ARRAY
The standard says the array index is evaluated according to

   floor(z + 0.5)

but RNDNE is sufficient also for the test cases were z is close to 1.5
and it is likely to hit 1.5, the corner case were RNDNE gives a result
different from above formula.

v5: - Use RNDNE instead of ADD 0.5 and FLOOR (Ilia Mirkin)
    - update commit message

Fixes 325 tests from android/cts/master/gles3-master.txt:
  dEQP-GLES3.functional.shaders.texture_functions.texture.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.textureoffset.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturelod.sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturelodoffset.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*sampler2darray*
  dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.*sampler2darray*
  dEQP-GLES3.functional.texture.filtering.2d_array.formats.*
  dEQP-GLES3.functional.texture.filtering.2d_array.sizes.*
  dEQP-GLES3.functional.texture.filtering.2d_array.combinations.*
  dEQP-GLES3.functional.texture.shadow.2d_array.*
  dEQP-GLES3.functional.texture.vertex.2d_array.*

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
6 years agor600: Delay emission of texture gradients and lookup offsets
Gert Wollny [Tue, 17 Jul 2018 17:04:07 +0000 (19:04 +0200)]
r600: Delay emission of texture gradients and lookup offsets

Gradients used in texture lookups and the offsets must reside in the
same fetch clause (the first is imposed by the hardware and the second
is expected by sb). In order to ensure that no ALU clause is inserted
between emission and use of these, delay the emission of these
instructions until the texture instruction using them is also emitted.

This is needed in preparation for the correction of the texture array
indices.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
6 years agoutil/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.
Bas Nieuwenhuizen [Wed, 18 Jul 2018 11:58:49 +0000 (13:58 +0200)]
util/disk_cache: Fix disk_cache_get_function_timestamp with disabled cache.

radv always needs it, so just check the header instead. Also
do not declare the function if the variable is not set, so we
get a nice compile error instead of failing to open a device
at runtime.

Fixes: b87ef9e606a "util: fix MSVC build issue in disk_cache.h"
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agonir: Do not use continue block after removing it.
Bas Nieuwenhuizen [Sat, 14 Jul 2018 23:19:17 +0000 (01:19 +0200)]
nir: Do not use continue block after removing it.

Reinserting code directly before a jump means the block gets split
and merged, removing the original block and replacing it in the
process.

Hence keeping a pointer to the continue block over a reinsert
causes issues.

This code changes nir_opt_if to simply look for the new continue
block.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107275
CC: 18.1 <mesa-stable@lists.freedesktop.org>
6 years agoradv: simplify a condition in radv_src_access_flush()
Samuel Pitoiset [Wed, 18 Jul 2018 14:19:07 +0000 (16:19 +0200)]
radv: simplify a condition in radv_src_access_flush()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: save current state just before resolving with FS
Samuel Pitoiset [Wed, 18 Jul 2018 14:19:06 +0000 (16:19 +0200)]
radv: save current state just before resolving with FS

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: don't check if a subpass has resolve attachments twice
Samuel Pitoiset [Wed, 18 Jul 2018 14:19:05 +0000 (16:19 +0200)]
radv: don't check if a subpass has resolve attachments twice

We already check that in radv_cmd_buffer_resolve_subpass().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: make use of radv_subpass_barrier() when resolving subpasses
Samuel Pitoiset [Wed, 18 Jul 2018 14:19:04 +0000 (16:19 +0200)]
radv: make use of radv_subpass_barrier() when resolving subpasses

The goal is to use radv_barrier()/radv_subpass_barrier() as
much as possible for further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agonv50/ir: move LateAlgebraicOpt back to right after ConstantFolding
Rhys Perry [Tue, 12 Jun 2018 11:14:14 +0000 (12:14 +0100)]
nv50/ir: move LateAlgebraicOpt back to right after ConstantFolding

total instructions in shared programs : 5480808 -> 5472107 (-0.16%)
total gprs used in shared programs    : 647530 -> 647532 (0.00%)
total shared used in shared programs  : 389120 -> 389120 (0.00%)
total local used in shared programs   : 21064 -> 21064 (0.00%)
total bytes used in shared programs   : 58551648 -> 58459352 (-0.16%)

                local     shared        gpr       inst      bytes
    helped           0           0          73        2609        2609
      hurt           0           0          71          34          34

6 years agonv50/ir: handle SHLADD in IndirectPropagation
Rhys Perry [Tue, 12 Jun 2018 10:43:49 +0000 (11:43 +0100)]
nv50/ir: handle SHLADD in IndirectPropagation

An alternative solution to the problem fixed in
0bd83d0 ("nv50/ir: move LateAlgebraicOpt to the very end").

total instructions in shared programs : 5481195 -> 5480808 (-0.01%)
total gprs used in shared programs    : 647535 -> 647530 (-0.00%)
total shared used in shared programs  : 389120 -> 389120 (0.00%)
total local used in shared programs   : 21064 -> 21064 (0.00%)
total bytes used in shared programs   : 58555784 -> 58551648 (-0.01%)

                local     shared        gpr       inst      bytes
    helped           0           0           2          34          34
      hurt           0           0           0           0           0

6 years agogm107/ir: use CS2R for SV_CLOCK
Rhys Perry [Thu, 19 Jul 2018 15:58:46 +0000 (16:58 +0100)]
gm107/ir: use CS2R for SV_CLOCK

This instruction seems to be faster than S2R and requires no barrier,
though the range of special registers it can read from is limited.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
6 years agointel: tools: dump: remove mentions of intel_aubdump
Lionel Landwerlin [Wed, 18 Jul 2018 16:38:52 +0000 (17:38 +0100)]
intel: tools: dump: remove mentions of intel_aubdump

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agointel: tools: aubwrite: fix invalid frees on finish
Lionel Landwerlin [Wed, 18 Jul 2018 16:39:19 +0000 (17:39 +0100)]
intel: tools: aubwrite: fix invalid frees on finish

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
6 years agoac/nir: add a workaround for bitfield_extract when count is 0
Samuel Pitoiset [Thu, 19 Jul 2018 18:27:11 +0000 (20:27 +0200)]
ac/nir: add a workaround for bitfield_extract when count is 0

LLVM 7 returns incorrect results when count is 0, something
has been broken since LLVM 6. Of course, the best solution is
to fix LLVM but this workaround works as expected for now.

Original workaround by Philippe Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agointel/isl/gen4: Make depth/stencil buffers Y-Tiled
Nanley Chery [Mon, 16 Jul 2018 22:42:39 +0000 (15:42 -0700)]
intel/isl/gen4: Make depth/stencil buffers Y-Tiled

Rendering to a linear depth buffer on gen4 is causing a GPU hang in the
CI system. Until a better explanation is found, assume that errata is
applicable to all gen4 platforms.

Fixes fbe01625f6bf2cef6742e1ff0d3d44a2afec003e
("i965/miptree: Share tiling_flags in miptree_create").

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoi965/misc: Use depth/stencil surf's tiling on gen4-5
Nanley Chery [Mon, 16 Jul 2018 20:03:09 +0000 (13:03 -0700)]
i965/misc: Use depth/stencil surf's tiling on gen4-5

Make the 3D engine aware of the depth/stencil surface's tiling before
doing any render operations.

Fixes fbe01625f6bf2cef6742e1ff0d3d44a2afec003e
("i965/miptree: Share tiling_flags in miptree_create").

Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107248
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoglsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch
Caio Marcelo de Oliveira Filho [Tue, 26 Jun 2018 23:26:46 +0000 (16:26 -0700)]
glsl: don't let an 'if' then-branch kill copy propagation (elements) for else-branch

When handling 'if' in copy propagation elements, if a certain variable
was killed when processing the first branch of the 'if', then the
second would get any propagation from previous nodes.

    x = y;
    if (...) {
        z = x;  // This would turn into z = y.
        x = 22; // x gets killed.
    } else {
        w = x;  // This would NOT turn into w = y.
    }

With the change, we let copy propagation happen independently in the
two branches and only then apply the killed values for the subsequent
code.

One example in shader-db part of shaders/unity/8.shader_test:

    (assign  (xyz) (var_ref col_1)  (var_ref tmpvar_8) )
    (if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) (
      (assign  (xyz) (var_ref col_1)  (expression vec3 + (var_ref tmpvar_8) ... ) ... )
    )
    (
      (assign  (xyz) (var_ref col_1)  (expression vec3 lrp (var_ref col_1) ... ) ... )
    ))

The variable col_1 was replaced by tmpvar_8 in the then-part but not
in the else-part.

NIR deals well with copy propagation, so it already covered for the
missing ones that this patch fixes.

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agoglsl: change opt_copy_propagation_elements data structures
Caio Marcelo de Oliveira Filho [Mon, 25 Jun 2018 17:44:56 +0000 (10:44 -0700)]
glsl: change opt_copy_propagation_elements data structures

Instead of keeping multiple acp_entries in lists, have a single
acp_entry per variable. With this, the implementation of clone is more
convenient and now fully implemented. In the previous code, clone was
only partial.

Before this patch, each acp_entry struct represented a write to a
variable including LHS, RHS and a mask of what channels were written
to. There were two main hash tables, the first (lhs_ht) stored a list
of acp_entries per LHS variable, with the values available to copy for
that variable; the second (rhs_ht) was a "reverse index" for the first
hash table, so stored acp_entries per RHS variable.

After the patch, there's a single acp_entry struct per LHS variable,
it contains an array with references to the RHS variables per
channel. There now is a single hash table, from LHS variable to the
corresponding entry. The "reverse index" is stored in the ACP entry,
in the form of a set of variables that copy from the LHS. To make the
clone operation cheaper, the ACP entries are created on demand.

This should not change the result of copy propagation, a later patch
will take advantage of the clone operation.

v2: Add note clarifying how the hashtable is destroyed.

v3: (all from Eric Anholt)
    Add remove_unused_var_from_dsts() function for reuse.
    Remove from dsts as we go instead of clearing at the end.
    Add clarifying comment to erase().

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agoglsl: separate copy propagation state
Caio Marcelo de Oliveira Filho [Sat, 23 Jun 2018 00:35:23 +0000 (17:35 -0700)]
glsl: separate copy propagation state

Separate higher level logic of visiting instructions and chosing when
to store and use new copy data from the datastructure holding the copy
propagation information. This will also make easier later patches that
change the structure.

v2: Remove empty destructor and clarify how hash tables are destroyed.

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agointel: tools: dump: trace memory writes
Lionel Landwerlin [Wed, 18 Jul 2018 17:19:31 +0000 (18:19 +0100)]
intel: tools: dump: trace memory writes

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
6 years agointel: tools: dump: remove command execution feature
Lionel Landwerlin [Wed, 18 Jul 2018 14:12:57 +0000 (15:12 +0100)]
intel: tools: dump: remove command execution feature

In commit 86cb05a6d35a52 ("intel: aubinator: remove standard input
processing option") we removed the ability to process aub as an input
stream because we're now rely on mmapping the aub file to back the
buffers aubinator is parsing.

intel_aubdump was the provider of the standard input data and since
we've copied/reworked intel_aubdump into intel_dump_gpu within Mesa,
we don't need that code anymore.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradv: Fix incorrect assumption about ternary operator precedence
Danylo Piliaiev [Wed, 18 Jul 2018 08:47:19 +0000 (11:47 +0300)]
radv: Fix incorrect assumption about ternary operator precedence

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agomesa: fix make check for AMD_performance_monitor
Marek Olšák [Thu, 19 Jul 2018 05:16:56 +0000 (01:16 -0400)]
mesa: fix make check for AMD_performance_monitor

6 years agomesa: remove dead code from api_loopback
Marek Olšák [Tue, 17 Jul 2018 03:29:48 +0000 (23:29 -0400)]
mesa: remove dead code from api_loopback

This should only contain functions not set in vtxfmt.c.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agomesa: expose ARB_indirect_parameters in the compatibility profile
Marek Olšák [Tue, 17 Jul 2018 03:16:31 +0000 (23:16 -0400)]
mesa: expose ARB_indirect_parameters in the compatibility profile

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> (v1)
v2: fix dispatch_sanity

6 years agovbo: fix ARB_multi_draw_indirect for the compatibility profile
Marek Olšák [Tue, 17 Jul 2018 03:14:14 +0000 (23:14 -0400)]
vbo: fix ARB_multi_draw_indirect for the compatibility profile

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agomesa: expose ARB_shader_viewport_layer_array in the compatibility profile
Marek Olšák [Tue, 17 Jul 2018 03:10:17 +0000 (23:10 -0400)]
mesa: expose ARB_shader_viewport_layer_array in the compatibility profile

no changes needed for GL compat

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agomesa: expose ARB_ES3_1_compatibility in the compatibility profile
Marek Olšák [Tue, 17 Jul 2018 03:09:53 +0000 (23:09 -0400)]
mesa: expose ARB_ES3_1_compatibility in the compatibility profile

no changes needed for GL compat

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agowinsys/amdgpu: remove RADEON_SURF_FMASK leftover
Marek Olšák [Mon, 25 Jun 2018 22:57:08 +0000 (18:57 -0400)]
winsys/amdgpu: remove RADEON_SURF_FMASK leftover

RADEON_SURF_FMASK is never set.

6 years agoac: run LLVM optimization passes only on the final function after inlining
Marek Olšák [Thu, 5 Jul 2018 06:27:45 +0000 (02:27 -0400)]
ac: run LLVM optimization passes only on the final function after inlining

6 years agoradv: Enable binning and dfsm by default on Raven.
Bas Nieuwenhuizen [Sat, 14 Jul 2018 12:28:23 +0000 (14:28 +0200)]
radv: Enable binning and dfsm by default on Raven.

Seems like it increases performance by 2-3% for some demos and games.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradv: Always set disable zpass increment bit when possible.
Bas Nieuwenhuizen [Sat, 14 Jul 2018 12:28:22 +0000 (14:28 +0200)]
radv: Always set disable zpass increment bit when possible.

When no occlusion queries are active even if out of order is enabled.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradv: Select correct entries for binning.
Bas Nieuwenhuizen [Sat, 14 Jul 2018 12:28:21 +0000 (14:28 +0200)]
radv: Select correct entries for binning.

Overshot it by one every time.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradv: Fix number of samples used for binning.
Bas Nieuwenhuizen [Sat, 14 Jul 2018 12:28:20 +0000 (14:28 +0200)]
radv: Fix number of samples used for binning.

Used the wrong register ...

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradv: Disable disabled color buffers in rbplus opts.
Bas Nieuwenhuizen [Sat, 14 Jul 2018 12:28:19 +0000 (14:28 +0200)]
radv: Disable disabled color buffers in rbplus opts.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agor600: silence the signed overflow warning like radeonsi
Marek Olšák [Wed, 18 Jul 2018 21:47:54 +0000 (17:47 -0400)]
r600: silence the signed overflow warning like radeonsi

r600_gpu_load.c: In function ‘r600_gpu_load_thread’:
../../../../src/util/os_time.h:82:7: warning: assuming signed overflow does not occur when assuming that (X + c) >= X is always true [-Wstrict-overflow]
    if (start <= end)

6 years agoradv: fix wmaybe-uninitialized in radv_meta_fast_clear.c
Andres Rodriguez [Wed, 18 Jul 2018 18:18:57 +0000 (14:18 -0400)]
radv: fix wmaybe-uninitialized in radv_meta_fast_clear.c

Assignment and usage of this variable both happen inside an
if(rad_image_has_dcc()) {} blocks. It seems gcc plays it safe and
assumes that both function calls could have different return values.

But in this case we should be safe.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradeonsi: emit_guardband packets optimization
Sonny Jiang [Tue, 17 Jul 2018 14:22:03 +0000 (10:22 -0400)]
radeonsi: emit_guardband packets optimization

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: Save CLEAR_STATE initial values for optimization
Sonny Jiang [Tue, 17 Jul 2018 14:22:02 +0000 (10:22 -0400)]
radeonsi: Save CLEAR_STATE initial values for optimization

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: Refuse to accept code with unhandled relocations
Jan Vesely [Tue, 17 Jul 2018 01:22:22 +0000 (21:22 -0400)]
radeonsi: Refuse to accept code with unhandled relocations

They might lead to unrecoverable GPU hang.
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable@lists.freedesktop.org
6 years agoAllow AMD_perfmon on GLES contexts
Eric Anholt [Wed, 18 Jul 2018 15:01:17 +0000 (11:01 -0400)]
Allow AMD_perfmon on GLES contexts

v2: whitespace alignment fix

Reviewed-by: Rob Clark <robdclark@gmail.com>
6 years agoegl: Use the canonical drm-uapi fourcc header to avoid local defines.
Eric Anholt [Mon, 16 Jul 2018 22:57:24 +0000 (15:57 -0700)]
egl: Use the canonical drm-uapi fourcc header to avoid local defines.

We should only use a #define locally once it's been upstreamed, and at
that point you should just update our drm_fourcc.h.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agov3d: Fix tiling modifier support to use the new UIF define.
Eric Anholt [Wed, 20 Jun 2018 23:54:31 +0000 (16:54 -0700)]
v3d: Fix tiling modifier support to use the new UIF define.

You can't use T tiled buffers on V3D 3.x and newer, it's been replaced
with a newer layout shared with other hardware blocks.

6 years agodrm-uapi: Update drm_fourcc.h for new format modifiers.
Eric Anholt [Wed, 20 Jun 2018 23:51:39 +0000 (16:51 -0700)]
drm-uapi: Update drm_fourcc.h for new format modifiers.

This brings in the Broadcom VC4 SAND and V3D 3.x+ UIF modifiers, from
drm-next commit 4da1d4c751c9b1b713c13043bad7c4d27cd1418c.

6 years agost/mesa: notify u_vbuf/driver that draw index bounds are unknown for indirect
Marek Olšák [Tue, 17 Jul 2018 05:50:42 +0000 (01:50 -0400)]
st/mesa: notify u_vbuf/driver that draw index bounds are unknown for indirect

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agoradeonsi: Use signed char for color_interp_vgpr_index
Timothy Pearson [Mon, 16 Jul 2018 19:20:42 +0000 (14:20 -0500)]
radeonsi: Use signed char for color_interp_vgpr_index

color_interp_vgpr_index was declared as a generic char value.
Because signed values are used in this variable, the result
was not safe across architectures and crashed on ppc64[el]
and arm.

Declare color_interp_vgpr_index as a signed type.

Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
6 years agointel/blorp: Take an explicit filter parameter in blorp_blit
Jason Ekstrand [Mon, 25 Jun 2018 22:14:38 +0000 (15:14 -0700)]
intel/blorp: Take an explicit filter parameter in blorp_blit

This lets us move the glBlitFramebuffer nonsense into the GL driver and
make the usage of BLORP mutch more explicit and obvious as to what it's
doing.

Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agointel/blorp: Add a blorp_filter enum for use in blorp_blit
Jason Ekstrand [Wed, 20 Jun 2018 05:05:57 +0000 (22:05 -0700)]
intel/blorp: Add a blorp_filter enum for use in blorp_blit

At the moment, this is entirely internal but we'll expose it to clients
of the BLORP API in the next commit.

Reviewed-by: Chad Versace <chadversary@chromium.org>
6 years agointel/tools: add missing include for stdarg.h
Caio Marcelo de Oliveira Filho [Wed, 18 Jul 2018 16:15:53 +0000 (09:15 -0700)]
intel/tools: add missing include for stdarg.h

Fixes build in GCC 8.1.1:

FAILED: src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o
gcc -Isrc/intel/tools/src@intel@tools@@intel_dump_gpu@sha -Isrc/intel/tools -I../../src/intel/tools -Isrc/../include -I../../src/../include -Isrc -I../../src -Isrc/mapi -I../../src/mapi -Isrc/mesa -I../../src/mesa -I../../src/gallium/include -I../../src/gallium/auxiliary -Isrc/intel -I../../src/intel -I../../include/drm-uapi -fdiagnostics-color=always -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -std=c99 -O2 -g -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS '-DVERSION="18.2.0-devel"' -DPACKAGE_VERSION=VERSION '-DPACKAGE_BUGREPORT="https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa"' -DGLX_USE_TLS -DENABLE_ST_OMX_BELLAGIO=0 -DENABLE_ST_OMX_TIZONIA=0 -DHAVE_X11_PLATFORM -DGLX_INDIRECT_RENDERING -DGLX_DIRECT_RENDERING -DGLX_USE_DRM -DHAVE_DRM_PLATFORM -DHAVE_SURFACELESS_PLATFORM -DENABLE_SHADER_CACHE -DHAVE___BUILTIN_BSWAP32 -DHAVE___BUILTIN_BSWAP64 -DHAVE___BUILTIN_CLZ -DHAVE___BUILTIN_CLZLL -DHAVE___BUILTIN_CTZ -DHAVE___BUILTIN_EXPECT -DHAVE___BUILTIN_FFS -DHAVE___BUILTIN_FFSLL -DHAVE___BUILTIN_POPCOUNT -DHAVE___BUILTIN_POPCOUNTLL -DHAVE___BUILTIN_UNREACHABLE -DHAVE_FUNC_ATTRIBUTE_CONST -DHAVE_FUNC_ATTRIBUTE_FLATTEN -DHAVE_FUNC_ATTRIBUTE_MALLOC -DHAVE_FUNC_ATTRIBUTE_PURE -DHAVE_FUNC_ATTRIBUTE_UNUSED -DHAVE_FUNC_ATTRIBUTE_WARN_UNUSED_RESULT -DHAVE_FUNC_ATTRIBUTE_WEAK -DHAVE_FUNC_ATTRIBUTE_FORMAT -DHAVE_FUNC_ATTRIBUTE_PACKED -DHAVE_FUNC_ATTRIBUTE_RETURNS_NONNULL -DHAVE_FUNC_ATTRIBUTE_VISIBILITY -DHAVE_FUNC_ATTRIBUTE_ALIAS -DHAVE_FUNC_ATTRIBUTE_NORETURN -D_GNU_SOURCE -DUSE_SSE41 -DUSE_GCC_ATOMIC_BUILTINS -DUSE_X86_64_ASM -DMAJOR_IN_SYSMACROS -DHAVE_SYS_SYSCTL_H -DHAVE_LINUX_FUTEX_H -DHAVE_ENDIAN_H -DHAVE_STRTOF -DHAVE_MKOSTEMP -DHAVE_POSIX_MEMALIGN -DHAVE_TIMESPEC_GET -DHAVE_MEMFD_CREATE -DHAVE_STRTOD_L -DHAVE_DLADDR -DHAVE_DL_ITERATE_PHDR -DHAVE_ZLIB -DHAVE_PTHREAD -DHAVE_LIBDRM -DHAVE_LLVM=0x0600 -DMESA_LLVM_VERSION_PATCH=1 -DHAVE_VALGRIND -DHAVE_LIBUNWIND -DHAVE_WAYLAND_PLATFORM -DWL_HIDE_DEPRECATED -DHAVE_DRI3 -DHAVE_DRI3_MODIFIERS -Wall -Werror=implicit-function-declaration -Werror=missing-prototypes -fno-math-errno -fno-trapping-math -Wno-missing-field-initializers -fPIC -fvisibility=hidden -Wno-override-init  -MD -MQ 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -MF 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o.d' -o 'src/intel/tools/src@intel@tools@@intel_dump_gpu@sha/aub_write.c.o' -c ../../src/intel/tools/aub_write.c
../../src/intel/tools/aub_write.c: In function ‘fail_if’:
../../src/intel/tools/aub_write.c:243:4: error: implicit declaration of function ‘va_start’; did you mean ‘assert’? [-Werror=implicit-function-declaration]
    va_start(args, format);
    ^~~~~~~~
    assert
../../src/intel/tools/aub_write.c:245:4: error: implicit declaration of function ‘va_end’; did you mean ‘rand’? [-Werror=implicit-function-declaration]
    va_end(args);
    ^~~~~~
    rand
cc1: some warnings being treated as errors

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agointel/tools: Rename error2aub to intel_error2aub
Jason Ekstrand [Wed, 18 Jul 2018 16:02:25 +0000 (09:02 -0700)]
intel/tools: Rename error2aub to intel_error2aub

Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoi965: Sweep NIR after linking phase to free held memory
Danylo Piliaiev [Wed, 11 Jul 2018 12:29:00 +0000 (15:29 +0300)]
i965: Sweep NIR after linking phase to free held memory

After optimization passes and many trasfromations most of memory
NIR holds is a garbage which was being freed only after shader deletion.
Freeing it at the end of linking will save memory which would be useful
in case there are a lot of complex shaders being compiled.
The common case for this issue is 32bit game running under Wine.

The cost of the optimization is around ~3-5% of compilation speed
with complex shaders.

V2: by Jason Ekstrand
    - Move nir_sweep up, right after the last change of NIR

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103274
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
6 years agowinsys/amdgpu: fix VDPAU interop by having one amdgpu_winsys_bo per BO (v2)
Marek Olšák [Mon, 16 Jul 2018 17:11:29 +0000 (13:11 -0400)]
winsys/amdgpu: fix VDPAU interop by having one amdgpu_winsys_bo per BO (v2)

Dependencies between rings are inserted correctly if a buffer is
represented by only one unique amdgpu_winsys_bo instance.
Use a hash table keyed by amdgpu_bo_handle to have exactly one
amdgpu_winsys_bo per amdgpu_bo_handle.

v2: return offset and stride properly

Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
6 years agowinsys/amdgpu: use a better hash_pointer function
Marek Olšák [Mon, 16 Jul 2018 17:10:57 +0000 (13:10 -0400)]
winsys/amdgpu: use a better hash_pointer function

Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
6 years agowinsys/amdgpu: clean up error handling in amdgpu_bo_from_handle
Marek Olšák [Mon, 16 Jul 2018 17:07:09 +0000 (13:07 -0400)]
winsys/amdgpu: clean up error handling in amdgpu_bo_from_handle

Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
6 years agowinsys/amdgpu: shorten bo->ws in amdgpu_bo_destroy
Marek Olšák [Mon, 16 Jul 2018 17:04:53 +0000 (13:04 -0400)]
winsys/amdgpu: shorten bo->ws in amdgpu_bo_destroy

Tested-by: Leo Liu <leo.liu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
6 years agointel/tools: Add an error state to aub translator
Jason Ekstrand [Tue, 17 Jul 2018 16:14:38 +0000 (09:14 -0700)]
intel/tools: Add an error state to aub translator

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agointel/tools: Break aub file writing into a helper
Jason Ekstrand [Tue, 17 Jul 2018 06:13:20 +0000 (23:13 -0700)]
intel/tools: Break aub file writing into a helper

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agointel/tools: Refactor aub dumping to remove singletons
Jason Ekstrand [Tue, 17 Jul 2018 05:38:08 +0000 (22:38 -0700)]
intel/tools: Refactor aub dumping to remove singletons

Instead of having quite so many singletons, we use a struct aub_file to
organize the bits we need for writing an aub file.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agointel/dump_gpu: Fix corner cases in PPGTT range calculations
Jason Ekstrand [Tue, 17 Jul 2018 22:55:07 +0000 (15:55 -0700)]
intel/dump_gpu: Fix corner cases in PPGTT range calculations

For large buffers which span an entire l1 page table, we got the range
calculations wrong.  In this case, we end up with an l1_start which is
the first byte represented by the given l1 table and an l1_end which is
the first byte after the range represented by the l1 table.  Then
l2_start_index == L2_index(l2_end) due to roll-over.  Instead, compute
lN_end using (1Ull << shift) - 1 so that lN_end is the last byte in the
range represented by the Nth level page table.  When we do this, we
don't need the conditional expression anymore.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agointel/blorp: fix uninitialized variable warning
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 22:22:50 +0000 (15:22 -0700)]
intel/blorp: fix uninitialized variable warning

Compiler doesn't pick up that level and start_layer will be defined,
so do as was done for num_layers in 4d8b476fa9a "intel/blorp: Fix
compiler warning about num_layers." and always set it.

Fixes warning

../../src/mesa/drivers/dri/i965/brw_blorp.c: In function ‘brw_blorp_clear_depth_stencil’:
../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘start_layer’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    blorp_clear_depth_stencil(&batch, &depth_surf, &stencil_surf,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              level, start_layer, num_layers,
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              x0, y0, x1, y1,
                              ~~~~~~~~~~~~~~~
                              (mask & BUFFER_BIT_DEPTH), ctx->Depth.Clear,
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                              stencil_mask, ctx->Stencil.Clear);
                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/mesa/drivers/dri/i965/brw_blorp.c:1439:4: warning: ‘level’ may be used uninitialized in this function [-Wmaybe-uninitialized]

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agoutil/string_buffer: fix warning in tests
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 22:12:00 +0000 (15:12 -0700)]
util/string_buffer: fix warning in tests

And also specify the maximum size when writing to static buffers. The
warning below refers to the case where "str5" could be larger than
"str5 - str4", then the strcat would have overlapping dst and src.

Compiler doesn't pick up the bound from the snprintf above, so we make
clear the bounds of str5 by using strncat() instead of strcat().

../../src/util/tests/string_buffer/string_buffer_test.cpp: In member function ‘virtual void string_buffer_string_buffer_tests_Test::TestBody()’:
../../src/util/tests/string_buffer/string_buffer_test.cpp:106:10: warning: ‘char* strcat(char*, const char*)’ accessing 81 or more bytes at offsets 48 and 128 may overlap 1 byte at offset 128 [-Wrestrict]
    strcat(str4, str5);
    ~~~~~~^~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
6 years agoi965/miptree: avoid uninitialized variable warnings
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 21:59:32 +0000 (14:59 -0700)]
i965/miptree: avoid uninitialized variable warnings

GCC 8.1.1 is having a hard time identifying that the values are
properly initialized when used. In the 'memset_value' case, we pass
the uninitialized value to another function (that will use only if the
conditions match the initialization).

Just give enough hint to the compiler to figure things out. Fixes the
warnings

../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c: In function ‘intel_miptree_alloc_aux’:
../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1839:18: warning: ‘memset_value’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    mt->aux_buf = intel_alloc_aux_buffer(brw, &aux_surf, needs_memset,
                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                         memset_value);
                                         ~~~~~~~~~~~~~
../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1698:10: warning: ‘initial_state’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       if (wants_memset)
          ^
../../src/mesa/drivers/dri/i965/intel_mipmap_tree.c:1772:23: note: ‘initial_state’ was declared here
    enum isl_aux_state initial_state;
                       ^~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agointel/batch-decoder: fix uninitialized values warnings
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 21:17:38 +0000 (14:17 -0700)]
intel/batch-decoder: fix uninitialized values warnings

Code assumes that all the necessary fields will exist, but compiler
doesn't know about this. Provide zero as default values, like in other
decoding functions.

Fixes warnings

../../src/intel/common/gen_batch_decoder.c: In function ‘handle_media_interface_descriptor_load’:
../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_entry_count’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       dump_binding_table(ctx, binding_table_offset, binding_entry_count);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c:347:7: warning: ‘binding_table_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]

../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_count’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       dump_samplers(ctx, sampler_offset, sampler_count);
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../src/intel/common/gen_batch_decoder.c:346:7: warning: ‘sampler_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]

../../src/intel/common/gen_batch_decoder.c:343:7: warning: ‘ksp’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       ctx_disassemble_program(ctx, ksp, "compute shader");
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

../../src/intel/common/gen_batch_decoder.c: In function ‘decode_dynamic_state_pointers’:
../../src/intel/common/gen_batch_decoder.c:663:54: warning: ‘state_offset’ may be used uninitialized in this function [-Wmaybe-uninitialized]
    const uint32_t *state_map = ctx->dynamic_base.map + state_offset;
                                ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~

../../src/intel/common/gen_batch_decoder.c: In function ‘gen_print_batch’:
../../src/intel/common/gen_batch_decoder.c:856:13: warning: ‘next_batch.map’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          if (next_batch.map == NULL) {
             ^
../../src/intel/common/gen_batch_decoder.c:860:13: warning: ‘next_batch.addr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             gen_print_batch(ctx, next_batch.map, next_batch.size,
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             next_batch.addr);
                             ~~~~~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agointel/decoder: use snprintf(..., "%s", ...) instead of strncpy
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 21:06:09 +0000 (14:06 -0700)]
intel/decoder: use snprintf(..., "%s", ...) instead of strncpy

strncpy() doesn't guarantee the terminator NUL, so we would need to
set ourselves. Just use snprintf() instead.

Fixes the warnings

../../src/intel/common/gen_decoder.c: In function ‘iter_decode_field’:
../../src/intel/common/gen_decoder.c:897:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation]
       strncpy(iter->name, iter->field->name, sizeof(iter->name));
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘iter_advance_field’,
    inlined from ‘gen_field_iterator_next’ at ../../src/intel/common/gen_decoder.c:1015:9:
../../src/intel/common/gen_decoder.c:844:7: warning: ‘strncpy’ specified bound 128 equals destination size [-Wstringop-truncation]
       strncpy(iter->name, iter->field->name, sizeof(iter->name));
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agoanv: give more room to debug report
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 20:56:23 +0000 (13:56 -0700)]
anv: give more room to debug report

The error buffer is limited to 256, but the report contains the
filename and possibly other data. So give it more space.

Avoids the warnings

../../src/intel/vulkan/anv_util.c: In function ‘__anv_perf_warn’:
../../src/intel/vulkan/anv_util.c:66:42: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 254 [-Wformat-truncation=]
    snprintf(report, sizeof(report), "%s: %s", file, buffer);
                                          ^~         ~~~~~~
../../src/intel/vulkan/anv_util.c:66:4: note: ‘snprintf’ output 3 or more bytes (assuming 258) into a destination of size 256
    snprintf(report, sizeof(report), "%s: %s", file, buffer);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

../../src/intel/vulkan/anv_util.c: In function ‘__vk_errorf’:
../../src/intel/vulkan/anv_util.c:96:48: warning: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size 252 [-Wformat-truncation=]
       snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
                                                ^~                    ~~~~~~
../../src/intel/vulkan/anv_util.c:96:7: note: ‘snprintf’ output 8 or more bytes (assuming 263) into a destination of size 256
       snprintf(report, sizeof(report), "%s:%d: %s (%s)", file, line, buffer,
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                error_str);
                ~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agoanv: avoid warning when switching in VkStructureType
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 20:50:07 +0000 (13:50 -0700)]
anv: avoid warning when switching in VkStructureType

When one of the cases is not part of the enum, the compilar complains:

../../src/intel/vulkan/anv_formats.c: In function ‘anv_GetPhysicalDeviceFormatProperties2’:
../../src/intel/vulkan/anv_formats.c:728:7: warning: case value ‘1000001004’ not in enumerated type ‘VkStructureType’ {aka ‘enum VkStructureType’} [-Wswitch]
       case VK_STRUCTURE_TYPE_WSI_FORMAT_MODIFIER_PROPERTIES_LIST_MESA:
       ^~~~

Given the switch has an "default:" case, we don't lose anything by
switching on the unsigned value to avoid the warning.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agoglsl: remove unnecessary parenthesis from macro
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 20:40:26 +0000 (13:40 -0700)]
glsl: remove unnecessary parenthesis from macro

The "__inst" will contain the name used for the variable of type
"__type *". Parenthesis is not necessary as the name itself shouldn't
be an expression.

Fixes warning:

In file included from ../../src/mesa/main/mtypes.h:49,
                 from ../../src/intel/compiler/brw_compiler.h:30,
                 from ../../src/intel/compiler/brw_shader.h:29,
                 from ../../src/intel/compiler/brw_fs.h:31,
                 from ../../src/intel/compiler/brw_fs_cse.cpp:24:
../../src/intel/compiler/brw_fs_cse.cpp: In member function ‘bool fs_visitor::opt_cse_local(bblock_t*)’:
../../src/compiler/glsl/list.h:675:12: warning: unnecessary parentheses in declaration of ‘entry’ [-Wparentheses]
    __type *(__inst);                                      \
            ^
../../src/intel/compiler/brw_fs_cse.cpp:257:10: note: in expansion of macro ‘foreach_in_list_use_after’
          foreach_in_list_use_after(aeb_entry, entry, &aeb) {
          ^~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agointel/compiler: fix -Wsign-compare warning
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 20:32:36 +0000 (13:32 -0700)]
intel/compiler: fix -Wsign-compare warning

Explicitly convert to signed integer. Conversion is valid since is the
same (implicitly) used to initialize the loop. Avoids the warning:

../../src/intel/compiler/brw_fs.cpp: In member function ‘bool fs_visitor::lower_simd_width()’:
../../src/intel/compiler/brw_fs.cpp:5761:45: warning: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Wsign-compare]
             split_inst.eot = inst->eot && i == n - 1;
                                           ~~^~~~~~~~

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agointel/compiler: silence -Wclass-memaccess warnings
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 20:19:30 +0000 (13:19 -0700)]
intel/compiler: silence -Wclass-memaccess warnings

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agospirv: initialize is_vertex_input
Caio Marcelo de Oliveira Filho [Mon, 16 Jul 2018 19:36:29 +0000 (12:36 -0700)]
spirv: initialize is_vertex_input

Fixes warning:

../../src/compiler/spirv/vtn_variables.c: In function ‘var_decoration_cb’:
../../src/compiler/spirv/vtn_variables.c:1400:12: warning: ‘is_vertex_input’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       bool is_vertex_input;
            ^~~~~~~~~~~~~~~

The code used to set is_vertex_input in all possible codepaths, but
after 23edc5b1ef3 "spirv: translate default-block uniforms" the
compiler isn't sure all codepaths will initialize the variable.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
6 years agofreedreno/a5xx: perfmance counters
Rob Clark [Wed, 18 Jul 2018 13:42:29 +0000 (09:42 -0400)]
freedreno/a5xx: perfmance counters

AMD_performance_monitor support

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: batch query support (perfcounters)
Rob Clark [Wed, 18 Jul 2018 13:40:04 +0000 (09:40 -0400)]
freedreno: batch query support (perfcounters)

Core infrastructure for performance counters, using gallium's batch
query interface (to support AMD_performance_monitor).

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: batch query prep-work
Rob Clark [Thu, 28 Jun 2018 12:16:12 +0000 (08:16 -0400)]
freedreno: batch query prep-work

For batch queries we have N different query_type's for one query, so
mapping a single query_type to a sample_provider doesn't really work
out.  Instead add a new constructor to construct a query directly
from a sample_provider.

Also, the sample buffer size needs to be determined at runtime, as
it depends on the number of query_types.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: rework accumulated query result vfunc
Rob Clark [Thu, 28 Jun 2018 12:14:10 +0000 (08:14 -0400)]
freedreno: rework accumulated query result vfunc

Take the query object, rather than the ctx.  The ctx ptr isn't hugely
useful but for back queries we will need the query object to properly
get the results.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: output ir3 and nir asm for frameretrace
Rob Clark [Mon, 9 Jul 2018 17:17:12 +0000 (13:17 -0400)]
freedreno/ir3: output ir3 and nir asm for frameretrace

See: https://github.com/janesma/apitrace/commit/298dc8195bf082fe1f47aa474e28411f85dd5393

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: redirectable ir3 disasm output
Rob Clark [Mon, 9 Jul 2018 16:36:10 +0000 (12:36 -0400)]
freedreno/ir3: redirectable ir3 disasm output

For now it still goes to stdout, this will make it easier to support
output on stderr like what frameretrace expects.

(If we eventually have a proper GL extension for this, implementation
probably looks like dumping shader disasm to a tmp file and then dumping
that out over whatever mechanism is used.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: resync ir3 disassembler
Rob Clark [Mon, 9 Jul 2018 14:14:14 +0000 (10:14 -0400)]
freedreno/ir3: resync ir3 disassembler

Pull in latest updates from cffdump in envytools tree, so we can output
to other than just stdout.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: register usage queries
Rob Clark [Mon, 25 Jun 2018 12:47:55 +0000 (08:47 -0400)]
freedreno: register usage queries

Avg number of (half) regs per draw, so we can corrolate fps dips to
shader register usage.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agonir: add lowering for gl_HelperInvocation
Rob Clark [Fri, 1 Jun 2018 18:07:15 +0000 (14:07 -0400)]
nir: add lowering for gl_HelperInvocation

v2: reword comment about lower_helper_invocations to be more clear
    that it might not work on all hardware
v3: add special variant of load_sample_id which does not imply per-
    sample shading

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agomesa: don't double incr/decr ActiveCounters
Rob Clark [Mon, 2 Jul 2018 14:40:36 +0000 (10:40 -0400)]
mesa: don't double incr/decr ActiveCounters

Frameretrace ends up w/ excess calls to SelectPerfMonitorCountersAMD()
which ends up re-enabling already enabled counters.  Which causes
ActiveCounters[group] to be double incremented for the same counter.
This causes BeginPerfMonitorAMD() to fail.

The AMD_performance_monitor spec doesn't say that an error should be
generated in this case.  So I think the safe thing to do is just safe-
guard against excess increments/decrements.

Signed-off-by: Rob Clark <robdclark@gmail.com>