mesa.git
5 years agonir/phi_builder: Internal users should use nir_phi_builder_value_set_block_def too
Ian Romanick [Tue, 30 Oct 2018 16:46:26 +0000 (09:46 -0700)]
nir/phi_builder: Internal users should use nir_phi_builder_value_set_block_def too

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoetnaviv: drop redundant ctx function parameter
Christian Gmeiner [Wed, 12 Dec 2018 13:45:56 +0000 (14:45 +0100)]
etnaviv: drop redundant ctx function parameter

There is no need to have an extra ctx paramter as all the other
parameters carry all the needed information.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
5 years agogenxml: Consistently use a numeric "MOCS" field
Kenneth Graunke [Tue, 11 Dec 2018 08:34:11 +0000 (00:34 -0800)]
genxml: Consistently use a numeric "MOCS" field

When we first started using genxml, we decided to represent MOCS as an
actual structure, and pack values.  However, in many places, it was more
convenient to use a numeric value rather than treating it as a struct,
so we added secondary setters in a bunch of places as well.

We were not entirely consistent, either.  Some places only had one.
Gen6 had both kinds of setters for STATE_BASE_ADDRESS, but newer gens
only had the struct-based setters.  The names were sometimes "Constant
Buffer Object Control State" instead of "Memory", making it harder to
find.  Many had prefixes like "Vertex Buffer MOCS"...in a vertex buffer
packet...which is a bit redundant.

On modern hardware, MOCS is simply an index into a table, but we were
still carrying around the structure with an "Index to MOCS Table" field,
in addition to the direct numeric setters.  This is clunky - we really
just want a number on new hardware.

This patch eliminates the struct-based setters, and makes the numeric
setters be consistently called "MOCS".  We leave the struct definition
around on Gen7-8 for reference purposes, but it is unused.

v2: Drop bonus "Depth Buffer MOCS" fields on Gen7.5 and Gen9

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agonir: fix opt_if_loop_last_continue()
Timothy Arceri [Thu, 13 Dec 2018 23:23:27 +0000 (10:23 +1100)]
nir: fix opt_if_loop_last_continue()

The pass did not correctly handle loops ending in:

if ssa_7 {
block block_8:
/* preds: block_7 */
continue
/* succs: block_1 */
} else {
block block_9:
/* preds: block_7 */
break
/* succs: block_11 */
}

The break will get eliminated by another opt but if this pass gets
called first (as it does on RADV) we ended up inserting
instructions after the break.

Fixes: 5921a19d4b0c ("nir: add if opt opt_if_loop_last_continue()")
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agofreedreno/a6xx: fix resource_copy_region()
Rob Clark [Thu, 13 Dec 2018 14:15:33 +0000 (09:15 -0500)]
freedreno/a6xx: fix resource_copy_region()

pctx->resource_copy_region() needs to fall back to sw copy for
non-renderable formats.  But previously for things that we could
not use the blitter for, would fall back to 3d.  Which won't work
if 3d can't render to the dst format either.

Instead rework things to fallback to fd_resource_copy_region(),
which will try 3d core and then fall back to memcpy().

Fixes (for example) dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: move fd_resource_copy_region()
Rob Clark [Thu, 13 Dec 2018 14:14:48 +0000 (09:14 -0500)]
freedreno: move fd_resource_copy_region()

Code-motion prep for next patch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a6xx: more blitter fixes
Rob Clark [Tue, 11 Dec 2018 17:36:52 +0000 (12:36 -0500)]
freedreno/a6xx: more blitter fixes

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: update generated headers
Rob Clark [Tue, 11 Dec 2018 15:59:53 +0000 (10:59 -0500)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agogallium/aux: add is_unorm() helper
Rob Clark [Tue, 11 Dec 2018 16:13:49 +0000 (11:13 -0500)]
gallium/aux: add is_unorm() helper

We already had one for is_snorm() but not unorm.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a6xx: fix blitter crash
Rob Clark [Mon, 10 Dec 2018 15:40:31 +0000 (10:40 -0500)]
freedreno/a6xx: fix blitter crash

Fixes a crash with unsupported formats in dEQP-GLES3.functional.texture.format.sized.2d.rgb9_e5_pot

Also fixes gpu hangs with some formats that are supported, but which we
don't know what internal-format to use for the blitter, for ex
dEQP-GLES3.functional.texture.format.sized.2d_array.rgb10_a2_pot

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: don't remove unused input components
Rob Clark [Thu, 13 Dec 2018 18:50:50 +0000 (13:50 -0500)]
freedreno/ir3: don't remove unused input components

Fixes: 0d240c22141 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: fix crash
Rob Clark [Mon, 10 Dec 2018 15:39:28 +0000 (10:39 -0500)]
freedreno/ir3: fix crash

Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z

Fixes: 0d240c22141 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: also set DUMP flag on shaders
Rob Clark [Tue, 4 Dec 2018 13:07:50 +0000 (08:07 -0500)]
freedreno: also set DUMP flag on shaders

If we emit shader as a pointer to a GEM object, also set the RELOC_DUMP
flag as a hint to kernel that this is a useful buffer to snapshot for
debug dumps.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: debug GEM obj names
Rob Clark [Fri, 30 Nov 2018 13:29:51 +0000 (08:29 -0500)]
freedreno: debug GEM obj names

With a recent enough kernel, set debug names for GEM BOs, which will
show up in $debugfs/gem

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/drm: sync uapi and enable softpin
Rob Clark [Wed, 28 Nov 2018 13:50:19 +0000 (08:50 -0500)]
freedreno/drm: sync uapi and enable softpin

Pull in updated UAPI and use kernel API version to enable softpin.
Since MSM_SUBMIT_BO_DUMP flag was added at same time, use that to
signal to kernel that cmdstream buffers are useful to dump for
debugging/cmdstream-traces.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agonir: Move intel's half-float image store lowering to to nir_format.h.
Eric Anholt [Tue, 11 Dec 2018 21:49:28 +0000 (13:49 -0800)]
nir: Move intel's half-float image store lowering to to nir_format.h.

I needed the same function for v3d.  This was originally in d3e046e76c06
("nir: Pull some of intel's image load/store format conversion to
nir_format.h") before we made am istake about simplifying the function.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoRevert "intel: Simplify the half-float packing in image load/store lowering."
Eric Anholt [Thu, 13 Dec 2018 19:25:08 +0000 (11:25 -0800)]
Revert "intel: Simplify the half-float packing in image load/store lowering."

This reverts commit 06fbcd2cd5cc5702c9039c26d20082a99bc157bf.
nir_pack_half_2x16_split *isn't* vectorizable, it's 1-component only, thus
why we had this split-scalar code in the first place.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: Print the format of image variables.
Eric Anholt [Thu, 13 Dec 2018 19:15:07 +0000 (11:15 -0800)]
nir: Print the format of image variables.

This helps a lot when debugging image load/store lowering on large
testcases.  Unfortunately the Mesa enum name stuff is under src/mesa and
we can't get at it from the compiler.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agomesa/st: Expose compute shaders when NIR support is advertised.
Eric Anholt [Thu, 6 Dec 2018 00:08:12 +0000 (16:08 -0800)]
mesa/st: Expose compute shaders when NIR support is advertised.

We have a NIR path, and V3D doesn't have TGSI input for compute (only what
TTN can handle for the various gallium-internal shaders).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agoradv/xfb: fix counter buffer bounds checks.
Dave Airlie [Thu, 13 Dec 2018 03:29:04 +0000 (03:29 +0000)]
radv/xfb: fix counter buffer bounds checks.

If we gave this function 0 counter buffers, we'd still try and
access pCounterBuffers[0] as this check was incorrect.

Fixes crash with ext_transform_feedback-pipeline-basic-primgen
on zink on radv.

Fixes: 677b496b6 (radv: fix begin/end transform feedback with 0 counter buffers.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoi965: Enable nir_opt_idiv_const for 32 and 64-bit integers
Jason Ekstrand [Fri, 29 Dec 2017 03:53:36 +0000 (19:53 -0800)]
i965: Enable nir_opt_idiv_const for 32 and 64-bit integers

The pass should work for all bit sizes but it's less clear that the
extra instructions are worth it on small integers.  Also, the hardware
doesn't do mul_high on anything other than 32-bit integers and, absent
any decent mechanism for testing the pass on 8 and 16-bit types, it's
probably best to just leave it disabled for now.

Shader-db results on Sky Lake:

    total instructions in shared programs: 15105795 -> 15111403 (0.04%)
    instructions in affected programs: 72774 -> 78382 (7.71%)
    helped: 0
    HURT: 265

Note that hurt here actually means helped because we're getting rid of
integer quotient operations (which are a send on some platforms!) and
replacing them with fairly cheap ALU ops.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
5 years agoi965/vec4: Implement nir_op_uadd_sat
Jason Ekstrand [Mon, 8 Oct 2018 22:33:10 +0000 (17:33 -0500)]
i965/vec4: Implement nir_op_uadd_sat

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
5 years agoi965/fs: Implement nir_op_uadd_sat
Ian Romanick [Sat, 6 Oct 2018 02:04:47 +0000 (21:04 -0500)]
i965/fs: Implement nir_op_uadd_sat

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: Add a pass for lowering integer division by constants
Jason Ekstrand [Thu, 28 Dec 2017 21:06:28 +0000 (13:06 -0800)]
nir: Add a pass for lowering integer division by constants

It's a reasonably well-known fact in the world of compilers that integer
divisions by constants can be replaced by a multiply, an add, and some
shifts.  This commit adds such an optimization to NIR for easiest case
of udiv.  Other division operations will be added in following commits.
In order to provide some additional driver control, the pass takes a
minimum bit size to optimize.

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
5 years agonir: Add a saturated unsigned integer add opcode
Ian Romanick [Sat, 6 Oct 2018 01:22:41 +0000 (20:22 -0500)]
nir: Add a saturated unsigned integer add opcode

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir/lower_int64: Add support for [iu]mul_high
Jason Ekstrand [Fri, 29 Dec 2017 22:38:55 +0000 (14:38 -0800)]
nir/lower_int64: Add support for [iu]mul_high

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
5 years agonir: Allow [iu]mul_high on non-32-bit types
Jason Ekstrand [Fri, 29 Dec 2017 21:49:43 +0000 (13:49 -0800)]
nir: Allow [iu]mul_high on non-32-bit types

Reviewed-by: Ian Romanick ian.d.romanick@intel.com
5 years agoglx: mandate xf86vidmode only for "drm" dri platforms
Emil Velikov [Tue, 11 Dec 2018 16:20:40 +0000 (16:20 +0000)]
glx: mandate xf86vidmode only for "drm" dri platforms

Currently we have the three dri "platforms" - drm, apple and windows.

Since xf86vidmode is a thing only for the drm one, adjust the
preprocessor guards and correctly check for the dependency.

v2: terminate the GLX_USE_WINDOWSGL hunk

Cc: Jon TURNEY <jon.turney@dronecode.org.uk>
Fixes: 5bc509363b6 ("glx: make xf86vidmode mandatory for direct rendering")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agonir: remove unused variable
Alejandro Piñeiro [Thu, 13 Dec 2018 14:25:57 +0000 (15:25 +0100)]
nir: remove unused variable

To avoid the following warning:
./src/compiler/nir/nir_loop_analyze.c:807:16: warning: unused variable ‘ns’ [-Wunused-variable]
    nir_shader *ns = impl->function->shader;
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovirgl: work around bad assumptions in virglrenderer
Erik Faye-Lund [Tue, 11 Dec 2018 14:02:53 +0000 (14:02 +0000)]
virgl: work around bad assumptions in virglrenderer

Virglrenderer does the wrong thing when given an instance divisor;
it tries to use the element-index rather than the binding-index as
the argument to glVertexBindingDivisor(). This worked fine as long
as there was a 1:1 relationship between elements and bindings,
which was the case util 19a91841c34 "st/mesa: Use Array._DrawVAO in
st_atom_array.c.".

So let's detect instance divisors, and restore a 1:1 relationship in
that case. This will make old versions of virglrenderer behave
correctly. For newer versions, we can consider making a better
interface, where the instance divisor isn't specified per element,
but rather per binding. But let's save that for another day.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: 19a91841c34 "st/mesa: Use Array._DrawVAO in st_atom_array.c."
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
5 years agovirgl: wrap vertex element state in a struct
Erik Faye-Lund [Tue, 11 Dec 2018 12:34:38 +0000 (12:34 +0000)]
virgl: wrap vertex element state in a struct

This just has one member for now; the handle. But this is about to
change.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
5 years agovirgl: simplify virgl_hw_set_index_buffer
Erik Faye-Lund [Tue, 11 Dec 2018 11:16:47 +0000 (11:16 +0000)]
virgl: simplify virgl_hw_set_index_buffer

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
5 years agovirgl: simplify virgl_hw_set_vertex_buffers
Erik Faye-Lund [Tue, 11 Dec 2018 11:16:35 +0000 (11:16 +0000)]
virgl: simplify virgl_hw_set_vertex_buffers

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-By: Gert Wollny <gert.wollny@collabora.com>
5 years agodocs: update calendar, add news item and link release notes for 18.2.7
Juan A. Suarez Romero [Thu, 13 Dec 2018 14:45:20 +0000 (15:45 +0100)]
docs: update calendar, add news item and link release notes for 18.2.7

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agodocs: add sha256 checksums for 18.2.7
Juan A. Suarez Romero [Thu, 13 Dec 2018 14:42:26 +0000 (15:42 +0100)]
docs: add sha256 checksums for 18.2.7

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit e90429cc6dc5b5ada4253fc0a8517645d86a4f6c)

5 years agodocs: add release notes for 18.2.7
Juan A. Suarez Romero [Thu, 13 Dec 2018 13:58:30 +0000 (14:58 +0100)]
docs: add release notes for 18.2.7

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 419ee20097597bed77e73fd283e6d15e8dcb89e9)

5 years agoradv: don't check if format is depth in radv_image_can_enable_hile()
Samuel Pitoiset [Wed, 12 Dec 2018 13:15:53 +0000 (14:15 +0100)]
radv: don't check if format is depth in radv_image_can_enable_hile()

This is always TRUE if htile_size is not 0.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: check if addrlib enabled HTILE in radv_image_can_enable_htile()
Samuel Pitoiset [Wed, 12 Dec 2018 13:15:52 +0000 (14:15 +0100)]
radv: check if addrlib enabled HTILE in radv_image_can_enable_htile()

When hile_size is 0, we can't enable HTILE. This doesn't change
anything, except not calling radv_image_alloc_htile().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: switch on EOP when primitive restart is enabled with triangle strips
Samuel Pitoiset [Tue, 9 Oct 2018 12:15:12 +0000 (14:15 +0200)]
radv: switch on EOP when primitive restart is enabled with triangle strips

Otherwise, Yakuza hangs the GPU with DXVK. We don't know if
linetrip and pointlist are affected, so my point is to do that
only for triangle strips.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: allow to skip DCC decompressions with the new predicate
Samuel Pitoiset [Mon, 10 Dec 2018 12:00:33 +0000 (13:00 +0100)]
radv: allow to skip DCC decompressions with the new predicate

Feral games aren't affected because they don't decompress DCC.
F1 2018 has one DCC decompression per frame, but I don't see
any performance improvements. This new predicate will be
probably more useful for DCC/MSAA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add a predicate for reflecting DCC decompression state
Samuel Pitoiset [Mon, 10 Dec 2018 11:57:34 +0000 (12:57 +0100)]
radv: add a predicate for reflecting DCC decompression state

It's somehow similar to the FCE predicate.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoi965/compute: Emit GPGPU_WALKER in genX_state_upload
Jordan Justen [Mon, 12 Nov 2018 02:01:56 +0000 (18:01 -0800)]
i965/compute: Emit GPGPU_WALKER in genX_state_upload

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965/genX_state: Add register access functions
Jordan Justen [Mon, 12 Nov 2018 01:46:33 +0000 (17:46 -0800)]
i965/genX_state: Add register access functions

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agointel: Simplify the half-float packing in image load/store lowering.
Eric Anholt [Wed, 12 Dec 2018 19:29:29 +0000 (11:29 -0800)]
intel: Simplify the half-float packing in image load/store lowering.

This was noted by Jason in review when I tried to make a helper for the
old path.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: Pull some of intel's image load/store format conversion to nir_format.h
Eric Anholt [Tue, 11 Dec 2018 21:49:28 +0000 (13:49 -0800)]
nir: Pull some of intel's image load/store format conversion to nir_format.h

I needed the same functions for v3d.  Note that the color value in the
Intel lowering has already been cut down to image.chans num_components.

v2: Drop the half float one, since it was a 1-liner after cleanup.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: Add some more consts to the nir_format_convert.h helpers.
Eric Anholt [Tue, 11 Dec 2018 21:40:54 +0000 (13:40 -0800)]
nir: Add some more consts to the nir_format_convert.h helpers.

Most of the bits were constant, but a few were missed.  Avoids warnings
from v3d's upcoming static const bits declarations.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: detect more induction variables
Timothy Arceri [Mon, 26 Nov 2018 01:05:00 +0000 (12:05 +1100)]
nir: detect more induction variables

This allows loop analysis to detect inductions variables that
are incremented in both branches of an if rather than in a main
loop block. For example:

   loop {
      block block_1:
      /* preds: block_0 block_7 */
      vec1 32 ssa_8 = phi block_0: ssa_4, block_7: ssa_20
      vec1 32 ssa_9 = phi block_0: ssa_0, block_7: ssa_4
      vec1 32 ssa_10 = phi block_0: ssa_1, block_7: ssa_4
      vec1 32 ssa_11 = phi block_0: ssa_2, block_7: ssa_21
      vec1 32 ssa_12 = phi block_0: ssa_3, block_7: ssa_22
      vec4 32 ssa_13 = vec4 ssa_12, ssa_11, ssa_10, ssa_9
      vec1 32 ssa_14 = ige ssa_8, ssa_5
      /* succs: block_2 block_3 */
      if ssa_14 {
         block block_2:
         /* preds: block_1 */
         break
         /* succs: block_8 */
      } else {
         block block_3:
         /* preds: block_1 */
         /* succs: block_4 */
      }
      block block_4:
      /* preds: block_3 */
      vec1 32 ssa_15 = ilt ssa_6, ssa_8
      /* succs: block_5 block_6 */
      if ssa_15 {
         block block_5:
         /* preds: block_4 */
         vec1 32 ssa_16 = iadd ssa_8, ssa_7
         vec1 32 ssa_17 = load_const (0x3f800000 /* 1.000000*/)
         /* succs: block_7 */
      } else {
         block block_6:
         /* preds: block_4 */
         vec1 32 ssa_18 = iadd ssa_8, ssa_7
         vec1 32 ssa_19 = load_const (0x3f800000 /* 1.000000*/)
         /* succs: block_7 */
      }
      block block_7:
      /* preds: block_5 block_6 */
      vec1 32 ssa_20 = phi block_5: ssa_16, block_6: ssa_18
      vec1 32 ssa_21 = phi block_5: ssa_17, block_6: ssa_4
      vec1 32 ssa_22 = phi block_5: ssa_4, block_6: ssa_19
      /* succs: block_1 */
   }

Unfortunatly GCM could move the addition out of the if for us
(making this patch unrequired) but we still cannot enable the GCM
pass without regressions.

This unrolls a loop in Rise of The Tomb Raider.

vkpipeline-db results (VEGA):

Totals from affected shaders:
SGPRS: 88 -> 96 (9.09 %)
VGPRS: 56 -> 52 (-7.14 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 2168 -> 4560 (110.33 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 4 -> 4 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211

5 years agonir: reword code comment
Timothy Arceri [Mon, 26 Nov 2018 01:04:35 +0000 (12:04 +1100)]
nir: reword code comment

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agonir: in loop analysis track actual control flow type
Timothy Arceri [Sun, 25 Nov 2018 23:14:28 +0000 (10:14 +1100)]
nir: in loop analysis track actual control flow type

This will allow us to improve analysis to find more induction
variables.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agonir: add if opt opt_if_loop_last_continue()
Danylo Piliaiev [Sun, 25 Nov 2018 23:59:52 +0000 (10:59 +1100)]
nir: add if opt opt_if_loop_last_continue()

Removing the last continue can allow more loops to unroll. Also
inserting code into the if branch can allow the various if opts
to progress further.

The insertion of some loops into the if branch also reduces VGPR
use in some shaders.

vkpipeline-db results (VEGA):

Totals from affected shaders:
SGPRS: 6552 -> 6576 (0.37 %)
VGPRS: 6544 -> 6532 (-0.18 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 481952 -> 478032 (-0.81 %) bytes
LDS: 13 -> 13 (0.00 %) blocks
Max Waves: 241 -> 242 (0.41 %)
Wait states: 0 -> 0 (0.00 %)

Shader-db results radeonsi (VEGA):

Totals from affected shaders:
SGPRS: 168 -> 168 (0.00 %)
VGPRS: 144 -> 140 (-2.78 %)
Spilled SGPRs: 157 -> 157 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 8524 -> 8488 (-0.42 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 7 -> 7 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

v2: (Timothy Arceri):
- allow for continues in either branch
- move any trailing loops inside the if as well as blocks.
- leave nir_opt_trivial_continues() to actually remove the
  continue.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=32211

5 years agonir: rework force_unroll_array_access()
Timothy Arceri [Thu, 15 Nov 2018 10:28:31 +0000 (21:28 +1100)]
nir: rework force_unroll_array_access()

Here we rework force_unroll_array_access() so that we can reuse
the induction variable detection in a following patch.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agonir: factor out some of the complex loop unroll code to a helper
Timothy Arceri [Thu, 15 Nov 2018 08:51:20 +0000 (19:51 +1100)]
nir: factor out some of the complex loop unroll code to a helper

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agodocs: Document GitLab merge request process (email alternative)
Jordan Justen [Tue, 27 Nov 2018 23:39:10 +0000 (15:39 -0800)]
docs: Document GitLab merge request process (email alternative)

This documents a process for using GitLab Merge Requests as an second
way to submit code changes for Mesa. Only one of the two methods is
allowed for each patch series.

We will *not* require all patches to be emailed. Some code changes may
be reviewed and merged without any discussion on the mesa-dev email
list.

v2:
 * No longer require email. Allow submitter to choose email or a
   GitLab merge request.
 * Various feedback from Brian, Daniel, Dylan, Eric, Erik, Jason,
   Matt, Michel and Rob.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Rob Clark <robdclark@gmail.com>
5 years agomeson: libfreedreno depends upon libdrm (for fence support)
Rhys Kidd [Wed, 12 Dec 2018 07:45:18 +0000 (02:45 -0500)]
meson: libfreedreno depends upon libdrm (for fence support)

Error message building freedreno Gallium driver with meson:

  ../src/gallium/drivers/freedreno/freedreno_fence.c:27:21: fatal error: libsync.h: No such file or directory
   \#include <libsync.h>

Fixes: 4aa69cc4257 ("meson: build freedreno")
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agonir: Document the function inlining process
Jason Ekstrand [Mon, 29 Oct 2018 17:00:40 +0000 (12:00 -0500)]
nir: Document the function inlining process

This has thrown a few people off recently and it's good to have the
process and all the rational for it documented somewhere.  A comment at
the top of nir_inline_functions seems as good a place as any.

Acked-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/blorp: Assert that we don't re-layout a compressed surface
Jason Ekstrand [Sat, 10 Feb 2018 18:52:51 +0000 (10:52 -0800)]
intel/blorp: Assert that we don't re-layout a compressed surface

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/pipeline: Set the correct binding count for compute shaders
Jason Ekstrand [Tue, 13 Feb 2018 03:34:48 +0000 (19:34 -0800)]
anv/pipeline: Set the correct binding count for compute shaders

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoradv: bump reported version to 1.1.90
Samuel Pitoiset [Tue, 11 Dec 2018 12:53:05 +0000 (13:53 +0100)]
radv: bump reported version to 1.1.90

After going through the spec changelog, it looks like RADV
is up to date. Note that ANV also reports 1.1.90.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovirgl: force linear texturing support
Erik Faye-Lund [Mon, 10 Dec 2018 14:57:32 +0000 (14:57 +0000)]
virgl: force linear texturing support

When I made sure that half-float texture-filtering was required for ES3,
I didn't realize that virgl doesn't report support for this correctly.
This regressed the GLES version available on top of several drivers,
including i965 from 3.2 to 2.0.

This is going to need protocol changes to fix properly, so let's just
restore the previous behavior by enabling floating-point filtering
unconditionally for now.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: fcf9fcee3c8 "mesa/main: do not require float-texture filtering for es3"
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agointel/compiler: do not copy-propagate strided regions to ddx/ddy arguments
Iago Toral Quiroga [Mon, 28 May 2018 11:03:24 +0000 (13:03 +0200)]
intel/compiler: do not copy-propagate strided regions to ddx/ddy arguments

The implementation of these opcodes in the generator assumes that their
arguments are packed, and it generates register regions based on that
assumption.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoanv: Advertise support for MinLod on Skylake+
Jason Ekstrand [Wed, 3 Oct 2018 03:04:09 +0000 (22:04 -0500)]
anv: Advertise support for MinLod on Skylake+

These are usually used for dealing with sparse resources but there's no
reason why we can't hook them up before we have sparse.  We have the
hardware; let's light it up.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agointel/fs: Support min_lod parameters on texture instructions
Jason Ekstrand [Thu, 11 Oct 2018 20:57:50 +0000 (15:57 -0500)]
intel/fs: Support min_lod parameters on texture instructions

We have to lower some shadow instructions because they don't exist in
hardware and we have to lower txb+offset+clamp because the message gets
too big and we run into the sampler message length limit of 11 regs.

Acked-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agonir/lower_tex: Add lowering for some min_lod cases
Jason Ekstrand [Thu, 11 Oct 2018 19:14:29 +0000 (14:14 -0500)]
nir/lower_tex: Add lowering for some min_lod cases

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agonir/lower_tex: Modify txd instructions instead of replacing them
Jason Ekstrand [Thu, 11 Oct 2018 19:27:26 +0000 (14:27 -0500)]
nir/lower_tex: Modify txd instructions instead of replacing them

I don't know if one is better than the other or not but this approach
has the advantage that we never forget to copy information over and
we're not hard-coding quite as many assumptions.  It's also a lot
simpler and much less code.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agonir/lower_tex: Simplify lower_gradient logic
Jason Ekstrand [Thu, 11 Oct 2018 17:56:21 +0000 (12:56 -0500)]
nir/lower_tex: Simplify lower_gradient logic

Instead of having to call two different lower_gradient functions based
on whether or not it's a cube, just make lower_gradient handle cubes.
This significantly simplifies some of the logic.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agospirv: Add support for MinLod
Jason Ekstrand [Wed, 3 Oct 2018 02:15:47 +0000 (21:15 -0500)]
spirv: Add support for MinLod

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agointel/ir: Don't allow allocating zero registers
Jason Ekstrand [Thu, 11 Oct 2018 18:52:45 +0000 (13:52 -0500)]
intel/ir: Don't allow allocating zero registers

This simple check helps catch bugs early that can end up propagating
into later stages of the compile and triggering strange asserts.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agogallivm: remove unused float coord wrapping for aos sampling
Roland Scheidegger [Fri, 7 Dec 2018 01:28:01 +0000 (02:28 +0100)]
gallivm: remove unused float coord wrapping for aos sampling

AoS sampling tries to use integers for coord wrapping when possible,
as it should be faster. However, for AVX, this was suboptimal, because
only floats can use 8x32bit vectors, whereas integers have to be split
into 4x32bit vectors. (I believe part of why it was slower was also
that at least earlier llvm versions had trouble optimizing it properly,
since you can still do simple bit ops with 8x32bit vectors, so a
sequence of int add / and / int add / and with such vectors would
actually end up doing 128bit inserts/extracts between the operations
instead of just doing the cheap 128bit ands.)
Hence, a special float coord wrapping path was added to AoS sampling.
But this path was actually disabled for a long time already, since we
found that just splitting everything before entering the AoS path was
still sligthly faster usually, so none of this float coord wrapping
code was used anymore (AoS sampling code, when avx2 isn't supported,
never sees vectors with length > 4). I thought it might be useful some
day again, but I'm not interested anymore in optimizing for very weird
instruction sets which have support for 256bit vectors for floats but
not for ints, so just drop it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agodocs: update calendar, add news item and link release notes for 18.3.1
Emil Velikov [Tue, 11 Dec 2018 21:25:18 +0000 (21:25 +0000)]
docs: update calendar, add news item and link release notes for 18.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agodocs: add sha256 checksums for 18.3.1
Emil Velikov [Tue, 11 Dec 2018 21:19:03 +0000 (21:19 +0000)]
docs: add sha256 checksums for 18.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agodocs: add release notes for 18.3.1
Emil Velikov [Tue, 11 Dec 2018 21:12:55 +0000 (21:12 +0000)]
docs: add release notes for 18.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agofreedreno: Add .dir-locals to the common directory
Neil Roberts [Sun, 2 Dec 2018 17:22:11 +0000 (18:22 +0100)]
freedreno: Add .dir-locals to the common directory

The commit aa0fed10d35 moved a bunch of Freedreno code to a common
directory. The previous directory had a .dir-locals file for Emacs.
This patch copies it to the new directory as well.

Reviewed-by: Kristian H. Kristensen <hoegsberg@chromium.org>
5 years agomesa/st/nir: fix missing nir_compact_varyings
Rob Clark [Sat, 8 Dec 2018 18:21:52 +0000 (13:21 -0500)]
mesa/st/nir: fix missing nir_compact_varyings

LinkedTransformFeedback is normally populated, which had nerf'd varying
packing since the check was introduced.

Fixes: dbd52585fa9 st/nir: Disable varying packing when doing transform feedback.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agonir: fix spelling typo
Rob Clark [Sat, 8 Dec 2018 18:19:51 +0000 (13:19 -0500)]
nir: fix spelling typo

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoanv,radv: Disable VK_EXT_pci_bus_info
Jason Ekstrand [Mon, 10 Dec 2018 16:57:35 +0000 (10:57 -0600)]
anv,radv: Disable VK_EXT_pci_bus_info

The Vulkan working group recently discovered that we made a mistake in
assuming that PCI domains are 16-bit even though they can potentially be
32-bit values.  To fix this, the next spec update will change the types
in the VK_EXT_pci_bus_info struct to be 32 bits which will be a
backwards-incompatible change.  Normally, Khronos tries very hard to
never make backwards incompatible changes to specs.  Hopefully, the
extension is new enough (2 months) that there are no shipping apps which
use the extension so this should be safe.

This commit disables the extension for both anv and radv in mesa and
should be back-ported to 18.3 ASAP so we avoid any potential issues with
new apps running on old drivers.  I'll send out a commit (which we can
also back-port to 18.3 if we really care) to re-enable the extension in
both drivers once this week's spec update ships.  The one known use of
this extension is internal to mesa and will continue working with the
extension disabled and will naturally update when we get a new header.

Cc: "18.3" <mesa-stable@lists.freedesktop.org>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agodocs: extends 18.2 lifecycle
Juan A. Suarez Romero [Mon, 10 Dec 2018 12:22:59 +0000 (13:22 +0100)]
docs: extends 18.2 lifecycle

As 18.3 was published with some delay, let's extend 18.2 life for
another extra release.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglapi: fixup EXT_multisampled_render_to_texture dispatch
Kristian H. Kristensen [Mon, 10 Dec 2018 18:14:34 +0000 (18:14 +0000)]
glapi: fixup EXT_multisampled_render_to_texture dispatch

There's a few missing and convoluted bits:

 - FramebufferTexture2DMultisampleEXT
Missing sanity check, should be desktop="false"

 - RenderbufferStorageMultisampleEXT
Missing sanity check, is aliased to RenderbufferStorageMultisample.
Thus it's set only when desktop GL or GLES2 v3.0+, while the extension
is GLES2 2.0+.

If we flip the aliasing we'll break indirect GLX, so loosen the version
to 2.0. Not perfect, yet this is the most sane thing I could think of.

v2: [Emil] Fixup RenderbufferStorageMultisampleEXT, commmit message

Cc: Kristian H. Kristensen <hoegsberg@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108974
Fixes: 1b331ae505e ("mesa: Add core support for EXT_multisampled_render_to_texture{,2}")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agofreedreno: Fix the Makefile.am fix
Kristian H. Kristensen [Mon, 10 Dec 2018 22:22:47 +0000 (14:22 -0800)]
freedreno: Fix the Makefile.am fix

Commit b028ce29f090938d12b0999fe4b0e712d2adc431 fixed a typo in
src/freedreno/Makefile.am, but ended up breaking the build for
freedreno.  The typo inadvertently made things work, as we were not
supposed to link with libnir or libmesautil to begin with.  Those come
in through libmesagallium and the typo prevented the duplicated
linkage.

Fixes: b028ce29f ("freedreno: add the missing _la in libfreedreno_ir3_la")
Cc: Emil Velikov <emil.velikov@collabora.com>
5 years agoi965/fs: Handle V/UV immediates in dump_instructions()
Matt Turner [Mon, 5 Nov 2018 17:52:09 +0000 (09:52 -0800)]
i965/fs: Handle V/UV immediates in dump_instructions()

5 years agointel/compiler: Always print flag subregister number
Sagar Ghuge [Sun, 9 Dec 2018 07:07:43 +0000 (23:07 -0800)]
intel/compiler: Always print flag subregister number

While disassembling the predicate always print flag subregister number
to keep grammar same across the generation for assembler tool.

v2: Combine consecutive format calls (Matt Turner)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region
Sagar Ghuge [Sun, 9 Dec 2018 05:50:36 +0000 (21:50 -0800)]
intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar region

When RepCtrl is set, the swizzle field is ignored by the hardware. In
order to ensure a 1-to-1 correspondence between the human-readable
disassembly and the binary instruction encoding always set the swizzle
to XXXX (all zeros) when it is unused due to RepCtrl

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agomeson: Add nir_algebraic_parser_test to suites
Dylan Baker [Fri, 7 Dec 2018 17:15:27 +0000 (09:15 -0800)]
meson: Add nir_algebraic_parser_test to suites

Just to make it easier to run a nir tests together.

Fixes: a0ae12ca91a45f81897e774019cde9bd081f03a0
       ("nir/algebraic: Add unit tests for bitsize validation")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoamd/addrlib: drop si_ci_vi_merged_enum.h from the list
Emil Velikov [Mon, 10 Dec 2018 14:47:38 +0000 (14:47 +0000)]
amd/addrlib: drop si_ci_vi_merged_enum.h from the list

Fixes: 776b9113656 ("amd/addrlib: update Mesa's copy of addrlib")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agofreedreno: add the missing _la in libfreedreno_ir3_la
Emil Velikov [Mon, 10 Dec 2018 11:48:24 +0000 (11:48 +0000)]
freedreno: add the missing _la in libfreedreno_ir3_la

Fixes: aa0fed10d35 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agofreedreno: drop duplicate MKDIR_GEN declaration
Emil Velikov [Mon, 10 Dec 2018 11:45:42 +0000 (11:45 +0000)]
freedreno: drop duplicate MKDIR_GEN declaration

Fixes: aa0fed10d35 ("freedreno: move ir3 to common location")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotravis: radeonsi and radv require LLVM 7.0
Rhys Kidd [Mon, 10 Dec 2018 05:21:23 +0000 (00:21 -0500)]
travis: radeonsi and radv require LLVM 7.0

Fixes: 3fbdcd942fe ("amd: remove support for LLVM 6.0")
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
Cc: Andres Gomez <agomez@igalia.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoloader: free error state, when checking the drawable type
Kirill Burtsev [Wed, 5 Dec 2018 15:54:27 +0000 (15:54 +0000)]
loader: free error state, when checking the drawable type

Currently we distinguish if the drawable is a window or pixmap by
checking xcb_present_select_input throws an error or not.

Yet, we don't always free the error state returned by xcb.

Cc: Kirill Burtsev <kirill.burtsev@qt.io>
Cc: Boyan Ding <boyan.j.ding@gmail.com>
Fixes: 6bd9ba7d074 ("loader: Add dri3 helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
[Emil: add commit message, fixes tag]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agonir: make use of new nir_cf_list_clone_and_reinsert() helper
Timothy Arceri [Fri, 16 Nov 2018 03:58:03 +0000 (14:58 +1100)]
nir: make use of new nir_cf_list_clone_and_reinsert() helper

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: add a new nir_cf_list_clone_and_reinsert() helper
Timothy Arceri [Fri, 16 Nov 2018 03:57:11 +0000 (14:57 +1100)]
nir: add a new nir_cf_list_clone_and_reinsert() helper

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: clarify some nit_loop_info member names
Timothy Arceri [Tue, 20 Nov 2018 00:35:37 +0000 (11:35 +1100)]
nir: clarify some nit_loop_info member names

Following commits will introduce additional fields such as
guessed_trip_count. Renaming these will help avoid confusion
as our unrolling feature set grows.

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: small tidy ups for nir_loop_analyze()
Timothy Arceri [Thu, 15 Nov 2018 09:40:08 +0000 (20:40 +1100)]
nir: small tidy ups for nir_loop_analyze()

Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoi965: Flip arguments to load_register_reg helpers.
Kenneth Graunke [Fri, 7 Dec 2018 21:01:07 +0000 (13:01 -0800)]
i965: Flip arguments to load_register_reg helpers.

load_register_imm and load_register_mem take the destination as the
first argument, so I'd like load_register_reg to do the same the sake
of consistency.  Otherwise, reading sequences of mixed LRI/LRM/LRR is
needlessly confusing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoi965: Delete dead brw_meta_resolve_color prototype.
Kenneth Graunke [Sat, 8 Dec 2018 02:41:19 +0000 (18:41 -0800)]
i965: Delete dead brw_meta_resolve_color prototype.

Dead since commit 09e041d61d367ff3a9e8492521606090050255d4 (May 2016).

5 years agonv50/ir: fix use-after-free in ConstantFolding::visit
Karol Herbst [Fri, 7 Dec 2018 08:44:55 +0000 (09:44 +0100)]
nv50/ir: fix use-after-free in ConstantFolding::visit

opnd() might delete the passed in instruction, but it's used through
i->srcExists() later in visit

v2: use continue instead return
v3: use brackets for the outer if/else chain

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agonouveau: use atomic operations for driver statistics
Karol Herbst [Fri, 7 Dec 2018 19:10:50 +0000 (20:10 +0100)]
nouveau: use atomic operations for driver statistics

multiple threads can write to those at the same time

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agonv50/ir: initialize relDegree staticly
Karol Herbst [Fri, 7 Dec 2018 08:47:05 +0000 (09:47 +0100)]
nv50/ir: initialize relDegree staticly

this race condition is pretty harmless, but also pretty trivial to fix

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agoshader-packing
Eric Anholt [Sat, 8 Dec 2018 00:51:12 +0000 (16:51 -0800)]
shader-packing

5 years agotfu
Eric Anholt [Sat, 8 Dec 2018 00:49:41 +0000 (16:49 -0800)]
tfu

5 years agov3d: Fix a leak of the disassembled instruction string during debug dumps.
Eric Anholt [Fri, 7 Dec 2018 18:34:40 +0000 (10:34 -0800)]
v3d: Fix a leak of the disassembled instruction string during debug dumps.

Fixes: ade416d02369 ("broadcom: Add VC5 NIR compiler.")
5 years agovc4: Fix a leak of the transfer helper on screen destroy.
Eric Anholt [Fri, 7 Dec 2018 18:31:27 +0000 (10:31 -0800)]
vc4: Fix a leak of the transfer helper on screen destroy.

Fixes: d009463a6549 ("vc4: Switch to using u_transfer_helper for MSAA maps.")