mesa.git
6 years agomesa: reduce the size of gl_viewport_attrib
Marek Olšák [Thu, 16 Nov 2017 00:46:40 +0000 (01:46 +0100)]
mesa: reduce the size of gl_viewport_attrib

All drivers convert these to float, so there is no reason to use double.
The piglit test that expects double precision from glGet will be adjusted
not to require it (there is a piglit patch).

gl_context::ViewportArray: 512 -> 384 bytes

Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: reduce the size of gl_texture_object
Marek Olšák [Thu, 16 Nov 2017 00:44:10 +0000 (01:44 +0100)]
mesa: reduce the size of gl_texture_object

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: reduce the size of gl_program
Marek Olšák [Thu, 16 Nov 2017 00:10:27 +0000 (01:10 +0100)]
mesa: reduce the size of gl_program

gl_program: 1456 -> 976 bytes

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: reduce the size of gl_image_unit (v2)
Marek Olšák [Wed, 15 Nov 2017 23:44:43 +0000 (00:44 +0100)]
mesa: reduce the size of gl_image_unit (v2)

gl_context::ImageUnits: 6144 -> 4608 bytes

v2: use ASSERT_BITFIELD_SIZE

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: further reduce the size of ctx->Texture
Marek Olšák [Wed, 15 Nov 2017 21:41:12 +0000 (22:41 +0100)]
mesa: further reduce the size of ctx->Texture

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8
Marek Olšák [Wed, 15 Nov 2017 21:10:43 +0000 (22:10 +0100)]
mesa: decrease the array size of ctx->Texture.FixedFuncUnit to 8

GL allows doing glTexEnv on 192 texture units, while in reality,
only MaxTextureCoordUnits units are used by fixed-func shaders.

There is a piglit patch that adjusts piglits/texunits to check only
MaxTextureCoordUnits units.

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit
Marek Olšák [Wed, 15 Nov 2017 21:02:51 +0000 (22:02 +0100)]
mesa: separate legacy stuff from gl_texture_unit into gl_fixedfunc_texture_unit

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: inline init_texture_unit
Marek Olšák [Wed, 15 Nov 2017 16:50:33 +0000 (17:50 +0100)]
mesa: inline init_texture_unit

because this is going to be changed

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agomesa: use GLenum16 in a few more places
Marek Olšák [Tue, 30 Jan 2018 21:25:25 +0000 (22:25 +0100)]
mesa: use GLenum16 in a few more places

Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agoanv: Move setting current_pipeline to cmd_state_init
Jason Ekstrand [Mon, 12 Feb 2018 16:17:57 +0000 (08:17 -0800)]
anv: Move setting current_pipeline to cmd_state_init

We were setting current_pipeline to UINT32_MAX and then calling
cmd_cmd_state_reset which memsets the entire state struct to 0 which
implicitly resets current_pipeline to 3D.  I have no idea how this
hasn't caused everything to explode.

Fixes: cd3feea74582 "anv/cmd_buffer: Rework anv_cmd_state_reset"
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agoanv: Don't resolve or ambiguate non-existent layers
Jason Ekstrand [Mon, 12 Feb 2018 17:48:12 +0000 (09:48 -0800)]
anv: Don't resolve or ambiguate non-existent layers

The previous code was trying to avoid non-existent layers by taking a
MAX with anv_image_aux_layers.  Unfortunately, it wasn't taking into
account that layer_count starts at base_layer which may not be zero.
Instead, we need to subtract base_layer from anv_image_aux_layers with
a guard against roll-over.

Fixes: de3be6180169f9 "anv/cmd_buffer: Rework aux tracking"
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoi965: Fix bugs in intel_from_planar
Daniel Stone [Mon, 12 Feb 2018 17:54:41 +0000 (17:54 +0000)]
i965: Fix bugs in intel_from_planar

This commit fixes two bugs in intel_from_planar.  First, if the planar
format was non-NULL but only had a single plane, we were falling through
to the planar case.  If we had a CCS modifier and plane == 1, we would
return NULL instead of the CCS plane.  Second, if we did end up in the
planar_format == NULL case and the modifier was DRM_FORMAT_MOD_INVALID,
we would end up segfaulting in isl_drm_modifier_has_aux.

Cc: mesa-stable@lists.freedesktop.org
Fixes: 8f6e54c92966bb94a3f05f2cc7ea804273e125ad
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradv: Fix compiler warning about uninitialized 'set'
Eric Anholt [Sat, 10 Feb 2018 11:06:45 +0000 (11:06 +0000)]
radv: Fix compiler warning about uninitialized 'set'

The compiler doesn't figure out that we only get result == VK_SUCCESS if
set got initialized.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoglsl/tests: Fix strict aliasing warning about int64/double.
Eric Anholt [Sat, 10 Feb 2018 11:01:20 +0000 (11:01 +0000)]
glsl/tests: Fix strict aliasing warning about int64/double.

Fixes: 4bf986274728 ("glsl/tests: Add UINT64 and INT64 types")
Reviewed-by: Rhys Kidd <rhyskidd@gmail.com>
6 years agoac/nir: Fix compiler warning about uninitialized dw_addr.
Eric Anholt [Sat, 10 Feb 2018 10:37:37 +0000 (10:37 +0000)]
ac/nir: Fix compiler warning about uninitialized dw_addr.

Even switching the def's condition to be the same chip revision check as
the use, the compiler doesn't figure it out.  Just NULL-init it.

Fixes: ec53e527421d ("ac/nir: Add ES output to LDS for GFX9.")
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agogallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax.
Eric Anholt [Sat, 10 Feb 2018 10:24:14 +0000 (10:24 +0000)]
gallium/llvmpipe: Fix compiler warnings about ddx/ddy/ddmax.

My gcc doesn't figure out that dims >= 1 (seems reasonable), and doesn't
notice that ddmax is used from the same no_rho_opt as its initialization.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
6 years agoanv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.
Kenneth Graunke [Sun, 11 Feb 2018 22:52:27 +0000 (14:52 -0800)]
anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.

The kernel used to have execbuf parameters to program the INSTPM bit
for whether 3DSTATE_CONSTANT_* should be relative to dynamic state
base address or an absolute address.  However, they never worked in
the presence of hardware contexts, so I deleted them a while back.

It doesn't make sense to set this flag, as it doesn't exist anymore.
It also never did anything anyway - the flag is zero, so |'ing it in
did nothing.  The default is relative anyway.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agor200: remove left over dead code
Eric Engestrom [Fri, 9 Feb 2018 11:38:43 +0000 (11:38 +0000)]
r200: remove left over dead code

0aaa27f29187ffb739c7 removed the references to this array without
removing the array itself

Cc: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 0aaa27f29187ffb739c7 "mesa: Pass the translated color logic op dd_function_table::LogicOpcode"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
6 years agoac/nir: remove backlink to nir_to_llvm_context
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:35 +0000 (13:54 +0100)]
ac/nir: remove backlink to nir_to_llvm_context

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: remove nir_to_llvm_context::module
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:34 +0000 (13:54 +0100)]
ac/nir: remove nir_to_llvm_context::module

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: remove nir_to_llvm_context::builder
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:33 +0000 (13:54 +0100)]
ac/nir: remove nir_to_llvm_context::builder

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: drop nir_to_llvm_context from glsl_to_llvm_type()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:32 +0000 (13:54 +0100)]
ac/nir: drop nir_to_llvm_context from glsl_to_llvm_type()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: drop nir_to_llvm_context from visit_var_atomic()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:31 +0000 (13:54 +0100)]
ac/nir: drop nir_to_llvm_context from visit_var_atomic()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:30 +0000 (13:54 +0100)]
ac/nir: drop nir_to_llvm_context from visit_vulkan_resource_reindex()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: drop nir_to_llvm_context from visit_load_push_constant()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:29 +0000 (13:54 +0100)]
ac/nir: drop nir_to_llvm_context from visit_load_push_constant()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: drop nir_to_llvm_context from cast_ptr()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:28 +0000 (13:54 +0100)]
ac/nir: drop nir_to_llvm_context from cast_ptr()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:27 +0000 (13:54 +0100)]
ac/nir: drop nir_to_llvm_context from visit_load_local_invocation_index()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: drop nir_to_llvm_context from emit_f2f16()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:26 +0000 (13:54 +0100)]
ac/nir: drop nir_to_llvm_context from emit_f2f16()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: remove unused parameters in abi::load_tess_coord()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:25 +0000 (13:54 +0100)]
ac: remove unused parameters in abi::load_tess_coord()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: remove useless bitcast in load_tess_coord()
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:24 +0000 (13:54 +0100)]
ac/nir: remove useless bitcast in load_tess_coord()

nir_intrinsic_load_tess_coord always returns a v3i32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: add load_resource() to the ABI
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:23 +0000 (13:54 +0100)]
ac: add load_resource() to the ABI

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: add load_sample_mask_in() to the ABI
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:22 +0000 (13:54 +0100)]
ac: add load_sample_mask_in() to the ABI

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: move view_index to the ABI
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:21 +0000 (13:54 +0100)]
ac: move view_index to the ABI

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: move push_constants to the ABI
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:20 +0000 (13:54 +0100)]
ac: move push_constants to the ABI

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac: move tg_size to the ABI
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:19 +0000 (13:54 +0100)]
ac: move tg_size to the ABI

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoac/nir: remove unused nir_to_llvm_context:{defs,phis}
Samuel Pitoiset [Fri, 9 Feb 2018 12:54:18 +0000 (13:54 +0100)]
ac/nir: remove unused nir_to_llvm_context:{defs,phis}

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoegl/gbm: Fix compiler warning about visual matching.
Eric Anholt [Sat, 10 Feb 2018 16:32:57 +0000 (16:32 +0000)]
egl/gbm: Fix compiler warning about visual matching.

The compiler doesn't know that num_visuals > 0.

Fixes: 37a8d907cc16 ("egl/gbm: Ensure EGLConfigs match GBM surface format")
Reviewed-by: Daniel Stone <daniels@collabora.com>
6 years agofreedreno: small fix for flushing dependent batches
Rob Clark [Sat, 10 Feb 2018 19:12:11 +0000 (14:12 -0500)]
freedreno: small fix for flushing dependent batches

Flush a resource's previous write_batch synchronously.  Because a
resource's associated batches are not updated until after the flush
thread submits rendering to the kernel, this was causing a bit of
confusion in the following loop.  This fixes a bug that appeared with
recent stk.

Perhaps we need to re-work things a bit to clear out dependent patches
in the ctx's thread and use a fence to deal with the period between
when a flush is queued and when it is submitted to the kernel.  But
this will do until time permits a larger refactor.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: intra-block scheduling
Rob Clark [Mon, 5 Feb 2018 13:45:29 +0000 (08:45 -0500)]
freedreno/ir3: intra-block scheduling

Because of loops, we can't schedule all of a block's predecessors first.
Instead just assume that the result consumed in a block was written far
enough away in all paths into a block.  And do an intra-block scheduling
pass to figure out if there are any cases where we need to insert extra
nop's.  This works out better than always assuming the worst case (ie.
that a value live into a block was written in the last instruction in
the predecessor block).

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: "boost" the depth of if/else condition
Rob Clark [Sun, 4 Feb 2018 17:52:24 +0000 (12:52 -0500)]
freedreno/ir3: "boost" the depth of if/else condition

Account for the move to predicate register, to try to avoid needing to
insert extra NOPs later.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: account for arrays in delayslot calc
Rob Clark [Sun, 4 Feb 2018 17:42:19 +0000 (12:42 -0500)]
freedreno/ir3: account for arrays in delayslot calc

Normally false-deps are not something to consider, since they mostly
exist for delay-slot related reasons:

 * barriers
 * ordering writes after read
 * SSBO/image access ordering

The exception is a false-dependency on an array store.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: more clever legalize algorithm
Rob Clark [Thu, 1 Feb 2018 14:08:39 +0000 (09:08 -0500)]
freedreno/ir3: more clever legalize algorithm

Previously we didn't handle flow control in legalize, and instead just
set (ss)(sy) on the first instruction in every block.  Which isn't very
clever.

Instead, consider output state of all predecessor blocks, so we only
set a sync bit if needed for any possible path leading into a block.
Because of loops, we can't require that all successor blocks are
legalized before a given block, so instead run in a loop until results
converge.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: track block predecessors
Rob Clark [Wed, 31 Jan 2018 17:58:05 +0000 (12:58 -0500)]
freedreno/ir3: track block predecessors

Useful in the following patches.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: clean up dangling false-dep's
Rob Clark [Mon, 29 Jan 2018 20:59:55 +0000 (15:59 -0500)]
freedreno/ir3: clean up dangling false-dep's

Maybe there is a better way for this..  where it comes useful is "array"
loads, which end up as a false-dep for a later array store.

If all the uses of an array load are CP'd into their consumer, it still
leaves the dangling array load, leading to funny things like:

  mov.u32u32 r5.y, r0.y
  mov.u32u32 r5.y, r0.z

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: handle IMMED for mad 2nd src special case
Rob Clark [Tue, 30 Jan 2018 17:18:13 +0000 (12:18 -0500)]
freedreno/ir3: handle IMMED for mad 2nd src special case

Consider also immediates for swapping the first two srcs, because they
can be lowered to constant.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: remove ir3 phi instruction
Rob Clark [Mon, 29 Jan 2018 21:22:26 +0000 (16:22 -0500)]
freedreno/ir3: remove ir3 phi instruction

Now that we convert phi webs to ssa, we can drop all this.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: remove lower_if_else pass
Rob Clark [Mon, 29 Jan 2018 21:09:44 +0000 (16:09 -0500)]
freedreno/ir3: remove lower_if_else pass

Now that it is unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add experimental GCM pass
Rob Clark [Mon, 29 Jan 2018 19:53:13 +0000 (14:53 -0500)]
freedreno/ir3: add experimental GCM pass

Generally seems to do worse on instruction count and register usage,
according to shader-db.  But shader-db also doesn't do a very good job
of weighting loop bodies, so that might not be totally valid.

So add an env variable to enable GCM pass for easier experimentation.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: change opt passes
Rob Clark [Fri, 26 Jan 2018 15:43:48 +0000 (10:43 -0500)]
freedreno/ir3: change opt passes

There are more useful nir passes added since initial conversion to nir.
But ir3 was never updated to use them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: use peephole select pass
Rob Clark [Fri, 19 Jan 2018 21:13:09 +0000 (16:13 -0500)]
freedreno/ir3: use peephole select pass

Agressively lowering all if/else to selects in some extreme cases
results in much higher register pressure.  Using peephole select instead
with a modest threshold speeds up alu2 4x!

16 seems like a good limit, low enough to help alu2 but not too low that
it penalizes everything else.  With a bit better scheduling of the
instruction that moves a value into a predicate register, we might be
able to lower this limit a bit more in the future, but since we need 6
cycles from the move to predicate register to predicated branch, that
puts some sort of lower bound on how far we can lower this threshold.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: lower phi webs to regs
Rob Clark [Thu, 18 Jan 2018 13:32:22 +0000 (08:32 -0500)]
freedreno/ir3: lower phi webs to regs

nir's from_ssa pass is much better at avoiding inserting extra moves
than our logic is.  And lowering phi webs to regs just treats anything
involved in a phi web as an array of length=1.  Which with previous
array related fixes in RA/etc ends up working out quite well.  This cuts
down on extra instructions and also helps with register pressure.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: separate arrays from groups
Rob Clark [Mon, 29 Jan 2018 17:32:24 +0000 (12:32 -0500)]
freedreno/ir3: separate arrays from groups

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: make block/instruction serialno per-shader
Rob Clark [Mon, 29 Jan 2018 14:54:07 +0000 (09:54 -0500)]
freedreno/ir3: make block/instruction serialno per-shader

Makes it easier to compare values seen in-game (where there are many
shaders) to cmdline standalone compiler.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add spirv support to cmdline compiler
Rob Clark [Tue, 23 Jan 2018 14:28:44 +0000 (09:28 -0500)]
freedreno/ir3: add spirv support to cmdline compiler

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: don't lower fsat
Rob Clark [Sun, 21 Jan 2018 17:31:51 +0000 (12:31 -0500)]
freedreno/ir3: don't lower fsat

Instead, if possible fold (sat) flag into src, otherwise use:

  (sat)max.f rD, rS, rS

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add encoding/decoding for (sat) bit
Rob Clark [Sun, 21 Jan 2018 17:20:01 +0000 (12:20 -0500)]
freedreno/ir3: add encoding/decoding for (sat) bit

Seems to be there since a3xx, but we always lowered fsat.  But we can
shave some instructions, especially in shaders that use lots of
clamp(foo, 0.0, 1.0) by not lowering fsat.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: extend liverange of arrays
Rob Clark [Sun, 21 Jan 2018 16:13:44 +0000 (11:13 -0500)]
freedreno/ir3: extend liverange of arrays

Use livein state of other blocks to extend liverange of arrays when they
are still needed by successor blocks.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: avoid extra mov's for "arrays"
Rob Clark [Fri, 19 Jan 2018 20:45:37 +0000 (15:45 -0500)]
freedreno/ir3: avoid extra mov's for "arrays"

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: a couple more array fixes
Rob Clark [Mon, 29 Jan 2018 21:01:42 +0000 (16:01 -0500)]
freedreno/ir3: a couple more array fixes

(Plus a couple TODOs)

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: keep array stores
Rob Clark [Mon, 29 Jan 2018 20:58:49 +0000 (15:58 -0500)]
freedreno/ir3: keep array stores

Since these are not in SSA form, add to block's keeps so it doesn't
appear unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: propagate barrier information
Rob Clark [Mon, 29 Jan 2018 20:38:06 +0000 (15:38 -0500)]
freedreno/ir3: propagate barrier information

When eliminating movs, the instruction that is now directly using the
src of the mov has the same scheduling order constraints as the original
mov instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: remove pointless statement
Rob Clark [Mon, 29 Jan 2018 20:35:12 +0000 (15:35 -0500)]
freedreno/ir3: remove pointless statement

Function ends after this if/else ladder, so it was pointless.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: some more debug prints
Rob Clark [Mon, 29 Jan 2018 20:33:55 +0000 (15:33 -0500)]
freedreno/ir3: some more debug prints

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: fix printing of relative branch offsets
Rob Clark [Mon, 29 Jan 2018 20:24:17 +0000 (15:24 -0500)]
freedreno/ir3: fix printing of relative branch offsets

The number of bits depends on generation.  But printing negative values
with a5xx encoding (largest size) but compiling for a3xx or a4xx, would
result in negative values printed as large positive values.

I guess in practice huge negative branch offsets aren't likely (and if
that is the case, the shader is probably too big to grok by reading the
assembly).  So just print using smallest bitfield size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: be more clever with if/else jumps
Rob Clark [Mon, 15 Jan 2018 20:57:52 +0000 (15:57 -0500)]
freedreno/ir3: be more clever with if/else jumps

Try to clean up things like:

  br !p0.x #2
  br p0.x #something

to eliminate the first branch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: avoid some spurious sync bits
Rob Clark [Mon, 15 Jan 2018 19:58:41 +0000 (14:58 -0500)]
freedreno/ir3: avoid some spurious sync bits

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: print # of sync bits for shaderdb
Rob Clark [Mon, 29 Jan 2018 20:28:10 +0000 (15:28 -0500)]
freedreno/ir3: print # of sync bits for shaderdb

When trying to optimize to reduce stalls, it is nice to see this info.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add debug trace for flush
Rob Clark [Mon, 29 Jan 2018 14:53:30 +0000 (09:53 -0500)]
freedreno: add debug trace for flush

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agointel/compiler: fix 64bit value prints on 32bit
Grazvydas Ignotas [Sat, 3 Feb 2018 21:59:05 +0000 (23:59 +0200)]
intel/compiler: fix 64bit value prints on 32bit

Fix the following:
warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but
argument 3 has type ‘uint64_t {aka long long unsigned int}.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agost/glsl_to_nir: remove unused options variable
Timothy Arceri [Sat, 10 Feb 2018 00:06:55 +0000 (11:06 +1100)]
st/glsl_to_nir: remove unused options variable

6 years agost/radeonsi: enable disk cache for nir
Timothy Arceri [Tue, 23 Jan 2018 05:00:31 +0000 (16:00 +1100)]
st/radeonsi: enable disk cache for nir

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost: add nir shader disk cache support
Timothy Arceri [Wed, 24 Jan 2018 00:12:51 +0000 (11:12 +1100)]
st: add nir shader disk cache support

v2: include compute shader support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/glsl_to_tgsi: move nir detection earlier
Timothy Arceri [Tue, 30 Jan 2018 00:56:57 +0000 (11:56 +1100)]
st/glsl_to_tgsi: move nir detection earlier

We move the nir check before the shader cache call so that we can
call a nir based caching function in a following patch.

Also with this change we simply check if vertex shaders support
NIR rather than looping over the stages as mixing of shader types
is not supported anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR
Timothy Arceri [Fri, 9 Feb 2018 01:02:27 +0000 (12:02 +1100)]
radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR

Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead.

This change indirectly enables NIR support for compute shaders
on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agor600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR
Timothy Arceri [Fri, 9 Feb 2018 01:01:04 +0000 (12:01 +1100)]
r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR

We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support
in clover.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoclover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR
Timothy Arceri [Fri, 9 Feb 2018 01:03:57 +0000 (12:03 +1100)]
clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR

PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR
for compute shaders, so we let clover pick the one it wants to use.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agor600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs
Timothy Arceri [Fri, 9 Feb 2018 00:59:54 +0000 (11:59 +1100)]
r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs

Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi/nir: add depth layout to scan pass
Timothy Arceri [Fri, 9 Feb 2018 10:09:35 +0000 (21:09 +1100)]
radeonsi/nir: add depth layout to scan pass

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradeonsi/nir: add FRAG_RESULT_COLOR to scan pass
Timothy Arceri [Fri, 9 Feb 2018 09:34:53 +0000 (20:34 +1100)]
radeonsi/nir: add FRAG_RESULT_COLOR to scan pass

Fixes a number of draw buffers piglit tests.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoac: convert nir_op_f2f32 src to a float
Timothy Arceri [Fri, 9 Feb 2018 06:17:31 +0000 (17:17 +1100)]
ac: convert nir_op_f2f32 src to a float

Fixes the following piglit test:

./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo

Where we would end up with the nir such as:

vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10
vec1 32 ssa_12 = f2f32 ssa_2

And our pack_64_2x32_split nir to llvm code always produces
a 64bit integer as output.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoac: fix some 64bit unpack asserts
Timothy Arceri [Fri, 9 Feb 2018 06:15:54 +0000 (17:15 +1100)]
ac: fix some 64bit unpack asserts

Previously the asserts did not take swizzles into account.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoRevert "i965: prevent potentially null pointer access"
Mark Janes [Fri, 9 Feb 2018 17:37:57 +0000 (09:37 -0800)]
Revert "i965: prevent potentially null pointer access"

This reverts commit 712332ed54f14b5ee34c2990e351ca48992488b2, which
caused over 90k failures in Mesa i965 CI.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
6 years agoegl/gbm: Ensure EGLConfigs match GBM surface format
Daniel Stone [Tue, 6 Feb 2018 17:59:05 +0000 (17:59 +0000)]
egl/gbm: Ensure EGLConfigs match GBM surface format

When we create an EGL window surface on a GBM surface, ensure that the
EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB
interchange.

For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888
gbm_surface (and vice-versa) are acceptable, but rendering with an
XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be
rejected.

This was previously allowed through; when 10bpc formats were enabled,
clients which picked a completely random EGL config and hoped/assumed
they were XRGB8888 would break.

If you have bisected a failure to start a GBM/KMS client to this commit,
please look at its EGLConfig selection (e.g. through eglChooseConfigs),
and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the
attribs for config selection.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/gbm: Remove duplicate format table
Daniel Stone [Tue, 6 Feb 2018 17:44:37 +0000 (17:44 +0000)]
egl/gbm: Remove duplicate format table

Now that we have mask/channel information in gbm_dri's format conversion
table, we can remove the copy in EGL.

As this table contains more formats (notably including R8 and RG8, which
can be used for BO but not surface allocation), we now compare the masks
of all channels when trying to find a suitable config. Without doing
this, an XRGB8888 EGLConfig would match on an R8 format.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agogbm/dri: Expose visuals table through gbm_dri_device
Daniel Stone [Tue, 6 Feb 2018 17:42:03 +0000 (17:42 +0000)]
gbm/dri: Expose visuals table through gbm_dri_device

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agogbm/dri: Add RGBA masks to GBM format table
Daniel Stone [Tue, 6 Feb 2018 17:38:37 +0000 (17:38 +0000)]
gbm/dri: Add RGBA masks to GBM format table

Eventually, we can replace the visuals list inside GBM EGL driver with
this one.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use an array for modifiers
Daniel Stone [Tue, 6 Feb 2018 10:29:13 +0000 (10:29 +0000)]
egl/wayland: Use an array for modifiers

Each Wayland EGLDisplay currently contains a struct with one vector of
modifiers per format, hardcoded in the header. To allow easier support
for more formats, turn this into an array of u_vectors which is opaque
outside of platform_wayland.c.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Remove has_format enum
Daniel Stone [Tue, 6 Feb 2018 11:14:32 +0000 (11:14 +0000)]
egl/wayland: Remove has_format enum

Instead of the has_format enum, use an index into the visual array. This
makes adding new formats less typing.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Add bpp to visual map
Daniel Stone [Tue, 6 Feb 2018 11:58:45 +0000 (11:58 +0000)]
egl/wayland: Add bpp to visual map

Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation,
had their own format -> bpp lookup tables. Replace these with a lookup
into the visual map.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for DRIImage<->FourCC map
Daniel Stone [Tue, 6 Feb 2018 11:51:17 +0000 (11:51 +0000)]
egl/wayland: Use visual map for DRIImage<->FourCC map

When trying to translate between DRIImage format enums and FourCC codes,
use our visual map rather than an open-coded subset.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for format advertisement
Daniel Stone [Tue, 6 Feb 2018 10:20:39 +0000 (10:20 +0000)]
egl/wayland: Use visual map for format advertisement

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for buffer_from_image
Daniel Stone [Tue, 6 Feb 2018 10:15:32 +0000 (10:15 +0000)]
egl/wayland: Use visual map for buffer_from_image

When creating a wl_buffer on an upstream Wayland display from an
existing EGLImage, use the dri2_wl_visual map rather than another
hardcoded list of formats.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for config->format lookup
Daniel Stone [Tue, 6 Feb 2018 10:07:23 +0000 (10:07 +0000)]
egl/wayland: Use visual map for config->format lookup

Having hoisted the format -> config map into common code, we now use it
for config -> format lookups.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Add format enums to visual map
Daniel Stone [Tue, 6 Feb 2018 09:42:27 +0000 (09:42 +0000)]
egl/wayland: Add format enums to visual map

Extend the visual map from only containing names and bitmasks, to also
carrying the three format enums we need. These are the DRIImage format
tokens for internal allocation, FourCC codes for wl_drm and dmabuf
protocol, and wl_shm codes for swrast drivers.

We will later use these formats to eliminate a bunch of open-coded
conversions.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use proper enum type in visual definition
Daniel Stone [Tue, 6 Feb 2018 09:33:56 +0000 (09:33 +0000)]
egl/wayland: Use proper enum type in visual definition

No semantic change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Widen channel masks to bpp
Daniel Stone [Tue, 6 Feb 2018 18:06:52 +0000 (18:06 +0000)]
egl/wayland: Widen channel masks to bpp

Widen the channel masks given in the visual table to the full width of
the pixel format, i.e. as many leading zeros as required.

No functional change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Hoist format <-> EGLConfig definition up
Daniel Stone [Tue, 6 Feb 2018 09:32:22 +0000 (09:32 +0000)]
egl/wayland: Hoist format <-> EGLConfig definition up

Pull the mapping between Wayland formats and EGLConfigs up to the top
level, so we can reuse it elsewhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Fix ARGB/XRGB transposition in config map
Daniel Stone [Tue, 6 Feb 2018 09:45:01 +0000 (09:45 +0000)]
egl/wayland: Fix ARGB/XRGB transposition in config map

When 0b2b7191214eb moved from an if tree to a struct to map between
wl_drm formats and EGLConfigs, it transposed the mapping between XRGB
and ARGB. Luckily, everyone exposes both formats, so this is harmless.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 0b2b7191214eb ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agost/mesa: generate blend state according to the number of enabled color buffers
Marek Olšák [Wed, 31 Jan 2018 03:37:00 +0000 (04:37 +0100)]
st/mesa: generate blend state according to the number of enabled color buffers

Non-MRT cases always translate blend state for 1 color buffer only.
MRT cases only check and translate blend state for enabled color buffers.

This also avoids an assertion failure in translate_blend for:
  dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agost/mesa: don't translate blend state when color writes are disabled
Marek Olšák [Tue, 30 Jan 2018 23:53:16 +0000 (00:53 +0100)]
st/mesa: don't translate blend state when color writes are disabled

Reviewed-by: Eric Anholt <eric@anholt.net>