mesa.git
6 years agoradv: reduce the number of small surfaces that need CMASK or DCC
Samuel Pitoiset [Thu, 21 Dec 2017 16:45:23 +0000 (17:45 +0100)]
radv: reduce the number of small surfaces that need CMASK or DCC

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agogm107/ir: use lane 0 for manual textureGrad handling
Ilia Mirkin [Wed, 20 Dec 2017 04:37:25 +0000 (23:37 -0500)]
gm107/ir: use lane 0 for manual textureGrad handling

This is parallel to the pre-SM50 change which does this. Adjusts the
shuffles / quadops to make the values correct relative to lane 0, and
then splat the results to all lanes for the final move into the target
register.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-By: Karol Herbst <kherbst@redhat.com>
6 years agoradv/meta: fix blit paths for depth/stencil (v2.1)
Dave Airlie [Thu, 21 Dec 2017 06:52:24 +0000 (16:52 +1000)]
radv/meta: fix blit paths for depth/stencil (v2.1)

This fixes the layout issue for the blit path as well.

This fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint*

v2: use compatible render passes.
v2.1: use enum

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: handle depth/stencil image copy with layouts better. (v3.1)
Dave Airlie [Thu, 21 Dec 2017 06:23:30 +0000 (16:23 +1000)]
radv: handle depth/stencil image copy with layouts better. (v3.1)

If we are doing a general->general transfer with HIZ enabled,
we want to hit the tile surface disable bits in radv_emit_fb_ds_state,
however we never get the current layout to know we are in general
and meta hardcoded the transfer layout which is always tile enabled.

This fixes:
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_general
dEQP-VK.api.copy_and_blit.core.image_to_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.general_general

v2: refactor some shared helpers for blit patches
v3: we only need multiple render passes as they should be compatible.
v3.1: use enum (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: refactor blit2d pipeline creation
Dave Airlie [Wed, 20 Dec 2017 23:00:43 +0000 (09:00 +1000)]
radv: refactor blit2d pipeline creation

This just refactors the gfx9 blit2d pipeline creation
to be less lines of code.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv/gfx9: add support for 3d images to blit 2d paths
Dave Airlie [Tue, 19 Dec 2017 05:42:10 +0000 (15:42 +1000)]
radv/gfx9: add support for 3d images to blit 2d paths

This add support for a 3D image reading path to the blit 2d paths,
like I did for the clear paths.

Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv/gfx9: add 3d sampler image->buffer copy shader. (v3)
Dave Airlie [Tue, 19 Dec 2017 03:55:18 +0000 (13:55 +1000)]
radv/gfx9: add 3d sampler image->buffer copy shader. (v3)

On GFX9 we must access 3D textures with 3D samplers AFAICS.

This fixes:
dEQP-VK.api.image_clearing.core.clear_color_image.3d.single_layer

on GFX9 for me.

v1.1: fix tex->sampler_dim to dim
v2: send layer in from outside
v3: don't regress on pre-gfx9

Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: fix surface max layer count (v2)
Dave Airlie [Tue, 19 Dec 2017 05:41:42 +0000 (15:41 +1000)]
radv: fix surface max layer count (v2)

looking at traces I noticed we'd set slice_max too large sometimes.

This should fix it.

v2: fix missing - 1

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agointel/fs: Initialize fs_visitor::grf_used on construction.
Francisco Jerez [Sun, 17 Dec 2017 08:21:13 +0000 (00:21 -0800)]
intel/fs: Initialize fs_visitor::grf_used on construction.

This should shut up some Valgrind errors during pre-regalloc
scheduling.  The errors were harmless since they could only have led
to the estimation of the bank conflict penalty of an instruction
pre-regalloc, which is inaccurate at that point of the program
compilation, but no less accurate than the intended "return 0"
fall-back path.  The scheduling pass is normally re-run after regalloc
with a well-defined grf_used value and accurate bank conflict
information.

Fixes: acf98ff933d "intel/fs: Teach instruction scheduler about GRF bank conflict cycles."
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agointel/fs/bank_conflicts: Use posix_memalign() instead of overaligned new to obtain...
Francisco Jerez [Sun, 17 Dec 2017 21:05:55 +0000 (13:05 -0800)]
intel/fs/bank_conflicts: Use posix_memalign() instead of overaligned new to obtain vector storage.

The weight_vector_type constructor was inadvertently assuming C++17
semantics of the new operator applied on a type with alignment
requirement greater than the largest fundamental alignment.
Unfortunately on earlier C++ dialects the implementation was allowed
to raise an allocation failure when the alignment requirement of the
allocated type was unsupported, in an implementation-defined fashion.
It's expected that a C++ implementation recent enough to implement
P0035R4 would have honored allocation requests for such over-aligned
types even if the C++17 dialect wasn't active, which is likely the
reason why this problem wasn't caught by our CI system.

A more elegant fix would involve wrapping the __SSE2__ block in a
'__cpp_aligned_new >= 201606' preprocessor conditional and continue
taking advantage of the language feature, but that would yield lower
compile-time performance on old compilers not implementing it
(e.g. GCC versions older than 7.0).

Fixes: af2c320190f3c731 "intel/fs: Implement GRF bank conflict mitigation pass."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104226
Reported-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
6 years agoRevert "spirv: consider bitsize when handling OpSwitch cases"
Mark Janes [Thu, 21 Dec 2017 20:15:40 +0000 (12:15 -0800)]
Revert "spirv: consider bitsize when handling OpSwitch cases"

This reverts commit 9702fac68e8bd07be8871f7925d7f9fb98da3699, which
hangs vulkancts and crucible on all platforms.

The patch is being reverted because it disables continuous integration
testing.  The patch from bug 104359 does not apply to master.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359

6 years agoradv: fix issue with multisample positions and interp_var_at_sample.
Dave Airlie [Thu, 21 Dec 2017 04:03:20 +0000 (14:03 +1000)]
radv: fix issue with multisample positions and interp_var_at_sample.

This fixes vmfaults seen on vega with:
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_single_sample_.128_128_1.samples_1

These were caused by the don't allocate cmask but it was just accidental.

The actual problem was the shader was trying to get the sample positions from
a buffer, but the buffer was never getting configured to contain them, as the
previous shader never needed them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 1171b304f3 (radv: overhaul fragment shader sample positions.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agodocs: update calendar, add news item and link release notes for 17.3.1
Emil Velikov [Thu, 21 Dec 2017 17:38:04 +0000 (17:38 +0000)]
docs: update calendar, add news item and link release notes for 17.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
6 years agodocs: add sha256 checksums for 17.3.1
Emil Velikov [Thu, 21 Dec 2017 17:34:52 +0000 (17:34 +0000)]
docs: add sha256 checksums for 17.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit f66496d291881f1eaca2ee5d326367fb73537541)

6 years agodocs: add release notes for 17.3.1
Emil Velikov [Thu, 21 Dec 2017 17:04:41 +0000 (17:04 +0000)]
docs: add release notes for 17.3.1

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit 4f5e85e9e97de4ae6e3d779ff42bf392c4739234)

6 years agoradv/gfx9: fix primitive topology when adjacency is used
Samuel Pitoiset [Wed, 20 Dec 2017 19:57:21 +0000 (20:57 +0100)]
radv/gfx9: fix primitive topology when adjacency is used

Found by inspection.

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoglsl: disable vec3 packing/splitting in tfb separate mode
Brian Paul [Mon, 18 Dec 2017 19:32:56 +0000 (12:32 -0700)]
glsl: disable vec3 packing/splitting in tfb separate mode

This fixes a varying packing issue when using transform feedback in
GL_SEPARATE_ATTRIBS mode.  By time we get to linking, we already
know that the number of feedback attributes is under the
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't
as critical.  In fact, packing/splitting vec3 attributes can cause
trouble because splitting effectively creates another TFB output
which can exceed device limits.  So, disable vec3 packing when it's
not needed to avoid that issue.

Fixes the Piglit ext_transform_feedback-separate test on VMware
driver.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: simply packing class comparison
Brian Paul [Fri, 15 Dec 2017 22:21:46 +0000 (15:21 -0700)]
glsl: simply packing class comparison

Handle comparing the packing class using the same method as we do
for var->data.is_xfb_only

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: document varying_matches::assign_locations() params and return value
Brian Paul [Fri, 15 Dec 2017 22:08:17 +0000 (15:08 -0700)]
glsl: document varying_matches::assign_locations() params and return value

And change *components to components[] as a reminder that it's an array.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: remove some continue statements
Brian Paul [Fri, 15 Dec 2017 21:36:25 +0000 (14:36 -0700)]
glsl: remove some continue statements

In some cases, I think loop code is easier to read without continue
statements.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: use bitwise operators in varying_matches::compute_packing_class()
Brian Paul [Fri, 15 Dec 2017 21:30:26 +0000 (14:30 -0700)]
glsl: use bitwise operators in varying_matches::compute_packing_class()

The mix of bitwise operators with * and + to compute the packing_class
values was a little weird.  Just use bitwise ops instead.

v2: add assertion to make sure interpolation bits fit without collision,
per Timothy.  Basically, rewrite function to be simpler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: simplify loop in varying_matches::assign_locations()
Brian Paul [Fri, 15 Dec 2017 21:27:55 +0000 (14:27 -0700)]
glsl: simplify loop in varying_matches::assign_locations()

The use of break/continue was kind of weird/confusing.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: minor simplification in assign_varying_locations()
Brian Paul [Fri, 15 Dec 2017 21:25:20 +0000 (14:25 -0700)]
glsl: minor simplification in assign_varying_locations()

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: make varying_matches::is_varying_packing_safe() const
Brian Paul [Fri, 15 Dec 2017 21:23:39 +0000 (14:23 -0700)]
glsl: make varying_matches::is_varying_packing_safe() const

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoglsl: trivial comment fixes in lower_packed_varyings.cpp
Brian Paul [Fri, 15 Dec 2017 17:18:00 +0000 (10:18 -0700)]
glsl: trivial comment fixes in lower_packed_varyings.cpp

Reviewed by: Timothy Arceri <tarceri@itsqueeze.com>

6 years agodocs: update 17.3 and 18.0 cycles for the release calendar
Andres Gomez [Mon, 18 Dec 2017 19:31:23 +0000 (21:31 +0200)]
docs: update 17.3 and 18.0 cycles for the release calendar

Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
6 years agospirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist
Juan A. Suarez Romero [Wed, 20 Dec 2017 10:51:31 +0000 (11:51 +0100)]
spirv: Makefile.nir.am: include vtn_gather_types_c.py script in tarball dist

Fixes: bb1e6ff161c ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
6 years agost/dri: allow direct YUYV import
Lucas Stach [Tue, 30 May 2017 13:07:13 +0000 (15:07 +0200)]
st/dri: allow direct YUYV import

Push this format to the pipe driver unchanged.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
6 years agospirv: consider bitsize when handling OpSwitch cases
Juan A. Suarez Romero [Tue, 19 Dec 2017 17:55:24 +0000 (17:55 +0000)]
spirv: consider bitsize when handling OpSwitch cases

When walking over all the cases in a OpSwitch, take in account the bitsize
of the literals to avoid getting wrong cases.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agodrirc: set allow_glsl_cross_stage_interpolation_mismatch for more games
Tapani Pälli [Wed, 20 Dec 2017 07:23:55 +0000 (09:23 +0200)]
drirc: set allow_glsl_cross_stage_interpolation_mismatch for more games

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104288
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
6 years agoanv: disallow VK_REMAINING_ARRAY_LAYERS in vkCmdClearAttachments()
Samuel Iglesias Gonsálvez [Tue, 19 Dec 2017 07:59:36 +0000 (08:59 +0100)]
anv: disallow VK_REMAINING_ARRAY_LAYERS in vkCmdClearAttachments()

Vulkan spec doesn't specify that VK_REMAINING_ARRAY_LAYERS is allowed
in the passed VkClearRect struct.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agonvc0/ir: change textureGrad to always use lane 0 as the tex origin
Ilia Mirkin [Wed, 16 Aug 2017 04:34:43 +0000 (00:34 -0400)]
nvc0/ir: change textureGrad to always use lane 0 as the tex origin

Thanks to Karol Herbst for the debugging / tracing work that led to this
change.

Move to using lane 0 as the "work" lane for the texture. It is unclear
why this helps, as that computation should be identical to doing it in
the "correct" lane with the properly adjusted quadops.

In order to be able to use the lane 0 result, we also have to ensure
that lane 0 contains the proper array/indirect/shadow values.

This applies to Fermi and Kepler. Maxwell+ may or may not need fixing,
but that lowering logic is separate.

Fixes KHR-GL45.texture_cube_map_array.sampling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agobroadcom/vc5: Add missing setting of the UIF XOR disable flag in textures.
Eric Anholt [Tue, 19 Dec 2017 22:23:06 +0000 (14:23 -0800)]
broadcom/vc5: Add missing setting of the UIF XOR disable flag in textures.

Most piglit textures happened to work out by RGBW not changing in that
bit, but it did cause failures in RGBA16F fbo-generatemipmap-formats.

6 years agobroadcom/vc5: Clean up the comment and code around level 0 UIF.
Eric Anholt [Tue, 19 Dec 2017 22:20:19 +0000 (14:20 -0800)]
broadcom/vc5: Clean up the comment and code around level 0 UIF.

I wrote this early in driver development, and our UIF handling is much
better now.

6 years agobroadcom/vc5: Simplify the tiling calculations.
Eric Anholt [Tue, 19 Dec 2017 22:08:18 +0000 (14:08 -0800)]
broadcom/vc5: Simplify the tiling calculations.

The mb_tile_layout table was just the utile_w/h times two, so reuse the
utile code instead.

6 years agobroadcom/vc5: Return the depth in all components of depth textures.
Eric Anholt [Thu, 16 Nov 2017 20:01:13 +0000 (12:01 -0800)]
broadcom/vc5: Return the depth in all components of depth textures.

Apparently gallium's u_blitter wants depth from at least the .z component,
and other swizzling appears to apply on top of that.  Fixes
fbo-generatemipmap-formats failures with depth formats.

6 years agobroadcom/vc5: Enable decompressing RGTC for desktop GL support.
Eric Anholt [Fri, 15 Dec 2017 22:40:43 +0000 (14:40 -0800)]
broadcom/vc5: Enable decompressing RGTC for desktop GL support.

This matches freedreno's behavior.

6 years agobroadcom/vc5: Use u_transfer_helper for MSAA mappings.
Eric Anholt [Fri, 15 Dec 2017 22:40:24 +0000 (14:40 -0800)]
broadcom/vc5: Use u_transfer_helper for MSAA mappings.

6 years agobroadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.
Eric Anholt [Wed, 6 Dec 2017 02:58:41 +0000 (18:58 -0800)]
broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.

There may be some more RCL work to be done (I think I need to split my Z/S
stores when doing separate stencil), but this gets piglit's "texwrap
GL_ARB_depth_buffer_float" working.

v2: Unwrap the z32f_wrapper before calling the helper, rather than having
    the helper have a callback.
v3: Rebase on Rob Clark's u_transfer_helper instead

6 years agofreedreno: add debug flag to force high priority context
Rob Clark [Fri, 25 Aug 2017 11:58:40 +0000 (07:58 -0400)]
freedreno: add debug flag to force high priority context

Mainly for testing, FD_MESA_DEBUG=hiprio will force high priority
contexts.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: context priority support
Rob Clark [Thu, 24 Aug 2017 14:22:24 +0000 (10:22 -0400)]
freedreno: context priority support

For devices (and kernels) which support different priority ringbuffers,
expose context priority support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agogallium: plumb context priority through to driver
Rob Clark [Wed, 23 Aug 2017 18:39:55 +0000 (14:39 -0400)]
gallium: plumb context priority through to driver

Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
6 years agointel/compiler/gen10: Disable push constants.
Rafael Antognolli [Mon, 18 Dec 2017 23:23:11 +0000 (15:23 -0800)]
intel/compiler/gen10: Disable push constants.

We still have gpu hangs on Cannonlake when using push constants, so
disable them for now until we have a proper fix for these hangs.

v2: Add warning message when creating context too.

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
6 years agoradv: properly load unused gl_LocalInvocationID/gl_WorkGroupID components
Samuel Pitoiset [Mon, 18 Dec 2017 21:06:38 +0000 (22:06 +0100)]
radv: properly load unused gl_LocalInvocationID/gl_WorkGroupID components

F1 2017 looks good now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: do not add extra SGPR when push constants are not used
Samuel Pitoiset [Mon, 18 Dec 2017 18:38:58 +0000 (19:38 +0100)]
radv: do not add extra SGPR when push constants are not used

This is not because the vertex stage needs some push constants
that other stages need them too. This should reduce the number
of loaded SGPRs in some situations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: change the needs_push_constants logic
Samuel Pitoiset [Mon, 18 Dec 2017 18:38:57 +0000 (19:38 +0100)]
radv: change the needs_push_constants logic

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: store pipeline stages that need push constants
Samuel Pitoiset [Mon, 18 Dec 2017 18:38:56 +0000 (19:38 +0100)]
radv: store pipeline stages that need push constants

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: remove one useless check in ac_nir_shader_info_pass()
Samuel Pitoiset [Mon, 18 Dec 2017 18:38:55 +0000 (19:38 +0100)]
radv: remove one useless check in ac_nir_shader_info_pass()

pipeline->layout can't be NULL now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: remove one useless check in radv_flush_constants()
Samuel Pitoiset [Mon, 18 Dec 2017 18:38:54 +0000 (19:38 +0100)]
radv: remove one useless check in radv_flush_constants()

pipeline->layout can't be NULL now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: add assertions to make sure pipeline layout objects are valid
Samuel Pitoiset [Mon, 18 Dec 2017 18:38:53 +0000 (19:38 +0100)]
radv: add assertions to make sure pipeline layout objects are valid

The spec requires it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: create pipeline layout objects for all meta operations
Samuel Pitoiset [Mon, 18 Dec 2017 18:38:52 +0000 (19:38 +0100)]
radv: create pipeline layout objects for all meta operations

They are dummy objects but the spec requires layout to not be
NULL, this just makes sure we are creating valid pipeline layout
objects. This will allow us to remove some useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: Use a sort for rebuilding the sparse buffer bo list.
Bas Nieuwenhuizen [Tue, 19 Dec 2017 08:01:32 +0000 (09:01 +0100)]
radv: Use a sort for rebuilding the sparse buffer bo list.

It uses slightly more memory (though still bounded by the number
of mapped ranges), but gives less quadratic behavior.

Cuts 4 minutes from the runtime of the CTS *.sparse.* tests.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
6 years agofreedreno/ir3: handle VTXID_BASE for indirect draws
Rob Clark [Mon, 18 Dec 2017 20:09:49 +0000 (15:09 -0500)]
freedreno/ir3: handle VTXID_BASE for indirect draws

Need to do some gymnastics to copy the parameter from the indirect
parameters buffer to uniform so shader sees the correct base-vertex-id.

Fixes ./bin/arb_draw_indirect-vertexid on a5xx and probably a4xx too.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add ctx->mem_to_mem()
Rob Clark [Mon, 18 Dec 2017 20:06:37 +0000 (15:06 -0500)]
freedreno/ir3: add ctx->mem_to_mem()

For dealing with indirect-draw + gl_VertexID, we'll introduce another
case where we need to use CP_MEM_TO_MEM.  Rather than adding more
if(a5xx)/else make this a ctx vfunc.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/a5xx: use vertex_id_zero_base
Rob Clark [Mon, 18 Dec 2017 18:34:18 +0000 (13:34 -0500)]
freedreno/a5xx: use vertex_id_zero_base

Cmdstream traces from blob make it clear that the blob driver dev's
*think* a5xx has a real (non-zero-based) vtxid.  But reality claims
differently.

Fixes ./bin/gl-3.2-basevertex-vertexid and probably others.

This means draw-indirect is going to need some gymnastics to copy
base-vertex into uniform.  (a4xx probably needs that too.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agor600: clear compressed flags in image state on unbind.
Dave Airlie [Tue, 19 Dec 2017 05:36:53 +0000 (05:36 +0000)]
r600: clear compressed flags in image state on unbind.

If we aren't binding an image, clear the compressed flags.

This fixes a segfault seen with an apitrace.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104331
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoswr: Account for index_bias in offsets
George Kyriazis [Thu, 14 Dec 2017 18:01:53 +0000 (12:01 -0600)]
swr: Account for index_bias in offsets

When calculating buffer offsets for client buffers account for info.index_bias.

Fixes the follow piglit tests:
arb_draw_elements_base_vertex-drawelements-user_varrays
arb_draw_elements_base_vertex-negative-index-user_varrays

Reviewed-By: Bruce Cherniak <bruce.cherniak@intel.com>
6 years agor600: only reported tgsi ir compute support on evergreen+
Dave Airlie [Mon, 18 Dec 2017 21:38:09 +0000 (21:38 +0000)]
r600: only reported tgsi ir compute support on evergreen+

This fixes a crash on r600/r700.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Advertise sync fd import and export.
Bas Nieuwenhuizen [Mon, 18 Dec 2017 20:09:19 +0000 (21:09 +0100)]
radv: Advertise sync fd import and export.

Passes dEQP-VK.*.sync_fd.*

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Implement sync file import/export for fences & semaphores.
Bas Nieuwenhuizen [Mon, 18 Dec 2017 20:02:05 +0000 (21:02 +0100)]
radv: Implement sync file import/export for fences & semaphores.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv/amdgpu: wrap sync fd import/export.
Bas Nieuwenhuizen [Mon, 18 Dec 2017 19:33:07 +0000 (20:33 +0100)]
radv/amdgpu: wrap sync fd import/export.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoac/nir: fix lds store for patch outputs.
Dave Airlie [Mon, 18 Dec 2017 06:53:44 +0000 (16:53 +1000)]
ac/nir: fix lds store for patch outputs.

This wasn't calculating the correct value, this along with
a nir patch fixes a regression in:
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 043d14db30a (ac/nir: don't write tcs outputs to LDS that aren't read back.)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agonir/linking: always set the used_across_stages/outputs_read bits
Dave Airlie [Mon, 18 Dec 2017 06:49:43 +0000 (16:49 +1000)]
nir/linking: always set the used_across_stages/outputs_read bits

If we don't remap and output this code would trample the outputs
read bits.

This fixes a regression in
dEQP-VK.tessellation.shader_input_output.barrier

Fixes: 1c9c42d16b4c (nir: add varying component packing helpers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agospirv: Relax the validation conditions of OpSelect
Jason Ekstrand [Fri, 15 Dec 2017 03:53:05 +0000 (19:53 -0800)]
spirv: Relax the validation conditions of OpSelect

The Talos Principle contains shaders with an OpSelect between two
vectors where the condition is a scalar boolean.  This is technically
against the spec bout nir_builder gracefully handles it by splatting
out the condition to all the channels.  So long as the condition is a
boolean, just emit a warning instead of failing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246

6 years agoradv: remove useless radv_cmask_info::base_address_reg
Samuel Pitoiset [Fri, 15 Dec 2017 17:54:00 +0000 (18:54 +0100)]
radv: remove useless radv_cmask_info::base_address_reg

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: add ac_vgt_gs_mode() helper
Samuel Pitoiset [Fri, 15 Dec 2017 14:37:19 +0000 (15:37 +0100)]
amd/common: add ac_vgt_gs_mode() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoamd/common: add ac_get_cb_shader_mask() helper
Samuel Pitoiset [Fri, 15 Dec 2017 14:37:18 +0000 (15:37 +0100)]
amd/common: add ac_get_cb_shader_mask() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoRevert "radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components"
Samuel Pitoiset [Fri, 15 Dec 2017 15:01:56 +0000 (16:01 +0100)]
Revert "radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components"

This reverts commit 2294d35b243dee15af15895e876a63b7d22e48cc.

We can't do this without adjusting the input SGPRs/VGPRs logic.
For now, just revert it. I will send a proper solution later.

It fixes a rendering issue in F1 2017 that CTS didn't catch up.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
6 years agoradv: port merge tess info from anv
Dave Airlie [Mon, 18 Dec 2017 05:05:52 +0000 (15:05 +1000)]
radv: port merge tess info from anv

anv merges the tess info correctly, but radv wasn't doing this.

This fixes hangs in
dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw

Fixes: 60fc0544e0 (radv/pipeline: handle tessellation shader compilation)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Add external fence support.
Bas Nieuwenhuizen [Mon, 27 Nov 2017 23:28:14 +0000 (00:28 +0100)]
radv: Add external fence support.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Implement VK_KHR_external_fence_fd.
Bas Nieuwenhuizen [Mon, 27 Nov 2017 23:21:12 +0000 (00:21 +0100)]
radv: Implement VK_KHR_external_fence_fd.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Implement fences based on syncobjs.
Bas Nieuwenhuizen [Mon, 27 Nov 2017 22:58:35 +0000 (23:58 +0100)]
radv: Implement fences based on syncobjs.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoamd/common: Add detection of the syncobj wait/signal/reset ioctls.
Bas Nieuwenhuizen [Mon, 27 Nov 2017 00:06:11 +0000 (01:06 +0100)]
amd/common: Add detection of the syncobj wait/signal/reset ioctls.

First amdgpu bump after inclusion was 20 (which was done for local BOs).

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Add syncobj signal/reset/wait to winsys.
Bas Nieuwenhuizen [Mon, 27 Nov 2017 00:02:42 +0000 (01:02 +0100)]
radv: Add syncobj signal/reset/wait to winsys.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoconfigure/meson: Bump libdrm_amdgpu version requirement.
Bas Nieuwenhuizen [Sat, 16 Dec 2017 23:51:58 +0000 (00:51 +0100)]
configure/meson: Bump libdrm_amdgpu version requirement.

For the radv dependencies on syncobj signal/reset.

Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agoandroid: fix vulkan driver build
Tapani Pälli [Tue, 12 Dec 2017 08:01:57 +0000 (10:01 +0200)]
android: fix vulkan driver build

fixes undefined references by adding missing wsi common API

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoandroid: fix undefined references to futex API
Tapani Pälli [Tue, 12 Dec 2017 08:01:56 +0000 (10:01 +0200)]
android: fix undefined references to futex API

Fixes: f98a2768ca "mesa: Add new fast mtx_t mutex type for basic use cases"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agodocs: mark GL4.3 as finished for r600
Dave Airlie [Mon, 18 Dec 2017 04:28:07 +0000 (04:28 +0000)]
docs: mark GL4.3 as finished for r600

Still only on fp64 supported hw.

6 years agor600: export robust buffer access
Dave Airlie [Fri, 3 Nov 2017 01:53:36 +0000 (11:53 +1000)]
r600: export robust buffer access

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agor600: export GLSL 430
Dave Airlie [Fri, 3 Nov 2017 01:52:26 +0000 (11:52 +1000)]
r600: export GLSL 430

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agor600/cs: add compute support to caps
Dave Airlie [Fri, 3 Nov 2017 01:30:12 +0000 (11:30 +1000)]
r600/cs: add compute support to caps

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agor600: always flush between gfx and compute
Dave Airlie [Fri, 24 Nov 2017 00:51:35 +0000 (10:51 +1000)]
r600: always flush between gfx and compute

This is in no way optimal, but there seems to be some problems
mixing at the moment, lots of hangs, it is possible, just need
to figure out more magic.

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agor600: fix unused variable warning
Dave Airlie [Mon, 18 Dec 2017 04:29:19 +0000 (04:29 +0000)]
r600: fix unused variable warning

Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoradv: Fix multi-layer blits.
Bas Nieuwenhuizen [Sun, 17 Dec 2017 22:53:37 +0000 (23:53 +0100)]
radv: Fix multi-layer blits.

We did not set the layer correctly for the dst, as we would keep
using the base layer. Same for the source image.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102710
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agofreedreno/a5xx: add a5xx blitter
Rob Clark [Thu, 23 Nov 2017 16:58:31 +0000 (11:58 -0500)]
freedreno/a5xx: add a5xx blitter

FD_MESA_DEBUG=noblit to disable

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add generic blitter
Rob Clark [Wed, 22 Nov 2017 17:37:15 +0000 (12:37 -0500)]
freedreno: add generic blitter

Basically a clone of util_blitter_blit() but with special handling to
blit PIPE_BUFFER as a PIPE_TEXTURE_1D.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add non-draw batches for compute/blit
Rob Clark [Fri, 24 Nov 2017 15:37:22 +0000 (10:37 -0500)]
freedreno: add non-draw batches for compute/blit

Get rid of "gmem" (ie. tiling) ringbuffer, and just emit setup commands
directly to "draw" ringbuffer for compute (and in future for blits not
using the 3d pipe).  This way we can have a simple flat cmdstream buffer
and bypass setup related to 3d pipe.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: track staging and shadow perf ctrs for the HUD
Rob Clark [Tue, 21 Nov 2017 18:20:53 +0000 (13:20 -0500)]
freedreno: track staging and shadow perf ctrs for the HUD

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: staging upload transfers
Rob Clark [Mon, 20 Nov 2017 20:34:40 +0000 (15:34 -0500)]
freedreno: staging upload transfers

In the busy && !needs_flush case, we can support a DISCARD_RANGE upload
using a staging buffer.  This is a bit different from the case of mid-
batch uploads which require us to shadow the whole resource (because
later draws in an earlier tile happen before earlier draws in a later
tile).

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: update generated headers
Rob Clark [Sat, 25 Nov 2017 19:10:34 +0000 (14:10 -0500)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agoanv: Remove unused variable.
Bas Nieuwenhuizen [Sat, 16 Dec 2017 21:02:11 +0000 (22:02 +0100)]
anv: Remove unused variable.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradeonsi: don't call force_dcc_off for buffers
Marek Olšák [Tue, 12 Dec 2017 21:21:13 +0000 (22:21 +0100)]
radeonsi: don't call force_dcc_off for buffers

This was undefined yet harmless behavior in LLVM.
Not anymore - it causes a hang now.

Cc: 17.3 <mesa-stable@lists.freedesktop.org>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
6 years agoisl: Don't require VALIGN_2 for R32G32B32_FLOAT on Haswell.
Kenneth Graunke [Fri, 15 Dec 2017 00:17:45 +0000 (16:17 -0800)]
isl: Don't require VALIGN_2 for R32G32B32_FLOAT on Haswell.

According to the RENDER_SURFACE_STATE internal documentation, the
R32G32B32_FLOAT restriction is marked "IVB" only.  We choose to apply
it to Ivybridge and Baytrail, but not Haswell.

Apparently fixes KHR-GL46.texture_size_promotion.functional on Haswell.

Changes these tests from crashing to skipping on Haswell:
- KHR-GL46.direct_state_access.textures_storage_multisample_2d_rgb32f
- KHR-GL46.direct_state_access.textures_storage_multisample_3d_rgb32f

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
6 years agoradeon/uvd: add and manage render picture list
Boyuan Zhang [Fri, 15 Dec 2017 16:23:25 +0000 (11:23 -0500)]
radeon/uvd: add and manage render picture list

Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agoradeon/vcn: add and manage render picture list
Boyuan Zhang [Fri, 15 Dec 2017 16:17:32 +0000 (11:17 -0500)]
radeon/vcn: add and manage render picture list

Create a list in decoder to store all render picture buffer pointers that
currently being used in reference picture lists.

During get message buffer call, check each pointer in render_pic_list[]
within given pic->ref[] list, remove pointer that no longer being used by
pic->ref[]. Then add current render surface pointer to the render_pic_list[]
and assign the associated index to result.curr_idx.

As a result, result.curr_idx will have the correct index to represent the
current render picture, instead of the previous increamenting values.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agovl: remove is idr flag
Boyuan Zhang [Thu, 7 Dec 2017 21:13:51 +0000 (16:13 -0500)]
vl: remove is idr flag

Remove is_idr flag since not being used anymore.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agost/va: directly use idr pic flag
Boyuan Zhang [Fri, 8 Dec 2017 23:22:25 +0000 (18:22 -0500)]
st/va: directly use idr pic flag

Remove is_idr flag, and use idr_pic_flag provided by vaapi directly

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agoradeon/vce: determine idr by pic type
Boyuan Zhang [Thu, 7 Dec 2017 21:10:13 +0000 (16:10 -0500)]
radeon/vce: determine idr by pic type

Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
6 years agoradeon/vcn: determine idr by pic type
Boyuan Zhang [Thu, 30 Nov 2017 16:58:32 +0000 (11:58 -0500)]
radeon/vcn: determine idr by pic type

Vaapi encode interface provides idr frame flags, where omx interface doesn't.
Therefore, change to use picture type to determine idr frame, which will
work for both interfaces.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agoutil: scons: wire up the sha1 test
Emil Velikov [Thu, 14 Dec 2017 17:20:30 +0000 (17:20 +0000)]
util: scons: wire up the sha1 test

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>