mesa.git
5 years agovirgl: comment on a sync issue in transfers
Chia-I Wu [Fri, 10 May 2019 18:06:49 +0000 (11:06 -0700)]
virgl: comment on a sync issue in transfers

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: PIPE_TRANSFER_READ does not imply flush
Chia-I Wu [Tue, 7 May 2019 20:22:51 +0000 (13:22 -0700)]
virgl: PIPE_TRANSFER_READ does not imply flush

virgl_res_needs_flush should suffice.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: do not skip readback because of explicit flush
Chia-I Wu [Tue, 7 May 2019 17:56:40 +0000 (10:56 -0700)]
virgl: do not skip readback because of explicit flush

Both apps and we (see virgl_buffer_transfer_flush_region) might
flush regions that are unmodified.  We have to read back for those
flushes.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agovirgl: remove unused virgl_transfer_inline_write
Chia-I Wu [Tue, 7 May 2019 17:01:31 +0000 (10:01 -0700)]
virgl: remove unused virgl_transfer_inline_write

It currently has no user and is probably incorrect (resource_wait is
required in some more cases).  Remove it so that we can focus on
transfers first.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
5 years agoiris/resource: Drop redundant checks for aux support
Nanley Chery [Wed, 1 May 2019 21:57:23 +0000 (14:57 -0700)]
iris/resource: Drop redundant checks for aux support

Drop some checks that are already done by ISL.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoiris/resource: Fall back to no aux if creation fails
Nanley Chery [Wed, 1 May 2019 21:42:58 +0000 (14:42 -0700)]
iris/resource: Fall back to no aux if creation fails

No surface requires an auxiliary surface to operate correctly. Fall back
to an uncompressed surface if mesa fails to create and allocate an
auxiliary surface. This enables adding more restrictions to ISL without
having to update iris.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoi965/miptree: Refactor intel_miptree_supports_ccs_e()
Nanley Chery [Thu, 2 May 2019 22:41:50 +0000 (15:41 -0700)]
i965/miptree: Refactor intel_miptree_supports_ccs_e()

Update and rename this function to format_supports_ccs_e() to better
match its behavior.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoi965/miptree: Drop intel_*_supports_hiz()
Nanley Chery [Mon, 29 Apr 2019 20:00:25 +0000 (13:00 -0700)]
i965/miptree: Drop intel_*_supports_hiz()

intel_tiling_supports_hiz() and intel_miptree_supports_hiz() duplicate
much the work done by isl_surf_get_hiz_surf(). Replace them with simple
expressions.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoisl: Add restrictions to isl_surf_get_hiz_surf()
Nanley Chery [Mon, 29 Apr 2019 19:59:35 +0000 (12:59 -0700)]
isl: Add restrictions to isl_surf_get_hiz_surf()

Import some restrictions from intel_tiling_supports_hiz() and
intel_miptree_supports_hiz().

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoi965/miptree: Drop intel_*_supports_ccs()
Nanley Chery [Sat, 27 Apr 2019 01:18:44 +0000 (18:18 -0700)]
i965/miptree: Drop intel_*_supports_ccs()

intel_tiling_supports_ccs() and intel_miptree_supports_ccs() duplicate
much the work done by isl_surf_get_ccs_surf(). Drop them both and index
a boolean array to choose CCS_D in intel_miptree_choose_aux_usage().

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoisl: Add restriction and comments to isl_surf_get_ccs_surf()
Nanley Chery [Sat, 27 Apr 2019 01:17:19 +0000 (18:17 -0700)]
isl: Add restriction and comments to isl_surf_get_ccs_surf()

Import some restrictions and comments from intel_miptree_supports_ccs().

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoi965/miptree: Drop intel_miptree_supports_mcs()
Nanley Chery [Sat, 27 Apr 2019 00:00:34 +0000 (17:00 -0700)]
i965/miptree: Drop intel_miptree_supports_mcs()

This function duplicates much the work done by isl_surf_get_mcs_surf().
Replace it with a simple expression.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoisl: Modify restrictions in isl_surf_get_mcs_surf()
Nanley Chery [Fri, 26 Apr 2019 23:52:48 +0000 (16:52 -0700)]
isl: Modify restrictions in isl_surf_get_mcs_surf()

Import some restrictions from intel_miptree_supports_mcs() and don't
assume that the caller knows which device generations are supported.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoi965/miptree: Fall back to no aux if creation fails
Nanley Chery [Wed, 24 Apr 2019 20:34:15 +0000 (13:34 -0700)]
i965/miptree: Fall back to no aux if creation fails

No surface requires an auxiliary surface to operate correctly. Fall back
to an uncompressed surface if mesa fails to create and allocate an
auxiliary surface. This enables adding more restrictions to ISL without
having to update i965.

Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agomesa: Set _NEW_VARYING_VP_INPUTS iff varying_vp_inputs are set.
Mathias Fröhlich [Sun, 12 May 2019 08:35:52 +0000 (10:35 +0200)]
mesa: Set _NEW_VARYING_VP_INPUTS iff varying_vp_inputs are set.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Avoid setting _NEW_VARYING_VP_INPUTS in non fixed function mode.
Mathias Fröhlich [Sun, 12 May 2019 08:35:52 +0000 (10:35 +0200)]
mesa: Avoid setting _NEW_VARYING_VP_INPUTS in non fixed function mode.

Instead of checking the API variant on entry of set_varying_vp_inputs
to check if we can ever be interrested in fixed function processing
or not, we can check if we are actually fixed function processing.
To check this we can use the immediately updated
gl_context::VertexProgram._VPMode value that tells us if we have a
user provided shader program or if we are in fixed function processing
either through an internal TNL shader of directly through hardware.
When doing so, we also need to recheck the varying_vp_inputs variable
at the time gl_context::VertexProgram._VPMode is set to VP_MODE_FF.
Put asserts at the consumers of gl_context::varying_vp_inputs to make
sure gl_context::VertexProgram._VPMode is set to VP_MODE_FF. By that
gl_context::varying_vp_inputs should be up to date then.

By not looking at the opengl api for this decision we should actually
catch more cases where we can avoid setting a state change flag, including
the ones where we cannot get into VP_MODE_FF by the choice of the api.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Fix test for setting the _NEW_VARYING_VP_INPUTS flag.
Mathias Fröhlich [Sun, 12 May 2019 08:35:52 +0000 (10:35 +0200)]
mesa: Fix test for setting the _NEW_VARYING_VP_INPUTS flag.

The precondition stated in the comment is not true. The values mentioned are
only set from _mesa_update_state which in turn may not yet be called.
For now set the _NEW_VARYING_VP_INPUTS flag a bit more often, we will narrow
that down to a minimum again in a later patch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Make _mesa_set_varying_vp_inputs static in state.c.
Mathias Fröhlich [Sun, 12 May 2019 08:35:52 +0000 (10:35 +0200)]
mesa: Make _mesa_set_varying_vp_inputs static in state.c.

Is no longer used outside that file.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa: Fix old outdated variable name in a comment.
Mathias Fröhlich [Sun, 12 May 2019 08:35:52 +0000 (10:35 +0200)]
mesa: Fix old outdated variable name in a comment.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agomesa/vbo: Update Comment to what is actually happening.
Mathias Fröhlich [Sun, 12 May 2019 08:35:52 +0000 (10:35 +0200)]
mesa/vbo: Update Comment to what is actually happening.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
5 years agowayland/egl: Ensure correct buffer size when allocating
Jonas Ådahl [Mon, 6 May 2019 07:54:27 +0000 (09:54 +0200)]
wayland/egl: Ensure correct buffer size when allocating

Whenever a buffer is allocated, e.g. by the first draw call or EGL call after a
buffer swap, make sure the size is up to date. Prior to this commit, we
failed to do so when querying the buffer age, or swapping buffers
without any prior EGL call or draw call.

Signed-off-by: Jonas Ådahl <jadahl@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoegl: check if a window/pixmap is already used on surface creation
Paulo Zanoni [Wed, 1 May 2019 23:26:47 +0000 (16:26 -0700)]
egl: check if a window/pixmap is already used on surface creation

The spec says we can't create another surface if we already created a
surface with the given window or pixmap. Implement this check.

This behavior is exercised by piglit/egl-create-surface.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
5 years agoegl: store the native surface pointer in struct _egl_surface
Paulo Zanoni [Wed, 1 May 2019 22:42:26 +0000 (15:42 -0700)]
egl: store the native surface pointer in struct _egl_surface

Each platform stores this in a different place:
  - platform_drm uses dri2_surf->gbm_surf->base
  - platform_android uses dri2_surf->window
  - platform_wayland uses dri2_surf->wl_win
  - platform_x11 uses dri2_surf->drawable
  - platform_x11_dri3 uses dri3_surf->loader_drawable.drawable
  - haiku doesn't even store it!

We need access to the native surface since the specification asks us
to refuse creating a new surface if there's already an EGLSurface
associated with native_surface.

An alternative to this patch would be to create a new
API.GetNativeWindow callback that each platform would have to
implement. While that's something we can definitely do, I prefer
this approach.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
5 years agoradv: add support for VK_KHR_uniform_buffer_standard_layout
Samuel Pitoiset [Mon, 13 May 2019 16:43:55 +0000 (18:43 +0200)]
radv: add support for VK_KHR_uniform_buffer_standard_layout

Nothing to do.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agosoftpipe/buffer: load only as many components as the the buffer resource type provides
Gert Wollny [Mon, 13 May 2019 12:02:24 +0000 (14:02 +0200)]
softpipe/buffer: load only as many components as the the buffer resource type provides

Otherwise we risk to read past the end of the buffer.

In addition, change the loop counters to unsigned to be consistent
with the types.

Fixes: afa8707ba93a7d226a76319acda2a8dd89524db7
    softpipe: add SSBO/shader atomics support.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agopanfrost: ci: Reduce batch size to 3000
Tomeu Vizoso [Mon, 13 May 2019 05:28:24 +0000 (07:28 +0200)]
panfrost: ci: Reduce batch size to 3000

As with the previous value of 5000 we seemed to be reaching OOM in some
circumstances.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Update expectations
Tomeu Vizoso [Mon, 13 May 2019 06:05:01 +0000 (08:05 +0200)]
panfrost: ci: Update expectations

Since last Friday, these two tests have been fixed:

dEQP-GLES2.functional.shaders.functions.control_flow.return_in_nested_loop_fragment
dEQP-GLES2.functional.shaders.linkage.varying_7

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agofreedreno: Fix warning on printing a uint64_t using %llx.
Eric Anholt [Mon, 13 May 2019 18:04:46 +0000 (11:04 -0700)]
freedreno: Fix warning on printing a uint64_t using %llx.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Silence compiler warnings about "*" in boolean context.
Eric Anholt [Fri, 10 May 2019 00:32:25 +0000 (17:32 -0700)]
freedreno: Silence compiler warnings about "*" in boolean context.

It sure looks like we just want both of them to be nonzero, and && is
probably going to be cheaper than * anyway.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Silence compiler warnings about uninit 'layers'
Eric Anholt [Fri, 10 May 2019 00:30:35 +0000 (17:30 -0700)]
freedreno: Silence compiler warnings about uninit 'layers'

My gcc can't see that the uninitialized value from the PIPE_BUFFER case
isn't used from the !PIPE_BUFFER cases later.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Quiet compiler warnings on 64-bit.
Eric Anholt [Fri, 10 May 2019 00:26:44 +0000 (17:26 -0700)]
freedreno: Quiet compiler warnings on 64-bit.

__u64 is a ulonglong on x86_64, not uint64_t, so my gcc was complaining
about the wrong type being passed in.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Make emacs indent the way robclark's eclipse does.
Eric Anholt [Fri, 10 May 2019 00:13:14 +0000 (17:13 -0700)]
freedreno: Make emacs indent the way robclark's eclipse does.

The .editorconfig helps with the tabs, but we've got this
two-tabs-from-previous-indentation line continuation style that requires
whacking the c-file-offsets.  This will throw emacs warnings when first
opening a file in the directory, press '!' to shut it up for the future.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agofreedreno: Make .editorconfig match .dir-locals.el.
Eric Anholt [Thu, 9 May 2019 23:33:56 +0000 (16:33 -0700)]
freedreno: Make .editorconfig match .dir-locals.el.

The editorconfig takes precedence over dir-locals in emacs26 with
editorconfig enabled, so the /.editorconfig was affecting these
directories.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
5 years agoanv: Implement VK_KHR_uniform_buffer_standard_layout
Jason Ekstrand [Tue, 12 Feb 2019 17:02:57 +0000 (11:02 -0600)]
anv: Implement VK_KHR_uniform_buffer_standard_layout

There's no real work to do here since we already support scalar block
layout which is a direct superset of what this extension allows.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan: Update the XML and headers to 1.1.108
Jason Ekstrand [Mon, 13 May 2019 16:08:32 +0000 (11:08 -0500)]
vulkan: Update the XML and headers to 1.1.108

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agotu/entrypoints: Import copy
Jason Ekstrand [Mon, 13 May 2019 22:20:12 +0000 (17:20 -0500)]
tu/entrypoints: Import copy

It's used without being imported

5 years agonv50/ir/nir: make use of SYSTEM_VALUE_MAX when iterating read sysvals
Karol Herbst [Sun, 12 May 2019 13:55:15 +0000 (15:55 +0200)]
nv50/ir/nir: make use of SYSTEM_VALUE_MAX when iterating read sysvals

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agonv50/ir/nir: prefer to shift 1ull instead of 1ll
Karol Herbst [Sun, 12 May 2019 05:32:03 +0000 (07:32 +0200)]
nv50/ir/nir: prefer to shift 1ull instead of 1ll

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <dev@pmoreau.org>
5 years agoradv: Clean up signalled and submitted fields from winsys fences.
Bas Nieuwenhuizen [Sat, 11 May 2019 22:30:06 +0000 (00:30 +0200)]
radv: Clean up signalled and submitted fields from winsys fences.

Other types like syncobj do not need it, so lets make things a bit more uniform.

Also reduce confusion what the signalled/submitted referred to (especially with
imported fences)

Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agoradv: bump reported version to 1.1.107
Samuel Pitoiset [Mon, 13 May 2019 16:41:57 +0000 (18:41 +0200)]
radv: bump reported version to 1.1.107

VK_AMD_draw_indirect_count has been promoted with the suffix
changed to KHR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agov3d: Use driconf to expose non-MSAA texture limits for Xorg.
Eric Anholt [Wed, 1 May 2019 22:02:27 +0000 (15:02 -0700)]
v3d: Use driconf to expose non-MSAA texture limits for Xorg.

The V3D 4.2 HW has a limit to MSAA texture sizes of 4096.  With non-MSAA,
we can go up to 7680 (actually probably 8138, but that hasn't been
validated by the HW team).  Exposing 7680 in X11 will allow dual 4k displays.

5 years agogallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.
Eric Anholt [Mon, 29 Apr 2019 22:38:24 +0000 (15:38 -0700)]
gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.

The _LEVELS assumes that the max is always power of two.  For V3D 4.2, we
can support up to 7680 non-power-of-two MSAA textures, which will let X11
support dual 4k displays on newer hardware.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: Replace MaxTextureLevels with MaxTextureSize.
Eric Anholt [Wed, 1 May 2019 21:00:33 +0000 (14:00 -0700)]
mesa: Replace MaxTextureLevels with MaxTextureSize.

In most places (glGetInteger, max_legal_texture_dimensions), we wanted the
number of pixels, not the number of levels.  Number of levels is easily
recovered with util_next_power_of_two() and ffs().  More importantly, for
V3D we want to be able to expose a non-power-of-two maximum texture size
to cover 2x4k displays on HW that can't quite do 8192 wide.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: Remove proxy image checks for maximum level.
Eric Anholt [Wed, 1 May 2019 21:01:58 +0000 (14:01 -0700)]
mesa: Remove proxy image checks for maximum level.

We've already verified this by _mesa_legal_texture_dimensions() before
this call.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: Reuse _mesa_max_texture_levels() instead of open-coding it.
Eric Anholt [Wed, 1 May 2019 21:13:18 +0000 (14:13 -0700)]
mesa: Reuse _mesa_max_texture_levels() instead of open-coding it.

The shared function has some extension presence checks, but other than
that has the same switch statement contents.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agointel/tools: Fix build with glibc < 2.27.
Vinson Lee [Fri, 10 May 2019 18:24:18 +0000 (11:24 -0700)]
intel/tools: Fix build with glibc < 2.27.

glibc < 2.27 defines OVERFLOW in /usr/include/math.h.

This patch fixes this build error.

In file included from ../include/c99_math.h:37:0,
                 from ../src/util/u_math.h:44,
                 from ../src/mesa/main/macros.h:35,
                 from ../src/intel/compiler/brw_reg.h:47,
                 from ../src/intel/tools/i965_asm.h:32,
                 from ../src/intel/tools/i965_gram.y:29:
src/intel/tools/i965_gram.tab.c:562:5: error: expected identifier before numeric constant
     OVERFLOW = 412,
     ^

Fixes: 70308a5a8a80 ("intel/tools: New i965 instruction assembler tool")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110656
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Eric Engestrom <eric@engestrom.ch>
5 years agost/mesa: enable the ST_DEBUG env var in release and debugoptimized builds
Marek Olšák [Fri, 10 May 2019 01:07:57 +0000 (21:07 -0400)]
st/mesa: enable the ST_DEBUG env var in release and debugoptimized builds

Useful for dumping shaders.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoradeonsi: overhaul the vertex fetch fixup mechanism
Nicolai Hähnle [Mon, 1 Apr 2019 13:44:39 +0000 (15:44 +0200)]
radeonsi: overhaul the vertex fetch fixup mechanism

The overall goal is to support unaligned loads from vertex buffers
natively on SI.

In the unaligned case, we fall back to the general case implementation in
ac_build_opencoded_load_format. Since this function is fully general,
we will also use it going forward for cases requiring fully manual format
conversions of dwords anyway.

This requires a different encoding of the fix_fetch array, which will now
contain the entire format information if a fixup is required.

Having to check the alignment of vertex buffers is awkward. To keep the
impact on the fast path minimal, the si_context will keep track of which
vertex buffers are (not) at least dword-aligned, while the
si_vertex_elements will note which vertex buffers have some (at most dword)
alignment requirement. Vertex buffers should be dword-aligned most of the
time, which allows a fast early-out in almost all cases.

Add the radeonsi_vs_fetch_always_opencode configuration variable for
testing purposes. Note that it can only be used reliably on LLVM >= 9,
because support for byte and short load is required.

v2:
- add a missing check to si_bind_vertex_elements

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: store sctx->vertex_elements in a local in si_shader_selector_key_vs
Nicolai Hähnle [Mon, 1 Apr 2019 13:45:25 +0000 (15:45 +0200)]
radeonsi: store sctx->vertex_elements in a local in si_shader_selector_key_vs

Purely as a shorthand in the remainder of the function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: add ac_build_opencoded_fetch_format
Nicolai Hähnle [Fri, 29 Mar 2019 22:03:51 +0000 (23:03 +0100)]
amd/common: add ac_build_opencoded_fetch_format

Implement software emulation of buffer_load_format for all types required
by vertex buffer fetches.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agonir/validate: Use a single set for SSA def validation
Jason Ekstrand [Sat, 4 May 2019 23:02:50 +0000 (18:02 -0500)]
nir/validate: Use a single set for SSA def validation

The current SSA def validation we do in nir_validate validates three
things:

 1. That each SSA def is only ever used in the function in which it is
    defined.

 2. That an nir_src exists in an SSA def's use list if and only if it
    points to that SSA def.

 3. That each nir_src is in the correct use list (uses or if_uses) based
    on whether it's an if condition or not.

The way we were doing this before was that we had a hash table which
provided a map from SSA def to a small ssa_def_validate_state data
structure which contained a pointer to the nir_function_impl and two
hash sets, one for each use list.  This meant piles of allocation and
creating of little hash sets.  It also meant one hash lookup for each
SSA def plus one per use as well as two per src (because we have to look
up the ssa_def_validate_state and then look up the use.)  It also
involved a second walk over the instructions as a post-validate step.

This commit changes us to use a single low-collision hash set of SSA
sources for all of this by being a bit more clever.  We accomplish the
objectives above as follows:

 1. The list is clear when we start validating a function.  If the
    nir_src references an SSA def which is defined in a different
    function, it simply won't be in the set.

 2. When validating the SSA defs, we walk the uses and verify that they
    have is_ssa set and that the SSA def points to the SSA def we're
    validating.  This catches the case of a nir_src being in the wrong
    list.  We then put the nir_src in the set and, when we validate the
    nir_src, we assert that it's in the set.  This takes care of any
    cases where a nir_src isn't in the use list.  After checking that
    the nir_src is in the set, we remove it from the set and, at the end
    of nir_function_impl validation, we assert that the set is empty.
    This takes care of any cases where a nir_src is in a use list but
    the instruction is no longer in the shader.

 3. When we put a nir_src in the set, we set the bottom bit of the
    pointer to 1 if it's the condition of an if.  This lets us detect
    whether or not a nir_src is in the right list.

When running shader-db with an optimized debug build of mesa on my
laptop, I get the following shader-db CPU times:

   With NIR_VALIDATE=0       3033.34 seconds
   Before this commit       20224.83 seconds
   After this commit         6255.50 seconds

Assuming shader-db is a representative sampling of GLSL shaders, this
means that making this change yields an 81% reduction in the time spent
in nir_validate.  It still isn't cheap but enabling validation now only
increases compile times by 2x instead of 6.6x.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agoutil/set: Add a helper to resize a set
Jason Ekstrand [Fri, 10 May 2019 18:50:56 +0000 (13:50 -0500)]
util/set: Add a helper to resize a set

Often times you don't know how big a set will be and you want the code
to just grow it as needed.  However, sometimes you do know and you can
avoid a lot of rehashing if you just specify a size up-front.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agoutil/set: Add a search_and_add function
Jason Ekstrand [Fri, 10 May 2019 18:37:42 +0000 (13:37 -0500)]
util/set: Add a search_and_add function

This function is identical to _mesa_set_add except that it takes an
extra out parameter that lets the caller detect if a replacement
happened.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agonir/validate: Use a ralloc context for our temporary data
Jason Ekstrand [Fri, 10 May 2019 19:44:45 +0000 (14:44 -0500)]
nir/validate: Use a ralloc context for our temporary data

All of our hash tables and sets are already using ralloc.  There's
really no good reason why we don't just make a ralloc context rather
than try to remember to clean everything up manually.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
5 years agolima: add Allwinner H5 support
Patrick Lerda [Sun, 12 May 2019 22:03:58 +0000 (00:03 +0200)]
lima: add Allwinner H5 support

The H5 hardware variant requires a specific plb_max_blk number. This
value can't be probed at the hardware level.

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agolima: refactor plb_max_blk
Patrick Lerda [Sun, 12 May 2019 22:03:22 +0000 (00:03 +0200)]
lima: refactor plb_max_blk

Move plb_max_blk to lima_screen, and add a new debug option:
LIMA_PLB_MAX_BLK

Signed-off-by: Patrick Lerda <patrick9876@free.fr>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
5 years agoradv: Do not use extra descriptor space for the 3rd plane.
Bas Nieuwenhuizen [Wed, 8 May 2019 23:57:28 +0000 (01:57 +0200)]
radv: Do not use extra descriptor space for the 3rd plane.

While ImageFormatProperties returns the number of internal descriptors,
it turns out that applications do not need to actually allocate more
descriptors in the descriptor pool.

So if we make descriptors with more planes larger we have to be
convervative and always allocate space for the larger descriptors
which is a waste given the low usage of this ext.

So let us make use of the fact that 3plane formats all have the
same formats & dimensions for the last two planes. This way we
only need the first half of the descriptor of the 3rd plane and
can share the second half of the second plane.

This allows us to use 16 bytes for the descriptor which nicely
fits into the 16 bytes that are unused right next to the sampler.

Fixes: 5564c38212a "radv: Update descriptor sets for multiple planes."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Add support for icd loader interface v4.
Bas Nieuwenhuizen [Sat, 5 Jan 2019 12:46:53 +0000 (13:46 +0100)]
radv: Add support for icd loader interface v4.

Adds support for physical device functions unknown to the loader.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agopanfrost/midgard: Handle csel correctly
Alyssa Rosenzweig [Tue, 7 May 2019 02:52:08 +0000 (02:52 +0000)]
panfrost/midgard: Handle csel correctly

We use an algebraic pass for the csel optimizations, and use proper
vectorized csel ops (i/fcsel_v) for mixed, rather lowering.

To avoid regressions along the way, we fix an issue with the copy
propagation pass (it should not attempt to propagate constants).
Similarly, we take care to break bundles when using csel to fix some
scheduler corner cases.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agoiris: Implement ARB_indirect_parameters
Illia Iorin [Thu, 9 May 2019 21:44:39 +0000 (00:44 +0300)]
iris: Implement ARB_indirect_parameters

iris_draw_vbo is divided into two functions to remove unnecessary
operations from the loop. This implementation of ARB_indirect_parameters
takes into account NV_conditional_render by saving MI_PREDICATE_RESULT
at the start of a draw call and restoring it at the end also the result
of NV_conditional_render is taken into account when computing predicates
that limit draw calls for ARB_indirect_parameters in a similar way
to 1952fd8d in ANV.

v2: Optimize indirect draws (suggested by Kenneth Graunke)
v3: (by Kenneth Graunke)
 - Fix an issue where indirect draws wouldn't set patch information
   before updating the compiled TCS.
 - Move some code back to iris_draw_vbo to avoid duplicating it.
 - Fix minor indentation issues.

Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoiris: Split iris_update_draw_info into two functions.
Kenneth Graunke [Sun, 12 May 2019 06:43:17 +0000 (23:43 -0700)]
iris: Split iris_update_draw_info into two functions.

Shader draw parameters need updating on each iteration of a multidraw
loop, but the primitive based information only needs to be updated once.

Also, patch information needs to be recorded before filling out the TCS
program key, as it determines the number of HS instances.

5 years agonir: Fix wrong sign in lower_rcp
Ruslan Kabatsayev [Sat, 11 May 2019 11:04:36 +0000 (14:04 +0300)]
nir: Fix wrong sign in lower_rcp

The nested fma calls were supposed to implement

x_new = x + x * (1 - x*src),

but instead current code is equivalent to

x_new = x - x * (1 - x*src).

The result is that Newton-Raphson steps don't improve precision at all.
This patch fixes this problem.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110435
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agointel: drop misleading driver name from gen_get_device_info()
Mike Blumenkrantz [Mon, 15 Apr 2019 16:20:54 +0000 (12:20 -0400)]
intel: drop misleading driver name from gen_get_device_info()

5 years agoradv: clear vertex bindings while resetting command buffer
Józef Kucia [Fri, 10 May 2019 19:38:22 +0000 (21:38 +0200)]
radv: clear vertex bindings while resetting command buffer

Only vertex inputs accessed by vertex shader must have valid buffers
bound.

Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 5010436e09f "radv: bail out when binding the same vertex buffers"
5 years agost/mesa: fix 2 crashes in st_tgsi_lower_yuv
Marek Olšák [Thu, 25 Apr 2019 22:44:51 +0000 (18:44 -0400)]
st/mesa: fix 2 crashes in st_tgsi_lower_yuv

src/mesa/state_tracker/st_tgsi_lower_yuv.c:68: void reg_dst(struct
 tgsi_full_dst_register *, const struct tgsi_full_dst_register *, unsigned
 int): assertion "dst->Register.WriteMask" failed

The second crash was due to insufficient allocated size for TGSI
instructions.

Cc: 19.0 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agoiris: Use full ways for L3 cache setup on Icelake.
Kenneth Graunke [Fri, 10 May 2019 21:15:53 +0000 (14:15 -0700)]
iris: Use full ways for L3 cache setup on Icelake.

Anuj fixed this in i965 and anv, but the fix never landed in iris.
Fixes tessellation corruption on Icelake.  Thanks to Rafael for
bisecting this and tracking it down.

Fixes: d0996d5fab6 iris: Emit default L3 config for the render pipeline
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
5 years agoanv: Fix limits when VK_EXT_descriptor_indexing is used
Caio Marcelo de Oliveira Filho [Thu, 9 May 2019 08:01:19 +0000 (01:01 -0700)]
anv: Fix limits when VK_EXT_descriptor_indexing is used

Update various limits in
VkPhysicalDeviceDescriptorIndexingPropertiesEXT that were previously
zero to their values from VkPhysicalDeviceLimits.  When using
VK_EXT_descriptor_indexing, the former limits will apply to all the
descriptor layout sets -- not only those using the new feature bits.

For the reference, VK_EXT_descriptor_indexing says

    "There are new descriptor set layout and descriptor pool creation
    flags that are required to opt in to the update-after-bind
    functionality, and there are separate maxPerStage* and
    maxDescriptorSet* limits that apply to these descriptor set
    layouts which may be much higher than the pre-existing limits. The
    old limits only count descriptors in non-updateAfterBind
    descriptor set layouts, and the new limits count descriptors in
    all descriptor set layouts in the pipeline layout."

Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agovulkan/overlay: keep allocating draw data until it can be reused
Lionel Landwerlin [Thu, 9 May 2019 16:52:11 +0000 (17:52 +0100)]
vulkan/overlay: keep allocating draw data until it can be reused

The original implementation assumed that we could allocate the same
amount of command buffers as the number of images in the swapchain.
But the application could potentially render much faster and rerender
into images that have been submitted for presentation but not yet
presented.

This change keeps on allocating command buffers, vertex buffer, vertex
indices as well as a semaphore and a fence for as long as we can't
reuse a previously submitted one.

This fixes rendering issues in the overlay at high frame rates.

v2: Don't recreate semaphores constantly (Józef)

v3: Drop useless surface & FreeCommandBuffers (Józef)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>
5 years agovulkan/overlay: fix truncating error on 32bit platforms
Lionel Landwerlin [Thu, 9 May 2019 14:22:40 +0000 (15:22 +0100)]
vulkan/overlay: fix truncating error on 32bit platforms

Non dispatchable handles can be uint64_t. When compiling the layer on
a 32bit platform, this will lead to casting uint64_t into (void *)
which is 32bit, leading to incorrect handles being mapped internally
in the layer.

v2: Use more HKEY() (Eric)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Józef Kucia <joseph.kucia@gmail.com>
Fixes: 2d2927938f074f ("vulkan/overlay-layer: fix cast errors")
Reviewed-by: Józef Kucia <joseph.kucia@gmail.com>
5 years agoi965: Fix memory leaks in brw_upload_cs_work_groups_surface().
Kenneth Graunke [Thu, 9 May 2019 22:40:13 +0000 (15:40 -0700)]
i965: Fix memory leaks in brw_upload_cs_work_groups_surface().

This was taking a reference to the 64kB upload buffer and never
returning it, leaking a reference each time this atom triggered.

This leaked lots of 64kB upload BOs, eventually running us out of
of VMA space.  This would usually happen when using mpv to watch a
movie, after 20-40 minutes.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134
Fixes: 63d7b33f516 i965/cs: Setup surface binding for gl_NumWorkGroups
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agost/va: set the visible image dimensions in vlVaDeriveImage
Julien Isorce [Wed, 8 May 2019 19:40:50 +0000 (12:40 -0700)]
st/va: set the visible image dimensions in vlVaDeriveImage

This fixes video being rendered incorrectly.

User wants height of 360 but internally pipe_video_buffer 's height
is 368 in the test below.

Test:
  GST_GL_PLATFORM=egl gst-launch-1.0 videotestsrc ! video/x-raw, width=868, height=360, format=NV12 ! vaapipostproc ! glimagesink

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443
Signed-off-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoswrast: Rename blend_func->swrast_blend_func
Alyssa Rosenzweig [Fri, 10 May 2019 16:22:05 +0000 (16:22 +0000)]
swrast: Rename blend_func->swrast_blend_func

This avoids a conflict with the new (driver-agnostic) blend_func enum in
shader_enum.h, which broke the build of swrast (and i965 by extension).

My apologies :(

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Fixes: f41be53a ("compiler: Add enums for blend state")
Cc: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
5 years agotravis: fix syntax, and drop unused stuff
Eric Engestrom [Thu, 9 May 2019 12:28:41 +0000 (13:28 +0100)]
travis: fix syntax, and drop unused stuff

Fixes: a988d953899c099719f3 "ci: Delete autotools build jobs"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agonir: Add blend_const_color_rgba sysval
Alyssa Rosenzweig [Mon, 6 May 2019 02:00:37 +0000 (02:00 +0000)]
nir: Add blend_const_color_rgba sysval

This represents a float vec4 constant color, as passed to glBlendColor.
While the existing 4 shader sysvals are retained to minimize code churn,
a single vectorized intrinsic is required for efficient blending on
vector architectures. (This may also apply to archictectures like
Bifrost where ALU is scalar but load/store is vector; it largely depends
on how blending is implemented per-driver.)

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agogallium: Add helper to convert PIPE blending to shader_enum style
Alyssa Rosenzweig [Mon, 6 May 2019 02:03:48 +0000 (02:03 +0000)]
gallium: Add helper to convert PIPE blending to shader_enum style

Complementing the new API-agnostic shader_enum blending style, we add
helpers to translate between the two forms. Ideally, we could just use
PIPE blending directly, but that makes Vulkan support challenging.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agocompiler: Add enums for blend state
Alyssa Rosenzweig [Mon, 6 May 2019 01:58:56 +0000 (01:58 +0000)]
compiler: Add enums for blend state

We add enums corresponding to (GLES) blend state to shader_enums.h,
complementing the existing advanced blending enums in the file. This
allows us to represent blending state in a driver-agnostic, API-agnostic
way to permit lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
5 years agonir: allow specifying a set of opcodes in lower_alu_to_scalar
Jonathan Marek [Wed, 8 May 2019 16:45:48 +0000 (12:45 -0400)]
nir: allow specifying a set of opcodes in lower_alu_to_scalar

This can be used by both etnaviv and freedreno/a2xx as they are both vec4
architectures with some instructions being scalar-only.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agointel/fs/copy-prop: Don't walk all the ACPs for each instruction
Jason Ekstrand [Sun, 5 May 2019 05:13:20 +0000 (00:13 -0500)]
intel/fs/copy-prop: Don't walk all the ACPs for each instruction

In order to set up KILL sets, the dataflow code was walking the entire
array of ACPs for every instruction.  If you assume the number of ACPs
increases roughly with the number of instructions, this is O(n^2).  As
it turns out, regions_overlap() is not nearly as cheap as one would like
and shows up as a significant chunk on perf traces.

This commit changes things around and instead first builds an array of
exec_lists which it uses like a hash table (keyed off ACP source or
destination) similar to what's done in the rest of the copy-prop code.
By first walking the list of ACPs and populating the table and then
walking instructions and only looking at ACPs which probably have the
same VGRF number, we can reduce the complexity to O(n).  This takes the
execution time of the piglit vs-isnan-dvec test from about 56.4 seconds
on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to
about 38.7 seconds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/fs/copy-prop: Purge unused ACPs
Jason Ekstrand [Sun, 5 May 2019 04:51:23 +0000 (23:51 -0500)]
intel/fs/copy-prop: Purge unused ACPs

If the destination of an ACP entry exists only within this block, then
there's no need to keep it for dataflow analysis.  We can delete it from
the out_acp table and avoid growing the bitsets any bigger than we
absolutely have to.  This reduces the maximum number of global ACP
entries in the vs-isnan-dvec with software fp64 on Kaby Lake from 8630
to 3942 and takes the execution time of the piglit vs-isnan-dvec test
from about 1:16.2 on an unoptimized debug build (what we run in CI) with
NIR_VALIDATE=0 to about 56.4 seconds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agointel/fs/copy-prop: Bump the hash table size to 64
Jason Ekstrand [Sun, 5 May 2019 03:47:59 +0000 (22:47 -0500)]
intel/fs/copy-prop: Bump the hash table size to 64

While the number of ACPs is generally not huge compared to the number of
blocks, 16 does seem a bit small.  Bumping it to 64 takes the execution
time of the piglit vs-isnan-dvec test from about 1:18.1 on an unoptimized
debug build (what we run in CI) with NIR_VALIDATE=0 to about 1:16.2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agowinsys/amdgpu: add VCN JPEG to no user fence group
Leo Liu [Wed, 8 May 2019 12:13:52 +0000 (08:13 -0400)]
winsys/amdgpu: add VCN JPEG to no user fence group

There is no user fence for JPEG, the bug triggering
kernel WARN_ON(flags & AMDGPU_FENCE_FLAG_64BIT)

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
5 years agolima: fix width 4096 resolution GP fail
Qiang Yu [Thu, 9 May 2019 12:40:24 +0000 (20:40 +0800)]
lima: fix width 4096 resolution GP fail

When width=4096 and shift_w=0, block_w=0x100 which overflow
the PLBU_CMD 8 bits for it.

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
5 years agopanfrost: Add CAPFs for conservative rasterization
Tomeu Vizoso [Thu, 9 May 2019 12:07:45 +0000 (14:07 +0200)]
panfrost: Add CAPFs for conservative rasterization

Just do what everybody else but Nouveau does and return 0.0f.

This prevents the repeated logging of these messages on startup:

Unexpected PIPE_CAPF 6 query
Unexpected PIPE_CAPF 7 query
Unexpected PIPE_CAPF 8 query

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: Only take the fast paths on buffers aligned to block size
Tomeu Vizoso [Wed, 8 May 2019 06:59:30 +0000 (08:59 +0200)]
panfrost: Only take the fast paths on buffers aligned to block size

As the functions operate on 16-byte blocks.

Fixes this Valgrind error:

Invalid read of size 4
   at 0x5857568: swizzle_bpp1_align16 (pan_swizzle.c:85)
   by 0x585780F: panfrost_texture_swizzle (pan_swizzle.c:171)
   by 0x584F587: panfrost_tile_texture (pan_resource.c:489)
   by 0x584F641: panfrost_transfer_unmap (pan_resource.c:525)
   by 0x587718D: u_transfer_helper_transfer_unmap (u_transfer_helper.c:516)
   by 0x5875D85: pipe_transfer_unmap (u_inlines.h:515)
   by 0x5875F13: u_default_texture_subdata (u_transfer.c:80)
   by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
   by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
   by 0x5391353: teximage (teximage.c:3105)
   by 0x5391353: teximage_err (teximage.c:3132)
   by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
   by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
 Address 0x1e94f1e8 is 0 bytes after a block of size 16 alloc'd
   at 0x483F5C8: malloc (vg_replace_malloc.c:299)
   by 0x584F47D: panfrost_transfer_map (pan_resource.c:467)
   by 0x587694D: u_transfer_helper_transfer_map (u_transfer_helper.c:243)
   by 0x5875EA7: u_default_texture_subdata (u_transfer.c:59)
   by 0x53FFDC3: st_TexSubImage (st_cb_texture.c:1480)
   by 0x54005BB: st_TexImage (st_cb_texture.c:1709)
   by 0x5391353: teximage (teximage.c:3105)
   by 0x5391353: teximage_err (teximage.c:3132)
   by 0x5391B9B: _mesa_TexImage2D (teximage.c:3170)
   by 0x5097A77: shared_dispatch_stub_183 (glapi_mapi_tmp.h:18833)
   by 0x4DA8AB: glu::CallLogWrapper::glTexImage2D(unsigned int, int, int, int, int, int, unsigned int, unsigned int, void const*) (in /home/tomeu/deqp-build/modules/gles2/deqp-gles2)

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Cc: 19.1 <mesa-stable@lists.freedesktop.org>
5 years agopanfrost: Fix two uninitialized accesses in compiler
Tomeu Vizoso [Tue, 7 May 2019 15:28:36 +0000 (17:28 +0200)]
panfrost: Fix two uninitialized accesses in compiler

Valgrind was complaining of those.

NIR_PASS only sets progress to TRUE if there was progress.

nir_const_load_to_arr() only sets as many constants as components has
the instruction.

This was causing some dEQP tests to flip-flop, such as:

dEQP-GLES2.functional.fragment_ops.blend.equation_src_func_dst_func.add_src_color_constant_color

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Fixes: 14531d676b11 ("nir: make nir_const_value scalar")
5 years agopanfrost: ci: Skip running some tests
Tomeu Vizoso [Tue, 7 May 2019 11:43:14 +0000 (13:43 +0200)]
panfrost: ci: Skip running some tests

These tests add too much time to the total run time, and some of them
even hang the DUTs, even if I haven't been able to reproduce it locally.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Don't restart Weston
Tomeu Vizoso [Tue, 7 May 2019 10:57:58 +0000 (12:57 +0200)]
panfrost: ci: Don't restart Weston

There doesn't seem to actually be any noticeably memory leaks on Weston
when running dEQP. We do seem to leak quiet a bit in the client, so we
still have to run the dEQP runner in batches.

This removes the risk of Weston not restarting properly and introducing
spurious failures.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Update list of expected failures
Tomeu Vizoso [Tue, 7 May 2019 08:58:36 +0000 (10:58 +0200)]
panfrost: ci: Update list of expected failures

This matches the current state of things on both RK3288 and RK3399.
Hopefully, from now on we'll only remove stuff from this list.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Tweak dEQP to improve throughput
Tomeu Vizoso [Tue, 7 May 2019 08:00:23 +0000 (10:00 +0200)]
panfrost: ci: Tweak dEQP to improve throughput

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Fix list of tests to run
Tomeu Vizoso [Tue, 7 May 2019 06:45:19 +0000 (08:45 +0200)]
panfrost: ci: Fix list of tests to run

Make sure we have only test case names in the list, excluding names of
test groups.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Check for incomplete runs
Tomeu Vizoso [Tue, 7 May 2019 06:44:03 +0000 (08:44 +0200)]
panfrost: ci: Check for incomplete runs

To improve robustness, check that we got the expected number of results.
Right now we hard-code the expected number of tests run, but with some
effort we may be able to infer it.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Add tests to flip-flop list
Tomeu Vizoso [Mon, 6 May 2019 05:39:42 +0000 (07:39 +0200)]
panfrost: ci: Add tests to flip-flop list

These tests aren't giving reliable results. Mask them for now.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agopanfrost: ci: Add support for running the tests on RK3288
Tomeu Vizoso [Fri, 3 May 2019 15:48:48 +0000 (17:48 +0200)]
panfrost: ci: Add support for running the tests on RK3288

Build artifacts for armhf and schedule them on a Veyron Chromebook with
RK3288.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
5 years agolima: fix tile buffer reloading
Vasily Khoruzhick [Wed, 8 May 2019 02:03:34 +0000 (19:03 -0700)]
lima: fix tile buffer reloading

Buffer needs to be reloaded every time unless explicit clear() was
called.

Fixes rendering issues with wayland compositors.

Reviewed-by: Qiang Yu <yuq825@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
5 years agoanv: Remove special allocation for anv_push_constants
Caio Marcelo de Oliveira Filho [Tue, 7 May 2019 06:46:42 +0000 (23:46 -0700)]
anv: Remove special allocation for anv_push_constants

The key reason for that mechanism is gone: all the extra optional data
that could be in the anv_push_constants was moved elsewhere.  At this
point, just put anv_push_constants directly in anv_cmd_state (part of
anv_cmd_buffer).

v2: Remove a NULL check we don't need anymore in
    anv_cmd_buffer_push_constants().  (Lionel)
    Fix size we consider for valid push params.  (Lionel)

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoiris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY
Kenneth Graunke [Wed, 8 May 2019 18:33:50 +0000 (11:33 -0700)]
iris: Expose PIPE_CAP_DEVICE_RESET_STATUS_QUERY

This provides a way for the application to query whether any resets have
happened, which lets us expose "robust" contexts.  This also enables the
KHR_robust_buffer_access_behavior tests.

5 years agoiris: Hook up device reset callbacks
Kenneth Graunke [Wed, 8 May 2019 05:26:22 +0000 (22:26 -0700)]
iris: Hook up device reset callbacks

This mechanism lets the driver inform the state tracker about GPU
resets, say for destroying a robust API context and reporting a "device
lost" error to the application, making it take action to deal with this.

5 years agoiris: Try to recover from GPU hangs.
Kenneth Graunke [Wed, 8 May 2019 06:19:30 +0000 (23:19 -0700)]
iris: Try to recover from GPU hangs.

The iris batch module now tries to detect that the kernel has banned
our GEM context, creates a new non-banned context, and informs the
iris context module that all assumptions about state are now invalid
and it needs to reinitialize the relevant state.

Based on Chris Wilson's work, but significantly rewritten by me.

5 years agoiris: Add helpers to clone a hardware context.
Chris Wilson [Wed, 8 May 2019 06:23:13 +0000 (23:23 -0700)]
iris: Add helpers to clone a hardware context.

(Chris Wilson wrote this code in a patch titled "i965: Be resilient in
the face of GPU hangs"; Ken fixed a bug and copied it to iris.)

5 years agoiris: Mark render batches as non-recoverable.
Kenneth Graunke [Wed, 8 May 2019 06:03:46 +0000 (23:03 -0700)]
iris: Mark render batches as non-recoverable.

Adapted from Chris Wilson's patch.  The comment is largely his.

Currently, when iris hangs the GPU, it will continue sending batches
which incrementally update the state, assuming it's preserved across
batches.  However, the kernel's GPU reset support reinitializes the
guilty context to the default GPU state (reasonably not wanting to
trust the current state).  This ends up resetting critical things
like STATE_BASE_ADDRESS, causing memory accesses in all subsequent
batches to be garbage, and almost certainly result in more hangs
until we're banned or we kill the machine.

We now ask the kernel to ban our render context immediately, so we
notice we've gone off the rails as fast as possible.  Eventually, we'll
attempt to recover and continue.  For now, we just avoid torching the
GPU over and over.