git.libre-soc.org Git - mesa.git/log

Timur Kristóf [Tue, 17 Sep 2019 17:59:52 +0000 (19:59 +0200)]

radv: Enable ACO on Navi.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Leo Liu [Mon, 28 Oct 2019 17:17:04 +0000 (13:17 -0400)]

radeonsi: enable 8K video decode support for HEVC and VP9

HW 8K decode support starts at Renoir

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>

commit | commitdiff | tree

Leo Liu [Mon, 28 Oct 2019 17:08:25 +0000 (13:08 -0400)]

radeon/vcn: Add VP9 8K decode support

Require increase of context buffer size

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>

commit | commitdiff | tree

Rhys Perry [Fri, 18 Oct 2019 12:05:00 +0000 (13:05 +0100)]

aco: try to group together VMEM loads of the same resource

v2: remove accidental shaderInt16 change
v2: simplify can_move_down initialization
v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

commit | commitdiff | tree

Daniel Schürmann [Thu, 10 Oct 2019 14:31:40 +0000 (16:31 +0200)]

aco: don't schedule instructions through depending VMEM instructions

Previously, the scheduler tried to move up instructions from below depending
VMEM instructions only to move them down again when scheduling the VMEM
instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Thu, 10 Oct 2019 12:55:13 +0000 (14:55 +0200)]

aco: add can_reorder flags to load_ubo and load_constant

These got lost due to some refactoring.
Due to the way our scheduler works currently, for now
we add back the reorder flag for divergent loads only.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Wed, 28 Aug 2019 10:08:12 +0000 (12:08 +0200)]

aco: only skip RAR dependencies if the variable is killed somewhere

This patch changes VMEM scheduling in a way that they can only
be moved upwards by previous VMEM instructions but not downwards.
This way, it improves the order of VMEM instructions in relation
to their users.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

commit | commitdiff | tree

Daniel Schürmann [Thu, 29 Aug 2019 15:17:32 +0000 (17:17 +0200)]

aco: restrict scheduling depending on max_waves

Previously, we allowed all shaders to reduce the number of max_waves to as low as 5.
Restricting this on shaders with low register demand, increases the total number of waves
while the VMEM def-use distances hardly change.
This patch also changes the max number of move operations per MEM instruction.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>

commit | commitdiff | tree

Jason Ekstrand [Tue, 29 Oct 2019 21:10:49 +0000 (16:10 -0500)]

anv: Avoid emitting UBO surface states that won't be used

This shaves around 4-5% off of a CPU-limited example running with the
Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Jason Ekstrand [Tue, 29 Oct 2019 22:28:18 +0000 (17:28 -0500)]

intel/vec4: Set brw_stage_prog_data::has_ubo_pull

In 0e4a75f917, Ken added a flag brw_stage_prog_data which indicates
whether any UBO pulls ever occur.  Unfortunately, he neglected to set
the bit in the vec4 back-end.  This was fine at the time because the
optimization was intended for iris which does not support gen7 and using
the vec4 back-end on Gen8+ requires an environment variable.  We want to
use this in Vulkan which does support Gen7 so we want the information
from the vec4 back-end as well as scalar.

Fixes: 0e4a75f917 "intel/compiler: Record whether any pull constant..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 28 Oct 2019 14:12:27 +0000 (15:12 +0100)]

radv: fix perftest options

RADV_PERFTEST=outooforder has been removed a while ago. This fixes
dumping the options into hang reports.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 28 Oct 2019 14:12:03 +0000 (15:12 +0100)]

radv: move nomemorycache debug option at the right palce

Fixes: 6571000071d ("radv: add debug option to turn off in memory cache")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Mon, 28 Oct 2019 15:56:15 +0000 (16:56 +0100)]

radv: fix dumping SPIR-V into hang reports

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Tapani Pälli [Fri, 25 Oct 2019 08:06:05 +0000 (11:06 +0300)]

mesa: enable ARB_gpu_shader_int64 in compat profile

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Tapani Pälli [Fri, 25 Oct 2019 08:00:04 +0000 (11:00 +0300)]

mesa: add [Program]Uniform*64ARB display list support

This is required for int64 to be enabled in compat profile.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Bas Nieuwenhuizen [Fri, 25 Oct 2019 08:26:50 +0000 (10:26 +0200)]

radv: Enable VK_KHR_timeline_semaphore.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Mon, 28 Oct 2019 01:44:54 +0000 (02:44 +0100)]

radv: Add wait-before-submit support for timelines.

This is actually a non-threaded implementation. I'd summarize this
as event-based submission.

When submit happens we walk a tree of submissions that depend on
the syncobj signal operations to be submitted and if those submission
we no other dependencies we start to execute them immediately.

Or, well I still use a list to avoid issues with long chains and
the stacksize when using recursion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Tue, 22 Oct 2019 08:18:06 +0000 (10:18 +0200)]

radv: Add timelines with a VK_KHR_timeline_semaphore impl.

This does not fully do wait-before-submit, to be done in a follow
up patch.

For kernels without support for timeline syncobjs, this adds an
implementation of non-shareable timelines using legacy syncobjs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Wed, 23 Oct 2019 13:31:43 +0000 (15:31 +0200)]

radv: Add temporary datastructure for submissions.

So we can defer them.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Sun, 20 Oct 2019 20:50:58 +0000 (22:50 +0200)]

radv: Split semaphore into two parts as enum+union.

This is in preparation to adding more types.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Sun, 20 Oct 2019 17:15:24 +0000 (19:15 +0200)]

radv: Always enable syncobj when supported for all fences/semaphores.

This simplifies code for timeline semaphores by needing to support
less configurations.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Sun, 20 Oct 2019 17:12:24 +0000 (19:12 +0200)]

radv: Improve fence signalling in QueueSubmit.

Only signalling it once.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Sat, 19 Oct 2019 15:05:22 +0000 (17:05 +0200)]

radv: Do sparse binding in queue submission.

So we have one place to do queue things if we end up deferring
submissions.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Thu, 3 Oct 2019 19:08:29 +0000 (21:08 +0200)]

radv: Split out commandbuffer submission.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Tue, 1 Oct 2019 16:14:34 +0000 (18:14 +0200)]

radv: Clean up unused variable.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Wed, 30 Oct 2019 02:29:21 +0000 (03:29 +0100)]

radv: Add an early exit in the secure compile if we already have the cache entries.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

commit | commitdiff | tree

Bas Nieuwenhuizen [Wed, 30 Oct 2019 01:54:37 +0000 (02:54 +0100)]

radv: Compute hashes in secure process for secure compilation.

To prevent poisoning arbitrary cache entries.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

commit | commitdiff | tree

Erik Faye-Lund [Tue, 29 Oct 2019 13:12:02 +0000 (14:12 +0100)]

zink: drop nop descriptor-updates

If there's nothing to be done, let's actually do nothing. Seems like a
good idea.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Erik Faye-Lund [Tue, 29 Oct 2019 12:27:58 +0000 (13:27 +0100)]

zink: use bitfield for dirty flagging

Bitfields are a bit more ideomatic than explicit flags, and harder to
get wrong.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Erik Faye-Lund [Tue, 29 Oct 2019 11:43:56 +0000 (12:43 +0100)]

zink: use dynamic state for line-width

This will lead to fewer pipelines in the cache, which is assumed to
become our most unavoidable performance bottle-neck down the line.

Reviewed-by: Dave Airlie <airlied@redhat.com>

commit | commitdiff | tree

Duncan Hopkins [Wed, 14 Aug 2019 10:07:47 +0000 (11:07 +0100)]

zink: Use optimal layout instead of general. Reduces valid layer warnings. Fixes RADV image noise.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>

commit | commitdiff | tree

Michel Dänzer [Wed, 30 Oct 2019 08:38:20 +0000 (09:38 +0100)]

gitlab-ci: Disable meson-windows job for the time being

It needs a CI runner carrying the mesa-windows tag, but there's none
available currently.

commit | commitdiff | tree

Timothy Arceri [Tue, 29 Oct 2019 06:46:57 +0000 (17:46 +1100)]

radv: make use of radv_sc_read()

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Timothy Arceri [Tue, 29 Oct 2019 06:43:40 +0000 (17:43 +1100)]

radv: add radv_sc_read() helper

This is a function with timeout support for reading from the pipe
between processes used for secure compile.

Initially we hardcode the timeout to 5 seconds. We can adjust the
timeout limit in future if needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Timothy Arceri [Tue, 29 Oct 2019 06:41:41 +0000 (17:41 +1100)]

radv: allow select() calls in secure compile

This will be used in the following patch to support timeouts for
reading the pipe between processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Lepton Wu [Wed, 30 Oct 2019 00:52:21 +0000 (17:52 -0700)]

mapi: Improve the x86 tsd stubs performance.

This skips touching %ebx most times and it shows that glGetString performance
increased from 114M/s to 120M/s on my desktop.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>

commit | commitdiff | tree

Lepton Wu [Tue, 22 Oct 2019 03:22:18 +0000 (20:22 -0700)]

mapi: Inline call x86_current_tls.

This saves one return and a simple benchmark which calls glGetString
repeatedly on my desktop shows it improves calls per second from 123M
to 141M.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1997
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>

commit | commitdiff | tree

Lepton Wu [Sat, 26 Oct 2019 00:27:04 +0000 (17:27 -0700)]

mapi: Clean up entry_patch_public for x86 tls

Remove hard coded 16 and use entry_generate_or_patch to patch
public stubs. The generated code actually is sightly tighter
than before since the "nop" instructions before the final "jmp"
get removed.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>

commit | commitdiff | tree

Lepton Wu [Fri, 25 Oct 2019 23:54:35 +0000 (16:54 -0700)]

mapi: split entry_generate_or_patch for x86 tls

The code works exactly the same with before. Just split this function
out so we can reuse it.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>

commit | commitdiff | tree

Jonathan Gray [Fri, 13 Sep 2019 17:09:15 +0000 (10:09 -0700)]

mapi: Adapted libglvnd x86 tsd changes

The x86 assembly language stub in src/mapi/entry_x86_tsd.h does not
generate PIC (position-independent code). This causes text relocations
which bring troubles on recent versions of FreeBSD, OpenBSD, Android.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108541
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lepton Wu <lepton@chromium.org>

commit | commitdiff | tree

Caio Marcelo de Oliveira Filho [Tue, 29 Oct 2019 19:09:38 +0000 (12:09 -0700)]

spirv: Don't fail if multiple ordering semantics bits are set

Vulkan requires that only one bit for the ordering is set, but old
versions of GLSLang just set all the bits. This was fixed as part of
https://github.com/KhronosGroup/glslang/commit/c51287d744fb6e7e9ccc09f6f8451e6c64b1dad6
but we can still find older versions (or shaders compiled with it)
around.

So instead of failing, emit a warning and fallback to the effective
result of any combination of multiple bits: AcquireRelease.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2018
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Sagar Ghuge [Tue, 15 Oct 2019 21:13:29 +0000 (14:13 -0700)]

intel/isl: Allow stencil buffer to support compression on Gen12+

v2: (Nanley Chery)
- Fix commit title
- Fix comment

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Tue, 17 Sep 2019 20:20:16 +0000 (13:20 -0700)]

iris: Resolve stencil resource prior to copy or used by CPU

v2: Decide aux usage in get_copy_region_aux_settings (Nanley Chery)

v3: Use isl_surf_usage_is_stencil function (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Tue, 3 Sep 2019 23:30:14 +0000 (16:30 -0700)]

iris: Prepare resources before stencil blit operation

We have to resolve destination surfaces if we are bliting to and from
the same surface.

v2: Revert unrelated change (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 28 Aug 2019 07:21:20 +0000 (00:21 -0700)]

iris: Prepare depth resource if clear_depth enable

Avoid preparing depth resource, if we did fast depth clear before.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 14 Aug 2019 20:58:57 +0000 (13:58 -0700)]

iris: Prepare stencil resource before clear depth stencil

Let aux surface state tracker track the stencil buffer's aux state while
clearing depth stencil buffer.

v2: Fix condition check (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 7 Aug 2019 20:42:39 +0000 (13:42 -0700)]

iris: Resolve stencil buffer lossless compression with WM_HZ_OP packet

Even though stencil buffer compression looks like regular lossless color
compression w/o fast clear support, we have to resolve stencil buffer
with WM_HZ_OP packet.

v2: Check if resource is stencil with helper function (Nanley Chery)

v3: Remove unnecessary included file (Nanley Chery)

v4: (Nanley Chery)
- Avoid stencil buffer aux state transition by improving condition check

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Tue, 17 Sep 2019 18:04:15 +0000 (11:04 -0700)]

intel/blorp: Set stencil resolve enable bit

When set, the stencil buffer is filled with the true stencil values and
we have to disable stencil buffer clear enable bit.

v2: 1) Refactor code little bit (Nanley Chery)
2) Fix assertion (Nanley Chery)

v3: 1) Remove unncessary assignment (Nanley Chery)
2) Fix GEN_GEN check (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 23 Oct 2019 23:24:46 +0000 (16:24 -0700)]

intel: Track stencil aux usage on Gen12+

Enable stencil compression enable and control surface enable bit if
stencil buffer lossless compression is enabled.

v2: Remove unnecessary GEN_GEN check (Nanley Chery)

v3: (Nanley Chery)
- Change commit subject tag from intel/isl to intel
- Keep assignment order correct

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Tue, 15 Oct 2019 18:15:22 +0000 (11:15 -0700)]

intel/blorp: Add helper function for stencil buffer resolve

On Gen12+, Stencil buffer's lossless compression should be resolved
with WM_HZ_OP packet.

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 14 Aug 2019 20:58:33 +0000 (13:58 -0700)]

intel/blorp: Assign correct view while clearing depth stencil

We never saw any failures regarding this typo but it's good to assign
correct stencil view while constructing blorp_params.

Fixes: 0cabf93b80d0 "intel/blorp: Add an entrypoint for clearing depth and stencil"
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 23 Oct 2019 23:17:48 +0000 (16:17 -0700)]

genxml/gen12: Add Stencil Buffer Resolve Enable bit

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Nanley Chery [Thu, 24 Oct 2019 16:14:07 +0000 (09:14 -0700)]

iris: Allocate main and aux surfaces together

On Gen12, the CCS buffer address doesn't have to be referenced in state
packets. In the case of a stencil buffer with CCS, the kernel won't know
the location of the CCS unless an extra call is made to pin its address.
To avoid this extra call, make the CCS part of the main surface.

v2. Update comment above bo_size. (Jordan)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Nanley Chery [Fri, 25 Oct 2019 22:07:42 +0000 (15:07 -0700)]

iris: Determine aux offsets within configure_aux

If a resource has a modifier, the main and aux surfaces will share a BO.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Nanley Chery [Fri, 25 Oct 2019 22:38:18 +0000 (15:38 -0700)]

iris: Bail resource creation upon aux creation error

The functions used during aux buffer configuration and creation only
return false for exceptional errors. Don't proceed with surface creation
in those cases.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Nanley Chery [Fri, 25 Oct 2019 19:05:58 +0000 (12:05 -0700)]

iris: Drop iris_resource::aux::extra_aux::bo

The primary and secondary aux buffers are always allocated in the same
BO.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Duncan Hopkins [Tue, 24 Sep 2019 15:03:04 +0000 (16:03 +0100)]

zink: pass line width from rast_state to gfx_pipeline_state.

Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 28 Oct 2019 16:17:06 +0000 (11:17 -0500)]

anv: Reduce the minimum number of relocations

The original value of 256 was under the assumption that you're a batch
buffer which is likely going to have a large number of relocations.
However, pipeline objects on Gen7 will have at most 6 relocations (one
per shader stage and one for the workaround BO) so this is a lot of
per-pipeline wasted space.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Mon, 28 Oct 2019 15:22:47 +0000 (10:22 -0500)]

anv: Delay allocation of relocation lists

The old relocation list code always allocated 256 relocations and a hash
set up-front without knowing whether or not we really need them.  In
particular, in the softpin case, this is two fairly large allocations
that we don't need to be making.  Also, for pipeline objects on haswell
where we don't have softpin, we don't need relocations unless scratch is
used so this is extra data per-pipeline.  Instead, we should do it
on-demand.  This shaves 3.5% off of a cpu-limited example running with
the Dawn WebGPU implementation.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Plamena Manolova [Wed, 23 Oct 2019 22:47:03 +0000 (23:47 +0100)]

anv: Implement new way for setting streamout buffers.

For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Plamena Manolova [Wed, 23 Oct 2019 22:45:58 +0000 (23:45 +0100)]

iris: Implement new way for setting streamout buffers.

For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Plamena Manolova [Thu, 17 Oct 2019 20:05:55 +0000 (21:05 +0100)]

genxml: Add 3DSTATE_SO_BUFFER_INDEX_* instructions

For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Rob Clark [Thu, 24 Oct 2019 21:29:39 +0000 (14:29 -0700)]

freedreno/a6xx: add a618 support

Signed-off-by: Rob Clark <robdclark@chromium.org>

commit | commitdiff | tree

Rob Clark [Thu, 24 Oct 2019 21:03:32 +0000 (14:03 -0700)]

freedreno/a6xx: cleanup magic registers

Extract out values for the handful of unknown registers which have
different values across different a6xx models, to simplify adding
support for new a6xx's.

Signed-off-by: Rob Clark <robdclark@chromium.org>

commit | commitdiff | tree

Rob Clark [Thu, 24 Oct 2019 21:22:09 +0000 (14:22 -0700)]

freedreno/a6xx: remove some left over dead code

These registers don't exist, just remnants of initial port from a5xx.

Signed-off-by: Rob Clark <robdclark@chromium.org>

commit | commitdiff | tree

Plamena Manolova [Mon, 28 Oct 2019 23:47:39 +0000 (23:47 +0000)]

anv: Set depthBounds to true in anv_GetPhysicalDeviceFeatures.

Add depth bounds testing to the list of supported
physical device features.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Plamena Manolova [Mon, 28 Oct 2019 23:44:28 +0000 (23:44 +0000)]

genxml: Change 3DSTATE_DEPTH_BOUNDS bias.

The bias for the 3DSTATE_DEPTH_BOUNDS instruction
should be 2 not 1.

Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Michel Dänzer [Fri, 25 Oct 2019 16:59:56 +0000 (18:59 +0200)]

gitlab-ci: Only run the pipeline if any files affecting it have changed

E.g. documentation-only changes cannot affect the outcome of the
pipeline, so don't waste resources on running it.

The thing we need to be careful about here is that the container stage
jobs must always run if any later stage jobs using the corresponding
docker images run. We're currently using the same .ci-run-policy
template for all jobs, so this is trivially true.

v2:
* Add bin/ and common.py (Eric Engestrom)

Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> # v1
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>

commit | commitdiff | tree

Krzysztof Raszkowski [Tue, 29 Oct 2019 14:50:02 +0000 (14:50 +0000)]

gallium/swr: Enable GL_ARB_gpu_shader5: multiple streams

Added support for geometry shader multiple streams (part of
GL_ARB_gpu_shader5 extension).

Reviewed-by: Jan Zielinski <jan.zielinski@intel.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Sun, 27 Oct 2019 23:46:50 +0000 (19:46 -0400)]

panfrost: Remove unused definitions in mali-job.h

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Alyssa Rosenzweig [Sun, 27 Oct 2019 23:46:21 +0000 (19:46 -0400)]

panfrost: Cleanup _shader_upper -> shader

I don't believe this is actually a tagged pointer; warn if it is.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>

commit | commitdiff | tree

Eric Engestrom [Sat, 26 Oct 2019 21:43:50 +0000 (22:43 +0100)]

meson: define _GNU_SOURCE on FreeBSD

_mesa_strtod() needs this to use strtod_l(), which behaves correctly
wrt `,` vs `.` decimal separator.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2008
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>

commit | commitdiff | tree

Lionel Landwerlin [Fri, 20 Sep 2019 18:12:13 +0000 (21:12 +0300)]

intel/perf: update ICL configurations

A few equations/programming changes for ICL.

v2: Fix a couple of issues in naming and floating/integer operations (Ken)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Alexandros Frantzis [Tue, 29 Oct 2019 09:01:57 +0000 (11:01 +0200)]

gitlab-ci: Update required libdrm version

Commit 9edcce2a32ed bumped the required libdrm-amdgpu version to
2.4.100. Update the version we use in our CI scripts to avoid CI
build failures.

Also bump the debian image name for this change to take effect.
Note that amdgpu is only built with the debian-buster image,
so only this image requires an update.

Fixes: 9edcce2a ("ac: get tcc_harvested from the kernel")
Signed-off-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>

commit | commitdiff | tree

Eric Engestrom [Tue, 29 Oct 2019 09:24:36 +0000 (09:24 +0000)]

travis: fix scons build after deprecation warning

Fixes: 54053bc8d0dad89a38e2 ("scons: Print a deprecation warning about using scons on not windows")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>

commit | commitdiff | tree

Caio Marcelo de Oliveira Filho [Mon, 28 Oct 2019 21:46:23 +0000 (14:46 -0700)]

anv: Fix output of INTEL_DEBUG=bat for chained batches

The anv_batch_bo contents are linked one to another, and when printing
we have to start with the first of those. Since in `u_vector` new
elements are added to the head, to get the first element we need the
vector's tail.

Fixes: 32ffd90002b ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>

commit | commitdiff | tree

Marek Olšák [Wed, 9 Oct 2019 23:32:42 +0000 (19:32 -0400)]

winsys/amdgpu: use the new GPU reset query

commit | commitdiff | tree

Marek Olšák [Tue, 24 Sep 2019 21:55:52 +0000 (17:55 -0400)]

ac: get tcc_harvested from the kernel

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Marek Olšák [Sat, 26 Oct 2019 00:25:59 +0000 (20:25 -0400)]

radeonsi: initialize shader compilers in threads on demand

It takes a noticable amount of time with piglit.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

commit | commitdiff | tree

Marek Olšák [Thu, 24 Oct 2019 04:22:58 +0000 (00:22 -0400)]

radeonsi: don't print diagnostic LLVM remarks and notes

We don't use them.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>

commit | commitdiff | tree

Timur Kristóf [Thu, 24 Oct 2019 15:34:37 +0000 (17:34 +0200)]

aco: Introduce vgpr_limit to keep track of available VGPRs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

commit | commitdiff | tree

Timur Kristóf [Sat, 21 Sep 2019 16:03:56 +0000 (18:03 +0200)]

aco: Implement subgroup shuffle in GFX10 wave64 mode.

Previously subgroup shuffle was implemented using the bpermute
instruction, which only works accross half-waves, so by itself it's
not suitable for implementing subgroup shuffle when the shader is
running in wave64 mode.

This commit adds a trick using shared VGPRs that allows to implement
subgroup shuffle still relatively effectively in this mode.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

commit | commitdiff | tree

Rhys Perry [Thu, 12 Sep 2019 19:04:20 +0000 (20:04 +0100)]

aco: Remove dead code in reduction lowering.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

commit | commitdiff | tree

Rhys Perry [Thu, 12 Sep 2019 18:28:52 +0000 (19:28 +0100)]

aco: Fix reductions on GFX10.

Fixes p_reduce (all cluster sizes), p_inclusive_scan and p_exclusive_scan
with all reduction operations.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>

commit | commitdiff | tree

Eric Engestrom [Sat, 5 Oct 2019 21:30:51 +0000 (22:30 +0100)]

loader: default to iris for all future PCI IDs

The existing "fallback" code didn't actually do anything, so this
removes it, and instead we just always fallback to `iris` for future
PCI IDs.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Eric Engestrom [Thu, 24 Oct 2019 12:29:37 +0000 (13:29 +0100)]

anv: add a couple printflike() annotations

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>

commit | commitdiff | tree

Erik Faye-Lund [Mon, 28 Oct 2019 13:02:02 +0000 (14:02 +0100)]

st/mesa: lower global vars to local after lowering clip

When this code was merged, this wasn't necessary because the
state-tracker would do it later anyway. But this recently got changed,
without changing the code that depended on this.

Arguably, this was a mistake in the lowering pass to begin with. Either
way, let's fix it by not assuming that the lowering code gets called
later when it's not needed.

This fixed user-defined clip-planes in Zink.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: eaffdad1082 ("st/mesa: don't lower_global_vars_to_local for VS if there are no dead inputs")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 18 Sep 2019 20:14:31 +0000 (13:14 -0700)]

iris: Create resource with aux_usage MCS_CCS

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 18 Sep 2019 19:37:59 +0000 (12:37 -0700)]

intel/isl: Support lossless compression with multisamples

GEN12 adds the ability to losslessly compress each sample plane in a
multisampled buffer that uses MCS compression.

v2: Remove unnecessary assertion (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Fri, 20 Sep 2019 21:05:58 +0000 (14:05 -0700)]

iris: Get correct resource aux usage for copy

Add case for MCS_CCS so that we get the correct aux usage while copy
operation.

v2: Fix commit subject (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Thu, 10 Oct 2019 17:40:17 +0000 (10:40 -0700)]

intel/blorp: Use isl_aux_usage_has_mcs instead of comparing

Depending on MCS_CSS or MCS we can emit blorp blit shaders.

As we support MCS_CSS and MCS, it makes sense to use
isl_aux_usage_has_mcs function.

v2: Fix commit message (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Wed, 18 Sep 2019 20:15:47 +0000 (13:15 -0700)]

iris: Define MCS_CCS state transitions and usages

v2: 1) Fix assertion check (Nanley Chery)
2) Correct commit subject (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Thu, 19 Sep 2019 15:13:15 +0000 (08:13 -0700)]

iris: Initialize CCS to fast clear while using with MCS

v2: Explain Bsepc quotes properly (Nanley Chery)

v3: 1) Fix comment format (Nanley Chery)
2) Fix typo in comment (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Sagar Ghuge [Thu, 19 Sep 2019 15:20:34 +0000 (08:20 -0700)]

intel/isl: Don't reconfigure aux surfaces for MCS

If aux for MCS is already configured, don't configure again.

v2: Fix missing period in commit message (Nanley Chery)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>

commit | commitdiff | tree

Erik Faye-Lund [Wed, 23 Oct 2019 09:59:03 +0000 (11:59 +0200)]

zink: emulate optional depth-formats

The Vulkan spec says that an implementation has to support one of
VK_FORMAT_X8_D24_UNORM_PACK32 and VK_FORMAT_D32_SFLOAT, as well of
one of VK_FORMAT_D24_UNORM_S8_UINT and VK_FORMAT_D32_SFLOAT_S8_UINT.

So let's keep track which one is supported of earch pair, and emulate
one on top of the other one.

This won't give the exact result for comparisons, or when mapping and
unmapping the resources. But it's better than flat out failing to create
the resource, and we can fix the map/unmap issue later if needed.

Tested-by: Duncan Hopkins <duncan@thefoundry.co.uk>

commit | commitdiff | tree

Erik Faye-Lund [Tue, 22 Oct 2019 13:29:55 +0000 (15:29 +0200)]

zink: error if VK_KHR_maintenance1 isn't supported

While we're at it, remove the VK_-prefix from the extension bool; all
extensions have this so it's kinda superfluous.

commit | commitdiff | tree

Nanley Chery [Fri, 2 Aug 2019 22:38:36 +0000 (15:38 -0700)]

iris: Disallow incomplete resource creation

If a modifier specifies an aux, it must be created.

Fixes: 75a3947af46 ("iris/resource: Fall back to no aux if creation fails")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Nanley Chery [Wed, 25 Sep 2019 19:48:57 +0000 (12:48 -0700)]

iris: Don't leak the resource for unsupported modifier

Make sure the res struct is free'd before returning.

Fixes: 2dce0e94a3d ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Nanley Chery [Wed, 21 Aug 2019 22:23:24 +0000 (15:23 -0700)]

iris: Enable HIZ_CCS sampling

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Nanley Chery [Wed, 18 Sep 2019 16:44:02 +0000 (09:44 -0700)]

intel/blorp: Satisfy clear color rules for HIZ_CCS

Store the converted depth value into two dwords. Avoids regressing the
piglit test "fbo-depth-array depth-clear", when HIZ_CCS sampling is
enabled in a later commit.

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

RSS Atom