Alyssa Rosenzweig [Mon, 9 Mar 2020 18:25:00 +0000 (14:25 -0400)]
pan/bi: Introduce writemasks
I feel so dirty. But this will let the IR be a lot more flexible seeing
as we really are vector in a certain sense (I/O, small types)
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4139>
Alyssa Rosenzweig [Mon, 9 Mar 2020 18:09:04 +0000 (14:09 -0400)]
pan/bi: Generalize swizzles to avoid extracts
We'd really rather not emit extracts. We are approaching on a vector IR
anyway which is annoying but really necessary to handle I/O and fp16
correctly. So let's just go all the way and deal with swizzles and masks
within reason; it'll still be somewhat saner in the long-term.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4139>
Alyssa Rosenzweig [Tue, 10 Mar 2020 00:19:29 +0000 (20:19 -0400)]
panfrost: Move mir_to_bytemask to common code
...also so we can start sharing code properly between the panfrost
compilers.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4139>
Rob Clark [Tue, 10 Mar 2020 15:10:22 +0000 (08:10 -0700)]
freedreno/fdperf: set locale
Set local to get numbers printed w/ commas.. much easier to read that
way.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4119>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4119>
Rob Clark [Sun, 8 Mar 2020 23:42:23 +0000 (16:42 -0700)]
freedreno/computerator: add performance counter support
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4119>
Jason Ekstrand [Mon, 2 Mar 2020 23:26:43 +0000 (17:26 -0600)]
vulkan/wsi: Return an error if dup() fails
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4135>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4135>
Jason Ekstrand [Mon, 2 Mar 2020 23:05:59 +0000 (17:05 -0600)]
vulkan/wsi: Don't leak the FD when GetImageDrmFormatModifierProperties fails
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4135>
Rob Clark [Wed, 4 Mar 2020 17:06:51 +0000 (09:06 -0800)]
freedreno/ir3: try to avoid syncs
Update postsched to be better aware of where costly (ss) syncs would
result. Sometimes it is better to allow a nop or two, to avoid a
sync quickly after an SFU.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Fri, 28 Feb 2020 20:48:16 +0000 (12:48 -0800)]
freedreno/ir3: round-robin RA
In the second (scalar pass) use the information about # of registers
used in the first pass as the target max, and round-robin within that
range. This generally gives the post-RA sched pass more opportunities
to re-order instructions to remove nop's.
Also, we can be a bit clever when assigning dest registers for SFU
instructions, by picking the register used for it's src (if available
and already assigned). This avoids some (ss) syncs caused by write
after read hazards. (Ie. the SFU instruction will read it's own src
before writing dest.)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Fri, 28 Feb 2020 19:02:26 +0000 (11:02 -0800)]
freedreno/ir3: track register usage in first RA pass
We'll use the feedback from the first pass to select a target register
usage in the second pass.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Fri, 28 Feb 2020 23:45:53 +0000 (15:45 -0800)]
freedreno/ir3: fix has_latency_to_hide
Also count tex-prefetch instructions. And only let the no-latency rule
kick in for frag shaders.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Fri, 28 Feb 2020 20:47:29 +0000 (12:47 -0800)]
freedreno/ir3: split out has_latency_to_hide()
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Mon, 9 Mar 2020 21:20:50 +0000 (14:20 -0700)]
util/ra: move NO_REG to header
In the select_reg callback, I want to be able to determine if a given
node is already assigned, and if so what physical register has been
assigned.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Fri, 28 Feb 2020 18:06:54 +0000 (10:06 -0800)]
util/ra: spiff out select_reg_callback
Add a parameter so the callback can know which node it is selecting a
register for. And remove the graph parameter, as it is unused by
existing users, and somewhat unnecessary (ie. the callback data could
be used instead).
And add a comment so $future_me remembers how this works.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Sun, 1 Mar 2020 22:16:59 +0000 (14:16 -0800)]
freedreno: fix FD_MESA_DEBUG=inorder
Fixes: 2c07e03b792 ("freedreno: allow ctx->batch to be NULL")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Wed, 4 Mar 2020 18:51:10 +0000 (10:51 -0800)]
freedreno/ir3: add simplified stall estimation
Doesn't take into account stalls that result from a register written in
a different block, etc. But this should be more useful than just using
number of (ss)'s by trying to estimate how costly a given sync is.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Tue, 25 Feb 2020 18:42:57 +0000 (10:42 -0800)]
freedreno/ir3: remove extra nops inserted in scheduler
They were inserting a nop between back to back SFU instrucions. But
that doesn't actually appear to be required. And they get stripped out
later anyways before legalize.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Sun, 1 Mar 2020 22:20:56 +0000 (14:20 -0800)]
freedreno/computerator: add hrsq/hlog2/hexp2
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Wed, 4 Mar 2020 20:06:30 +0000 (12:06 -0800)]
freedreno/ir3: also lower lowp frag outputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Rob Clark [Wed, 4 Mar 2020 19:54:26 +0000 (11:54 -0800)]
nir/print: show variable precision
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
Danylo Piliaiev [Tue, 10 Mar 2020 14:24:59 +0000 (16:24 +0200)]
intel/tools: Fix compilation with UBSan
Compilation failed with several similar errors:
../src/intel/tools/aub_read.c:322:4: error: case label does not reduce to an integer constant
322 | case MAKE_HEADER(TYPE_AUB, OPCODE_AUB, SUBOPCODE_HEADER):
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4132>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4132>
Mathias Fröhlich [Tue, 11 Dec 2018 17:45:43 +0000 (18:45 +0100)]
i965: Use gl_vertex_format in brw_vertex_element.
State upload needs to cope with the vertex format
rather than with the full attribute data.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Tue, 11 Dec 2018 17:45:43 +0000 (18:45 +0100)]
i965: Make use of the vertex format functions in i965.
v2: Style fixes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Tue, 11 Dec 2018 17:45:43 +0000 (18:45 +0100)]
mesa: Provide gl_vertex_format accessors.
Provide the same set of VAO and current value gl_vertex_format
accessor functions like we have for the gl_array_attributes.
For most purpose the vertex format is what we need.
v2: Style fixes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Tue, 11 Dec 2018 17:45:43 +0000 (18:45 +0100)]
mesa: Remove now unused _mesa_draw_attrib.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Tue, 11 Dec 2018 17:45:43 +0000 (18:45 +0100)]
mesa: Remove now unused _mesa_draw_attrib_and_binding.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Mon, 9 Mar 2020 21:23:31 +0000 (22:23 +0100)]
i965: Remove glbinding from brw_vertex_element.
v2: Rebase.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Sat, 20 Apr 2019 05:28:19 +0000 (07:28 +0200)]
i965: Reorder workaround flags computation.
Vertex processing workaround flags can be split into
array and current vertex attributes. By that we
can use specific access functions for these different
vertex attribute kinds. This finally obsoletes
some access functions that I introduced last winter
for a smooth transition.
v2: Style fixes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Tue, 11 Dec 2018 17:45:43 +0000 (18:45 +0100)]
i965: Split merge_inputs and clear_buffers.
The merge_inputs function handles that part that changes when the
inputs change. The clear_buffers function triggers when we may need
a new upload. Thus the merge_inputs can be limited to be once
per brw_draw_prims.
v2: Move declaration of attribute index into the for scope.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Sat, 20 Apr 2019 05:39:56 +0000 (07:39 +0200)]
i965: Test original vertex array pointer to skip array upload.
Rather than do a NULL pointer check on a pointer that may be offset by the
min-max index range of an GL draw operation, execute the NULL test on the
original vertex array pointer.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Sat, 20 Apr 2019 05:57:15 +0000 (07:57 +0200)]
i965: Use the VAOs binding information in array setup.
The change basically reimplements array setup by walking
the gl_contex::Array._DrawVAO on a per binding sequence.
In this way we can make direct use of the application
provided minimum set of buffer objects and emit fewer relocs.
v2: Rebase onto:
compiler: Move double_inputs to gl_program::DualSlotInputs
v3: Rebase onto introduction of gl_vertex_format
v4: Reorder and extend patch series.
v5: Split out two hunks into seperate patches.
v6: Avoid using GL* types.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Sat, 20 Apr 2019 05:57:12 +0000 (07:57 +0200)]
i965: Use 32 bit u_bit_scan for vertex attribute setup.
The vertex array object contains 32 vertex arrays. By that we cannot
reference more then these in the vertex shader inputs. So, we can use
the 32 bits u_bit_scan function to iterate the vertex shader inputs
and place an assert that only these are present. Also place an other
assert that the vertex array setup in i965 does not overrun the
enabled array in brw_context::vs::enabled.
v2: Style fixes.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Sat, 2 Nov 2019 07:06:03 +0000 (08:06 +0100)]
iris: Move down iris_emit_sbe_swiz in profiles.
Harvest the information gathered in the previous patch
inside of iris.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Mathias Fröhlich [Tue, 11 Dec 2018 17:45:43 +0000 (18:45 +0100)]
i965: Move down genX_upload_sbe in profiles.
Avoid looping over all VARYING_SLOT_MAX urb_setup array
entries from genX_upload_sbe. Prepare an array indirection
to the active entries of urb_setup already in the compile
step. On upload only walk the active arrays.
v2: Use uint8_t to store the attribute numbers.
v3: Change loop to build up the array indirection.
v4: Rebase.
v5: Style fix.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/308>
Boris Brezillon [Fri, 6 Mar 2020 10:46:39 +0000 (11:46 +0100)]
panfrost: Get rid of ctx->payloads[]
Now that vertex/tiler payloads are re-initialized at draw/launch_grid
time we can get of of the ctx->payloads[] field and allocate those
payload templates on the stack.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Fri, 6 Mar 2020 10:43:38 +0000 (11:43 +0100)]
panfrost: Use ctx->active_prim in panfrost_writes_point_size()
Check ctx->active_prim instead of prefix.draw_mode so we can eventually
get rid of ctx->payloads.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Fri, 6 Mar 2020 10:31:06 +0000 (11:31 +0100)]
panfrost: Re-init the VT payloads at draw/launch_grid() time
Doing that should help us avoiding state leaks between draw/launch_grid
calls.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Fri, 6 Mar 2020 08:45:31 +0000 (09:45 +0100)]
panfrost: Move panfrost_emit_varying_descriptor() to pan_cmdstream.c
Move panfrost_emit_varying_descriptor() to pan_cmdstream.c where other
emit functions live and adjust the prototype to be consistent with other
emit helpers.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Fri, 6 Mar 2020 08:09:03 +0000 (09:09 +0100)]
panfrost: Move panfrost_emit_vertex_data() to pan_cmdstream.c
Move panfrost_emit_vertex_data() to pan_cmdstream.c where other emit
functions live, and adjust the prototype for consistency.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Fri, 6 Mar 2020 07:57:31 +0000 (08:57 +0100)]
panfrost: Inline panfrost_queue_draw() and panfrost_emit_for_draw()
Now that panfrost_queue_draw() and panfrost_emit_for_draw() are
small enough, we can move the code to panfrost_draw_vbo() and have all
vt and emit calls grouped in one place.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Fri, 6 Mar 2020 07:02:14 +0000 (08:02 +0100)]
panfrost: Move vertex/tiler payload initialization out of panfrost_draw_vbo()
Add a panfrost_vt_set_draw_info() function taking care of the draw
related initialization of the vertex and tiler payloads.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 20:55:01 +0000 (21:55 +0100)]
panfrost: Move streamout offset update out of panfrost_draw_vbo()
That's part of out attempt to shrink panfrost_draw_vbo().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 20:48:09 +0000 (21:48 +0100)]
panfrost: Rename panfrost_stage_attributes()
panfrost_stage_attributes() is emitting mali_attr_meta descriptors, so
let's rename it accordingly and move it to pan_cmdstream.c.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 18:32:01 +0000 (19:32 +0100)]
panfrost: Move the mali_attr.src_offset adjustment to a sub-function
Create a panfrost_vertex_state_upd_attr_offs() helper to adjust
the attr_meta src_offsets.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 18:48:30 +0000 (19:48 +0100)]
panfrost: Emit attribute descriptors after patching the templates
Patching attribute desc when they are in cacheable memory should be
more efficient.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 18:40:15 +0000 (19:40 +0100)]
panfrost: Prepare attribute for builtins at state creation time
The attribute meta slots reserved for gl_VertexID and gl_InstanceID
can be pre-filled at state creation time. Only the index needs to be
adjusted when attributes are generated.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 18:26:03 +0000 (19:26 +0100)]
panfrost: Ignore BO start addr when adjusting src_offset
BOs are guaranteed to be aligned on 4K which inherently guarantees the
64 byte alignment.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 18:22:26 +0000 (19:22 +0100)]
panfrost: Drop initial mali_attr_meta.src_offset assignment
The mali_attr_meta.src_offset is initialized to
pipe_vertex_element.src_offset at vertex element creation time, but
this field is then adjusted when the descrptors are emitted. Let's
use the pipe_vertex_element data we saved earlier and drop this initial
assignment.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 17:53:08 +0000 (18:53 +0100)]
panfrost: Add an helper to emit a pair of vertex/tiler jobs
Add the panfrost_emit_vertex_tiler_jobs() helper and use it in
panfrost_queue_draw().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 17:43:13 +0000 (18:43 +0100)]
panfrost: Move sampler/tex descs emission helpers to pan_cmdstream.c
Move panfrost_upload_texture_descriptors() and
panfrost_upload_sampler_descriptors() to pan_cmdstream.c where other
cmdstream related helpers live. While at it, change their prototype
and name to make it consistent with the other helpers and prepare
things for ctx->payloads[] removal.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 15:26:56 +0000 (16:26 +0100)]
panfrost: Add a panfrost_sampler_desc_init() helper
It just makes sense to group all HW descriptor initilization logic in
pan_cmdstream.c, so let's move this code out of
panfrost_create_sampler_state().
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 15:20:18 +0000 (16:20 +0100)]
panfrost: Prepare shader_meta descriptors at emission time
This way we avoid potential state leaks and keep the shader_meta
initialization in once place. The time spent preparing the shader
descriptors should be negligible compared to the time spent pushing
those descriptors to the transient buffer (remember we are writing to
non-cacheable memory here).
Note that we might get back to some sort of shader_meta descriptor
caching at some point if that proves necessary, but now we have those
panfrost_frag_meta_xxx_update() helpers now where xxx maps directly to
a CSO bind, which should ease desc template updates.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 14:17:31 +0000 (15:17 +0100)]
panfrost: Prepare things to get rid of panfrost_shader_state.tripipe
panfrost_shader_state.tripipe is used as a template for shader_meta
desc emission, but shader_meta desc preparation time should be negligible
compared to desc emission time (remember we are writing to non-cacheable
memory here). Let's prepare for generating the the shader_meta desc
entirely at draw time by adding the necessary fields to
panfrost_shader_state.
Note that we might brink back some sort of shader_meta desc caching at
some point, but let's simplify things a bit for now.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 10:52:12 +0000 (11:52 +0100)]
panfrost: Add an helper to update the rasterizer part of a tiler job desc
That's part of our attempt to make panfrost_emit_for_draw() a bit more
dry and eventually get rid of it by inlining the code in
panfrost_draw_vbo(). This is just one step in this direction.
Note that we get rid of the panfrost_rasterizer.tiler_gl_enables field
along the way, as setting/clearing those bits at draw time instead of
doing when the state is created should make a huge difference. We might
get back to pre-computed VT descs at some point, but let's keep things
simple for now.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 10:47:04 +0000 (11:47 +0100)]
panfrost: Add an helper to update the occclusion query part of a tiler job desc
That's part of our attempt to make panfrost_emit_for_draw() a bit more
dry and eventually get rid of it by inlining the code in
panfrost_draw_vbo(). This is just one step in this direction.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 10:23:41 +0000 (11:23 +0100)]
panfrost: Simplify panfrost_emit_for_draw() and make it private
Now that panfrost_launch_grid() no longer calls panfrost_emit_for_draw(),
we can keep it private to pan_context.c and drop all compute-related
stuff.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 10:18:37 +0000 (11:18 +0100)]
panfrost: Stop using panfrost_emit_for_draw() for compute jobs
We actually need a small subset of what's done in
panfrost_emit_for_draw() when emitting compute jobs, so let's copy
what we need directly in panfrost_launch_grid() instead of re-using
this function whose initial purpose was to generate vertex/tiler jobs
for draw operations.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Fri, 6 Mar 2020 08:59:56 +0000 (09:59 +0100)]
panfrost: Move panfrost_attach_vt_framebuffer() to pan_cmdstream.c
Move panfrost_attach_vt_framebuffer() to pan_cmdstream.c and change its
name to panfrost_vt_attach_framebuffer() so we can use a consistent
prefix (panfrost_vt_) for all helpers initializing/updating
midgard_payload_vertex_tiler fields.
Note that the function only initializes one VT object now.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 10:02:56 +0000 (11:02 +0100)]
panfrost: Dissociate shader meta patching from the desc emission
Right now we emit two shader descriptors for the fragment shader, one
when panfrost_patch_shader_state() is called, and the final one
including both the shader_meta and the blend RT descriptors.
The first generated fragment shader descriptor is never used, since the
second one overrides the postfix.shader pointer.
Let's dissociate the state patching logic from the descriptor emission
so we don't upload descriptors that are never used.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 08:57:44 +0000 (09:57 +0100)]
panfrost: Move shared mem desc emission out of panfrost_launch_grid()
Let's move the shared memory descriptor emission to a dedicated function
living with its pairs in pan_cmdstream.c.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 08:46:42 +0000 (09:46 +0100)]
panfrost: Move the const buf emission logic out of panfrost_emit_for_draw()
Let's move the constant buffer emission logic in a dedicated helper
to make panfrost_emit_for_draw() a bit more dry.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 08:30:58 +0000 (09:30 +0100)]
panfrost: Move viewport desc emission out of panfrost_emit_for_draw()
Let's move the viewport descriptor emission logic to a dedicated helper
in order to shrink a bit the panfrost_emit_for_draw().
Note that this helper is placed in a new pan_cmdstream.c file where we
will group all cmdstream related helpers (everything that's related to
HW descriptor initialization emission).
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 07:58:10 +0000 (08:58 +0100)]
panfrost: Move the batch stack size adjustment out of panfrost_queue_draw()
That's part of our attempt to sanitize panfrost_queue_draw(),
panfrost_draw_vbo() and panfrost_emit_for_draw(). The new
panfrost_batch_adjust_stack_size() helper is placed in pan_job.c, where
all batch related functions live.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 09:46:39 +0000 (10:46 +0100)]
panfrost: Add an helper to retrieve the currently active shader state
Doing that improves readability and helps avoiding code duplication.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Boris Brezillon [Thu, 5 Mar 2020 16:24:39 +0000 (17:24 +0100)]
panfrost: Assign primitive_size.pointer only if writes_point_size() returns true
Checking vs->writes_point_size is not enough, as we might have a vertex
shader writing point size, but a primitive that's not MALI_POINT. That
currently works because emit_varying_descriptor() is called before the
primitive_size.constant field is update, but let's make the logic more
robust, just in case things are re-ordered at some point.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4083>
Samuel Pitoiset [Tue, 3 Mar 2020 13:24:55 +0000 (14:24 +0100)]
radv/sqtt: describe pipeline and wait events barriers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Tue, 3 Mar 2020 13:08:30 +0000 (14:08 +0100)]
radv/rgp: bump the instrumentation spec version to 1
RGP expects the version to be 1, otherwise it doesn't display the
barriers (including layout transitions) correctly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Tue, 3 Mar 2020 13:06:26 +0000 (14:06 +0100)]
radv/sqtt: describe render pass color/depthstencil clears
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Tue, 3 Mar 2020 13:03:25 +0000 (14:03 +0100)]
radv/sqtt: describe draw/dispatch and emit event markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Tue, 3 Mar 2020 12:42:41 +0000 (13:42 +0100)]
radv/sqtt: describe begin/end command buffers with user markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Wed, 26 Feb 2020 13:05:15 +0000 (14:05 +0100)]
radv: initial implementation of the driver internal layer SQTT
This layer is used to emit SQTT user markers to command buffers. It
currently only emits API markers but it will consolidated soon with
barrier markers and more.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Wed, 26 Feb 2020 13:12:01 +0000 (14:12 +0100)]
radv/sqtt: add a helper that emits thread trace userdata markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Wed, 26 Feb 2020 12:59:05 +0000 (13:59 +0100)]
radv: use device entrypoints from the SQTT layer if enabled
This allows to override RADV device entrypoints if the prefix
is 'sqtt' instead of 'radv'.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Samuel Pitoiset [Wed, 26 Feb 2020 12:52:11 +0000 (13:52 +0100)]
radv/entrypoints: declare a driver internal layer for SQTT
Some Vulkan commands will be overriden to emit user SQTT markers.
These markers are then used by the Radeon GPU Profiler to display
timings, barrier operations (cache flushes, pipeline stalls, layout
transitions) and more.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Boris Brezillon [Sat, 7 Mar 2020 15:24:03 +0000 (16:24 +0100)]
panfrost: Pass the sampler view format when creating a tex descriptor
A sampler can use a different format than the native texture format.
Let's pass the sampler format instead of the native texture format when
creating a texture descriptor.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4101>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4101>
Boris Brezillon [Sat, 7 Mar 2020 14:40:15 +0000 (15:40 +0100)]
Revert "panfrost: Z24 variants should be sampled as R32UI"
Commit
0406ea485649 ("panfrost: Z24 variants should be sampled as
R32UI") causes a regression when depth textures are sampled.
It's still not clear how MALI_Z32 can work for for Z32 and Z24{S,X}8,
but let's leave that question for later.
Reported-by: Icecream95 <ixn@keemail.me>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4101>
Tomeu Vizoso [Mon, 9 Mar 2020 13:17:41 +0000 (14:17 +0100)]
gallium: Add forgotten docs for new CAPs related to transform feedback
These three caps were missing docs.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4115>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4115>
Vasily Khoruzhick [Wed, 4 Mar 2020 06:05:52 +0000 (22:05 -0800)]
lima: enable minmax cache for index buffers
Re-use minmax cache for index buffers from panfrost.
Reviewed-by: Andreas Baierl <ichgeh@imkreisrum.de>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4051>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4051>
Vasily Khoruzhick [Wed, 4 Mar 2020 05:31:51 +0000 (21:31 -0800)]
panfrost: split index cache into shared part
Split it into shared part since we're going to re-use it in lima.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4051>
Marek Olšák [Wed, 12 Feb 2020 22:15:34 +0000 (17:15 -0500)]
st/mesa: fix a possible crash with selection and feedback modes
The index bounds are always valid without an index buffer, but they won't be.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3986>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3986>
Marek Olšák [Thu, 23 Jan 2020 00:14:50 +0000 (19:14 -0500)]
st/mesa: flush the bitmap cache before st/dri and vbo flushes
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3986>
Francisco Jerez [Tue, 3 Mar 2020 21:20:47 +0000 (13:20 -0800)]
intel/fs: Fix workaround for VxH indirect addressing bug under control flow.
The current workaround for this hardware bug involved marking the ADD
instruction used to initialize the address register as NoMask on
Gen12, which was based on the assumption that the problem was caused
by a hardware bug affecting the application of the execution mask to
the address register write.
However that doesn't seem to be the case: The address register write
was working correctly, the real problem leading to hangs on TGL is
that the indirect addressing logic is unable to deal with garbage
values in the address register (e.g. misaligned offsets), even for
channels which are currently inactive due to non-uniform control flow.
The current workaround isn't able to avoid that situation in general,
since the result of the NoMask ADD instruction for a dead channel is
calculated based on the corresponding (dead) component of the
indirect_byte_offset source, which would still be undefined in the
likely case that the source was initialized under control flow itself.
This would lead to hangs whenever MOV_INDIRECT was used under
non-uniform control flow in some scenarios like a tessellation shader
from GFXBench5/gl_4 (AKA Car Chase) on TGL. In addition I've managed
to reproduce the same issue on earlier platforms by initializing the
whole address register with garbage before the ADD instruction, so
this seems to be a long-standing issue we have avoided mostly by luck.
This patch fixes the problem and applies the workaround to all
platforms, since even when the hardware is able to deal with garbage
address values without hanging there might be a significant
performance cost from reading random GRF registers due to the useless
extra EU cycles spent fetching registers for dead channels and due to
the potential for unintended serialization with respect to other
random instructions that could be executed in parallel, which may have
had a cost of the order of hundreds of cycles in the worst case
scenario.
Fixes: f93dfb509c "intel/fs: Write the address register with NoMask for MOV_INDIRECT"
Tested-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Ian Romanick [Tue, 25 Feb 2020 19:37:01 +0000 (11:37 -0800)]
intel/fs: Allow NOT instructions in conditional discard optimization
I don't know why I explicitly disallowed NOT in the first place. :(
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
14549846 ->
14549770 (<.01%)
instructions in affected programs: 12934 -> 12858 (-0.59%)
helped: 76
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.13% max: 5.56% x̄: 1.04% x̃: 0.90%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.25% -0.84%
Instructions are helped.
total cycles in shared programs:
203793967 ->
203792696 (<.01%)
cycles in affected programs: 77920 -> 76649 (-1.63%)
helped: 67
HURT: 1
helped stats (abs) min: 2 max: 36 x̄: 19.00 x̃: 16
helped stats (rel) min: 0.04% max: 4.68% x̄: 2.35% x̃: 2.28%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03%
95% mean confidence interval for cycles value: -20.75 -16.63
95% mean confidence interval for cycles %-change: -2.57% -2.05%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3965>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3965>
Ian Romanick [Mon, 24 Feb 2020 19:22:02 +0000 (11:22 -0800)]
intel/fs: Do cmod prop again after scheduling
Pre-RA scheduling can create more opportunities for CMOD propagation.
This takes advantage of that.
It may be worth doing this again in post-RA scheduling, but there are
additional problems there.
I'm a little torn about the use of the OPT() macro. On the one hand, it
would be confusing to see dumps from INTEL_DEBUG=optimizer that don't
match the final output. On the other hand, since register allocation
can fail, the same pass can be run multiple times. Each time one or
both passes might or might not make progress. This would also lead to
incongruous, confusing output.
Ice Lake
total instructions in shared programs:
14549808 ->
14548529 (<.01%)
instructions in affected programs: 231985 -> 230706 (-0.55%)
helped: 632
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.02 x̃: 1
helped stats (rel) min: 0.05% max: 2.56% x̄: 0.57% x̃: 0.41%
95% mean confidence interval for instructions value: -2.25 -1.79
95% mean confidence interval for instructions %-change: -0.61% -0.54%
Instructions are helped.
total cycles in shared programs:
203770850 ->
203776599 (<.01%)
cycles in affected programs:
2495653 ->
2501402 (0.23%)
helped: 282
HURT: 197
helped stats (abs) min: 1 max: 242 x̄: 20.37 x̃: 16
helped stats (rel) min: <.01% max: 11.65% x̄: 0.91% x̃: 0.64%
HURT stats (abs) min: 2 max: 609 x̄: 58.35 x̃: 20
HURT stats (rel) min: <.01% max: 10.97% x̄: 1.35% x̃: 0.66%
95% mean confidence interval for cycles value: 5.27 18.73
95% mean confidence interval for cycles %-change: -0.16% 0.21%
Inconclusive result (%-change mean confidence interval includes 0).
LOST: 0
GAINED: 2
Skylake
total instructions in shared programs:
13447708 ->
13446594 (<.01%)
instructions in affected programs: 216813 -> 215699 (-0.51%)
helped: 623
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 1.79 x̃: 1
helped stats (rel) min: 0.06% max: 2.86% x̄: 0.59% x̃: 0.42%
95% mean confidence interval for instructions value: -1.99 -1.59
95% mean confidence interval for instructions %-change: -0.63% -0.55%
Instructions are helped.
total cycles in shared programs:
193759224 ->
193762726 (<.01%)
cycles in affected programs:
2540035 ->
2543537 (0.14%)
helped: 249
HURT: 190
helped stats (abs) min: 2 max: 196 x̄: 16.67 x̃: 14
helped stats (rel) min: <.01% max: 4.71% x̄: 0.66% x̃: 0.62%
HURT stats (abs) min: 2 max: 614 x̄: 40.27 x̃: 14
HURT stats (rel) min: 0.02% max: 5.78% x̄: 0.86% x̃: 0.37%
95% mean confidence interval for cycles value: 2.57 13.39
95% mean confidence interval for cycles %-change: -0.11% 0.11%
Inconclusive result (%-change mean confidence interval includes 0).
LOST: 0
GAINED: 1
Broadwell
total instructions in shared programs:
13418631 ->
13417393 (<.01%)
instructions in affected programs: 243192 -> 241954 (-0.51%)
helped: 694
HURT: 0
helped stats (abs) min: 1 max: 31 x̄: 1.78 x̃: 1
helped stats (rel) min: 0.06% max: 2.86% x̄: 0.59% x̃: 0.44%
95% mean confidence interval for instructions value: -1.95 -1.62
95% mean confidence interval for instructions %-change: -0.62% -0.55%
Instructions are helped.
total cycles in shared programs:
200822940 ->
200829128 (<.01%)
cycles in affected programs:
2128651 ->
2134839 (0.29%)
helped: 251
HURT: 226
helped stats (abs) min: 1 max: 200 x̄: 14.32 x̃: 12
helped stats (rel) min: <.01% max: 3.56% x̄: 0.60% x̃: 0.50%
HURT stats (abs) min: 2 max: 611 x̄: 43.28 x̃: 18
HURT stats (rel) min: 0.02% max: 7.03% x̄: 0.93% x̃: 0.54%
95% mean confidence interval for cycles value: 7.44 18.50
95% mean confidence interval for cycles %-change: 0.02% 0.23%
Cycles are HURT.
Haswell and Ivy Bridge had similar results. (Haswell shown)
total instructions in shared programs:
11569710 ->
11568829 (<.01%)
instructions in affected programs: 147862 -> 146981 (-0.60%)
helped: 487
HURT: 0
helped stats (abs) min: 1 max: 34 x̄: 1.81 x̃: 1
helped stats (rel) min: 0.12% max: 4.75% x̄: 0.57% x̃: 0.45%
95% mean confidence interval for instructions value: -2.03 -1.59
95% mean confidence interval for instructions %-change: -0.61% -0.54%
Instructions are helped.
total cycles in shared programs:
187079425 ->
187079437 (<.01%)
cycles in affected programs:
1088494 ->
1088506 (<.01%)
helped: 234
HURT: 124
helped stats (abs) min: 2 max: 282 x̄: 22.66 x̃: 16
helped stats (rel) min: 0.03% max: 7.88% x̄: 0.93% x̃: 0.75%
HURT stats (abs) min: 1 max: 276 x̄: 42.86 x̃: 20
HURT stats (rel) min: 0.03% max: 6.70% x̄: 0.99% x̃: 0.53%
95% mean confidence interval for cycles value: -5.54 5.61
95% mean confidence interval for cycles %-change: -0.41% -0.11%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 7746 -> 7740 (-0.08%)
spills in affected programs: 6 -> 0
helped: 1
HURT: 0
total fills in shared programs: 6264 -> 6258 (-0.10%)
fills in affected programs: 6 -> 0
helped: 1
HURT: 0
Sandy Bridge
total instructions in shared programs:
10688576 ->
10688177 (<.01%)
instructions in affected programs: 137875 -> 137476 (-0.29%)
helped: 358
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 1.11 x̃: 1
helped stats (rel) min: 0.15% max: 1.43% x̄: 0.35% x̃: 0.28%
95% mean confidence interval for instructions value: -1.18 -1.05
95% mean confidence interval for instructions %-change: -0.37% -0.32%
Instructions are helped.
total cycles in shared programs:
153397144 ->
153393046 (<.01%)
cycles in affected programs:
1220713 ->
1216615 (-0.34%)
helped: 255
HURT: 31
helped stats (abs) min: 1 max: 304 x̄: 16.71 x̃: 16
helped stats (rel) min: <.01% max: 6.70% x̄: 0.41% x̃: 0.31%
HURT stats (abs) min: 1 max: 41 x̄: 5.29 x̃: 3
HURT stats (rel) min: 0.02% max: 0.65% x̄: 0.16% x̃: 0.11%
95% mean confidence interval for cycles value: -17.44 -11.22
95% mean confidence interval for cycles %-change: -0.40% -0.29%
Cycles are helped.
Iron Lake
total instructions in shared programs:
8106894 ->
8105529 (-0.02%)
instructions in affected programs: 287197 -> 285832 (-0.48%)
helped: 1099
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 1.24 x̃: 1
helped stats (rel) min: 0.16% max: 4.55% x̄: 0.67% x̃: 0.61%
95% mean confidence interval for instructions value: -1.29 -1.19
95% mean confidence interval for instructions %-change: -0.70% -0.64%
Instructions are helped.
total cycles in shared programs:
188347022 ->
188344266 (<.01%)
cycles in affected programs:
3740632 ->
3737876 (-0.07%)
helped: 758
HURT: 10
helped stats (abs) min: 2 max: 38 x̄: 3.68 x̃: 2
helped stats (rel) min: <.01% max: 1.00% x̄: 0.12% x̃: 0.08%
HURT stats (abs) min: 2 max: 4 x̄: 3.20 x̃: 4
HURT stats (rel) min: 0.03% max: 0.07% x̄: 0.06% x̃: 0.07%
95% mean confidence interval for cycles value: -3.82 -3.35
95% mean confidence interval for cycles %-change: -0.13% -0.11%
Cycles are helped.
GM45
total instructions in shared programs:
4985449 ->
4984768 (-0.01%)
instructions in affected programs: 145154 -> 144473 (-0.47%)
helped: 547
HURT: 0
helped stats (abs) min: 1 max: 10 x̄: 1.24 x̃: 1
helped stats (rel) min: 0.16% max: 2.86% x̄: 0.66% x̃: 0.61%
95% mean confidence interval for instructions value: -1.31 -1.18
95% mean confidence interval for instructions %-change: -0.69% -0.62%
Instructions are helped.
total cycles in shared programs:
128835062 ->
128833144 (<.01%)
cycles in affected programs:
2720650 ->
2718732 (-0.07%)
helped: 517
HURT: 1
helped stats (abs) min: 2 max: 38 x̄: 3.71 x̃: 2
helped stats (rel) min: <.01% max: 0.89% x̄: 0.11% x̃: 0.07%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.04% max: 0.04% x̄: 0.04% x̃: 0.04%
95% mean confidence interval for cycles value: -4.02 -3.39
95% mean confidence interval for cycles %-change: -0.12% -0.10%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3965>
Eric Engestrom [Mon, 9 Mar 2020 22:40:40 +0000 (23:40 +0100)]
docs: update calendar, add news item, and link releases notes for 19.3.5
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4121>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4121>
Eric Engestrom [Mon, 9 Mar 2020 18:49:32 +0000 (19:49 +0100)]
docs: add release notes for 19.3.5
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4121>
Vinson Lee [Mon, 2 Mar 2020 05:38:18 +0000 (21:38 -0800)]
st/nine: Fix incompatible-pointer-types-discards-qualifiers errors.
../src/gallium/state_trackers/nine/nine_ff.c:129:28: error: initializing 'struct nine_ff_vs_key *' with an expression of type 'const void *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
struct nine_ff_vs_key *vs = key;
^ ~~~
../src/gallium/state_trackers/nine/nine_ff.c:145:28: error: initializing 'struct nine_ff_ps_key *' with an expression of type 'const void *' discards qualifiers [-Werror,-Wincompatible-pointer-types-discards-qualifiers]
struct nine_ff_ps_key *ps = key;
^ ~~~
Fixes: fdd96578ef2d ("nine: Add state tracker nine for Direct3D9 (v3)")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Andre Heider <a.heider@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4015>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4015>
Marek Olšák [Thu, 27 Feb 2020 03:48:41 +0000 (22:48 -0500)]
radeonsi: determine uses_bindless_samplers correctly
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
Marek Olšák [Wed, 4 Mar 2020 00:01:17 +0000 (19:01 -0500)]
ac: add a bug workaround for the 100% NGG culling case
Fixes: 8db00a51f85 - radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
Marek Olšák [Tue, 3 Mar 2020 23:40:50 +0000 (18:40 -0500)]
radeonsi: add a bug workaround for NGG - LATE_ALLOC_GS
Cc: 19.3 20.0 <mesa-stable@lists.freedesktop.org>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
Sonny Jiang [Wed, 8 Jan 2020 22:18:40 +0000 (17:18 -0500)]
radeonsi: enable EXT_texture_shadow_lod
Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
Chia-I Wu [Thu, 7 Feb 2019 23:14:19 +0000 (15:14 -0800)]
egl/android: require ANDROID_native_fence_sync for buffer age
Querying buffer age requires a buffer to be dequeued. But dequeuing
without ANDROID_native_fence_sync might imply eglClientWaitSync,
which results in a deadlock as the display lock is already held by
eglQuerySurface.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/221>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/221>
Edmondo Tommasina [Mon, 9 Mar 2020 15:25:03 +0000 (16:25 +0100)]
radv/sqtt: fix RADV_THREAD_TRACE_BUFFER_SIZE spelling
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4116>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4116>
Eric Engestrom [Mon, 9 Mar 2020 11:23:00 +0000 (12:23 +0100)]
docs/releasing: add missing </li> tags
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4094>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4094>
Eric Engestrom [Fri, 6 Mar 2020 22:07:16 +0000 (23:07 +0100)]
docs: trivial fix for html structure
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4094>
Neil Roberts [Fri, 7 Jun 2019 06:52:14 +0000 (08:52 +0200)]
glsl/opt_minmax: Add support for float16
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929>
Kristian H. Kristensen [Wed, 4 Mar 2020 19:58:58 +0000 (11:58 -0800)]
glsl/lower_instructions: Handle fp16 for FDIV_TO_MUL_RCP
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929>
Hyunjun Ko [Mon, 3 Jun 2019 08:44:14 +0000 (08:44 +0000)]
glsl/lower_instructions: Handle fp16 for MOD_TO_FLOOR
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929>
Neil Roberts [Sun, 26 May 2019 13:32:49 +0000 (15:32 +0200)]
glsl/lower_instructions: Use float16 constants when appropriate
When lowering instructions that involve floating-point constants, pick
the appropriate type for the constant so that it will also work with
float16 parameters.
v2: Use float16_t constructor instead of helper function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929>
Neil Roberts [Thu, 16 May 2019 15:01:06 +0000 (17:01 +0200)]
glsl/validate: Allow float16 in the expression tree
v2. [Hyunjun Ko (zzoon@igalia.com)] squashed 3 commits
into one commit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3929>