Jason Ekstrand [Thu, 17 May 2018 15:46:03 +0000 (08:46 -0700)]
intel/fs: Set up FB write message headers in the visitor
Doing instruction header setup in the generator is awful for a number
of reasons. For one, we can't schedule the header setup at all. For
another, it means lots of implied writes which the instruction scheduler
and other passes can't properly read about. The second isn't a huge
problem for FB writes since they always happen at the end. We made a
similar change to sampler handling in
ff4726077d86.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Francisco Jerez [Fri, 13 Jan 2017 22:18:22 +0000 (14:18 -0800)]
intel/fs: Fix implied_mrf_writes() for headerless FB writes.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Francisco Jerez [Fri, 13 Jan 2017 22:17:20 +0000 (14:17 -0800)]
intel/fs: Fix fs_inst::flags_written() for Gen4-5 FB writes.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Francisco Jerez [Fri, 13 Jan 2017 22:16:12 +0000 (14:16 -0800)]
intel/eu: Return new instruction to caller from brw_fb_WRITE().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Fri, 18 May 2018 01:47:19 +0000 (18:47 -0700)]
intel/fs: Pull FB write implied headers from src[0]
Now that we have the implied header in src[0] for tracking purposes, we
may as well use it in the generator. This makes things a tiny bit more
general.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Thu, 17 May 2018 22:40:48 +0000 (15:40 -0700)]
intel/fs: Properly track implied header regs read by FB writes
The FB write opcode on gen4-5 does implied copies from g0 and g1 to the
message payload. With this commit, we start tracking that as part of
the IR by having the FB write read from g0-1.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Thu, 17 May 2018 00:51:10 +0000 (17:51 -0700)]
intel/fs: FS_OPCODE_REP_FB_WRITE has side effects
It doesn't matter since we don't ever run replicated write shaders
through the optimizer but it's good to be complete.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Dylan Baker [Thu, 28 Jun 2018 17:06:44 +0000 (10:06 -0700)]
docs: Add news item for mesa 18.1.2
Which I forgot to do when 18.1.2 came out.
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Rhys Perry [Fri, 22 Jun 2018 20:47:43 +0000 (21:47 +0100)]
nvc0: remove magic values in nve4_set_tex_handles()
With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is
changed to anything other than 0x20.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Rhys Perry [Thu, 28 Jun 2018 13:26:33 +0000 (14:26 +0100)]
nvc0/ir: fix TargetNVC0::insnCanLoadOffset()
Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset
could be set to a specific value. The IndirectPropagation pass expected
it to return whether the offset could be increased by a specific value,
which is what TargetNV50::insnCanLoadOffset() does.
Fixes: 37b67db6ae34fb6586d640a7a1b6232f091dd812
("nvc0/ir: be careful about propagating very large offsets into const load")
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Alok Hota [Fri, 22 Jun 2018 14:11:26 +0000 (09:11 -0500)]
swr/rast: Updating code style based on current clang-format rules
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Vinson Lee [Mon, 25 Jun 2018 14:52:19 +0000 (09:52 -0500)]
swr/rast: Fix addPassesToEmitFile usage with llvm-7.0.
Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output
file argument to addPassesToEmitFile and hook it up to dwo output.").
CXX rasterizer/jitter/libmesaswr_la-JitManager.lo
rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3
pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Mon, 25 Jun 2018 14:52:18 +0000 (09:52 -0500)]
swr/rast: Handling removed LLVM intrinsics in trunk
- Functionality replaced with emulated intrinsics
- Fixes Bug 106558
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Mon, 25 Jun 2018 14:52:17 +0000 (09:52 -0500)]
swr/rast: Adding SCATTERPS functionality to BuilderGfxMem
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Alok Hota [Mon, 25 Jun 2018 14:52:16 +0000 (09:52 -0500)]
swr/rast: Adding Read/Write specifier to TranslateGfxAddress stack
- Removing unused generic translate function
- Requiring read/write specifier in builder_gfx_mem
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Chad Versace [Fri, 1 Jun 2018 02:57:55 +0000 (19:57 -0700)]
gallium: Fix automake for Android (v2)
Chromium OS uses Autotools and pkg-config when building Mesa for
Android. The gallium drivers were failing to find the headers and
libraries for zlib and Android's libbacktrace.
v2:
- Don't add a check for zlib.pc. configure.ac already checks for
zlib.pc elsewhere. [for tfiga]
- Check for backtrace.pc separately from the other Android libs.
[for tfiga]
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Wed, 27 Jun 2018 23:23:20 +0000 (09:23 +1000)]
glsl: skip comparison opt when adding vars of different size
The spec allows adding scalars with a vector or matrix. In this case
the opt was losing swizzle and size information.
This fixes a bug with Doom (2016) shaders.
Fixes: 34ec1a24d61f ("glsl: Optimize (x + y cmp 0) into (x cmp -y).")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jason Ekstrand [Wed, 27 Jun 2018 21:09:51 +0000 (14:09 -0700)]
Revert "anv: Print the actual enum for ignored structure types"
This reverts commit
fda7014c35e5f5dfa26f078ad0512d13ead8b717. It was
hitting an unreachable when the sType was unknown.
Jason Ekstrand [Tue, 26 Jun 2018 20:33:29 +0000 (13:33 -0700)]
anv: Print the actual enum for ignored structure types
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Jason Ekstrand [Wed, 27 Jun 2018 01:32:38 +0000 (18:32 -0700)]
i965/bufmgr: Use the correct argument order for bo_alloc_internal
The memzone and flags parameters were accidentally flipped in the call
from brw_bo_alloc_tiled_2d.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Keith Packard [Tue, 26 Jun 2018 23:01:45 +0000 (16:01 -0700)]
vulkan/wsi_common_display: Return SURFACE_LOST for fatal DRM errors
Instead of encouraging the client to re-create the swapchain and keep
going with an OUT_OF_DATE error, tell the client that further use of
the current surface will not succeed as the associated kernel objects
are no longer valid.
In particular, when a DRM lease is revoked, then the client needs to
get another lease and create a new surface for that.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Eric Anholt [Thu, 21 Jun 2018 23:39:15 +0000 (16:39 -0700)]
glsl: Make sure that packed varyings reflect always_active_io properly.
The always_active_io flag was only set according to the first variable
that got packed in, so NIR io compaction would end up compacting XFB
varyings that shouldn't move at that point.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Eric Anholt [Mon, 25 Jun 2018 20:29:42 +0000 (13:29 -0700)]
v3d: Fix Z clipping when viewport.scale[2] is negative.
Fixes:
dEQP-GLES3.functional.shaders.builtin_variable.depth_range_fragment
dEQP-GLES3.functional.shaders.builtin_variable.depth_range_vertex
Eric Anholt [Tue, 26 Jun 2018 22:58:21 +0000 (15:58 -0700)]
v3d: Convert a bunch of our "minus one" fields over to the new XML attr.
This fixes up their formatting for CLIF files and makes the code more
legible.
Eric Anholt [Tue, 26 Jun 2018 22:53:26 +0000 (15:53 -0700)]
v3d: Add pack/unpack/decode support for fields with a "- 1" modifier.
Right now, we name these fields as "field name minus one" so that your C
code obviously states what the value should be. However, it's easy enough
to handle at the codegen level with another little XML attribute, meaning
less C code and easier-to-read values in CLIF dumping and gdb as well.
(The actual CLIF format for simulator and FPGA replay takes in
pre-minus-one values, so we need it there too).
Tapani Pälli [Thu, 14 Jun 2018 11:08:11 +0000 (14:08 +0300)]
i965: small cleanup in blorp debug printing output (trivial)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tapani Pälli [Thu, 14 Jun 2018 11:08:10 +0000 (14:08 +0300)]
mesa: add a space between headers and source (trivial)
There used to be one and it looks like it was removed by
eb63640c1d.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tapani Pälli [Thu, 14 Jun 2018 11:08:09 +0000 (14:08 +0300)]
features.txt: mark some extensions as done
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Danylo Piliaiev [Thu, 21 Jun 2018 09:34:15 +0000 (12:34 +0300)]
mesa: Return number of result bits for GL_ANY_SAMPLES_PASSED_CONSERVATIVE
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106986
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Samuel Pitoiset [Tue, 26 Jun 2018 09:19:26 +0000 (11:19 +0200)]
radv: use separate bind points for the dynamic buffers
The Vulkan spec says:
"pipelineBindPoint is a VkPipelineBindPoint indicating whether
the descriptors will be used by graphics pipelines or compute
pipelines. There is a separate set of bind points for each of
graphics and compute, so binding one does not disturb the other."
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 26 Jun 2018 20:35:04 +0000 (22:35 +0200)]
radv: remove unused 'predicated' parameter from some functions
It's always false.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 8 Jun 2018 00:02:20 +0000 (10:02 +1000)]
virgl: add ARB_texture_view support
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Jason Ekstrand [Mon, 25 Jun 2018 23:18:19 +0000 (16:18 -0700)]
nir/opt_if: Remove unneeded phis if we make progress
Now that SSA values can be derefs and they have special rules, we have
to be a bit more careful about our LCSSA phis. In particular, we need
to clean up in case LCSSA ended up creating a phi node for a deref.
This fixes validation issues with some Vulkan CTS tests with the new
deref instructions.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Samuel Pitoiset [Fri, 22 Jun 2018 17:16:43 +0000 (19:16 +0200)]
radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries
Ported from RadeonSI.
This appears to fix some random fails with:
dEQP-VK.query_pool.statistics_query.*
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tapani Pälli [Thu, 14 Jun 2018 08:10:20 +0000 (11:10 +0300)]
glsl: serialize data from glTransformFeedbackVaryings
While XFB has been enabled for cache, we did not serialize enough
data for the whole API to work (such as glGetProgramiv).
Fixes: 6d830940f7 "Allow shader cache usage with transform feedback"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106907
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Samuel Pitoiset [Mon, 25 Jun 2018 13:56:46 +0000 (15:56 +0200)]
radv: enable VK_EXT_shader_stencil_export
The driver already supports exporting the stencil value.
The following CTS test now pass:
dEQP-VK.pipeline.shader_stencil_export.op_replace
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 25 Jun 2018 14:22:43 +0000 (16:22 +0200)]
radv: ignore pInheritanceInfo for primary command buffers
From the Vulkan spec:
"If this is a primary command buffer, then this value is ignored."
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Andrii Simiklit [Fri, 22 Jun 2018 07:59:57 +0000 (10:59 +0300)]
i965/gen6/gs: Handle case where a GS doesn't allocate VUE
We can not use the VUE Dereference flags combination for EOT
message under ILK and SNB because the threads are not initialized
there with initial VUE handle unlike Pre-IL.
So to avoid GPU hangs on SNB and ILK we need
to avoid usage of the VUE Dereference flags combination.
(Was tested only on SNB but according to the specification
SNB Volume 2 Part 1: 1.6.5.3, 1.6.5.6
the ILK must behave itself in the similar way)
v2: Approach to fix this issue was changed.
Instead of different EOT flags in the program end
we will create VUE every time even if GS produces no output.
v3: Clean up the patch.
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105399
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Dave Airlie [Tue, 26 Jun 2018 00:43:14 +0000 (10:43 +1000)]
radeon: duplicate cmask surface for now.
The radeon winsys isn't linked against the ac code, I have vague
memories of this causing some problems before, for now fix the build
but just duplicating the code.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Fri, 22 Jun 2018 04:02:47 +0000 (00:02 -0400)]
radeonsi: rename r600_transfer -> si_transfer
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 04:00:11 +0000 (00:00 -0400)]
radeonsi: properly set cmask_buffer in si_reallocate_texture_inplace
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 03:54:20 +0000 (23:54 -0400)]
radeonsi: remove redundant si_texture::cmask_size
cmask_buffer and surface.cmask_size can replace its role.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 03:00:56 +0000 (23:00 -0400)]
radeonsi: inline struct r600_cmask_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 02:54:59 +0000 (22:54 -0400)]
radeonsi: move CMASK size computation into ac_surface
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 02:50:51 +0000 (22:50 -0400)]
ac/surface: move cmask_size/alignment into radeon_surf
cmask_size is changed to uint32_t because it can't be greater than 4GB.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 02:16:07 +0000 (22:16 -0400)]
radeonsi: rename r600_surface -> si_surface
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 02:07:24 +0000 (22:07 -0400)]
radeonsi: rename r600_memory_object -> si_memory_object
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 02:04:33 +0000 (22:04 -0400)]
radeonsi: remove unused r600_memory_object::offset
The real offset is passed through resource_from_memobj.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 22 Jun 2018 00:41:06 +0000 (20:41 -0400)]
radeonsi: unify duplicated texture_from_handle & texture_from_memobj
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Fri, 15 Jun 2018 19:28:28 +0000 (15:28 -0400)]
radeonsi: reorder and initialize more fields in si_reallocate_texture_inplace
Some fields shouldn't be initialized, like framebuffers_bound and other stats.
It's hopefully complete now.
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
Marek Olšák [Thu, 21 Jun 2018 23:19:49 +0000 (19:19 -0400)]
radeonsi: stop using lp_build_emit_llvm_unary/binary
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 22:52:47 +0000 (18:52 -0400)]
radeonsi: stop using lp_build_alloc
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 22:52:21 +0000 (18:52 -0400)]
radeonsi: use gallivm less
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 22:20:59 +0000 (18:20 -0400)]
radeonsi: stop using lp_bld_intr.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 22:20:59 +0000 (18:20 -0400)]
radeonsi: remove last uses of lp_build_context::undef
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 22:18:42 +0000 (18:18 -0400)]
radeonsi: stop using lp_bld_arit.h
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 22:06:23 +0000 (18:06 -0400)]
radeonsi: stop using lp_build_gather_values
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 22:03:06 +0000 (18:03 -0400)]
radeonsi: clean up some #includes
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 21 Jun 2018 05:36:22 +0000 (01:36 -0400)]
radeonsi: clean up passing the is_monolithic flag for compilation
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Robert Foss [Wed, 18 Apr 2018 15:27:40 +0000 (17:27 +0200)]
egl/android: Add DRM node probing and filtering
This patch both adds support for probing & filtering DRM nodes
and switches away from using the GRALLOC_MODULE_PERFORM_GET_DRM_FD
gralloc call.
Currently the filtering is based just on the driver name,
and the desired name is supplied using the "drm.gpu.vendor_name"
Android property.
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Rob Herring [Thu, 26 Apr 2018 14:02:01 +0000 (16:02 +0200)]
egl/android: #ifdef out flink name support
Maintaining both flink names and prime fd support which are provided by
2 different gralloc implementations is problematic because we have a
dependency on a specific gralloc implementation header.
This mostly disables the dependency on the gralloc implementation and
headers. The dependency on GRALLOC_MODULE_PERFORM_GET_DRM_FD remains for
now, but the definition is added locally to remove the header
dependency.
drm_gralloc support can be enabled by setting
BOARD_USES_DRM_GRALLOC=true in BoardConfig.mk.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Robert Foss [Fri, 4 May 2018 15:05:13 +0000 (17:05 +0200)]
gallium/util: Fix build error due to cast to different size
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Mon, 25 Jun 2018 11:34:10 +0000 (13:34 +0200)]
radv: fix HTILE metadata initialization in presence of subpass clears
If the driver ends up by performing a slow depthstencil clear,
the HTILE metadata won't be initialized correctly.
This fixes random VM faults on Polaris while running CTS
with Bas's runner. This doesn't seem to regress performance.
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Gert Wollny [Thu, 31 May 2018 23:20:54 +0000 (01:20 +0200)]
r600/sb: give the scheduler more margin to find valid instructions groups
For instruction sequences that change the address register with every load
the current limit to bail out of the scheduler and reject the optimisation
was too tight, i.e. it was expected that at least one pending instruction
would be scheduled each time.
Give the scheduler more margin to sort out these load sequences by allowing
a number of rounds where no instruction is scheduled.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106163
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Gert Wollny [Thu, 31 May 2018 21:25:09 +0000 (23:25 +0200)]
r600/sb: fix rotated register in while loop
This patch is based on
https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html
Dave Airlie:
"A bunch of CTS tests led me to write
tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
which r600/sb always fell over on.
GCM seems to move some of the copies into other basic blocks,
if we don't allow this to happen then it doesn't seem to schedule
them badly.
Everything I've read on SSA/phi copies say they have to happen
in parallel, so keeping them in the same basic block seems like
a good way to keep some of that property."
This patch differs from the one proposed by Dave in that it only adds
the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi*
and that are located in loops.
Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test
(no regressions in the shader set). It also fixes all failing tests from
dEQP-GLES3.functional.shaders.loops.*
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Rob Clark [Sat, 23 Jun 2018 15:02:50 +0000 (11:02 -0400)]
freedreno/ir3: fix deref conversion fallout
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sat, 23 Jun 2018 10:47:49 +0000 (06:47 -0400)]
freedreno/ir3: fix unused variable warning
Fixes: cf0c7258ee0 freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Fri, 22 Jun 2018 20:09:25 +0000 (16:09 -0400)]
freedreno: fix HW_ATOMIC_COUNTERS cap
This was mistakenly exposed, even though we want atomic counters to be
lowered to atomic ops on an SSBO like nearly every other GPU. Which
somehow recently started getting segfaults due to calling a null
pipe->set_hw_atomic_buffers().
Fixes a crash in stk, and probably other things.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Keith Packard [Fri, 16 Jun 2017 04:00:56 +0000 (21:00 -0700)]
radv: add VK_EXT_display_control to radv driver [v5]
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.
v2:
Rework fence integration into the driver so that waiting for
any of a mixture of fence types (wsi, driver or syncobjs)
causes the driver to poll, while a list of just syncobjs or
just driver fences will block. When we get syncobjs for wsi
fences, we'll adapt to use them.
v3: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
v4: Adapt to WSI fence API change. It now returns VkResult and
no longer has an option for relative timeouts.
v5: wsi_register_display_event and wsi_register_device_event now
use the default allocator when NULL is provided, so remove the
computation of 'alloc' here.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Keith Packard [Fri, 16 Jun 2017 04:00:56 +0000 (21:00 -0700)]
anv: add VK_EXT_display_control to anv driver [v5]
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.
v2: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Add extension to list in alphabetical order
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
v3: Adapt to WSI fence API change. It now returns VkResult and
no longer has an option for relative timeouts.
v4: wsi_register_display_event and wsi_register_device_event now
use the default allocator when NULL is provided, so remove the
computation of 'alloc' here.
v5: use zalloc2 instead of alloc2 for the WSI fence.
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Keith Packard [Fri, 16 Jun 2017 04:00:56 +0000 (21:00 -0700)]
vulkan: add VK_EXT_display_control [v10]
This extension provides fences and frame count information to direct
display contexts. It uses new kernel ioctls to provide 64-bits of
vblank sequence and nanosecond resolution.
v2: Remove DRM_CRTC_SEQUENCE_FIRST_PIXEL_OUT flag. This has
been removed from the proposed kernel API.
Add NULL parameter to drmCrtcQueueSequence ioctl as we
don't care what sequence the event was actually queued to.
v3: Adapt to pthread clock switch to MONOTONIC
v4: Fix scope for wsi_display_mode andwsi_display_connector allocs
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
v5: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Use wsi_rel_to_abs_time helper function to convert relative
timeouts to absolute timeouts without causing overflow.
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
v6:
Change WSI fence wait function to return VkResult instead of
bool. This makes the meaning of the return value easier to
understand, and allows for the indication of failure.
Also change the WSI fence wait function to take only absolute
timeouts and not provide an option for a relative timeout. No
users wanted relative timeouts, and it's simpler if that option
isn't available.
Terminate the DPMS property loop once we've found the property.
Assert that the fence hasn't already been destroyed in
wsi_display_fence_destroy.
Rearrange the event handler function order in the file to place
routines in an easier to find order.
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
v7:
Adapt to API changes for surface_get_capabilities
v8:
Use wsi->alloc in register_display_event so that callers
don't have to dig out an allocator for us.
v9:
Fix a few minor formatting issues
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
v10:
Use wsi->alloc if none provided in wsi_display_fence_alloc.
Now that drivers are expected to pass the allocator argument
straight through from the application, we need to check those
for NULL everywhere.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Keith Packard [Wed, 6 Jun 2018 05:18:56 +0000 (23:18 -0600)]
anv: Support wait for heterogeneous list of fences [v3]
Handle the case where the set of fences to wait for is not all of the
same type by either waiting for them sequentially (waitAll), or
polling them until the timer has expired (!waitAll). We hope the
latter case is not common.
While the current code makes sure that it always has fences of only
one type, that will not be true when we add WSI fences. Split out this
refactoring to make merging that clearer.
v2: Adopt Jason Ekstrand's coding conventions
Declare variables at first use, eliminate extra whitespace between
types and names. Wrap lines to 80 columns.
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
v2:
Cast INT64_MAX to uint64_t to make of its use as the maximum
possible timeout clearly unsigned to the reader.
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Make anv_wait_for_fences with !waitAll check all fences at least
once, even if the requested timeout has already passed.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bas Nieuwenhuizen [Sun, 3 Jun 2018 23:10:12 +0000 (01:10 +0200)]
radv: Enable lower_io_to_temporaries after deref changes.
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 7 Apr 2018 05:34:57 +0000 (22:34 -0700)]
nir/lower_system_values: Assert/assume direct var derefs
System values are never arrays or structs so we can assume a direct var
deref. This simplifies things a bit and prevents us from accidentally
throwing away an array index.
Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 26 Mar 2018 21:50:38 +0000 (14:50 -0700)]
nir: Remove old-school deref chain support
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 26 Mar 2018 22:26:21 +0000 (15:26 -0700)]
nir: Remove deref chain support from analyze_loops
Note that this patch needs to come late in the series since this pass
can be run after any pass that damages nir_metadata_loop_analysis.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rob Clark [Thu, 5 Apr 2018 00:41:59 +0000 (20:41 -0400)]
freedreno/ir3: convert to deref instructions
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rob Clark [Thu, 5 Apr 2018 00:40:33 +0000 (20:40 -0400)]
nir: promote intrinsic_get_var() to helper
Useful in a few other places.. let's not copy-pasta
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 3 Apr 2018 00:41:28 +0000 (17:41 -0700)]
nir: Rework lower_locals_to_regs to use deref instructions
This completely reworks the pass to support deref instructions and
delete support for old deref chains
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 28 Mar 2018 04:00:01 +0000 (21:00 -0700)]
intel,ir3: Re-enable nir_opt_copy_prop_vars
Now that it's rewritten for deref instructions, we can turn it back on.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bas Nieuwenhuizen [Sat, 12 May 2018 23:17:23 +0000 (01:17 +0200)]
radeonsi: Remove deref chain support in nir scan pass.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Fri, 11 May 2018 12:38:12 +0000 (14:38 +0200)]
radv: Remove deref chain support in radv shader info pass.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sat, 12 May 2018 23:48:18 +0000 (01:48 +0200)]
ac/nir: Remove deref chain support.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Fri, 11 May 2018 23:02:32 +0000 (01:02 +0200)]
radeonsi: Add deref support to the nir scan pass.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Wed, 28 Mar 2018 03:57:30 +0000 (20:57 -0700)]
nir: Rework opt_copy_prop_vars to use deref instructions
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 28 Mar 2018 00:21:35 +0000 (17:21 -0700)]
nir/copy_prop_vars: Re-order some logic in compare_derefs
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 2 Apr 2018 23:47:35 +0000 (16:47 -0700)]
nir: Remove deref chain support from split_per_member_structs
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 2 Apr 2018 23:44:40 +0000 (16:44 -0700)]
nir: Remove deref chain support from opt_undef
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 2 Apr 2018 23:24:10 +0000 (16:24 -0700)]
nir: Remove deref chain support from split_var_copies
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 2 Apr 2018 23:15:39 +0000 (16:15 -0700)]
nir: Remove deref chain support from dead_variables
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 2 Apr 2018 23:10:04 +0000 (16:10 -0700)]
nir: Remove deref chain support from propagate_invariant
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 2 Apr 2018 23:02:43 +0000 (16:02 -0700)]
nir: Remove deref chain support from lower_var_copies
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 2 Apr 2018 22:59:39 +0000 (15:59 -0700)]
nir: Remove deref chain support from lower_drawpixels
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 27 Mar 2018 16:45:23 +0000 (09:45 -0700)]
nir: Remove deref chain support from opt_peephole_select
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 27 Mar 2018 16:43:23 +0000 (09:43 -0700)]
nir: Remove deref chain support from lower_tex
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 27 Mar 2018 16:15:54 +0000 (09:15 -0700)]
nir: Remove deref chain support from lower_wpos_ytransform
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 27 Mar 2018 16:14:56 +0000 (09:14 -0700)]
nir: Remove deref chain support from lower_wpos_center
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 27 Mar 2018 16:08:31 +0000 (09:08 -0700)]
nir: Remove deref chain support from lower_system_values
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 27 Mar 2018 14:56:49 +0000 (07:56 -0700)]
nir: Remove deref chain support from remove_unused_varyings
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Tue, 27 Mar 2018 14:37:18 +0000 (07:37 -0700)]
nir: Delete lower_io_types
It's only used by the ir3 stand-alone compiler and Rob said we could
delete it.
Acked-by: Rob Clark <robdclark@gmail.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>