Tomeu Vizoso [Thu, 20 Jun 2019 13:37:10 +0000 (15:37 +0200)]
panfrost: Set job requirements during draw
Right now we are doing it at a moment when we don't have all the
information we need.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Suggested-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Rohan Garg <rohan.garg@collabora.com>
Cc: Rohan Garg <rohan.garg@collabora.com>
Fixes: bfca21b622df ("panfrost: Figure out job requirements in pan_job.c")
Alyssa Rosenzweig [Thu, 20 Jun 2019 15:54:56 +0000 (08:54 -0700)]
panfrost/meson: Link with libpanfrost_shared
Fixes: 035a07c0 ("panfrost: Switch to lima tiling")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Hyunjun Ko [Thu, 20 Jun 2019 13:12:23 +0000 (22:12 +0900)]
freedreno/ir3: fix typo
Fixes: a9b556d3a04 ("freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Alyssa Rosenzweig [Tue, 18 Jun 2019 17:48:43 +0000 (10:48 -0700)]
panfrost: Load from tiled images
Now that we have lima tiling code available, use it to load from a tiled
source.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Tue, 18 Jun 2019 17:43:07 +0000 (10:43 -0700)]
panfrost: Switch to lima tiling
Lima and Panfrost both have implementations of software tiling
(the Lima one was forked off the Panfrost one which was forked off the
original Lima one...). Switch to the most recent Lima code, since it's
more complete than ours at this point.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Thu, 20 Jun 2019 15:19:06 +0000 (08:19 -0700)]
panfrost: Fix tiled NPOT textures with bpp<4
Panfrost's tiling routines (incorrectly) ignored the source stride,
masking this bug; lima's routines respect this stride, causing issues
when tiling NPOT textures whose stride is not a multiple of 64
(for instance, NPOT textures with bpp=1).
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Tue, 18 Jun 2019 18:16:21 +0000 (11:16 -0700)]
lima,panfrost: Move lima_tiling.c/h to /src/panfrost
This will allow both drivers to share this code. Both drivers
build-tested with meson. Android build not tested.
v2: Change naming from tiling->shared, in case Lima and Panfrost can
share more in the future. Fix Android build system.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-and-tested-by: Qiang Yu <yuq825@gmail.com>
Kenneth Graunke [Thu, 20 Jun 2019 15:02:42 +0000 (10:02 -0500)]
iris: Use render_batch/compute_batch locals in memory_barrier
We have them, may as well use them.
Lionel Landwerlin [Sun, 21 Apr 2019 00:01:55 +0000 (01:01 +0100)]
anv: only resort to sync fds internally with no syncobj support
We can rely on only one kind of synchronization object (drm-syncobj)
when it is available. This reduces the number of file descriptors we
use in our implementation.
This will be required later for timeline semaphores implementation, at
this point we won't ever want to use anything else but syncobjs.
v2: Only use has_syncobj for semaphores (Jason)
v3: Only has_syncobj in assert on semaphores in QueueSubmit (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Alyssa Rosenzweig [Wed, 19 Jun 2019 16:36:22 +0000 (09:36 -0700)]
panfrost: Remove other commented pointers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 16:35:57 +0000 (09:35 -0700)]
panfrost/decode: Elide more zero fields
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 16:33:19 +0000 (09:33 -0700)]
panfrost/decode: Remove memory comments
These do more harm than good at this point.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 16:31:49 +0000 (09:31 -0700)]
panfrost: Add missing 0x in invocation_count
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 16:31:16 +0000 (09:31 -0700)]
panfrost/decode: Skip decode of fragment backend in non-fragment
This is all zero for anything but fragment shaders.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 16:24:01 +0000 (09:24 -0700)]
panfrost/decode: Clip mali_compute_fbd at 64-bytes
Looking at internal evidence (later fields including a literal other
compute job inception-style, seeming memory corruption, no clear
function, and the field after this being a pointer to *itself*), it
looks like this is really a much smaller descriptor.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 16:07:13 +0000 (09:07 -0700)]
panfrost/decode: Print COMPUTE uniforms as pointers
In OpenGL, uniforms generally represent fp32 vec4s (at least in highp
mode). In OpenCL, they represent vec2s of 64-bit pointers.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 15:59:23 +0000 (08:59 -0700)]
panfrost/decode: Show int uniforms
Float is ambiguous.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 15:55:03 +0000 (08:55 -0700)]
panfrost/decode: Expand pointers in compute descriptor
Just as an aid.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Wed, 19 Jun 2019 15:41:51 +0000 (08:41 -0700)]
panfrost/decode: Identify "compute FBD"
There is fundamentally not a framebuffer associated with a compute job.
Allocate a new structure for it so we don't mess up graphics when
decoding.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Wed, 19 Jun 2019 11:16:45 +0000 (13:16 +0200)]
panfrost: Allocate panfrost_job in panfrost_context
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Wed, 19 Jun 2019 11:16:14 +0000 (13:16 +0200)]
panfrost: Release transient pools
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Tomeu Vizoso [Thu, 20 Jun 2019 05:59:01 +0000 (07:59 +0200)]
panfrost: ci: Exclude flip-flops from results
These tests are failing at times, blacklist for now:
dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgba
dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgb
dEQP-GLES2.functional.shaders.matrix.mul.dynamic_highp_mat4_vec4_vertex
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alejandro Piñeiro [Thu, 20 Jun 2019 12:48:35 +0000 (14:48 +0200)]
util: add empty line before virgl options
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Alejandro Piñeiro [Thu, 20 Jun 2019 11:13:06 +0000 (13:13 +0200)]
util: add missing DRI_CONF_OPT_END
When DRI_CONF_GLES_EMULATE_BGRA was added for the virgl driver, it
missed a DRI_CONF_OPT_END.
This make some drivers, like v4c/v3d to crash with the following
error:
Fatal error in __driConfigOptions line 99, column 2: mismatched tag.
Not sure why it doesn't fail with virgl.
Fixes: b79366344929c6e477c64a63f246c6db0766a71c
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Eric Engestrom [Wed, 19 Jun 2019 21:18:15 +0000 (22:18 +0100)]
isl: tag unreachable path as such
GCC should be able to figure out that all the possible enum values are
exhausted in the switch() and all the branches return from the function,
but apparently it doesn't, so let's tell the compiler explicitly.
This gets rid of the following warnings in GCC 9:
[1/24] Compiling C object 'src/intel/isl/
60d23f8@@isl@sta/isl.c.o'.
../src/intel/isl/isl.c: In function ‘isl_surf_init_s’:
../src/intel/isl/isl.c:1569:10: warning: ‘array_pitch_el_rows’ may be used uninitialized in this function [-Wmaybe-uninitialized]
1569 | *surf = (struct isl_surf) {
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~
1570 | .dim = info->dim,
| ~~~~~~~~~~~~~~~~~
1571 | .dim_layout = dim_layout,
| ~~~~~~~~~~~~~~~~~~~~~~~~~
1572 | .msaa_layout = msaa_layout,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1573 | .tiling = tiling,
| ~~~~~~~~~~~~~~~~~
1574 | .format = info->format,
| ~~~~~~~~~~~~~~~~~~~~~~~
1575 |
|
1576 | .levels = info->levels,
| ~~~~~~~~~~~~~~~~~~~~~~~
1577 | .samples = info->samples,
| ~~~~~~~~~~~~~~~~~~~~~~~~~
1578 |
|
1579 | .image_alignment_el = image_align_el,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1580 | .logical_level0_px = logical_level0_px,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1581 | .phys_level0_sa = phys_level0_sa,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1582 |
|
1583 | .size_B = size_B,
| ~~~~~~~~~~~~~~~~~
1584 | .alignment_B = base_alignment_B,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1585 | .row_pitch_B = row_pitch_B,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1586 | .array_pitch_el_rows = array_pitch_el_rows,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1587 | .array_pitch_span = array_pitch_span,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1588 |
|
1589 | .usage = info->usage,
| ~~~~~~~~~~~~~~~~~~~~~
1590 | };
| ~
../src/intel/isl/isl.c:1488:24: warning: ‘*((void *)&phys_total_el+4)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
1488 | struct isl_extent2d phys_total_el;
| ^~~~~~~~~~~~~
../src/intel/isl/isl.c:1335:38: warning: ‘phys_total_el’ may be used uninitialized in this function [-Wmaybe-uninitialized]
1335 | isl_align_div(phys_total_el->w * tile_el_scale,
| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
../src/intel/isl/isl.c:1488:24: note: ‘phys_total_el’ was declared here
1488 | struct isl_extent2d phys_total_el;
| ^~~~~~~~~~~~~
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Samuel Pitoiset [Thu, 20 Jun 2019 07:17:37 +0000 (09:17 +0200)]
radv: enable DCC for mipmapped color textures on GFX8
It's tricky on GFX9, so only GFX8 for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 20 Jun 2019 07:17:36 +0000 (09:17 +0200)]
radv: do not fast clears if one level can't be fast cleared
And fallback to slow color clears.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 20 Jun 2019 07:17:35 +0000 (09:17 +0200)]
radv: add fast clears support for mipmapped color images with DCC
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 20 Jun 2019 07:17:34 +0000 (09:17 +0200)]
radv: add radv_dcc_clear_level() helper
For clearing only one level.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 20 Jun 2019 07:17:33 +0000 (09:17 +0200)]
radv: re-initialize DCC metadata after decompressing using compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 20 Jun 2019 07:17:32 +0000 (09:17 +0200)]
radv: initialize levels without DCC during layout transitions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Thomas Hellstrom [Fri, 5 Apr 2019 15:07:49 +0000 (17:07 +0200)]
svga: Support ARB_buffer_storage
This basically boils down to supporting persistent and coherent buffer
storage.
We chose to use coherent buffer storage for all persistent buffers
even if it's not explicitly specified, since using glMemoryBarrier to
obtain coherency would be particularly expensive in our driver stack,
and require a lot of additional bookkeeping.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Thomas Hellstrom [Wed, 10 Apr 2019 11:54:30 +0000 (13:54 +0200)]
gallium/util: Make it possible to disable persistent maps in the upload manager
For svga, the use of persistent / coherent maps is typically slightly
slower than without them. It's probably a bit case-dependent and
possible to tune, but for now, make sure we can disable those.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Thomas Hellstrom [Thu, 11 Apr 2019 08:10:04 +0000 (10:10 +0200)]
svga: Map vertex- index- and constant buffers ansynchronously when reading
With SWTNL and index translation we're mapping buffers for reading. These
buffers are commonly upload_mgr buffers that might already be referenced
by another submitted or unsubmitted GPU command. A synchronous map will
then trigger a flush and sync, at least on Linux that doesn't distinguish
between read- and write referencing. So map these buffers async. If they
for some obscure reason happen to be dirty (stream-output, buffer-copy),
the resource_buffer code will read-back and sync anyway. For persistent /
coherent buffers a corresponding read-back and sync will happen in the
kernel fault handler.
Testing: Piglit quick. No regressions.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Thomas Hellstrom [Thu, 11 Apr 2019 06:56:20 +0000 (08:56 +0200)]
svga: Fix index buffer uploads
In the case of SWTNL and index translation we were uploading index buffers
and then reading out from them using the CPU. Furthermore, when translating
indices we often cached the results with an upload_mgr buffer, causing the
cached indexes to be immediately discarded on the next write to that
upload_mgr buffer.
Fix this by only uploading when we know the index buffer is going to be
used by hardware. If translating, only cache translated indices if the
original buffer was not a user buffer. In the latter case when we're not
caching, use an upload_mgr buffer for the hardware indices.
This means we can also remove the SWTNL hand-crafted index buffer upload
mechanism in favour of the upload_mgr.
Finally avoid using util_upload_index_buffer(). It wastes index buffer
space by trying to make sure that the offset of the indices in the
upload_mgr buffer is larger or equal to the position of the indices in
the source buffer. From what I can tell, the SVGA device does not
require that.
Testing done: Piglit quick. No regressions.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Thomas Hellstrom [Fri, 5 Apr 2019 07:09:19 +0000 (09:09 +0200)]
winsys/svga: Make it possible to specify coherent resources
Add a flag in the surface cache key and a winsys usage flag to
specify coherent memory.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Thomas Hellstrom [Sat, 6 Apr 2019 09:13:39 +0000 (11:13 +0200)]
gallium/util: Make u_debug_flush support persistent maps
Previously unsynchronized maps have been assumed to also be persistent,
Now destinguish between persistent and unsynchronized map and also support
PIPE_TRANSFER_PERSISTENT from ARB_buffer_storage.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Gert Wollny [Mon, 27 May 2019 14:38:38 +0000 (16:38 +0200)]
virgl: Add debug flag to bypass driconf to enable the BGRA tweaks
This useful for testing, also because with vtest the dri configuration
is not read.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Mon, 27 May 2019 14:38:14 +0000 (16:38 +0200)]
virgl: Add a tweak to set the value for emulated queries of GL_SAMPLES_PASSED
On GLES hosts GL_SAMPLES_PASSED is emulated by GL_ANY_SAMPLES_PASSED which returns a boolen.
With this tweak the value that is returned if any sample passed can be set. This
may be of iterest when an application decides whether some geometry is rendered based
on an amount of visibility and not just a binary desicion. virgelrenderer sets a default
of 1024 on th host.
v2: Remove reference from virgl and correct description (Emil)
v3: Send the tweak binary encoded instead of using strings (Gurchetan)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Mon, 27 May 2019 14:32:40 +0000 (16:32 +0200)]
virgl: Add tweak to apply a swizzle when drawing/blitting to a emulated BGRA texture
With Qemu this final swizzle is not needed, but with vtest it is, i.e. it depends on
how a program using virglrenderer uses the surface that is rendered to, hence
a tweak is added.
v2: Update description and fix spelling (Emil)
v3: Send tweak as binary value instead of using strings (Gurchetan)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Mon, 27 May 2019 14:31:17 +0000 (16:31 +0200)]
virgl: Add driconf tweak for emulating BGRA surfaces on GLES
These tweaks are used to fix rendering issues with Valve games and
at least also "The Raven Remastered" when run on a GLES host.
v2: Fix type in define and remove virgl from driconf option (Emil)
v3: Encode tweak binary instead of using strings (Gurchetan)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Mon, 27 May 2019 14:28:44 +0000 (16:28 +0200)]
virgl: Add override for BGRA format to use swizzled SRGB format
Tie in the check whether the host supports tweaks and whether this tweak
is enabled.
v2: Add comment about the emulated formats not being used directly in the
guest (Gurchetan)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Mon, 27 May 2019 14:26:25 +0000 (16:26 +0200)]
virgl: Add code to accept BGRx_SRGB as RGBx_SRGB
This will be enabled in later patches by the emulation tweak.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Mon, 27 May 2019 14:02:28 +0000 (16:02 +0200)]
virgl: Add skeleton to evaluate cap and send tweaks
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Fri, 12 Apr 2019 07:52:31 +0000 (09:52 +0200)]
virgl: factor out format host bits check
This will make it a single location when we want to replace a format.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Wed, 10 Apr 2019 11:54:14 +0000 (13:54 +0200)]
gallium/virgl: Add code path for virgl to read driconf
This works only for the drm variant of virgl and not for the vtest
variant.
v2: Rebase, replace the configuration query function by a pointer to
the configuration data.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Gert Wollny [Wed, 10 Apr 2019 12:03:00 +0000 (14:03 +0200)]
virgl: Add driinfo file and tie it into the build
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Caio Marcelo de Oliveira Filho [Sat, 23 Mar 2019 17:28:03 +0000 (10:28 -0700)]
glspirv: Call pass to lower frexp instructions
These were previously handled by the spirv_to_nir, but that changed to
be an explict pass in
23d30f4099f "spirv,nir: lower
frexp_exp/frexp_sig inside a new NIR pass"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Fri, 22 Mar 2019 05:58:30 +0000 (22:58 -0700)]
spirv: Restrict use of descriptor intrinsics to Vulkan
In ARB_gl_spirv we'll be able to use variables for uniform buffers, so
don't use the descriptor intrinsics to lower the block access.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Nicolai Hähnle [Thu, 23 May 2019 13:17:51 +0000 (15:17 +0200)]
ac/rtld: report better error messages for LDS overallocation
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Tue, 21 May 2019 23:17:34 +0000 (19:17 -0400)]
ac/rtld: check correct LDS max size
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Fri, 7 Dec 2018 09:31:41 +0000 (10:31 +0100)]
radeonsi: add s_sethalt to shaders for debugging
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Nicolai Hähnle [Thu, 23 May 2019 13:17:24 +0000 (15:17 +0200)]
ac/rtld: fix sorting of LDS symbols by alignment
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Bas Nieuwenhuizen [Wed, 19 Jun 2019 13:16:51 +0000 (15:16 +0200)]
meson: Allow building radeonsi with just the android platform.
Just as was allowed by autotools.
Fixes: 108d257a168 "meson: build libEGL"
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Bas Nieuwenhuizen [Wed, 19 Jun 2019 13:05:40 +0000 (15:05 +0200)]
anv: Fix vulkan build in meson.
Apparently the android part was never ported to meson.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Bas Nieuwenhuizen [Wed, 19 Jun 2019 13:03:43 +0000 (15:03 +0200)]
radv: Fix vulkan build in meson.
Apparently the android part was never ported to meson.
CC: <mesa-stable@lists.freedesktop.org>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Jason Ekstrand [Wed, 19 Jun 2019 20:18:06 +0000 (15:18 -0500)]
anv/image: Set different usage flags for shadow surfaces
For the block BLOCK_TEXEL_VIEW_COMPATIBLE case, this didn't matter
because the flags were already more-or-less what we wanted. However,
for gen7 stencil shadow images, it still had ISL_SURF_USAGE_STENCIL_BIT
so we were getting W-tiled which isn't what we want for the shadow. By
passing just ISL_SURF_USAGE_TEXTURE_BIT (and CUBE if we care), we now
get something that's actually texturable.
Fixes: f3ea0cf828 "anv: Add stencil texturing support for gen7"
Jason Ekstrand [Wed, 19 Jun 2019 19:14:20 +0000 (14:14 -0500)]
anv: Flush caches in anv_image_copy_to_shadow
Copies to a shadow image happen during a VkCmdPipelineBarrier or at
subpass transitions. We could potentially be a bit more conservative
but these transitions shouldn't happen often and it's better to have our
bases covered.
Fixes: f3ea0cf828 "anv: Add stencil texturing support for gen7"
Jason Ekstrand [Thu, 6 Jun 2019 15:51:25 +0000 (10:51 -0500)]
nir: Make nir_constant a vector rather than a matrix
Most places in NIR, we treat matrices like arrays. The one annoying
exception to this has been nir_constant where a matrix is a first-class
thing. This commit changes that so a matrix nir_constant is the same as
an array nir_constant. This makes matrix nir_constants a tiny bit more
expensive but shrinks all others by 96B.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 15:46:25 +0000 (10:46 -0500)]
glsl/nir: Fix handling of 64-bit values in uniform storage
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 15:20:48 +0000 (10:20 -0500)]
spirv: Only copy needed components for OpSpecConstantOp
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 15:09:01 +0000 (10:09 -0500)]
spirv: Use a single path for OpSpecConstantOp of OpVectorShuffle
Now that nir_const_value is a scalar, there's no reason why we need
multiple paths here and it's just extra paths to keep working. While
we're here, we also add a vtn_fail_if check that component indices are
in-bounds.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 14:49:55 +0000 (09:49 -0500)]
spirv: Use vtn_constan_uint() for array lengths and gather components
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 14:53:27 +0000 (09:53 -0500)]
spirv: Add a vtn_constant_int helper
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 16:40:13 +0000 (11:40 -0500)]
glsl/types: Add a real is_integer helper
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 16:37:32 +0000 (11:37 -0500)]
glsl/types: Rename is_integer to is_integer_32
It only accepts 32-bit integers so it should have a more descriptive
name. This patch should not be a functional change.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 16:26:51 +0000 (11:26 -0500)]
glsl/types: Ignore bit sizes in contains_integer()
All of the callers for this function are looking at interpolation
qualifiers and want to make sure they're declared flat. Any 64-bit
integer inputs need to be flat. It's also makes the function make more
sense since "integer" is fairly generic.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 6 Jun 2019 16:30:45 +0000 (11:30 -0500)]
glsl/types: Handle all bit sizes in glsl_type_is_integer
All of the callers of this function really just want to know if the type
is an integer and don't care about bit size.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Caio Marcelo de Oliveira Filho [Wed, 19 Jun 2019 18:39:24 +0000 (11:39 -0700)]
glsl/nir_opt_access: Update uniforms correctly when only vars change
Even if only variables access flags are changed, the existing NIR
infrastructure expects metadata to be explicitly preserved, so do
that. Don't care about avoiding preserve to be called twice since the
cost is negligible.
This scenario can be triggered by dead variables, and also by other
intrinsics that read the variables -- but not cause progress to be
made when processing the intrinsics.
Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Caio Marcelo de Oliveira Filho [Wed, 19 Jun 2019 17:00:39 +0000 (10:00 -0700)]
glsl/nir: Fix getting the sampler dim when arrays are involved
Unwrap any array in the variable type so we can get the sampler dim.
This fixes piglit test
spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test.
Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jory Pratt [Wed, 8 May 2019 02:47:40 +0000 (21:47 -0500)]
meson: Search for execinfo.h
Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for
execinfo.h presence, just check directly. This allows the build to work
on musl.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Jory Pratt [Mon, 10 Jun 2019 18:48:02 +0000 (11:48 -0700)]
util: Heap-allocate 256K zlib buffer
The disk cache code tries to allocate a 256 Kbyte buffer on the stack.
Since musl only gives 80 Kbyte of stack space per thread, this causes a
trap.
See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size
(In musl-1.1.21 the default stack size has increased to 128K)
[mattst88]: Original author unknown, but I think this is small enough
that it is not copyrightable.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Kenneth Graunke [Wed, 19 Jun 2019 16:57:01 +0000 (11:57 -0500)]
anv: Fix wrong printf formatter
%lu is for unsigned long, %zu is for size_t. Just cast the data.
Kenneth Graunke [Wed, 19 Jun 2019 04:47:12 +0000 (23:47 -0500)]
iris: Bail on queries for INTEL_NO_HW=1.
We don't execute any of the commands to record snapshots, so we can't
actually produce a real result. We do however need to avoid waiting
on a syncpt which will never be signalled. So, just return 0.
David Riley [Thu, 13 Jun 2019 00:16:35 +0000 (17:16 -0700)]
virgl: Support VIRGL_BIND_SHARED
Support a new virgl bind type for shared buffers.
Signed-off-by: David Riley <davidriley@chormium.org>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Lionel Landwerlin [Thu, 2 May 2019 16:43:03 +0000 (17:43 +0100)]
anv: write spirv-nir logs back to the application
Using the existing VK_EXT_debug_report extension.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Connor Abbott [Tue, 4 Jun 2019 12:42:54 +0000 (14:42 +0200)]
ac/nir: Set speculatable for buffer loads where allowed
This brings the nir path in line with the TGSI path.
Totals from affected shaders:
SGPRS: 2984 -> 2984 (0.00 %)
VGPRS: 2792 -> 2652 (-5.01 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 247380 -> 248072 (0.28 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 121 -> 132 (9.09 %)
Wait states: 0 -> 0 (0.00 %)
Most of the change came from DiRT: Showdown, and came from sinking SSBO
loads.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Tue, 4 Jun 2019 12:12:34 +0000 (14:12 +0200)]
nir: Use reorderable access flag
No changes with radeonsi shader-db.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Tue, 4 Jun 2019 11:02:31 +0000 (13:02 +0200)]
nir: Add a helper to determine if an intrinsic can be reordered
This is simple now, but we're going to be adding a few more conditions
to this later.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Tue, 4 Jun 2019 12:18:54 +0000 (14:18 +0200)]
st/nir: Use gl_nir_opt_access
Nothing uses its results yet, that will come with the following commits.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Tue, 4 Jun 2019 12:13:13 +0000 (14:13 +0200)]
glsl/nir: Add optimization pass for access flags
Right now, this just deduces when we can arbitrarily reorder SSBO and
image loads, matching the existing logic in radeonsi's TGSI->LLVM pass.
This approach can't handle some things that nir_opt_copy_prop_vars can,
but it can handle images, and with GCM it lets us hoist reads outside of
loops. We can also pass this information to LLVM which lets it do its
own optimizations on it.
This is GLSL only as I haven't tested it on Vulkan yet, and it would
probably need a few changes to work there.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Fri, 31 May 2019 17:03:48 +0000 (19:03 +0200)]
nir: Add reorderable memory access enum
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Wed, 5 Jun 2019 08:23:00 +0000 (10:23 +0200)]
nir/copy_prop_vars: Ignore volatile accesses
The spec explicitly says that volatile writes can't be removed and
volatile reads do not guarantee that the same value will still be around
after the read, as if there were a barrier after each read/write. Just
ignore them.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Tue, 4 Jun 2019 09:41:25 +0000 (11:41 +0200)]
glsl/nir: Propagate access qualifiers
We were completely ignoring these before, except for putting them on
variables. While we're here, don't set access qualifiers when converting
to bindless since glsl_to_nir will already have set a more accurate
qualifier that includes any qualifiers on struct members that are
dereferenced.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Tue, 4 Jun 2019 09:40:14 +0000 (11:40 +0200)]
nir: Allow qualifiers on copy_deref and image instructions
In the next commit, we'll properly handle access qualifiers on struct
members by propagating them to load/store instructions, but these
instructions had no way to specify the qualifier.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Connor Abbott [Fri, 31 May 2019 17:04:36 +0000 (19:04 +0200)]
ac,radeonsi: Always mark buffer stores as inaccessiblememonly
inaccessiblememonly means that it doesn't modify memory accesible via
normal LLVM pointers. This lets LLVM's dead store elimination, memcpy
forwarding, etc. ignore functions with this attribute. We don't
represent descriptors as pointers, so this property is always true of
buffer and image stores. There are plans to represent descriptors via
pointers, but this just means that now nothing is inaccessiblememonly,
as LLVM will then understand loads/stores via its usual alias analysis.
Radeonsi was mistakenly only setting it if the driver could prove that
there were no reads, and then it was cargo-culted into ac_llvm_build
and ac_llvm_to_nir. Rip it out of everything.
statistics with nir enabled:
Totals from affected shaders:
SGPRS: 152 -> 152 (0.00 %)
VGPRS: 128 -> 132 (3.12 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 9324 -> 9244 (-0.86 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Max Waves: 17 -> 17 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
The only difference was a manhattan31 shader.
Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Tue, 11 Jun 2019 12:19:35 +0000 (13:19 +0100)]
egl: add missing #include
close() is in <unistd.h>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Samuel Pitoiset [Tue, 18 Jun 2019 16:58:40 +0000 (18:58 +0200)]
radv: disable viewport clamping even if FS doesn't write Z
This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 14 Nov 2018 15:24:02 +0000 (16:24 +0100)]
radv: implement compressed FMASK texture reads with RADV_PERFTEST=tccompatcmask
This allows us to disable the FMASK decompress pass when
transitioning from CB writes to shader reads.
This will likely be improved and enabled by default in the future.
No CTS regressions on GFX8 but a few number of multisample CTS
failures on GFX9 (they look related to the small hint).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 18 Jun 2019 14:11:07 +0000 (16:11 +0200)]
radv: fix FMASK expand with SRGB formats
Found while working on DCC for MSAA.
Fixes: 6b976024a87 ("radv: add support for FMASK expand")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tomeu Vizoso [Tue, 18 Jun 2019 12:24:57 +0000 (14:24 +0200)]
panfrost: Move to use ralloc for some allocations
We have some serious leaks, so plug some and also move to ralloc to
limit the lifetime of some objects to that of their parent.
Lots more such work to do.
For some reason, this fixes:
dEQP-GLES2.functional.lifetime.attach.deleted_output.texture_framebuffer
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Mathias Fröhlich [Thu, 6 Jun 2019 08:22:25 +0000 (10:22 +0200)]
egl: Don't add hardware device if there is no render node v2.
Do not offer a hardware drm backed egl device if no render node
is available. The current implementation will fail on this
egl device. On top it issues a warning that is actually missleading.
There are finally more error paths that can fail on the way to a
hardware backed egl device. Fixing all of them would kind of require
opening the drm device and see if there is a usable driver associated
with the device. The taken approach avoids a full probe and fixes at
least this kind of problem on kvm virtualization hosts I observe here.
Fixes: dbb4457d985 ("egl: add EGL_EXT_device_drm support")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Christian Gmeiner [Mon, 3 Jun 2019 05:42:06 +0000 (07:42 +0200)]
etnaviv: support GL_ARB_seamless_cubemap_per_texture
Passes spec@amd_seamless_cubemap_per_texture@amd_seamless_cubemap_per_texture
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
Christian Gmeiner [Mon, 3 Jun 2019 05:31:08 +0000 (07:31 +0200)]
etnaviv: update headers from rnndb
Update to etna_viv commit
a3bf0da.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Dave Airlie [Tue, 18 Jun 2019 21:19:03 +0000 (07:19 +1000)]
radeonsi: fix undefined shift in macro definition
Pointed out by coverity
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Tue, 18 Jun 2019 21:10:13 +0000 (07:10 +1000)]
nouveau: fix frees in unsupported IR error paths.
This is pointless in that we won't ever hit those paths in real life,
but coverity complains.
Fixes: f014ae3c7cce ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Rohan Garg [Wed, 5 Jun 2019 17:04:04 +0000 (19:04 +0200)]
panfrost: Move clearing logic into pan_job
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Chia-I Wu [Mon, 17 Jun 2019 16:53:48 +0000 (09:53 -0700)]
virgl: fix sync issue regarding discard/unsync transfers
GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as
GL_MAP_INVALIDATE_RANGE_BIT naively. When we run into
ptr = glMapBufferRange(buf, 0, size,
GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT);
memcpy(ptr, data1, size);
glUnmapBuffer(buf);
ptr = glMapBufferRange(buf, size, size,
GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT);
memcpy(ptr, data2, size);
glUnmapBuffer(buf);
we never want data1 to be copy_transfer'ed. Because that would mean
that data2 might overwrite valid data.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis alexandros.frantzis@collabora.com
Fixes: a22c5df0794 ("virgl: Use buffer copy transfers to avoid waiting when mapping")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Alyssa Rosenzweig [Mon, 17 Jun 2019 23:23:41 +0000 (16:23 -0700)]
panfrost: Enable sRGB
Now that sRGB formats are supported for both rendering and sampling,
advertise support.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Alyssa Rosenzweig [Tue, 18 Jun 2019 14:41:26 +0000 (07:41 -0700)]
panfrost: Disable AFBC on sRGB buffers
The performance impact is slightly mitigated by tiling the render
target, but it's undeniably still slow compared to AFBC. Unfortunately,
it doesn't look like AFBC and sRGB play nice...
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>