Rob Clark [Mon, 2 Jul 2018 12:15:43 +0000 (08:15 -0400)]
mesa: fix error msg typo
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Rob Clark [Sat, 23 Jun 2018 22:22:42 +0000 (18:22 -0400)]
nir: fixup intrinsic comment
Now the deref is the first src.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Tomeu Vizoso [Fri, 22 Jun 2018 13:04:04 +0000 (15:04 +0200)]
mesa: handle a bunch of formats in IMPLEMENTATION_COLOR_READ_*
Virgl could save a lot of work converting buffers in the host side
between formats if Mesa supported a bunch of other formats when reading
pixels.
This commit adds cases to handle specific formats so that the values
reported by the two calls match more closely the underlying native
formats.
In GLES is important that IMPLEMENTATION_COLOR_READ_* return the native
format and data type because the spec only allows reading with those,
besides GL_RGBA or GL_RGBA_INTEGER.
Additionally, because virgl currently doesn't implement such
conversions, this commit fixes several tests in
dEQP-GLES3.functional.fbo.color.clear.*, when using virgl in the guest
side.
The logic is based on knowledge that is shared with
_mesa_format_matches_format_and_type() but we cannot assert that the
results match as we don't have all the starting information at both
points. So leave the assert out and hope CI comes soon to save us all.
v2: * Let R10G10B10A2_UINT fall back to GL_RGBA_INTEGER (Eric Anholt)
* Assert with _mesa_format_matches_format_and_type (Eric Anholt)
v3: * Remove the assert, as it won't be reliable (Eric Anholt)
v4: * Use _mesa_is_format_integer in the fallback (Eric Anholt)
v5: * Remove superfluous call to
_mesa_uncompressed_format_to_type_and_comps (Eric Anholt)
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
Samuel Pitoiset [Mon, 9 Jul 2018 09:37:15 +0000 (11:37 +0200)]
radv: add support for VK_EXT_conditional_rendering
Inherited commands buffers are not supported.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 9 Jul 2018 09:33:28 +0000 (11:33 +0200)]
radv: add support for non-inverted conditional rendering
By default, our internal rendering commands are discarded
only if the predicate is non-zero (ie. DRAW_VISIBLE). But
VK_EXT_conditional_rendering also allows to discard commands
when the predicate is zero, which means we have to use a
different flag.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 9 Jul 2018 09:16:43 +0000 (11:16 +0200)]
radv: set the predicate for indirect/indexed draw commands
VK_EXT_conditional_rendering allows to discard draw commands
(not only normal draws).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 9 Jul 2018 09:12:25 +0000 (11:12 +0200)]
radv: set the predicate for dispatch commands
VK_EXT_conditional_rendering allows to discard dispatch commands.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Lionel Landwerlin [Tue, 17 Jul 2018 14:05:28 +0000 (15:05 +0100)]
i965: batchbuffer: write correct canonical offset with softpin
Addresses in the command streams should be in canonical form (i.e
bit[63:48] == bit[47]). If the [bo->gtt_offset, bo->gtt_offset +
target_offset] range contains the address 0x800000000000, the current
code will fail that criteria.
v2: Fix missing include (Lionel)
Fixes: 1c9053d0765dc6 ("i965: Prepare batchbuffer module for softpin support.")
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Samuel Pitoiset [Wed, 18 Jul 2018 08:54:26 +0000 (10:54 +0200)]
radv: remove unused variable in radv_CreateRenderPass2KHR()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Tue, 17 Jul 2018 15:03:26 +0000 (17:03 +0200)]
radv: optimize radv_stage_flush() for pre fragment shader stages
We don't need to emit PS_PARTIAL_FLUSH for the pre fragment shader
stages (ie. geometry/tessellation). Emitting VS_PARTIAL_FLUSH
is enough for these stages. Note that PS_PARTIAL_FLUSH also
synchronizes all vertex stages.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Iglesias Gonsálvez [Tue, 17 Jul 2018 06:55:48 +0000 (08:55 +0200)]
anv: fix assert in anv_CmdBindDescriptorSets()
The assert is checking that we are not binding more descriptor sets
than the supported by the driver. When binding the descriptor set
number MAX_SETS-1, it was breaking the assert because
descriptorSetCount = 1.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jan Vesely [Tue, 17 Jul 2018 06:07:45 +0000 (02:07 -0400)]
clover: Report error when pipe driver fails to create compute state
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Jan Vesely [Tue, 17 Jul 2018 06:05:02 +0000 (02:05 -0400)]
clover: Catch errors from executing event action
Abort all dependent events.
v2: Abort the current event as well.
CC: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Timothy Arceri [Mon, 16 Jul 2018 05:19:29 +0000 (15:19 +1000)]
nir: add a couple of ior opts to nir_opt_algebraic
One of these was seen in a Deus Ex shader.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Sun, 15 Jul 2018 23:26:33 +0000 (09:26 +1000)]
nir: allow opt_peephole_select to handle nir_instr_type_deref
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Tue, 17 Jul 2018 18:51:16 +0000 (14:51 -0400)]
r600: fix warnings when unref'ing pool->bo
Konstantin Kharlamov [Fri, 29 Dec 2017 05:32:31 +0000 (08:32 +0300)]
r600g: some -Wsign-compare fixes
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Konstantin Kharlamov [Fri, 29 Dec 2017 05:32:30 +0000 (08:32 +0300)]
st/glx: constify some variables
Just a nice hint for both peoples and compilers.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Konstantin Kharlamov [Fri, 29 Dec 2017 05:32:29 +0000 (08:32 +0300)]
st/nine: constify some variables
Just a nice hint for both peoples and compilers.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Konstantin Kharlamov [Fri, 29 Dec 2017 05:32:28 +0000 (08:32 +0300)]
r600g: constify some variables
Just a nice hint for both peoples and compilers.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Konstantin Kharlamov [Mon, 1 Jan 2018 07:38:37 +0000 (10:38 +0300)]
r600g: do not use "fast-clear" for small textures (v3)
Ported from radeonsi. Improves windowed glxgears ran as
vblank_mode=0 glxgears -info -geometry 0+0+512+512
from ≈2270 FPS to ≈2360 FPS. Tested with AMD TURKS.
v2: turned out glxgears ignores the option above, the correct way would
be "512x512+0+0". Now it can be seen 512x512 actually loses 30 FPS.
300×300 however wins around a hundred FPS, and to leave some room in
case results may differ for other cards I want not to nitpick in search
of an optimum but to simply leave 300×300 in the code.
v3: remove redundant braces, and try harder for the mail to stick to
the rest of the series.
Signed-off-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Rob Clark [Tue, 17 Jul 2018 14:14:59 +0000 (10:14 -0400)]
freedreno: re-work fd_batch_reference() locking
Annoyingly we still have to briefly drop the lock to unref resources..
but push the lock down into __fd_batch_destroy() so we can invalidate
the batch and reset resources before dropping the lock.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 17 Jul 2018 14:12:55 +0000 (10:12 -0400)]
freedreno: make fd_batch a one-shot thing
Re-allocate rather than re-use. Originally we had an unnecessarily
complex design to avoid re-allocating cmdstream buffers. But now that
support for "growable" cmdstream buffers has been in place for a couple
years, I guess we can care a bit less about the extra overhead on older
kernels.
But making the batches one-shot removes a class of potential race
conditions vs the flush_queue.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 17 Jul 2018 14:02:51 +0000 (10:02 -0400)]
freedreno: flush immediately when reading a pending batch
Instead of the reading batch setting a dependency on the writing batch,
simply flush the writing batch immediately. This avoids situations
where we have to flush the context's current batch later.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 17 Jul 2018 13:54:23 +0000 (09:54 -0400)]
freedreno: get rid of noop render
This was basically to avoid a zero-dword IB (indirect-branch), but
instead just don't emit the IB packet in that case.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 17 Jul 2018 13:44:23 +0000 (09:44 -0400)]
freedreno: fix samples=0 vs samples=1 confusion
pipe_framebuffer_state can have samples=0 in various cases, which is
actually the same thing as samples=1. So use the _get_num_samples()
helper to populate the key, to avoid this looking like two distinct
fb states to the cache.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 17 Jul 2018 13:42:27 +0000 (09:42 -0400)]
freedreno: comment for _invalidate_batch()
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 17 Jul 2018 13:40:23 +0000 (09:40 -0400)]
freedreno: hold batch references when flushing
It is possible for a batch to be freed under our feet when flushing, so
it is best to hold a reference to all of them up-front.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Karol Herbst [Sat, 14 Jul 2018 04:17:08 +0000 (06:17 +0200)]
nir/spirv: print id for unsupported alu opcode
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Karol Herbst [Thu, 12 Jul 2018 01:40:23 +0000 (03:40 +0200)]
nir: prepare for bumping up max components to 16
OpenCL knows vector of size 8 and 16.
v2: rebased on master (nir_swizzle rework)
rework more declarations with nir_component_mask_t
adjust print_var_decl
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Samuel Pitoiset [Thu, 12 Jul 2018 14:26:34 +0000 (16:26 +0200)]
radv/winsys: use alloca() for semaphore dependencies
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Fri, 13 Jul 2018 15:35:58 +0000 (17:35 +0200)]
radv: reduce number of CB/DB meta flushes for VK_ACCESS_TRANSFER_WRITE_BIT
If we know that the given image doesn't have any metadata,
we don't need to flush.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 13 Jul 2018 12:14:52 +0000 (14:14 +0200)]
radv: fix implementation of VK_KHR_create_renderpass2 for multiviews
The Vulkan 1.1.80 spec says:
"viewMask has the same effect for the described subpass as
VkRenderPassMultiviewCreateInfo::pViewMasks has on each
corresponding subpass."
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Erik Faye-Lund [Tue, 17 Jul 2018 05:43:57 +0000 (15:43 +1000)]
virgl: respect max_vertex_attrib_stride cap
This is required for OpenGL 4.4 and OpenGL ES 3.1 support.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Lepton Wu [Tue, 17 Jul 2018 01:56:32 +0000 (18:56 -0700)]
virgl: Fix flush in virgl_encoder_inline_write.
The current code is buggy: if there are only 12 dwords left in cbuf,
we emit a zero data length command which will be rejected by virglrenderer.
Fix it by calling flush in this case.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
Erik Faye-Lund [Mon, 16 Jul 2018 10:37:31 +0000 (12:37 +0200)]
virgl: implement set_min_samples
This allows us to implement glMinSampleShading correctly, which up
until now just got ignored.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Caio Marcelo de Oliveira Filho [Tue, 26 Jun 2018 22:46:53 +0000 (15:46 -0700)]
glsl: do second pass of const propagation in loops
When handling loops in constant propagation, implement the "FINISHME"
comment like copy propagation: perform a first pass to find values
that can't be propagated, then perform a second pass with the ACP
containing still valid values.
Certain values are killed because the loop may run more than one
iteration, so we can't copy propagate them as they would be invalid in
the later iterations.
Reviewed-by: Eric Anholt <eric@anholt.net>
Caio Marcelo de Oliveira Filho [Fri, 15 Jun 2018 20:59:45 +0000 (13:59 -0700)]
glsl: don't let an 'if' then-branch kill const propagation for else-branch
When handling 'if' in constant propagation, if a certain variable was
killed when processing the first branch of the 'if', then the second
would get any propagation from previous nodes. This is similar to the
change done for copy propagation code.
x = 1;
if (...) {
z = x; // This would turn into z = 1.
x = 22; // x gets killed.
} else {
w = x; // This would NOT turn into w = 1.
}
With the change, we let constant propagation happen independently in
the two branches and only then apply the killed values for the
subsequent code.
The new code use a single hash table for keeping the kills of both
branches (the branches only write to it), and it gets deleted after we
use -- instead of waiting for mem_ctx to collect it.
NIR deals well with constant propagation, so it already covered for
the missing ones that this patch fixes.
Reviewed-by: Eric Anholt <eric@anholt.net>
Eric Anholt [Mon, 16 Jul 2018 20:57:03 +0000 (13:57 -0700)]
v3d: Disable shader-db cycle estimates until we sort out TMU estimates.
I keep having to ignore these shader-db changes since I don't trust them,
so just disable the reports entirely.
Eric Anholt [Mon, 16 Jul 2018 20:27:13 +0000 (13:27 -0700)]
v3d: Emit the lowered uniform just before its first use in a block.
total instructions in shared programs: 98578 -> 98119 (-0.47%)
instructions in affected programs: 27571 -> 27112 (-1.66%)
and it also eliminates most spills/fills on the CTS's randomized uniform
usage testcases.
Eric Anholt [Mon, 16 Jul 2018 19:41:28 +0000 (12:41 -0700)]
v3d: Add an assert that we don't provide an invalid texture return words.
The docs had an update noting this restriction, so reflect it in the code.
Eric Anholt [Mon, 16 Jul 2018 19:35:11 +0000 (12:35 -0700)]
v3d: Apply GFXH-1625 restriction on TMUWT in the end of the shader.
This doesn't affect us yet since we're not doing TMUWTs, but I think we
will for GLES 3.1.
Sergii Romantsov [Thu, 12 Jul 2018 12:47:48 +0000 (15:47 +0300)]
intel/batch_decoder: decoding of 3DSTATE_CONSTANT_BODY.
SNB doesn't have a definition of 3DSTATE_CONSTANT_BODY, thats
why we got segmentation fault when used INTEL_DEBUG=bat.
Fixed by adding of 3DSTATE_CONSTANT_BODY into 3DSTATE_CONSTANT
of VS, GS and PS structures.
v2: added definition of 3DSTATE_CONSTANT_BODY to the gen6.xml
Fixes: 169d8e011ae (intel: Fix 3DSTATE_CONSTANT buffer decoding.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107190
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Mon, 16 Jul 2018 18:32:58 +0000 (14:32 -0400)]
r600: fix build after the removal of RADEON_PRIO_* flags
Roland Scheidegger [Sat, 14 Jul 2018 02:49:36 +0000 (04:49 +0200)]
nir: fix msvc build
Empty initializer braces aren't valid c (it's a gnu extension, and
it's valid in c++).
Hopefully fixes appveyor / msvc build...
Fixes
a3150c1d06ae7766c3d3fe3b33432e55c3c7527e
Jason Ekstrand [Wed, 4 Jul 2018 02:18:28 +0000 (19:18 -0700)]
nir/worklist: Rework the foreach macro
This makes the arguments match the (thing, container) pattern used in
other nir_foreach macros and also renames it to make that a bit more
clear.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Eric Anholt [Thu, 12 Jul 2018 18:45:27 +0000 (11:45 -0700)]
intel: tools: Fix uninitialized variable warnings in intel_dump_gpu.
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Jason Ekstrand [Fri, 6 Jul 2018 21:41:12 +0000 (14:41 -0700)]
spirv: Fix a couple of image atomic load/store bugs
For one thing, the NIR opcodes for image load/store always take and
return a vec4 value regardless of the image type. We need to fix up
both the source and destination to handle it. For another thing, we
weren't actually setting up a destination in the OpAtomicLoad case.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: mesa-stable@lists.freedesktop.org
Marek Olšák [Thu, 12 Jul 2018 04:47:11 +0000 (00:47 -0400)]
winsys/amdgpu: clean up error handling in amdgpu_cs_submit_ib
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Thu, 12 Jul 2018 04:27:06 +0000 (00:27 -0400)]
radeonsi: rework RADEON_PRIO flags to be <= 31
This decreases sizeof(struct amdgpu_cs_buffer) from 24 to 16 bytes.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Thu, 12 Jul 2018 04:17:02 +0000 (00:17 -0400)]
radeonsi: merge DCC/CMASK/HTILE priority flags
For a later simplification.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Thu, 12 Jul 2018 04:05:23 +0000 (00:05 -0400)]
radeonsi: remove non-GFX BO priority flags
For a later simplification.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Thu, 12 Jul 2018 03:24:31 +0000 (23:24 -0400)]
winsys/amdgpu: use alloca when using global_bo_list
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Thu, 12 Jul 2018 03:21:16 +0000 (23:21 -0400)]
winsys/amdgpu: remove label bo_list_error
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Thu, 12 Jul 2018 03:20:06 +0000 (23:20 -0400)]
winsys/amdgpu: always update gfx_bo_list_counter
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Marek Olšák [Thu, 12 Jul 2018 03:19:15 +0000 (23:19 -0400)]
winsys/amdgpu: make amdgpu_cs_context::flags & handles local
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Gert Wollny [Fri, 13 Jul 2018 12:46:31 +0000 (14:46 +0200)]
mesa/virgl: Fix off-by-one and copy-paste error in multisample position evaluation
Converting from a switch statement that would not allow intermediate sample counts
to use an if-else chain went a bit wrong, so that in some cases the range that
should be inclusive was exclusive and the line for 16 samples was copies wrongly.
v2: elaborate commit message.
Fixes: 91f48cdfe5c817158c533a8f67c60e9aabbe4479
virgl: Add support for glGetMultisample
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (v1)
Karol Herbst [Sun, 24 Jun 2018 20:10:28 +0000 (22:10 +0200)]
nouveau: fix 3D blitter for unsigned to signed integer conversions
fixes a couple of packed_pixel CTS tests. No regressions inside a CTS run.
v2: simplify the changes a bit
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Karol Herbst [Thu, 12 Jul 2018 04:27:49 +0000 (06:27 +0200)]
nir: fix printing of vec16 type
Fixes: 2f181c8c183cc8b4d0450789bb20c2be48d32db3
"glsl_types: vec8/vec16 support"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Rob Clark [Thu, 8 Mar 2018 19:18:59 +0000 (14:18 -0500)]
nir/spirv: implement BuiltInWorkDim
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Karol Herbst [Wed, 11 Jul 2018 23:18:23 +0000 (01:18 +0200)]
nir/spirv: print id for unsupported builtins
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Jason Ekstrand [Thu, 12 Jul 2018 21:05:26 +0000 (14:05 -0700)]
intel/blorp: Handle 3-component formats in clears
This fixes a nasty hang in Batman: Arkham City which apparently calls
vkCmdClearColorImage on a linear RGB image.
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Thu, 12 Jul 2018 20:55:26 +0000 (13:55 -0700)]
intel/blorp: Fix blits to R8G8B8_UNORM_SRGB
In this case, the surface faking will give us a R8_UNORM surface and we
need to do an sRGB conversion in the shader. Found by inspection.
cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Caio Marcelo de Oliveira Filho [Thu, 12 Jul 2018 18:17:04 +0000 (11:17 -0700)]
util/hash_table: add helper to remove entry by key
And the corresponding test case.
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Fri, 13 Jul 2018 01:23:34 +0000 (18:23 -0700)]
nir/lower_tex: Use nir_format_srgb_to_linear
A while ago, we added a bunch of format conversion helpers; we should
use them instead of hand-rolling sRGB conversions.
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Fri, 13 Jul 2018 17:12:02 +0000 (10:12 -0700)]
vc4: Tell NIR to lower fdiv instructions
This should allow us to use them in nir_lower_tex
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Fri, 13 Jul 2018 20:54:46 +0000 (13:54 -0700)]
docs: Update news, calendar, and relnotes for 18.1.4
Dylan Baker [Fri, 13 Jul 2018 18:46:58 +0000 (11:46 -0700)]
docs: Add sha256 sums for 18.1.4 tarballs
Dylan Baker [Fri, 13 Jul 2018 18:34:55 +0000 (11:34 -0700)]
docs: Add release notes for 18.1.4
Eric Anholt [Tue, 20 Feb 2018 17:07:09 +0000 (17:07 +0000)]
vc4: Switch to using u_transfer_helper for MSAA maps.
No requirement, just reduces code duplication.
Eric Anholt [Thu, 12 Jul 2018 23:58:07 +0000 (16:58 -0700)]
v3d: Work around GFXH-1461 bug losing our Z/S clears.
If you load S and clear Z or vice versa, the clear may get lost. Just
fall back to drawing a quad.
Fixes KHR-GLES3.packed_depth_stencil.verify_read_pixels.depth24_stencil8
Eric Anholt [Thu, 12 Jul 2018 18:41:37 +0000 (11:41 -0700)]
meson: Move xvmc test tools from unit tests to installed tools.
These are not unit tests, as they rely on the host's XVMC and some user
configuration. Switch them over to being general installed tools, to fix
unit testing.
Fixes: 22a817af8a89 ("meson: build gallium xvmc state tracker")
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Gert Wollny [Thu, 5 Jul 2018 17:11:22 +0000 (19:11 +0200)]
r600: Add spill output to group only if register or target index changes
The current spill code checks in each instruction of an instruction group whether
spilling is needed and if so, it adds spilling for each component as a seperate
instruction and it allocates a new temporary for each component and since it takes
the write mask from the TGSI representation, all components might be written
each time and as a result already written components might be overwritten with
garbage like:
...
y: MOV R9.y, [0x42140000 37].x
t: MOV R8.x, [0x42040000 33].y
...
MEM_SCRATCH WRITE_IND_ACK 0 R9.xy__, @R4.x ES:3
MEM_SCRATCH WRITE_IND_ACK 0 R8.xy__, @R4.x ES:3
...
To resolve this isse accumulate spills to the same memory location so that only one
memory write instruction is emitted for an instruction group that writes up to all
four components.
This fixes updated piglits (see https://patchwork.freedesktop.org/series/46064/):
spec/glsl-1.30/execution
fs-large-local-array-vec2.shader_test
fs-large-local-array-vec3.shader_test
fs-large-local-array-vec4.shader_test
v2: fix some typos and add comment about piglits (Roland Scheidegger)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (v1)
Nanley Chery [Fri, 6 Jul 2018 20:02:44 +0000 (13:02 -0700)]
i965/miptree: Allocate MS texture BOs as BUSY
These buffer objects are never accessed with the CPU.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Sat, 9 Jun 2018 23:44:15 +0000 (16:44 -0700)]
i965/miptree: Inline make_separate_stencil
Note that the separate stencil miptree now has the same alloc_flag as
the depth component. Only stencil renderbuffers (as opposed to textures)
have BO_ALLOC_BUSY.
v2: Add note about BO_ALLOC_BUSY in message (Topi).
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Sat, 9 Jun 2018 00:50:39 +0000 (17:50 -0700)]
i965/miptree: Init r8stencil_needs_update to false
The current behavior masked two bugs where the flag was not set to true
after modifying the stencil texture. One case was a regression
introduced with commit
bdbb527a65fc729e7a9319ae67de60d03d06c3fd and
another was a bug in the depthstencil mapping code. These have since
been fixed.
To prevent such bugs from being masked in the future, initialize
r8stencil_needs_update to false.
v2: Keep the delayed allocation.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Mon, 11 Jun 2018 18:01:52 +0000 (11:01 -0700)]
i965/miptree: Refactor miptree_create
Enable a future patch to create the r8stencil_mt in this function.
v2: Explicitly set etc_format to MESA_FORMAT_NONE (Topi).
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Mon, 11 Jun 2018 15:15:17 +0000 (08:15 -0700)]
i965/miptree: Add and use mt_surf_usage
v2: Make mt_fmt const (Topi).
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Mon, 11 Jun 2018 17:35:28 +0000 (10:35 -0700)]
i965/miptree: Share alloc_flags in miptree_create
Note that this maintains BO_ALLOC_BUSY for depth renderbuffers, but not
depth textures.
v2: Add note about BO_ALLOC_BUSY in message (Topi).
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Mon, 11 Jun 2018 17:31:05 +0000 (10:31 -0700)]
i965/miptree: Share the miptree format in miptree_create
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Mon, 11 Jun 2018 17:23:23 +0000 (10:23 -0700)]
i965/miptree: Share tiling_flags in miptree_create
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Tue, 12 Jun 2018 14:16:16 +0000 (07:16 -0700)]
i965/miptree: Delete MIPTREE_CREATE_LINEAR
This enum constant was introduced to enable blit maps with
intel_miptree_create
da2880bea05bfc87109477ab026a7f5401fc8f0c. Now that
such maps use the more direct make_surface function which allows you to
specify the tiling directly, the constant is no longer being used.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Sat, 9 Jun 2018 23:45:02 +0000 (16:45 -0700)]
i965/miptree: Use make_surface in map_blit
Do this so that we don't have to special case linearly-tiled depth
buffers in miptree_create.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Fri, 8 Jun 2018 20:24:09 +0000 (13:24 -0700)]
i965/draw: Fix adding the stencil bo to the depth cache
Fix the case where stencil writes are enabled on a depth stencil
texture. Found by inspection.
v2: Fix message to allow for depth stencil writes (Topi).
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Tue, 29 May 2018 06:18:41 +0000 (23:18 -0700)]
i965/draw: Set the r8stencil flag after drawing
Fixes the regresion introduced with commit
bdbb527a65fc729e7a9319ae67de60d03d06c3fd
"i965: Use ISL for emitting depth/stencil/hiz state on gen6+"
Found by inspection.
Prevents regressing the piglit test, fbo-depth-array stencil-draw, later
on in this series.
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Fri, 8 Jun 2018 06:55:44 +0000 (23:55 -0700)]
i965/miptree: Set the r8stencil flag in map_depthstencil
Found by initializing the r8stencil_needs_update to false in
make_separate_stencil_surface.
Prevents regressing the piglit test arb_stencil_texturing-draw, later on
in the series.
Cc: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Nanley Chery [Tue, 29 May 2018 05:35:33 +0000 (22:35 -0700)]
i965: Set the r8stencil flag in miptree_finish_write
This seems to be the most appropriate place.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Karol Herbst [Fri, 13 Jul 2018 01:33:22 +0000 (03:33 +0200)]
nir: cleanup oversized arrays in nir_swizzle calls
There are no fixed sized array arguments in C, those are simply pointers
to unsized arrays and as the size is passed in anyway, just rely on that.
where possible calls are replaced by nir_channel and nir_channels.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Nanley Chery [Wed, 30 May 2018 23:32:07 +0000 (16:32 -0700)]
i965/miptree: Use the correct BLT pitch
Retile miptrees to a linear tiling less often. Retiling can cause issues
with imported BOs.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106738
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Nanley Chery [Wed, 23 May 2018 22:50:14 +0000 (15:50 -0700)]
i965/miptree: Drop an if case from retile_as_linear
Drop an if statement whose predicate never evaluates to true. row_pitch
belongs to a surface with non-linear tiling. According to
isl_calc_tiled_min_row_pitch, the pitch is a multiple of the tile width.
By looking at isl_tiling_get_info, we see that non-linear tilings have
widths greater than or equal to 128B.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Nanley Chery [Wed, 30 May 2018 23:22:13 +0000 (16:22 -0700)]
i965: Make blt_pitch public
We'd like to reuse this helper.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Caio Marcelo de Oliveira Filho [Thu, 5 Jul 2018 20:02:30 +0000 (13:02 -0700)]
nir: delete not needed for reinserted nir_cf_list
It wasn't causing problems since there's nothing to delete, but better
be consistent with the rest of existing codebase.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Caio Marcelo de Oliveira Filho [Mon, 9 Jul 2018 18:29:41 +0000 (11:29 -0700)]
glsl: remove struct kill_entry in constant propagation
The only value in kill_entry is the writemask, which can be stored in
the data pointer of the hash table entry.
Suggested by Eric Anholt.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Caio Marcelo de Oliveira Filho [Sat, 7 Jul 2018 00:21:34 +0000 (17:21 -0700)]
glsl: slim the kill_entry struct used in const propagation
Since
4654439fdd7 "glsl: Use hash tables for
opt_constant_propagation() kill sets." uses a hash_table for storing
kill_entries, so the structs can be simplified.
Remove the exec_node from kill_entry since it is not used in an
exec_list anymore.
Remove the 'var' from kill_entry since it is now redundant with the
key of the hash table.
Suggested by Eric Anholt.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Caio Marcelo de Oliveira Filho [Fri, 29 Jun 2018 18:38:09 +0000 (11:38 -0700)]
i965: fix typo (wrong gen number) in comment
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Caio Marcelo de Oliveira Filho [Mon, 25 Jun 2018 20:42:22 +0000 (13:42 -0700)]
util/set: helper to remove entry by key
v2: Add unit test. (Eric Anholt)
Reviewed-by: Eric Anholt <eric@anholt.net>
Caio Marcelo de Oliveira Filho [Mon, 25 Jun 2018 16:51:20 +0000 (09:51 -0700)]
util/set: add a clone function
v2: Add unit test. (Eric Anholt)
Reviewed-by: Eric Anholt <eric@anholt.net>
Caio Marcelo de Oliveira Filho [Fri, 6 Jul 2018 18:26:13 +0000 (11:26 -0700)]
util/set: add a basic unit test
Reviewed-by: Eric Anholt <eric@anholt.net>
Marek Olšák [Tue, 7 Nov 2017 01:02:21 +0000 (02:02 +0100)]
radeonsi: add support for Vega20
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Eric Anholt [Wed, 11 Jul 2018 21:53:57 +0000 (14:53 -0700)]
u_blitter: Add an option to draw the triangles using an index buffer.
For V3D, the HW will interpolate slightly differently along the shared
edge of the trifan. The conformance tests manage to catch this in the
nearest_consistency_* group. To get interpolation to match, we need the
last vertex of the triangle to be shared.
I first tried implementing draw_rectangle to do triangles instead, but
that was quite a bit (147 lines) of code duplication from u_blitter, and
this seems much simpler and less likely to break as u_blitter changes.
Fixes dEQP-GLES3.functional.fbo.blit.rect.nearest_consistency_* on V3D.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>