Caio Marcelo de Oliveira Filho [Thu, 28 Mar 2019 07:36:39 +0000 (00:36 -0700)]
glsl: Enable texture builtins for NV_compute_shader_derivatives
Renamed a few predicates from "fs_only" to be "derivative_only" (or
similar pairs).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Caio Marcelo de Oliveira Filho [Tue, 26 Mar 2019 07:04:01 +0000 (00:04 -0700)]
glsl: Enable derivative builtins for NV_compute_shader_derivatives
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Caio Marcelo de Oliveira Filho [Tue, 26 Mar 2019 06:47:21 +0000 (23:47 -0700)]
glsl: Remove redundant conditions when asserting in_qualifier
As the code evolved, we ended up with a redundant conditions. Clean
this up.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Caio Marcelo de Oliveira Filho [Mon, 18 Mar 2019 21:26:52 +0000 (14:26 -0700)]
mesa: Extension boilerplate for NV_compute_shader_derivatives
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Timothy Arceri [Mon, 8 Apr 2019 10:13:49 +0000 (20:13 +1000)]
nir/radv: remove restrictions on opt_if_loop_last_continue()
When I implemented opt_if_loop_last_continue() I had restricted
this pass from moving other if-statements inside the branch opposite
the continue. At the time it was causing a bunch of spilling in
shader-db for i965.
However Samuel Pitoiset noticed that making this pass more aggressive
significantly improved the performance of Doom on RADV. Below are
the statistics he gathered.
28717 shaders in 14931 tests
Totals:
SGPRS:
1267317 ->
1267549 (0.02 %)
VGPRS: 896876 -> 895920 (-0.11 %)
Spilled SGPRs: 24701 -> 26367 (6.74 %)
Code Size:
48379452 ->
48507880 (0.27 %) bytes
Max Waves: 241159 -> 241190 (0.01 %)
Totals from affected shaders:
SGPRS: 23584 -> 23816 (0.98 %)
VGPRS: 25908 -> 24952 (-3.69 %)
Spilled SGPRs: 503 -> 2169 (331.21 %)
Code Size:
2471392 ->
2599820 (5.20 %) bytes
Max Waves: 586 -> 617 (5.29 %)
The codesize increases is related to Wolfenstein II it seems largely
due to an increase in phis rather than the existing jumps.
This gives +10% FPS with Doom on my Vega56.
Rhys Perry also benchmarked Doom on his VEGA64:
Before: 72.53 FPS
After: 80.77 FPS
v2: disable pass on non-AMD drivers
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Dave Airlie [Wed, 27 May 2015 07:41:32 +0000 (17:41 +1000)]
softpipe: add support for vertex streams (v2)
This enables the ARB_gpu_shader5 vertex streams on softpipe.
v2: only enable when not using llvm.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 27 May 2015 07:39:05 +0000 (17:39 +1000)]
draw: add support to tgsi paths for geometry streams. (v2)
This hooks up the geometry shader processing to the TGSI
support added in the previous commits.
It doesn't change the llvm interface other than to
keep things building.
v2: fix some regressions caused by primitiveoffsets
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 27 May 2015 07:37:46 +0000 (17:37 +1000)]
softpipe: add support for indexed queries.
We need indexed queries to retrieve the geom shader info.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 27 May 2015 07:35:32 +0000 (17:35 +1000)]
tgsi: add support for geometry shader streams.
This adds support to retrieve the primitive counts
for each stream, along with the offset for each
primitive into the output array.
It also adds support for parsing the stream argument
to the emit and end instructions.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 27 May 2015 07:22:13 +0000 (17:22 +1000)]
draw: add stream member to stats callback
This just adds space for the member to the callback, doesn't
change anything else.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Chia-I Wu [Fri, 8 Feb 2019 22:57:33 +0000 (14:57 -0800)]
vulkan/wsi: make wl_drm optional
When wl_drm is missing and the driver supports modifiers, use
zwp_linux_dmabuf_v1 for the list of supported formats and for buffer
creation.
Limit the supported formats to those with modifiers, which are
WL_DRM_FORMAT_{ARGB8888,XRGB8888} currently.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Chia-I Wu [Tue, 12 Feb 2019 20:31:57 +0000 (12:31 -0800)]
vulkan/wsi: add wsi_wl_display_dmabuf
Add wsi_wl_display_dmabuf for zwp_linux_dmabuf_v1-related states.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Chia-I Wu [Tue, 12 Feb 2019 20:31:57 +0000 (12:31 -0800)]
vulkan/wsi: add wsi_wl_display_drm
Add wsi_wl_display_drm for wl_drm-related states. We will move
formats into the struct in a later commit.
Remove the unnecessary check for wl_registry_bind failures.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Chia-I Wu [Tue, 12 Feb 2019 05:25:49 +0000 (21:25 -0800)]
vulkan/wsi: refactor drm_handle_format
Refactor the swtich statement in drm_handle_format out to
wsi_wl_display_add_wl_format.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Chia-I Wu [Tue, 12 Feb 2019 01:23:56 +0000 (17:23 -0800)]
vulkan/wsi: create wl_drm wrapper as needed
When modifiers are specified, we have to use dmabuf rather than
wl_drm. We don't need the wrapper in that case.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Chia-I Wu [Tue, 12 Feb 2019 01:04:54 +0000 (17:04 -0800)]
vulkan/wsi: move modifier array into wsi_wl_swapchain
This avoids repeated checks for each wsi_wl_image.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Adam Jackson [Tue, 5 Mar 2019 20:31:51 +0000 (15:31 -0500)]
drisw: Try harder to probe whether MIT-SHM works
XQueryExtension merely tells you whether the extension exists, it
doesn't tell you whether you're local enough for it to work.
XShmQueryVersion is not enough to discover this either, you need to
provoke the server to do actual work, and if it thinks you're remote it
will throw BadRequest at you. So send an invalid ShmDetach and use the
error code to distinguish local from remote.
[airlied: fixed bug not resetting xshm_error to 0 on success,
which made later stuff fail completely.]
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Adam Jackson <ajax@redhat.com>
Jason Ekstrand [Fri, 22 Mar 2019 22:45:29 +0000 (17:45 -0500)]
nir/search: Search for all combinations of commutative ops
Consider the following search expression and NIR sequence:
('iadd', ('imul', a, b), b)
ssa_2 = imul ssa_0, ssa_1
ssa_3 = iadd ssa_2, ssa_0
The current algorithm is greedy and, the moment the imul finds a match,
it commits those variable names and returns success. In the above
example, it maps a -> ssa_0 and b -> ssa_1. When we then try to match
the iadd, it sees that ssa_0 is not b and fails to match. The iadd
match will attempt to flip itself and try again (which won't work) but
it cannot ask the imul to try a flipped match.
This commit instead counts the number of commutative ops in each
expression and assigns an index to each. It then does a loop and loops
over the full combinatorial matrix of commutative operations. In order
to keep things sane, we limit it to at most 4 commutative operations (16
combinations). There is only one optimization in opt_algebraic that
goes over this limit and it's the bitfieldReverse detection for some UE4
demo.
Shader-db results on Kaby Lake:
total instructions in shared programs:
15310125 ->
15302469 (-0.05%)
instructions in affected programs:
1797123 ->
1789467 (-0.43%)
helped: 6751
HURT: 2264
total cycles in shared programs:
357346617 ->
357202526 (-0.04%)
cycles in affected programs:
15931005 ->
15786914 (-0.90%)
helped: 6024
HURT: 3436
total loops in shared programs: 4360 -> 4360 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total spills in shared programs: 23675 -> 23666 (-0.04%)
spills in affected programs: 235 -> 226 (-3.83%)
helped: 5
HURT: 1
total fills in shared programs: 32040 -> 32032 (-0.02%)
fills in affected programs: 190 -> 182 (-4.21%)
helped: 6
HURT: 2
LOST: 18
GAINED: 5
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Lionel Landwerlin [Mon, 8 Apr 2019 15:27:30 +0000 (16:27 +0100)]
intel: add dependency on genxml generated files
Drivers using genxml will start compilation before generated files are
created, so add a dependency to it.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Cc: mesa-stable@lists.freedesktop.org
Marek Olšák [Mon, 8 Apr 2019 18:24:48 +0000 (14:24 -0400)]
radeonsi: fix a crash when unbinding sampler states
Acked-by: James Zhu <James.Zhu@amd.com>
Samuel Pitoiset [Mon, 8 Apr 2019 12:11:51 +0000 (14:11 +0200)]
radv: fix getting the vertex strides if the bindings aren't contiguous
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110349
Fixes: a66b186bebf ("radv: use typed buffer loads for vertex input fetches")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Lionel Landwerlin [Mon, 25 Mar 2019 15:02:47 +0000 (15:02 +0000)]
anv: implement VK_KHR_swapchain revision 70
This revision allows for images to be :
- created by reusing image parameters from swapchain
- bound to memory from a swapchain
v2: Add color attachment flag
Use same implicit WSI parameters (tiling, samples, usage)
v3: Fix missing break in vk_foreach_struct_const() switch (Lionel)
v4: Fix accessing image aspects before android resolve (Tapani)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Eric Engestrom [Fri, 5 Apr 2019 14:58:28 +0000 (15:58 +0100)]
vk/util: remove unneeded array index
This is an array of 1, so [0] is the only content, and meson already
flattens the list so this is unnecessary.
Also, all the other uses of vk_api_xml don't do that.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Samuel Pitoiset [Mon, 8 Apr 2019 09:39:07 +0000 (11:39 +0200)]
ac/nir: fix intrinsic names for atomic operations with LLVM 9+
This fixes the following LLVM error when using RADV_DEBUG=checkir:
Intrinsic name not mangled correctly for type arguments! Should be: llvm.amdgcn.buffer.atomic.add.i32
i32 (i32, <4 x i32>, i32, i32, i1)* @llvm.amdgcn.buffer.atomic.add
The cmpswap operation still uses the old intrinsic.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Alyssa Rosenzweig [Sun, 7 Apr 2019 16:05:42 +0000 (16:05 +0000)]
panfrost: Remove "mali_unknown6" nonsense
This structure was used maaaany moons ago as a placeholder for the
varying meta (now unified with mali_attr_meta and essentially fully
decoded). I don't know why it's still in the file. Let's wack it.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Fri, 5 Apr 2019 05:45:01 +0000 (05:45 +0000)]
panfrost/midgard: Enable lower_find_lsb
This is exactly what the blob does.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Fri, 5 Apr 2019 05:37:37 +0000 (05:37 +0000)]
panfrost/midgard: Add ibitcount8 op
The mechanics of this opcode are a little opaque, but essentially, it's
used in 8-bit mode to do a bit count in parallel of a uint and then
doing a ton of clever iadd/imov ops to recombine.
v2: Correct opcode. Thank you to jernej on IRC for noticing this awkward
typo!
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Fri, 5 Apr 2019 05:34:03 +0000 (05:34 +0000)]
panfrost/midgard: Add ilzcnt op
Used for implementing findLSB/MSB
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Fri, 5 Apr 2019 05:16:54 +0000 (05:16 +0000)]
panfrost/midgard: Add umin/umax opcodes
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Fri, 5 Apr 2019 05:16:32 +0000 (05:16 +0000)]
panfrost: Add tilebuffer load? branch
Also document branches better.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Fri, 5 Apr 2019 01:17:21 +0000 (01:17 +0000)]
panfrost/decode: Add flags for tilebuffer readback
These flags are set when reading back the tilebuffer from a fragment
shader via various mechanisms (including ARM_shader_framebuffer_fetch
and EXT_pixel_local_storage).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Karol Herbst [Fri, 29 Mar 2019 20:40:45 +0000 (21:40 +0100)]
panfrost/midgard: use nir_src_is_const and nir_src_as_uint
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Jason Ekstrand [Tue, 2 Apr 2019 02:23:01 +0000 (21:23 -0500)]
vc4: Prefer nir_src_comp_as_uint over nir_src_as_const_value
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Karol Herbst [Thu, 28 Mar 2019 16:53:30 +0000 (17:53 +0100)]
v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Fri, 5 Apr 2019 18:52:59 +0000 (11:52 -0700)]
gallium/util: Add const to u_range_intersect
This doesn't modify the range, so it can accept a const pointer.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Greg V [Sun, 3 Mar 2019 15:40:58 +0000 (18:40 +0300)]
gallium/hud: add CPU usage support for FreeBSD
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Kenneth Graunke [Sat, 6 Apr 2019 22:57:35 +0000 (15:57 -0700)]
iris: Silence unused variable warnings in release mode
Jason Ekstrand [Fri, 29 Mar 2019 22:12:47 +0000 (17:12 -0500)]
nir/algebraic: Add some logical OR and AND patterns
The new OR pattern has been seen in the wild and can end up being
generated by GLSLang. Not sure about the other two new patterns but we
may as well throw them in for completeness. While we're here, we can
drop the '@bool' specifier from the one pattern because specifying True
already implies 1-bit which basically implies boolean.
Shader-db results on Kaby Lake:
total instructions in shared programs:
15321227 ->
15321129 (<.01%)
instructions in affected programs: 3594 -> 3496 (-2.73%)
helped: 6
HURT: 0
total cycles in shared programs:
357481321 ->
357479725 (<.01%)
cycles in affected programs: 44109 -> 42513 (-3.62%)
helped: 6
HURT: 0
VkPipeline-DB results on Kaby Lake:
total instructions in shared programs:
3770504 ->
3769734 (-0.02%)
instructions in affected programs: 19058 -> 18288 (-4.04%)
helped: 163
HURT: 0
total cycles in shared programs:
1417583701 ->
1417569727 (<.01%)
cycles in affected programs: 750958 -> 736984 (-1.86%)
helped: 158
HURT: 1
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Jason Ekstrand [Sat, 30 Mar 2019 03:51:20 +0000 (22:51 -0500)]
nir/algebraic: Drop some @bool specifiers
Now that we have one-bit booleans, we don't need to rely on looking at
parent instructions in order to figure out if a value is a Boolean most
of the time. We can drop these specifiers and now the optimizations
will apply more generally.
Shader-DB results on Kaby Lake:
total instructions in shared programs:
15321168 ->
15321227 (<.01%)
instructions in affected programs: 8836 -> 8895 (0.67%)
helped: 1
HURT: 31
total cycles in shared programs:
357481781 ->
357481321 (<.01%)
cycles in affected programs: 146524 -> 146064 (-0.31%)
helped: 22
HURT: 10
total spills in shared programs: 23675 -> 23673 (<.01%)
spills in affected programs: 11 -> 9 (-18.18%)
helped: 1
HURT: 0
total fills in shared programs: 32040 -> 32036 (-0.01%)
fills in affected programs: 27 -> 23 (-14.81%)
helped: 1
HURT: 0
No change in VkPipeline-DB
Looking at the instructions hurt, a bunch of them seem to be a case
where doing exactly the right thing in NIR ends up doing the wrong-ish
thing in the back-end because flags are dumb. In particular, there's a
case where we have a MUL followed by a CMP followed by a SEL and when we
turn that SEL into an OR, it uses the GRF result of the CMP rather than
the flag result so the CMP can't be merged with the MUL. Those shaders
appear to schedule better according to the cycle estimates so I guess
it's a win? Also it helps spilling in one Car Chase compute shader.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Andrii Simiklit [Wed, 3 Apr 2019 13:51:14 +0000 (16:51 +0300)]
util: clean the 24-bit unused field to avoid an issues
This is a field of FLOAT_32_UNSIGNED_INT_24_8_REV texture pixel.
OpenGL spec "8.4.4.2 Special Interpretations" is saying:
"the second word contains a packed 24-bit unused field,
followed by an 8-bit index"
The spec doesn't require us to clear this unused field
however it make sense to do it to avoid some
undefined behavior in some apps.
Suggested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110305
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Caio Marcelo de Oliveira Filho [Thu, 4 Apr 2019 19:46:42 +0000 (12:46 -0700)]
nir: Take if_uses into account when repairing SSA
If a def is used as an condition before its definition, we should also
consider this a case to repair. When repairing, make sure we rewrite
any if conditions too.
Found in while inspecting a SPIR-V conversion from a 'continue block'
that contains a conditional branch. We pull the continue block up to
the beggining of the loop, and the condition in the branch ends up
defined afterwards.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 364212f1ede4b "nir: Add a pass to repair SSA form"
Marek Olšák [Fri, 5 Apr 2019 15:18:21 +0000 (11:18 -0400)]
tegra: fix the build after the set_shader_buffers change
James Zhu [Fri, 29 Mar 2019 19:59:39 +0000 (15:59 -0400)]
gallium/auxiliary/vl: Add barrier/unbind after compute shader launch.
Add memory barrier sync for multiple launch cases, and unbind completed
resources after launch.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
James Zhu [Fri, 29 Mar 2019 19:57:51 +0000 (15:57 -0400)]
gallium/auxiliary/vl: Fixed blank issue with compute shader
Multiple init buffer within one open instance will cause blank issue.
Updating viewport per frame will fix this issue.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Tested-by: Bruno Milreu <bmilreu@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
James Zhu [Tue, 19 Mar 2019 19:45:29 +0000 (15:45 -0400)]
gallium/auxiliary/vl: Fixed blur issue with weave compute shader
Correct wrong interpolatation with top/bottom row which caused blur issue.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Tested-by: Bruno Milreu <bmilreu@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Fri, 5 Apr 2019 12:24:29 +0000 (13:24 +0100)]
docs: update calendar, add news item and link release notes for 18.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Fri, 5 Apr 2019 11:00:12 +0000 (12:00 +0100)]
docs: add sha256 checksums for 18.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
eb9da68cbf23aafb1192beed084b2f05df65dd04)
Emil Velikov [Fri, 5 Apr 2019 10:59:15 +0000 (11:59 +0100)]
docs: add release notes for 18.3.6
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
(cherry picked from commit
b03f51c4b4dfa54775e866b75f68a41862c062c2)
Samuel Pitoiset [Fri, 5 Apr 2019 09:26:12 +0000 (11:26 +0200)]
nir: do not pack varying with different types
The current algorithm only supports packing 32-bit types.
If a shader uses both 16-bit and 32-bit varyings, we shouldn't
compact them together.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Gert Wollny [Thu, 4 Apr 2019 09:34:26 +0000 (11:34 +0200)]
softpipe: Use mag texture filter also for clamped lod == 0
Follow the spec when selecting the magnification filter (OpenGL 4.5,
section 8.14):
If λ(x, y) is less than or equal to the constant c (see section 8.15)
the texture is said to be magnified;
While we're here also silence a potential warning about implicit float
to double conversion.
v2: Update commit message to contain a reference to the spec as pointed
out by Eric.
Fixes a number of dEQP GLES2 and GLES3 test out of:
dEQP-GLES2.functional.texture.filtering.*
dEQP-GLES2.functional.texture.vertex.2d.filtering.*
dEQP-GLES3.functional.texture.vertex.*.filtering.*
dEQP-GLES3.functional.texture.filtering.*
dEQP-GLES3.functional.texture.shadow.2d.*
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tapani Pälli [Fri, 5 Apr 2019 05:55:18 +0000 (08:55 +0300)]
iris: handle aux properly in iris_resource_get_handle
Disable aux when resource seen the first time and EXPLICIT_FLUSH
not being set. This fixes issues seen when launching Xorg and
CCS_E getting utilized.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 4 Apr 2019 17:51:22 +0000 (10:51 -0700)]
v3d: Add some more new packets for V3D 4.x.
The T/G shader references and common state will be needed for GLES 3.2.
Eric Anholt [Wed, 3 Apr 2019 19:39:54 +0000 (12:39 -0700)]
v3d: Don't try to use the TFU blit path if a scissor is enabled.
We'll need to do a render-based blit for scissors, since the TFU (as seen
in this conditional) can only update a whole surface.
Fixes: 976ea90bdca2 ("v3d: Add support for using the TFU to do some blits.")
Fixes piglit fbo-scissor-blit.
Eric Anholt [Fri, 29 Mar 2019 22:38:15 +0000 (15:38 -0700)]
v3d: Bump the maximum texture size to 4k for V3D 4.x.
4.1 and 4.2 both have the same 16k limit, but it I'm seeing GPU hangs in
the CTS at 8k and 16k. 4k at least lets us get one 4k display working.
Cc: mesa-stable@lists.freedesktop.org
Eric Anholt [Tue, 2 Apr 2019 23:51:44 +0000 (16:51 -0700)]
v3d: Add support for handling OOM signals from the simulator.
I have v3d allocating enough initial allocation memory that we've been
passing tests without it, but to match kernel behavior more it would be
good to actually exercise the OOM path.
Illia Iorin [Thu, 28 Feb 2019 10:33:50 +0000 (12:33 +0200)]
mesa/main: Fix multisample texture initialize
Sampler of Multisample textures wasn't initialized correct. So when
texture object created as multisample its sampler is initialized in a
individual case. We change the initial state of TEXTURE_MIN_FILTER and
TEXTURE_MAG_FILTER to NEAREST.
These changes are approved by KhronosGroup.
https://github.com/KhronosGroup/OpenGL-API/issues/45
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Signed-off-by: Illia Iorin <illia.iorin@globallogic.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109057
Sergii Romantsov [Thu, 24 Jan 2019 12:33:55 +0000 (14:33 +0200)]
glsl: Fix input/output structure matching across shader stages
Section 7.4.1 (Shader Interface Matching) of the OpenGL 4.30 spec says:
"Variables or block members declared as structures are considered
to match in type if and only if structure members match in name,
type, qualification, and declaration order."
Fixes:
* layout-location-struct.shader_test
v2: rebased against master and small fixes
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108250
Dave Airlie [Fri, 29 Mar 2019 02:57:55 +0000 (12:57 +1000)]
ddebug: add compute functions to help hang detection
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dave Airlie [Thu, 4 Apr 2019 01:18:26 +0000 (11:18 +1000)]
iris: avoid use after free in shader destruction
While playing with compute shaders, I was getting a random crash,
noticed that bind_state was using the old shader info for comparision,
but gallium allows the shader to be deleted while bound, so this could
lead to a use after free.
This can't happen using the cso cache. As it tracks all of this.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Thu, 28 Feb 2019 18:02:13 +0000 (13:02 -0500)]
radeonsi: set exact shader buffer read/write usage in CS
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Wed, 3 Apr 2019 18:22:16 +0000 (14:22 -0400)]
glsl: remember which SSBOs are not read-only and pass it to gallium
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Thu, 28 Feb 2019 02:54:47 +0000 (21:54 -0500)]
gallium: add writable_bitmask parameter into set_shader_buffers
to indicate write usage per buffer.
This is just a hint (it will be used by radeonsi).
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Danylo Piliaiev [Thu, 4 Apr 2019 12:04:50 +0000 (15:04 +0300)]
iris: Fix assert when using vertex attrib without buffer binding
The GL 4.5 spec says:
"If any enabled array’s buffer binding is zero when DrawArrays or
one of the other drawing commands defined in section 10.4 is called,
the result is undefined."
The result is undefined but it should not crash.
Fixes: gl-3.1-vao-broken-attrib
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tapani Pälli [Thu, 28 Mar 2019 08:49:45 +0000 (10:49 +0200)]
iris: move iris_flush_resource so we can call it from get_handle
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Tue, 2 Apr 2019 06:28:06 +0000 (23:28 -0700)]
iris: Save/restore MI_PREDICATE_RESULT, not MI_PREDICATE_DATA.
MI_PREDICATE_DATA is an intermediate storage for the MI_PREDICATE
command's calculations - it holds the result of the subtraction when
the compare operation is SRCS_EQUAL or DELTAS_EQUAL. But the actual
result of the predication is MI_PREDICATE_RESULT, which is what we
want to copy from the render context to the compute context.
Eric Engestrom [Fri, 22 Mar 2019 16:54:57 +0000 (16:54 +0000)]
util/process: document memory leak
We consider it acceptable, but let's still document it in case people
notice it and are not sure why it's there.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Eric Engestrom [Thu, 14 Mar 2019 13:58:54 +0000 (13:58 +0000)]
simplify LLVM version string printing
Figure it out once in the build system, then just use that all over the place.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Guido Günther [Wed, 3 Apr 2019 16:13:49 +0000 (18:13 +0200)]
gallium/u_dump: util_dump_sampler_view: Dump u.tex.first_level
Dump u.tex.first_level instead of dumping u.tex.last_level twice.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Guido Günther [Wed, 3 Apr 2019 11:08:47 +0000 (13:08 +0200)]
gallium: ddebug: Add missing fence related wrappers
Without that `GALLIUM_DDEBUG=always kmscube -A` would segfault like
#0 0x0000000000000000 in ()
#1 0x0000ffffa72a3c54 in dri2_get_fence_fd (_screen=0xaaaaed4f2090, _fence=0xaaaaed9ef880) at ../src/gallium/state_trackers/dri/dri_helpers.c:140
#2 0x0000ffffa8744824 in dri2_dup_native_fence_fd (drv=0xaaaaed5010c0, disp=0xaaaaed5029a0, sync=0xaaaaed9ef7c0) at ../src/egl/drivers/dri2/egl_dri2.c:3050
#3 0x0000ffffa87339b8 in eglDupNativeFenceFDANDROID (dpy=0xaaaaed5029a0, sync=0xaaaaed9ef7c0) at ../src/egl/main/eglapi.c:2107
#4 0x0000aaaabd29ca90 in ()
#5 0x0000aaaabd401000 in ()
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Danylo Piliaiev [Wed, 3 Apr 2019 15:09:24 +0000 (18:09 +0300)]
st/mesa: Fix GL_MAP_COLOR with glDrawPixels GL_COLOR_INDEX
Documentation for glDrawPixels with GL_COLOR_INDEX says:
"If the GL is in color index mode, and if GL_MAP_COLOR is true,
the index is replaced with the value that it references in
lookup table GL_PIXEL_MAP_I_TO_I"
We are always in RGBA mode and there is nothing in documentation
about GL_MAP_COLOR in RGBA mode for GL_COLOR_INDEX.
Scale and bias are also only applicable for RGBA format and not
mentioned for GL_COLOR_INDEX.
Thus the behaviour will be on par with i965.
Fixes: gl-1.0-drawpixels-color-index
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Tue, 19 Mar 2019 14:15:35 +0000 (14:15 +0000)]
gallium/hud: fix rounding error in nic bps computation
While at it, fix typo in "rounding error" :P
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Tue, 19 Mar 2019 14:11:48 +0000 (14:11 +0000)]
gallium/hud: prevent buffer overflow
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Tue, 19 Mar 2019 14:11:09 +0000 (14:11 +0000)]
gallium/hud: fix memory leaks
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Wed, 9 Jan 2019 01:08:08 +0000 (20:08 -0500)]
radeonsi: enable displayable DCC on Ravens
Marek Olšák [Sat, 5 Jan 2019 00:39:01 +0000 (19:39 -0500)]
radeonsi: add support for displayable DCC for multi-RB chips
A compute shader is used to reorder DCC data from aligned to unaligned.
Marek Olšák [Sat, 5 Jan 2019 00:19:54 +0000 (19:19 -0500)]
radeonsi: add support for displayable DCC for 1 RB chips
This is the simpler codepath - just disable RB and pipe alignment for DCC.
Marek Olšák [Sat, 5 Jan 2019 00:27:27 +0000 (19:27 -0500)]
radeonsi: add ability to bind images as image buffers
so that we can bind DCC (texture) as an image buffer.
Marek Olšák [Fri, 9 Nov 2018 21:51:47 +0000 (16:51 -0500)]
radeonsi/gfx9: add support for PIPE_ALIGNED=0
Needed by displayable DCC.
We need to flush L2 after rendering if PIPE_ALIGNED=0 and DCC is enabled.
Marek Olšák [Wed, 3 Apr 2019 21:17:08 +0000 (17:17 -0400)]
amd/addrlib: fix uninitialized values for Addr2ComputeDccAddrFromCoord
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tapani Pälli [Tue, 2 Apr 2019 06:12:21 +0000 (09:12 +0300)]
iris: move variable to the scope where it is being used
iris_upload_border_color is passed a pointer which points to
variable that is introduced in a different scope.
CID:
1444296
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tapani Pälli [Tue, 2 Apr 2019 05:56:07 +0000 (08:56 +0300)]
st/nir: run st_nir_opts after 64bit ops lowering
CID:
1444309
Fixes: 9ab1b1d0227 "st/nir: Move 64-bit lowering later"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Alyssa Rosenzweig [Wed, 3 Apr 2019 03:52:36 +0000 (03:52 +0000)]
panfrost: Size tiled temp buffers correctly
This should lower transient memory usage and improve performance
slightly (due to less memory to malloc/free, better cache locality,
etc).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Wed, 3 Apr 2019 03:36:38 +0000 (03:36 +0000)]
panfrost: Respect box->width in tiled stores
This fixes a regression uploading partial tiled textures introduced
sometime during the cubemap series.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Wed, 3 Apr 2019 02:15:18 +0000 (02:15 +0000)]
panfrost: Cleanup some indirection in pan_resource
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Wed, 3 Apr 2019 01:48:09 +0000 (01:48 +0000)]
panfrost: Implement system values
This patch implements system values via specially-crafted uniforms.
While we previously had an ad hoc system for passing the viewport into
the vertex shader, this commit generalizes the system to allow for
arbitrary system values to be added to both shader stages. While we're
at it, we clean up uniform handling code (which was considerably muddied
to handle the ad hoc viewport uniform).
This commit serves as both a cleanup of the existing codebase and the
precursor to new functionality, like implementing textureSize().
Concurrent with these changes is respecting the depth transform, which
was not possible with the old fixed uniform system and here serves as a
proof-of-correctness test (as well as justifying the NIR changes).
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Alyssa Rosenzweig [Wed, 3 Apr 2019 01:45:44 +0000 (01:45 +0000)]
nir: Add "viewport vector" system values
While a partial set of viewport system values exist, these are scalar
values, which is a poor fit for viewport transformations on vector ISAs
like Midgard (where the vec3 values for scale and offset each need to be
coherent in a vec4 uniform slot to take advantage of vectorized
transform math). This patch adds vec3 scale/offset fields corresponding
to the 3D Gallium viewport / glViewport+depth
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Eric Anholt <eric@anholt.net>
Erik Faye-Lund [Tue, 2 Apr 2019 08:48:30 +0000 (10:48 +0200)]
virgl: also destroy all read-transfers
For texture write-transfers, we either free them on the transfer-queue
or right away. But for read-transfers, we currently only destroy them in
case they used a temp-resource. This leads to occasional resource-leaks.
Let's add a call to virgl_resource_destroy_transfer in the missing case.
Do the same thing for buffers as well, but the logic is a bit easier to
follow there.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: f0e71b10888 ("virgl: use transfer queue")
Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org>
Dylan Baker [Mon, 1 Apr 2019 17:14:54 +0000 (10:14 -0700)]
meson: Error if LLVM is turned off but clover it turned on
Since clover has a hard requirement on LLVM
v2: - make error message more specific
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Dylan Baker [Mon, 1 Apr 2019 17:12:03 +0000 (10:12 -0700)]
meson: Error if LLVM doesn't have rtti when building clover
We already do this for nouveau, but it's required for clover too.
Alyssa Rosenzweig [Sun, 31 Mar 2019 19:06:05 +0000 (19:06 +0000)]
panfrost: Remove support for legacy kernels
Previously, there was minimal support for interoperating with legacy
kernels (reusing kernel modules originally designed for proprietary
legacy userspaces, rather than for upstream-friendly free software
stacks). Now that the Panfrost kernel is stabilising, this commit drops
the legacy code path.
Panfrost users need to use a modern, mainline kernel supporting the
Panfrost kernel driver from this commit forward.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Lucas Stach [Wed, 27 Mar 2019 11:25:18 +0000 (12:25 +0100)]
etnaviv: only try to construct scanout resource when on KMS winsys
Trying to construct a scanout capable buffer will only ever work when
when we are on top of a KMS winsys, as the render node isn't capable
of allocating contiguous buffers.
Tested-by: Marius Vlad <marius.vlad@collabora.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Lucas Stach [Wed, 27 Mar 2019 11:22:58 +0000 (12:22 +0100)]
etnaviv: flush all pending contexts when accessing a resource with the CPU
When setting up a transfer to a resource, all contexts where the resource
is pending must be flushed. Otherwise a write transfer might be started
in the current context before all contexts that access the resource in
shared (read) mode have been executed.
Fixes: 64813541d575 (etnaviv: fix resource usage tracking across
different pipe_context's)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Tested-By: Guido Günther <agx@sigxcpu.org>
Lucas Stach [Wed, 27 Mar 2019 11:22:57 +0000 (12:22 +0100)]
etnaviv: don't flush own context when updating resource use
The context is self synchronizing at the GPU side, as commands are
executed in order. We must not flush our own context when updating the
resource use, as that leads to excessive flushing on effectively every
draw call, causing huge CPU overhead.
Fixes: 64813541d575 (etnaviv: fix resource usage tracking across
different pipe_context's)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Christian Gmeiner [Wed, 27 Mar 2019 13:58:16 +0000 (14:58 +0100)]
etnaviv: shrink struct etna_3d_state
Drop struct members which are only written to but never read from.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Dave Airlie [Wed, 3 Apr 2019 02:20:40 +0000 (12:20 +1000)]
intel/compiler: use defined size for vector components
If we increase vector sizing later it would be nice to avoid
tripped over this again.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Dave Airlie [Wed, 3 Apr 2019 02:20:02 +0000 (12:20 +1000)]
nir: use proper array sizing define for vectors
If we increase the vector size in the future it would be good
to not have to fix these up, this should change nothing at present.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Timothy Arceri [Wed, 3 Apr 2019 02:24:18 +0000 (13:24 +1100)]
Revert "nir: propagate known constant values into the if-then branch"
This reverts commit
4218b6422cf1ff70c7f0feeec699a35e88522ed7.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110311
Timothy Arceri [Fri, 23 Nov 2018 00:49:19 +0000 (11:49 +1100)]
nir: propagate known constant values into the if-then branch
Helps Max Waves / VGPR use in a bunch of Unigine Heaven
shaders.
shader-db results radeonsi (VEGA):
Totals from affected shaders:
SGPRS:
5505440 ->
5505872 (0.01 %)
VGPRS:
3077520 ->
3077296 (-0.01 %)
Spilled SGPRs: 39032 -> 39030 (-0.01 %)
Spilled VGPRs: 16326 -> 16326 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 744 -> 744 (0.00 %) dwords per thread
Code Size:
123755028 ->
123753316 (-0.00 %) bytes
Compile Time:
2751028 ->
2560786 (-6.92 %) milliseconds
LDS: 1415 -> 1415 (0.00 %) blocks
Max Waves: 972192 -> 972240 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
vkpipeline-db results RADV (VEGA):
Totals from affected shaders:
SGPRS: 160 -> 160 (0.00 %)
VGPRS: 88 -> 88 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 18268 -> 18152 (-0.63 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 26 -> 26 (0.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Lepton Wu [Mon, 18 Mar 2019 23:40:25 +0000 (16:40 -0700)]
virgl: close drm fd when destroying virgl screen.
This fd was create in virgl_drm_screen_create and should be closed
in virgl_drm_screen_destroy.
Signed-off-by: Lepton Wu <lepton@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Rafael Antognolli [Tue, 26 Mar 2019 22:35:00 +0000 (15:35 -0700)]
iris: Enable fast clears on gen8.
Since we are now properly storing the clear color with SCS bits, we can
now enable fast clears on gen8 too.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>