Rob Clark [Thu, 11 Jun 2020 18:00:55 +0000 (11:00 -0700)]
freedreno/ir3: respect tex prefetch limits
Refactor a bit the limit checking in the bindless case, and add tex/samp
limit checking for the non-bindless case, to ensure we do not try to
prefetch textures which cannot be encoded in the # of bits available.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
Rob Clark [Thu, 11 Jun 2020 16:47:05 +0000 (09:47 -0700)]
freedreno/ir3: add debug code to print conflicting half-regs
I keep re-typing this from time to time when debugging various things.
Which is dumb.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
Rob Clark [Thu, 11 Jun 2020 16:43:11 +0000 (09:43 -0700)]
nir/print: print tex dest type
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
Francisco Jerez [Tue, 9 Jun 2020 22:23:30 +0000 (15:23 -0700)]
iris/icl+: Report same caching domain as main surface for clear color BO.
Even though the clear color BO is bound as a read-only buffer, report
the same caching domain as the main BO in use_surface() (typically
IRIS_DOMAIN_RENDER_WRITE) in order to avoid ping-ponging back and
forth between IRIS_DOMAIN_RENDER_WRITE and IRIS_DOMAIN_OTHER_READ,
which leads to increased stall-at-pixel-scoreboard synchronization
between draw calls.
Fixes a 5%-10% FPS regression in some benchmarks spotted on ICL.
Reported-by: Clayton Craft <clayton.a.craft@intel.com>
Fixes: eb5d1c27227302167d299 "iris: Annotate all BO uses with domain and sequence number information."
Closes: #3097
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5411>
Mauro Rossi [Thu, 11 Jun 2020 06:17:02 +0000 (08:17 +0200)]
android: aco: add aco_ir.cpp to Makefile.sources
Fixes the following building errors:
FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/vulkan.radv_intermediates/LINKED/vulkan.radv.so
...
ld.lld: error: undefined symbol: aco::can_use_SDWA(chip_class, std::__1::unique_ptr<aco::Instruction, aco::instr_deleter_functor> const&)
...
ld.lld: error: undefined symbol: aco::can_use_opsel(chip_class, aco_opcode, int, bool)
...
clang-9: error: linker command failed with exit code 1 (use -v to see invocation)
Fixes: d9cfb8ad ("aco: validate instructions reading/writing upper halves/bytes")
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5425>
Eric Engestrom [Wed, 10 Jun 2020 18:43:39 +0000 (20:43 +0200)]
docs: update calendar, add news item, and link releases notes for 20.1.1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5421>
Eric Engestrom [Wed, 10 Jun 2020 18:01:30 +0000 (20:01 +0200)]
docs: Add release notes for 20.1.1
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5421>
Marek Olšák [Thu, 11 Jun 2020 09:00:44 +0000 (05:00 -0400)]
ac/surface: don't free dcc_retile_map on failure
because the hash table now owns it.
Fixes: bd553f0546d - ac/surface: cache DCC retile maps (v2)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
Marek Olšák [Thu, 11 Jun 2020 08:30:04 +0000 (04:30 -0400)]
ac/surface: enable DCC for the first level in the mip tail on gfx10
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
Marek Olšák [Thu, 11 Jun 2020 08:20:44 +0000 (04:20 -0400)]
ac/surface: require that gfx8 doesn't have DCC in order to be displayable
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
Marek Olšák [Wed, 10 Jun 2020 15:43:49 +0000 (11:43 -0400)]
ac/surface: don't set is_displayable if displayable DCC is missing
If flags.display isn't set, then displayable DCC will not be computed, so
is_displayable will always be false.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5424>
Marek Olšák [Wed, 10 Jun 2020 12:53:40 +0000 (08:53 -0400)]
amd/addrlib: fix the C++ one definition rule violation
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1854
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5414>
Jason Ekstrand [Fri, 22 May 2020 03:24:28 +0000 (22:24 -0500)]
iris: Better handle metadata in NIR passes
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
Jason Ekstrand [Fri, 22 May 2020 01:41:28 +0000 (20:41 -0500)]
intel/nir: Call nir_metadata_preserve on !progress
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
Jason Ekstrand [Fri, 22 May 2020 03:34:37 +0000 (22:34 -0500)]
nir: Properly preserve metadata in more cases
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
Jason Ekstrand [Fri, 22 May 2020 01:41:12 +0000 (20:41 -0500)]
nir: Call nir_metadata_preserve on !progress
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
Jason Ekstrand [Fri, 22 May 2020 02:37:33 +0000 (21:37 -0500)]
nir: Add a nir_shader_preserve_all_metadata helper
There are some passes which really work on the shader level and it's
easier if we have a helper which preserves metadata on the whole shader.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
Jason Ekstrand [Fri, 22 May 2020 01:39:30 +0000 (20:39 -0500)]
nir: Add a nir_metadata_all enum value
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5171>
Dave Airlie [Wed, 10 Jun 2020 03:12:41 +0000 (13:12 +1000)]
gallivm/sample: fix texel type for stencil 8-bit
This has to be unsigned, so clamping works properly for border
colors.
Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_uint_stencil
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5379>
Dave Airlie [Wed, 10 Jun 2020 02:42:38 +0000 (12:42 +1000)]
gallivm/conv: enable conversion min code. (v2)
I'm not sure why this code was if (0), but if (1) for it fixes
dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_float_color
This test expects +inf to get mapped to 255 and -inf to 0, both values
were ending up at 0.
v2: also enable in the SSE paths
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5379>
Dave Airlie [Tue, 9 Jun 2020 03:58:59 +0000 (13:58 +1000)]
gallivm/format: convert unsigned values to float properly.
This fixes:
dEQP-GLES31.functional.draw_indirect.random.2
which ends up with 3x32-bit USCALED values going down this path
some of which have the top bit set, and end up converted to signed
float instead of unsigned float values.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5379>
Dave Airlie [Mon, 8 Jun 2020 22:25:37 +0000 (08:25 +1000)]
llvmpipe: fix subpixel bits reporting.
This fixes some vulkan tests later.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5379>
Dave Airlie [Fri, 27 Mar 2020 05:27:41 +0000 (15:27 +1000)]
gallivm/nir: add group barrier support
Fixes crash in
dEQP-GLES31.functional.synchronization.inter_invocation.image_write_read
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5379>
Dave Airlie [Mon, 8 Jun 2020 07:04:50 +0000 (17:04 +1000)]
draw/gs: add more info to debugging.
adds invocations and vertex streams to default off debug,
fixes compile as well due to missing ,
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5379>
Dave Airlie [Mon, 8 Jun 2020 07:02:11 +0000 (17:02 +1000)]
draw/gs: fix emitting inactive primitives crash
Fixes dEQP-GLES31.functional.geometry_shading.emit.line_strip_emit_1_end_1
This test only emits 1 primitive, but the stores don't respect
the current mask, which might only have one lane active, for that single
primitive. Also fix the final emit path to use the emitted_mask
rather than the current execution mask.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5379>
Eric Anholt [Wed, 10 Jun 2020 19:13:43 +0000 (12:13 -0700)]
ci: Leave a note as to what might be going on with a test.
dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
fails pretty strangely (given that we're passing everything else) and
there's an old VK-GL-CTS bug open about this test, and it's suspicious
that all the ARM drivers seem to have trouble with it. I tried dropping
to -O0 on guilding that file in the CTS and it didn't help, though.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5419>
Eric Anholt [Wed, 10 Jun 2020 18:28:03 +0000 (11:28 -0700)]
freedreno/a6xx: Fix clip_halfz support.
Same bit as on other gens, apparently it just got missed on this one.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5419>
Ben Skeggs [Sat, 6 Jun 2020 23:52:49 +0000 (09:52 +1000)]
nvc0: initial support for tu1xx
v2:
- add proper method definitions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:46 +0000 (09:52 +1000)]
nvc0: initial support for gv100
v2:
- remove unnecessary MAX2()
- add proper method definitions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:45 +0000 (09:52 +1000)]
nvc0: remove hardcoded blitter vertprog
I don't really feel like writing SM70 SASS by hand...
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:43 +0000 (09:52 +1000)]
nvc0: move setting of entrypoint for a shader stage to a function
GV100 requires something different, cleaner to move this to a single place.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:41 +0000 (09:52 +1000)]
nvc0: use NVIDIA headers for GP100- compute QMD
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:39 +0000 (09:52 +1000)]
nvc0: use NVIDIA headers for GK104->GM2xx compute QMD
v2:
- add header debug_printf(), and indent the output
v3:
- rename one of the helper macros
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:37 +0000 (09:52 +1000)]
nvir/gv100: enable support for tu1xx
SM75 has a bunch more stuff, but is otherwise backwards-compatible
with SM70 SASS.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:35 +0000 (09:52 +1000)]
nvir/gv100: initial support
v2:
- add TargetGV100::isBarrierRequired() for OP_BREV
- use NV50_IR_SUBOP_LOP3_LUT() convenience macro where it makes sense
- separated out nir_lower_idiv into its own commit
- make use of the shared function to generate compiler options
- disable lower_fpow, nir's lowering is broken
v3:
- use replaceCvt() instead of custom NEG/ABS/SAT lowering
v4:
- remove WAR from peephole, not needed now we're using replaceCvt()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:33 +0000 (09:52 +1000)]
nvir/nir/gm107: switch off lower_extract_word
We can use PRMT here.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:31 +0000 (09:52 +1000)]
nvir/nir/gm107: switch off lower_extract_byte
We can use PRMT here.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:29 +0000 (09:52 +1000)]
nvir/nir/gm107: turn on nir_lower_extract64
About to disable lowering for extract_byte/word in favour of a better
local implementation, but still need lowering for 64-bit versions.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:27 +0000 (09:52 +1000)]
nvir/nir/gm107: split nir shader compiler options from gf100
We can enable some more things here vs earlier GPUs.
v2:
- make use of the shared function to generate compiler options
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:25 +0000 (09:52 +1000)]
nvir/gm107: separate out header for sched data calculator
SM70 code emitter will want to reuse this.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:23 +0000 (09:52 +1000)]
nvir/gm107: replace SHR+AND+AND with PRMT+PRMT in PFETCH lowering
This is more SM70-friendly.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:21 +0000 (09:52 +1000)]
nvir/gm107: implement OP_PERMT
PFETCH lowering will be changed to use this as it's more SM70-friendly,
and this will also allow us to implement extract_byte/word opcodes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sun, 7 Jun 2020 20:23:50 +0000 (06:23 +1000)]
nvir/nir: use nir_lower_idiv
NIR provides a common implementation of this so we don't need to use a
hand-written built-in library.
v2:
- use idiv_precise instead
Especially important on SM70 where we don't have an assembler.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:17 +0000 (09:52 +1000)]
nvir/nir: nir expects the shift amount to wrap, rather than clamp
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:15 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_uror
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:13 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_urol
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:11 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_i16
v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:10 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_u16
v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:08 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_i8
v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:06 +0000 (09:52 +1000)]
nvir/nir: implement nir_op_extract_u8
v2:
- use getSSA() instead of getScratch()
v3:
- fix whitespace
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:19 +0000 (09:52 +1000)]
nvir/nir: turn on lower_rotate
This isn't implemented, and won't be for GPUs that don't support SHF.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sun, 7 Jun 2020 22:43:54 +0000 (08:43 +1000)]
nvir/nir: flesh out options
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:04 +0000 (09:52 +1000)]
nvir/nir: move nir options to codegen
These seem to make more sense living with the compiler.
v2:
- use a shared function to generate the per-chipset structs
- remove nir.h include from header, not needed
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:02 +0000 (09:52 +1000)]
nvir/nir: fix fragment program output when using MRT
v2:
- use BITFIELD64_BIT()
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Karol Herbst [Fri, 15 May 2020 09:14:12 +0000 (11:14 +0200)]
nvir/nir: use component helpers instead of insn->num_components
We have nir_intrinsic_dest_components and nir_intrinsic_src_components
which handle all the corner cases.
Fixes a bunch of regressions like front_face stuff.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Mon, 8 Jun 2020 23:52:47 +0000 (09:52 +1000)]
nvir: run replaceZero() before replaceCvt()
replaceCvt() will miss some cases otherwise.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:52:00 +0000 (09:52 +1000)]
nvir: add constant folding for OP_PERMT
Important for SM70 INSBF/EXTBF lowering, as these can can often be
eliminated completely.
v2:
- skip CF when subOp is set
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:58 +0000 (09:51 +1000)]
nvir: introduce OP_FINAL
Required to support SM70 GS.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:56 +0000 (09:51 +1000)]
nvir: introduce OP_SGXT
Required for SM70 EXTBF lowering.
v2:
- added constant folding
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:55 +0000 (09:51 +1000)]
nvir: introduce OP_BMSK
This replaces the existing implementation without adding lowering for
earlier GPUs. The reason for this is because the existing code isn't
at all correct, and it also can't be hit anyway.
Will be required to support SM70 lowering passes.
v2:
- fixup source selection
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:53 +0000 (09:51 +1000)]
nvir: introduce OP_SHF
We already use a hack from NVC0LegalizeSSA::handleShift() on GK110 and
newer which encodes SHF into the existing SHL/SHR opcodes, but there's
a couple of problems with it:
- LO/HI are swapped in one of the directions, which is very confusing.
- The initial SM70 code will emit this from NIR->NVIR, and using the
existing encodings will confuse the optimisation passes.
As I want to limit the impact on other GPUs from the initial bring-up
of Volta/Turing, let's add an explicit representation of SHF in the IR.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:51 +0000 (09:51 +1000)]
nvir: introduce OP_BREV with lowering to EXTBF_REV for current GPUs
SM70 has this instruction, but no BFE.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:49 +0000 (09:51 +1000)]
nvir: introduce OP_WARPSYNC
Will be required to support SM70.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:45 +0000 (09:51 +1000)]
nvir: introduce OP_LOP3_LUT
Will be required to support SM70, but is also available on earlier GPUs.
v2:
- add convenience macro suggested by Karol
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Ben Skeggs [Sat, 6 Jun 2020 23:51:40 +0000 (09:51 +1000)]
nvir: bump max encoding size of instructions
SM70 SASS is encoded into 16 bytes.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5377>
Erik Faye-Lund [Tue, 9 Jun 2020 19:25:26 +0000 (21:25 +0200)]
gallium/hud: do not specify potentially invalid depth-range
Setting the depth-scale to 1 while leaving the depth-translation at 0
means our near-plane is at -1 in OpenGL semantics, which is
out-of-range on some drivers. In particular, Zink has this limitation.
But since we'll only pass a zero z in here anyway, we might as well
multiply it by zero, and get the same result. This avoids the problem.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5408>
Dave Airlie [Wed, 13 May 2020 03:37:39 +0000 (13:37 +1000)]
draw: add disk caching for draw shaders
This adds the cache search/insert and compile skipping for cached
objects to the VS/GS/TES/TCS stages in draw.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Wed, 13 May 2020 03:37:19 +0000 (13:37 +1000)]
llvmpipe: hook draw disk cache up
Connect the draw callbacks into the llvmpipe code.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Wed, 13 May 2020 03:36:55 +0000 (13:36 +1000)]
draw: add disk cache callbacks for draw shaders
This provides a set of hooks from the driver that draw can
use to access the disk cache for the draw shaders.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Wed, 13 May 2020 00:49:51 +0000 (10:49 +1000)]
llvmpipe/cs: add shader caching
As for fragment shader, skip compilation step if we have the shaders
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Wed, 13 May 2020 00:45:37 +0000 (10:45 +1000)]
llvmpipe/fs: add caching support
Serialize and check if the object is in the cache, it there is
a cached object skip compilation code once we've constructed
the function interface.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Thu, 14 May 2020 05:23:48 +0000 (15:23 +1000)]
gallivm: don't cache shaders that use fetch functions.
This needs to be reworked, but it's a bit messy as we have to store
all the fetch pointers to be added as globals later once gallivm
has been initialised further. For now just refuse to cache shaders
that hit these paths (mainly ETC1 and BPTC).
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Tue, 21 Apr 2020 03:14:20 +0000 (13:14 +1000)]
llvmpipe: add infrastructure for disk cache support
This hooks up the gallium API and adds the APIs needed
for shader stages to search and add things to the cache.
It also adds cache stats debug printing.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Wed, 13 May 2020 00:43:56 +0000 (10:43 +1000)]
gallivm: add cache interface to mcjit
MCJIT uses an ObjectCache object to implement the cache,
this creates and instances of it and adds it to the MCJIT
instances, it stores the cached object for later use by
the outer layers.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Fri, 15 May 2020 00:11:56 +0000 (10:11 +1000)]
gallivm: skip operations if we have a cached object.
If the object is loaded from the cache, a bunch of gallivm/llvm
interactions can be skipped.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Tue, 12 May 2020 23:30:44 +0000 (09:30 +1000)]
gallivm: add support for a cache object
This plumbs the cache object into the gallivm API, nothing uses
it yet.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Fri, 15 May 2020 00:05:55 +0000 (10:05 +1000)]
gallivm: rework debug printf hook to use global mapping.
Cached shaders require relinking, so hardcoding the pointer
can't work. This switches out the printf code to use new
proper API.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Fri, 15 May 2020 00:03:32 +0000 (10:03 +1000)]
gallivm: rework coroutine malloc/free callouts.
When using cached shaders we have to relink the shader with
external symbols when it's loaded. However the way gallivm does
function calls now hardcodes the function pointer into the shader.
LLVM had a mechanism for doing this properly using global mappings,
this switches the coroutine alloc/free code to use a global mapping.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Thu, 14 May 2020 23:59:34 +0000 (09:59 +1000)]
llvmpipe/draw: drop variant number from function names.
When we use an object cache for the MCJIT we can have identical
cache entries from the same shader variant in different shaders,
but the JIT objcache uses the function name to relink things,
so it has to be consistent. Just drop the variants from the
function names.
Note the modules still have the variant info.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Thu, 21 May 2020 03:21:51 +0000 (13:21 +1000)]
llvmpipe/cs: overhaul cs variant key state.
This just realigns it with the fs state, and fixes some issues
where shaders weren't getting cached correctly.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Dave Airlie [Mon, 8 Jun 2020 02:30:27 +0000 (12:30 +1000)]
util/disk_cache: add fallback for disk_cache_get_function_identifier
Otherwise drivers need to have a ifdef on windows, easier to fix
here hopefully.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Christian Gmeiner [Tue, 9 Jun 2020 17:05:21 +0000 (19:05 +0200)]
ci: fix possible spuriously run of jobs
Need to list arm_test-base here as well, or jobs using this
template may spuriously run if the arm_test-base job fails or
is cancelled.
Suggested-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5405>
Marek Olšák [Tue, 9 Jun 2020 08:55:19 +0000 (04:55 -0400)]
ac/surface: cache DCC retile maps (v2)
This reduces overhead when resizing windows or when allocating
similar image sizes over and over again.
v2: optimize the memory footprint of the cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
Marek Olšák [Tue, 9 Jun 2020 07:19:04 +0000 (03:19 -0400)]
ac/surface: add a wrapper structure to hold ADDR_HANDLE
and more things in the future.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
Marek Olšák [Tue, 9 Jun 2020 07:06:22 +0000 (03:06 -0400)]
amd/addrlib: remove unused members of ADDR2_COMPUTE_DCC_ADDRFROMCOORD_INPUT
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
Marek Olšák [Tue, 9 Jun 2020 06:40:20 +0000 (02:40 -0400)]
amd/addrlib: don't recompute DCC info for every ComputeDccAddrFromCoord call
This decreases the DCC retile map overhead from 23% to 18%.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
Marek Olšák [Tue, 9 Jun 2020 06:08:21 +0000 (02:08 -0400)]
ac/surface: don't recompute the DCC retile map for imported textures
The retile map is not used in this case, and the retile map computation
takes 39% of CPU time when resizing a window.
This brings it down to 23%.
The dcc_retile_use_uint16 setting has to be derived from DCC sizes.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
Rhys Perry [Thu, 21 May 2020 19:21:37 +0000 (20:21 +0100)]
aco: fix moving sub-dword values out of a register for a fixed definition
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
Rhys Perry [Fri, 15 May 2020 14:25:44 +0000 (15:25 +0100)]
aco: use Info::definition_size instead of definition's regclass
16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
Rhys Perry [Mon, 18 May 2020 14:37:33 +0000 (15:37 +0100)]
aco: add Info::{operand_size,definition_size}
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
Rhys Perry [Tue, 12 May 2020 14:08:05 +0000 (15:08 +0100)]
aco: prefer 4-byte aligned definitions
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
CodeSize: 811984 -> 806224 (-0.71%)
Instrs: 155733 -> 155939 (+0.13%); split: -0.04%, +0.18%
Cycles:
1982568 ->
1984400 (+0.09%); split: -0.06%, +0.15%
VMEM: 7187 -> 7121 (-0.92%); split: +0.86%, -1.78%
SMEM: 1770 -> 1769 (-0.06%)
VClause: 1475 -> 1476 (+0.07%)
Copies: 12406 -> 12606 (+1.61%); split: -0.46%, +2.07%
Branches: 5901 -> 5900 (-0.02%); split: -0.25%, +0.24%
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
Rhys Perry [Mon, 11 May 2020 16:49:40 +0000 (17:49 +0100)]
aco: allow reading/writing upper halves/bytes when possible
Use SDWA, opsel or a different opcode to achieve this.
shader-db (Navi, fp16 enabled):
Totals from 42 (0.03% of 127638) affected shaders:
VGPRs: 3424 -> 3416 (-0.23%)
CodeSize: 811124 -> 811984 (+0.11%); split: -0.12%, +0.23%
Instrs: 156638 -> 155733 (-0.58%)
Cycles:
1994180 ->
1982568 (-0.58%); split: -0.59%, +0.00%
VMEM: 7019 -> 7187 (+2.39%); split: +3.45%, -1.05%
SMEM: 1771 -> 1770 (-0.06%); split: +0.06%, -0.11%
VClause: 1477 -> 1475 (-0.14%)
Copies: 13216 -> 12406 (-6.13%)
Branches: 5942 -> 5901 (-0.69%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
Rhys Perry [Tue, 9 Jun 2020 19:41:49 +0000 (20:41 +0100)]
aco: p_extract_vector in 64-bit u2f16/i2f16
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
Rhys Perry [Wed, 3 Jun 2020 10:27:55 +0000 (11:27 +0100)]
aco: validate instructions reading/writing upper halves/bytes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
Icecream95 [Sat, 6 Jun 2020 10:32:04 +0000 (22:32 +1200)]
panfrost: Add writes_stencil to the EARLY_Z disable list
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
Icecream95 [Sat, 6 Jun 2020 03:36:22 +0000 (15:36 +1200)]
pan/mdg: Print writeout sources in mir_print_instruction
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
Icecream95 [Sat, 6 Jun 2020 05:25:08 +0000 (17:25 +1200)]
pan/mdg: Add new depth store lowering
This uses the new nir_intrinsic_store_combined_output_pan intrinsic,
which can write depth, stencil and color in a single instruction. If
there are no color writes, the "depth RT" is written to.
Fixes the dEQP GLES3 depth write tests, as well as the piglit tests
fragdepth_gles2, glsl-1.10-fragdepth and when modified to not rely
on depth/stencil reload, glsl-fs-shader-stencil-export.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
Icecream95 [Sat, 6 Jun 2020 03:41:51 +0000 (15:41 +1200)]
pan/mdg: Add depth/stencil support to emit_fragment_store
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
Icecream95 [Sat, 6 Jun 2020 03:39:22 +0000 (15:39 +1200)]
pan/mdg: Move search_var to earlier in midgard_compile.c
It will be needed by the new zs lowering.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
Icecream95 [Sat, 6 Jun 2020 03:21:21 +0000 (15:21 +1200)]
pan/mdg: Add new depth writeout code
We schedule depth writeout to smul and stencil to vlut, so scheduling
to smul has to be disabled in these cases.
When only writing stencil, scheduling to smul is still disabled to
prevent stencil writeout from being scheduled there.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>