Eric Engestrom [Sat, 3 Aug 2019 11:31:19 +0000 (12:31 +0100)]
gitlab-ci: don't install autotools deps
These could've been deleted a long time ago, but apparent we forgot.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Eric Engestrom [Sun, 7 Jul 2019 10:40:04 +0000 (11:40 +0100)]
util: fix mem leak of program path
Fixes: 759b94038987bb983398 ("util: Get program name based on path when possible")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Eric Engestrom [Sat, 3 Aug 2019 01:33:42 +0000 (02:33 +0100)]
meson: build intel-ui tools as part of `all` tools
Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111289
Cc: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Sat, 3 Aug 2019 11:41:06 +0000 (12:41 +0100)]
gitlab-ci: add gtk3 dev files for `-D tools=intel-ui`
We also need to update wayland-protocols and libXrandr (and randrproto),
as they are too old for gdk3 (which gtk3 depends on).
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jan Vesely [Tue, 6 Aug 2019 16:24:18 +0000 (12:24 -0400)]
clover: Fix build after clang r367864
v2: Drop special case of llvm-9
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Aaron Watry <awatry@gmail.com>
Timothy Arceri [Wed, 7 Aug 2019 00:21:43 +0000 (10:21 +1000)]
mesa: remove super old TODOs from shaderapi.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
John Stultz [Wed, 24 Jul 2019 23:32:54 +0000 (23:32 +0000)]
mesa: freedreno: Android.registers.mk: Fix up register xml.h file generation
The current Androdi.registers.mk file causes build failures that
look like:
FAILED:
external/mesa3d/src/freedreno/Android.registers.mk:49: error: implicit rules are obsolete: out/target/product/linaro_db845c/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/%.xml.h
Caused by the following Android build rule change:
https://android.googlesource.com/platform/build/+/HEAD/Changes.md#implicit_rules
I tried to replace this with something similar to the static
pattern suggested in the URL above, but ended up getting all the
xml.h files generated using only the first a2xx.xml source file.
So I've fallen back to explicitly defining the make rules for
each.
Additionally, we needed to provide the proper
LOCAL_EXPORT_C_INCLUDE_DIRS and add the defined static library
to the components that depend on the register headers.
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
John Stultz [Wed, 3 Jul 2019 22:37:45 +0000 (22:37 +0000)]
mesa: Add ir3/ir3_nir_imul.c generation to Android.mk
With current master we're seeing build failures with AOSP:
error: undefined symbol: ir3_nir_lower_imul
This is due to the ir3_nir_imul.c file not being generated
in the Android.mk files.
This patch simply adds it to the Android build, after which
thigns build and book ok on db410c.
Cc: Rob Clark <robdclark@chromium.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Greg Hartman <ghartman@google.com>
Cc: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Rohan Garg [Wed, 17 Jul 2019 16:50:13 +0000 (18:50 +0200)]
panfrost: Take into account a index_bias for glDrawElementsBaseVertex calls
Midgard does not accept a index_bias directly and relies instead on a
bias correction offset (offset_bias_correction) in order to calculate
the unbiased vertex index.
We need to make sure we adjust offset_start and vertex_count in order to
take into account the index_bias as required by a
glDrawElementsBaseVertex call and then supply a additional
offset_bias_correction to the hardware.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Bas Nieuwenhuizen [Sun, 4 Aug 2019 23:39:23 +0000 (01:39 +0200)]
radv/gfx10: Enable DCC for storage images.
v2: Hide it behind a perftest flag.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 4 Aug 2019 23:38:42 +0000 (01:38 +0200)]
radv: Add device argument for dcc compression check.
Because it is about to be generation dependent.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 4 Aug 2019 23:19:29 +0000 (01:19 +0200)]
radv: Disable compression for compute DCC decompress store.
Previously we relied on stores not using DCC but that is going to
change, so disable compression explicitly.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 4 Aug 2019 23:07:04 +0000 (01:07 +0200)]
radv: Add extra struct to image view creation.
For extra args. Unlike image creation, I'm not embedding the vk
struct in there, so all the inline structs can be kept.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 4 Aug 2019 23:24:45 +0000 (01:24 +0200)]
radv: Do not decompress on LAYOUT_GENERAL.
We handle render loops properly now and STORAGE still disables
DCC/TC-compat HTILE in general.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 4 Aug 2019 21:48:43 +0000 (23:48 +0200)]
radv: Pass through render loop detection to internal layout decisions.
And do nothing with it yet.
Everything outside a renderpass has no render loop.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Sun, 4 Aug 2019 21:17:20 +0000 (23:17 +0200)]
radv: Add render loop detection in renderpass.
VK spec 7.3:
"Applications must ensure that all accesses to memory that backs
image subresources used as attachments in a given renderpass instance
either happen-before the load operations for those attachments, or
happen-after the store operations for those attachments."
So the only renderloops we can have is with input attachments. Detect
these.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Fri, 2 Aug 2019 05:17:16 +0000 (15:17 +1000)]
drirc: Add vendor workaround for Divinity: Original Sin EE
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551
Timothy Arceri [Fri, 2 Aug 2019 05:13:59 +0000 (15:13 +1000)]
mesa/gallium: add dric option to allow overriding GL vendor string
Will be used in the following patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93551
Marek Olšák [Wed, 7 Aug 2019 00:09:46 +0000 (20:09 -0400)]
relnotes/19.2: document EXT_texture_dhadow_lod
Bas Nieuwenhuizen [Tue, 6 Aug 2019 09:50:37 +0000 (11:50 +0200)]
radv: Fix config reg assert.
Using the wrong bounds
Fixes: "219d6939df8 radv: add more assertions to make sure packets are correctly emitted"
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 2 Aug 2019 13:22:52 +0000 (15:22 +0200)]
tgsi_to_nir: add a few needed double opcodes
for internal radeonsi shaders
v2 (Connor):
- Split out prep work from adding opcodes, and rewrite the former
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Fri, 2 Aug 2019 13:19:00 +0000 (15:19 +0200)]
tgsi_to_nir: implement a few needed 64-bit integer opcodes
for internal radeonsi shaders
v2 (Connor):
- Split this out from the prep work, and rework the former
- Add support for U64SNE
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Connor Abbott [Fri, 2 Aug 2019 13:13:53 +0000 (15:13 +0200)]
ttn: Prepare for 64-bit sources and destinations
v2: Properly handle 32->64 bit conversions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Connor Abbott [Fri, 2 Aug 2019 13:00:30 +0000 (15:00 +0200)]
ttn: Use 1-bit NIR comparison opcodes
We shouldn't be using the versions that output a 32-bit boolean, since
nir_opt_algebraic won't optimize them as well. Drivers will lower these
to the 32-bit versions after optimizing, if appropriate. Also, this will
make implementing 64-bit comparisons easier.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Connor Abbott [Fri, 2 Aug 2019 12:56:20 +0000 (14:56 +0200)]
nir/builder: Add nir_b2i
Same as nir_b2f but for integers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Thu, 1 Aug 2019 08:17:26 +0000 (10:17 +0200)]
radeonsi: enable EXT_shader_image_load_store
This depends on LLVM 10 because this needs https://reviews.llvm.org/D65283
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Wed, 24 Jul 2019 10:07:50 +0000 (12:07 +0200)]
radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrap
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 13:57:54 +0000 (15:57 +0200)]
radeonsi: add support for tgsi ATOMDEC_WRAP / ATOMINC_WRAP opcodes
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Wed, 24 Jul 2019 10:09:31 +0000 (12:09 +0200)]
ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Wed, 24 Jul 2019 10:06:34 +0000 (12:06 +0200)]
nir: add atomic_inc_wrap/atomic_dec_wrap image intrinsics
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 14:38:44 +0000 (16:38 +0200)]
glsl: add EXT_shader_image_load_store new image functions
This extension has 2 functions that are missing from the ARB versions:
- imageAtomicIncWrap
- imageAtomicDecWrap
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 14:26:45 +0000 (16:26 +0200)]
glsl: add EXT_shader_image_load_store keywords to lexer
All of them already existed for ARB_shader_image_load_store.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 14:33:46 +0000 (16:33 +0200)]
glsl: add size qualifiers from EXT_shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 13:50:38 +0000 (15:50 +0200)]
glsl: handle differences between ARB/EXT versions of shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 13:47:26 +0000 (15:47 +0200)]
mesa: add EXT_shader_image_load_store glBindImageTextureEXT function
The implementation is almost identical to glBindImageTexture except for error
checking.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 13:46:27 +0000 (15:46 +0200)]
glapi: add EXT_shader_image_load_store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Wed, 17 Jul 2019 13:43:39 +0000 (15:43 +0200)]
gallium: add PIPE_CAP_TGSI_ATOMINC_WRAP to indicate support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Fri, 12 Jul 2019 13:54:17 +0000 (15:54 +0200)]
tgsi: add ATOMICINC_WRAP/ATOMICDEC_WRAP opcode
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Wed, 31 Jul 2019 03:20:03 +0000 (23:20 -0400)]
radeonsi/gfx10: enable all CUs for GS if NGG is never used
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Mon, 29 Jul 2019 21:43:55 +0000 (17:43 -0400)]
radeonsi/gfx10: add global use_ngg and use_ngg_streamout flags
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Wed, 31 Jul 2019 01:42:26 +0000 (21:42 -0400)]
radeonsi/gfx10: remove an obsolete VGT_REUSE_OFF workaround
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Wed, 31 Jul 2019 01:39:03 +0000 (21:39 -0400)]
radeonsi/gfx10: disable LATE_ALLOC_GS on Navi14
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Wed, 31 Jul 2019 01:29:29 +0000 (21:29 -0400)]
radeonsi/gfx10: implement a bug workaround for GE_PC_ALLOC
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 22:40:22 +0000 (18:40 -0400)]
radeonsi/gfx10: implement a bug workaround for NGG -> legacy transitions
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 22:33:01 +0000 (18:33 -0400)]
radeonsi/gfx10: implement a GE bug workaround
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 22:16:05 +0000 (18:16 -0400)]
radeonsi/gfx10: set GE_CNTL for tessellation correctly
to match PAL
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Mon, 29 Jul 2019 21:45:22 +0000 (17:45 -0400)]
radeonsi/gfx10: simplify NGG code in si_update_shaders
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Mon, 29 Jul 2019 21:44:52 +0000 (17:44 -0400)]
radeonsi/gfx10: fix input VGPRs for legacy VS
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 21:43:41 +0000 (17:43 -0400)]
radeonsi: make sure that rasterizer state != NULL and remove all NULL checking
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 21:43:41 +0000 (17:43 -0400)]
radeonsi: make sure that DSA state != NULL and remove all NULL checking
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 21:43:41 +0000 (17:43 -0400)]
radeonsi: make sure that blend state != NULL and remove all NULL checking
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 21:28:50 +0000 (17:28 -0400)]
radeonsi: DCC MSAA blending bug - include logic op, limit to Navi14 and older
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 21:25:06 +0000 (17:25 -0400)]
radeonsi: determine accurately whether logic op is enabled
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 21:10:59 +0000 (17:10 -0400)]
radeonsi: skip draw calls with 0-sized index buffers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 03:01:01 +0000 (23:01 -0400)]
radeonsi/nir: lower PS inputs before scanning the shader
Lowering PS inputs can eliminate some of them, which messes up
persp/linear barycentric coord usage info.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 02:04:16 +0000 (22:04 -0400)]
radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_center
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Mon, 29 Jul 2019 23:24:02 +0000 (19:24 -0400)]
radeonsi: add missing prints into si_dump_shader_key
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Marek Olšák [Tue, 30 Jul 2019 20:30:04 +0000 (16:30 -0400)]
radeonsi: disable SDMA image copies on dGPUs to fix corruption in games
Cc: 19.1 19.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 16:28:15 +0000 (18:28 +0200)]
mesa: add EXT_dsa glMultiTexCoordPointerEXT function
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 15:52:01 +0000 (17:52 +0200)]
mesa: add EXT_dsa glMultiTexGen* functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 13:42:39 +0000 (15:42 +0200)]
mesa: add EXT_dsa glCopyMultiTexImage* and glCopyMultiTexSubImage*
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 13:15:04 +0000 (15:15 +0200)]
mesa: add EXT_dsa glGetMultiTexParameteriv/fvEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 12:47:31 +0000 (14:47 +0200)]
mesa: add EXT_dsa glMultiTexSubImage1D/2D/3DEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 11:45:15 +0000 (13:45 +0200)]
mesa: add EXT_dsa glMultiTexImage1D/2D/3DEXT + glGetMultiTexImageEXT
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Thu, 1 Aug 2019 12:50:26 +0000 (14:50 +0200)]
mesa: add glBindMultiTextureEXT display list support
Fixes: 0972b0b059d ("mesa: add support for glBindMultiTextureEXT")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 11:44:57 +0000 (13:44 +0200)]
mesa: add EXT_dsa glMultiTexParameter* functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Mon, 29 Apr 2019 17:26:55 +0000 (19:26 +0200)]
mesa: add EXT_dsa (Get)MultiTexEnv functions
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Thu, 1 Aug 2019 10:05:52 +0000 (12:05 +0200)]
mesa: add _mesa_(get)texenvi(f)v_indexed helpers
They are exactly like _mesa_GetTexEnvfv/_mesa_GetTexEnviv except they take
a GLuint texunit parameter instead of relying of ctx->Texture.CurrentUnit.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Tue, 30 Apr 2019 11:37:12 +0000 (13:37 +0200)]
mesa: add new helper _mesa_get_texobj_by_target_and_texunit
Based on the 'static get_texobj_by_target' function from texparam.c,
but extended to also take the texunit as a parameter.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Pierre-Eric Pelloux-Prayer [Mon, 29 Apr 2019 17:23:36 +0000 (19:23 +0200)]
mesa: replace _mesa_get_current_fixedfunc_tex_unit with _mesa_get_fixedfunc_tex_unit
The new function implements the same feature but doesn't depend
on ctx->Texture.CurrentUnit.
This change allows to use it from indexed functions.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Danylo Piliaiev [Thu, 25 Jul 2019 10:09:08 +0000 (13:09 +0300)]
iris: Handle vertex shader with window space position
Iris advertises support for PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION
so let's actually implement it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110657
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Erico Nunes [Tue, 6 Aug 2019 17:51:44 +0000 (19:51 +0200)]
lima: fix pipe_debug_callback warnings
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Vasily Khoruzhick [Mon, 5 Aug 2019 19:33:09 +0000 (12:33 -0700)]
lima/ppir: move sin/cos input scaling into NIR
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Antia Puentes [Thu, 28 Feb 2019 18:07:36 +0000 (19:07 +0100)]
nir/spirv: Fix gl_BaseVertex for non-indexed draws for OpenGL
Lowers BaseVertex to the correct system value for OpenGL.
v2: use options->environment rather than adding a new flag to
spirv_to_nir_options
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Kenneth Graunke [Tue, 6 Aug 2019 15:20:58 +0000 (08:20 -0700)]
iris: Increase BATCH_SZ to 64kB
This seems to improve performance by roughly ~1% across the board.
Thanks to Rafael Antognolli and Dan Walsh for their help tuning.
Bas Nieuwenhuizen [Sun, 28 Jul 2019 20:32:33 +0000 (22:32 +0200)]
ac/nir: Use correct cast for readfirstlane and ptrs.
Fixes: 028ce527 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Sun, 28 Jul 2019 20:31:00 +0000 (22:31 +0200)]
radv: Do non-uniform lowering before bool lowering.
Since it can introduce comparisons.
Fixes: 028ce527395 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Jonathan Marek [Fri, 26 Jul 2019 17:53:36 +0000 (13:53 -0400)]
etnaviv: support 3D and 2D array textures
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Wed, 24 Jul 2019 14:30:08 +0000 (10:30 -0400)]
etnaviv: fix 3d texture upload
Fix uploading of 3D textures and 2D array textures:
* Remove asserts in BLT and RS checking z
* Use box->z/box->depth in etna_copy_resource_box and CPU tile/untile
* Track mip level depth and use it in etna_copy_resource
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Sat, 11 May 2019 13:51:42 +0000 (09:51 -0400)]
etnaviv: add alternative NIR compiler
enable with ETNA_MESA_DEBUG=nir
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Jonathan Marek [Fri, 28 Jun 2019 02:02:45 +0000 (22:02 -0400)]
etnaviv: prep for UBOs
Allow UBO relocs and only emitting uniforms that are actually used.
GC7000Lite has no address register, so upload uniforms to a UBO object to
LOAD from.
I removed the code to check for changes to individual uniforms and just
reupload to entire uniform state when the state is dirty. I think there
was very limited benefit to it and it isn't compatible with relocs.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Fri, 21 Jun 2019 01:01:38 +0000 (21:01 -0400)]
etnaviv: disasm: add dual16 bits, immediate decoding, and some opcodes
Also use structs from etnaviv_asm since they hold the same information.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Fri, 28 Jun 2019 01:52:22 +0000 (21:52 -0400)]
etnaviv: asm: new features
* Dual16 bits
* Halti5 disable multiple uniform src
* write_mask compose
* Halti2+ immediates
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Jonathan Marek [Tue, 30 Jul 2019 06:48:54 +0000 (02:48 -0400)]
etnaviv: update headers from rnndb
Update to etna_viv commit
f38ba2d.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Erico Nunes [Sat, 3 Aug 2019 08:56:12 +0000 (10:56 +0200)]
lima: add summary report for shader-db
Very basic summary, loops and gpir spills:fills are not updated yet and
are only there to comply with the strings to shader-db report.py regex.
For now it can be used to analyze the impact of changes in instruction
count in both gpir and ppir.
The LIMA_DEBUG=shaderdb setting can be useful to output stats on
applications other than shader-db.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Erico Nunes [Sat, 3 Aug 2019 08:51:58 +0000 (10:51 +0200)]
lima: add support for debug callback
This adds support for glDebugMessageCallback which is required to
support shader-db reports.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Tomeu Vizoso [Tue, 6 Aug 2019 13:19:40 +0000 (15:19 +0200)]
panfrost/ci: Remove two tests from list of failures
These tests have been fixed by:
b514f411837b ("glcpp: use pre-expansion line number for __LINE__")
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Jon Turney [Fri, 2 Aug 2019 12:42:48 +0000 (13:42 +0100)]
st/dri: Move dri2_format_mapping table and it's accessors from dri2.c to dri_helpers.c
8af1990a exposed dri2_get_mapping_by_fourcc() in dri_helpers.h, so it
could be used by dri_get_egl_image(), but didn't move it. This breaks
the build in the with_dri=false case (e.g. when building for a target
which doesn't have libdrm, so swrast is only dri driver built)
Jonathan Marek [Sun, 28 Jul 2019 18:26:00 +0000 (14:26 -0400)]
glcpp: use pre-expansion line number for __LINE__
Fixes the following deqp tests:
dEQP-GLES2.functional.shaders.preprocessor.predefined_macros.line_2_*
It don't see the spec requiring this, but it seems to be better, as the
clang preprocessor for example has this behavior.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 31 Jul 2019 15:42:24 +0000 (10:42 -0500)]
anv: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3D
There is an object-level preemption workaround which requires this.
However, even without object-level preemption, we seem to have issues
with geometry flickering when 3D and compute are combined in the same
batch and this appears to fix it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109630
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111267
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Tue, 23 Jan 2018 09:35:51 +0000 (17:35 +0800)]
nir/algebraic: Use value range analysis to eliminate useless unary ops
Sandy Bridge is the big winner because it lies at something of a
crossroads. It supports a fairly high OpenGL version, and it still has
the old style math box. The high OpenGL version means a lot more
shaders can run on it. The old style math box means extra moves are
necessary to resolve source modifiers on operands to complex math
instructions like COS, SQRT, and RCP.
v2: Remove a couple patterns that are now redundant.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
16282006 ->
16278207 (-0.02%)
instructions in affected programs: 174555 -> 170756 (-2.18%)
helped: 661
HURT: 0
helped stats (abs) min: 1 max: 36 x̄: 5.75 x̃: 3
helped stats (rel) min: 0.06% max: 23.68% x̄: 2.81% x̃: 1.94%
95% mean confidence interval for instructions value: -6.16 -5.34
95% mean confidence interval for instructions %-change: -3.02% -2.60%
Instructions are helped.
total cycles in shared programs:
367168597 ->
367134284 (<.01%)
cycles in affected programs:
1105276 ->
1070963 (-3.10%)
helped: 460
HURT: 150
helped stats (abs) min: 1 max: 568 x̄: 96.60 x̃: 82
helped stats (rel) min: 0.02% max: 32.50% x̄: 7.99% x̃: 4.27%
HURT stats (abs) min: 1 max: 901 x̄: 67.49 x̃: 39
HURT stats (rel) min: 0.07% max: 20.00% x̄: 4.90% x̃: 4.22%
95% mean confidence interval for cycles value: -65.68 -46.82
95% mean confidence interval for cycles %-change: -5.59% -4.05%
Cycles are helped.
Sandy Bridge
total instructions in shared programs:
10824272 ->
10802557 (-0.20%)
instructions in affected programs:
1237988 ->
1216273 (-1.75%)
helped: 8199
HURT: 0
helped stats (abs) min: 1 max: 41 x̄: 2.65 x̃: 2
helped stats (rel) min: 0.12% max: 20.00% x̄: 2.04% x̃: 1.73%
95% mean confidence interval for instructions value: -2.70 -2.59
95% mean confidence interval for instructions %-change: -2.07% -2.00%
Instructions are helped.
total cycles in shared programs:
154009894 ->
153843598 (-0.11%)
cycles in affected programs:
10650486 ->
10484190 (-1.56%)
helped: 4973
HURT: 1533
helped stats (abs) min: 1 max: 3904 x̄: 40.20 x̃: 20
helped stats (rel) min: 0.02% max: 41.72% x̄: 2.63% x̃: 1.67%
HURT stats (abs) min: 1 max: 453 x̄: 21.94 x̃: 8
HURT stats (rel) min: 0.02% max: 41.91% x̄: 1.54% x̃: 0.58%
95% mean confidence interval for cycles value: -28.02 -23.10
95% mean confidence interval for cycles %-change: -1.74% -1.56%
Cycles are helped.
LOST: 0
GAINED: 2
GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs:
8135196 ->
8134888 (<.01%)
instructions in affected programs: 31920 -> 31612 (-0.96%)
helped: 169
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 1.82 x̃: 2
helped stats (rel) min: 0.43% max: 3.23% x̄: 1.23% x̃: 1.16%
95% mean confidence interval for instructions value: -2.01 -1.64
95% mean confidence interval for instructions %-change: -1.32% -1.15%
Instructions are helped.
total cycles in shared programs:
188575724 ->
188574092 (<.01%)
cycles in affected programs: 406840 -> 405208 (-0.40%)
helped: 169
HURT: 0
helped stats (abs) min: 4 max: 72 x̄: 9.66 x̃: 10
helped stats (rel) min: 0.07% max: 2.16% x̄: 0.57% x̃: 0.47%
95% mean confidence interval for cycles value: -10.72 -8.59
95% mean confidence interval for cycles %-change: -0.63% -0.50%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Tue, 23 Jan 2018 13:11:53 +0000 (21:11 +0800)]
nir/algebraic: Use value range analysis to convert fmin to fsat
All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
16297320 ->
16282006 (-0.09%)
instructions in affected programs:
2434498 ->
2419184 (-0.63%)
helped: 8091
HURT: 1
helped stats (abs) min: 1 max: 51 x̄: 1.89 x̃: 2
helped stats (rel) min: 0.04% max: 14.29% x̄: 0.98% x̃: 0.95%
HURT stats (abs) min: 7 max: 7 x̄: 7.00 x̃: 7
HURT stats (rel) min: 0.28% max: 0.28% x̄: 0.28% x̃: 0.28%
95% mean confidence interval for instructions value: -1.94 -1.85
95% mean confidence interval for instructions %-change: -0.99% -0.96%
Instructions are helped.
total cycles in shared programs:
367221624 ->
367168597 (-0.01%)
cycles in affected programs:
126409635 ->
126356608 (-0.04%)
helped: 5612
HURT: 1023
helped stats (abs) min: 1 max: 2332 x̄: 31.11 x̃: 16
helped stats (rel) min: <.01% max: 30.31% x̄: 1.69% x̃: 1.42%
HURT stats (abs) min: 1 max: 2372 x̄: 118.84 x̃: 16
HURT stats (rel) min: <.01% max: 46.98% x̄: 1.46% x̃: 0.35%
95% mean confidence interval for cycles value: -11.52 -4.46
95% mean confidence interval for cycles %-change: -1.26% -1.14%
Cycles are helped.
total spills in shared programs: 8868 -> 8870 (0.02%)
spills in affected programs: 28 -> 30 (7.14%)
helped: 0
HURT: 1
total fills in shared programs: 21903 -> 21904 (<.01%)
fills in affected programs: 42 -> 43 (2.38%)
helped: 0
HURT: 1
Haswell
total instructions in shared programs:
13353925 ->
13338728 (-0.11%)
instructions in affected programs:
2265850 ->
2250653 (-0.67%)
helped: 8127
HURT: 5
helped stats (abs) min: 1 max: 51 x̄: 1.88 x̃: 2
helped stats (rel) min: 0.04% max: 20.00% x̄: 1.13% x̃: 1.07%
HURT stats (abs) min: 5 max: 16 x̄: 9.00 x̃: 6
HURT stats (rel) min: 0.19% max: 0.52% x̄: 0.35% x̃: 0.28%
95% mean confidence interval for instructions value: -1.91 -1.83
95% mean confidence interval for instructions %-change: -1.15% -1.11%
Instructions are helped.
total cycles in shared programs:
375535444 ->
375536343 (<.01%)
cycles in affected programs:
131206582 ->
131207481 (<.01%)
helped: 5590
HURT: 1055
helped stats (abs) min: 1 max: 2844 x̄: 34.15 x̃: 16
helped stats (rel) min: <.01% max: 21.57% x̄: 2.08% x̃: 1.60%
HURT stats (abs) min: 1 max: 2487 x̄: 181.78 x̃: 21
HURT stats (rel) min: <.01% max: 40.66% x̄: 1.96% x̃: 0.37%
95% mean confidence interval for cycles value: -4.74 5.01
95% mean confidence interval for cycles %-change: -1.51% -1.37%
Inconclusive result (value mean confidence interval includes 0).
total spills in shared programs: 23401 -> 23407 (0.03%)
spills in affected programs: 248 -> 254 (2.42%)
helped: 2
HURT: 5
total fills in shared programs: 34850 -> 34845 (-0.01%)
fills in affected programs: 383 -> 378 (-1.31%)
helped: 2
HURT: 5
Ivy Bridge
total instructions in shared programs:
11975423 ->
11968117 (-0.06%)
instructions in affected programs: 845703 -> 838397 (-0.86%)
helped: 4071
HURT: 0
helped stats (abs) min: 1 max: 51 x̄: 1.79 x̃: 1
helped stats (rel) min: 0.08% max: 8.21% x̄: 1.04% x̃: 0.93%
95% mean confidence interval for instructions value: -1.87 -1.71
95% mean confidence interval for instructions %-change: -1.06% -1.02%
Instructions are helped.
total cycles in shared programs:
179674318 ->
179635552 (-0.02%)
cycles in affected programs:
5100065 ->
5061299 (-0.76%)
helped: 2650
HURT: 611
helped stats (abs) min: 1 max: 900 x̄: 21.85 x̃: 16
helped stats (rel) min: <.01% max: 21.55% x̄: 2.39% x̃: 1.40%
HURT stats (abs) min: 1 max: 1841 x̄: 31.33 x̃: 6
HURT stats (rel) min: <.01% max: 58.71% x̄: 1.64% x̃: 0.37%
95% mean confidence interval for cycles value: -14.14 -9.64
95% mean confidence interval for cycles %-change: -1.75% -1.52%
Cycles are helped.
LOST: 3
GAINED: 7
Sandy Bridge
total instructions in shared programs:
10828844 ->
10824272 (-0.04%)
instructions in affected programs: 525678 -> 521106 (-0.87%)
helped: 2386
HURT: 0
helped stats (abs) min: 1 max: 51 x̄: 1.92 x̃: 2
helped stats (rel) min: 0.11% max: 7.96% x̄: 1.05% x̃: 0.94%
95% mean confidence interval for instructions value: -2.04 -1.80
95% mean confidence interval for instructions %-change: -1.08% -1.03%
Instructions are helped.
total cycles in shared programs:
154024591 ->
154009894 (<.01%)
cycles in affected programs:
4005766 ->
3991069 (-0.37%)
helped: 1245
HURT: 506
helped stats (abs) min: 1 max: 585 x̄: 21.07 x̃: 16
helped stats (rel) min: 0.02% max: 11.57% x̄: 1.98% x̃: 0.83%
HURT stats (abs) min: 1 max: 639 x̄: 22.81 x̃: 6
HURT stats (rel) min: 0.01% max: 26.21% x̄: 1.07% x̃: 0.26%
95% mean confidence interval for cycles value: -10.57 -6.21
95% mean confidence interval for cycles %-change: -1.23% -0.97%
Cycles are helped.
GM45 and Iron Lake had similar results. (Iron Lake shown)
total instructions in shared programs:
8137248 ->
8135196 (-0.03%)
instructions in affected programs: 148322 -> 146270 (-1.38%)
helped: 992
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.07 x̃: 2
helped stats (rel) min: 0.41% max: 9.73% x̄: 1.74% x̃: 1.51%
95% mean confidence interval for instructions value: -2.16 -1.98
95% mean confidence interval for instructions %-change: -1.80% -1.67%
Instructions are helped.
total cycles in shared programs:
188583424 ->
188575724 (<.01%)
cycles in affected programs:
4409620 ->
4401920 (-0.17%)
helped: 956
HURT: 6
helped stats (abs) min: 2 max: 168 x̄: 8.09 x̃: 8
helped stats (rel) min: 0.04% max: 6.76% x̄: 0.27% x̃: 0.18%
HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6
HURT stats (rel) min: 0.10% max: 0.10% x̄: 0.10% x̃: 0.10%
95% mean confidence interval for cycles value: -8.41 -7.60
95% mean confidence interval for cycles %-change: -0.29% -0.25%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Tue, 23 Jan 2018 02:00:55 +0000 (10:00 +0800)]
nir/algebraic: Use value range analysis to eliminate tautological compares
It's only one application on one platform (Haswell) that's affected,
but spills and fills increase quite dramatically. :(
All Gen8+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
16320850 ->
16297320 (-0.14%)
instructions in affected programs: 448012 -> 424482 (-5.25%)
helped: 1938
HURT: 0
helped stats (abs) min: 2 max: 264 x̄: 12.14 x̃: 10
helped stats (rel) min: 0.35% max: 43.75% x̄: 5.85% x̃: 5.38%
95% mean confidence interval for instructions value: -12.80 -11.48
95% mean confidence interval for instructions %-change: -5.99% -5.72%
Instructions are helped.
total cycles in shared programs:
367496943 ->
367221624 (-0.07%)
cycles in affected programs:
8557232 ->
8281913 (-3.22%)
helped: 1907
HURT: 26
helped stats (abs) min: 4 max: 12802 x̄: 147.21 x̃: 48
helped stats (rel) min: 0.03% max: 75.85% x̄: 5.55% x̃: 3.94%
HURT stats (abs) min: 4 max: 1870 x̄: 208.23 x̃: 20
HURT stats (rel) min: 0.16% max: 32.11% x̄: 8.31% x̃: 0.79%
95% mean confidence interval for cycles value: -165.38 -119.48
95% mean confidence interval for cycles %-change: -5.68% -5.04%
Cycles are helped.
LOST: 1
GAINED: 0
Haswell
total instructions in shared programs:
13374211 ->
13353925 (-0.15%)
instructions in affected programs: 349868 -> 329582 (-5.80%)
helped: 1669
HURT: 1
helped stats (abs) min: 1 max: 264 x̄: 12.57 x̃: 10
helped stats (rel) min: 0.12% max: 46.81% x̄: 6.86% x̃: 6.49%
HURT stats (abs) min: 700 max: 700 x̄: 700.00 x̃: 700
HURT stats (rel) min: 64.34% max: 64.34% x̄: 64.34% x̃: 64.34%
95% mean confidence interval for instructions value: -13.25 -11.04
95% mean confidence interval for instructions %-change: -7.01% -6.63%
Instructions are helped.
total cycles in shared programs:
375763544 ->
375535444 (-0.06%)
cycles in affected programs:
6932686 ->
6704586 (-3.29%)
helped: 1622
HURT: 48
helped stats (abs) min: 2 max: 12229 x̄: 148.31 x̃: 68
helped stats (rel) min: 0.06% max: 74.03% x̄: 5.94% x̃: 4.12%
HURT stats (abs) min: 3 max: 7451 x̄: 259.44 x̃: 41
HURT stats (rel) min: 0.05% max: 54.99% x̄: 8.52% x̃: 2.88%
95% mean confidence interval for cycles value: -159.86 -113.31
95% mean confidence interval for cycles %-change: -5.86% -5.18%
Cycles are helped.
total spills in shared programs: 23258 -> 23401 (0.61%)
spills in affected programs: 54 -> 197 (264.81%)
helped: 4
HURT: 2
total fills in shared programs: 34775 -> 34850 (0.22%)
fills in affected programs: 52 -> 127 (144.23%)
helped: 4
HURT: 1
LOST: 5
GAINED: 0
Ivy Bridge
total instructions in shared programs:
11996051 ->
11977964 (-0.15%)
instructions in affected programs: 346679 -> 328592 (-5.22%)
helped: 1508
HURT: 0
helped stats (abs) min: 2 max: 198 x̄: 11.99 x̃: 10
helped stats (rel) min: 0.26% max: 19.83% x̄: 5.73% x̃: 5.43%
95% mean confidence interval for instructions value: -12.65 -11.34
95% mean confidence interval for instructions %-change: -5.86% -5.60%
Instructions are helped.
total cycles in shared programs:
179891389 ->
179691339 (-0.11%)
cycles in affected programs:
7869479 ->
7669429 (-2.54%)
helped: 1485
HURT: 23
helped stats (abs) min: 1 max: 12615 x̄: 136.16 x̃: 54
helped stats (rel) min: 0.02% max: 71.84% x̄: 4.69% x̃: 3.49%
HURT stats (abs) min: 1 max: 403 x̄: 93.48 x̃: 6
HURT stats (rel) min: 0.04% max: 34.01% x̄: 8.68% x̃: 0.81%
95% mean confidence interval for cycles value: -154.59 -110.73
95% mean confidence interval for cycles %-change: -4.79% -4.19%
Cycles are helped.
Sandy Bridge
total instructions in shared programs:
10829247 ->
10828844 (<.01%)
instructions in affected programs: 21258 -> 20855 (-1.90%)
helped: 88
HURT: 0
helped stats (abs) min: 2 max: 17 x̄: 4.58 x̃: 5
helped stats (rel) min: 0.52% max: 3.92% x̄: 2.05% x̃: 2.21%
95% mean confidence interval for instructions value: -5.03 -4.13
95% mean confidence interval for instructions %-change: -2.21% -1.89%
Instructions are helped.
total cycles in shared programs:
154035437 ->
154024591 (<.01%)
cycles in affected programs: 430176 -> 419330 (-2.52%)
helped: 78
HURT: 10
helped stats (abs) min: 2 max: 4649 x̄: 143.06 x̃: 32
helped stats (rel) min: 0.05% max: 6.02% x̄: 2.03% x̃: 1.07%
HURT stats (abs) min: 3 max: 265 x̄: 31.30 x̃: 6
HURT stats (rel) min: 0.10% max: 8.67% x̄: 1.03% x̃: 0.21%
95% mean confidence interval for cycles value: -232.53 -13.97
95% mean confidence interval for cycles %-change: -2.13% -1.23%
Cycles are helped.
Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs:
8137402 ->
8137248 (<.01%)
instructions in affected programs: 2280 -> 2126 (-6.75%)
helped: 10
HURT: 0
helped stats (abs) min: 12 max: 19 x̄: 15.40 x̃: 15
helped stats (rel) min: 3.90% max: 11.73% x̄: 7.19% x̃: 6.95%
95% mean confidence interval for instructions value: -17.69 -13.11
95% mean confidence interval for instructions %-change: -8.99% -5.39%
Instructions are helped.
total cycles in shared programs:
188538716 ->
188583424 (0.02%)
cycles in affected programs: 69326 -> 114034 (64.49%)
helped: 0
HURT: 10
HURT stats (abs) min: 2068 max: 7686 x̄: 4470.80 x̃: 4870
HURT stats (rel) min: 27.20% max: 173.66% x̄: 69.55% x̃: 59.41%
95% mean confidence interval for cycles value: 2830.86 6110.74
95% mean confidence interval for cycles %-change: 39.18% 99.91%
Cycles are HURT.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Thu, 11 Oct 2018 21:21:42 +0000 (14:21 -0700)]
nir/algebraic: Use value range analysis to eliminate tautological compares not used by if-statements
This just eliminates tautological / contradictory compares that are used
for bcsel and other non-if-statement cases. If-statements are not
affected because removing flow control can cause the i965 instrution
scheduler to create some very long live ranges resulting in unncessary
spilling. This causes some shaders to fall of a performance cliff.
Since many small if-statements are already flattened to bcsel, this
optimization covers more than 68% of the possible cases (2417 shaders
helped for instructions on Skylake vs. 3554).
v2: Reorder and add whitespace to make the relationship between the
patterns more obvious. Suggested by Caio.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
16333474 ->
16322028 (-0.07%)
instructions in affected programs: 438559 -> 427113 (-2.61%)
helped: 1765
HURT: 0
helped stats (abs) min: 1 max: 275 x̄: 6.48 x̃: 4
helped stats (rel) min: 0.20% max: 36.36% x̄: 4.07% x̃: 1.82%
95% mean confidence interval for instructions value: -6.87 -6.10
95% mean confidence interval for instructions %-change: -4.30% -3.84%
Instructions are helped.
total cycles in shared programs:
367608554 ->
367511103 (-0.03%)
cycles in affected programs:
8368829 ->
8271378 (-1.16%)
helped: 1541
HURT: 129
helped stats (abs) min: 1 max: 4468 x̄: 66.78 x̃: 39
helped stats (rel) min: 0.01% max: 45.69% x̄: 4.10% x̃: 2.17%
HURT stats (abs) min: 1 max: 973 x̄: 42.25 x̃: 10
HURT stats (rel) min: 0.02% max: 64.39% x̄: 2.15% x̃: 0.60%
95% mean confidence interval for cycles value: -64.90 -51.81
95% mean confidence interval for cycles %-change: -3.89% -3.36%
Cycles are helped.
total spills in shared programs: 8867 -> 8868 (0.01%)
spills in affected programs: 18 -> 19 (5.56%)
helped: 0
HURT: 1
total fills in shared programs: 21900 -> 21903 (0.01%)
fills in affected programs: 78 -> 81 (3.85%)
helped: 0
HURT: 1
All Gen6 and earlier platforms had similar results. (Sandy Bridge shown)
total instructions in shared programs:
10829877 ->
10829247 (<.01%)
instructions in affected programs: 30240 -> 29610 (-2.08%)
helped: 177
HURT: 0
helped stats (abs) min: 1 max: 15 x̄: 3.56 x̃: 3
helped stats (rel) min: 0.37% max: 17.39% x̄: 2.68% x̃: 1.94%
95% mean confidence interval for instructions value: -3.93 -3.18
95% mean confidence interval for instructions %-change: -3.04% -2.32%
Instructions are helped.
total cycles in shared programs:
154036580 ->
154035437 (<.01%)
cycles in affected programs: 352402 -> 351259 (-0.32%)
helped: 96
HURT: 28
helped stats (abs) min: 1 max: 128 x̄: 14.73 x̃: 6
helped stats (rel) min: 0.03% max: 24.00% x̄: 1.51% x̃: 0.46%
HURT stats (abs) min: 1 max: 117 x̄: 9.68 x̃: 4
HURT stats (rel) min: 0.03% max: 2.24% x̄: 0.43% x̃: 0.23%
95% mean confidence interval for cycles value: -13.40 -5.03
95% mean confidence interval for cycles %-change: -1.62% -0.53%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Thu, 1 Aug 2019 18:51:36 +0000 (11:51 -0700)]
nir/range-analysis: Range tracking for ffma and flrp
A similar technique could be used for fmin3, fmax3, and fmid3.
This could be squashed with the previous commit. I kept it separate to
ease review.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Wed, 24 Jan 2018 12:23:15 +0000 (20:23 +0800)]
nir/range-analysis: Range tracking for bcsel
This could be squashed with the previous commit. I kept it separate to
ease review.
v2: Add some missing cases. Use nir_src_is_const helper. Both
suggested by Caio. Use a table for mapping source ranges to a result
range.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Sat, 2 Jun 2018 01:54:49 +0000 (18:54 -0700)]
nir/range-analysis: Tighten the range of fsat based on the range of its source
This could be squashed with the previous commit. I kept it separate to
ease review.
v2: Use a switch statement and add more comments. Both suggested by
Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Tue, 23 Jan 2018 01:48:43 +0000 (09:48 +0800)]
nir/range-analysis: Rudimentary value range analysis pass
Most integer operations are omitted because dealing with integer
overflow is hard. There are a few things that could be smarter if there
was a small amount more tracking of ranges of integer types (i.e.,
operands are Boolean, operand values fit in 16 bits, etc.).
The changes to nir_search_helpers.h are included in this patch to
simplify reordering the changes to nir_opt_algebraic.py.
v2: Memoize range analysis results. Without this, some shaders appear
to get stuck in infinite loops.
v3: Rebase on many months of Mesa changes, including 1-bit Boolean
changes.
v4: Rebase on "nir: Drop imov/fmov in favor of one mov instruction".
v5: Use nir_alu_srcs_equal for detecting (a*a). Previously just the SSA
value was compared, and this incorrectly matched (a.x*a.y).
v6: Many code improvements including (but not limited to) better names,
more comments, and better use of helper functions. All suggested by
Caio. Rework the handling of several opcodes to use a table for mapping
source ranges to a result range. This change fixed a bug that caused
fmax(gt_zero, ge_zero) to be incorrectly recognized as ge_zero.
Slightly tighten the range of fmul by recognizing that x*x is gt_zero if
x is gt_zero. Add similar handling for -x*x.
v7: Use _______ in the tables as an alias for unknown. Suggested by
Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Wed, 14 Feb 2018 01:44:00 +0000 (17:44 -0800)]
nir/algebraic: Simplify some comparisons like a+constant < constant
v2: Remove unsafe integer versions of the optimizations. This change
had no effect on shader-db results. Suggested by Caio.
All Gen6+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
16333713 ->
16332631 (<.01%)
instructions in affected programs: 258112 -> 257030 (-0.42%)
helped: 1275
HURT: 407
helped stats (abs) min: 1 max: 7 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.20% max: 8.33% x̄: 1.33% x̃: 0.86%
HURT stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.11% max: 2.94% x̄: 0.98% x̃: 0.98%
95% mean confidence interval for instructions value: -0.70 -0.59
95% mean confidence interval for instructions %-change: -0.84% -0.70%
Instructions are helped.
total cycles in shared programs:
367596791 ->
367601268 (<.01%)
cycles in affected programs:
3420062 ->
3424539 (0.13%)
helped: 1553
HURT: 783
helped stats (abs) min: 1 max: 742 x̄: 24.36 x̃: 6
helped stats (rel) min: 0.05% max: 21.12% x̄: 1.47% x̃: 0.65%
HURT stats (abs) min: 1 max: 557 x̄: 54.04 x̃: 14
HURT stats (rel) min: 0.01% max: 33.66% x̄: 3.36% x̃: 1.43%
95% mean confidence interval for cycles value: -1.60 5.43
95% mean confidence interval for cycles %-change: -0.03% 0.33%
Inconclusive result (value mean confidence interval includes 0).
LOST: 0
GAINED: 2
Iron Lake
total instructions in shared programs:
8137992 ->
8137874 (<.01%)
instructions in affected programs: 17501 -> 17383 (-0.67%)
helped: 104
HURT: 2
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.25% max: 2.63% x̄: 0.87% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -1.22 -1.00
95% mean confidence interval for instructions %-change: -0.94% -0.76%
Instructions are helped.
total cycles in shared programs:
188540038 ->
188539650 (<.01%)
cycles in affected programs: 704574 -> 704186 (-0.06%)
helped: 125
HURT: 84
helped stats (abs) min: 2 max: 96 x̄: 6.45 x̃: 4
helped stats (rel) min: <.01% max: 3.47% x̄: 0.42% x̃: 0.25%
HURT stats (abs) min: 2 max: 58 x̄: 4.98 x̃: 4
HURT stats (rel) min: 0.01% max: 2.75% x̄: 0.36% x̃: 0.33%
95% mean confidence interval for cycles value: -3.20 -0.52
95% mean confidence interval for cycles %-change: -0.19% -0.03%
Cycles are helped.
GM45
total instructions in shared programs:
5008889 ->
5008830 (<.01%)
instructions in affected programs: 8824 -> 8765 (-0.67%)
helped: 52
HURT: 1
helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1
helped stats (rel) min: 0.25% max: 2.38% x̄: 0.86% x̃: 0.72%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -1.27 -0.95
95% mean confidence interval for instructions %-change: -0.96% -0.71%
Instructions are helped.
total cycles in shared programs:
128969426 ->
128969128 (<.01%)
cycles in affected programs: 399798 -> 399500 (-0.07%)
helped: 74
HURT: 30
helped stats (abs) min: 2 max: 22 x̄: 6.76 x̃: 6
helped stats (rel) min: <.01% max: 1.83% x̄: 0.46% x̃: 0.29%
HURT stats (abs) min: 2 max: 58 x̄: 6.73 x̃: 6
HURT stats (rel) min: 0.06% max: 2.75% x̄: 0.42% x̃: 0.21%
95% mean confidence interval for cycles value: -4.60 -1.14
95% mean confidence interval for cycles %-change: -0.32% -0.08%
Cycles are helped.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Ian Romanick [Thu, 21 Mar 2019 21:52:58 +0000 (14:52 -0700)]
nir/algebraic: Recognize (a < 0 || 0 < b) as min(a, -b) < 0
Similar to commit
97e6c1b9 and
f5cf74d8ba8c.
First apply 0 < b => -b < 0 to get (a < 0 || -b < 0), then apply some
pre-existing rules to get min(a, -b) < 0.
v2: Substantially update the comment explaining the use of is_used_once
and the duplication of patterns. Suggested by Caio. Also, while flt
and fge are not commutative, ior and iand are. Half of the original
patterns were redundant, so delete them. As alternate justification for
deleting them, fmin(a, -b) < 0 <=> 0 < fmax(-a, b). Proof left as an
exercise for the reader.
All Gen7+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
16333789 ->
16333713 (<.01%)
instructions in affected programs: 11424 -> 11348 (-0.67%)
helped: 32
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 2.38 x̃: 2
helped stats (rel) min: 0.20% max: 1.67% x̄: 0.76% x̃: 0.69%
95% mean confidence interval for instructions value: -3.03 -1.72
95% mean confidence interval for instructions %-change: -0.89% -0.62%
Instructions are helped.
total cycles in shared programs:
367598295 ->
367596791 (<.01%)
cycles in affected programs: 141414 -> 139910 (-1.06%)
helped: 23
HURT: 6
helped stats (abs) min: 3 max: 386 x̄: 72.52 x̃: 20
helped stats (rel) min: 0.15% max: 4.86% x̄: 1.01% x̃: 0.76%
HURT stats (abs) min: 4 max: 88 x̄: 27.33 x̃: 12
HURT stats (rel) min: 0.22% max: 3.95% x̄: 1.08% x̃: 0.59%
95% mean confidence interval for cycles value: -93.51 -10.21
95% mean confidence interval for cycles %-change: -1.10% -0.05%
Cycles are helped.
total instructions in shared programs:
10830836 ->
10830779 (<.01%)
instructions in affected programs: 6895 -> 6838 (-0.83%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 14 x̄: 4.75 x̃: 1
helped stats (rel) min: 0.14% max: 1.61% x̄: 0.65% x̃: 0.33%
95% mean confidence interval for instructions value: -8.46 -1.04
95% mean confidence interval for instructions %-change: -1.03% -0.27%
Instructions are helped.
total cycles in shared programs:
154028477 ->
154032740 (<.01%)
cycles in affected programs: 178433 -> 182696 (2.39%)
helped: 3
HURT: 9
helped stats (abs) min: 3 max: 20 x̄: 11.00 x̃: 10
helped stats (rel) min: 0.07% max: 0.20% x̄: 0.12% x̃: 0.09%
HURT stats (abs) min: 27 max: 1415 x̄: 477.33 x̃: 262
HURT stats (rel) min: 0.22% max: 6.45% x̄: 2.49% x̃: 1.76%
95% mean confidence interval for cycles value: 28.68 681.82
95% mean confidence interval for cycles %-change: 0.37% 3.30%
Cycles are HURT.
Iron Lake
total instructions in shared programs:
8137966 ->
8137992 (<.01%)
instructions in affected programs: 3281 -> 3307 (0.79%)
helped: 0
HURT: 6
HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3
HURT stats (rel) min: 0.63% max: 1.01% x̄: 0.76% x̃: 0.64%
95% mean confidence interval for instructions value: 2.17 6.50
95% mean confidence interval for instructions %-change: 0.56% 0.96%
Instructions are HURT.
total cycles in shared programs:
188539386 ->
188540038 (<.01%)
cycles in affected programs: 103826 -> 104478 (0.63%)
helped: 0
HURT: 7
HURT stats (abs) min: 16 max: 218 x̄: 93.14 x̃: 80
HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.53% x̃: 0.46%
95% mean confidence interval for cycles value: 10.26 176.02
95% mean confidence interval for cycles %-change: 0.24% 0.81%
Cycles are HURT.
GM45
total instructions in shared programs:
5008876 ->
5008889 (<.01%)
instructions in affected programs: 1645 -> 1658 (0.79%)
helped: 0
HURT: 3
HURT stats (abs) min: 3 max: 7 x̄: 4.33 x̃: 3
HURT stats (rel) min: 0.63% max: 1.00% x̄: 0.76% x̃: 0.63%
total cycles in shared programs:
128968950 ->
128969426 (<.01%)
cycles in affected programs: 64854 -> 65330 (0.73%)
helped: 0
HURT: 4
HURT stats (abs) min: 18 max: 218 x̄: 119.00 x̃: 120
HURT stats (rel) min: 0.14% max: 0.95% x̄: 0.60% x̃: 0.66%
95% mean confidence interval for cycles value: -62.92 300.92
95% mean confidence interval for cycles %-change: -0.05% 1.26%
Inconclusive result (value mean confidence interval includes 0).
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>