mesa.git
5 years agoRevert "utils/u_math: break dependency on gallium/utils"
Dylan Baker [Thu, 20 Sep 2018 17:36:33 +0000 (10:36 -0700)]
Revert "utils/u_math: break dependency on gallium/utils"

This reverts commit 0abce6d7700ee42eb00c787732ec1fdefe250d03.

Which broke the windows build.

5 years agoi965: remove outdated comment about TCS passthrough
Caio Marcelo de Oliveira Filho [Tue, 18 Sep 2018 01:31:48 +0000 (18:31 -0700)]
i965: remove outdated comment about TCS passthrough

Since commit 75881bed9e1 "i965: Rework the TCS passthrough shader to
use NIR." the created nir_shader is not dummy, and it is compiled by
the backend like the others.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agomeson: add option to statically link llvm
Christoph Haag [Mon, 17 Sep 2018 23:08:07 +0000 (01:08 +0200)]
meson: add option to statically link llvm

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoutils/u_math: break dependency on gallium/utils
Dylan Baker [Fri, 7 Sep 2018 20:09:23 +0000 (13:09 -0700)]
utils/u_math: break dependency on gallium/utils

Currently u_math needs gallium utils for cpu detection.  Most of what
u_math uses out of u_cpu_detection is duplicated in src/mesa/x86
(surprise!), so I've just reworked it as much as possible to use the
x86/common_x86_features.h macros instead of the gallium ones. The mesa
implementation is a header only approach, with no external dependencies.
There is one small function that was copied over, as promoting
u_cpu_detection is itself a fairly hefty undertaking, as it depends on
u_debug, and this fixes the bug for now.

bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107870
Tested-by: Vinson Lee <vlee@freedesktop.org>
5 years agoegl/android: rework device probing
Emil Velikov [Mon, 3 Sep 2018 12:37:47 +0000 (13:37 +0100)]
egl/android: rework device probing

Unlike the other platforms, here we aim do guess if the device that we
somewhat arbitrarily picked, is supported or not.

In particular: when a vendor is _not_ requested we loop through all
devices, picking the first one which can create a DRI screen.

When a vendor is requested - we use that and do _not_ fall-back to any
other device.

The former seems a bit fiddly, but considering EGL_EXT_explicit_device and
EGL_MESA_query_renderer are MIA, this is the best we can do for the
moment.

With those (proposed) extensions userspace will be able to create a
separate EGL display for each device, query device details and make the
conscious decision which one to use.

v2:
 - update droid_open_device_drm_gralloc()
 - set the dri2_dpy->fd before using it
 - return a EGLBoolean for droid_{probe,open}_device*
 - do not warn on droid_load_driver failure (Tomasz)
 - plug mem leak on dri2_create_screen failure (Tomasz)
 - fixup function name typo (Tomasz, Rob)

v3:
 - add forward declaration for droid_load_driver()
Fixes the HAVE_DRM_GRALLOC build (Mauro)
 - split dup() assignment and check in separate lines (Tomasz, Eric)
 - make droid_load_driver() static (Tomasz)
 - drop unused prop_set variable (Tomasz)

v4:
 - rebase
 - fwd declarationi should be for droid_probe_device()

Cc: Robert Foss <robert.foss@collabora.com>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Mauro Rossi <issor.oruam@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Tomasz Figa <tfiga@chromium.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoglsl: Add an assert when cloning ir_dereference_record with invalid field
Danylo Piliaiev [Wed, 15 Aug 2018 12:46:23 +0000 (15:46 +0300)]
glsl: Add an assert when cloning ir_dereference_record with invalid field

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoglsl: Avoid propagating incompatible type of initializer
Danylo Piliaiev [Wed, 15 Aug 2018 12:46:22 +0000 (15:46 +0300)]
glsl: Avoid propagating incompatible type of initializer

do_assignment validated assigment but when rhs type was not compatible
it proceeded without issues and returned error_emitted = false.
On the other hand process_initializer expected do_assignment to always
return compatible type and never fail.

As a result when variable was initialized with incompatible type
the type of variable changed to the incompatible one.
This manifested in unnecessary error messages and in one case in crash.

Example GLSL:
 vec4 tmp = vec2(0.0);
 tmp.z -= 1.0;

Past error messages:
 initializer of type vec2 cannot be assigned to variable of type vec4
 invalid swizzle / mask `z'
 type mismatch
 operands to arithmetic operators must be numeric

After this patch:
 initializer of type vec2 cannot be assigned to variable of type vec4

In the other case when we initialize variable with incompatible struct,
accessing variable's field leaded to a crash. Example:
 uniform struct {float field;} data;
 ...
 vec4 tmp = data;
 tmp.x -= 1.0;

After the patch there is only error line without a crash:
 initializer of type #anon_struct cannot be assigned to variable of
  type vec4

Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107547

5 years agost/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not...
Michal Srb [Thu, 15 Mar 2018 16:27:57 +0000 (17:27 +0100)]
st/dri: don't set queryDmaBufFormats/queryDmaBufModifiers if the driver does not implement it

This is equivalent to commit a65db0ad1c3, but for dri_kms_init_screen. Without
this gbm_dri_is_format_supported always returns false.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104926
Fixes: e14fe41e0bf ("st/dri: implement createImageFromRenderbuffer(2)")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Tested-by: Adam Williamson <adamwill@fedoraproject.org>
5 years agoanv/so_memcpy: Don't consider src/dst_offset when computing block size
Jason Ekstrand [Mon, 10 Sep 2018 21:36:10 +0000 (16:36 -0500)]
anv/so_memcpy: Don't consider src/dst_offset when computing block size

The only thing that matters is the size since we never specify any
offsets in terms of blocks.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoRevert "mesa: only update framebuffer-state for clears"
Jakob Bornecrantz [Wed, 19 Sep 2018 14:21:26 +0000 (15:21 +0100)]
Revert "mesa: only update framebuffer-state for clears"

This reverts commit fb86365148d5b8f3f06c5e42d9c8440fc1f6693f.

5 years agoradv: use a 64-bit unsigned integer when allocating a descriptor pool
Samuel Pitoiset [Tue, 18 Sep 2018 14:18:37 +0000 (16:18 +0200)]
radv: use a 64-bit unsigned integer when allocating a descriptor pool

pool->size is a 64-bit unsigned integer too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: enable VK_SUBGROUP_FEATURE_ARITHMETIC_BIT
Samuel Pitoiset [Tue, 18 Sep 2018 13:27:52 +0000 (15:27 +0200)]
radv: enable VK_SUBGROUP_FEATURE_ARITHMETIC_BIT

All CTS pass on Polaris/Vega with LLVM 6, 7 and master, so
I think it's safe to enable the feature.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: do not support blitting surfaces with depth and stencil
Samuel Pitoiset [Tue, 18 Sep 2018 13:06:42 +0000 (15:06 +0200)]
radv: do not support blitting surfaces with depth and stencil

Fixes:
dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_optimal_nearest

And all friends that try to blit a surface with different
depth and stencil formats.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomesa: only update framebuffer-state for clears
Erik Faye-Lund [Mon, 10 Sep 2018 20:11:16 +0000 (22:11 +0200)]
mesa: only update framebuffer-state for clears

If we update the program-state etc, we risk compiling needless shaders,
which can cost quite a bit of performance.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agonir: add initializer data to fix MSVC compile error
Juan A. Suarez Romero [Wed, 19 Sep 2018 09:27:49 +0000 (11:27 +0200)]
nir: add initializer data to fix MSVC compile error

CC: Jason Ekstrand <jason@jlekstrand.net>
Fixes: 82799a5d1b8 ("nir: Add a small pass to rematerialize derefs
per-block")
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agonir: Add some asserts that we don't put derefs in phis
Jason Ekstrand [Tue, 11 Sep 2018 18:06:01 +0000 (13:06 -0500)]
nir: Add some asserts that we don't put derefs in phis

The lcssa and phis_to_regs passes are used by various NIR optimizations
that modify the CFG.  Putting a couple of asserts will help ensure that
we don't accidentally put derefs in phis as part of an optimization
pass.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
5 years agonir/opt_if: Re-materialize derefs in use blocks before peeling loops
Jason Ekstrand [Tue, 11 Sep 2018 17:55:45 +0000 (12:55 -0500)]
nir/opt_if: Re-materialize derefs in use blocks before peeling loops

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107879
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
5 years agonir/loop_unroll: Re-materialize derefs in use blocks before unrolling
Jason Ekstrand [Tue, 11 Sep 2018 17:51:09 +0000 (12:51 -0500)]
nir/loop_unroll: Re-materialize derefs in use blocks before unrolling

When we're about to re-arrange a bunch of blocks, it's a good idea to
make sure that we don't have deref uses crossing block boundaries.
Otherwise we may end up with a deref going through a phi and that would
be bad.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
5 years agonir: Add a small pass to rematerialize derefs per-block
Jason Ekstrand [Tue, 11 Sep 2018 17:15:22 +0000 (12:15 -0500)]
nir: Add a small pass to rematerialize derefs per-block

This pass re-materializes deref instructions on a per-block basis to
ensure that every use of a deref occurs in the same block as the
instruction which uses it.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
5 years agoamd: Add Picasso device id
Kenneth Feng [Thu, 26 Jul 2018 02:53:33 +0000 (10:53 +0800)]
amd: Add Picasso device id

No changes here compared to Raven.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
5 years agoRevert "radv: fix descriptor pool allocation size"
Bas Nieuwenhuizen [Tue, 18 Sep 2018 20:46:43 +0000 (22:46 +0200)]
Revert "radv: fix descriptor pool allocation size"

This reverts commit 90819abb56f6b1a0cd4946b13b6caf24fb46e500.

This logic was wrong, the original code is correct. The direct
impact is that we allocate up to approximately a squared amount
of memory compared to what we should allocate.

Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: implement VK_EXT_conservative_rasterization
Samuel Pitoiset [Mon, 17 Sep 2018 14:34:11 +0000 (16:34 +0200)]
radv: implement VK_EXT_conservative_rasterization

Only supported by GFX9+.

The conservativeraster Sascha demo seems to work as expected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: do not re-create the sampler for every blits in CmdBlitImage()
Samuel Pitoiset [Mon, 17 Sep 2018 12:57:51 +0000 (14:57 +0200)]
radv: do not re-create the sampler for every blits in CmdBlitImage()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: allow to force anisotropy via RADV_TEX_ANISO
Samuel Pitoiset [Mon, 17 Sep 2018 20:23:19 +0000 (22:23 +0200)]
radv: allow to force anisotropy via RADV_TEX_ANISO

Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomesa: enable EXT_framebuffer_object in core profile
Timothy Arceri [Sat, 8 Sep 2018 04:20:17 +0000 (14:20 +1000)]
mesa: enable EXT_framebuffer_object in core profile

Since user defined names are not allowed in core profile
we remove the allow_user_names bool and just check if
we have a core profile like all other buffer/texture
object handling code does.

This extension is required by "Wolfenstein: The Old Blood"
and is exposed in core in the Nvidia binary driver.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy dri config option texture_depth
Timothy Arceri [Thu, 30 Aug 2018 00:19:08 +0000 (10:19 +1000)]
mesa: move legacy dri config option texture_depth

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy dri config option fthrottle_mode
Timothy Arceri [Thu, 30 Aug 2018 00:19:07 +0000 (10:19 +1000)]
mesa: move legacy dri config option fthrottle_mode

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy dri config option def_max_anisotropy
Timothy Arceri [Thu, 30 Aug 2018 00:19:06 +0000 (10:19 +1000)]
mesa: move legacy dri config option def_max_anisotropy

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy dri config option no_neg_lod_bias
Timothy Arceri [Thu, 30 Aug 2018 00:19:05 +0000 (10:19 +1000)]
mesa: move legacy dri config option no_neg_lod_bias

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy dri config option round_mode
Timothy Arceri [Thu, 30 Aug 2018 00:19:04 +0000 (10:19 +1000)]
mesa: move legacy dri config option round_mode

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: remove unused dri option float_depth
Timothy Arceri [Thu, 30 Aug 2018 00:19:03 +0000 (10:19 +1000)]
mesa: remove unused dri option float_depth

This seems to have only been used by DRI1 drivers which were
removed with e4344161bde2.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy dri config option dither_mode
Timothy Arceri [Thu, 30 Aug 2018 00:19:02 +0000 (10:19 +1000)]
mesa: move legacy dri config option dither_mode

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy dri config option color_reduction
Timothy Arceri [Thu, 30 Aug 2018 00:19:01 +0000 (10:19 +1000)]
mesa: move legacy dri config option color_reduction

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa: move legacy TCL dri config options
Timothy Arceri [Thu, 30 Aug 2018 00:19:00 +0000 (10:19 +1000)]
mesa: move legacy TCL dri config options

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoutil: use force_compat_profile for Wolfenstein The Old Blood
Timothy Arceri [Wed, 12 Sep 2018 00:52:07 +0000 (10:52 +1000)]
util: use force_compat_profile for Wolfenstein The Old Blood

This game is looking for some odd extension after creating a core
context such as ARB_vertex_program and EXT_framebuffer_object.

Rather then enabling these in core this forces the game to use
compat. This allows the game to run and seems to work without
issues. All other id tech games/engines use a compat profile.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agomesa/st: add force_compat_profile option to driconfig
Timothy Arceri [Wed, 12 Sep 2018 00:52:06 +0000 (10:52 +1000)]
mesa/st: add force_compat_profile option to driconfig

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoRevert "radeonsi: avoid syncing the driver thread in si_fence_finish"
Timothy Arceri [Wed, 12 Sep 2018 10:50:34 +0000 (20:50 +1000)]
Revert "radeonsi: avoid syncing the driver thread in si_fence_finish"

This reverts commit bc65dcab3bc48673ff6180afb036561a4b8b1119.

This was manually reverted. Reverting stops the menu hanging in
some id tech games such as RAGE and Wolfenstein The New Order.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107891

5 years agov3d: Switch from FLUSH_ALL_STATE to FLUSH for ending our bin CLs.
Eric Anholt [Thu, 13 Sep 2018 19:56:18 +0000 (12:56 -0700)]
v3d: Switch from FLUSH_ALL_STATE to FLUSH for ending our bin CLs.

The HW for FLUSH_ALL_STATE isn't validated, since the closed driver only
uses FLUSH.  Now that we don't have any new state at the end of our bin
CLs, follow their lead.

5 years agov3d: Stop clearing the OQ state at the end of the job.
Eric Anholt [Thu, 13 Sep 2018 19:59:13 +0000 (12:59 -0700)]
v3d: Stop clearing the OQ state at the end of the job.

Ever since we added OQ support, we've been clearing OQ state at the start
of the job anyway.  We're intentionally breaking old-and-new-driver-mix
systems, because we need to stop using the unvalidated FLUSH_ALL_STATE.

5 years agov3d: Always emit a TF disable at the start of drawing on V3D 4.x.
Eric Anholt [Thu, 13 Sep 2018 19:54:26 +0000 (12:54 -0700)]
v3d: Always emit a TF disable at the start of drawing on V3D 4.x.

The HW's FLUSH_ALL_STATE is not validated, so we probably shouldn't use
it, meaning that we need to reset state at the start.  By doing this, we
also make ourselves more resilient to another client leaving the TF state
enabled at the end of their batch (as we now do, ourselves).

However, we still need to emit a single TF disable at the end of the
frame, for SWVC5-718.

5 years agobuild: Don't overlink gallium xlib target
Dylan Baker [Mon, 17 Sep 2018 17:17:48 +0000 (10:17 -0700)]
build: Don't overlink gallium xlib target

Currently gallium's xlib target will fail to link due to multiple
definitions of all the symbols in libmesautil, this only shows up in
autotools, and not in meson due to differences in the way that meson and
autotools handle linking static archives into static archives. Autotools
uses -Wl,--whole-archive implicitly, meson requires this behavior to be
opted-into. The solution is just to remove libmesautils from the
libgl-xlib target, since it will get all of those symbols form
libmesagallium.

I've dropped the link from meson as well, it doesn't seem to hurt
anything and should make linking just a little faster.

Fixes: 8396043f304bb2a752130230055605c5c966e89f
       ("Replace uses of _mesa_bitcount with util_bitcount")
bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107923
Tested-by: Brian Paul <brianp@vmware.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Cc: Sergii Romantsov<sergii.romantsov@globallogic.com>
5 years agomove pthread_setaffinity_np check to the build system
Dylan Baker [Thu, 13 Sep 2018 18:06:09 +0000 (11:06 -0700)]
move pthread_setaffinity_np check to the build system

Rather than trying to encode all of the rules in a header, lets just put
them in the build system where they belong. This fixes the build on
FreeBSD, which does have pthraed_setaffinity_np, but it's in a
pthread_np.h, not behind _GNU_SOURCE. FreeBSD also implements cpu_set
slightly differently, so additional changes would be required to get it
working right there anyway.

v2: - fix #define in autotools

Fixes: 9f1bbbdbbd77d346c74c7abbb31f399151a85713
       ("util: try to fix the Android and MacOS build")
Cc: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agomesa: FramebufferParameteri parameter checking
Fritz Koenig [Fri, 14 Sep 2018 18:40:49 +0000 (11:40 -0700)]
mesa: FramebufferParameteri parameter checking

Missing break; causes parameter checking to
never pass GL_FRAMEBUFFER_FLIP_Y_MESA parameters.

Fixes: 318c265160 ("mesa: GL_MESA_framebuffer_flip_y extension [v4]")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
5 years agomesa: Additional FlipY applications
Fritz Koenig [Mon, 10 Sep 2018 19:11:16 +0000 (12:11 -0700)]
mesa: Additional FlipY applications

Instances where direction was determined based on
winsys or user fbo and should be determined based on
FlipY.

Key STATE_FB_WPOS_Y_TRANSFORM for of FlipY instead of
_mesa_is_user_fbo.  This corrects gl_FragCoord usage
when applying GL_MESA_framebuffer_flip_y.

Fixes: ab05dd183cc ("i965: implement GL_MESA_framebuffer_flip_y [v3]")
Reviewed-by: Brian Paul <brianp@vmware.com>
5 years agoradv: Use build ID if available for cache UUID.
Bas Nieuwenhuizen [Sun, 16 Sep 2018 00:50:34 +0000 (02:50 +0200)]
radv: Use build ID if available for cache UUID.

To get an useful UUID for systems that have a non-useful mtime
for the binaries.

I started using SHA1 to ensure we get reasonable mixing in the
various possibilities and the various build id lengths.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoradv: enable shaderInt16 capability
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:40 +0000 (12:52 +0200)]
radv: enable shaderInt16 capability

Not sure if this is all wired up. CTS does pass and the Tangrams
demo works fine on Vega. There are corruption issues on Polaris
but not sure if that related to 16-bit support.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add 16-bit support to ac_build_bitfield_reverse()
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:39 +0000 (12:52 +0200)]
ac: add 16-bit support to ac_build_bitfield_reverse()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add 16-bit support to ac_build_bit_count()
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:38 +0000 (12:52 +0200)]
ac: add 16-bit support to ac_build_bit_count()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add 16-bit support to ac_find_lsb()
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:37 +0000 (12:52 +0200)]
ac: add 16-bit support to ac_find_lsb()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add 16-bit support to ac_build_umsb()
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:36 +0000 (12:52 +0200)]
ac: add 16-bit support to ac_build_umsb()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add 16-bit support to ac_build_isign()
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:35 +0000 (12:52 +0200)]
ac: add 16-bit support to ac_build_isign()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add 16-bit constant values for zero and one
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:34 +0000 (12:52 +0200)]
ac: add 16-bit constant values for zero and one

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add ac_build_bifield_reverse() helper
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:33 +0000 (12:52 +0200)]
ac: add ac_build_bifield_reverse() helper

Are we missing 64-bit support?

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: add ac_build_bit_count() helper
Samuel Pitoiset [Fri, 14 Sep 2018 10:52:32 +0000 (12:52 +0200)]
ac: add ac_build_bit_count() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix use of unreachable() in the meta blit path
Samuel Pitoiset [Mon, 17 Sep 2018 09:18:06 +0000 (11:18 +0200)]
radv: fix use of unreachable() in the meta blit path

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoRevert "radv: Optimize rebinding the same descriptor set."
Samuel Pitoiset [Mon, 17 Sep 2018 09:20:57 +0000 (11:20 +0200)]
Revert "radv: Optimize rebinding the same descriptor set."

This introduces random GPU hangs on Vega, at least.

This reverts commit 02a43edf186cb9998741ba765cb948bb238a122d.

5 years agoradv: fix descriptor pool allocation size
Samuel Pitoiset [Fri, 14 Sep 2018 12:56:38 +0000 (14:56 +0200)]
radv: fix descriptor pool allocation size

The size has to be multiplied by the number of sets.

This gets rid of the OUT_OF_POOL_KHR error and fixes
a crash with the Tangrams demo.

CC: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoanv/query: Add an emit_srm helper
Jason Ekstrand [Fri, 14 Sep 2018 22:21:44 +0000 (17:21 -0500)]
anv/query: Add an emit_srm helper

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: Add a mi_memset and use it for zeroing queries
Jason Ekstrand [Fri, 14 Sep 2018 22:06:48 +0000 (17:06 -0500)]
anv: Add a mi_memset and use it for zeroing queries

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/query: Use anv_address everywhere
Jason Ekstrand [Fri, 14 Sep 2018 22:02:08 +0000 (17:02 -0500)]
anv/query: Use anv_address everywhere

Instead of passing around BOs and offsets, use addresses which are anv's
GPU equivalent of pointers.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/query: Write both dwords in emit_zero_queries
Jason Ekstrand [Fri, 14 Sep 2018 21:34:22 +0000 (16:34 -0500)]
anv/query: Write both dwords in emit_zero_queries

Each query slot is a uint64_t and we were only zeroing half of it.

Fixes: 7ec6e4e68980 "anv/query: implement multiview interactions"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv/query: Increment an index while writing results
Jason Ekstrand [Fri, 14 Sep 2018 20:07:36 +0000 (15:07 -0500)]
anv/query: Increment an index while writing results

Instead of computing an index at the end which we hope maps to the
number of things written, just count the number of things as we go.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoi965/fs: Don't propagate conditional modifiers from integer compares to adds
Ian Romanick [Thu, 13 Sep 2018 00:16:50 +0000 (17:16 -0700)]
i965/fs: Don't propagate conditional modifiers from integer compares to adds

No shader-db changes on any Intel platform... which probably explains
why no bugs have been bisected to this problem since it landed in Mesa
18.1. :( The commit mentioned below is in 18.2, so 18.1 would need a
slightly different fix (due to code refactoring).

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes: 77f269bb560 "i965/fs: Refactor propagation of conditional modifiers from compares to adds"
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (reviewed the original patch)
Cc: Matt Turner <mattst88@gmail.com> (reviewed the original patch)
5 years agoradv: Only allow 16 user SGPRs for compute on GFX9+.
Bas Nieuwenhuizen [Sun, 16 Sep 2018 10:28:33 +0000 (12:28 +0200)]
radv: Only allow 16 user SGPRs for compute on GFX9+.

Apparently for compute there are only 16 instead of the 32 for the
graphics path.

Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Set the user SGPR MSB for Vega.
Bas Nieuwenhuizen [Sun, 16 Sep 2018 10:17:00 +0000 (12:17 +0200)]
radv: Set the user SGPR MSB for Vega.

Otherwise using 32 user SGPRs would be broken.

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: Optimize rebinding the same descriptor set.
Bas Nieuwenhuizen [Sun, 16 Sep 2018 00:17:32 +0000 (02:17 +0200)]
radv: Optimize rebinding the same descriptor set.

This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.

Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agor600/sb: use safe math optimizations when TGSI contains precise operations
Gert Wollny [Fri, 14 Sep 2018 14:56:48 +0000 (16:56 +0200)]
r600/sb: use safe math optimizations when TGSI contains precise operations

Fixes:
  dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3
  dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3
  dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3

Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoandroid: broadcom/cle: export the broadcom top level path headers
Mauro Rossi [Sun, 26 Aug 2018 21:38:12 +0000 (23:38 +0200)]
android: broadcom/cle: export the broadcom top level path headers

Fixes the following building error in vc4 build:

In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56:
gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10:
fatal error: 'cle/v3d_packet_helpers.h' file not found
         ^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoandroid: broadcom/cle: add gallium include path
Mauro Rossi [Sun, 26 Aug 2018 21:11:02 +0000 (23:11 +0200)]
android: broadcom/cle: add gallium include path

Fixes the following building error:

In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38:
In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29:
external/mesa/src/gallium/auxiliary/util/u_math.h:42:10:
fatal error: 'pipe/p_compiler.h' file not found
         ^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoandroid: broadcom/genxml: fix collision with intel/genxml header-gen macro
Mauro Rossi [Sat, 25 Aug 2018 16:17:23 +0000 (18:17 +0200)]
android: broadcom/genxml: fix collision with intel/genxml header-gen macro

Fixes the following building error, happening when building both intel and broadcom:

Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h
FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h
/bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \
external/mesa/src/broadcom/cle/v3d_packet_v21.xml \
> gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h"
Traceback (most recent call last):
  File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module>
    p = Parser(sys.argv[2])
IndexError: list index out of range

header-gen macro is already defined by Intel genxml building rules
and the existing header-gen does not have the $(PRIVATE_VER) argument,
infact the bash command line logged in the building error is missing
exactly $(PRIVATE_VER) argument

Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk
solves the building error, another possible way is to keep the gen rules
commands expanded and not use the macros.

Fixes: 7f80a9ff13 ("vc4: Introduce XML-based packet header generation like Intel's.")
Cc: "18.2" <mesa-stable@lists.freedesktop.org>
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Mauro Rossi <issor.oruam@gmail.com>
5 years agoanv/memcpy: fix build after starting to use addresses
Caio Marcelo de Oliveira Filho [Sat, 15 Sep 2018 03:53:22 +0000 (20:53 -0700)]
anv/memcpy: fix build after starting to use addresses

The offsets now come from the anv_address, these references were not
updated and using the old variable.

Fixes: e1ab8345574 "anv/memcpy: Use addresses instead of bo+offset"
Tested-by: Clayton Craft <clayton.a.craft@intel.com>
5 years agoanv/cmd_buffer: Take an address in emit_lrm
Jason Ekstrand [Tue, 11 Sep 2018 19:59:02 +0000 (14:59 -0500)]
anv/cmd_buffer: Take an address in emit_lrm

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv/memcpy: Use addresses instead of bo+offset
Jason Ekstrand [Mon, 10 Sep 2018 21:43:34 +0000 (16:43 -0500)]
anv/memcpy: Use addresses instead of bo+offset

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv/so_memcpy: Use the correct SO_BUFFER size on gen8+
Jason Ekstrand [Mon, 10 Sep 2018 21:37:17 +0000 (16:37 -0500)]
anv/so_memcpy: Use the correct SO_BUFFER size on gen8+

This shouldn't matter as we'll never write OOB anyway but we may as well
get it right.  It's supposed to be in dwords - 1.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
5 years agoac: fix get_image_coords() for radeonsi
Timothy Arceri [Fri, 27 Jul 2018 05:32:36 +0000 (15:32 +1000)]
ac: fix get_image_coords() for radeonsi

Because this was setting image to true we would end up calling
si_load_image_desc() when we sould be calling
si_load_sampler_desc().

This fixes an assert() in Deus Ex: MD

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallium/util: don't let child processes inherit our thread affinity
Marek Olšák [Thu, 13 Sep 2018 00:29:19 +0000 (20:29 -0400)]
gallium/util: don't let child processes inherit our thread affinity

v2: corrected the comment

5 years agogallium/util: start with a random L3 cache index for AMD Zen
Marek Olšák [Fri, 7 Sep 2018 19:37:45 +0000 (15:37 -0400)]
gallium/util: start with a random L3 cache index for AMD Zen

5 years agost/mesa: Validate the result of pipe_transfer_map in make_texture (v2)
Josh Pieper [Mon, 10 Sep 2018 02:03:27 +0000 (22:03 -0400)]
st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)

When using Freecad, I was getting intermittent segfaults inside of
mesa.  I traced it down to this path in st_cb_drawpixels.c where the
result of pipe_transfer_map wasn't being checked.  In my case, it was
returning NULL because nouveau_bo_new returned ENOENT.  I'm by no
means a mesa developer, but this patch solves the problem for me and
seems reasonable enough.

v2: Marek - also unmap the PBO and release the texture, and call
    the make_texture function sooner for less cleanup

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
5 years agoradv: emit the initial config only once in the preambles
Samuel Pitoiset [Thu, 13 Sep 2018 10:30:21 +0000 (12:30 +0200)]
radv: emit the initial config only once in the preambles

It shouldn't be needed to emit the initial graphics or compute
state when beginning a new command buffer. Emitting them in
the preamble should be enough and this will reduce IB sizes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix setting global locations for indirect descriptors
Samuel Pitoiset [Wed, 12 Sep 2018 13:40:09 +0000 (15:40 +0200)]
radv: fix setting global locations for indirect descriptors

Indirect descriptors only need one entry, we don't have to
emit a location for every descriptors.

Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix flushing indirect descriptors
Samuel Pitoiset [Wed, 12 Sep 2018 13:40:08 +0000 (15:40 +0200)]
radv: fix flushing indirect descriptors

Let say, we first bind a graphics pipeline that needs indirect
descriptors sets. The userdata pointers will be emitted at draw
time. Then if we bind a compute pipeline that doesn't need any
indirect descriptors, the driver will re-emit them for all
grpahics stages.

To avoid this to happen, just check the bind point type.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix GPU hangs with 32-bit indirect descriptors
Samuel Pitoiset [Wed, 12 Sep 2018 13:40:07 +0000 (15:40 +0200)]
radv: fix GPU hangs with 32-bit indirect descriptors

LLVM 6 isn't affected.

Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: handle loc->indirect correctly for the first descriptor
Samuel Pitoiset [Wed, 12 Sep 2018 13:40:06 +0000 (15:40 +0200)]
radv: handle loc->indirect correctly for the first descriptor

This was wrong for descriptor #0 when all of them are indirect.
This is because indirect_offset was 0 and we emitted a
"normal" descriptor pointer for nothing.

While we are at it remove
radv_userdata_info::indirect_offset which is useless.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: bump the maximum number of arguments to 64
Samuel Pitoiset [Wed, 12 Sep 2018 13:40:05 +0000 (15:40 +0200)]
radv: bump the maximum number of arguments to 64

Bumping to 64 should be safe enough.

Fixes some crashes with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: tidy up ac_setup_rings() for the GSVS rings
Samuel Pitoiset [Thu, 13 Sep 2018 13:58:02 +0000 (15:58 +0200)]
radv: tidy up ac_setup_rings() for the GSVS rings

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix setting the number of entries for GSVS on VI+
Samuel Pitoiset [Thu, 13 Sep 2018 13:58:01 +0000 (15:58 +0200)]
radv: fix setting the number of entries for GSVS on VI+

According to RadeonSI, it's unnecessary to multiply by
the stride. That field seems to always be 64.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: always compute the number of components from the output mask
Samuel Pitoiset [Thu, 13 Sep 2018 13:58:00 +0000 (15:58 +0200)]
radv: always compute the number of components from the output mask

That removes two special cases for clip/cull distances.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: emit data contiguously in the GS->VS ring buffer
Samuel Pitoiset [Thu, 13 Sep 2018 13:57:59 +0000 (15:57 +0200)]
radv: emit data contiguously in the GS->VS ring buffer

Instead of having holes. The other ring parameters like
offset and stride can be updated later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: make use of the output usage mask in GS copy shader
Samuel Pitoiset [Thu, 13 Sep 2018 13:57:58 +0000 (15:57 +0200)]
radv: make use of the output usage mask in GS copy shader

This is just for consistency because LLVM can detect and
remove unused loads.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: improve a comment in si_emit_set_predication_state()
Samuel Pitoiset [Wed, 12 Sep 2018 21:20:39 +0000 (23:20 +0200)]
radv: improve a comment in si_emit_set_predication_state()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: fix VK_EXT_conditional_rendering visibility
Samuel Pitoiset [Wed, 12 Sep 2018 21:20:38 +0000 (23:20 +0200)]
radv: fix VK_EXT_conditional_rendering visibility

It's actually just the opposite.

This fixes the new Sascha conditionalrender demo.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: make use of ac_unpack_param() instead of ac_build_bfe()
Samuel Pitoiset [Thu, 13 Sep 2018 14:36:45 +0000 (16:36 +0200)]
radv: make use of ac_unpack_param() instead of ac_build_bfe()

Same code is generated because LLVM ends up by using bfe, but
that seems cleaner to me.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir: add loop unroll support for complex wrapper loops
Timothy Arceri [Wed, 11 Jul 2018 00:53:41 +0000 (10:53 +1000)]
nir: add loop unroll support for complex wrapper loops

In GLSL IR we cheat with switch statements and simply convert them
into loops with a single iteration. This allowed us to make use of
the existing jump instruction handling provided by the loop handing
code, it also allows dead code to be cleaned up once we have
wrapped the code in a loop.

However using loops in this way created previously unrollable loops
which limits further optimisations. Here we provide a way to unroll
loops that end in a break and have multiple other exits.

All shader-db changes are from the dolphin uber shaders. There is a
small amount of HURT shaders but in general the improvements far
exceed the HURT.

shader-db results IVB:

total instructions in shared programs: 10018187 -> 10016468 (-0.02%)
instructions in affected programs: 104080 -> 102361 (-1.65%)
helped: 36
HURT: 15

total cycles in shared programs: 220065064 -> 154529655 (-29.78%)
cycles in affected programs: 126063017 -> 60527608 (-51.99%)
helped: 51
HURT: 0

total loops in shared programs: 2515 -> 2308 (-8.23%)
loops in affected programs: 903 -> 696 (-22.92%)
helped: 51
HURT: 0

total spills in shared programs: 4370 -> 4124 (-5.63%)
spills in affected programs: 1397 -> 1151 (-17.61%)
helped: 9
HURT: 12

total fills in shared programs: 4581 -> 4419 (-3.54%)
fills in affected programs: 2201 -> 2039 (-7.36%)
helped: 9
HURT: 15

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: propagates if condition evaluation down some alu chains
Timothy Arceri [Mon, 23 Jul 2018 04:38:34 +0000 (14:38 +1000)]
nir: propagates if condition evaluation down some alu chains

v2:
 - only allow nir_op_inot or nir_op_b2i when alu input is 1.
 - use some helpers as suggested by Jason.

v3:
 - evaluate alu op for single input alu ops
 - add helper function to decide if to propagate through alu
 - make use of nir_before_src in another spot

shader-db IVB results:

total instructions in shared programs: 9993483 -> 9993472 (-0.00%)
instructions in affected programs: 1300 -> 1289 (-0.85%)
helped: 11
HURT: 0

total cycles in shared programs: 219476091 -> 219476059 (-0.00%)
cycles in affected programs: 7675 -> 7643 (-0.42%)
helped: 10
HURT: 1

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: evaluate if condition uses inside the if branches
Timothy Arceri [Wed, 29 Aug 2018 15:24:43 +0000 (10:24 -0500)]
nir: evaluate if condition uses inside the if branches

Since we know what side of the branch we ended up on we can just
replace the use with a constant.

All the spill changes in shader-db are from Dolphin uber shaders,
despite some small regressions the change is clearly positive.

V2: insert new constant after any phis in the
    use->parent_instr->type == nir_instr_type_phi path.

v3:
 - use nir_after_block_before_jump() for inserting const
 - check dominance of phi uses correctly

v4:
 - create some helpers as suggested by Jason.

v5 (Jason Ekstrand):
 - Use LIST_ENTRY to get the phi src

shader-db results IVB:

total instructions in shared programs: 9999201 -> 9993483 (-0.06%)
instructions in affected programs: 163235 -> 157517 (-3.50%)
helped: 132
HURT: 2

total cycles in shared programs: 231670754 -> 219476091 (-5.26%)
cycles in affected programs: 143424120 -> 131229457 (-8.50%)
helped: 115
HURT: 24

total spills in shared programs: 4383 -> 4370 (-0.30%)
spills in affected programs: 1656 -> 1643 (-0.79%)
helped: 9
HURT: 18

total fills in shared programs: 4610 -> 4581 (-0.63%)
fills in affected programs: 374 -> 345 (-7.75%)
helped: 6
HURT: 0

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agovirgl: adjust strides when mapping temp-resources
Erik Faye-Lund [Wed, 12 Sep 2018 07:48:41 +0000 (09:48 +0200)]
virgl: adjust strides when mapping temp-resources

When we're mapping temp-resources, we clip the resource to the
transfer-box, which means the stride might not be correct any more.

So let's update the stride from the temp-resource, and recompute the
layer-stride.

This fixes crashes when running dEQP with --deqp-gl-config-name=rgba8888d24s8ms4

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: a8987b88ff1 "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
5 years agonvir: Always split 64-bit IMAD/IMUL operations
Pierre Moreau [Mon, 4 Dec 2017 23:51:04 +0000 (00:51 +0100)]
nvir: Always split 64-bit IMAD/IMUL operations

Those operations do not map to actual hardware instructions, therefore
those should always be lowered to 32-bit instructions.

Fixes: 009c54aa7af "nv50/ir: Split 64-bit integer MAD/MUL operations"
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
5 years agost/vdpau: Use output buffer as back buffer with 24-bit color only
Leo Liu [Fri, 7 Sep 2018 13:26:08 +0000 (09:26 -0400)]
st/vdpau: Use output buffer as back buffer with 24-bit color only

Using output buffer with 8 bits video RGB as back buffer
certainly is not working for 30 bits color depth visual.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
5 years agovl/dri: add color depth to vl winsys
Leo Liu [Mon, 10 Sep 2018 20:02:29 +0000 (16:02 -0400)]
vl/dri: add color depth to vl winsys

For VDPAU use later

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
5 years agovl/dri3: add support for 10 bits format
Leo Liu [Mon, 10 Sep 2018 19:33:28 +0000 (15:33 -0400)]
vl/dri3: add support for 10 bits format

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>