Gert Wollny [Mon, 12 Nov 2018 11:34:26 +0000 (12:34 +0100)]
mesa: Reference count shaders that are used by transform feedback objects
Transform feedback objects may hold a pointer to a shader program, and
at least in Gallium, this must be a valid pointer until
ctx->Driver.EndTransformFeedback in glEndTransformFeedback has been called
- which is conform with the spec that any program that is part of a
current rendering state should only be flagged for deletion by glDeleteProgram.
This was not handled properly for the transform feedback objects so that
a call sequence
glUseProgram(x)
glBeginTransformFreedback(...)
glPauseTransformFeedback(...)
glDeleteProgram(x)
glEndTransformFeedback(...)
would result in a use after free bug. With this patch the transform
feedback object also updates the reference count to the used program
thereby keeping the program valid as long as the transform feedback
objects links to it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108713
Fixes: 654587696b4234d09a6b471b70e9629cf2887c27
mesa: add end_transform_feedback() helper
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Samuel Pitoiset [Mon, 12 Nov 2018 08:46:14 +0000 (09:46 +0100)]
radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9
Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 12 Nov 2018 20:59:29 +0000 (21:59 +0100)]
radv: set PA.SC_CONSERVATIVE_RASTERIZATION.NULL_SQUAD_AA_MASK_ENABLE
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 12 Nov 2018 10:37:20 +0000 (11:37 +0100)]
radv: binding streamout buffers doesn't change context regs
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Plamena Manolova [Sun, 11 Nov 2018 20:30:09 +0000 (22:30 +0200)]
nir: Don't lower the local work group size if it's variable.
If the local work group size is variable it won't be available
at compile time so we can't lower it in nir_lower_system_values().
Signed-off-by: Plamena Manolova <plamena.n.manolova@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Matt Turner [Sun, 11 Nov 2018 21:44:41 +0000 (13:44 -0800)]
util/ralloc: Make sizeof(linear_header) a multiple of 8
Prior to this patch sizeof(linear_header) was 20 bytes in a
non-debug build on 32-bit platforms. We do some pointer arithmetic to
calculate the next available location with
ptr = (linear_size_chunk *)((char *)&latest[1] + latest->offset);
in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation
would only be 4-byte aligned.
On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of
4-byte registers to memory) requires an 8-byte aligned address. Such an
instruction is used to store to an 8-byte integer type, like intmax_t
which is used in glcpp's expression_value_t struct.
As a result of the 4-byte alignment returned by linear_alloc_child() we
would generate a SIGBUS (unaligned exception) on SPARC.
According to the GNU libc manual malloc() always returns memory that has
at least an alignment of 8-bytes [1]. I think our allocator should do
the same.
So, simple fix with two parts:
(1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally.
(2) Mark linear_header with an aligned attribute, which will cause
its sizeof to be rounded up to that alignment. (We already do
this for ralloc_header)
With this done, all Mesa's unit tests now pass on SPARC.
[1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html
Fixes: 47e17586924f ("glcpp: use the linear allocator for most objects")
Bug: https://bugs.gentoo.org/636326
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Sun, 11 Nov 2018 21:36:29 +0000 (13:36 -0800)]
util/ralloc: Switch from DEBUG to NDEBUG
The debug code is all asserts, so protect it with the same thing that
controls assert.
Reviewed-by: Eric Anholt <eric@anholt.net>
Timothy Arceri [Thu, 8 Nov 2018 04:45:34 +0000 (15:45 +1100)]
nir: add support for removing redundant stores to copy prop var
For example the following type of thing is seen in TCS from
a number of Vulkan and DXVK games:
vec1 32 ssa_557 = deref_var &oPatch (shader_out float)
vec1 32 ssa_558 = intrinsic load_deref (ssa_557) ()
vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float)
vec1 32 ssa_560 = intrinsic load_deref (ssa_559) ()
vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float)
vec1 32 ssa_562 = intrinsic load_deref (ssa_561) ()
intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x */
intrinsic store_deref (ssa_559, ssa_560) (1) /* wrmask=x */
intrinsic store_deref (ssa_561, ssa_562) (1) /* wrmask=x */
No shader-db changes on i965 (SKL).
vkpipeline-db results RADV (VEGA):
Totals from affected shaders:
SGPRS: 7832 -> 7728 (-1.33 %)
VGPRS: 6476 -> 6740 (4.08 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 469572 -> 456596 (-2.76 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 989 -> 960 (-2.93 %)
Wait states: 0 -> 0 (0.00 %)
The Max Waves and VGPRS changes here are misleading. What is
happening is a bunch of TCS outputs are being optimised away as
they are now recognised as unused. This results in more varyings
being compacted via nir_compact_varyings() which can result in
more register pressure when they are not packed in an optimal way.
This is an existing problem independent of this patch. I've run
some benchmarks and haven't noticed any performance regressions
in affected games.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Wed, 7 Nov 2018 03:29:18 +0000 (14:29 +1100)]
anv/i965: make use of nir_link_constant_varyings()
shader-db results for SLK:
total instructions in shared programs:
13106498 ->
13091573 (-0.11%)
instructions in affected programs:
1186244 ->
1171319 (-1.26%)
helped: 6186
HURT: 0
total cycles in shared programs:
332062633 ->
331961653 (-0.03%)
cycles in affected programs:
8537165 ->
8436185 (-1.18%)
helped: 5371
HURT: 862
LOST: 6
GAINED: 14
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Eric Anholt [Mon, 29 Oct 2018 16:28:00 +0000 (09:28 -0700)]
egl: Improve the debugging of gbm format matching in DRI configs.
Previously the debug would be:
libEGL debug: No DRI config supports native format 0x20203852
libEGL debug: No DRI config supports native format 0x38385247
but
libEGL debug: No DRI config supports native format R8
libEGL debug: No DRI config supports native format GR88
is a lot easier to understand.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Eric Anholt [Fri, 2 Nov 2018 21:35:06 +0000 (14:35 -0700)]
gbm: Introduce a helper function for printing GBM format names.
This requires that the caller make a little (stack) allocation to store
the string.
v2: Use gbm_format_canonicalize (suggested by Daniel)
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Eric Anholt [Thu, 8 Nov 2018 17:57:32 +0000 (09:57 -0800)]
gbm: Move gbm_format_canonicalize() to the core.
I want it for the format name debugging code.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Dylan Baker [Fri, 9 Nov 2018 20:56:00 +0000 (12:56 -0800)]
meson: fix libatomic tests
There are two problems:
1) the extra underscore in MISSING_64BIT_ATOMICS
2) we should link with libatomic if the previous test decided we needed
it
Fixes: d1992255bb29054fa51763376d125183a9f602f3
("meson: Add build Intel "anv" vulkan driver")
Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com>
Marek Olšák [Fri, 9 Nov 2018 21:47:46 +0000 (16:47 -0500)]
mesa: mark GL_SR8_EXT non-renderable on GLES
Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Mon, 12 Nov 2018 20:43:58 +0000 (15:43 -0500)]
st/mesa: disable L3 thread pinning
This implementation can have massive drawbacks.
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Christian Gmeiner [Sat, 1 Sep 2018 19:15:27 +0000 (21:15 +0200)]
nir: add lowering for ffloor
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Alyssa Rosenzweig [Sun, 11 Nov 2018 19:09:40 +0000 (11:09 -0800)]
util: Fix warning in u_cpu_detect on non-x86
regs is only set and used on x86; on other platforms (like ARM), this
code causes a trivial warning, solved by moving the regs declaration to
the architecture-dependent usage.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Dylan Baker [Fri, 9 Nov 2018 21:27:56 +0000 (13:27 -0800)]
meson: Don't set -Wall
meson does this for you with its warn levels, so we don't need to set
it ourselves.
Fixes: d1992255bb29054fa51763376d125183a9f602f3
("meson: Add build Intel "anv" vulkan driver")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Rob Clark [Sun, 11 Nov 2018 15:41:52 +0000 (10:41 -0500)]
freedreno/drm: fix unused 'entry' warnings
Looks like importing libdrm_freedreno into mesa crossed paths with
e27902a2613.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Lionel Landwerlin [Thu, 8 Nov 2018 17:26:36 +0000 (17:26 +0000)]
i965: add support for sampling from AYUV
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Thu, 8 Nov 2018 17:28:20 +0000 (17:28 +0000)]
dri: add AYUV format
v2: Add a AYUV entry android in the android backend (Tapani)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Thu, 8 Nov 2018 16:28:20 +0000 (16:28 +0000)]
nir/lower_tex: Add AYUV lowering support
Byte ordering is :
0: V
1: U
2: Y
3: A
v2: Split refactoring of alpha channel (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (v1)
Acked-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Lionel Landwerlin [Fri, 9 Nov 2018 10:33:37 +0000 (10:33 +0000)]
nir/lower_tex: add alpha channel parameter for yuv lowering
We're about to introduce AYUV support which provides its own alpha
channel. So give alpha as a parameter and set it to 1 on exising
formats.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Samuel Pitoiset [Thu, 8 Nov 2018 13:48:31 +0000 (14:48 +0100)]
radv: make use of num_good_cu_per_sh in si_emit_graphics() too
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Nov 2018 13:00:36 +0000 (14:00 +0100)]
radv: clean up setting partial_es_wave for distributed tess on VI
Only needed when the pipeline actually uses tessellation. I don't
think that changes anything, except improving readability.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Nov 2018 13:00:35 +0000 (14:00 +0100)]
radv: cleanup and document a Hawaii bug with offchip buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Hanno Böck [Wed, 7 Nov 2018 08:01:42 +0000 (09:01 +0100)]
glsl/test: Fix use after free in test_optpass.
The variable state is free'd and afterwards state->error is used
as the return value, resulting in a use after free bug detected
by memory safety tools like address sanitizer.
Signed-off-by: Hanno Böck <hanno@hboeck.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Timothy Arceri [Mon, 12 Nov 2018 02:25:27 +0000 (13:25 +1100)]
nir: don't pack varyings ints with floats unless flat
Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Mon, 12 Nov 2018 02:24:42 +0000 (13:24 +1100)]
nir: add glsl_type_is_integer() helper
Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Francisco Jerez [Thu, 8 Nov 2018 22:03:24 +0000 (14:03 -0800)]
intel/fs: Prevent emission of IR instructions not aligned to their own execution size.
This can occur during payload setup of SIMD-split send message
instructions, which can lead to the emission of header setup
instructions with a non-zero channel group and fixed SIMD width. Such
instructions could end up using undefined channel enable signals
except they don't care since they're always marked force_writemask_all.
Not known to affect correctness of any workload at this point, but it
would be trivial to back-port to stable if something comes up.
Reported-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: Sagar Ghuge <sagar.ghuge@intel.com>
Timothy Arceri [Fri, 9 Nov 2018 22:20:10 +0000 (09:20 +1100)]
st/mesa: make use of nir_link_constant_varyings()
Shader-db results radeonsi (VEGA):
Totals from affected shaders:
SGPRS: 161464 -> 161368 (-0.06 %)
VGPRS: 86904 -> 86292 (-0.70 %)
Spilled SGPRs: 296 -> 314 (6.08 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size:
3618596 ->
3573852 (-1.24 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 26189 -> 26276 (0.33 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Eric Anholt <eric@anholt.net>
Timothy Arceri [Thu, 8 Nov 2018 22:24:11 +0000 (09:24 +1100)]
nir: add new linking opt nir_link_constant_varyings()
This pass moves constant outputs to the consuming shader stage
where possible.
Reviewed-by: Eric Anholt <eric@anholt.net>
Andre Heider [Tue, 6 Nov 2018 08:27:14 +0000 (09:27 +0100)]
st/nine: clean up thead shutdown sequence a bit
Just break out of the loop instead, it does the same thing.
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
Andre Heider [Tue, 6 Nov 2018 08:27:13 +0000 (09:27 +0100)]
st/nine: plug thread related leaks
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
Andre Heider [Tue, 6 Nov 2018 08:27:12 +0000 (09:27 +0100)]
st/nine: fix stack corruption due to ABI mismatch
This fixes various crashes and hangs when using nine's 'thread_submit'
feature.
On 64bit, the thread function's data argument would just be NULL.
On 32bit, the data argument would be garbage depending on the compiler
flags (in my case -march>=core2).
Fixes: f3fa7e3068512d ("st/nine: Use WINE thread for threadpool")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Axel Davy <davyaxel0@gmail.com>
Marek Olšák [Fri, 2 Nov 2018 20:09:13 +0000 (16:09 -0400)]
radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET only
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Fri, 2 Nov 2018 20:08:26 +0000 (16:08 -0400)]
gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Tue, 30 Oct 2018 00:48:33 +0000 (20:48 -0400)]
radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2
and add has_dcc_constant_encode.
Marek Olšák [Tue, 30 Oct 2018 00:46:48 +0000 (20:46 -0400)]
radeonsi: use better DCC clear codes
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Marek Olšák [Tue, 6 Nov 2018 22:11:55 +0000 (17:11 -0500)]
ac/surface: remove the overallocation workaround for Vega12
not needed anymore (probably since the tile_swizzle fix)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Lionel Landwerlin [Fri, 9 Nov 2018 16:49:09 +0000 (16:49 +0000)]
intel/aub_read: remove useless breaks
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Erik Faye-Lund [Fri, 9 Nov 2018 16:39:25 +0000 (17:39 +0100)]
Revert "mesa: expose NV_conditional_render on GLES"
This reverts commit
5213be9fab72548c799b30e320dd1b257534f096.
Erik Faye-Lund [Fri, 9 Nov 2018 16:39:22 +0000 (17:39 +0100)]
Revert "mesa/main: fixup make check after NV_conditional_render for gles"
This reverts commit
cccd7a253f9ed14ea748a222f58b0e5c895eb939.
Erik Faye-Lund [Fri, 9 Nov 2018 15:16:13 +0000 (16:16 +0100)]
mesa/main: fixup make check after NV_conditional_render for gles
It seems I missed some details when exposing NV_conditional_render
on GLES; this fixes up "make check".
Fixes: 5213be9fab7 ("mesa: expose NV_conditional_render on GLES")
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@intel.com>
Nicolai Hähnle [Wed, 7 Nov 2018 11:10:21 +0000 (12:10 +0100)]
radv: include LLVM IR in the VK_AMD_shader_info "disassembly"
Helpful for debugging compiler backend problems: this allows us to
easily retrieve the LLVM IR from RenderDoc.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Erik Faye-Lund [Thu, 1 Nov 2018 12:28:25 +0000 (13:28 +0100)]
mesa: expose NV_conditional_render on GLES
The extension spec has been updated to include GLES 2 support, so let's
enable it there.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Iago Toral Quiroga [Wed, 31 Oct 2018 11:18:34 +0000 (12:18 +0100)]
nir/constant_folding: fix incorrect bit-size check
nir_alu_type_get_type_size takes a type as parameter and we were
passing a bit-size instead, which did what we wanted by accident,
since a bit-size of zero matches nir_type_invalid, which has a
size of 0 too.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Iago Toral Quiroga [Wed, 17 Oct 2018 10:05:42 +0000 (12:05 +0200)]
intel/compiler: fix node interference of simd16 instructions
SIMD16 instructions need to have additional interferences to prevent
source / destination hazards when the source and destination registers
are off by one register.
While we already have code to handle this, it was only running for SIMD16
dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch.
An example of this are pull constant loads since commit
b56fa830c6095,
but there are more cases.
This fixes a number of CTS test failures found in work-in-progress
tests that were hitting this situation for 16-wide pull constants
in a SIMD8 program.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Roland Scheidegger [Thu, 8 Nov 2018 01:52:47 +0000 (02:52 +0100)]
gallivm: fix improper clamping of vertex index when fetching gs inputs
Because we only have one file_max for the (2d) gs input file, the value
actually represents the max of attrib and vertex index (although I'm
not entirely sure if we really want the max, since the max valid value
of the vertex dimension can be easily deduced from the input primitive).
Thus in cases where the number of inputs is higher than the number of
vertices per prim, we did not properly clamp the vertex index, which
would result in out-of-bound fetches, potentially causing segfaults
(the segfaults seemed actually difficult to trigger, but valgrind
certainly wasn't happy). This might have happened even if the shader
did not actually try to fetch bogus vertices, if the fetching happened
in non-active conditional clauses.
To fix simply use the correct max vertex index value (derived from
the input prim type) instead when clamping for this case.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Aditya Swarup [Thu, 1 Nov 2018 00:12:40 +0000 (17:12 -0700)]
i965: Lift restriction in external textures for EGLImage support
Fixes Skqp's unitTest_EGLImageTest test.
For Intel platforms, we support external textures only for EGLImages
created with EGL_EXT_image_dma_buf_import. This restriction seems to
be Intel specific and not present for other platforms.
While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent
to the test because of this restriction.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Ian Romanick [Thu, 1 Nov 2018 20:50:14 +0000 (13:50 -0700)]
glsl: Add pragma to disable all warnings
Use #pragma warning(off) and #pragma warning(on) to disable or enable
all warnings. This is a big hammer. If we ever need a smaller hammer,
we can enhance this functionality.
There is one lame thing about this. Because we parse everything, create
an AST, then convert the AST to GLSL IR, we have to treat the #pragma
like a statment. This means that you can't do something like
' void
' #pragma warning(off)
' __foo
' #pragma warning(on)
' (float param0);
Fixing that would, as far as I can tell, require a huge amount of work.
I did try just handling the #pragma during parsing (like we do for
state for the whole shader.
v2: Fix the #pragma lines in the commit message that git-commit ate.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Thu, 1 Nov 2018 20:47:58 +0000 (13:47 -0700)]
glsl: Add warning tests for identifiers with __
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Wed, 7 Nov 2018 21:47:18 +0000 (15:47 -0600)]
intel/fs: Add an assert to optimize_frontfacing_ternary
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 20 Oct 2018 17:25:31 +0000 (12:25 -0500)]
anv: Use nir_src_is_const and friends in lowering code
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 20 Oct 2018 17:21:46 +0000 (12:21 -0500)]
intel/analyze_ubo_ranges: Use nir_src_is_const and friends
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 20 Oct 2018 15:28:51 +0000 (10:28 -0500)]
intel/vec4: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Wed, 7 Nov 2018 23:47:45 +0000 (17:47 -0600)]
nir: Add a read_mask helper for ALU instructions
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 20 Oct 2018 14:55:28 +0000 (09:55 -0500)]
intel/fs: Use the new nir_src_is_const and friends
As of this commit, all uses of const sources either go through a
nir_src_as_<type> helper which handles bit sizes correctly or else are
accompanied by a nir_src_bit_size() == 32 assertion to assert that we
have the size we think we have.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Sat, 20 Oct 2018 15:05:33 +0000 (10:05 -0500)]
intel/fs,vec4: Clean up a repeated pattern with SSBOs
Everywhere we handle SSBO intrinsics, we have exactly the same pattern
for computing the index so we may as well make a helper for it. We also
add a get_nir_src_imm to vec4 and use it for SSBO offsets.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Samuel Pitoiset [Thu, 8 Nov 2018 10:16:45 +0000 (11:16 +0100)]
radv: fix GPU hangs when loading depth/stencil clear values on SI/CIK
HTILE is supported on these chips, not sure how I missed that.
This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used.
Fixes: f425d9ee74 ("radv: use LOAD_CONTEXT_REG when loading fast clear values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Wed, 7 Nov 2018 21:05:31 +0000 (22:05 +0100)]
radv: use LOAD_CONTEXT_REG when loading fast clear values
This avoids syncing the Micro Engine. This is only supported
for VI+ currently. There is probably a way for using
LOAD_CONTEXT_REG on previous chips but that could be done later.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Wed, 7 Nov 2018 16:06:27 +0000 (17:06 +0100)]
radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+
Inclusive and exclusives scan are missing because older chips
don't have llvm.amdgcn.update.dpp.
This fixes crashes with dEQP-VK.subgroups.arithmetic.*.
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Adam Jackson [Tue, 7 Aug 2018 20:55:37 +0000 (16:55 -0400)]
glx: Demand success from CreateContext requests (v2)
GLXCreate{,New}Context, like most X resource creation requests, does not
emit a reply and therefore is emitted into the X stream asynchronously.
However, unlike most resource creation requests, the GLXContext we
return is a handle to library state instead of an XID. So if context
creation fails for any reason - say, the server doesn't support indirect
contexts - then we will fail in strange places for strange reasons.
We could make every GLX entrypoint robust against half-created contexts,
or we could just verify that context creation worked. Reuse the
__glXIsDirect code to do this, as a cheap way of verifying that the
XID is real.
glXCreateContextAttribsARB solves this by using the _checked version of
the xcb command, so effectively this change makes the classic context
creation paths as robust as CreateContextAttribs.
v2: Better use of Bool, check that error != NULL first (Olivier Fourdan)
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Karol Herbst [Wed, 7 Nov 2018 12:34:29 +0000 (13:34 +0100)]
gm107/ir: fix compile time warning in getTEXSMask
In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)':
warning: control reaches end of non-void function [-Wreturn-type]
Reported-by: Moiman@freenode
Fixes: f821e80213e38e93f96255b3deacb737a600ed40
"gm107/ir: use scalar tex instructions where possible"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Michel Dänzer [Thu, 1 Nov 2018 11:30:42 +0000 (12:30 +0100)]
winsys/amdgpu: Stop using amdgpu_bo_handle_type_kms_noimport
It only behaves any different from amdgpu_bo_handle_type_kms with
libdrm 2.4.93, and it breaks if an older version is picked up.
Bugzilla: https://bugs.freedesktop.org/108096
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Lionel Landwerlin [Wed, 7 Nov 2018 10:55:05 +0000 (10:55 +0000)]
intel/dump_gpu: add platform option
Got tired of remembering the PCI ids.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Lionel Landwerlin [Wed, 7 Nov 2018 10:55:04 +0000 (10:55 +0000)]
intel/dump_gpu: move output option together
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Samuel Pitoiset [Mon, 5 Nov 2018 08:54:28 +0000 (09:54 +0100)]
radv: disable conditional rendering for vkCmdCopyQueryPoolResults()
VK_EXT_conditional_rendering says that copy commands should not be
affected by conditional rendering.
Cc: 18.2 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Mon, 5 Nov 2018 09:35:36 +0000 (10:35 +0100)]
radv: allocate enough space in CS when copying query results with compute
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Fri, 2 Nov 2018 02:33:52 +0000 (13:33 +1100)]
ac/nir_to_llvm: fix b2f for f64
Fixes: d7e0d47b9de3 ("nir: Add a bunch of b2[if] optimizations")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Karol Herbst [Sun, 5 Aug 2018 16:34:22 +0000 (18:34 +0200)]
gm107/ir: use scalar tex instructions where possible
TEXS, TLD4 and TLD4S are variants of tex instructions which are more
scalar, which gives RA more freedom and is less likely to insert silly
MOVs to satisfy quad registers.
shader-db changes:
total instructions in shared programs :
7687265 ->
7614782 (-0.94%)
total gprs used in shared programs : 803620 -> 798045 (-0.69%)
total shared used in shared programs : 639636 -> 639636 (0.00%)
total local used in shared programs : 24648 -> 24648 (0.00%)
total bytes used in shared programs :
82103400 ->
81330696 (-0.94%)
local shared gpr inst bytes
helped 0 0 3648 10647 10647
hurt 0 0 464 205 205
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 5 Aug 2018 17:12:48 +0000 (19:12 +0200)]
nv50/ir: add scalar field to TexInstructions
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 5 Aug 2018 18:41:49 +0000 (20:41 +0200)]
nv50/ra: add condenseDef overloads for partial condenses
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 5 Aug 2018 17:12:32 +0000 (19:12 +0200)]
nv50/ir: print color masks of tex instructions
v2: print the mask for TXG as well
make the mask to be printed more mask like
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Jason Ekstrand [Tue, 6 Nov 2018 15:48:59 +0000 (09:48 -0600)]
vulkan: Update the XML and headers to 1.1.91
The biggest change here is the rename of VK_NVX_ray_tracing to
VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface.
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Gert Wollny [Thu, 1 Nov 2018 11:59:27 +0000 (12:59 +0100)]
r600: Add support for EXT_texture_sRGB_R8
Enables on R600 and makes pass:
dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*
v2: remove chunk for dri/radeon (Emil)
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Lionel Landwerlin [Tue, 6 Nov 2018 11:37:51 +0000 (11:37 +0000)]
anv/android: mark gralloc allocated BOs as external
Allocating through Gralloc implies buffers are going to be used
outside the driver. We have special MOCS settings for external BOs and
we probably want to use them here too.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a1220e73116bad7 ("anv/android: Set the BO flags in bo_cache_import (v2)")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Lionel Landwerlin [Tue, 6 Nov 2018 11:37:50 +0000 (11:37 +0000)]
anv: stub internal android code
This reduces the amount of #ifdef ANDROID we'll have to have inside
the driver. Potentially offering better coverage of the android
extensions.
v2: Move anv_android.h include before anv_entrypoints.h (Tapani)
Fix autotools android build (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Kristian H. Kristensen [Fri, 19 Oct 2018 04:41:21 +0000 (21:41 -0700)]
freedreno/a6xx: Clear z32 and separate stencil with blitter
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>
Rob Clark [Tue, 30 Oct 2018 12:41:58 +0000 (08:41 -0400)]
freedreno/a6xx: fix VSC bug with larger # of tiles
At higher resolutions with the addition of MSAA, the number of tiles
can increase to the point where we use more than one VSC pipe per
tile. Which would cause us to calculate an out-of-bounds offset for
VSC_SIZE_ADDRESS. So don't try to be clever, just always put it at
a fixed offset assuming the max 32 VSC pipes in use.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 29 Oct 2018 17:28:45 +0000 (13:28 -0400)]
freedreno: update generated headers
Signed-off-by: Rob Clark <robdclark@gmail.com>
Olivier Fourdan [Thu, 25 Oct 2018 12:48:15 +0000 (14:48 +0200)]
wayland/egl: Resize EGL surface on update buffer for swrast
After commit
a9fb331ea ("wayland/egl: update surface size on window
resize"), the surface size is updated as soon as the resize is done, and
`update_buffers()` would resize only if the surface size differs from
the attached size.
However, in the case of swrast, there is no resize callback and the
attached size is updated in `dri2_wl_swrast_commit_backbuffer()` prior
to the `swrast_update_buffers()` so the attached size is always up to
date when it reaches `swrast_update_buffers()` and the surface is never
resized.
This can be observed with "totem" using the GDK backend on Wayland (the
default) when running on software rendering:
$ LIBGL_ALWAYS_SOFTWARE=true CLUTTER_BACKEND=gdk totem
Resizing the window would leave the EGL surface size unchanged.
To avoid the issue, partially revert the part of commit
a9fb331ea for
`swrast_update_buffers()` and resize on the win size and not the
attached size.
Fixes: a9fb331ea - wayland/egl: update surface size on window resize
Signed-off-by: Olivier Fourdan <ofourdan@redhat.com>
CC: Daniel Stone <daniel@fooishbar.org>
CC: Juan A. Suarez Romero <jasuarez@igalia.com>
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Lionel Landwerlin [Mon, 5 Nov 2018 20:42:40 +0000 (20:42 +0000)]
intel/decoders: fix instruction base address parsing
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 00103db04ab879 ("intel: Fix decoding for partial STATE_BASE_ADDRESS updates.")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Emil Velikov [Fri, 2 Nov 2018 18:34:19 +0000 (18:34 +0000)]
egl/glvnd: correctly report errors when vendor cannot be found
If the user provides an invalid display or device the ToVendor lookup
will fail.
In this case, the local [Mesa vendor] error code will be set. Thus on
sequential eglGetError(), the error will be EGL_SUCCESS.
To be more specific, GLVND remembers the last vendor and calls back
into it's eglGetError, although there's no guarantee to ever have had
one.
v2:
- Add _eglError call, so the debug callback is executed (Kyle)
- Drop XXX comment.
Piglit: tests/egl/spec/egl_ext_device_query
Fixes: ce562f9e3fa ("EGL: Implement the libglvnd interface for EGL (v3)")
Cc: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kyle Brenneman <kbrenneman@nvidia.com>
Emil Velikov [Fri, 2 Nov 2018 18:50:48 +0000 (18:50 +0000)]
egl: add EGL_EXT_device_base entrypoints
eglQueryDevicesEXT (unlike the other three functions) does not depend
on the display. It is implemented in GLVND, which calls into each
driver collecting the list of devices and presenting it to the user.
For the other entrypoints, GLVND acts as pass through stub calling into
the vendor library. The vendor implementation calls back into GLVND to
get the vendor dispatch. Then the driver proceeds to call itself via
the said dispatch.
This design makes is possible to keep using "old" GLVND with newer
vendor drivers. Since effectively all the extension code is within the
latter itself.
Without said entrypoints, any user will outright crash - as reported in
the bug report.
Note: there's a follow-up fix needed to our GLVND code, to make piglit
happy.
v2: add some beefy documentation in the commit message.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108635
Fixes: 7552fcb7b9b ("egl: add base EGL_EXT_device_base implementation")
Reported-by: kyle.devir@mykolab.com
Cc: kyle.devir@mykolab.com
Acked-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Fri, 2 Nov 2018 15:48:26 +0000 (15:48 +0000)]
docs: mention EXT_shader_implicit_conversions
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Marek Olšák [Sat, 3 Nov 2018 00:56:42 +0000 (20:56 -0400)]
st/va: fix incorrect use of resource_destroy
Fixes: 4373dd32154 ("st/va: Support YUV formats in vaCreateSurfaces")
Cc: Drew Davenport <ddavenport@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Sergii Romantsov [Mon, 5 Nov 2018 13:02:49 +0000 (15:02 +0200)]
i965/batch/debug: Allow log be dumped before assert
Message that may show the culprit of assert now will
be dumped before that for debug purposes.
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Lionel Landwerlin [Mon, 29 Oct 2018 18:14:47 +0000 (18:14 +0000)]
intel/sanitize_gpu: add debug message on mmap fail
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Lionel Landwerlin [Mon, 29 Oct 2018 18:14:46 +0000 (18:14 +0000)]
intel/sanitize_gpu: deal with non page multiple buffer sizes
We can only map at page aligned offsets. We got that wrong with buffer
size where (size % 4096) != 0 (anv has a WA buffer of 1024).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Lionel Landwerlin [Mon, 29 Oct 2018 18:14:45 +0000 (18:14 +0000)]
intel/sanitize_gpu: add help/gdb options to wrapper
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Lionel Landwerlin [Mon, 29 Oct 2018 18:14:44 +0000 (18:14 +0000)]
intel/dump_gpu: add missing gdb option
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Eric Engestrom [Mon, 5 Nov 2018 09:57:09 +0000 (09:57 +0000)]
wsi/wayland: only finish() a successfully init()ed display
Fixes: 43691024982b3ea734ad0 "vulkan/wsi/wayland: Stop caching Wayland displays"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
Eric Engestrom [Mon, 5 Nov 2018 09:55:02 +0000 (09:55 +0000)]
wsi/wayland: use proper VkResult type
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Sergii Romantsov [Thu, 1 Nov 2018 11:02:43 +0000 (13:02 +0200)]
autotools: library-dependency when no sse and 32-bit
Building of 32bit Mesa may fail if __SSE__ is not specified.
Added missed dependency from libm.
v2: avoided dependecy on any flag, just link
v3: meson doesn't fail, but have added dependency on libm
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Lionel G Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108560
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Samuel Pitoiset [Wed, 31 Oct 2018 11:00:12 +0000 (12:00 +0100)]
radv: more use of radv_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Wed, 31 Oct 2018 11:00:11 +0000 (12:00 +0100)]
radv: replace si_emit_wait_fence() with radv_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Wed, 31 Oct 2018 10:43:34 +0000 (11:43 +0100)]
radv: add missing TFB queries support to CmdCopyQueryPoolsResults()
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Fixes: b4eb029062a ("radv: implement VK_EXT_transform_feedback")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Samuel Pitoiset [Fri, 2 Nov 2018 11:20:48 +0000 (12:20 +0100)]
radv: remove useless sync after copying query results with compute
The spec says:
"vkCmdCopyQueryPoolResults is considered to be a transfer
operation, and its writes to buffer memory must be synchronized
using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT
before using the results."
VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle,
while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector
caches and L2. So, it's useless to set those flags internally.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Vinson Lee [Wed, 31 Oct 2018 22:35:23 +0000 (15:35 -0700)]
r600/sb: Fix constant logical operand in assert.
Fixes: da977ad90747 ("r600/sb: start adding GDS support")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>