Ilia Mirkin [Sat, 11 Feb 2017 23:37:41 +0000 (18:37 -0500)]
nvc0: set the render condition in the compute object
Fixes GL45-CTS.compute_shader.conditional-dispatching
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Sat, 11 Feb 2017 23:20:50 +0000 (18:20 -0500)]
gm107/ir: fix address offset bitfield for ATOMS
Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Fri, 10 Feb 2017 06:55:08 +0000 (01:55 -0500)]
nv50/ir: convert an ATOM.EXCH without a destination into a store
On SM35 there does not appear to be a way to emit a ATOM.EXCH with a
null destination. This should be functionally equivalent to a plain
store however, so just do that.
Fixes GL45-CTS.compute_shader.atomic-case2 on SM35.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 17 Sep 2016 22:23:49 +0000 (18:23 -0400)]
nvc0: fix 64-bit integer query buffer writes
The former logic just plain didn't work at all. We need to write the
subsequent dword to the next buffer location.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 12 Feb 2017 01:23:30 +0000 (20:23 -0500)]
nv50/ir: return a register when retrieving thread id sysval
We have logic to short-circuit such retrievals to zero. However "zero"
was an immediate, and some logic expected to get registers (to later be
propagated). Fix this by using loadImm.
Fixes GL45-CTS.gpu_shader5.images_array_indexing
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 11 Feb 2017 22:20:52 +0000 (17:20 -0500)]
nv50/ir: add missing break after DSSG
Recently broken during int64 addition.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Christian Gmeiner [Wed, 4 Jan 2017 21:59:38 +0000 (22:59 +0100)]
etnaviv: shader-db traces
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>
Christian Gmeiner [Tue, 31 Jan 2017 19:15:53 +0000 (20:15 +0100)]
etnaviv: keep track of emitted loops
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
Christian Gmeiner [Tue, 3 Jan 2017 14:06:24 +0000 (15:06 +0100)]
etnaviv: wire up core pipe_debug_callback
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Jose Maria Casanova Crespo [Fri, 10 Feb 2017 13:25:27 +0000 (14:25 +0100)]
glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1
From GLSL ES 3.10 spec, section 4.1.9 "Arrays":
"If an array is declared as the last member of a shader storage block
and the size is not specified at compile-time, it is sized at run-time.
In all other cases, arrays are sized only at compile-time."
In desktop GLSL it is allowed to have unsized-arrays that are
not last, as long as we can determine that they are implicitly
sized, which is detected at link-time.
With this patch Mesa reports a compilation error as glslang does with
the following shader:
buffer SSBO { vec4 data[]; vec4 moreData;};
void main (void)
{
}
Fixes:
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Fri, 10 Feb 2017 21:24:37 +0000 (13:24 -0800)]
vc4: Enable glSampleMask() even when !rasterizer->multisample.
gallium's blitter expects that it can set the sample mask even when the
rasterizer doesn't have the flag on.
Between this and the previous test, 10 new ext_framebuffer_multisample
tests start passing.
Eric Anholt [Fri, 10 Feb 2017 21:18:06 +0000 (13:18 -0800)]
vc4: Respect glSampleMask() even when we're not writing color.
gallium's quad-based blitter for copying MSAA depth textures expects to be
able to do 4 passes updating a sample at a time using glSampleMask, and
there's no color buffer bound when it's doing that.
Eric Anholt [Fri, 10 Feb 2017 20:59:39 +0000 (12:59 -0800)]
vc4: Use the nir_builder helper for loading sample mask.
Eric Anholt [Fri, 10 Feb 2017 20:00:32 +0000 (12:00 -0800)]
vc4: Use accurate 1/w in coordinate shader as well as vert shader.
We probably shouldn't be emitting different scaled viewport coordinates
between vertex and coord.
Eric Anholt [Fri, 10 Feb 2017 18:45:42 +0000 (10:45 -0800)]
vc4: Drop VS inputs to 8.
In the hardware we only get to declare 8 vertex elements (GLES2's
minimum), so we should be exposing that number here. Fixes an assertion
failure in piglit texrect-many, at the expense of various GL 2.0-ish
minmax tests now complaining that our count is too low.
Eric Anholt [Tue, 7 Feb 2017 01:30:59 +0000 (17:30 -0800)]
vc4: Avoid emitting small immediates for UBO indirect load address guards.
The kernel will reject our shader if we emit one here, and having 4, 8, or
12 as the top end of our UBO clamp rare is enough that it's not worth
making the kernel let us.
Fixes piglit fs-const-array-of-struct and
fs-const-array-of-struct-of-array since recent GLSL linking changes made
us get this as an indirect load of a uniform, instead of a tempoary.
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Timothy Arceri [Wed, 8 Feb 2017 04:05:19 +0000 (15:05 +1100)]
util/disk_cache: use stat() to check if entry is a directory
d_type is not supported on all systems.
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97967
Emil Velikov [Tue, 7 Feb 2017 22:20:51 +0000 (22:20 +0000)]
st/nine: update configure options in the README
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Tue, 7 Feb 2017 15:53:14 +0000 (15:53 +0000)]
configure.ac: supersede --enable-gallium-llvm over --enable-llvm
Currently we have extra (somewhat questionable) modularity, such that
one could build some parts with LLVM while others w/o.
That is extremely fragile, error prone and requires quite noticable
amount of code throughout.
Thus lets deprecate the gallium toggle in faviour of the generic one.
The former will throw a warning when set, and it will be overwritten by
the latter. This will allow gradual transition w/o breaking people's
scripts.
v2: Rebase, document in release notes.
Cc: Dave Airlie <airlied@redhat.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de> (v1)
Emil Velikov [Tue, 7 Feb 2017 20:13:08 +0000 (20:13 +0000)]
configure.ac: remove dummy radeon_gallium_llvm_check()
The extra function brings no added benefit as of earlier commit which
made llvm_require_version (as called by radeon_llvm_check) require LLVM
(--enable-gallium-llvm).
Fixes: 5f966a96af7 "configure.ac: Mandate --enable-gallium-llvm when
checking LLVM version"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Wed, 18 Jan 2017 13:54:04 +0000 (13:54 +0000)]
configure.ac: correctly manage llvm auto-detection
Earlier refactoring commits changed from one, dare I say it, broken
behaviour to another. Namely:
Before, as you explicitly --enable-gallium-llvm your selection was
ignored when llvm-config was not present/detected.
Today, the "auto" heuristics enables gallium llvm regardless if you have
llvm/llvm-config available or not.
Rework the auto-detection to attribute for llvm's presence.
v2: Set enable_gallium_llvm=no when LLVM is not found.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Reported-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Emil Velikov [Wed, 18 Jan 2017 13:54:03 +0000 (13:54 +0000)]
configure.ac: disable enable_gallium_llvm in the !x86 case
Already implicitly handled throughout, but keep it clear and disable
gallium-llvm. This change should be a no-op.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Wed, 18 Jan 2017 13:54:02 +0000 (13:54 +0000)]
configure.ac: set LLVM_{C, CXX, LD}FLAGS only as needed
Earlier refactoring commits started setting the above regardless if LLVM
is used or not. Move them to the respective section to restore the
original functionality.
Since we require the preprocessor flags (includes in particular) for the
header version parsing keep those as-is. They are not used outside of
configure.ac thus should not cause any side-effects.
As-is adding the C/CXXFLAGS can lead to build issues on when
cross-compiling.
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reported-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Wed, 18 Jan 2017 13:54:01 +0000 (13:54 +0000)]
Revert "configure.ac: Create correct LLVM_VERSION_INT with minor >= 10"
As stated in [1] by the LLVM devs, the new versioning scheme will not
deploy any minor version (i.e. it will always be zero). As such the
patch should not be needed.
This reverts commit
0e9a5be7e74fa2a9bd2a634ef60822bd6600ca1d.
[1] http://blog.llvm.org/2016/12/llvms-new-versioning-scheme.html
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Wed, 18 Jan 2017 13:54:00 +0000 (13:54 +0000)]
configure.ac: don't use == with test
Although it works, it's not the correct thing to do.
v2: Rebase
v3: Rebase
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de> (v1)
Emil Velikov [Wed, 18 Jan 2017 13:53:59 +0000 (13:53 +0000)]
configure.ac: remove unused LLVM variables
LLVM_BINDIR is completely unused while others such as LLVM_LIBDIR are
used only internally. In the latter case there's no need to AC_SUBST it.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Tobias Droste [Tue, 7 Feb 2017 19:53:32 +0000 (19:53 +0000)]
configure.ac: Only define HAVE_LLVM if LLVM is used
Make sure that HAVE_LLVM compiler define is only set if LLVM is
actually used.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Tobias Droste <tdroste@gmx.de>
v2 [Emil] fold within the existing conditional
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tobias Droste [Sat, 28 Jan 2017 13:56:57 +0000 (14:56 +0100)]
configure.ac: Rework MESA_LLVM and LLVM detection
Set FOUND_LLVM only when LLVM is present (checking for exact version/etc
is deferred) and use enable-gallium-llvm to indicate the global LLVM
status.
Renaming the latter is not appropriate for stable patches, so we'll
address it with a later commit.
Loosely based on work by Tobias.
v2: Check FOUND_LLVM if enable_gallium_llvm is set.
Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Tue, 7 Feb 2017 21:59:14 +0000 (21:59 +0000)]
configure.ac: move enable-gallium-llvm dependency with-gallium-drivers
... to where it's applicable.
Since we effectively made --enable-gallium-llvm mean --enable-llvm with
earlier commits, we need to move the requirement to guard the compnents
added for the LLVM draw.
Otherwise we'll error (as below) when building RADV w/o gallium drivers.
configure: error: --enable-gallium-llvm is required when building radv
v2: Don't remove but move the dependency (Tobias).
Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Tue, 7 Feb 2017 16:24:44 +0000 (16:24 +0000)]
configure.ac: Mandate --enable-gallium-llvm when checking LLVM version
With this change we effectively require --enable-gallium-llvm when
building RADV. This should be perfectly safe since the gallium radeonsi
driver already explicitly requires it.
The "gallium" part in --enable-gallium-llvm is about to be removed soon
(not in stable), but until then make sure that things can build.
To reflect the requirement (as opposed to check previously) we rename
llvm_check_version_for to llvm_require_version
Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Tue, 7 Feb 2017 16:13:07 +0000 (16:13 +0000)]
configure.ac: Rename the gallium_require_llvm helper
Drop the gallium prefix since we're about it use it throughout the
configure.
Note we do want to check for enable_gallium_llvm check since (as
explicitly requested) the toggle should mean --enable-llvm. Latter of
which to be resolved with later patches.
Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Tobias Droste [Sat, 28 Jan 2017 13:57:00 +0000 (14:57 +0100)]
configure.ac: Don't check LLVM version in require_llvm
This is actually not needed because the version is checked later.
Around line 2380
if test "x$enable_gallium_llvm" == "xyes"; then
llvm_check_version_for $LLVM_REQUIRED_GALLIUM "gallium"
llvm_add_default_components "gallium"
fi
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: Tobias Droste <tdroste@gmx.de>
Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
v2: [Emil Velikov: rebase/respin series order]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Tue, 7 Feb 2017 20:48:12 +0000 (20:48 +0000)]
configure.ac: move AC_ARG_ENABLE([gallium-llvm] hunk further up
With next commits we'll require --enable-gallium-llvm (en route to a
greater good later on) for RADV. The latter is required to ensure that
as otherwise we'll fail to build.
Cc: Dave Airlie <airlied@redhat.com>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Emil Velikov [Tue, 7 Feb 2017 14:43:30 +0000 (14:43 +0000)]
configure.ac: remove unused AC_SUBST([MESA_LLVM])
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tobias Droste <tdroste@gmx.de>
Nicolai Hähnle [Tue, 7 Feb 2017 07:43:09 +0000 (08:43 +0100)]
loader: unconditionally include unistd.h and stdlib.h
Otherwise we would fail with "implicit declaration of function" geteuid
and getenv respectively.
To trigger (re)move the libdrm.pc file and use the following:
$ ./autogen.sh --disable-egl --disable-gbm --disable-dri \
--with-dri-drivers=swrast --with-gallium-drivers=swrast
$ make
Cc: Vinson Lee <vlee@freedesktop.org>
Fixes: 3f462050c ("loader: Add an environment variable to override driver name choice.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99701
v2: [Emil: handle stdlib.h add commit message]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Tue, 7 Feb 2017 21:21:56 +0000 (21:21 +0000)]
intel/blorp: do not return const data by get_px_size_sa()
Not much point in the const qualifier since we provide a copy to the
user. Resolves the following -Wignored-qualifiers warning.
src/intel/blorp/blorp_blit.c:1857:8: warning: 'const' type qualifier on
return type has no effect [-Wignored-qualifiers]
v2: keep const qualifier of local variable.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Thu, 9 Feb 2017 11:19:21 +0000 (12:19 +0100)]
gallium/radeon: use staging for texture read mappings from GTT WC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 9 Feb 2017 11:03:34 +0000 (12:03 +0100)]
gallium/radeon: ignore the level parameter in buffer_transfer_map
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 9 Feb 2017 02:14:22 +0000 (03:14 +0100)]
gallium/radeon: fix performance of buffer readbacks
We want cached GTT for all non-persistent read mappings.
Set level = 0 on purpose.
Use dma_copy, because resource_copy_region causes a failure in the PBO
read of piglit/getteximage-luminance.
If Rocket League used the READ flag, it should get cached GTT.
v2: mask out UNSYNCHRONIZED
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 9 Feb 2017 10:38:29 +0000 (11:38 +0100)]
radeonsi: align vertex buffer descriptor list size for optimal prefetch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 9 Feb 2017 01:18:40 +0000 (02:18 +0100)]
radeonsi: align shader binaries to CP DMA alignment for optimal prefetch
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 9 Feb 2017 01:17:37 +0000 (02:17 +0100)]
radeonsi: move CP_DMA_ALIGNMENT definition
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 8 Feb 2017 02:05:11 +0000 (03:05 +0100)]
radeonsi: remove SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER
not necessary
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 8 Feb 2017 02:01:32 +0000 (03:01 +0100)]
radeonsi: remove separate CB/DB_META flush flags
not used separately
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 7 Feb 2017 23:26:37 +0000 (00:26 +0100)]
radeonsi: reduce the number of FMASK input coordinates
Before:
image_load v3, v[0:3] ...
After:
image_load v3, v[0:1] ...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 29 Nov 2016 14:35:11 +0000 (15:35 +0100)]
radeonsi: write shader asm annotated with wave info into GPU hang reports
Note that the disassembly is written twice - first the unmodified compiler
output and then the wave-annotated output only if there are waves executing
the shader.
Sample output from a real GPU hang most likely caused by image_sample:
The number of active waves = 28
Pixel Shader - annotated disassembly:
s_mov_b64 s[6:7], exec ;
BE86017E [PC=0x10f3e3800, off=0, size=4]
s_wqm_b64 exec, exec ;
BEFE077E [PC=0x10f3e3804, off=4, size=4]
...
image_sample v[7:9], v[0:1], s[12:19], s[20:23] dmask:0x7 ;
F0800700 00A30700 [PC=0x10f3e3a94, off=660, size=8]
s_buffer_load_dword s20, s[0:3], 0x50 ;
C0220500 00000050 [PC=0x10f3e3a9c, off=668, size=8]
s_load_dwordx4 s[24:27], s[4:5], 0x170 ;
C00A0602 00000170 [PC=0x10f3e3aa4, off=676, size=8]
s_load_dwordx8 s[12:19], s[4:5], 0x140 ;
C00E0302 00000140 [PC=0x10f3e3aac, off=684, size=8]
s_buffer_load_dword s11, s[0:3], 0x5c ;
C02202C0 0000005C [PC=0x10f3e3ab4, off=692, size=8]
s_buffer_load_dword s21, s[0:3], 0x54 ;
C0220540 00000054 [PC=0x10f3e3abc, off=700, size=8]
s_buffer_load_dword s22, s[0:3], 0x58 ;
C0220580 00000058 [PC=0x10f3e3ac4, off=708, size=8]
s_waitcnt vmcnt(0) ;
BF8C0F70 [PC=0x10f3e3acc, off=716, size=4]
^ SE0 SH0 CU1 SIMD1 WAVE0 EXEC=
aaaaaaa555aaaaaa INST32=
BF8C0F70
^ SE0 SH0 CU1 SIMD2 WAVE0 EXEC=
aaaa85555555552a INST32=
BF8C0F70
^ SE0 SH0 CU1 SIMD3 WAVE0 EXEC=
000000000000000a INST32=
BF8C0F70
^ SE0 SH0 CU6 SIMD1 WAVE0 EXEC=
25a5a5aa82aaaaaa INST32=
BF8C0F70
^ SE0 SH0 CU6 SIMD3 WAVE0 EXEC=
50aaaa8fffa55555 INST32=
BF8C0F70
^ SE0 SH0 CU7 SIMD0 WAVE0 EXEC=
5554aaaaaaa1a555 INST32=
BF8C0F70
^ SE0 SH0 CU7 SIMD0 WAVE1 EXEC=
aaaa5555ffffffff INST32=
BF8C0F70
^ SE0 SH0 CU7 SIMD1 WAVE0 EXEC=
555557aaaaaaaaa5 INST32=
BF8C0F70
^ SE0 SH0 CU7 SIMD3 WAVE0 EXEC=
5555aaaaaaaaaa85 INST32=
BF8C0F70
^ SE1 SH0 CU3 SIMD1 WAVE0 EXEC=
aaaaaaaaaaaaaaaa INST32=
BF8C0F70
^ SE1 SH0 CU4 SIMD0 WAVE0 EXEC=
aaaaaaaa5a5a5a5a INST32=
BF8C0F70
^ SE1 SH0 CU4 SIMD1 WAVE0 EXEC=
aaaaaaa5a5a5a4a5 INST32=
BF8C0F70
^ SE1 SH0 CU4 SIMD2 WAVE0 EXEC=
5555555000000000 INST32=
BF8C0F70
^ SE1 SH0 CU4 SIMD3 WAVE0 EXEC=
aa555554155aaaaa INST32=
BF8C0F70
^ SE1 SH0 CU5 SIMD0 WAVE0 EXEC=
55ffff55555555aa INST32=
BF8C0F70
^ SE1 SH0 CU5 SIMD1 WAVE0 EXEC=
555555555aaaaaaa INST32=
BF8C0F70
^ SE1 SH0 CU5 SIMD2 WAVE0 EXEC=
a0aaaaaaa8555555 INST32=
BF8C0F70
^ SE1 SH0 CU5 SIMD3 WAVE0 EXEC=
8aaaaaaaaaaaa555 INST32=
BF8C0F70
^ SE1 SH0 CU6 SIMD0 WAVE0 EXEC=
000000002aaaaaaa INST32=
BF8C0F70
^ SE2 SH0 CU1 SIMD0 WAVE0 EXEC=
5aaaa5400aaaa15a INST32=
BF8C0F70
^ SE2 SH0 CU1 SIMD1 WAVE0 EXEC=
00aaaaaaaa5555aa INST32=
BF8C0F70
^ SE2 SH0 CU1 SIMD2 WAVE0 EXEC=
aa00005555554555 INST32=
BF8C0F70
^ SE2 SH0 CU1 SIMD3 WAVE0 EXEC=
aaaaaaa000000000 INST32=
BF8C0F70
^ SE3 SH0 CU4 SIMD0 WAVE0 EXEC=
5555aaaaaaaaaaaa INST32=
BF8C0F70
^ SE3 SH0 CU4 SIMD2 WAVE0 EXEC=
ffaaaaaaaaaa5555 INST32=
BF8C0F70
^ SE3 SH0 CU4 SIMD3 WAVE0 EXEC=
aaaa55555555aa00 INST32=
BF8C0F70
^ SE3 SH0 CU5 SIMD0 WAVE0 EXEC=
00aaaaaaaaaaaa5a INST32=
BF8C0F70
^ SE3 SH0 CU5 SIMD1 WAVE0 EXEC=
5a555555005555ff INST32=
BF8C0F70
v_mul_f32_e32 v7, s6, v7 ;
0A0E0E06 [PC=0x10f3e3ad0, off=720, size=4]
...
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 28 Nov 2016 22:41:38 +0000 (23:41 +0100)]
radeonsi: write wave information into GPU hang reports
UMR is our new debugging tool. It must have +s set for Mesa to use it
without root privileges:
sudo chmod +s .../umr
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marc-André Lureau [Thu, 9 Feb 2017 14:41:11 +0000 (18:41 +0400)]
tgsi-dump: dump label if instruction has one
The instruction has an associated label when Instruction.Label == 1,
as can be seen in ureg_emit_label() or tgsi_build_full_instruction().
This fixes dump generating extra :0 labels on conditionals, and virgl
parsing more than the expected tokens and eventually reaching "Illegal
command buffer" (when parsing more than a safety margin of 10 we
currently have).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Marc-André Lureau [Thu, 9 Feb 2017 14:36:32 +0000 (18:36 +0400)]
tgsi: remove ureg_label_insn
Unused since commit
2897cb3dba9287011f9c43cd2f214100952370c0.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 9 Feb 2017 03:24:05 +0000 (03:24 +0000)]
radv: handle queue submission with no cs but semaphores
It's legal to submit just semaphores with no command streams,
this patch fixes this case by emitting the empty cs, it also
handles the fence emission for this case better.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Thu, 9 Feb 2017 11:41:15 +0000 (22:41 +1100)]
util/disk_cache: error check asprintf()
Fixes: f3d911463e8 "util/disk_cache: stop using ralloc_asprintf() unnecessarily"
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Timothy Arceri [Tue, 27 Sep 2016 22:56:26 +0000 (08:56 +1000)]
docs: add shader cache environment variables
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Ilia Mirkin [Thu, 9 Feb 2017 20:35:51 +0000 (15:35 -0500)]
nvc0/ir: fix ubo max clamp, reset file index
We just increased the max UBO, so we should also increase the clamp that
we do for robustness. Similarly, as we're including the fileIndex in the
new indirect value, we should reset fileIndex to 0 so that it is not
added in a second time.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Thu, 26 Jan 2017 03:16:56 +0000 (22:16 -0500)]
nv50/ir: always return 0 when trying to read thread id along unit dim
Many many many compute shaders only define a 1- or 2-dimensional block,
but then continue to use system values that take the full 3d into
account (like gl_LocalInvocationIndex, etc). So for the special case
that a dimension is exactly 1, we know that the thread id along that
axis will always be 0, so return it as such and allow constant folding
to fix things up.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Ilia Mirkin [Thu, 26 Jan 2017 04:48:23 +0000 (23:48 -0500)]
nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute
Kepler and up unfortunately only support up to 8 constbufs. We work
around this by loading from constbufs as if they were storage buffers.
However we were not consistently applying limits to loads from these
buffers. Make sure to do the same thing we do for storage buffers.
Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Thu, 26 Jan 2017 04:06:22 +0000 (23:06 -0500)]
nvc0: increase number of ubo binding points
Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but
it's unclear what it refers to). We need to expose an extra binding
point for the "program parameters", which means this must be 15. Remove
the last vestige of the "use c14 for immediates" idea.
Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Tue, 7 Feb 2017 23:18:07 +0000 (18:18 -0500)]
configure: add blurb about what the LIBDRM_*_REQUIRED stuff means
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Ilia Mirkin [Sun, 5 Feb 2017 18:08:07 +0000 (13:08 -0500)]
nvc0: expose int64
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 5 Feb 2017 23:09:02 +0000 (18:09 -0500)]
nvc0/ir: make it possible to have the flags def in def0
There's all kinds of logic that doesn't like there being holes in defs
or srcs lists. Avoid them. This also fixes the sched logic for maxwell.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 5 Feb 2017 15:03:53 +0000 (10:03 -0500)]
nvc0/ir: add support for 64-bit shift lowering on SM20/SM30
Unfortunately there is no SHF.L/SHF.R instruction pre-SM35. So we have
to do a bit more work to get the job done.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 5 Feb 2017 03:31:04 +0000 (22:31 -0500)]
nvc0/ir: add support for all the new int64 tgsi opcodes
A few thoughts:
- Some of that LegalizeSSA logic should really live much earlier and be
subject to the likes of DCE and other useful passes
- Some of the "lowering" done in from_tgsi should be done later so that
proper optimization might be done.
However this all works and the above can be improved upon later.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Pierre Moreau [Sun, 30 Oct 2016 21:34:25 +0000 (22:34 +0100)]
nv50/ir: Split 64-bit integer MAD/MUL operations
Hardware does not support 64-bit integers MAD and MUL operations, so we need
to transform them in 32-bit operations.
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Ilia Mirkin [Sun, 5 Feb 2017 03:29:17 +0000 (22:29 -0500)]
nvc0/ir: add a "high" subop for shifts, emit shf.l/shf.r for 64-bit
Note that this is not available for SM20/SM30.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 5 Feb 2017 04:57:53 +0000 (23:57 -0500)]
nvc0/ir: fix SET and SLCT emission
We were never emitting a .X flag for consuming condition code on SET,
and weren't emitting a signed type for SLCT comparison. Discovered while
working on int64 logic.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 4 Feb 2017 15:55:03 +0000 (10:55 -0500)]
nvc0/ir: add support for emitting partial min/max ops for int64
These operations allow you to compute min/max on arbitrary-width
integers, 32 bits at a time.
Note that the low/med ops implicitly set the condition code, and the
med/high ops implicitly consume it.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 5 Feb 2017 03:31:29 +0000 (22:31 -0500)]
gallium: add separate PIPE_CAP_INT64_DIVMOD
Nouveau does not currently have logic to implement this as a library
function. Even though such a library could be written, there's no big
advantage to do it that way for now given that int64 is a very uncommon
use-case. Allow a driver to expose INT64 without supporting division and
modulo operations.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Eric Engestrom [Thu, 9 Feb 2017 15:04:14 +0000 (15:04 +0000)]
docs: improve the list of gl implementations
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Eric Engestrom [Thu, 9 Feb 2017 15:03:30 +0000 (15:03 +0000)]
docs: improve the list of implemented APIs
Suggested-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Matt Turner [Tue, 31 Jan 2017 23:41:52 +0000 (15:41 -0800)]
glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=...
Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed
an OpenGL 3.3 compatibility profile context (with various unimplemented
features and bugs), but still refused to compile shaders with
#version 330 compatibility
This patch simply adds a small bit of plumbing to let that through.
Of course the same caveats apply: compatibility profile is still not
supported (and will not be supported), so there are no guarantees that
anything will work.
Tested-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Engestrom [Thu, 9 Feb 2017 11:33:36 +0000 (11:33 +0000)]
docs: reword sentence that my brain can't parse
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Elie Tournier <tournier.elie@gmail.com>
Eric Engestrom [Thu, 9 Feb 2017 02:10:17 +0000 (02:10 +0000)]
docs: https all the links \o/
Most of them already redirected to https anyway, so we might as well
avoid the redirection and the security implications by linking directly
to the right protocol.
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Brian Paul <brianp@vmware.com>
Eric Engestrom [Thu, 9 Feb 2017 02:10:16 +0000 (02:10 +0000)]
docs: fix gallium wiki link in relnotes
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Brian Paul <brianp@vmware.com>
Eric Engestrom [Thu, 9 Feb 2017 02:10:15 +0000 (02:10 +0000)]
docs: update 'thanks' for hosting
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Brian Paul <brianp@vmware.com>
Samuel Iglesias Gonsálvez [Wed, 8 Feb 2017 12:51:22 +0000 (13:51 +0100)]
i965/fs: add support for int64 to bool conversion
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Samuel Iglesias Gonsálvez [Wed, 8 Feb 2017 12:50:57 +0000 (13:50 +0100)]
nir: add opcode to perform int64 to bool conversions
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Samuel Iglesias Gonsálvez [Wed, 8 Feb 2017 12:26:43 +0000 (13:26 +0100)]
i965/fs: Add support for nir_op_[iu]2[iu]32
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Samuel Iglesias Gonsálvez [Wed, 8 Feb 2017 12:18:59 +0000 (13:18 +0100)]
i965/fs: Add support for nir_op_[iu]642f
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Samuel Iglesias Gonsálvez [Wed, 8 Feb 2017 12:14:34 +0000 (13:14 +0100)]
i965/fs: legalize [u]int64 to 32-bit data conversions in lower_d2x
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Fri, 3 Feb 2017 01:11:35 +0000 (17:11 -0800)]
i965/fs: Add support for nir_op_[iu]642d
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Thu, 2 Feb 2017 22:56:23 +0000 (14:56 -0800)]
i965: Allow int64 conversion operations in channel_expressions
This fixes 143 of the new piglit tests added by Nicolai
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Timothy Arceri [Wed, 8 Feb 2017 22:04:52 +0000 (09:04 +1100)]
util/disk_cache: stop using ralloc_asprintf() unnecessarily
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Timothy Arceri [Tue, 22 Nov 2016 05:56:21 +0000 (16:56 +1100)]
glsl: add param to force shader recompile
This will be used to skip checking the cache and force a recompile.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Timothy Arceri [Tue, 26 Apr 2016 04:56:22 +0000 (14:56 +1000)]
util: add a disk_cache_remove() function
This will be used to remove cache items created with old versions
of Mesa or other invalid cache items from the cache.
V2: rename stub function (cache_* funtions were renamed disk_cache_*)
in master.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Timothy Arceri [Fri, 3 Feb 2017 23:46:53 +0000 (10:46 +1100)]
st/mesa/i965: create link status enum
For the on-disk shader cache we want to be able to differentiate
between a program that was linked and one that was loaded from cache.
V2:
- don't return the new enum directly to the application when queried,
instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome
corruptions when using cache.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Brian Paul [Wed, 8 Feb 2017 19:31:44 +0000 (12:31 -0700)]
docs: update intro.html to mention new APIs, etc
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Brian Paul [Wed, 8 Feb 2017 19:31:43 +0000 (12:31 -0700)]
docs: the site is now hosted by freedesktop.org
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Bas Nieuwenhuizen [Wed, 8 Feb 2017 17:19:58 +0000 (18:19 +0100)]
radv: Add CPU color packing for VK_FORMAT_A2B10G10R10_UNORM_PACK32.
For allowing fast color clears in the main render targets of dota2.
[airlied: fix clear_vals[1] as suggested by Andres.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Wed, 8 Feb 2017 20:56:16 +0000 (21:56 +0100)]
mesa: (trivial) include <inttypes.h> for PRIx64 macros
Fixes a compile error with mingw.
Tim Rowley [Tue, 31 Jan 2017 23:05:19 +0000 (17:05 -0600)]
swr: [rasterizer jitter] Pass LLVM-IR size into jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 31 Jan 2017 19:13:00 +0000 (13:13 -0600)]
swr: [rasterizer core] Frontend SIMD16 WIP
Removed temporary scafolding in PA, widended the PA_STATE interface
for SIMD16, and implemented PA_STATE_CUT and PA_TESS for SIMD16.
PA_STATE_CUT and PA_TESS now work in SIMD16.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Fri, 20 Jan 2017 23:18:50 +0000 (17:18 -0600)]
swr: [rasterizer jitter] Disable unsafe FP optimizations in the jitter
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Wed, 25 Jan 2017 18:27:41 +0000 (12:27 -0600)]
swr: [rasterizer core] Frontend SIMD16 WIP
Widen simdvertex to SIMD16/simd16vertex in frontend for passing VS
attributes from VS to PA.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 24 Jan 2017 19:30:05 +0000 (13:30 -0600)]
swr: [rasterizer jitter] Add DEBUGTRAP jit builder function
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 24 Jan 2017 18:37:13 +0000 (12:37 -0600)]
swr: [rasterizer jitter] Multisample blend jit fix
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Mon, 23 Jan 2017 17:30:13 +0000 (11:30 -0600)]
swr: [rasterizer jitter] Change SimdVector representation to array
Make all SimdVectors in LLVM represented as simdscalar[4] rather
than a struct.
Fixes issues with promotion of values from i32 to i64 to match
register width.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Sat, 21 Jan 2017 00:32:14 +0000 (18:32 -0600)]
swr: [rasterizer jitter] Fix issues with stream-out on llvm>=3.8
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Fri, 20 Jan 2017 00:32:06 +0000 (18:32 -0600)]
swr: [rasterizer jitter] Adjust jitter header includes
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 19 Jan 2017 00:08:40 +0000 (18:08 -0600)]
swr: [rasterizer core] Frontend SIMD16 WIP
SIMD16 Primitive Assembly (PA) only supports TriList and RectList.
CUT_AWARE_PA, TESS, GS, and SO disabled in the SIMD16 front end.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Eric Engestrom [Wed, 8 Feb 2017 11:27:00 +0000 (04:27 -0700)]
docs: update package contents
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Eric Engestrom [Wed, 8 Feb 2017 11:27:00 +0000 (04:27 -0700)]
docs: fix unpacking instructions
File names were wrong, file formats were wrong, bunzip command was
wrong...
I also removed all but the simplest example; people who use pipes already
know how to untar, so let's simplify and remove potential confusion for
non-tech-savvy users.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>