Nicolai Hähnle [Tue, 18 Oct 2016 16:40:38 +0000 (18:40 +0200)]
radeonsi: fix 64-bit loads from LDS
Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among
others.
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolai Hähnle [Wed, 19 Oct 2016 16:14:48 +0000 (18:14 +0200)]
st/mesa: only set primitive_restart when the restart index is in range
Even when enabled, primitive restart has no effect when the restart index
is larger than the representable values in the index buffer.
Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert
for radeonsi VI.
v2: add an explanatory comment
Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Nicolai Hähnle [Tue, 18 Oct 2016 15:35:45 +0000 (17:35 +0200)]
st/glsl_to_tgsi: sort input and output decls by TGSI index
Fixes a regression introduced by commit
777dcf81b.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98307
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Nicolai Hähnle [Sun, 16 Oct 2016 15:34:33 +0000 (17:34 +0200)]
st/glsl_to_tgsi: fix block copies of arrays of structs
Use a full writemask in this case. This is relevant e.g. when a function
has an inout argument which is an array of structs.
v2: use C-style comment (Timothy Arceri)
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Nicolai Hähnle [Sun, 16 Oct 2016 15:33:51 +0000 (17:33 +0200)]
st/glsl_to_tgsi: fix block copies of arrays of doubles
Set the type of the left-hand side to the same as the right-hand side,
so that when the base type is double, the writemask of the MOV instruction
is properly fixed up.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: 13.0 <mesa-stable@lists.freedesktop.org>
Iago Toral Quiroga [Tue, 18 Oct 2016 12:15:36 +0000 (14:15 +0200)]
glsl: Indirect array indexing on non-last SSBO member must fail compilation
After the changes in comit
5b2675093e863a52, we moved this check to the
linker, but the spec expects this to be checked at compile-time. There are
dEQP tests that expect an error at compile time and the spec seems to confirm
that expectation:
"Except for the last declared member of a shader storage block (section 4.3.9
“Interface Blocks”), the size of an array must be declared (explicitly sized)
before it is indexed with anything other than an integral constant expression.
The size of any array must be declared before passing it as an argument to a
function. Violation of any of these rules result in compile-time errors. It
is legal to declare an array without a size (unsized) and then later
redeclare the same name as an array of the same type and specify a size, or
index it only with integral constant expressions (implicitly sized)."
Commit
5b2675093e863a52 tries to take care of the case where we have implicitly
sized arrays in SSBOs and it does so by checking the max_array_access field
in ir_variable during linking. In this patch we change the approach: we look
for indirect access on SSBO arrays, and when we find one, we emit a
compile-time error if the accessed member is not the last in the SSBO
definition.
There is a corner case that the specs do not address directly though and that
dEQP checks for: the case of an unsized array in an SSBO definition that is
not defined last but is never used in the shader code either. The following
dEQP tests expect a compile-time error in this scenario:
dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader
dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
However, since the unsized array is never used it is never indexed with a
non-constant expression, so by the spec quotation above, it should be valid and
the tests are probably incorrect.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Ilia Mirkin [Wed, 19 Oct 2016 05:20:03 +0000 (01:20 -0400)]
nv50/ir: process texture offset sources as regular sources
With ARB_gpu_shader5, texture offsets can be any source, including TEMPs
and IN's. Make sure to process them as regular sources so that we pick
up masks, etc.
This should fix some CTS tests that feed offsets directly to
textureGatherOffset, and we were not picking up the input use, thus not
advertising it in the shader header.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dave Airlie <airlied@redhat.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Wed, 19 Oct 2016 04:05:26 +0000 (00:05 -0400)]
nv50,nvc0: avoid reading out of bounds when getting bogus so info
The state tracker tries to attach the info to the wrong shader. This is
easy enough to protect against.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 12.0 13.0 <mesa-stable@lists.freedesktop.org>
Eric Engestrom [Wed, 19 Oct 2016 23:09:11 +0000 (00:09 +0100)]
wsi/wayland: fix error path
Fixes: 1720bbd353d87412754f ("anv/wsi: split image alloc/free out to separate fns.")
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 19 Oct 2016 03:36:23 +0000 (13:36 +1000)]
anv: drop unused zero macro.
I can't see this being used anywhere.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 20 Oct 2016 00:42:22 +0000 (01:42 +0100)]
radv: use emit_icmp for samples_identical
On a debug llvm build we'd assert on the next compare
when the return from samples_identical was i1 instead
of i32.
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jordan Justen [Wed, 6 Jul 2016 22:08:27 +0000 (15:08 -0700)]
i965/cs: Don't use a thread channel ID for small local sizes
When the local group size is 8 or less, we will execute the program at
most 1 time. Therefore, the local channel ID will always be 0. By
using a constant 0 in this case we can prevent using push constant
data.
This is not expected to be common a occurance in real applications,
but it has been seen in tests.
We could extend this optimization to 16 and 32 for SIMD16 and SIMD32,
but it gets a bit more complicated, because this optimization is
currently being done early on, before we have decided the SIMD size.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jordan Justen [Wed, 19 Oct 2016 17:25:21 +0000 (10:25 -0700)]
i965/cs: Use udiv/umod for local IDs
This allows for more optimizations relating to power-of-two divisions.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Timothy Arceri [Tue, 18 Oct 2016 23:51:48 +0000 (10:51 +1100)]
mesa: remove unused LocalSizeVariable
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Wed, 19 Oct 2016 11:09:49 +0000 (13:09 +0200)]
nvc0/ir: simplify predicate logic for GK104 atomic operations
The predicate is always CC_NOT_P as defined in
processSurfaceCoordsNVE4(), so we only want to emit OR.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 19 Oct 2016 11:02:02 +0000 (13:02 +0200)]
nvc0/ir: remove useless NVC0LoweringPass::gMemBase
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Wed, 19 Oct 2016 12:01:33 +0000 (14:01 +0200)]
nv50/ir: print CCTL subops in debug mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ian Romanick [Wed, 19 Oct 2016 15:53:10 +0000 (08:53 -0700)]
nir: Optimize integer division and modulus with 1
The previous power-of-two rules didn't catch idiv (because i965 doesn't
set lower_idiv) and imod cases. The udiv and umod cases should have
been caught, but I included them for orthogonality.
This fixes silly code observed from compute shaders with local_size_[xy]
= 1.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98299
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Tue, 18 Oct 2016 21:20:29 +0000 (23:20 +0200)]
configure.ac: enable EGL platform DRM if GBM is enabled
since GBM is enabled by default, this is also enabled by default
the whitespace changes remove tabs
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Marek Olšák [Tue, 18 Oct 2016 21:19:58 +0000 (23:19 +0200)]
configure.ac: enable GBM by default
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Marek Olšák [Tue, 18 Oct 2016 21:18:28 +0000 (23:18 +0200)]
configure.ac: print whether GBM is enabled
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Marek Olšák [Tue, 18 Oct 2016 13:20:22 +0000 (15:20 +0200)]
radeonsi: eliminate trivial constant VS outputs
These constant value VS PARAM exports:
- 0,0,0,0
- 0,0,0,1
- 1,1,1,0
- 1,1,1,1
can be loaded into PS inputs using the DEFAULT_VAL field, and the VS exports
can be removed from the IR to save export & parameter memory.
After LLVM optimizations, analyze the IR to see which exports are equal to
the ones listed above (or undef) and remove them if they are.
Targeted use cases:
- All DX9 eON ports always clear 10 VS outputs to 0.0 even if most of them
are unused by PS (such as Witcher 2 below).
- VS output arrays with unused elements that the GLSL compiler can't
eliminate (such as Batman below).
The shader-db deltas are quite interesting:
(not from upstream si-report.py, it won't be upstreamed)
PERCENTAGE DELTAS Shaders PARAM exports (affected only)
batman_arkham_origins 589 -67.17 %
bioshock-infinite 1769 -0.47 %
dirt-showdown 548 -2.68 %
dota2 1747 -3.36 %
f1-2015 776 -4.94 %
left_4_dead_2 1762 -0.07 %
metro_2033_redux 2670 -0.43 %
portal 474 -0.22 %
talos_principle 324 -3.63 %
warsow 176 -2.20 %
witcher2 1040 -73.78 %
----------------------------------------
All affected 991 -65.37 % ... 9681 -> 3353
----------------------------------------
Total 26725 -10.82 % ... 58490 -> 52162
v2: treat Undef as both 0 and 1
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> (v1)
Samuel Pitoiset [Tue, 18 Oct 2016 17:59:27 +0000 (19:59 +0200)]
nv50/ir: silent TGSI_PROPERTY_FS_DEPTH_LAYOUT
Found that information message while replaying a trace from
Metro 2033 Redux. Mark that property as useless for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Emil Velikov [Wed, 19 Oct 2016 17:46:22 +0000 (18:46 +0100)]
docs: add 13.1.0-devel release notes template, bump version
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Wed, 19 Oct 2016 16:33:38 +0000 (17:33 +0100)]
docs: rename release notes to 13.0.0
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Marek Olšák [Fri, 16 Sep 2016 20:42:54 +0000 (22:42 +0200)]
radeonsi: remove cb0_is_integer handling
st/mesa does this for us.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 16 Sep 2016 20:39:15 +0000 (22:39 +0200)]
st/mesa: disable alpha-test, alpha-to-coverage, alpha-to-one for integer FBs
v2: rebased
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 16 Oct 2016 22:54:35 +0000 (00:54 +0200)]
mesa: remove gl_shader_compiler_options::EmitNoNoise
it's always true
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 22:47:49 +0000 (00:47 +0200)]
glsl_to_tgsi: remove code for fixing up TGSI labels
I don't know what this was supposed to do, but all TGSI labels were
always 0.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 22:38:41 +0000 (00:38 +0200)]
glsl_to_tgsi: remove subroutine support
Never used. The GLSL compiler doesn't even look at EmitNoFunctions.
v2: add back "return" support in "main"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 22:11:21 +0000 (00:11 +0200)]
mesa_to_tgsi: remove remnants of flow control and subroutine support
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 22:07:01 +0000 (00:07 +0200)]
mesa_to_tgsi: drop support for instructions that can't occur here
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 20:08:03 +0000 (22:08 +0200)]
glsl_to_tgsi: allocate glsl_to_tgsi_instruction::tex_offsets on demand
sizeof(glsl_to_tgsi_instruction): 384 -> 264
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 20:04:02 +0000 (22:04 +0200)]
glsl_to_tgsi: merge buffer and sampler fields in glsl_to_tgsi_instruction
sizeof(glsl_to_tgsi_instruction): 416 -> 384
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 19:58:13 +0000 (21:58 +0200)]
glsl_to_tgsi: reduce the size of glsl_to_tgsi_instruction using bitfields
sizeof(glsl_to_tgsi_instruction): 464 -> 416
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 19:30:05 +0000 (21:30 +0200)]
glsl_to_tgsi: reduce the size of st_dst_reg and st_src_reg
I noticed that glsl_to_tgsi_instruction is too huge.
sizeof(glsl_to_tgsi_instruction): 752 -> 464 (-38%)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 19:28:36 +0000 (21:28 +0200)]
glsl_to_tgsi: remove unused st_translate::tex_offsets
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 19:22:11 +0000 (21:22 +0200)]
glsl_to_tgsi: remove unused parameters from calc_deref_offsets
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 16 Oct 2016 21:22:55 +0000 (23:22 +0200)]
glsl_to_tgsi: use array_id for temp arrays instead of hacking high bits
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Adam Jackson [Thu, 6 Oct 2016 19:37:54 +0000 (15:37 -0400)]
reviewers: Throw myself on the GLX grenade
Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Engestrom [Wed, 19 Oct 2016 14:09:26 +0000 (15:09 +0100)]
egl: bring back the default glapi.so name
Earlier commit replaced the default platform specific libglapi.so name
with an #error.
This may have been overzealous since the name is the correct for the BSD
platforms, at least. Reinstate the hunk - bringing back OpenBSD, et al.
to a successful build state.
Fixes: 7a9c92d071d ("egl/dri2: non-shared glapi cleanups")
[Emil Velikov: format the patch from Eric, add commit message and tag.]
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Iago Toral Quiroga [Tue, 27 Sep 2016 10:23:44 +0000 (12:23 +0200)]
i965: fix subnr overflow in suboffset()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Dave Airlie [Wed, 19 Oct 2016 07:34:28 +0000 (17:34 +1000)]
radv: decompress fmask before reading using texture unit
Before we can read the fmask using the compute shader, we need
to decompress the fmask in place.
This fixes a bunch of remaining failure and hopefully multisampling
in Talos.
Dave Airlie [Wed, 19 Oct 2016 05:43:26 +0000 (15:43 +1000)]
radv: fix samples_identical return value.
This was returning an inversion, so not doing as it should have.
We need to compare the fmask value with 0, and return the result
from that.
Dave Airlie [Wed, 19 Oct 2016 03:53:55 +0000 (13:53 +1000)]
radv: fix wsi porting regression in swapchain destroy.
The code in anv is right, there's a pending patch to fix this up
different, but I'll sync the code for now.
Dave Airlie [Wed, 19 Oct 2016 02:27:04 +0000 (12:27 +1000)]
radv: fix fmask ptr issue
We were using the wrong descriptor in the fmask picking code.
Dave Airlie [Tue, 18 Oct 2016 03:20:11 +0000 (13:20 +1000)]
radv: simplify fast clear shaders
There is no need for anything but a noop shader here.
Dave Airlie [Wed, 19 Oct 2016 00:53:51 +0000 (10:53 +1000)]
vulkan/wsi: fix out of tree build.
Dave Airlie [Mon, 10 Oct 2016 02:20:36 +0000 (03:20 +0100)]
radv: start using defines for the user sgpr offsets
This adds some comments and adds defines for the user sgprs,
so that we can move them around easier later and not have
to change/revalidate every one of these.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 06:49:34 +0000 (07:49 +0100)]
radv: port to common wsi codebase
This drops all the radv WSI code in favour of using
the new shared code that was ported from anv
This regresses Talos for now, Jason has pointed out
the bug is in Talos and we should wait for them to fix it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 06:12:33 +0000 (07:12 +0100)]
anv: move to using shared wsi code
This moves the shared code to a common subdirectory
and makes anv linked to that code instead of the copy
it was using.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 05:36:17 +0000 (06:36 +0100)]
anv/wsi: remove all anv references from WSI common code
the WSI code should be now be clean for sharing.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 04:42:29 +0000 (05:42 +0100)]
anv: move common wsi code to x11/wayland common files.
Next task is to rename all the anv_ out of this,
and move to a common location
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 04:14:45 +0000 (05:14 +0100)]
anv/wsi/wayland: add callback to get device format properties.
This avoids having to know the toplevel API name.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 02:09:02 +0000 (03:09 +0100)]
anv/wsi/wl: stop using device in more places
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 01:51:36 +0000 (02:51 +0100)]
anv/wsi: split out surface creation to avoid instance API
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 01:38:49 +0000 (02:38 +0100)]
anv/wsi: move further away from passing anv displays around
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Fri, 14 Oct 2016 00:34:10 +0000 (01:34 +0100)]
anv/wsi: split image alloc/free out to separate fns.
This moves these outside the wsi platform code, so we can reuse
that code
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 23:42:56 +0000 (00:42 +0100)]
anv/wsi: switch to using VkDevice in swapchain
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 23:35:12 +0000 (00:35 +0100)]
anv/wsi/x11: more refactoring to use generic handles
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 23:21:17 +0000 (00:21 +0100)]
anv/wsi/x11: start refactoring out the image allocation/free functionality
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:32:41 +0000 (05:32 +0100)]
anv/wsi: drop device from get format
Just use the wsi_device instead.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:26:03 +0000 (05:26 +0100)]
anv/wsi: remove device from get_support interface
replace with wsi_device and allocator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:25:33 +0000 (05:25 +0100)]
anv/wsi/x11: abstract WSI interface from internals.
This allows the API and the internals to be split, and the
internals shared.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:18:34 +0000 (05:18 +0100)]
anv/wsi/x11: push anv_device out of the init/finish routines
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:14:52 +0000 (05:14 +0100)]
anv/wsi: abstract wsi interfaces away from device a bit more.
This is a step towards separating out the wsi code for sharing
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:07:27 +0000 (05:07 +0100)]
anv/wsi/x11: push device out of x11 connection fns.
just pass the allocator/wsi_interface instead.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:27:56 +0000 (05:27 +0100)]
anv/wsi: drop device from get caps
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 04:33:28 +0000 (05:33 +0100)]
anv/wsi: drop get present modes device arg
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dave Airlie [Thu, 13 Oct 2016 03:43:27 +0000 (04:43 +0100)]
radv/anv/wsi: drop unneeded parameter
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Roland Scheidegger [Sat, 15 Oct 2016 01:53:48 +0000 (03:53 +0200)]
draw: improve vertex fetch (v2)
The per-element fetch has quite some calculations which are constant,
these can be moved outside both the per-element as well as the main
shader loop (llvm can figure out it's constant mostly on its own, however
this can have a significant compile time cost).
Similarly, it looks easier swapping the fetch loops (outer loop per attrib,
inner loop filling up the per vertex elements - this way the aos->soa
conversion also can be done per attrib and not just at the end though again
this doesn't really make much of a difference in the generated code). (This
would also make it possible to vectorize the calculations leading to the
fetches.)
There's also some minimal change simplifying the overflow math slightly.
All in all, the generated code seems to look slightly simpler (depending
on the actual vs), but more importantly I've seen a significant reduction
in compile times for some vs (albeit with old (3.3) llvm version, and the
time reduction is only really for the optimizations run on the IR).
v2: adapt to other draw change.
No changes with piglit.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Fri, 14 Oct 2016 01:08:00 +0000 (03:08 +0200)]
draw: improved handling of undefined inputs
Previous attempts to zero initialize all inputs were not really optimal
(though no performance impact was measurable). In fact this is not really
necessary, since we know the max number of inputs used.
Instead, just generate fetch for up to max inputs used by the shader,
directly replacing inputs for which there was no vertex element by zero.
This also cleans up key generation, which previously would have stored
some garbage for these elements.
And also drop the assertion which indicates such bogus usage by a
debug_printf (the whole point of initializing the undefined inputs was to
make this case safe to handle).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Fri, 14 Oct 2016 03:37:34 +0000 (05:37 +0200)]
gallivm: print out time for jitting functions with GALLIVM_DEBUG=perf
Compilation to actual machine code can easily take as much time as the
optimization passes on the IR if not more, so print this out too.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Tue, 18 Oct 2016 01:37:37 +0000 (03:37 +0200)]
gallivm: Use native packs and unpacks for the lerps
For the texturing packs, things looked pretty terrible. For every
lerp, we were repacking the values, and while those look sort of cheap
with 128bit, with 256bit we end up with 2 of them instead of just 1 but
worse, plus 2 extracts too (the unpack, however, works fine with a
single instruction, albeit only with llvm 3.8 - the vpmovzxbw).
Ideally we'd use more clever pack for llvmpipe backend conversion too
since we actually use the "wrong" shuffle (which is more work) when doing
the fs twiddle just so we end up with the wrong order for being able to
do native pack when converting from 2x8f -> 1x16b. But this requires some
refactoring, since the untwiddle is separate from conversion.
This is only used for avx2 256bit pack/unpack for now.
Improves openarena scores by 8% or so, though overall it's still pretty
disappointing how much faster 256bit vectors are even with avx2 (or
rather, aren't...). And, of course, eliminating the needless
packs/unpacks in the first place would eliminate most of that advantage
(not quite all) from this patch.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Dave Airlie [Fri, 14 Oct 2016 03:42:01 +0000 (13:42 +1000)]
anv: drop pointless struct decl.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 03:41:47 +0000 (13:41 +1000)]
radv: drop pointless struct decl.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 03:36:45 +0000 (13:36 +1000)]
radv: move to using shared vk_alloc inlines.
This moves to the shared vk_alloc inlines for vulkan
memory allocations.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 03:31:35 +0000 (13:31 +1000)]
anv: move to using vk_alloc helpers.
This moves all the alloc/free in anv to the generic helpers.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 03:19:43 +0000 (13:19 +1000)]
vulkan: add vk_alloc.h shared allocation inlines.
vulkan allocation allows for overriding the allocator used,
add some macros for anv/radv to share for this.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 03:12:08 +0000 (13:12 +1000)]
anv: drop local MIN/MAX macros.
Use the ones from mesa, most places already did.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 03:11:20 +0000 (13:11 +1000)]
radv: drop local MIN/MAX macros.
Use the ones in macros.h instead.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 03:10:26 +0000 (13:10 +1000)]
util: move min/max/clamp macros to util macros.h
Although the vulkan drivers include mesa macros.h, for
radv I'd like to move away from that.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 02:59:55 +0000 (12:59 +1000)]
radv: make use of shared vector helper.
This removes the vector code from radv in favour of sharing
code with anv.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 02:57:44 +0000 (12:57 +1000)]
anv: port to using new u_vector shared helper.
This just removes the anv vector code and uses the new helper.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 14 Oct 2016 02:55:03 +0000 (12:55 +1000)]
util: add vector util code.
This is ported from anv, both anv and radv can share this.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Brian Paul [Tue, 18 Oct 2016 16:20:55 +0000 (10:20 -0600)]
svga: minor code improvements in svga_validate_pipe_sampler_view()
Use the 'texture' local var in more places.
Rename 'pFormat' to 'viewFormat'.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Lionel Landwerlin [Wed, 12 Oct 2016 19:04:26 +0000 (20:04 +0100)]
intel: genxml: add SAMPLER_BORDER_COLOR_STATE structures
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Boyuan Zhang [Mon, 17 Oct 2016 20:11:48 +0000 (16:11 -0400)]
st/va: force to flush the last p frame in idr period
During dual instance encoding submission, if the second encode task and first
encode task have no reference dependency, e.g. p following with idr-frame,
there is a chance the second task will use for its reconstructed picture
buffer the same buffer used by first task for its reference/reconstructed
picture. In this case, buffer corruption may occur depending on encoding
speed. Fix is to force flush these two tasks separately to avoid race condition
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Chad Versace [Tue, 18 Oct 2016 16:39:49 +0000 (09:39 -0700)]
egl/surfaceless: Fix segfault in eglSwapBuffers
Since commit
63c5d5c6c46c8472ee7a8241a0f80f13d79cb8cd, the surfaceless
platform has allowed creation of pbuffer surfaces. But the vtable entry
for eglSwapBuffers has remained NULL.
Discovered by running a little pbuffer test.
Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Marek Olšák [Mon, 17 Oct 2016 10:51:27 +0000 (12:51 +0200)]
radeonsi: rename prefixes from radeon to si
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Marek Olšák [Mon, 17 Oct 2016 10:42:12 +0000 (12:42 +0200)]
radeonsi: merge radeon_llvm_context and si_shader_context
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Marek Olšák [Mon, 17 Oct 2016 10:30:42 +0000 (12:30 +0200)]
radeonsi: import all TGSI->LLVM code from gallium/radeon
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Marek Olšák [Sun, 16 Oct 2016 23:51:53 +0000 (01:51 +0200)]
gallium/radeon: simplify initialization of 64-bit gallivm builders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Marek Olšák [Sun, 16 Oct 2016 23:39:21 +0000 (01:39 +0200)]
gallium/radeon: remove unused radeon_llvm_reg_index_soa
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Marek Olšák [Sun, 16 Oct 2016 23:36:58 +0000 (01:36 +0200)]
radeonsi: move LLVM ALU codegen into radeonsi
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Jonathan Gray [Sun, 16 Oct 2016 12:08:42 +0000 (23:08 +1100)]
genxml: add generated headers to EXTRA_DIST
Building the Mesa 12.0.3 distfile failed on a system without python
as generated files were not included in the distfile.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Jonathan Gray [Sun, 16 Oct 2016 12:16:19 +0000 (23:16 +1100)]
mesa: automake: include mesa_glinterop.h in distfile
Add mesa_glinterop.h to the list of headers that will get included
in the distfile as it is required to build Mesa itself.
Corrects a regression introduced in
a89faa2022fd995af2019c886b152b49a01f9392.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Jonathan Gray [Sun, 16 Oct 2016 10:06:25 +0000 (21:06 +1100)]
egl: remove docs directory from EXTRA_DIST
The egl docs directory no longer exists as of
88b5c36fe1a1546bf633ee161a6715efc593acbd.
Remove it from EXTRA_DIST to unbreak 'make dist'
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Jonathan Gray [Sun, 16 Oct 2016 05:41:55 +0000 (16:41 +1100)]
genxml: avoid using a GNU make pattern rule
% pattern rules are a GNU extension. Convert the use of one to a
inference rule to allow this to build on OpenBSD.
This is a related change to the one made in
e3d43dc5eae5271e2c87bab702aa7409d3dd0b23
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 12 Sep 2016 17:47:54 +0000 (18:47 +0100)]
configure.ac: use a single require_libdrm helper
Rather than having 4-5 places which do the explicit check/message just
polish the gallium helper and use it everywhere.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>