Emil Velikov [Thu, 26 Jan 2017 19:26:13 +0000 (19:26 +0000)]
docs/releasing: add a note about the relnotes template
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:10 +0000 (13:24 +0000)]
mesa: remove explicit __STDC_FORMAT_MACROS define
Analogous to previous commits.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:09 +0000 (13:24 +0000)]
nouveau: remove explicit __STDC_FORMAT_MACROS define
Already handled by the build.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:08 +0000 (13:24 +0000)]
scons: swr: remove explicit __STDC_.*_MACROS defines
Analogous to previous commits.
Cc: George Kyriazis <george.kyriazis@intel.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:07 +0000 (13:24 +0000)]
gallium: remove explicit __STDC_.*_MACROS defines
Analogous to previous commits.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:06 +0000 (13:24 +0000)]
gallivm: remove explicit __STDC_.*_MACROS defines
Correctly handled by the build systems.
Cc: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:05 +0000 (13:24 +0000)]
glsl: remove explicit __STDC_FORMAT_MACROS define
Correctly handled by all the build systems.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:04 +0000 (13:24 +0000)]
autoconf: set all __STDC_*_MACROS
Analogous to previous commit(s), with a minor detail - here we set the
macros when building both C and C++ sources.
Resolving that is a more challenging task that we'll sort out another
day.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:03 +0000 (13:24 +0000)]
scons: always set __STDC_*_MACROS for C++ sources
Analogous to previous commit - just set the lot once throughout.
Cc: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:24:02 +0000 (13:24 +0000)]
android: always set __STDC_*_MACROS for C++ sources
Various parts of the code depend on the macros being defined.
Just set those unconditionally, only where needed (c++ sources) so that
we can drop the workarounds through the code.
Cc: Rob Herring <robh@kernel.org>
Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Emil Velikov [Thu, 26 Jan 2017 13:18:43 +0000 (13:18 +0000)]
st/xa: automake: remove duplicate -Wall
Already handled by configure.ac
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Thu, 26 Jan 2017 13:18:41 +0000 (13:18 +0000)]
mesa: move variable declaration to where its used
The variable replacement was unused when building w/o
ENABLE_SHADER_CACHE. Since we can mix variable declarations and code,
move it to where its used.
Fixes: 9f8dc3bf03e "utils: build sha1/disk cache only with
Android/Autoconf"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Emil Velikov [Thu, 26 Jan 2017 13:18:40 +0000 (13:18 +0000)]
st/mesa: use correct return statement for a void function
Analogous to previous commit.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Emil Velikov [Thu, 26 Jan 2017 13:18:39 +0000 (13:18 +0000)]
mesa: use correct return statement for a void function
Using return foo() is incorrect even if foo itself returns void.
Spotted by AppVeyor, as below:
teximage.c(3653) : warning C4098: 'copyteximage' : 'void' function returning a value
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Emil Velikov [Thu, 26 Jan 2017 13:18:38 +0000 (13:18 +0000)]
svga: remove const qualifier from SVGA3D_vgpu10_GenMips() prototype
Does not match the function definition or how it's used. Triggers the
following warning in AppVeyor
svga_cmd_vgpu10.c(1301) : warning C4028: formal parameter 2 different from declaration
Cc: Charmaine Lee <charmainel@vmware.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Thu, 26 Jan 2017 13:18:37 +0000 (13:18 +0000)]
nir: add extra const notation in compare_blocks()
MSVC warns about different const qualifiers. Add the extra const to
silence it.
nir_phi_builder.c(244) : warning C4090: 'initializing' : different 'const' qualifiers
nir_phi_builder.c(245) : warning C4090: 'initializing' : different 'const' qualifiers
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Emil Velikov [Thu, 26 Jan 2017 13:18:36 +0000 (13:18 +0000)]
nir: silence implicit conversion to 64bit
MSVC warns about implicit conversion as below. Annotate the literal
appropriately to silence the warning.
nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift
implicitly converted to 64 bits (was 64-bit shift intended?)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Emil Velikov [Mon, 16 Jan 2017 15:45:50 +0000 (15:45 +0000)]
i915, i965: automake: remove NA include directive
The path in question (... dri/intel/server) was removed years ago.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:49 +0000 (15:45 +0000)]
mesa/tests: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:48 +0000 (15:45 +0000)]
dri/osmesa: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:47 +0000 (15:45 +0000)]
dri/swrast: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:46 +0000 (15:45 +0000)]
radeon, r200: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:45 +0000 (15:45 +0000)]
mapi: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:44 +0000 (15:45 +0000)]
loader: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:43 +0000 (15:45 +0000)]
glx/windows: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:42 +0000 (15:45 +0000)]
glx/apple: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jeremy Sequoia <jeremyhu@apple.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:41 +0000 (15:45 +0000)]
glx: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:40 +0000 (15:45 +0000)]
d3dadapter9: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Axel Davy <axel.davy@ens.fr>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:39 +0000 (15:45 +0000)]
st/dri: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:38 +0000 (15:45 +0000)]
clover: automake: remove -I$(srcdir)
Already implicitly handled by the build system.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:37 +0000 (15:45 +0000)]
clover: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Aaron Watry <awatry@gmail.com>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:36 +0000 (15:45 +0000)]
egl: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:35 +0000 (15:45 +0000)]
i915: automake: include builddir prior to srcdir
Analogous to previous commit.
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:34 +0000 (15:45 +0000)]
i965: automake: include builddir prior to srcdir
The latter can contain stale generated file, which, as-is, we'll end up
using.
Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 16 Jan 2017 15:45:33 +0000 (15:45 +0000)]
freedreno: automake: correctly set MKDIR_GEN
Analogous to previous commit.
Fixes: 4610e5ef28e "freedreno/ir3: fix sin/cos"
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Rob Clark <robclark@freedesktop.org>
Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Reported-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Nicolas Dechesne <nicolas.dechesne@linaro.org>
Emil Velikov [Mon, 16 Jan 2017 15:45:32 +0000 (15:45 +0000)]
i965: automake: correctly set MKDIR_GEN
Otherwise we might end up w/o the respective folder (depending on
autotools version) and fail at build time.
Fixes: bfd17c76c12 "i965: Port INTEL_PRECISE_TRIG=1 to NIR."
Cc: "12.0 13.0" <mesa-dev@lists.freedesktop.org>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Engestrom [Thu, 26 Jan 2017 13:48:18 +0000 (13:48 +0000)]
anv: add missing extension errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Thu, 26 Jan 2017 13:48:17 +0000 (13:48 +0000)]
anv: add missing core errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Lionel Landwerlin [Thu, 26 Jan 2017 11:25:44 +0000 (11:25 +0000)]
anv: don't assert on out of memory descriptor pool in debug mode
Fixes:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Eric Engestrom [Thu, 26 Jan 2017 18:11:10 +0000 (18:11 +0000)]
docs/repository: fix name of main branch
This is git, not svn :P
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Engestrom [Tue, 24 Jan 2017 18:07:06 +0000 (18:07 +0000)]
egl: EGL_PLATFORM_SURFACELESS_MESA is now upstream
EGL_PLATFORM_SURFACELESS_MESA is in eglext.h as of last commit.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Eric Engestrom [Tue, 24 Jan 2017 18:07:05 +0000 (18:07 +0000)]
egl: update headers from registry
Khronos introduced a new macro (suggested by Google) to avoid using
C-style casts in C++ code, as those generate warnings.
Khronos Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16113
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Eric Engestrom [Thu, 26 Jan 2017 14:20:24 +0000 (14:20 +0000)]
radv: add missing extension errors in vk_errorf()
v2(Bas): Remove the extra VK_ERROR_FRAGMENTED_POOL cases.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eric Engestrom [Thu, 26 Jan 2017 14:20:23 +0000 (14:20 +0000)]
radv: add missing core errors in vk_errorf()
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Andreas Boll [Tue, 24 Jan 2017 15:44:12 +0000 (16:44 +0100)]
configure.ac: Require LLVM for r300 only on x86 and x86_64
b3119a3 introduced a strict LLVM requirement for r300 on all
architectures and thus configure fails on architectures where LLVM is
not available or buggy.
r300 doesn't strictly require LLVM, but for performance reasons we
highly recommend LLVM usage. So require it at least on x86 and x86_64
architectures as we have done before
b3119a3.
Fixes: b3119a3 ("configure.ac: Check gallium LLVM version in gallium_require_llvm")
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Nicolai Hähnle [Tue, 24 Jan 2017 20:22:32 +0000 (21:22 +0100)]
gallium: enable int64 on radeonsi, llvmpipe, softpipe
All of these have had support for the TGSI opcodes since before most of
the glsl compiler work landed.
Also update the docs accordingly, including the missing note about i965.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dave Airlie [Thu, 9 Jun 2016 00:17:58 +0000 (10:17 +1000)]
st/mesa: add support for enabling ARB_gpu_shader_int64.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dave Airlie [Thu, 9 Jun 2016 00:17:26 +0000 (10:17 +1000)]
st/glsl_to_tgsi: add support for 64-bit integers
v2: add conversion opcodes.
v3 (idr): Rebase on replacemtn of TGSI_OPCODE_I2U64 with
TGSI_OPCODE_I2I64.
v4 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b. Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.
v5 (nha): add clarifying comment about a subtle assumption
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Dave Airlie [Thu, 9 Jun 2016 00:13:03 +0000 (10:13 +1000)]
gallium: Add integer 64 capability
v1.1: move to using a normal CAP. (Marek)
v2: fill in the cap everywhere
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Topi Pohjolainen [Sat, 26 Nov 2016 16:03:56 +0000 (18:03 +0200)]
meta: Refactor texture format translation
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Tue, 29 Nov 2016 07:56:23 +0000 (09:56 +0200)]
intel/blorp/dbg: Name blit shaders for easy recognition in dumps
Blorp clears already have an equivalent.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Wed, 11 Jan 2017 08:26:32 +0000 (10:26 +0200)]
i965/hiz/gen6: Stop setting false qpitch
which is not applicable for "all slices at each lod". Current
logic makes one to believe it has some purpose. When miptree
layout is calculated brw_miptree_layout_texture_array() sets
the qpitch unconditionally but later on ignores it altogether
for ALL_SLICES_AT_EACH_LOD.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Tue, 10 Jan 2017 08:24:26 +0000 (10:24 +0200)]
i965/blorp/gen6: Remove dead code in hiz setup
Such as comment states for intel_miptree_hiz_buffer::mt, hiz_mt
only exists for gen6. In addition, intel_hiz_miptree_buf_create()
uses MIPTREE_LAYOUT_FORCE_ALL_SLICE_AT_LOD unconditionally.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Tue, 10 Jan 2017 09:02:08 +0000 (11:02 +0200)]
i965/gen6: Simplify hiz surface setup
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. The same goes for
intel_miptree_aux_buffer::pitch/qpitch.
This will make following patches simpler to read.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Tue, 10 Jan 2017 08:13:30 +0000 (10:13 +0200)]
i965/blorp/gen6: Simplify hiz surface setup
In intel_hiz_miptree_buf_create() intel_miptree_aux_buffer::bo
is unconditionally initialised to point to the same buffer
object as hiz_mt does. Also intel_miptree_aux_buffer::offset
is initialised to zero (calloc()).
This will make following patches significantly simpler to read.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Thu, 29 Dec 2016 08:06:16 +0000 (10:06 +0200)]
i965/gen6: Remove check for stencil format
There are is no alternative.
Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Wed, 28 Dec 2016 15:49:56 +0000 (17:49 +0200)]
i965: Remove check for hiz on earlier gens than SNB
Only caller, brw_workaround_depthstencil_alignment(), returns
early for gen6+.
While at it, reduce scope for brw_get_depthstencil_tile_masks() as
well.
Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Thu, 22 Dec 2016 08:36:03 +0000 (10:36 +0200)]
i965/miptree: Remove redundant check for null texture
There exact same check earlier in brw_miptree_layout() which
intel_miptree_create_layout() in turn calls unconditionally.
Reviewed-by: Samuel Iglesias Gons\341lvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Tue, 17 Jan 2017 08:10:17 +0000 (10:10 +0200)]
i965/miptree: Tell when brw_miptree_layout() fails
In addition, let intel_miptree_create_layout() release the
miptree - it is the allocator.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Topi Pohjolainen [Thu, 22 Dec 2016 08:09:55 +0000 (10:09 +0200)]
i965/meta: Remove unused brw_get_rb_for_slice()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gons<C3><A1>lvez <siglesias@igalia.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Michel Dänzer [Thu, 26 Jan 2017 06:28:12 +0000 (15:28 +0900)]
clover: Fix build against clang SVN >= r293097
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Eric Anholt [Sun, 8 Jan 2017 22:54:57 +0000 (14:54 -0800)]
vc4: Use NEON to speed up utile stores on Pi2+.
Improves 1024x1024 TexSubImage2D by 41.2371% +/- 3.52799% (n=10).
Eric Anholt [Thu, 5 Jan 2017 23:11:30 +0000 (15:11 -0800)]
vc4: Use NEON to speed up utile loads on Pi2.
We had a lot of memcpy call overhead because gpu_stride wasn't being
inlined. But if you split out the stride==8 and stride==16 cases like
this code does while still using memcpy, you'd no longer have glibc's
NEON memcpy applied at which point we'd be doing 16 uncached reads
instead of 64/(NEON memcpy granularity), for about a 30% performance
hit. By hand writing the assembly, we can get a whole cacheline
loaded at a time.
Unfortunately, NEON intrinsics turned out to be unusable -- they
didn't have the vldm instruction available.
Note that, for now, the NEON code is only enabled when building for ARMv7
(Pi 2+). We may want to do runtime detection for the Raspbian case, in
the future.
Improves 1024x1024 GetTexImage by 208.256% +/- 7.07029% (n=10).
Eric Anholt [Fri, 6 Jan 2017 18:51:22 +0000 (10:51 -0800)]
vc4: Move LT tiling code to a separate file.
This paves the way for building it twice, with NEON assembly or not.
Eric Anholt [Fri, 6 Jan 2017 18:55:07 +0000 (10:55 -0800)]
vc4: Use unreachable() in an unreachable codepath for tiling.
Samuel Pitoiset [Wed, 25 Jan 2017 15:56:46 +0000 (16:56 +0100)]
gallium/radeon: add VRAM-vis-usage HUD query
This new query returns the current visible usage of VRAM accessed
by the CPU. It will return 0 on radeon because it's unimplemented.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 25 Jan 2017 15:56:45 +0000 (16:56 +0100)]
gallium/radeon: query the CPU accessible size of VRAM
R600_DEBUG="info" can be used to display that size, as well as
the total amount of VRAM/GTT.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Ian Romanick [Thu, 6 Nov 2014 19:12:31 +0000 (11:12 -0800)]
mesa: Arrange validate_uniform_parameters parameters to match call sites
Saves a measly 20 bytes on IA32 and nothing on x64. Depending on
exactly when this is applied, a lot of variation is possible due to
function alignment.
text data bss dec hex filename
6670131 228340 22552
6921023 699b3f lib/i965_dri.so before
6670111 228340 22552
6921003 699b2b lib/i965_dri.so after
6342932 293872 29880
6666684 65b9bc lib64/i965_dri.so before
6342932 293872 29880
6666684 65b9bc lib64/i965_dri.so after
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Ian Romanick [Thu, 6 Nov 2014 02:44:21 +0000 (18:44 -0800)]
mesa: Arrange _mesa_uniform parameters to match the call sites
By putting the parameters first that match the parameters to the call
site, 4 (of 14) instructions are saved at _mesa_Uniform4fv on x64. On
IA32, the details of the instructions change, but it is the same count
and mix of instructions.
Before:
0000000000000830 <_mesa_Uniform4fv>:
830: 48 83 ec 10 sub $0x10,%rsp
834: 49 89 d0 mov %rdx,%r8
837: 48 8b 15 00 00 00 00 mov 0x0(%rip),%rdx # 83e <_mesa_Uniform4fv+0xe>
83e: 89 f8 mov %edi,%eax
840: 89 f1 mov %esi,%ecx
842: 41 b9 02 00 00 00 mov $0x2,%r9d
848: 64 48 8b 3a mov %fs:(%rdx),%rdi
84c: 48 8b 97 c8 01 02 00 mov 0x201c8(%rdi),%rdx
853: 48 8b 72 70 mov 0x70(%rdx),%rsi
857: 6a 04 pushq $0x4
859: 89 c2 mov %eax,%edx
85b: e8 00 00 00 00 callq 860 <_mesa_Uniform4fv+0x30>
860: 48 83 c4 18 add $0x18,%rsp
864: c3 retq
After:
00000000000007f0 <_mesa_Uniform4fv>:
7f0: 48 83 ec 10 sub $0x10,%rsp
7f4: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 7fb <_mesa_Uniform4fv+0xb>
7fb: 41 b9 02 00 00 00 mov $0x2,%r9d
801: 64 48 8b 08 mov %fs:(%rax),%rcx
805: 48 8b 81 c8 01 02 00 mov 0x201c8(%rcx),%rax
80c: 6a 04 pushq $0x4
80e: 4c 8b 40 70 mov 0x70(%rax),%r8
812: e8 00 00 00 00 callq 817 <_mesa_Uniform4fv+0x27>
817: 48 83 c4 18 add $0x18,%rsp
81b: c3 retq
Saves a measly 416 bytes of text on x64. Depending on exactly when this
is applied, a lot of variation is possible due to function alignment.
text data bss dec hex filename
6670131 228340 22552
6921023 699b3f lib/i965_dri.so before
6670131 228340 22552
6921023 699b3f lib/i965_dri.so after
6343348 293872 29880
6667100 65bb5c lib64/i965_dri.so before
6342932 293872 29880
6666684 65b9bc lib64/i965_dri.so after
There is likely to be no performance change with just this patch.
_mesa_uniform immediately calls validate_uniform_parameters with
parameters in the "wrong" (different from the call site) order.
v2: Rebase on GL_ARB_gpu_shader_fp64.
v3: Rebase on GL_ARB_gpu_shader_int64.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Ian Romanick [Thu, 6 Nov 2014 01:48:40 +0000 (17:48 -0800)]
mesa: Arrange _mesa_uniform_matrix parameters to match the call sites
By putting the parameters first that match the parameters to the call
site, 4 (of 16) instructions are saved at _mesa_UniformMatrix4fv on
x64. On IA32, the details of the instructions change, but it is the
same count and mix of instructions.
Before:
0000000000001380 <_mesa_UniformMatrix4fv>:
1380: 48 83 ec 10 sub $0x10,%rsp
1384: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 138b <_mesa_UniformMatrix4fv+0xb>
138b: 41 89 f8 mov %edi,%r8d
138e: 41 89 f1 mov %esi,%r9d
1391: 0f b6 d2 movzbl %dl,%edx
1394: 64 48 8b 38 mov %fs:(%rax),%rdi
1398: 48 8b b7 c8 01 02 00 mov 0x201c8(%rdi),%rsi
139f: 48 8b 76 70 mov 0x70(%rsi),%rsi
13a3: 68 06 14 00 00 pushq $0x1406
13a8: 51 push %rcx
13a9: 52 push %rdx
13aa: b9 04 00 00 00 mov $0x4,%ecx
13af: ba 04 00 00 00 mov $0x4,%edx
13b4: e8 00 00 00 00 callq 13b9 <_mesa_UniformMatrix4fv+0x39>
13b9: 48 83 c4 28 add $0x28,%rsp
13bd: c3 retq
After:
0000000000001360 <_mesa_UniformMatrix4fv>:
1360: 48 83 ec 10 sub $0x10,%rsp
1364: 48 8b 05 00 00 00 00 mov 0x0(%rip),%rax # 136b <_mesa_UniformMatrix4fv+0xb>
136b: 0f b6 d2 movzbl %dl,%edx
136e: 64 4c 8b 00 mov %fs:(%rax),%r8
1372: 49 8b 80 c8 01 02 00 mov 0x201c8(%r8),%rax
1379: 68 06 14 00 00 pushq $0x1406
137e: 6a 04 pushq $0x4
1380: 6a 04 pushq $0x4
1382: 4c 8b 48 70 mov 0x70(%rax),%r9
1386: e8 00 00 00 00 callq 138b <_mesa_UniformMatrix4fv+0x2b>
138b: 48 83 c4 28 add $0x28,%rsp
138f: c3 retq
Saves a measly 576 bytes of text on x64.
text data bss dec hex filename
6670131 228340 22552
6921023 699b3f lib/i965_dri.so before
6670131 228340 22552
6921023 699b3f lib/i965_dri.so after
6343924 293872 29880
6667676 65bd9c lib64/i965_dri.so before
6343348 293872 29880
6667100 65bb5c lib64/i965_dri.so after
v2: Rebase on GL_ARB_gpu_shader_fp64.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Ian Romanick [Tue, 1 Sep 2015 17:29:04 +0000 (10:29 -0700)]
mesa: Trivial clean-ups in uniform_query.cpp
This is C++, so we can mix code and declarations. Doing so allows
constification.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Lionel Landwerlin [Thu, 26 Jan 2017 16:57:40 +0000 (16:57 +0000)]
spirv: handle undefined components for OpVectorShuffle
Fixes:
dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related
dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related*
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
Lionel Landwerlin [Thu, 26 Jan 2017 16:57:25 +0000 (16:57 +0000)]
spirv: handle OpUndef as part of the variable parsing pass
Looking at the following bit of SPIRV shader :
...
%zero = OpConstant %i32 0
%ivec3_0 = OpConstantComposite %ivec3 %zero %zero %zero
%vec3_undef = OpUndef %ivec3
%sc_0 = OpSpecConstant %i32 0
%sc_1 = OpSpecConstant %i32 0
%sc_2 = OpSpecConstant %i32 0
...
Our compiler currently stops parsing variables & types on the OpUndef
and switches to instructions, leaving the following sc_[0-2] variables
untreated.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
Lionel Landwerlin [Thu, 26 Jan 2017 11:06:53 +0000 (11:06 +0000)]
anv: fix descriptor pool internal size allocation
The size of the pool is slightly smaller than the size of the
structure containing the whole pool. We need to take that into account
on when setting up the internals.
Fixes a crash due to out of bound memory access in:
dEQP-VK.api.descriptor_pool.out_of_pool_memory
v2: Drop debug traces (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Sun, 22 Jan 2017 09:44:08 +0000 (01:44 -0800)]
i965: Make intelEmitCopyBlit not truncate large strides.
When trying to blit larger tiled surfaces, the pitch can be larger than
32768 bytes, which means it won't fit in a GLshort. Passing it in will
truncate the stride to 0, which has...surprising results.
The pitch can be up to 32,768 DWords, or 128kB. We measure it in bytes,
but divide by 4 when programming it. So we need to handle values up to
131,072. Switch from GLshort to int32_t to avoid the truncation.
Fixes GL45-CTS.gtf30.GL3Tests.depth_texture.depth_texture_copyteximage
at widths greater than 8192.
v2: Use int32_t as negative values can be used (Jason).
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Tue, 24 Jan 2017 08:45:53 +0000 (00:45 -0800)]
i965: Use a UW source type for CS_OPCODE_CS_TERMINATE.
SIMD16 compute shaders use a send(16) with mlen 1 for the EOT message,
using a source of g127 for the single register. With a UD type, this
supposedly could read g128, which doesn't exist, causing the simulator
to get cranky. Use a UW type to avoid this.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Iago Toral Quiroga [Wed, 25 Jan 2017 14:04:35 +0000 (15:04 +0100)]
anv/lower_input_attachments: honor sample index parameter to subpassLoad()
According to GL_KHR_vulkan_glsl, the signature of subpassLoad() is:
gvec4 subpassLoad(gsubpassInput subpass);
gvec4 subpassLoad(gsubpassInputMS subpass, int sample);
So the multisampled case always receives an explicit sample index that we
should use. The current implementation was ignoring this parameter
and using gl_SampleID value instead.
Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_id.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Mon, 23 Jan 2017 19:57:21 +0000 (11:57 -0800)]
i965: Fix fast depth clears for surfaces with a dimension of 16384.
I hadn't bothered to set this bit because I figured it would just
paper over us getting the rectangle wrong. But it turns out that
there is a legitimate reason to use it, so let's do so.
The alternative would be to chop up 16k clears to multiple 8k clears,
which is pointlessly painful.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Chad Versace [Wed, 25 Jan 2017 20:12:20 +0000 (12:12 -0800)]
anv: Implement VK_KHR_get_physical_device_properties2
Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Wed, 25 Jan 2017 20:12:19 +0000 (12:12 -0800)]
anv: Refactor anv_GetPhysicalDeviceQueueFamilyProperties()
Add a helper function, anv_get_queue_family_properties(), which fills the
struct. This patch reduces churn in the following patch that implements
vkGetPhysicalDeviceQueueFamilyProperties2KHR.
Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Wed, 25 Jan 2017 20:12:18 +0000 (12:12 -0800)]
anv: Refactor anv_GetPhysicalDeviceFormatProperties()
Add a helper function, anv_get_image_format_properties(), which does all
the work and has a VkPhysicalDeviceImageFormatInfo2KHR parameter. This
patch reduces churn in the following patch that implements
vkGetPhysicalDeviceImageFormatProperties2KHR.
Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Wed, 25 Jan 2017 21:53:00 +0000 (13:53 -0800)]
anv: Revive struct anv_common
The struct was deleted by:
commit
efe9d1cde3340d3a9d17e5560b609a4fb839d43d
Author: Edward O'Callaghan <funfunctor@folklore1984.net>
Subject: anv: Clean up some unused variables
Unlike the original anv_common, the new one has a non-const pNext
pointer because we will use it for the output structs of
VK_KHR_get_physical_device_properties2.
v2:
- Retype pNext from void* to struct anv_common*.
Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Chad Versace [Wed, 25 Jan 2017 20:12:16 +0000 (12:12 -0800)]
anv: Define macro anv_debug()
This is a printf-like macro that prints a debug message to stderr when
built with DEBUG. If no DEBUG, then do nothing.
Reviewed-by: Jason Ekstranad <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Ian Romanick [Wed, 25 Jan 2017 00:13:01 +0000 (16:13 -0800)]
mesa: Fix copy-and-paste bug in _mesa_(Program|)Uniform[1234](i|ui)64vARB functions
All of the functions were passing 1 to _mesa_uniform instead of passing
count.
Fixes 16 unsed parameter warnings like:
main/uniforms.c: In function ‘_mesa_Uniform1i64vARB’:
main/uniforms.c:1692:47: warning: unused parameter ‘count’ [-Wunused-parameter]
_mesa_Uniform1i64vARB(GLint location, GLsizei count, const GLint64 *value)
^~~~~
This is why I build with extra warnings enabled. Unfortunately, there
are so many unused parameter warnings in Mesa that I didn't notice these
added warnings for over 6 months. :(
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Lionel Landwerlin [Wed, 25 Jan 2017 14:04:05 +0000 (14:04 +0000)]
spirv: bump headers to SPIRV 1.1
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Wed, 25 Jan 2017 14:03:31 +0000 (14:03 +0000)]
spirv: add default handler for new enums
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Wed, 25 Jan 2017 12:28:50 +0000 (12:28 +0000)]
spirv: fix typos
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Wed, 25 Jan 2017 16:22:40 +0000 (16:22 +0000)]
anv: set command buffer to NULL when allocations fail
The spec section 5.2 says:
"vkAllocateCommandBuffers can be used to create multiple command
buffers. If the creation of any of those command buffers fails, the
implementation must destroy all successfully created command buffer
objects from this command, set all entries of the pCommandBuffers
array to VK_NULL_HANDLE and return the error."
Fixes:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Wed, 25 Jan 2017 01:10:45 +0000 (17:10 -0800)]
vulkan/wsi: Lower the maximum image sizes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
Jason Ekstrand [Wed, 25 Jan 2017 00:43:15 +0000 (16:43 -0800)]
vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetPresentModes
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
Jason Ekstrand [Wed, 25 Jan 2017 00:43:01 +0000 (16:43 -0800)]
vulkan/wsi/wayland: Handle VK_INCOMPLETE for GetFormats
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "17.0" <mesa-dev@lists.freedesktop.org>
George Kyriazis [Tue, 24 Jan 2017 23:19:55 +0000 (17:19 -0600)]
swr: Update fs texture & sampler state logic
In swr_update_derived() update texture and sampler state on a new fragment
shader. GALLIUM_HUD can update fs using a previously bound texture and
sampler.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Samuel Pitoiset [Mon, 23 Jan 2017 20:44:45 +0000 (21:44 +0100)]
gallium/radeon: add a new HUD query for the number of mapped buffers
Useful when debugging applications which map a ton of buffers
and also because we used to run into Linux's limit on the number
of simultaneous mmap() calls.
v2: - update the commit message
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Iago Toral Quiroga [Tue, 24 Jan 2017 10:49:40 +0000 (11:49 +0100)]
spirv: handle gl_SampleMask
SPIR-V maps both gl_SampleMask and gl_SampleMaskIn to the same
builtin (SampleMask). The only way to tell which one we are dealing with
is to check if it is an input or an output.
Fixes:
dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.write.*
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Iago Toral Quiroga [Tue, 24 Jan 2017 09:48:01 +0000 (10:48 +0100)]
spirv: acknowledge multisampled input attachments
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Dave Airlie [Wed, 18 Jan 2017 03:46:43 +0000 (13:46 +1000)]
radv: program a default point size.
Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.
This preempt fixes a bunch of geom shader tests.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Fri, 20 Jan 2017 15:02:04 +0000 (16:02 +0100)]
radeonsi: handle first_non_void correctly in si_create_vertex_elements
This fixes R11G11B10_FLOAT, because it's in the category of "OTHER",
meaning that it doesn't have any channel description.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Fri, 20 Jan 2017 01:26:42 +0000 (02:26 +0100)]
st/mesa: destroy pipe_context before destroying st_context (v2)
If radeonsi starts compiling an optimized shader variant asynchronously
with a GL debug callback set and the application destroys the GL context,
radeonsi crashes when trying to write shader stats into the debug output
of a non-existent context after compilation, because st/mesa was destroyed
before pipe_context.
Firefox with WebGL2 enabled hits this bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99456
v2: protect against a double destroy in st_create_context_priv and callers.
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Wed, 18 Jan 2017 02:12:37 +0000 (13:12 +1100)]
nir: bump loop max unroll limit
The original number was chosen in an attempt to match the limits applied to
GLSL IR.
A look at the git history of the why these limits were chosen for GLSL IR
shows it was more to do with the slow speed of unrolling large loops in
GLSL IR than anything else. The speed of loop unrolling in NIR is not a
problem so we may wish to bump this even higher in future.
No shader-db change, however a furture change will disbale the GLSL IR
optimisation loop in the i965 backend results in 4 loops from The Talos
Principle failing to unroll. Bumping the limit allows them to unroll which
results in the instruction count matching the previous output from when the
GLSL IR opts were still enabled.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Tue, 24 Jan 2017 03:07:04 +0000 (14:07 +1100)]
glsl: lower constant arrays to uniform arrays before optimisation loop
Previously the constant array would not get copy propagated until the backend
did its GLSL IR opt loop. I plan on removing that from i965 shortly which
caused huge regressions in Deus-ex and Tomb Raider which have large
constant arrays. Moving lowering before the opt loop in the GLSL linker
fixes this and unexpectedly improves some compute shaders also.
shader-db results BDW:
instructions helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 204 -> 194 (-4.90%)
instructions helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1010 -> 741 (-26.63%)
instructions helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 542 -> 385 (-28.97%)
cycles helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8:
1831382 ->
1818492 (-0.70%)
cycles helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 216238 -> 206180 (-4.65%)
cycles helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 18484 -> 16644 (-9.95%)
total instructions in shared programs:
13060313 ->
13059877 (-0.00%)
instructions in affected programs: 1756 -> 1320 (-24.83%)
helped: 3
HURT: 0
total cycles in shared programs:
256586698 ->
256561910 (-0.01%)
cycles in affected programs:
2066104 ->
2041316 (-1.20%)
helped: 3
HURT: 0
V3: only call the opt loop if lowering progressed (Suggested by Eric)
V2: call opts before and after lowering (Suggested by Ken)
Reviewed-by: Eric Anholt <eric@anholt.net>