Emil Velikov [Tue, 11 Oct 2016 17:26:24 +0000 (18:26 +0100)]
anv: use correct header guards
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:23 +0000 (18:26 +0100)]
intel/genxml: use correct header guards
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:22 +0000 (18:26 +0100)]
intel/common: use correct header guards
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:21 +0000 (18:26 +0100)]
intel/blorp: use correct header guards
Avoid the discouraged use of pragma once and a missing guard for
blorp_genX_exec.h.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:20 +0000 (18:26 +0100)]
isl: use ifndef header guards
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:19 +0000 (18:26 +0100)]
isl: make locally used functions static
Signed-off-by: Emil Velikov <emil.velikov@collabra.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:18 +0000 (18:26 +0100)]
isl: trivial include-what-you-want cleanups
Noticed while skimming through the files.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:17 +0000 (18:26 +0100)]
isl/gen7: remove unneeded ISL_DEV_GEN check
The function gen7_format_needs_valign2 has two callers - the gen7 only
gen7_choose_valign_el() and isl_gen6_filter_tiling(). The latter of
which already guarding the invocation appropriately.
To be extra cautious add a couple of asserts alongside the removal of the
runtime check.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:16 +0000 (18:26 +0100)]
isl: prefix non-static API with isl_
The rest of ISL already follows this approach. Be consistent and resolve
the final references.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Tue, 11 Oct 2016 17:26:15 +0000 (18:26 +0100)]
isl/gen6: correctly check msaa layout samples count
Samples == 1 is a valid value, so returning false is plain wrong.
Seeming copy/paste typo introduced since day 1.
Fixes: afdadec77f5 ("isl: Implement isl_surf_init() for gen4-gen9")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Emil Velikov [Wed, 12 Oct 2016 18:05:33 +0000 (19:05 +0100)]
automake: add radv to the `make distcheck' hooks
Will allow us to catch issues (as fixed with previous patches) rather
than release a broken tarball.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Emil Velikov [Wed, 12 Oct 2016 18:05:32 +0000 (19:05 +0100)]
radv: move AMDGPU_LIBS later in the link chain
At the moment (albeit unlikely) one could get link-time issues, since
libdrm_amdgpu.so is before it's users in the link chain.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Emil Velikov [Wed, 12 Oct 2016 18:05:31 +0000 (19:05 +0100)]
radv: correct variable name VISIBILITY_{, C}FLAGS
The letter C was missing, thus in turn all the internal symbols were
exported.
As a result we hide ~150 symbols and cut ~36K from libvulkan_radeon.so.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Emil Velikov [Wed, 12 Oct 2016 18:05:30 +0000 (19:05 +0100)]
amd/addrlib: hide private symbols via VISIBILITY_CXXFLAGS
Private/internal symbols should not be exported. Using the CXXFLAGS cuts
~300 exported symbols and ~23K from libvulkan_radeon.so.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Wed, 12 Oct 2016 18:05:29 +0000 (19:05 +0100)]
intel: automake: replace direct basename $@ invokation with $(@F)
Use the shorthand make variable(s) as elsewhere in the build.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Emil Velikov [Tue, 11 Oct 2016 18:39:27 +0000 (19:39 +0100)]
gallium: annotate sw_driver_descriptor instance as const data
Already treated and handled as such.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Tue, 11 Oct 2016 18:39:26 +0000 (19:39 +0100)]
gallium: annotate drm_driver_descriptor instance as const data
Already treated and handled as such.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Tue, 11 Oct 2016 18:39:25 +0000 (19:39 +0100)]
gallium: rename drm_driver_descriptor::{, driver_}name
Historically we use "device name" for the name of the kernel module and
"driver name" for the dri/other driver.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Tue, 11 Oct 2016 18:39:24 +0000 (19:39 +0100)]
gallium: remove unused drm_driver_descriptor::driver_name
Likely unused since day 1, although I've only checked back until the
st/dri unification with commit
29ca7d2c948 ("st/dri: merge dri/drm and
dri/sw backends")
Based on the comment, referencing drmOpenByName it's not something we
want to bring back.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Tue, 11 Oct 2016 18:39:23 +0000 (19:39 +0100)]
gallium: fix drm_driver_descriptor::name comment
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Wed, 12 Oct 2016 17:49:36 +0000 (18:49 +0100)]
mesa_glinterop: allow building without X and related headers
This commit effectively reverts
c10dcb2ce837922c6ee4e191e6d6202098a5ee10
and fixes the typedef redefinition which inspired it.
In order to prevent requiring X packages at build time earlier commit
forward declared the required X/GLX typedefs. Since that approach
introduced typedef redefinition (a C11 feature) it was reverted.
To avoid the redefinition while _not_ mandating X and related headers
forward declare the structs and use those through the header.
As anyone uses the mesa interop header they ensure that the X (or others
in terms of EGL) headers are included, which ensures that everything is
resolved within the compilation unit.
Cc: Vinson Lee <vlee@freedesktop.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Cc: Tapani Pälli <tapani.palli@intel.com>
Cc: Chih-Wei Huang <cwhuang@android-x86.org>
Fixes: c10dcb2ce837 ("Revert "mesa_glinterop: remove inclusion of GLX
header"")
Fixes: 8472045b16b3 ("mesa_glinterop: remove inclusion of GLX header")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Mark Thompson [Wed, 12 Oct 2016 22:54:03 +0000 (23:54 +0100)]
st/va: Fix H.264 PicOrderCnt value
TopFieldPicOrderCnt is exactly the PicOrderCnt value for a frame - see
H.264 section 8.2.1.
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark Thompson [Wed, 12 Oct 2016 22:53:35 +0000 (23:53 +0100)]
st/va: Baseline profile is not supported
Constrained baseline profile is supported, so use that instead. This
matches what the encoder already does (constraint_set1_flag is always
set in the output bitstream).
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark Thompson [Wed, 12 Oct 2016 22:53:01 +0000 (23:53 +0100)]
st/va: Return surface formats depending on config chroma format
This makes the supported format actually match the configuration, and
allows the user to observe that NV12 is supported for video processing
where previously they couldn't (though it did always work if they
blindly tried to use it anyway).
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark Thompson [Wed, 12 Oct 2016 22:52:30 +0000 (23:52 +0100)]
st/va: Save surface chroma format in config
Both YUV420 and RGB32 configurations are supported, so we need to be
able to distinguish which is being used.
Reviewed-by: Christian König <christian.koenig@amd.com>
Mark Thompson [Wed, 12 Oct 2016 22:52:01 +0000 (23:52 +0100)]
st/va: Return more useful config attributes
The encoder attributes are needed for a user of the encoder to be
able to configure it sensibly without internal knowledge.
Reviewed-by: Christian König <christian.koenig@amd.com>
Mario Kleiner [Tue, 11 Oct 2016 18:42:03 +0000 (20:42 +0200)]
glx: Perform check for valid fbconfig against proper X-Screen.
Commit
cf804b4455fac9e585b3600a8318caaced9c23de
('glx: fix crash with bad fbconfig') introduced a check
in glXCreateNewContext() if the given config is a valid
fbconfig.
Unfortunately the check always checks the given config against
the fbconfigs of the DefaultScreen(dpy), instead of the
actual X-Screen specified in the config config->screen.
This leads to failure whenever a GL context is created
on a non-DefaultScreen(dpy), e.g., on X-Screen 1 of
a multi-x-screen setup, where the default screen is
typically 0.
Fix this by using config->screen instead of DefaultScreen(dpy).
Tested to fix context creation failure on a dual-x-screen setup.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tim Rowley [Fri, 14 Oct 2016 01:57:05 +0000 (20:57 -0500)]
swr: [rasterizer core] don't construct pArContext on non-ar builds
Stops debug directory being created on non-ar builds.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Tim Rowley [Fri, 14 Oct 2016 01:56:54 +0000 (20:56 -0500)]
swr: [rasterizer core] remove WorkerWaitForThreadEvent bucket
Cause of bucket stop capture hang, as threads get stuck in level 1.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Tim Rowley [Thu, 13 Oct 2016 17:30:34 +0000 (12:30 -0500)]
swr: [rasterizer core] move binner functionality to separate file
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Tim Rowley [Thu, 13 Oct 2016 15:32:58 +0000 (10:32 -0500)]
swr: [rasterizer scripts] add DEBUG_OUTPUT_DIR knob
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Tim Rowley [Thu, 13 Oct 2016 14:44:06 +0000 (09:44 -0500)]
swr: [rasterizer core] fix comment typo
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Tim Rowley [Tue, 11 Oct 2016 17:57:29 +0000 (12:57 -0500)]
swr: [rasterizer core/sim] 8x2 backend + 16-wide tile clear/load/store
Work in progress (disabled).
USE_8x2_TILE_BACKEND define in knobs.h enables AVX512 code paths
(emulated on non-AVX512 HW).
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Tim Rowley [Tue, 11 Oct 2016 17:42:35 +0000 (12:42 -0500)]
swr: [rasterizer archrast] fix event file issue with saving data
Also, tagging stats with draw id to correlate these events with
draw/dispatch events.
Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
Eric Engestrom [Wed, 12 Oct 2016 21:13:29 +0000 (22:13 +0100)]
swr: [rasterizer common] fix assert index
Fixes: b3bd8bb611bb465d2e5e ("swr: [rasterizer core] add support
for "RAW" surface format")
CovID:
1373647
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Ilia Mirkin [Fri, 14 Oct 2016 01:42:54 +0000 (21:42 -0400)]
docs: mark GL 4.4/4.5 extension groups as DONE for nvc0
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 14 Oct 2016 01:39:42 +0000 (21:39 -0400)]
nv50: enable ARB_enhanced_layouts
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Wed, 12 Oct 2016 17:30:57 +0000 (13:30 -0400)]
nvc0/ir: be more careful about preserving modifiers in SHLADD creation
src2 was being given the wrong modifier, and we were not properly
managing the modifier on the SHL source either.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Brian Paul [Fri, 7 Oct 2016 21:31:34 +0000 (15:31 -0600)]
mesa: fix indentation in vertex_attrib_binding()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Fri, 7 Oct 2016 21:21:58 +0000 (15:21 -0600)]
mesa: add sanity check assertion in update_array_format
At most, one of the normalized, integer, doubles bools can be true.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Fri, 7 Oct 2016 21:09:20 +0000 (15:09 -0600)]
mesa: remove needless cast in update_array()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Fri, 7 Oct 2016 21:08:50 +0000 (15:08 -0600)]
mesa: simplify update_array() with a vao local var
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Fri, 7 Oct 2016 21:03:55 +0000 (15:03 -0600)]
vbo: simplify some code in check_draw_elements_data()
Use the 'vao' local var in more places.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Thu, 6 Oct 2016 23:30:20 +0000 (17:30 -0600)]
mesa: rename gl_vertex_attrib_array gl_array_attributes
The structure contains the attributes of a vertex array. The old name
was kind of confusing.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Thu, 6 Oct 2016 23:21:09 +0000 (17:21 -0600)]
mesa: rename gl_vertex_attrib_array::VertexBinding
Rename to gl_vertex_attrib_array::BufferBindingIndex because this field
is an index into the array of buffer binding points. This makes some
code a little easier to follow since there's also a "VertexBinding" field
in gl_vertex_array_object.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Thu, 6 Oct 2016 23:16:51 +0000 (17:16 -0600)]
mesa: rename some vars in arrayobj.c
Use 'vao' instead of 'obj' to be consistent with other code.
Plus, add a comment.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Tue, 11 Oct 2016 21:25:24 +0000 (15:25 -0600)]
tgsi: fix comment typo in tgsi_ureg.c
Trivial.
Brian Paul [Mon, 10 Oct 2016 17:29:14 +0000 (11:29 -0600)]
mesa: replace gl_framebuffer::_IntegerColor wih _IntegerBuffers
Use a bitmask to indicate which color buffers are integer-valued, rather
than a bool. Also, the old field was mis-computed. If an integer buffer
was followed by a non-integer buffer, the _IntegerColor field was wrongly
set to false.
This fixes the new piglit gl-3.1-mixed-int-float-fbo test.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Fri, 7 Oct 2016 20:28:21 +0000 (14:28 -0600)]
mesa: remove 'params' parameter from ctx->Driver.TexParameter()
None of the drivers which implement this hook do anything with the
texture parameter value. Drivers just look at the pname and set a
dirty flag if needed.
We were doing some ugly casting and type conversion to setup the
argument so that all goes away.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Anholt [Thu, 13 Oct 2016 19:37:59 +0000 (12:37 -0700)]
vc4: Avoid loading from the texture during non-utile-aligned glTexImage().
Previously, the plan was "if the width/height we have to load/store isn't
the size the user is planning on writing, then we need to load the old
contents out beforehand to prevent writing back undefined".
However, when we're doing glTexImage() we often end up aligning the
width/height into the padding of the texture, and we don't actually
need to read out that padding.
Improves x11perf -aatrapezoid100 performance from ~460/sec to
~700/sec.
Axel Davy [Wed, 12 Oct 2016 17:10:53 +0000 (19:10 +0200)]
st/nine: Fix possible segfault in surface ctor
Regression introduced by
ba0274c7d6c3b77a36bbe1b444f427b0c873e2f3
Check the resource exists before assigning it
a flag (and use This->base.resource instead
of pResource, since the former may have a newly
allocate resource, while the latter would be
NULL).
This should reintroduce the behaviour of previous
code.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Wed, 12 Oct 2016 16:58:24 +0000 (18:58 +0200)]
st/nine: Remove useless code in nine_shader
Since
1604efa6fda9b780e8537a131ad77f3e83e5a67a,
lconsti and lconstb don't need to be initialized.
Remove some leftovers from the previous code (which
has now invalid use of ARRAY_SIZE on a pointer instead
of an array).
Reported by Coverity.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Axel Davy [Tue, 11 Oct 2016 16:57:17 +0000 (18:57 +0200)]
gallium/os: Use unsigned integers for size computation
Use uint64_t instead of int64_t in the calculation,
as the result is uint64_t.
Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Samuel Pitoiset [Sun, 9 Oct 2016 11:48:31 +0000 (13:48 +0200)]
nvc0: enable ARB_enhanced_layouts
All ARB_enhanced_layouts piglit tests pass without any changes
in our compiler.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Dave Airlie [Thu, 13 Oct 2016 19:09:39 +0000 (05:09 +1000)]
radv: fix the wayland wsi busy bit
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 13 Oct 2016 19:08:56 +0000 (05:08 +1000)]
anv: fix the wayland wsi busy flag setting
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tom Stellard [Thu, 13 Oct 2016 17:25:58 +0000 (17:25 +0000)]
radv: Use new image load/store intrinsic signatures v2
These were changed in LLVM r284024.
v2:
- Only use float types for vdata of llvm.amdgcn.image.store. LLVM doesn't
support integer types for this intrinsic.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Tom Stellard [Thu, 13 Oct 2016 15:21:27 +0000 (15:21 +0000)]
radv: Fix incorrect comment
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 11 Oct 2016 06:46:25 +0000 (16:46 +1000)]
radv: fix identity swizzle handling
The identity swizzle should operate exactly
like an .r = R, .g = G, .b = B, .a = A swizzle.
This fixes a bunch of the 16-bit BGRA blit tests
dEQP-VK.api.copy_and_blit.blit_image.all_formats.b4g4r4a4*
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 13 Oct 2016 02:43:07 +0000 (12:43 +1000)]
anv/wsi: fix apps that acquire multiple images up front
This fix was found in the radv codebase when running dota2,
no idea if anyone has reported it on anv, but the same problem
occurs.
Once an image is acquired we need to mark it busy.
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 13 Oct 2016 02:38:49 +0000 (12:38 +1000)]
radv/wsi: fix app that acquire multiple images up front
dota2 does multiple acquires followed by multiple queues,
this bug manifested itself as a hang in the xshmfence code
randomly when dota2 was doing it's menus. It also occured
when running dota2 under phoronix-test-suite.
The fix is once the image is acquired to mark it busy then
so nobody else can acquire. We have to trust vulkan apps
that they will eventually submit it.
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 29 Aug 2016 23:46:29 +0000 (09:46 +1000)]
anv: initialise and increment send_sbc
At least set this to not be uninitialised memory.
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Wed, 12 Oct 2016 20:15:31 +0000 (22:15 +0200)]
radeonsi: adjust and clean up Z_ORDER and EXEC_ON_x settings
The table was copied from the Vulkan driver. The comment lines are as long
as the table for cosmetic reasons.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 12 Oct 2016 19:47:41 +0000 (21:47 +0200)]
radeonsi: disable ReZ
This is a serious performance fix. Discovered by luck.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 11 Oct 2016 21:19:46 +0000 (23:19 +0200)]
radeonsi: implement TC-compatible HTILE
so that decompress blits aren't needed and depth texturing needs less
memory bandwidth.
Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible
HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16.
The format promotion is not visible to state trackers.
This is part of TC-compatible renderbuffer compression, which has 3 parts:
DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now.
I don't see a measurable increase in performance though.
(I tested Talos Principle and DiRT: Showdown, the latter is improved by
0.5%, which is almost noise, and it originally used layered Z16,
so at least we know that Z16 promoted to Z32F isn't slower now)
Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Wed, 12 Oct 2016 01:06:08 +0000 (03:06 +0200)]
gallium: add PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY
For performance tuning in drivers. It filters out window system
framebuffers and OpenGL renderbuffers.
radeonsi will use this to guess whether a depth buffer will be read
by a shader. There is no guarantee about what will actually happen.
This is a departure from PIPE_BIND flags which are defined to be strict
but they are useless in practice.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Nicolai Hähnle [Thu, 13 Oct 2016 14:03:06 +0000 (16:03 +0200)]
radeonsi: fix regression in image atomics
Caused by a bad rebase when pushing commit
76a940893.
Nicolai Hähnle [Mon, 10 Oct 2016 18:20:22 +0000 (20:20 +0200)]
st/mesa: fix vertex elements setup for doubles
Whether one or two slots are taken up by one API array depends on the
vertex shader, not on how the array is configured. When an array is
set up with fewer components than the shader expects, the high components
are undefined.
Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Mon, 10 Oct 2016 09:44:43 +0000 (11:44 +0200)]
st/glsl_to_tgsi: remove unnecessary ir_instruction argument from get_opcode
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Mon, 10 Oct 2016 09:44:03 +0000 (11:44 +0200)]
st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Sun, 9 Oct 2016 20:28:30 +0000 (22:28 +0200)]
st/glsl_to_tgsi: simplify translate_tex_offset
This fixes a bug with offsets from uniforms which seems to have only been
noticed as a crash in piglit's
arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag
on radeonsi.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Mon, 10 Oct 2016 13:09:40 +0000 (15:09 +0200)]
radeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.*
Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic*
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Nicolas Koch [Wed, 12 Oct 2016 11:55:46 +0000 (13:55 +0200)]
radv: Return correct result in EnumeratePhysicalDevices
If pPhysicalDevices is too small for all physical devices,
the driver must return VK_INCOMPLETE. Since only a single
physical device is supported, this is only the case when
pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Ilia Mirkin [Wed, 12 Oct 2016 18:01:34 +0000 (14:01 -0400)]
st/mesa: only flip stipple pattern for winsys fbo's
Gallium is completely oblivious to whether the fbo is flipped or not.
Only flip the stipple pattern when the fbo is flipped as well. Otherwise
the driver has no idea when to unflip the pattern.
Fixes bin/gl-2.1-polygon-stipple-fs -fbo
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Emil Velikov [Wed, 12 Oct 2016 15:06:47 +0000 (16:06 +0100)]
swr: automake: add ar_eventhandlerfile_h.template to the tarball
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Wed, 12 Oct 2016 00:03:25 +0000 (01:03 +0100)]
radv: add all headers to the sources list
Otherwise they'll be missing from the tarball and the build will fail.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Ilia Mirkin [Wed, 12 Oct 2016 14:24:59 +0000 (10:24 -0400)]
nvc0/ir: fix textureGather with a single offset
Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.
Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant gather offset")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Ilia Mirkin [Mon, 10 Oct 2016 20:57:50 +0000 (16:57 -0400)]
nv50/ir: copy over value's register id when resolving merge of a phi
The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Nicolai Hähnle [Thu, 6 Oct 2016 21:10:22 +0000 (23:10 +0200)]
st/mesa: enable ARB_enhanced_layouts and turn the cap on
v2: mark llvmpipe & softpipe properly as well (Jason Wood)
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Fri, 7 Oct 2016 10:19:33 +0000 (12:19 +0200)]
st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Fri, 7 Oct 2016 10:19:11 +0000 (12:19 +0200)]
st/glsl_to_tgsi: explicitly track all input and output declaration
In order to be able to emit overlapping input and output array
declarations, we flip the logic of emitting those declarations on its
head: rather than iterating over slots and emitting the corresponding
declarations, we iterate over the declarations from GLSL and emit those.
v2: fix some regressions related to structs
v3: fix a regression in geometry and tessellation shader array handling
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v2)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v2)
Nicolai Hähnle [Fri, 7 Oct 2016 19:30:05 +0000 (21:30 +0200)]
st/glsl_to_tgsi: mark "gaps" in input/output arrays as used
In some cases, a shader may have an input/output array but not use some
entries in the middle. This happens with eON games, for example.
We emit declarations that cover the entire array range even if there are
some unused gaps. This patch now reflects that in the InputsRead etc.
fields to ensure the various input/outputMapping arrays are actually
correct, which will be important when we re-jiggle the way declarations
are emitted.
v2: fix a typo (Edward O'Callaghan)
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Fri, 7 Oct 2016 14:15:30 +0000 (16:15 +0200)]
st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations
This optimization is incorrect with 64-bit operations, because the
channel-splitting logic in emit_asm ends up being applied twice to
the source operands.
A lucky coincidence of how the writemask test works resulted in this
optimization basically never being applied anyway. As far as I can tell,
the only case where it would (incorrectly) have been applied is something
like
dvec2 d;
float x = (float)d.y;
which nobody seems to have ever done. But the moral equivalent does occur
in one of the component layout piglit test.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Fri, 7 Oct 2016 10:49:36 +0000 (12:49 +0200)]
st/glsl_to_tgsi: simpler fixup of empty writemasks
Empty writemasks mean "copy everything", so we can always just use the number
of vector elements (which uses the GLSL meaning here, i.e. each double is a
single element/writemask bit).
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Fri, 7 Oct 2016 15:33:07 +0000 (17:33 +0200)]
st/glsl_to_tgsi: explicit handling of writemask for depth/stencil export
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Thu, 6 Oct 2016 21:10:10 +0000 (23:10 +0200)]
glsl: dump explicit location when printing IR
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Nicolai Hähnle [Wed, 12 Oct 2016 15:24:37 +0000 (17:24 +0200)]
tgsi/ureg: add ureg_DECL_output_layout
For specifying an exact location/component.
v2: change the order of parameters (Dave)
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Nicolai Hähnle [Fri, 7 Oct 2016 10:07:21 +0000 (12:07 +0200)]
tgsi/ureg: add layout/component input declarations
v2: change the order of parameters (Dave)
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Nicolai Hähnle [Fri, 7 Oct 2016 10:53:55 +0000 (12:53 +0200)]
tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays
v2: remove a tautological left-over assert (Marek)
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
Nicolai Hähnle [Fri, 7 Oct 2016 07:42:55 +0000 (09:42 +0200)]
gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS
This is a screen cap because drivers are expected to support it either
for all shader types or for none of them.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tom Stellard [Tue, 11 Oct 2016 21:06:54 +0000 (21:06 +0000)]
radeonsi: Use the new image load/store intrinsic signatures
This patch requires LLVM r284024 or newer.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tom Stellard [Tue, 11 Oct 2016 20:23:52 +0000 (20:23 +0000)]
radeonsi: Add function for converting LLVM type to intrinsic string
The existing function only worked for integer types.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Tom Stellard [Tue, 11 Oct 2016 16:43:36 +0000 (16:43 +0000)]
radeonsi: Refactor image store/load intrinsic name creation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 10 Oct 2016 20:24:27 +0000 (22:24 +0200)]
winsys/amdgpu: fix infinite loop w/ RADEON_NOOP=1 caused by unsubmitted fences
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 11 Oct 2016 14:55:41 +0000 (16:55 +0200)]
radeonsi: fix R600_DEBUG=precompile for shader-db
radeonsi no longer supports pixel shaders without interpolation optimizations,
which led to assertion failures in si_shader_ps when running shader-db.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 10 Oct 2016 16:51:24 +0000 (18:51 +0200)]
radeonsi: use TC write-back instead of full cache invalidation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 10 Oct 2016 16:49:22 +0000 (18:49 +0200)]
radeonsi: implement TC L2 write-back (flush) without cache invalidation
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 10 Oct 2016 15:39:43 +0000 (17:39 +0200)]
radeonsi: don't invalidate VMEM L1 for memory barriers for index buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Samuel Pitoiset [Thu, 6 Oct 2016 23:16:24 +0000 (01:16 +0200)]
nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)
total instructions in shared programs :
2286901 ->
2284473 (-0.11%)
total gprs used in shared programs :335256 -> 335273 (0.01%)
total local used in shared programs :31968 -> 31968 (0.00%)
local gpr inst bytes
helped 0 41 852 852
hurt 0 44 23 23
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Nicolai Hähnle [Tue, 11 Oct 2016 13:43:44 +0000 (15:43 +0200)]
mapi: fix out-of-tree build dependencies
We shouldn't be using wildcard here in the first place, but changing that
is some effort. As it stands, make -p confirms that glapi_gen_mapi_deps only
contains mapi_abi.py when building outside the Mesa tree.
As a result, only some of the tables were updated when XML files change, but
not the tables for shared glapi. This change ensures that we pick up the
XML files and scripts from the source tree as dependencies also for shared
glapi.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>