Mathias Fröhlich [Sat, 10 Mar 2018 15:01:31 +0000 (16:01 +0100)]
gallium: Use struct gl_array_attributes* as st_pipe_vertex_format argument.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Ian Romanick [Thu, 8 Mar 2018 05:05:34 +0000 (21:05 -0800)]
mesa: Don't write to user buffer in glGetTexParameterIuiv on error
With some sets of optimization flags, GCC will generate warnings like
this:
src/mesa/main/texparam.c:2327:27: warning: ‘*((void *)&ip+12)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
params[3] = ip[3];
~~^~~
src/mesa/main/texparam.c:2320:16: note: ‘*((void *)&ip+12)’ was declared here
GLint ip[4];
^~
ip is not initialized in cases where a GL error is generated. In these
cases, we should *not* write to the user's buffer, so this is actually a
bug. I wrote a new piglit test gl-3.0-texparameteri to show this bug.
I suspect that Coverity also detected this, but the scan site is
currently down.
Fixes: c2c507786 "main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv."
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Roman Gilg [Mon, 5 Mar 2018 16:41:44 +0000 (17:41 +0100)]
gallium: work around libtool relink issue for libdrm
This is similar to commit
90633079. libtool links first to system directories
instead of custom locations of libdrm on relinking. Since a more recent libdrm
version than the one provided by the system is often needed when compiling
mesa, make sure this works by putting libdrm in front.
See also: https://bugs.freedesktop.org/show_bug.cgi?id=100259
Signed-off-by: Roman Gilg <subdiff@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Thu, 8 Mar 2018 17:08:45 +0000 (17:08 +0000)]
vulkan: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Emil Velikov [Thu, 8 Mar 2018 17:07:39 +0000 (17:07 +0000)]
wayland-drm: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Emil Velikov [Thu, 8 Mar 2018 16:16:18 +0000 (16:16 +0000)]
egl: autotools: do not redirect stdin/stdout for wayland-scanner
The tool accepts the input and output files as arguments.
There's no need for the redirection.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Emil Velikov [Thu, 8 Mar 2018 14:07:07 +0000 (14:07 +0000)]
docs: document removal of GLX_SGIX_swap_{barrier,group} stubs
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Emil Velikov [Mon, 5 Mar 2018 18:33:14 +0000 (18:33 +0000)]
glx: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
- no actual users (on my Arch setup)
- the Nvidia driver does not implement the extension
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Emil Velikov [Mon, 5 Mar 2018 18:30:40 +0000 (18:30 +0000)]
gallium/x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
- no actual users (on my Arch setup)
- the Nvidia driver does not implement the extension
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Emil Velikov [Mon, 5 Mar 2018 18:28:35 +0000 (18:28 +0000)]
x11: remove empty GLX_SGIX_swap_group stubs
The extension was never implemented. Quick search suggests:
- no actual users (on my Arch setup)
- the Nvidia driver does not implement the extension
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Emil Velikov [Mon, 5 Mar 2018 18:25:16 +0000 (18:25 +0000)]
glx: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
- no actual users (on my Arch setup)
- the Nvidia driver does not implement the extension
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Emil Velikov [Mon, 5 Mar 2018 18:22:38 +0000 (18:22 +0000)]
gallium/x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
- no actual users (on my Arch setup)
- the Nvidia driver does not implement the extension
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Emil Velikov [Mon, 5 Mar 2018 18:17:13 +0000 (18:17 +0000)]
x11: remove empty GLX_SGIX_swap_barrier stubs
The extension was never implemented. Quick search suggests:
- no actual users (on my Arch setup)
- the Nvidia driver does not implement the extension
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Emil Velikov [Mon, 5 Mar 2018 18:14:51 +0000 (18:14 +0000)]
configure: remove unused AM_CONDITIONAL
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Bas Nieuwenhuizen [Fri, 9 Mar 2018 16:18:03 +0000 (17:18 +0100)]
radv: Increase the number of dynamic uniform buffers.
The vulkan API is not ideal as it does not allow us have a
shared limit.
Feral needs 15+6 for one of their games, and I'm not a fan
of overcommitting the limits, so increase the number of
dynamic uniform buffers to 16.
CC: <mesa-stable@lists.freedesktop.org>
CC: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 8 Mar 2018 20:18:55 +0000 (06:18 +1000)]
u_vbuf/translate: pass max_index into the set_buffer.
This fixes a memory trashing crash (not the test) seen with
dEQP-GLES3.stress.draw.unaligned_data.random.203
on virgl.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 9 Mar 2018 06:03:53 +0000 (16:03 +1000)]
r600: implement callstack workaround for evergreen.
This is ported from the sb backend, there are some issues with
evergreen stacks on the boundary between entries and ALU_PUSH_BEFORE
instructions.
Whenever we are going to use a push before, we check the stack
usage and if we have to use the workaround, then we switch to
a separate push.
I noticed this problem dealing with some of the soft fp64 shaders,
in nosb mode, they are quite stack happy.
This fixes all the glitches and inconsistencies I've seen with them
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Elie Tournier <elie.tournier@collabora.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Marek Olšák [Sun, 11 Mar 2018 00:23:45 +0000 (19:23 -0500)]
gallium/util: add helper util_wait_for_idle
This is an old patch that I had.
Roland Scheidegger [Sat, 10 Mar 2018 01:48:42 +0000 (02:48 +0100)]
u_blit: (trivial) u_blit.h needs to include p_defines.h
(For the pipe_tex_filter enum)
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Christian Gmeiner [Sat, 10 Mar 2018 14:53:27 +0000 (15:53 +0100)]
travis: bump libxcb version to 1.13
Fixes following dependency problem:
Native dependency xcb-dri3 found: NO found '1.11' but need: '>= 1.13'
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Fixes: c80c08e22603 ("vulkan/wsi/x11: Add support for DRI3 v1.2")
Mathias Fröhlich [Sun, 4 Mar 2018 17:15:53 +0000 (18:15 +0100)]
mesa: Make gl_vertex_array contain pointers to first order VAO members.
Instead of keeping a copy of the vertex array content in
struct gl_vertex_array only keep pointers to the first order
information originaly in the VAO.
For that represent the current values by struct gl_array_attributes
and struct gl_vertex_buffer_binding.
v2: Change comments.
Remove gl... prefix from variables except in the i965 directory where
it was like that before. Reindent because of that.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Roland Scheidegger [Fri, 9 Mar 2018 04:27:25 +0000 (05:27 +0100)]
draw: fix alpha value for very short aa lines
The logic would not work correctly for line lengths smaller than 1.0,
even a degenerated line with length 0 would still produce a fragment
with anyhwere between alpha 0.0 and 0.5.
Reviewed-by: Brian Paul <brianp@vmware.com>
Jordan Justen [Wed, 7 Mar 2018 07:28:00 +0000 (23:28 -0800)]
intel/vulkan: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jordan Justen [Tue, 6 Mar 2018 16:35:50 +0000 (08:35 -0800)]
i965: Hard code CS scratch_ids_per_subslice for Cherryview
Ken suggested that we might be underallocating scratch space on HD
400. Allocating scratch space as though there was actually 8 EUs
seems to help with a GPU hang seen on synmark CSDof.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104636
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105290
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Marek Olšák [Wed, 7 Mar 2018 18:47:28 +0000 (13:47 -0500)]
st/dri: fix OpenGL-OpenCL interop for GL_TEXTURE_BUFFER
Tested by our OpenCL team.
Fixes: 9c499e6759b26c5e "st/mesa: don't invoke st_finalize_texture & st_convert_sampler for TBOs"
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Fri, 9 Mar 2018 21:25:42 +0000 (16:25 -0500)]
radeonsi: add a workaround for GFX9 hang with init_config alignment
Fixes: 75c5d25f0f34cd702 "radeonsi: align command buffer starting address to fix some Raven hangs"
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Marek Olšák [Fri, 9 Mar 2018 21:24:40 +0000 (16:24 -0500)]
ac/gpu_info: print ib_start_alignment, add assertion
Greg V [Tue, 6 Mar 2018 19:16:03 +0000 (22:16 +0300)]
meson: Use system_has_kms_drm in default driver selection
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Eric Anholt [Tue, 6 Feb 2018 16:43:24 +0000 (16:43 +0000)]
broadcom/vc4: Add an accelerated path to turn raster R8/RG88 into tiled.
Drawing a 1080p YV12 video stream generated by MMAL goes from 10.5 FPS to
36.
Eric Anholt [Wed, 7 Feb 2018 14:40:08 +0000 (14:40 +0000)]
gallium: Add a util_blitter path for using a custom VS and FS.
Like the r600 paths to use other custom states, we pass in a couple of
parameters to customize the innards of the blitter. It's up to the caller
to wrap other state necessary for its shaders (for example, constant
buffers for the uniforms the shader uses).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Eric Anholt [Wed, 7 Feb 2018 15:22:19 +0000 (15:22 +0000)]
broadcom/vc4: Allow binding non-zero constant buffers.
We're going to use UBO loads for implementing YUV linear-to-T-format
blits.
Eric Anholt [Wed, 28 Feb 2018 22:16:54 +0000 (14:16 -0800)]
broadcom: Remove our defines of DRM_FORMAT_MOD_INVALID.
The imported drm_fourcc.h handles it now.
Eric Anholt [Wed, 28 Feb 2018 22:15:34 +0000 (14:15 -0800)]
broadcom: Suppress compiler warnings about enum pipe_tex_filter.
Louis-Francis Ratté-Boulianne [Fri, 6 Oct 2017 05:26:51 +0000 (01:26 -0400)]
egl/x11: Re-allocate buffers if format is suboptimal
If PresentCompleteNotify event says the pixmap was presented
with mode PresentCompleteModeSuboptimalCopy, it means the pixmap
could possibly have been flipped instead if allocated with a
different format/modifier.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Louis-Francis Ratté-Boulianne [Fri, 7 Jul 2017 06:54:26 +0000 (02:54 -0400)]
egl/x11: Support DRI3 v1.1
Add support for DRI3 v1.1, which allows pixmaps to be backed by
multi-planar buffers, or those with format modifiers. This is both
for allocating render buffers, as well as EGLImage imports from a
native pixmap (EGL_NATIVE_PIXMAP_KHR).
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Louis-Francis Ratté-Boulianne [Wed, 27 Sep 2017 03:11:55 +0000 (23:11 -0400)]
vulkan/wsi/x11: Return VK_SUBOPTIMAL_KHR for X11
When it is detected that a window could have been flipped
but has been copied because of suboptimal format/modifier.
The Vulkan client should then re-create the swapchain.
Signed-off-by: Louis-Francis Ratté-Boulianne <lfrb@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Daniel Stone [Thu, 8 Jun 2017 16:24:30 +0000 (17:24 +0100)]
vulkan/wsi/x11: Add support for DRI3 v1.2
Adds support for multiple planes and buffer modifiers.
v4: Rename "has_dri3_v1_1" to "has_dri3_modifiers"
v12: Multi-planar/modifier support is now DRI3 v1.2; also update release
versions
Dylan Baker [Fri, 2 Mar 2018 17:57:54 +0000 (09:57 -0800)]
autotools: include all meson.build files
Otherwise SWR cannot be built with meson from an autotools generated
tarball, such as the 18.0.0-rc4 tarball.
Fixes: 16bf81383080 ("meson/swr: re-shuffle generated files")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Michel Dänzer [Thu, 8 Mar 2018 16:32:50 +0000 (17:32 +0100)]
st/mesa: gl_program::info.system_values_read is a 64-bit-field
We were dropping the upper 32 bits, which caused assertion failures in
some compute shader piglit tests with radeonsi since the commit below.
Fixes: 752e96970303 ("compiler: Add two new system values for subgroups")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
George Kyriazis [Thu, 1 Mar 2018 18:39:18 +0000 (12:39 -0600)]
swr/rast: Refactor memory gather operations
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 27 Feb 2018 21:29:52 +0000 (15:29 -0600)]
swr/rast: Add KNOB_DISABLE_SPLIT_DRAW
This is useful for archrast data collection. This greatly speeds up the
post processing script since there is significantly less events generated.
Finally, this is a simpler option to communicate to users than having
them directly adjust MAX_PRIMS_PER_DRAW and MAX_TESS_PRIMS_PER_DRAW.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 2 Mar 2018 06:54:38 +0000 (00:54 -0600)]
swr/rast: Add VPOPCNT
Supports popcnt on vector masks (e.g. <8 x i1>)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 28 Feb 2018 23:33:13 +0000 (17:33 -0600)]
swr/rast: Add tracking for stream out topology
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 26 Feb 2018 23:55:23 +0000 (17:55 -0600)]
swr/rast: Add split draw and other state information to DrawInfoEvent.
Removed specific split draw events.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 26 Feb 2018 21:19:08 +0000 (15:19 -0600)]
swr/rast: Refactor api and worker event handlers.
In the API event handler we want to share information between the core
layer and the API. Specifically, around associating various ids with
different kinds of events. For example, associate render pass id with
draw ids, or command buffer ids with draw ids.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Sat, 24 Feb 2018 00:51:18 +0000 (18:51 -0600)]
swr/rast: Add support for generalized late and early z/stencil stats
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 23 Feb 2018 22:11:04 +0000 (16:11 -0600)]
swr/rast: Rasterized Subspans stats support
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 21 Feb 2018 01:24:55 +0000 (19:24 -0600)]
swr/rast: Added comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Eric Engestrom [Mon, 26 Feb 2018 13:34:54 +0000 (13:34 +0000)]
vulkan/wsi: clean up cleanup path
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Bas Nieuwenhuizen [Fri, 9 Mar 2018 13:08:38 +0000 (14:08 +0100)]
radv: Fix the autotools build take 2.
Forgot to remove a word....
Fixes: 04ffabf17a "radv: Fix autotools build."
Lucas Stach [Wed, 7 Mar 2018 13:31:59 +0000 (14:31 +0100)]
etnaviv: allow mixing different bit depths for color and depth surfaces
Vivante hardware supports this just fine. There is no reason why this shouldn't
be advertised as a valid combination.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Thierry Reding [Thu, 22 Feb 2018 17:21:45 +0000 (18:21 +0100)]
autotools: Add tegra to AM_DISTCHECK_CONFIGURE_FLAGS
This allows the driver to be built on a make distcheck and makes sure
that it properly builds when a distribution tarball is made.
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Thierry Reding [Tue, 27 May 2014 22:36:48 +0000 (00:36 +0200)]
tegra: Initial support
Tegra K1 and later use a GPU that can be driven by the Nouveau driver.
But the GPU is a pure render node and has no display engine, hence the
scanout needs to happen on the Tegra display hardware. The GPU and the
display engine each have a separate DRM device node exposed by the
kernel.
To make the setup appear as a single device, this driver instantiates
a Nouveau screen with each instance of a Tegra screen and forwards GPU
requests to the Nouveau screen. For purposes of scanout it will import
buffers created on the GPU into the display driver. Handles that
userspace requests are those of the display driver so that they can be
used to create framebuffers.
This has been tested with some GBM test programs, as well as kmscube and
weston. All of those run without modifications, but I'm sure there is a
lot that can be improved.
Some fixes contributed by Hector Martin <marcan@marcan.st>.
Changes in v2:
- duplicate file descriptor in winsys to avoid potential issues
- require nouveau when building the tegra driver
- check for nouveau driver name on render node
- remove unneeded dependency on libdrm_tegra
- remove zombie references to libudev
- add missing headers to C_SOURCES variable
- drop unneeded tegra/ prefix for includes
- open device files with O_CLOEXEC
- update copyrights
Changes in v3:
- properly unwrap resources in ->resource_copy_region()
- support vertex buffers passed by user pointer
- allocate custom stream and const uploader
- silence error message on pre-Tegra124
- support X without explicit PRIME
Changes in v4:
- ship Meson build files in distribution tarball
- drop duplicate driver_tegra dependency
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Thierry Reding [Wed, 11 Oct 2017 12:38:56 +0000 (14:38 +0200)]
nouveau: Add framebuffer modifier support
This adds support for framebuffer modifiers to Nouveau. This will be
used by the Tegra driver to share metadata about the format of buffers
(such as the tiling mode or compression).
Changes in v2:
- remove unused parameters to nouveau_buffer_create()
- move format modifier query code to nvc0 backend
- restrict format modifiers to 2D textures
- implement ->query_dmabuf_modifiers()
Changes in v4:
- add UAPI include path on meson builds
Changes in v5:
- remove unnecessary includes
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Thierry Reding [Wed, 11 Oct 2017 12:41:26 +0000 (14:41 +0200)]
nouveau/nvc0: Extract common tile mode macro
Add a new macro that can be used to extract the tiling mode from a
tile_mode value. This is will be used to determine the number of GOBs
used in block linear mode.
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Thierry Reding [Tue, 20 Feb 2018 14:48:37 +0000 (15:48 +0100)]
drm/tegra: Sanitize format modifiers
The existing format modifier definitions were merged prematurely, and
recent work has unveiled that the definitions are suboptimal in several
ways:
- The format specifiers, except for one, are not Tegra specific, but
the names don't reflect that.
- The number space is split into two, reserving 32 bits for some
"parameter" which most of the modifiers are not going to have.
- Symbolic names for the modifiers are not using the standard
DRM_FORMAT_MOD_* prefix, which makes them awkward to use.
- The vendor prefix NV is somewhat ambiguous.
Fortunately, nobody's started using these modifiers, so we can still fix
the above issues. Do so by using the standard prefix. Also, remove TEGRA
from the name of those modifiers that exist on NVIDIA GPUs as well. In
case of the block linear modifiers, make the "parameter" smaller (4
bits, though only 6 values are valid) and don't let that leak into any
of the other modifiers.
Finally, also use the more canonical NVIDIA instead of the ambiguous NV
prefix.
This is based on commit
268892cb63a822315921a8dab48ac3e4abf7dd03 from
Linux v4.16-rc1.
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Thierry Reding [Tue, 20 Feb 2018 14:47:25 +0000 (15:47 +0100)]
drm/fourcc: Fix fourcc_mod_code() definition
Avoid a compiler warnings when the val parameter is an expression.
This is based on commit
5843f4e02fbe86a59981e35adc6cabebee46fdc0 from
Linux v4.16-rc1.
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Andre Heider <a.heider@gmail.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Bas Nieuwenhuizen [Fri, 9 Mar 2018 07:43:01 +0000 (08:43 +0100)]
radv: Fix autotools build.
Forgot it again ....
Fixes: b6347807a9 "radv: Generate icd files."
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Thu, 8 Mar 2018 16:30:05 +0000 (17:30 +0100)]
ac/nir: set number of channels for packed mrt exports
Bit 0 enables VSRC0 (R in low bits, G high) and bit 2 enables
VSRC1 (B in low bits, A high).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bas Nieuwenhuizen [Thu, 8 Mar 2018 23:49:57 +0000 (00:49 +0100)]
radv: Update version to 1.1.70.
Turns out they did not reset the patch number on release.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Thu, 8 Mar 2018 23:47:26 +0000 (00:47 +0100)]
radv: Generate icd files.
If the api version is too low, the loader clamps the application
requested version to the advertized version, which messes with
which extensions are enabled.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Ian Romanick [Thu, 22 Feb 2018 02:15:52 +0000 (18:15 -0800)]
nir: Don't i2b a value that is already Boolean
A bunch of shaders have sequences like:
i2b(u2i(floatBitsToUint(intBitsToFloat(x == y ? -1 : 0))))
Other optimizations (and NIR's typeless nature) reduce this to
i2b(x == y)
which is silly.
Skylake
total instructions in shared programs:
14498698 ->
14497948 (<.01%)
instructions in affected programs: 74480 -> 73730 (-1.01%)
helped: 277
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 2.71 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.45% x̃: 0.68%
95% mean confidence interval for instructions value: -3.35 -2.06
95% mean confidence interval for instructions %-change: -1.74% -1.16%
Instructions are helped.
total cycles in shared programs:
532015500 ->
531999238 (<.01%)
cycles in affected programs:
5943878 ->
5927616 (-0.27%)
helped: 251
HURT: 74
helped stats (abs) min: 1 max: 13149 x̄: 127.89 x̃: 14
helped stats (rel) min: 0.01% max: 17.31% x̄: 1.55% x̃: 0.53%
HURT stats (abs) min: 1 max: 4550 x̄: 214.04 x̃: 15
HURT stats (rel) min: <.01% max: 44.43% x̄: 2.81% x̃: 0.33%
95% mean confidence interval for cycles value: -158.51 58.43
95% mean confidence interval for cycles %-change: -1.07% -0.04%
Inconclusive result (value mean confidence interval includes 0).
total loops in shared programs: 4753 -> 4735 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.
Haswell and Broadwell had simliar results. (Broadwell shown)
total instructions in shared programs:
14791877 ->
14791127 (<.01%)
instructions in affected programs: 77326 -> 76576 (-0.97%)
helped: 278
HURT: 1
helped stats (abs) min: 1 max: 32 x̄: 2.70 x̃: 2
helped stats (rel) min: 0.04% max: 13.79% x̄: 1.42% x̃: 0.68%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.49% max: 0.49% x̄: 0.49% x̃: 0.49%
95% mean confidence interval for instructions value: -3.33 -2.05
95% mean confidence interval for instructions %-change: -1.70% -1.13%
Instructions are helped.
total cycles in shared programs:
558250067 ->
558252872 (<.01%)
cycles in affected programs:
5806328 ->
5809133 (0.05%)
helped: 235
HURT: 83
helped stats (abs) min: 1 max: 10630 x̄: 81.73 x̃: 16
helped stats (rel) min: 0.03% max: 18.58% x̄: 1.60% x̃: 0.51%
HURT stats (abs) min: 1 max: 10590 x̄: 265.19 x̃: 20
HURT stats (rel) min: <.01% max: 15.28% x̄: 1.89% x̃: 0.54%
95% mean confidence interval for cycles value: -89.87 107.51
95% mean confidence interval for cycles %-change: -1.06% -0.32%
Inconclusive result (value mean confidence interval includes 0).
total loops in shared programs: 4735 -> 4717 (-0.38%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.
total fills in shared programs: 83111 -> 83110 (<.01%)
fills in affected programs: 28 -> 27 (-3.57%)
helped: 1
HURT: 0
Ivy Bridge
total instructions in shared programs:
11774173 ->
11773436 (<.01%)
instructions in affected programs: 70819 -> 70082 (-1.04%)
helped: 267
HURT: 0
helped stats (abs) min: 1 max: 48 x̄: 2.76 x̃: 2
helped stats (rel) min: 0.21% max: 19.51% x̄: 1.57% x̃: 0.63%
95% mean confidence interval for instructions value: -3.51 -2.01
95% mean confidence interval for instructions %-change: -1.94% -1.21%
Instructions are helped.
total cycles in shared programs:
257153833 ->
257148932 (<.01%)
cycles in affected programs: 585341 -> 580440 (-0.84%)
helped: 167
HURT: 100
helped stats (abs) min: 1 max: 1327 x̄: 44.89 x̃: 16
helped stats (rel) min: 0.04% max: 26.54% x̄: 2.41% x̃: 0.88%
HURT stats (abs) min: 1 max: 200 x̄: 25.95 x̃: 16
HURT stats (rel) min: 0.04% max: 9.81% x̄: 1.34% x̃: 0.65%
95% mean confidence interval for cycles value: -33.25 -3.46
95% mean confidence interval for cycles %-change: -1.47% -0.54%
Cycles are helped.
total loops in shared programs: 3416 -> 3398 (-0.53%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.
LOST: 2
GAINED: 0
Sandy Bridge
total instructions in shared programs:
10499306 ->
10499094 (<.01%)
instructions in affected programs: 6051 -> 5839 (-3.50%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 32 x̄: 4.93 x̃: 2
helped stats (rel) min: 0.39% max: 12.90% x̄: 4.29% x̃: 2.45%
95% mean confidence interval for instructions value: -7.66 -2.20
95% mean confidence interval for instructions %-change: -5.47% -3.12%
Instructions are helped.
total cycles in shared programs:
145862568 ->
145861370 (<.01%)
cycles in affected programs: 61733 -> 60535 (-1.94%)
helped: 36
HURT: 2
helped stats (abs) min: 16 max: 66 x̄: 36.61 x̃: 35
helped stats (rel) min: 0.45% max: 17.31% x̄: 4.92% x̃: 2.81%
HURT stats (abs) min: 18 max: 102 x̄: 60.00 x̃: 60
HURT stats (rel) min: 1.10% max: 1.85% x̄: 1.48% x̃: 1.48%
95% mean confidence interval for cycles value: -41.28 -21.77
95% mean confidence interval for cycles %-change: -6.16% -3.00%
Cycles are helped.
total loops in shared programs: 1803 -> 1785 (-1.00%)
loops in affected programs: 18 -> 0
helped: 18
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for loops value: -1.00 -1.00
95% mean confidence interval for loops %-change: -100.00% -100.00%
Loops are helped.
LOST: 4
GAINED: 0
No changes on Iron Lake of GM45.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Sat, 17 Feb 2018 01:33:13 +0000 (17:33 -0800)]
i965/vec4: Allow CSE on subset VF constant loads
v2: Rewrite the code that generates the VF mask. Suggested by Ken.
No changes on other platforms.
Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs:
13059891 ->
13059884 (<.01%)
instructions in affected programs: 431 -> 424 (-1.62%)
helped: 7
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.19% max: 5.26% x̄: 2.05% x̃: 1.49%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -3.39% -0.71%
Instructions are helped.
total cycles in shared programs:
409260032 ->
409260018 (<.01%)
cycles in affected programs: 4228 -> 4214 (-0.33%)
helped: 7
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.28% max: 2.04% x̄: 0.54% x̃: 0.28%
95% mean confidence interval for cycles value: -2.00 -2.00
95% mean confidence interval for cycles %-change: -1.15% 0.07%
Inconclusive result (%-change mean confidence interval includes 0).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Sat, 17 Feb 2018 01:26:11 +0000 (17:26 -0800)]
i965/vec4: Relax writemask condition in CSE
If the previously seen instruction generates more fields than the new
instruction, still allow CSE to happen. This doesn't do much, but it
also enables a couple more shaders in the next patch. It helped quite a
bit in another change series that I have (at least for now) abandoned.
v2: Add some extra comentary about the parameters to instructions_match.
Suggested by Ken.
No changes on Skylake, Broadwell, Iron Lake or GM45.
Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs:
11780295 ->
11780294 (<.01%)
instructions in affected programs: 302 -> 301 (-0.33%)
helped: 1
HURT: 0
total cycles in shared programs:
257308315 ->
257308313 (<.01%)
cycles in affected programs: 2074 -> 2072 (-0.10%)
helped: 1
HURT: 0
Sandy Bridge
total instructions in shared programs:
10506687 ->
10506686 (<.01%)
instructions in affected programs: 335 -> 334 (-0.30%)
helped: 1
HURT: 0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Thu, 22 Feb 2018 02:06:56 +0000 (18:06 -0800)]
i965/fs: Merge CMP and SEL into CSEL on Gen8+
v2: Fix several problems handling inverted predicates. Add a much
bigger comment around the BRW_CONDITIONAL_NZ case.
v3: Allow uniforms and shader inputs as sources for the original SEL and
CMP instructions. This enables a LOT more shaders to receive CSEL
merging (5816 vs 8564 on SKL).
v4: Report progress.
Broadwell and Skylake had similar results. (Broadwell shown)
helped: 8527
HURT: 0
helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1
helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70%
95% mean confidence interval for instructions value: -2.51 -2.36
95% mean confidence interval for instructions %-change: -1.15% -1.10%
Instructions are helped.
total cycles in shared programs:
559442317 ->
558288357 (-0.21%)
cycles in affected programs:
372699860 ->
371545900 (-0.31%)
helped: 6748
HURT: 1450
helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12
helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70%
HURT stats (abs) min: 1 max: 2538 x̄: 53.08 x̃: 14
HURT stats (rel) min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90%
95% mean confidence interval for cycles value: -179.01 -102.51
95% mean confidence interval for cycles %-change: -2.37% -2.08%
Cycles are helped.
LOST: 0
GAINED: 6
No changes on earlier platforms.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 23 Nov 2015 04:12:17 +0000 (20:12 -0800)]
i965/fs: Add infrastructure for generating CSEL instructions.
v2 (idr): Don't allow CSEL with a non-float src2.
v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt.
v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't
reset the access mode afterwards (suggested by Samuel and Matt). Add
support for CSEL not modifying the flags to more places (requested by
Matt).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v3]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Thu, 15 Feb 2018 22:49:55 +0000 (14:49 -0800)]
nir: Narrow some dot product operations
On vector platforms, this helps elide some constant loads.
v2: Reorder the transformations.
No changes on Broadwell or Skylake.
Haswell
total instructions in shared programs:
13093793 ->
13060163 (-0.26%)
instructions in affected programs:
1277532 ->
1243902 (-2.63%)
helped: 13216
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.56 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.63% x̃: 2.78%
HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.57 -2.49
95% mean confidence interval for instructions %-change: -3.65% -3.54%
Instructions are helped.
total cycles in shared programs:
409580819 ->
409268463 (-0.08%)
cycles in affected programs:
71730652 ->
71418296 (-0.44%)
helped: 9898
HURT: 2352
helped stats (abs) min: 2 max: 16014 x̄: 37.08 x̃: 16
helped stats (rel) min: <.01% max: 35.55% x̄: 6.26% x̃: 4.50%
HURT stats (abs) min: 2 max: 276 x̄: 23.25 x̃: 6
HURT stats (rel) min: <.01% max: 40.00% x̄: 3.54% x̃: 1.97%
95% mean confidence interval for cycles value: -33.19 -17.80
95% mean confidence interval for cycles %-change: -4.50% -4.26%
Cycles are helped.
total fills in shared programs: 82059 -> 82052 (<.01%)
fills in affected programs: 21 -> 14 (-33.33%)
helped: 7
HURT: 0
Sandy Bridge and Ivy Bridge had similar results (Ivy Bridge shown)
total instructions in shared programs:
11811851 ->
11780605 (-0.26%)
instructions in affected programs:
1155007 ->
1123761 (-2.71%)
helped: 12304
HURT: 95
helped stats (abs) min: 1 max: 18 x̄: 2.55 x̃: 2
helped stats (rel) min: 0.21% max: 20.00% x̄: 3.69% x̃: 2.86%
HURT stats (abs) min: 1 max: 6 x̄: 1.77 x̃: 1
HURT stats (rel) min: 0.09% max: 5.56% x̄: 1.25% x̃: 1.19%
95% mean confidence interval for instructions value: -2.56 -2.48
95% mean confidence interval for instructions %-change: -3.71% -3.59%
Instructions are helped.
total cycles in shared programs:
257618409 ->
257316805 (-0.12%)
cycles in affected programs:
71999580 ->
71697976 (-0.42%)
helped: 9155
HURT: 2380
helped stats (abs) min: 2 max: 16014 x̄: 38.44 x̃: 16
helped stats (rel) min: <.01% max: 35.75% x̄: 6.39% x̃: 4.62%
HURT stats (abs) min: 2 max: 290 x̄: 21.14 x̃: 4
HURT stats (rel) min: <.01% max: 41.55% x̄: 3.14% x̃: 1.33%
95% mean confidence interval for cycles value: -34.32 -17.97
95% mean confidence interval for cycles %-change: -4.55% -4.29%
Cycles are helped.
GM45 and Iron Lake had nearly identical results (Iron Lake shown)
total instructions in shared programs:
7886750 ->
7879944 (-0.09%)
instructions in affected programs: 373781 -> 366975 (-1.82%)
helped: 3715
HURT: 47
helped stats (abs) min: 1 max: 8 x̄: 1.86 x̃: 1
helped stats (rel) min: 0.22% max: 16.67% x̄: 2.88% x̃: 2.06%
HURT stats (abs) min: 1 max: 6 x̄: 2.55 x̃: 2
HURT stats (rel) min: 1.09% max: 5.00% x̄: 1.93% x̃: 2.35%
95% mean confidence interval for instructions value: -1.85 -1.77
95% mean confidence interval for instructions %-change: -2.91% -2.73%
Instructions are helped.
total cycles in shared programs:
178114636 ->
178095452 (-0.01%)
cycles in affected programs:
7227666 ->
7208482 (-0.27%)
helped: 3349
HURT: 301
helped stats (abs) min: 2 max: 90 x̄: 6.55 x̃: 4
helped stats (rel) min: <.01% max: 14.18% x̄: 0.95% x̃: 0.63%
HURT stats (abs) min: 2 max: 42 x̄: 9.13 x̃: 10
HURT stats (rel) min: 0.01% max: 11.19% x̄: 1.22% x̃: 1.50%
95% mean confidence interval for cycles value: -5.52 -4.99
95% mean confidence interval for cycles %-change: -0.81% -0.73%
Cycles are helped.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> [v1]
Lionel Landwerlin [Wed, 7 Mar 2018 14:10:15 +0000 (14:10 +0000)]
i965: perf: consolidate unmapping oa perf bo outside accumulation
Do this in one place outside the only caller of the accumulation
function.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Tue, 6 Mar 2018 17:11:56 +0000 (17:11 +0000)]
i965: perf: count number of accumlated reports
This will be reused later.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Tue, 6 Mar 2018 15:47:00 +0000 (15:47 +0000)]
i965: perf: reuse timescale base function from query
We already have the same function in brw_queryobj.c
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Wed, 7 Feb 2018 18:09:58 +0000 (18:09 +0000)]
i965: perf: store sysfs device entry into context
We want to reuse it later on.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Wed, 7 Feb 2018 18:10:57 +0000 (18:10 +0000)]
i965: perf: store the hw_id of the context in the query
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Lionel Landwerlin [Tue, 6 Feb 2018 17:29:32 +0000 (17:29 +0000)]
i965: perf: default case for unknown query types
Just some extra safety before further changes.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Marek Olšák [Tue, 6 Mar 2018 23:30:06 +0000 (18:30 -0500)]
radeonsi: remove chip_class parameter from si_lower_nir
We can get it from si_screen.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Sun, 11 Sep 2016 19:53:20 +0000 (21:53 +0200)]
winsys/amdgpu: query GDS info
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Tue, 6 Mar 2018 20:03:09 +0000 (15:03 -0500)]
winsys/amdgpu: pad compute IBs
v2: pad with PKT2 NOPs on SI
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Wed, 7 Mar 2018 16:36:26 +0000 (11:36 -0500)]
radeonsi: expand constbuf 0 address correctly to fix Vega10 hangs
This is only required with the latest libdrm.
This fixes 32-bit support with high addresses.
(and possibly 64-bit support too because the high bits need to be masked out)
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Wed, 7 Mar 2018 00:07:58 +0000 (19:07 -0500)]
radeonsi: align command buffer starting address to fix some Raven hangs
Cc: 17.3 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Christian Gmeiner [Mon, 5 Mar 2018 22:26:43 +0000 (23:26 +0100)]
etnaviv: add get_driver_query_group_info(..)
This enables AMD_performance_monitor extension.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Christian Gmeiner [Mon, 5 Mar 2018 22:26:42 +0000 (23:26 +0100)]
etnaviv: add query_group_info for sw counters
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Dylan Baker [Wed, 28 Feb 2018 21:07:57 +0000 (13:07 -0800)]
meson: Fix building gallium media libs without egl
v2: - rebase on omx fix
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Dylan Baker [Wed, 28 Feb 2018 18:13:38 +0000 (10:13 -0800)]
meson: Allow building dri based EGL without GLX
It should be possible to build EGL without GLX, but the meson build
currently doesn't allow that because it too tightly couples glx and dri.
This patch eases dri and glx apart, so that EGL without GLX can be
built.
CC: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Thierry Reding [Tue, 6 Mar 2018 09:44:08 +0000 (10:44 +0100)]
glx/apple: Ship meson build file in tarball
The meson build file for Apple GLX is not listed in the EXTRA_DIST make
variable and therefore isn't shipped as part of the release tarball, so
meson builds from the tarball will fail.
Add the file to EXTRA_DIST to ensure it is included in the tarball.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Samuel Pitoiset [Thu, 8 Mar 2018 08:53:14 +0000 (09:53 +0100)]
ac/nir: do not emit unnecessary null exports in fragment shaders
Null exports should only be needed when no other exports are
emitted. This removes a bunch of 'exp null off, off, off, off done vm'.
Affected games are Dota 2 and Wolfenstein 2, not sure if that
really helps, but code size is decreasing there.
Polaris10:
Totals from affected shaders:
SGPRS: 8216 -> 8216 (0.00 %)
VGPRS: 7072 -> 7072 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 454968 -> 453896 (-0.24 %) bytes
Max Waves: 772 -> 772 (0.00 %)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eric Engestrom [Thu, 8 Mar 2018 09:52:16 +0000 (09:52 +0000)]
drirc: whitespace fix
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Thomas Hellstrom [Mon, 26 Feb 2018 13:32:01 +0000 (14:32 +0100)]
drirc: Disable the GLX_SGI_video_sync extension for gnome-shell on vmware
With this extension enabled and a server GLX implementation that actually
honors it, Window movement lags considerably on gnome-shell/vmware, so
disable it by default.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Thomas Hellstrom [Mon, 26 Feb 2018 13:30:33 +0000 (14:30 +0100)]
gallium/st_dri: Honor the glx_disable_sgi_video_sync config option
This option is disabled by default. Primarily intended for drivers on
virtual hardware.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Thomas Hellstrom [Mon, 26 Feb 2018 13:27:40 +0000 (14:27 +0100)]
glx/dri: Add a driconf option to disable GLX_SGI_video_sync
Drivers on virtual hardware don't want to expose this extension to
GLX compositors, similarly to GLX_OML_sync_control, since that significantly
increases latency.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Timothy Arceri [Wed, 7 Mar 2018 22:46:42 +0000 (09:46 +1100)]
ac/radeonsi: add emit_kill to the abi
This should fix a regression with Rocket League grass rendering
on the NIR backend.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717
Timothy Arceri [Wed, 7 Mar 2018 22:37:10 +0000 (09:37 +1100)]
radeonsi: add si_llvm_emit_kill() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Wed, 7 Mar 2018 23:37:52 +0000 (10:37 +1100)]
spirv: fix autotools builds
Fixes: 68a6a3b51acc "spirv: handle AMD_gcn_shader extended instructions"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Wed, 7 Mar 2018 00:10:54 +0000 (11:10 +1100)]
ac: make use of if/loop build helpers
These helpers insert the basic block in the same order as they
appear in NIR making it easier to follow LLVM IR dumps. The helpers
also insert more useful labels onto the blocks.
TGSI use the line number of the corresponding opcode in the TGSI
dump as the label id, here we use the corresponding block index
from NIR.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 6 Mar 2018 23:55:47 +0000 (10:55 +1100)]
radeonsi: make use of if/loop build helpers in ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 6 Mar 2018 23:53:34 +0000 (10:53 +1100)]
ac: add if/loop build helpers
These have been ported over from radeonsi.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Daniel Schürmann [Fri, 23 Feb 2018 12:55:01 +0000 (13:55 +0100)]
radv: enable AMD_gcn_shader extension
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Daniel Schürmann [Fri, 23 Feb 2018 12:55:00 +0000 (13:55 +0100)]
ac: implement AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Daniel Schürmann [Fri, 23 Feb 2018 12:54:59 +0000 (13:54 +0100)]
spirv: handle AMD_gcn_shader extended instructions
Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Daniel Schürmann [Fri, 23 Feb 2018 12:54:58 +0000 (13:54 +0100)]
nir: add AMD_gcn_shader extended instructions
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Daniel Schürmann [Fri, 23 Feb 2018 12:54:57 +0000 (13:54 +0100)]
spirv: import AMD extensions header from glslang
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dylan Baker [Tue, 6 Mar 2018 18:36:09 +0000 (10:36 -0800)]
meson: Fix indent in omx meson.build
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Jon Turney <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Julien Isorce <julien.isorce@gmail.com>
Tested-by: Karol Herbst <kherbst@redhat.com>