mesa.git
7 years agor600g: constify some args at r600_asm.c
Constantine Charlamov [Mon, 17 Jul 2017 01:04:51 +0000 (04:04 +0300)]
r600g: constify some args at r600_asm.c

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agor600g: remove unused "bc" args, and one unneeded forward declaration
Constantine Charlamov [Mon, 17 Jul 2017 01:04:50 +0000 (04:04 +0300)]
r600g: remove unused "bc" args, and one unneeded forward declaration

To ease review just highlight "bc," string.

Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoradv: only report external semaphore info for opaque fd.
Dave Airlie [Tue, 25 Jul 2017 00:19:21 +0000 (10:19 +1000)]
radv: only report external semaphore info for opaque fd.

Until we support sync fd, don't report the info.

Fixes CTS dEQP-VK.api.external.semaphore.sync_fd.* from crashing.

Fixes: eaa56eab6 (radv: initial support for shared semaphores (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoi965: Simplify HiZ clears a bit
Jason Ekstrand [Thu, 15 Jun 2017 01:54:27 +0000 (18:54 -0700)]
i965: Simplify HiZ clears a bit

No need for all that switching when we can just assign a nice little
variable with the number of layers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Use {} to initialize GENX_* structs.
Rafael Antognolli [Wed, 12 Jul 2017 23:36:03 +0000 (16:36 -0700)]
i965: Use {} to initialize GENX_* structs.

gen4 have commands which start with KernelStartPointer, which is a
struct, so if we initialize it struct = { 0 }, we get warnings on some
compilers:

"GCC (pre 4.9?) can throw a Wmissing-braces on[1] while clang
-Wmissing-field-initializers [2]." - Emil

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
[2] https://bugs.llvm.org/show_bug.cgi?id=21689

This change works around that and will silence such warnings. It is both
a GCC and a clang extension.

v2:
   - Use {} instead of memset macro (Matt)

Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agost/mesa: create framebuffer iface hash table per st manager
Charmaine Lee [Sat, 22 Jul 2017 04:41:06 +0000 (21:41 -0700)]
st/mesa: create framebuffer iface hash table per st manager

With commit 5124bf98239, a framebuffer interface hash table is
created in st_gl_api_create(), which is called in
dri_init_screen_helper() for each screen. When the hash table is
overwritten with multiple calls to st_gl_api_create(), it can cause
race condition. This patch fixes the problem by creating a
framebuffer interface hash table per state tracker manager.

Fixes crash with steam.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101876
Fixes: 5124bf98239 ("st/mesa: add destroy_drawable interface")
Tested-by: Christoph Haag <haagch@frickel.club>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoradv: fix buffer views on SI/CIK.
Dave Airlie [Mon, 24 Jul 2017 10:42:54 +0000 (11:42 +0100)]
radv: fix buffer views on SI/CIK.

Fixes CTS dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024
on SI/CIK with radv.

Fixes: f4e499ec (radv: add initial non-conformant radv vulkan driver)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoegl/wayland: Ignore invalid modifiers
Daniel Stone [Fri, 21 Jul 2017 11:05:17 +0000 (12:05 +0100)]
egl/wayland: Ignore invalid modifiers

If the underlying driver does not support modifiers, dmabuf will still
advertise formats through the 'modifier' event, but send them with an
invalid modifier. Ignore them if this is the case, rather than passing
them through to the driver.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixes: 02cc35937277 ("egl/wayland: Use linux-dmabuf interface for buffers")
7 years agomesa: return GL_OUT_OF_MEMORY if NewSamplerObject fails
Samuel Pitoiset [Fri, 21 Jul 2017 12:42:06 +0000 (14:42 +0200)]
mesa: return GL_OUT_OF_MEMORY if NewSamplerObject fails

This is similar to other functions that create objects.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agomesa: pass the 'caller' function to create_samplers()
Samuel Pitoiset [Fri, 21 Jul 2017 12:42:05 +0000 (14:42 +0200)]
mesa: pass the 'caller' function to create_samplers()

To return GL_OUT_OF_MEMORY if NewSamplerObject fails.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agomesa: add compressed_tex_sub_image_{error,no_error} helpers
Samuel Pitoiset [Fri, 21 Jul 2017 08:43:22 +0000 (10:43 +0200)]
mesa: add compressed_tex_sub_image_{error,no_error} helpers

To avoid inlining compressed_tex_sub_image() a bunch of times.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agointel/blorp: ship blorp_genX_exec.h within the tarball
Emil Velikov [Mon, 24 Jul 2017 14:12:52 +0000 (15:12 +0100)]
intel/blorp: ship blorp_genX_exec.h within the tarball

Fixes: c9cb37b2a6c ("intel/blorp: Add a partial resolve pass for MCS")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agodocs: add 17.3.0-devel release notes template
Emil Velikov [Mon, 24 Jul 2017 13:19:21 +0000 (14:19 +0100)]
docs: add 17.3.0-devel release notes template

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: bump version to 17.2.0-devel
Emil Velikov [Mon, 24 Jul 2017 13:20:33 +0000 (14:20 +0100)]
mesa: bump version to 17.2.0-devel

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoegl: guard wayland header dep. tracking behind HAVE_PLATFORM_WAYLAND
Emil Velikov [Mon, 24 Jul 2017 12:22:06 +0000 (13:22 +0100)]
egl: guard wayland header dep. tracking behind HAVE_PLATFORM_WAYLAND

Otherwise we'll attemt to generate the header even we don't need to.
In that case the dependencies may not be met, leading to build failure.

Fixes: 166852e "configure.ac: rework wayland-protocols handling"
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
7 years agoswrast: add dri2ConfigQueryExtension to the correct extension list
Emil Velikov [Mon, 24 Jul 2017 09:10:49 +0000 (10:10 +0100)]
swrast: add dri2ConfigQueryExtension to the correct extension list

The extension should be in the list as returned by getExtensions().
Seems to have gone unnoticed since close to nobody wants to change the
vblank mode for the software driver.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
7 years agowayland-egl: update the SHA1 of the commit introducing v3
Emil Velikov [Mon, 24 Jul 2017 09:35:04 +0000 (10:35 +0100)]
wayland-egl: update the SHA1 of the commit introducing v3

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agowayland-egl: Update ABI checker
Miguel A. Vico [Thu, 20 Jul 2017 00:27:58 +0000 (17:27 -0700)]
wayland-egl: Update ABI checker

This change updates wayland-egl-abi-check.c with the latest changes to
wl_egl_window.

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agowayland-egl: Make wl_egl_window a versioned struct
Miguel A. Vico [Thu, 20 Jul 2017 00:27:12 +0000 (17:27 -0700)]
wayland-egl: Make wl_egl_window a versioned struct

We need wl_egl_window to be a versioned struct in order to keep track of
ABI changes.

This change makes the first member of wl_egl_window the version number.

An heuristic in the wayland driver is added so that we don't break
backwards compatibility:

 - If the first field (version) is an actual pointer, it is an old
   implementation of wl_egl_window, and version points to the wl_surface
   proxy.

 - Else, the first field is the version number, and we have
   wl_egl_window::surface pointing to the wl_surface proxy.

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoegl: Fix _eglPointerIsDereferencable() to ignore page residency
Miguel A. Vico [Thu, 20 Jul 2017 00:25:57 +0000 (17:25 -0700)]
egl: Fix _eglPointerIsDereferencable() to ignore page residency

mincore() returns 0 on success, and -1 on failure.  The last parameter
is a vector of bytes with one entry for each page queried.  mincore
returns page residency information in the first bit of each byte in the
vector.

Residency doesn't actually matter when determining whether a pointer is
dereferenceable, so the output vector can be ignored.  What matters is
whether mincore succeeds. See:

  http://man7.org/linux/man-pages/man2/mincore.2.html

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoegl: Move _eglPointerIsDereferencable() to eglglobals.[ch]
Miguel A. Vico [Thu, 20 Jul 2017 00:25:08 +0000 (17:25 -0700)]
egl: Move _eglPointerIsDereferencable() to eglglobals.[ch]

Move _eglPointerIsDereferencable() to eglglobals.[ch] and make it a
non-static function so it can be used out of egldisplay.c

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agowayland-egl: Add wl_egl_window ABI checker
Miguel A. Vico [Thu, 20 Jul 2017 00:22:44 +0000 (17:22 -0700)]
wayland-egl: Add wl_egl_window ABI checker

Add a small ABI checker for wl_egl_window so that we can check for
backwards incompatible changes at 'make check' time.

Signed-off-by: Miguel A. Vico <mvicomoya@nvidia.com>
Reviewed-by: James Jones <jajones@nvidia.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: use the correct variable for no undefined symbols
Emil Velikov [Fri, 21 Jul 2017 12:44:22 +0000 (13:44 +0100)]
swr: use the correct variable for no undefined symbols

The variable name was missing a leading LD_, which resulted in a missing
check for unresolved symbols in the backend binaries.

With the link addressed with earlier patches, we can correct the typo.

Thanks to Laurent for the help spotting this.

v2: Split from a larger patch.

Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: 9475251145174882b532 "swr: standardize linkage and check for
                             unresolved symbols"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reported-by: Laurent Carlier <lordheavym@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: don't forget to link KNL/SKX against pthreads
Emil Velikov [Fri, 21 Jul 2017 15:49:11 +0000 (16:49 +0100)]
swr: don't forget to link KNL/SKX against pthreads

Analogous to previous commit but for the KNL/SKX backends.

Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: 1cb5a6061ce ("configure/swr: add KNL and SKX architecture targets")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoswr: don't forget to link AVX/AVX2 against pthreads
Emil Velikov [Fri, 21 Jul 2017 15:44:14 +0000 (16:44 +0100)]
swr: don't forget to link AVX/AVX2 against pthreads

Seems like the backends have been using pthreads since day one, yet
we've been missing the link.

With later commit we'll fix a typo, hence the libraries will be build
with -Wl,no-undefined, aka failing the build on unresolved symbols.

v2: Split from a larger patch.

Cc: mesa-stable@lists.freedesktop.org
Cc: Bruce Cherniak <bruce.cherniak@intel.com>
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Fixes: c6e67f5a9373e916a8d2 "gallium/swr: add OpenSWR rasterizer"
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoconfigure.ac: rework wayland-protocols handling
Emil Velikov [Thu, 20 Jul 2017 16:53:01 +0000 (17:53 +0100)]
configure.ac: rework wayland-protocols handling

At dist/distcheck time we need to ensure that all the files and their
respective dependencies are handled.

At the moment we'll bail out as the linux-dmabuf rules are guarded in a
conditional. Move them outside of it and drop the sources from
BUILT_SOURCES.

Thus the files will be generated only as needed, which will happen only
after the wayland-protocols dependency is enforced in configure.ac.

v2: add dependency tracking for the header

Cc: Andres Gomez <agomez@igalia.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
7 years agoradv: enable sample shading
Dave Airlie [Thu, 24 Nov 2016 00:44:28 +0000 (00:44 +0000)]
radv: enable sample shading

This calculates ps_iter_samples from the minSampleShading input

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: don't set dedicated bit for buffer external memory.
Dave Airlie [Mon, 24 Jul 2017 07:15:39 +0000 (08:15 +0100)]
radv: don't set dedicated bit for buffer external memory.

This is an alternate fix for the buffer export dedicated interaction.

Fixes CTS dEQP-VK.api.external.memory.opaque_fd.dedicated.buffer.info

Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: fix non-0 based layer clears.
Dave Airlie [Mon, 24 Jul 2017 07:09:47 +0000 (17:09 +1000)]
radv: fix non-0 based layer clears.

If the layer base was > 0, it wasn't getting passed as the start
instance or getting added in the shaders.

Fixes CTS dEQP-VK.api.image_clearing.core.clear_color_attachment.2d_r8_uint_multiple_layers

Fixes: 7e0382fb (radv: add support for layered clears (v2))
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: check enabled device features.
Dave Airlie [Mon, 24 Jul 2017 06:16:40 +0000 (07:16 +0100)]
radv: check enabled device features.

The spec says we should return VK_ERROR_FEATURE_NOT_PRESENT.

Ported from anv.

Fixes CTS test dEQP-VK.api.device_init.create_device_unsupported_features

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: for external memory imports close the fd on import success
Dave Airlie [Mon, 24 Jul 2017 02:45:03 +0000 (03:45 +0100)]
radv: for external memory imports close the fd on import success

If we get an fd, we need to close it before returning.

Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.device_only.import_multiple_times

Fixes: b70829708a (radv: Implement VK_KHR_external_memory)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Don't segfault when exporting an image which hasn't been bound yet.
Bas Nieuwenhuizen [Sun, 23 Jul 2017 22:39:51 +0000 (00:39 +0200)]
radv: Don't segfault when exporting an image which hasn't been bound yet.

The image is set on Memory allocation already, but the image doesn't
have to have the BindImageMemory called yet. Luckily, we know offset
within a BO has to be 0 for dedicated allocations, so we can just
use the dummy 0 in the address calaculations.

Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.image.export_bind_import_bind

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: b70829708ac "radv: Implement VK_KHR_external_memory"
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Handle VK_ATTACHMENT_UNUSED in color attachments.
Bas Nieuwenhuizen [Sun, 23 Jul 2017 19:59:01 +0000 (21:59 +0200)]
radv: Handle VK_ATTACHMENT_UNUSED in color attachments.

This just sets them to INVALID COLOR,  instead of shifting the
attachments together.

This also fixes a number of cases where we use it first and only
then check if it is VK_ATTACHMENT_UNUSED.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agobroadcom: correct header file in BROADCOM_FILES
Andres Gomez [Wed, 19 Jul 2017 22:44:58 +0000 (01:44 +0300)]
broadcom: correct header file in BROADCOM_FILES

This fixes `make distcheck`

> make[3]: *** No rule to make target 'common/v3d_devinfo.h', needed by 'distdir'.  Stop.
> make[3]: Leaving directory '/home/local/mesa/src/broadcom'
> Makefile:945: recipe for target 'distdir' failed
> make[2]: Leaving directory '/home/local/mesa/src'
> make[2]: *** [distdir] Error 1
> make[1]: *** [distdir] Error 1

Fixes: 427bbbb99c ("broadcom: Introduce a header for talking about chip revisions.")
Cc: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoetnaviv: Clear lbl_usage array correctly
Wladimir J. van der Laan [Sun, 23 Jul 2017 11:24:39 +0000 (13:24 +0200)]
etnaviv: Clear lbl_usage array correctly

Fill the entire array instead of just a quarter. This avoids
crashes with large shaders.
(currently this never causes a problem because shaders larger than 2048/4
instructions are not supported by this driver on any hardware, but it will
cause problems in the future)

Fixes: ec436051899 ("etnaviv: fix shader miscompilation with more than 16 labels")
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
7 years agoanv/image: zalloc image views
Jason Ekstrand [Tue, 11 Jul 2017 18:07:45 +0000 (11:07 -0700)]
anv/image: zalloc image views

This allows us to avoid some extra zeroing.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv/image: Use vk_zalloc instead of an explicit memset
Jason Ekstrand [Tue, 11 Jul 2017 15:59:06 +0000 (08:59 -0700)]
anv/image: Use vk_zalloc instead of an explicit memset

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Separate surface states by layout instead of aux_usage
Jason Ekstrand [Tue, 11 Jul 2017 15:53:42 +0000 (08:53 -0700)]
anv: Separate surface states by layout instead of aux_usage

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/isl: Add some sanity checks for compressed surfaces
Jason Ekstrand [Tue, 11 Jul 2017 23:08:54 +0000 (16:08 -0700)]
intel/isl: Add some sanity checks for compressed surfaces

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agointel/isl: Add a helper to get a subimage surface
Jason Ekstrand [Tue, 11 Jul 2017 21:27:25 +0000 (14:27 -0700)]
intel/isl: Add a helper to get a subimage surface

We already have a helper for doing this in BLORP, this just moves the
logic into ISL where we can share it with other components.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoanv: Get rid of some unused function declarations
Jason Ekstrand [Tue, 11 Jul 2017 16:53:42 +0000 (09:53 -0700)]
anv: Get rid of some unused function declarations

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
7 years agoi965: Enable regular fast-clears (CCS_D) on gen9+
Jason Ekstrand [Thu, 22 Jun 2017 04:35:07 +0000 (21:35 -0700)]
i965: Enable regular fast-clears (CCS_D) on gen9+

The set of formats which supports CCS_E is actually fairly small on
gen9.  However, everything that supports fast-clears on gen8 also
supports fast-clears on gen9+.  The one very annoying exception is
that blending is broken for non-0/1 clear colors with sRGB formats.
In order to solve that problem, we do a resolve to get rid of the
clear color.  Another option would be to just not fast-clear with
non-0/1 clear colors however non-0/1 + blending + sRGB is uncommon
enough that this shouldn't be a significant performance problem.

This appears to help gl_manhattan31_off by about 2%.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Add a helper for determining if a color is 0/1
Jason Ekstrand [Tue, 18 Jul 2017 02:48:22 +0000 (19:48 -0700)]
intel/isl: Add a helper for determining if a color is 0/1

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/blorp: Allow blorp_copy on sRGB formats
Jason Ekstrand [Tue, 18 Jul 2017 00:42:46 +0000 (17:42 -0700)]
intel/blorp: Allow blorp_copy on sRGB formats

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965: Weaken the texture view rules for formats slightly
Jason Ekstrand [Tue, 18 Jul 2017 00:04:07 +0000 (17:04 -0700)]
i965: Weaken the texture view rules for formats slightly

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl/format: Add an srgb_to_linear helper
Jason Ekstrand [Thu, 22 Jun 2017 18:51:55 +0000 (11:51 -0700)]
intel/isl/format: Add an srgb_to_linear helper

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl/format: Dedent the template in gen_format_layout.py
Jason Ekstrand [Thu, 22 Jun 2017 18:27:04 +0000 (11:27 -0700)]
intel/isl/format: Dedent the template in gen_format_layout.py

This makes it much easier to edit the template and doesn't really dirty
the python all that much.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/surface_state: Get the aux usage from the miptree code
Jason Ekstrand [Thu, 22 Jun 2017 04:23:20 +0000 (21:23 -0700)]
i965/surface_state: Get the aux usage from the miptree code

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/surface_state: Take an isl_aux_usage in emit_surface_state
Jason Ekstrand [Thu, 22 Jun 2017 03:40:13 +0000 (20:40 -0700)]
i965/surface_state: Take an isl_aux_usage in emit_surface_state

This commit replaces the generic "flags" parameter with a more explicit
aux usage parameter.  This leads to a lot of duplicated code at the
moment but this will all get cleaned up directly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Take an isl_format in prepare_texture
Jason Ekstrand [Thu, 22 Jun 2017 04:10:53 +0000 (21:10 -0700)]
i965/miptree: Take an isl_format in prepare_texture

This will be a bit more convenient momentarily.  It's also more correct
because it makes prepare_texture take sRGB into account.

7 years agoi965/miptree: Use miptree range helpers in has_color_unresolved
Jason Ekstrand [Thu, 22 Jun 2017 03:00:12 +0000 (20:00 -0700)]
i965/miptree: Use miptree range helpers in has_color_unresolved

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Allow for accessing a CCS_E image as CCS_D
Jason Ekstrand [Thu, 22 Jun 2017 02:36:54 +0000 (19:36 -0700)]
i965/miptree: Allow for accessing a CCS_E image as CCS_D

This requires us to start using the partial clear state.  It makes
things quite a bit more complicated but it's still a fairly
straightforward exercise in diagram following.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Use ISL_AUX_STATE_PARTIAL_CLEAR for CCS_D
Jason Ekstrand [Thu, 22 Jun 2017 02:25:16 +0000 (19:25 -0700)]
i965/miptree: Use ISL_AUX_STATE_PARTIAL_CLEAR for CCS_D

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/isl: Add an aux state for "partial clear"
Jason Ekstrand [Thu, 22 Jun 2017 02:19:00 +0000 (19:19 -0700)]
intel/isl: Add an aux state for "partial clear"

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Take an aux_usage in prepare/finish
Jason Ekstrand [Wed, 21 Jun 2017 20:06:28 +0000 (13:06 -0700)]
i965/miptree: Take an aux_usage in prepare/finish

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Refactor some things to use mt->aux_usage
Jason Ekstrand [Wed, 21 Jun 2017 19:10:03 +0000 (12:10 -0700)]
i965/miptree: Refactor some things to use mt->aux_usage

Now that we have this field, it's much easier to switch on it than to
walk an if ladder that checks different things.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp: Use prepare/finish_depth for depth clears
Jason Ekstrand [Sat, 24 Jun 2017 00:44:34 +0000 (17:44 -0700)]
i965/blorp: Use prepare/finish_depth for depth clears

We also simplify the way we handle stencil since we know a priori that
it will have ISL_AUX_USAGE_NONE.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp: Use render_aux_usage for color clears
Jason Ekstrand [Sat, 24 Jun 2017 00:43:47 +0000 (17:43 -0700)]
i965/blorp: Use render_aux_usage for color clears

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp: Be more accurate about aux usage in blorp_copy
Jason Ekstrand [Sat, 24 Jun 2017 00:43:24 +0000 (17:43 -0700)]
i965/blorp: Be more accurate about aux usage in blorp_copy

The only real change here is that we now reject clear colors for MCS
with certain formats on gen < 9 because we can't trust that the
reinterpretation will work.  This may cause some MCS partial resolves.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp: Use texture/render_aux_usage for blits
Jason Ekstrand [Sat, 24 Jun 2017 00:42:52 +0000 (17:42 -0700)]
i965/blorp: Use texture/render_aux_usage for blits

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp: Do prepare/finish manually
Jason Ekstrand [Sat, 24 Jun 2017 00:22:24 +0000 (17:22 -0700)]
i965/blorp: Do prepare/finish manually

Our attempts to do it automatically are problematic at best.  In order
to really be precise, we need to know both the desired aux usage and
whether or not clear is supported.  The current automatic mechanism
doesn't cover this.  This commit itself is not a functional change since
it just reworks everything to be in terms of a silly helper.  Later
commits will switch things over to more sensible ways of choosing usage.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Rework prepare/finish_render to be in terms of aux_usage
Jason Ekstrand [Thu, 22 Jun 2017 03:19:54 +0000 (20:19 -0700)]
i965/miptree: Rework prepare/finish_render to be in terms of aux_usage

We keep the old and possibly broken method of determining aux usage
intact for now.  Therefore, the only functional change here is that we
may call finish_render a bit more accurately.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Add a helper for getting the aux usage for texturing
Jason Ekstrand [Thu, 22 Jun 2017 03:19:32 +0000 (20:19 -0700)]
i965/miptree: Add a helper for getting the aux usage for texturing

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Partially resolve MCS for texture views
Jason Ekstrand [Fri, 23 Jun 2017 17:44:16 +0000 (10:44 -0700)]
i965/miptree: Partially resolve MCS for texture views

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Add support for partially resolving MCS
Jason Ekstrand [Fri, 23 Jun 2017 17:43:30 +0000 (10:43 -0700)]
i965/miptree: Add support for partially resolving MCS

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Tighten up finish_mcs_write
Jason Ekstrand [Fri, 23 Jun 2017 17:42:30 +0000 (10:42 -0700)]
i965/miptree: Tighten up finish_mcs_write

Multisample surfaces only have a single miplevel so there's no reason to
be passing the extra parameters around.  It only leads to confusion.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Make aux_state work in terms of logical layers
Jason Ekstrand [Mon, 17 Jul 2017 23:16:41 +0000 (16:16 -0700)]
i965/miptree: Make aux_state work in terms of logical layers

This commit changes layer_range_length to return locical layers and also
changes the way we allocate the aux_state field to not allocate extra
layers for MCS.  This will be important as we're about to start doing
significantly more detailed tracking of MCS state.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agointel/blorp: Add a partial resolve pass for MCS
Jason Ekstrand [Fri, 23 Jun 2017 17:27:27 +0000 (10:27 -0700)]
intel/blorp: Add a partial resolve pass for MCS

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Remove some unneeded restrictions
Jason Ekstrand [Thu, 22 Jun 2017 04:33:41 +0000 (21:33 -0700)]
i965/miptree: Remove some unneeded restrictions

intel_miptree_supports_ccs_e should handle the gen >= 9 requirement and
there's no reason why we can't do CCS_E on window system buffers so long
as we resolve.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/miptree: Stop setting FOR_SCANOUT for renderbuffers
Jason Ekstrand [Wed, 19 Jul 2017 00:00:39 +0000 (17:00 -0700)]
i965/miptree: Stop setting FOR_SCANOUT for renderbuffers

Nothing created through intel_miptree_create_for_renderbuffer will ever
be exposed externally so there's no need to set FOR_SCANOUT.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp: Do flushes around depth resolves
Jason Ekstrand [Wed, 19 Jul 2017 01:44:26 +0000 (18:44 -0700)]
i965/blorp: Do flushes around depth resolves

It turns out that if you have rendering in-flight with CCS_E enabled and
you go to do a depth resolve without flushing, the CCS data may never
hit the memory.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoi965/blorp: Use the renderbuffer format for clears
Jason Ekstrand [Sun, 25 Jun 2017 05:50:53 +0000 (22:50 -0700)]
i965/blorp: Use the renderbuffer format for clears

This fixes the Piglit ARB_texture_views rendering-formats test.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
7 years agoanv: Predicate fast-clear resolves
Nanley Chery [Tue, 18 Apr 2017 18:03:42 +0000 (11:03 -0700)]
anv: Predicate fast-clear resolves

Image layouts only let us know that an image *may* be fast-cleared. For
this reason we can end up with redundant resolves. Testing has shown
that such resolves can measurably hurt performance and that predicating
them can avoid the penalty.

v2:
- Introduce additional resolve state management function (Jason Ekstrand).
- Enable easy retrieval of fast clear state fields.
v3: Use more descriptive field enums (Jason)

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agointel/blorp: Allow BLORP calls to be predicated
Nanley Chery [Tue, 25 Apr 2017 20:32:34 +0000 (13:32 -0700)]
intel/blorp: Allow BLORP calls to be predicated

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Skip some input attachment transitions
Nanley Chery [Wed, 24 May 2017 17:16:38 +0000 (10:16 -0700)]
anv/cmd_buffer: Skip some input attachment transitions

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Stop resolving CCS implicitly
Nanley Chery [Sat, 18 Mar 2017 05:36:05 +0000 (22:36 -0700)]
anv: Stop resolving CCS implicitly

With an earlier patch from this series, resolves are additionally
performed on layout transitions. Remove the now unnecessary implicit
resolves within render passes.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Transition more color buffer layouts
Nanley Chery [Sat, 4 Mar 2017 07:59:16 +0000 (23:59 -0800)]
anv: Transition more color buffer layouts

v2: Expound on comment for the pipe controls (Jason Ekstrand).
v3:
- Cast base_layer to uint64_t to avoid overflow.
- Remove "seems" from the pipe control comment.
- Fix clamp of layer_count (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Warn about not enabling CCS_E
Nanley Chery [Wed, 28 Jun 2017 17:29:04 +0000 (10:29 -0700)]
anv/cmd_buffer: Warn about not enabling CCS_E

Use the performance warning infrastructure to provide helpful
information when testing applications.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Move aux_usage assignment up
Nanley Chery [Wed, 28 Jun 2017 17:25:49 +0000 (10:25 -0700)]
anv/cmd_buffer: Move aux_usage assignment up

For readability, bring the assignment of CCS closer to the assignment of
NONE and MCS.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Always enable CCS_D in render passes
Nanley Chery [Fri, 31 Mar 2017 23:05:34 +0000 (16:05 -0700)]
anv/cmd_buffer: Always enable CCS_D in render passes

The lifespan of the fast-clear data will surpass the render pass scope.
We need CCS_D to be enabled in order to invalidate blocks previously
marked as cleared and to sample cleared data correctly.

v2: Avoid refactoring.
v3: Allow CCS_D for subpass resolves.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Disable CCS on gen7 color attachments upfront
Nanley Chery [Wed, 28 Jun 2017 16:35:08 +0000 (09:35 -0700)]
anv/cmd_buffer: Disable CCS on gen7 color attachments upfront

The next patch enables the use of CCS_D even when the color attachment
will not be fast-cleared. Catch the gen7 case early to simplify the
changes required.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Ensure fast-clear values are current
Nanley Chery [Thu, 19 Jan 2017 18:12:36 +0000 (10:12 -0800)]
anv/cmd_buffer: Ensure fast-clear values are current

v2: Rewrite functions, change location of synchronization.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/gpu_memcpy: Add a lighter-weight GPU memcpy function
Nanley Chery [Wed, 25 Jan 2017 22:54:39 +0000 (14:54 -0800)]
anv/gpu_memcpy: Add a lighter-weight GPU memcpy function

We'll be performing a GPU memcpy in more places to copy small amounts of
data. Add an alternate function that thrashes less state.

v2:
- Make a new function (Jason Ekstrand).
- Move the #define into the function.
v3:
- Update the function name (Jason).
- Update comments.
v4: Use an indirect drawing register as TEMP_REG (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Restrict fast clears in the GENERAL layout
Nanley Chery [Fri, 31 Mar 2017 20:52:53 +0000 (13:52 -0700)]
anv/cmd_buffer: Restrict fast clears in the GENERAL layout

v2: Remove ::first_subpass_layout assertion (Jason Ekstrand).
v3: Allow some fast clears in the GENERAL layout.
v4: Remove extra '||' and adjust line break (Jason Ekstrand).

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Don't partially fast clear image layers
Nanley Chery [Fri, 10 Mar 2017 22:41:14 +0000 (14:41 -0800)]
anv/cmd_buffer: Don't partially fast clear image layers

v2: Don't pass in the command buffer (Jason Ekstrand).
v3: Remove an incorrect assertion and an if condition for gen7.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/cmd_buffer: Initialize the clear values buffer
Nanley Chery [Sat, 4 Mar 2017 07:59:16 +0000 (23:59 -0800)]
anv/cmd_buffer: Initialize the clear values buffer

v2: Rewrite functions.
v3 (Jason Ekstrand):
- Don't set ResourceMinLOD.
- Fix clamp of level_count.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/image: Append CCS/MCS with a fast-clear state buffer
Nanley Chery [Thu, 19 Jan 2017 01:39:53 +0000 (17:39 -0800)]
anv/image: Append CCS/MCS with a fast-clear state buffer

v2: Update comments, function signatures, and add assertions.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv/image: Disable CCS if the image doesn't support rendering
Nanley Chery [Wed, 5 Jul 2017 19:15:24 +0000 (12:15 -0700)]
anv/image: Disable CCS if the image doesn't support rendering

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agointel/isl: Add surface state clear value information
Nanley Chery [Tue, 24 Jan 2017 23:55:57 +0000 (15:55 -0800)]
intel/isl: Add surface state clear value information

This will be used to load and store clear values from surface state
objects.

Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoanv: Transition MCS buffers from the undefined layout
Nanley Chery [Tue, 11 Jul 2017 17:46:58 +0000 (10:46 -0700)]
anv: Transition MCS buffers from the undefined layout

v2: Define MCS buffers with any sample count (Jason)

Cc: <mesa-stable@lists.freedesktop.org>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
7 years agointel/isl: Tighten up restrictions for CCS on gen7
Jason Ekstrand [Sat, 22 Jul 2017 23:50:22 +0000 (16:50 -0700)]
intel/isl: Tighten up restrictions for CCS on gen7

It may technically be possible to enable some sort of fast-clear support
for at least the base slice of a 2D array texture on gen7.  However,
it's not documented to work, we've never tried to do it in GL, and we
have no idea what the hardware does if you turn on CCS_D with arrayed
rendering.  Let's just play it safe and disallow it for now.  If someone
really cares that much about gen7 performance, they can come along and
try to get it working later.

7 years agoi965/bufmgr: Add comments about GTT coherency issues.
Chris Wilson [Sat, 22 Jul 2017 19:03:06 +0000 (12:03 -0700)]
i965/bufmgr: Add comments about GTT coherency issues.

(Patch written by Ken, but entirely comments written by Chris.)

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965: Drop non-LLC lunacy in the program cache code.
Kenneth Graunke [Tue, 11 Jul 2017 21:19:46 +0000 (14:19 -0700)]
i965: Drop non-LLC lunacy in the program cache code.

The non-LLC story was a horror show.  We uploaded data via pwrite
(drm_intel_bo_subdata), which would stall if the cache BO was in
use (being read) by the GPU.  Obviously, we wanted to avoid that.
So, we tried to detect whether the buffer was busy, and if so, we'd
allocate a new BO, map the old one read-only (hopefully not stalling),
copy all shaders compiled since the dawn of time to the new buffer,
upload our new one, toss the old BO, and let the state upload code
know that our program cache BO changed.  This was a lot of extra data
copying, and flagging BRW_NEW_PROGRAM_CACHE would also cause a new
STATE_BASE_ADDRESS to be emitted, stalling the entire pipeline.

Not only that, but our rudimentary busy tracking consistented of a flag
set at execbuf time, and not cleared until we threw out the program
cache BO.  So, the first shader upload after any drawing would hit this
"abandon the cache and start over" copying path.

This is largely unnecessary - it's just ancient and crufty code.  We can
use the same persistent mapping paths on all platforms.  On non-ancient
kernels, this will use a write combining map, which should be reasonably
fast.

One aspect that is worse: we do occasionally grow the program cache BO,
and copy the old contents to the newer BO.  This will suffer from UC
readback performance now.  To mitigate this, we use the MOVNTDQA based
streaming memcpy on platforms with SSE 4.1 (all Gen7+ atoms).  Gen4-5
are unfortunately going to be penalized.

v2: Add MOVNTDQA path, rebase on other map flag changes.
v3: Drop cache->bo_used_by_gpu too (caught by Chris Wilson).

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965: Set MAP_PERSISTENT on program cache buffers.
Kenneth Graunke [Fri, 21 Jul 2017 20:09:17 +0000 (13:09 -0700)]
i965: Set MAP_PERSISTENT on program cache buffers.

Chris Wilson pointed out that this mapping really is persistant.

Shouldn't actually have any effect today, but best to set it anyway.

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965: Correctly set MAP_WRITE when creating the LLC program cache map.
Kenneth Graunke [Fri, 21 Jul 2017 20:07:22 +0000 (13:07 -0700)]
i965: Correctly set MAP_WRITE when creating the LLC program cache map.

Using a read-only mapping is completely bogus - we use this mapping to
write all new shaders to the cache.

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965/bufmgr: Use write-combine mappings where available
Matt Turner [Tue, 11 Jul 2017 21:27:34 +0000 (22:27 +0100)]
i965/bufmgr: Use write-combine mappings where available

Write-combine mappings give much better performance on writes than
uncached access through the GTT.

Improves performance of GFXBench 4's gl_driver2 benchmark at 1024x768
on Apollolake by 3.6086% +/- 0.674193% (n=15).

v2: (by Ken) Rebase on lockless mappings, map_count deletion, valgrind
    updates, potential for CPU/WC maps failing, and other changes.

v3: (by Ken and Chris Wilson)

    (Ken): Rebase on set_domain -> gem_wait
    (Chris): Fix up a failed CPU/WC mmaping with a GTT mapping

    Not all objects will be mappable for direct access by the CPU
    (either using WC/CPU or WC paths), for example, a dmabuf wrapping an
    object on a foreign device or an object wrapping access to stolen
    memory. Since either the physical pages are not known or even do not
    exist, we need to use the mediated, indirect access via the GTT. (If
    one day, the kernel does suddenly start providing mediated access
    via a regular WB/WC mmapping, we no longer need the fallback.)

v4: Avoid falling back for MAP_RAW (Chris).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/bufmgr: Skip wait ioctl when not busy.
Kenneth Graunke [Mon, 17 Jul 2017 19:57:20 +0000 (12:57 -0700)]
i965/bufmgr: Skip wait ioctl when not busy.

If the buffer is idle, we I915_GEM_WAIT will return immediately,
so we may as well skip the ioctl altogether.  We can't trust the
"idle" flag for external buffers, but for most, it should be fine.

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965/bufmgr: Explicitly wait instead of using I915_GEM_SET_DOMAIN.
Kenneth Graunke [Mon, 17 Jul 2017 19:46:58 +0000 (12:46 -0700)]
i965/bufmgr: Explicitly wait instead of using I915_GEM_SET_DOMAIN.

With the advent of asynchronous maps, domain tracking doesn't make a
whole lot of sense.  Buffers can be in use on both the CPU and GPU at
the same time.  In order to avoid blocking, we stopped using set_domain
for asynchronous mappings, which means that the kernel's tracking has
lies.  We can't properly track it in userspace either, as the kernel
can change domains on us spontaneously (for example, when un-swapping).

According to Chris Wilson, I915_GEM_SET_DOMAIN does the following:

1. pins the backing storage (acquiring pages outside of the
   struct_mutex)

2. waits either for read/write access, including inter-device waits

3. updates the domain, clflushing as required

4. marks the object as used (for swapping)

5. turns off FBC/PSR/fancy scanout caching

Item (1) is not terribly important.  Most BOs are recycled via the
BO cache, so they already have pages.  Regardless, we fixed this
via an initial set_domain in the previous patch.

We implement item (2) with I915_GEM_WAIT.  This has one downside:
we'll stall unnecessarily if we do a read-only mapping of a buffer
that the GPU is reading.  I believe this is pretty uncommon.  We
may want to extend the wait ioctl at some point.

Mesa already does item (3) itself.  For cache-coherent buffers (most on
LLC systems), we don't need to do any clflushing - the CPU and GPU views
are coherent.  For non-coherent buffers (most on non-LLC systems), we
currently only use the CPU for read-only maps, and we explicitly clflush
when necessary.

We don't care about item (4)...swapping has already killed performance.
Plus, with async maps, the kernel's domain tracking is already bogus,
so it can't do this accurately regardless.

Item (5) should be okay because we avoid cached maps of scanout buffers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoi965/bufmgr: Allocate BO pages outside of the kernel's locking.
Kenneth Graunke [Fri, 21 Jul 2017 19:29:30 +0000 (12:29 -0700)]
i965/bufmgr: Allocate BO pages outside of the kernel's locking.

Suggested by Chris Wilson.

v2: Set the write domain to 0 (suggested by Chris).

Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoglsl: rework misleading block layout code
Timothy Arceri [Fri, 21 Jul 2017 01:42:33 +0000 (11:42 +1000)]
glsl: rework misleading block layout code

From the ARB_uniform_buffer_object spec:

   ""shared" uniform blocks, the default layout, ..."

This doesn't fix anything as the default layout is already applied
at this point but fixes the misleading code/comment.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>