mesa.git
5 years agointel/aub_viewer: fixup 0x address prefix
Lionel Landwerlin [Wed, 10 Oct 2018 21:30:04 +0000 (22:30 +0100)]
intel/aub_viewer: fixup 0x address prefix

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agointel/aub_viewer: fix shader get_bo
Lionel Landwerlin [Fri, 14 Sep 2018 20:19:21 +0000 (21:19 +0100)]
intel/aub_viewer: fix shader get_bo

Instruction addresses are always in ppgtt space.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoradeonsi: Enable adaptive_sync by default for radeon
Nicholas Kazlauskas [Tue, 23 Oct 2018 15:38:52 +0000 (11:38 -0400)]
radeonsi: Enable adaptive_sync by default for radeon

It's better to let most applications make use of adaptive sync
by default. Problematic applications can be placed on the blacklist
or the user can manually disable the feature.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
5 years agoloader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property
Nicholas Kazlauskas [Tue, 23 Oct 2018 15:38:51 +0000 (11:38 -0400)]
loader/dri3: Enable adaptive_sync via _VARIABLE_REFRESH property

The DDX driver can be notified of adaptive sync suitability by
flagging the application's window with the _VARIABLE_REFRESH property.

This property is set on the first swap the application performs
when adaptive_sync is set to true in the drirc.

It's performed here instead of when the loader is initialized for
two reasons:

(1) The window's drawable can be missing during loader init.
    This can be observed during the Unigine Superposition benchmark.

(2) Adaptive sync will only be enabled closer to when the application
    actually begins rendering.

If adaptive_sync is false then the _VARIABLE_REFRESH property
is deleted on loader init.

The property is only managed on the glx DRI3 backend for now. This
should cover most common applications and games on modern hardware.

Vulkan support can be implemented in a similar manner but would likely
require splitting the function out into a common helper function.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
5 years agodrirc: Initial blacklist for adaptive sync
Nicholas Kazlauskas [Tue, 23 Oct 2018 15:38:50 +0000 (11:38 -0400)]
drirc: Initial blacklist for adaptive sync

Applications that don't present at a predictable rate (ie. not games)
shouldn't have adapative sync enabled. This list covers some of the
common desktop compositors, web browsers and video players.

[ Michel Dänzer: Added entry for firefox-esr ]

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
5 years agoutil: Add adaptive_sync driconf option
Nicholas Kazlauskas [Tue, 23 Oct 2018 15:38:49 +0000 (11:38 -0400)]
util: Add adaptive_sync driconf option

This option lets the user decide whether mesa should notify the
window manager / DDX driver that the current application is adaptive
sync capable.

It's off by default.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
5 years agoutil: Get program name based on path when possible
Nicholas Kazlauskas [Tue, 23 Oct 2018 15:38:48 +0000 (11:38 -0400)]
util: Get program name based on path when possible

Some programs start with the path and command line arguments in
argv[0] (program_invocation_name). Chromium is an example of
an application using mesa that does this.

This tries to query the real path for the symbolic link /proc/self/exe
to find the program name instead. It only uses the realpath if it
was a prefix of the invocation to avoid breaking wine programs.

Cc: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoetnaviv: Consolidate buffer references from framebuffers
Tomeu Vizoso [Mon, 17 Dec 2018 08:56:00 +0000 (09:56 +0100)]
etnaviv: Consolidate buffer references from framebuffers

We were leaking surfaces because the references taken in
etna_set_framebuffer_state weren't being released on context destroy.

Instead of just directly releasing those references in
etna_context_destroy, use the util_copy_framebuffer_state helper.

Take the chance to remove the duplicated buffer references in
compiled_framebuffer_state to avoid confusion.

The leak can be reproduced with a client that continuously creates and
destroys contexts.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reported-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agovirgl/vtest: fix front buffer flush with protocol version 0.
Dave Airlie [Thu, 27 Dec 2018 06:09:19 +0000 (16:09 +1000)]
virgl/vtest: fix front buffer flush with protocol version 0.

Older versions of virglrenderer before 33da7361aec486290df0aec4ad8dfa8ff6adde2c
in vtest mode, misrender gears.

Fixes: 9d81cd8e7c (virgl: Pass resource size and transfer offsets)
Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
5 years agodocs/autoconf: Mark autoconf as being replaced
Dylan Baker [Thu, 20 Dec 2018 19:49:11 +0000 (11:49 -0800)]
docs/autoconf: Mark autoconf as being replaced

I know it's not what anyone wants, but how about we start with a
message in the documentation that encourages people to try meson.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
5 years agodocs/install: Update python dependency section
Dylan Baker [Thu, 20 Dec 2018 21:43:29 +0000 (13:43 -0800)]
docs/install: Update python dependency section

Note that meson requires python 3, scons requires python 2, and
autotools works with either.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
5 years agodocs/meson: Update LLVM section with information about native files
Dylan Baker [Thu, 20 Dec 2018 19:46:08 +0000 (11:46 -0800)]
docs/meson: Update LLVM section with information about native files

Reviewed-by: Eric Engeström <eric@engestrom.ch>
5 years agodocs/install: Add meson to the main install page
Dylan Baker [Thu, 20 Dec 2018 19:27:52 +0000 (11:27 -0800)]
docs/install: Add meson to the main install page

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engeström <eric@engestrom.ch>
5 years agodocs: update calendar, add news item and link release notes for 18.2.8
Juan A. Suarez Romero [Thu, 27 Dec 2018 16:37:33 +0000 (17:37 +0100)]
docs: update calendar, add news item and link release notes for 18.2.8

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agodocs: add sha256 checksums for 18.2.8
Juan A. Suarez Romero [Thu, 27 Dec 2018 16:34:35 +0000 (17:34 +0100)]
docs: add sha256 checksums for 18.2.8

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 24c31bc0e237148b1c44811b17c61fc71f09bd93)

5 years agodocs: add release notes for 18.2.8
Juan A. Suarez Romero [Thu, 27 Dec 2018 15:57:51 +0000 (15:57 +0000)]
docs: add release notes for 18.2.8

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 785e09e3b32980380eb2081eeb48c157306f99ba)

5 years agonv50,nvc0: add missing CAPs for unsupported features
Ilia Mirkin [Thu, 27 Dec 2018 01:27:29 +0000 (20:27 -0500)]
nv50,nvc0: add missing CAPs for unsupported features

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agonvc0: enable GL_NV_shader_atomic_float on pre-Maxwell
Ilia Mirkin [Thu, 27 Dec 2018 01:01:28 +0000 (20:01 -0500)]
nvc0: enable GL_NV_shader_atomic_float on pre-Maxwell

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agonv50/ir: add support for converting ATOMFADD to proper ir
Ilia Mirkin [Thu, 19 Apr 2018 01:36:52 +0000 (21:36 -0400)]
nv50/ir: add support for converting ATOMFADD to proper ir

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
5 years agost/mesa: expose GL_NV_shader_atomic_float when ATOMFADD is supported
Ilia Mirkin [Wed, 5 Dec 2018 05:22:40 +0000 (00:22 -0500)]
st/mesa: expose GL_NV_shader_atomic_float when ATOMFADD is supported

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: select ATOMFADD when source type is float
Ilia Mirkin [Thu, 19 Apr 2018 01:16:52 +0000 (21:16 -0400)]
st/mesa: select ATOMFADD when source type is float

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallium: add PIPE_CAP_TGSI_ATOMFADD to indicate support
Ilia Mirkin [Wed, 5 Dec 2018 05:21:34 +0000 (00:21 -0500)]
gallium: add PIPE_CAP_TGSI_ATOMFADD to indicate support

ATOMFADD is a little special -- make drivers have to specify it
explicitly.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agotgsi: add ATOMFADD operation
Ilia Mirkin [Thu, 19 Apr 2018 01:13:22 +0000 (21:13 -0400)]
tgsi: add ATOMFADD operation

This is supported by at least NVIDIA hardware, and exposeable via GL
extensions.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: allow glDrawElements to work with GL_SELECT feedback
Ilia Mirkin [Wed, 19 Dec 2018 03:47:05 +0000 (22:47 -0500)]
st/mesa: allow glDrawElements to work with GL_SELECT feedback

Not sure if this ever worked, but the current logic for setting the
min/max index is definitely wrong for indexed draws. While we're at it,
bring in all the usual logic from the non-indirect drawing path.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109086
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agogallium/ttn: Fix setup of outputs_written.
Eric Anholt [Thu, 20 Dec 2018 16:12:50 +0000 (08:12 -0800)]
gallium/ttn: Fix setup of outputs_written.

We need a 64-bit value, otherwise we only handle the low 32, and happen to
sign-extend to claim to write all varying slots if VARYING_SLOT_VAR2 was
used.

Fixes: 4d0b2c7aaac3 ("ttn: Update shader->info as we generate code.")
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agoanv: don't do partial resolve on layer > 0
Lionel Landwerlin [Mon, 3 Dec 2018 18:40:10 +0000 (18:40 +0000)]
anv: don't do partial resolve on layer > 0

We've made the choice not to use fast clears on layer > 0 with
multilayer images. This is partly because we would need to store
multiple clear colors for each layer, making the existing memory
layout, already including aux surfaces, fast clear color, image state,
etc... even more complex.

Partial resolves are the operations transfering the clear colors into
the auxiliary buffers. This operation is currently implemented in
Blorp by loading the clear color from the image's BO, into a shader
that then samples from the auxiliary buffer and writes the color only
if it isn't there already.

The problem here is that because we store only one clear color for all
layers and it is used for partial resolves. If you trigger a partial
clear on a layer > 0, then you're likely to deal with a color that is
not what you actually want. In the particular issues below, we have
multiple layers, each cleared with a different color but the partial
resolve just writes the wrong color into the auxiliary buffers for
layers > 0.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108910
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911
Cc: mesa-stable@lists.freedesktop.org
5 years agost/nine: Increase the limit of cached ff shaders
Axel Davy [Sat, 8 Dec 2018 19:42:23 +0000 (20:42 +0100)]
st/nine: Increase the limit of cached ff shaders

100 is too small for some games, which triggers recompilations
every frame. Increase to 1024.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
5 years agost/nine: Add src reference to nine_context_range_upload
Axel Davy [Mon, 3 Dec 2018 20:24:54 +0000 (21:24 +0100)]
st/nine: Add src reference to nine_context_range_upload

Just like nine_context_box_upload, nine_context_range_upload
should reference the src, which holds the ram source buffer.

Fixes: https://github.com/iXit/Mesa-3D/issues/327
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
5 years agost/nine: Bind src not dst in nine_context_box_upload
Axel Davy [Mon, 3 Dec 2018 20:15:47 +0000 (21:15 +0100)]
st/nine: Bind src not dst in nine_context_box_upload

nine_context_box_upload uploads a ram buffer (from src)
to a pipe_resource (dst).
We already have a refcount on the pipe_resource,
what needs to be protected from release is the ram buffer,
thus a reference to src.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
5 years agost/nine: Fix volumetexture dtor on ctor failure
Axel Davy [Sun, 25 Nov 2018 13:37:53 +0000 (14:37 +0100)]
st/nine: Fix volumetexture dtor on ctor failure

The dtor is called on allocation failure,
thus we must check the volumes are allocated
before trying to release them.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
5 years agost/nine: Switch to presentation buffer if resize is detected
Axel Davy [Sun, 16 Sep 2018 16:06:56 +0000 (18:06 +0200)]
st/nine: Switch to presentation buffer if resize is detected

This enables to match the window size
on resize on all cases, as it only works
currently with presentation buffers.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
5 years agost/nine: Use helper to release swapchain buffers later
Axel Davy [Wed, 21 Nov 2018 21:03:07 +0000 (22:03 +0100)]
st/nine: Use helper to release swapchain buffers later

This patch introduces a structure to release the
present_handles only when they are fully released
by the server, thus making
"DestroyD3DWindowBuffer" actually release the buffer
right away when called.

Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
5 years agofreedreno/a6xx: fix 3d texture layout
Rob Clark [Tue, 18 Dec 2018 15:34:23 +0000 (10:34 -0500)]
freedreno/a6xx: fix 3d texture layout

Maybe not 100% perfect, but seems to be a pretty good approximation of
that.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: update generated headers
Rob Clark [Fri, 21 Dec 2018 18:22:37 +0000 (13:22 -0500)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a6xx: improve setup_slices() debug msgs
Rob Clark [Tue, 18 Dec 2018 15:33:19 +0000 (10:33 -0500)]
freedreno/a6xx: improve setup_slices() debug msgs

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/a6xx: simplify special case for 3d layout
Rob Clark [Tue, 18 Dec 2018 15:28:57 +0000 (10:28 -0500)]
freedreno/a6xx: simplify special case for 3d layout

This logic can be re-written as the two cases for 3d (ie. before/after
the miplevel sizes start reducing) vs everything else.  I think it is
easier to read this way.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno: combine fd_resource_layer_offset()/fd_resource_offset()
Rob Clark [Tue, 18 Dec 2018 15:27:10 +0000 (10:27 -0500)]
freedreno: combine fd_resource_layer_offset()/fd_resource_offset()

We really only need this logic in one place.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: don't treat all inputs/outputs as vec4
Rob Clark [Wed, 5 Dec 2018 20:07:51 +0000 (15:07 -0500)]
freedreno/ir3: don't treat all inputs/outputs as vec4

This was a hold-over from the early TGSI days, and mostly not needed
with NIR.  This avoids burning an entire 4 consecutive scalar regs
for vec3 outputs, for example.  Which fixes a few places that we were
doing worse that we should on register usage.

Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agofreedreno/ir3: fix fallout of extra assert
Rob Clark [Fri, 21 Dec 2018 23:47:26 +0000 (18:47 -0500)]
freedreno/ir3: fix fallout of extra assert

Fixes the following crash that happened after d6110d4d

The problem happens if we first compile a "vanilla" shader with nothing
lowered in NIR, which perform the final lowering passes on so->shader->
nir (including nir_lower_locals_to_regs()), and then later we have
compile a shader with some lowering.  The second time through we would
have already done nir_lower_locals_to_regs().

Arguably this was already a bug, just one we hadn't noticed yet.

Fixes: d6110d4d547 intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
Signed-off-by: Rob Clark <robdclark@gmail.com>
5 years agost/nir: Drop unused gl_program parameter in VS input handling helper.
Kenneth Graunke [Tue, 30 Oct 2018 05:38:57 +0000 (22:38 -0700)]
st/nir: Drop unused gl_program parameter in VS input handling helper.

Nobody uses this, so let's drop it.  This makes the helper callable
from places without a gl_program.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/nir: Gather info after applying lowering FS variant features
Kenneth Graunke [Thu, 1 Nov 2018 18:57:09 +0000 (11:57 -0700)]
st/nir: Gather info after applying lowering FS variant features

DrawPixels lowering, for example, adds new varyings that need to be
accounted for in inputs_read.  The earlier info gathering at link time
cannot account for this.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: Combine the DrawPixels and Bitmap passthrough VS programs.
Kenneth Graunke [Tue, 4 Dec 2018 06:41:32 +0000 (22:41 -0800)]
st/mesa: Combine the DrawPixels and Bitmap passthrough VS programs.

They're now identical, so we can just compile it once.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: Don't open code the drawpixels vertex shader.
Kenneth Graunke [Tue, 4 Dec 2018 06:38:33 +0000 (22:38 -0800)]
st/mesa: Don't open code the drawpixels vertex shader.

Now that we always copy color, we can just use the util function.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: Drop !passColor optimization in drawpixels shaders.
Kenneth Graunke [Tue, 4 Dec 2018 06:26:47 +0000 (22:26 -0800)]
st/mesa: Drop !passColor optimization in drawpixels shaders.

The glDrawPixels passthrough vertex shader copies position and texcoord
vertex attributes to varying outputs.  It also optionally copies a third
gl_Color attribute, which sometimes is unnecessary.  Until now, we've
compiled separate variants of the shader, one of which does this extra
copy, and the other of which doesn't.  We have done this since 2007.

But, the vertex shader runs for a whopping four vertices, and so the
cost of a copying a single input to output is likely inconsequential.
In theory, we could bind one fewer vertex element - but we always bind
all three regardless.  So, we don't even get that savings.

This patch unifies the two, so we always copy the optional color,
and save having to compile the variant.  It also makes the VS input
interface match up with the vertex element state without any dead
(unused) input attributes.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agost/mesa: Drop dead 'passthrough_fs' field.
Kenneth Graunke [Tue, 4 Dec 2018 06:42:18 +0000 (22:42 -0800)]
st/mesa: Drop dead 'passthrough_fs' field.

Dead since 2015 (commit 5142564734bd68f165b02e29e384ebbcf91cce38).

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: Fix wrongly positioned paren.
Bas Nieuwenhuizen [Fri, 21 Dec 2018 20:06:55 +0000 (21:06 +0100)]
radv: Fix wrongly positioned paren.

Trivial.

Fixes: 9f0bfbed11f "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."
5 years agodocs: add note about using backticks for rbs in gitlab
Dylan Baker [Thu, 20 Dec 2018 17:28:56 +0000 (09:28 -0800)]
docs: add note about using backticks for rbs in gitlab

So that gitlab will render the < and > correctly allowing the tag to be
copy-n-pasted without additional formatting.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
5 years agopci_ids: add new VegaM pci id
Alex Deucher [Thu, 20 Dec 2018 15:11:01 +0000 (10:11 -0500)]
pci_ids: add new VegaM pci id

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: mesa-stable@lists.freedesktop.org
5 years agogallivm: abort when trying to use non-existing intrinsic
Roland Scheidegger [Fri, 21 Dec 2018 01:41:31 +0000 (02:41 +0100)]
gallivm: abort when trying to use non-existing intrinsic

Whenever llvm removes an intrinsic (we're using), we're hitting segfaults
due to llvm doing calls to address 0 in the jitted code instead.
However, Jose figured out we can actually detect this with
LLVMGetIntrinsicID(), so use this to abort, so we don't have to wonder
what got broken. (Of course, someone still needs to fix the code to
no longer use this intrinsic.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agogallivm: don't use pavg.b intrinsic on llvm >= 6.0
Roland Scheidegger [Thu, 20 Dec 2018 23:57:04 +0000 (00:57 +0100)]
gallivm: don't use pavg.b intrinsic on llvm >= 6.0

This intrinsic disppeared with llvm 6.0, using it ends up in segfaults
(due to llvm issuing call to NULL address in the jited shaders).
Add code doing the same thing as the autoupgrade code in llvm so it
can be matched and replaced back with a pavgb.

While here, also improve lp_test_format, so it tests both with and without
cache (as it was, it tested the cache versions only, whereas cache is
actually disabled in llvmpipe, and in any case even with it enabled
vertex and geometry shaders wouldn't use it). (Although at least for
the unorm8 uncached fetch, the code is still quite different to what
llvmpipe is using, since that would use unorm8x16 type, whereas
the test code is using unorm8x4 type, hence disabling some intrinsic
paths.)

Fixes: 6f4083143bb8 ("gallivm: use llvm jit code for decoding s3tc")
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
5 years agotravis: meson: port gallium build combinations over
Emil Velikov [Thu, 13 Dec 2018 01:34:59 +0000 (01:34 +0000)]
travis: meson: port gallium build combinations over

This commit adds a number of build combinations:

 - Gallium Drivers {SWR, RadeonSI, Others)
Each one has different LLVM requirements. Building SWR alone is twice
as slow as all other drivers combined.

 - Gallium ST Clover LLVM {5,6,7}
Because C++ API changes all the time. Analogous to above building
Clover takes as much time as building all other ST combined.

 - Gallium ST Others
Nouveau is used, instead of i915g since meson has explicit target
tracking. Meaning that a configure error is thrown if we use i915g
with say va, vdpau or others.

Note: LLVM prior to 5.0 is intentionally dropped. If needed we can add
that later.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: add explicit handling to gallium ST
Emil Velikov [Wed, 12 Dec 2018 13:52:20 +0000 (13:52 +0000)]
travis: meson: add explicit handling to gallium ST

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: explicitly control the DRI loaders
Emil Velikov [Wed, 12 Dec 2018 13:42:36 +0000 (13:42 +0000)]
travis: meson: explicitly control the DRI loaders

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: add unwind handling
Emil Velikov [Wed, 12 Dec 2018 13:33:14 +0000 (13:33 +0000)]
travis: meson: add unwind handling

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotravis: meson: use FOO_DRIVERS directly
Emil Velikov [Wed, 12 Dec 2018 13:18:54 +0000 (13:18 +0000)]
travis: meson: use FOO_DRIVERS directly

It makes for a shorter MESON_OPTIONS and cleaner handling.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: enable unit tests
Dylan Baker [Tue, 11 Dec 2018 18:34:51 +0000 (10:34 -0800)]
travis: meson: enable unit tests

v2: [Emil] pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotravis: Don't try to read libdrm out of configure.ac
Dylan Baker [Tue, 11 Dec 2018 19:09:21 +0000 (11:09 -0800)]
travis: Don't try to read libdrm out of configure.ac

Since we're going to delete it shortly

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotravis: meson: use native files to override llvm-config
Dylan Baker [Tue, 11 Dec 2018 18:40:25 +0000 (10:40 -0800)]
travis: meson: use native files to override llvm-config

This is the supported way to do this, and should be more robust and
reliable.

v2: [Emil]
 - enable backslash escapes
 - don't hardcode the path
 - pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: printout llvm-config --version
Emil Velikov [Thu, 13 Dec 2018 10:38:20 +0000 (10:38 +0000)]
travis: printout llvm-config --version

Provides quick and easy feedback.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: print the configured state
Emil Velikov [Wed, 12 Dec 2018 17:43:07 +0000 (17:43 +0000)]
travis: meson: print the configured state

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: flip to distro xenial, drop sudo false
Emil Velikov [Thu, 13 Dec 2018 11:20:41 +0000 (11:20 +0000)]
travis: flip to distro xenial, drop sudo false

The latter is the default these days and Travis will be removing sudo
soonish.

Flipping to xenial, allows us to remove a bunch of hacks we have. Plus
it prevents us from adding new ones, to workaround what seems like a
gcc/binutils bug. For example (from the upcoming meson build):

FAILED: ccache c++  -o src/gallium/targets/pipe-loader/pipe_r600.so ...
  ... src/util/libmesa_util.a ... /usr/lib/x86_64-linux-gnu/libz.so ...

src/util/libmesa_util.a(disk_cache.c.o): In function `deflate_and_write_to_disk':
_build/../src/util/disk_cache.c:746: undefined reference to `deflateInit_'
_build/../src/util/disk_cache.c:765: undefined reference to `deflate'
...

As we can see, even though libz.so is explicitly passed after the
object that requires it - the linker still fails to see the symbols.
Avoid all those situations - flip the switch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoconfigure: add CXX11_CXXFLAGS to LLVM_CXXFLAGS
Emil Velikov [Thu, 13 Dec 2018 11:56:40 +0000 (11:56 +0000)]
configure: add CXX11_CXXFLAGS to LLVM_CXXFLAGS

Seemingly with LLVM7 and GCC 5.0, the former won't properly advertise
-std=c++11 and the latter will choke.

dd this temporary workaround, otherwise we'll get errors like:

In file included from /usr/include/c++/5/type_traits:35:0,
                 from /usr/lib/llvm-7/include/llvm/Support/type_traits.h:18,
                 from /usr/lib/llvm-7/include/llvm/ADT/Optional.h:22,
                 from /usr/lib/llvm-7/include/llvm/ADT/STLExtras.h:20,
                 from /usr/lib/llvm-7/include/llvm/ADT/StringRef.h:13,
                 from /usr/lib/llvm-7/include/llvm/Target/TargetMachine.h:17,
                 from ../../../src/amd/common/ac_llvm_helper.cpp:36:
/usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglx/test: meson: assorted include fixes
Emil Velikov [Wed, 12 Dec 2018 19:24:14 +0000 (19:24 +0000)]
glx/test: meson: assorted include fixes

Swap '..' with the symbolic inc_glx and add glproto as dependency. That
will pull the correct include, effectively fixing the tests on macOS.

Fixes: a47c525f328 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglx: meson: wire up the dispatch-index-check test
Emil Velikov [Wed, 12 Dec 2018 19:07:52 +0000 (19:07 +0000)]
glx: meson: wire up the dispatch-index-check test

Accidentally dropped with earlier commit.!

Fixes: 4ccb9816737 ("meson: Use consistent style for tests")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglx: meson: drop includes from a link-only library
Emil Velikov [Wed, 12 Dec 2018 17:55:08 +0000 (17:55 +0000)]
glx: meson: drop includes from a link-only library

When producing the final libGL.so/libGLX_mesa.so we only link the local
static helper lib (libglx). Thus there's no reason for the includes.

Fixes: a47c525f328 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoTODO: glx: meson: build dri based glx tests, only with -Dglx=dri
Emil Velikov [Wed, 12 Dec 2018 17:47:36 +0000 (17:47 +0000)]
TODO: glx: meson: build dri based glx tests, only with -Dglx=dri

The library itself (libGL) is only built when -Dglx=dri, yet it's
accompanying tests are build even with -Dglx=xlib.

Adjust the guards, so we don't build the tests when they are not
applicable

v2:
 - Reword commit message (Dylan)
 - Drop build_by_default hunk (Dylan)

Fixes: a47c525f328 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agopipe-loader: meson: reference correct library
Emil Velikov [Thu, 13 Dec 2018 04:10:50 +0000 (04:10 +0000)]
pipe-loader: meson: reference correct library

The library is called libgalliumvl_stub - note singular.

Fixes: 42ea0631f10 ("meson: build clover")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agomeson: don't require glx/egl/gbm with gallium drivers
Emil Velikov [Thu, 13 Dec 2018 03:54:03 +0000 (03:54 +0000)]
meson: don't require glx/egl/gbm with gallium drivers

The gallium drivers do not require a DRI loader. Drop the artificial
and unnecessary restriction.

Fixes: af9d276134d ("meson: build libmesa_gallium")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agobin/get-pick-list.sh: warn when commit lists invalid sha
Emil Velikov [Mon, 17 Dec 2018 16:25:40 +0000 (16:25 +0000)]
bin/get-pick-list.sh: warn when commit lists invalid sha

We had cases where people would list old/invalid sha in the commit.
Add a trivial checker to catch those and throw a warning.

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
5 years agobin/get-pick-list.sh: rework handing of sha nominations
Emil Velikov [Mon, 17 Dec 2018 15:44:25 +0000 (15:44 +0000)]
bin/get-pick-list.sh: rework handing of sha nominations

Currently our is_sha_nomination does:
 - folds any whitespace, attempting to extract sha-like information
 - checks that at least one of the shas has landed

Split it in two and do sha-like validation first.

This way, commits with mesa-stable and sha nominations will feature the
fixes/revert/etc instead of stable (a) or will be omitted if not
applicable for the respective branch (b).

Misc examples from 18.3

(a)
-[   stable ] 5bc509363b6 glx: make xf86vidmode mandatory for direct rendering
+[    fixes ] 5bc509363b6 glx: make xf86vidmode mandatory for direct rendering

(b)
-[   stable ] 9a7b3199037 anv/query: flush render target before copying results

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
5 years agovc4: Hook up perf_debug() output to GL_ARB_debug_output as well.
Eric Anholt [Thu, 20 Dec 2018 05:42:36 +0000 (21:42 -0800)]
vc4: Hook up perf_debug() output to GL_ARB_debug_output as well.

This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.

5 years agovc4: Wire up core pipe_debug_callback
Rhys Kidd [Fri, 10 Aug 2018 16:40:09 +0000 (12:40 -0400)]
vc4: Wire up core pipe_debug_callback

This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Hook up perf_debug() output to GL_ARB_debug output as well.
Eric Anholt [Thu, 20 Dec 2018 05:34:44 +0000 (21:34 -0800)]
v3d: Hook up perf_debug() output to GL_ARB_debug output as well.

This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.

5 years agov3d: Wire up core pipe_debug_callback
Rhys Kidd [Fri, 10 Aug 2018 16:40:10 +0000 (12:40 -0400)]
v3d: Wire up core pipe_debug_callback

This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Drop shadow comparison state from shader variant key.
Eric Anholt [Thu, 20 Dec 2018 00:53:25 +0000 (16:53 -0800)]
v3d: Drop shadow comparison state from shader variant key.

The shadow state is now in the sampler.

5 years agov3d: Fix simulator mode on i915 render nodes.
Eric Anholt [Thu, 20 Dec 2018 00:35:23 +0000 (16:35 -0800)]
v3d: Fix simulator mode on i915 render nodes.

i915 render nodes refuse the dumb ioctls, so the simulator would crash on
the original non-apitrace shader-db.  Replace them with direct i915 calls
if we detect that we're on one of their gem fds.

5 years agodocs/meson: Recommend not using CFLAGS and friends
Dylan Baker [Wed, 19 Dec 2018 21:27:27 +0000 (13:27 -0800)]
docs/meson: Recommend not using CFLAGS and friends

Because of the many caveats involved, using -Dc_args instead of CFLAGS
is recommended both by meson upstream and by us.

v2: - Fix typo

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: enable shaderStorageImageMultisample feature on GFX8+
Samuel Pitoiset [Tue, 18 Dec 2018 08:11:30 +0000 (09:11 +0100)]
radv: enable shaderStorageImageMultisample feature on GFX8+

Untested on older chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add support for FMASK expand
Samuel Pitoiset [Mon, 17 Dec 2018 19:59:33 +0000 (20:59 +0100)]
radv: add support for FMASK expand

Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: initialize FMASK for images in fully expanded mode
Samuel Pitoiset [Mon, 17 Dec 2018 20:23:42 +0000 (21:23 +0100)]
radv: initialize FMASK for images in fully expanded mode

The value depends on the number of samples.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: restrict fmask lookup to image load intrinsics
Samuel Pitoiset [Tue, 18 Dec 2018 14:21:56 +0000 (15:21 +0100)]
ac/nir: restrict fmask lookup to image load intrinsics

We don't ever want to do the fmask lookup on a atomic or
store, the fmask should have been decompressed if the
surface has been moved to IMAGE_LAYOUT.

Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: add support for SpvCapabilityStorageImageMultisample
Samuel Pitoiset [Mon, 17 Dec 2018 16:24:06 +0000 (17:24 +0100)]
spirv: add support for SpvCapabilityStorageImageMultisample

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: compute optimal VM alignment for imported buffers
Samuel Pitoiset [Thu, 20 Dec 2018 14:25:22 +0000 (15:25 +0100)]
radv: compute optimal VM alignment for imported buffers

This fixes GPU hangs on GFX9 with
dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.*

Copied from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: Work around non-renderable 128bpp compressed 3d textures on GFX9.
Bas Nieuwenhuizen [Mon, 17 Dec 2018 08:59:49 +0000 (09:59 +0100)]
radv: Work around non-renderable 128bpp compressed 3d textures on GFX9.

Exactly what title says, the new addrlib does not allow the above with
certain dimensions that the CTS seems to hit. Work around it by not
allowing the app to render to it via compat with  other 128bpp formats
and do not render to it ourselves during copies.

Fixes: 776b9113656 "amd/addrlib: update Mesa's copy of addrlib"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: fix subpass image transitions with multiviews
Samuel Pitoiset [Thu, 20 Dec 2018 11:03:16 +0000 (12:03 +0100)]
radv: fix subpass image transitions with multiviews

The driver needs to decompress all image layers if a fast
depth/color clear has been performed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8
Samuel Pitoiset [Wed, 19 Dec 2018 17:16:00 +0000 (18:16 +0100)]
radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8

This workaround has been introduced by 135e4d434f6 for fixing
DXVK GPU hangs with many games. It is no longer needed since
LLVM r345718.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: remove the bitfield_extract workaround for LLVM 8
Samuel Pitoiset [Wed, 19 Dec 2018 16:52:54 +0000 (17:52 +0100)]
ac/nir: remove the bitfield_extract workaround for LLVM 8

This workaround has been introduced by 3d41757788a and it
is no longer needed since LLVM r346422.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agointel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
Iago Toral Quiroga [Wed, 19 Dec 2018 07:05:19 +0000 (08:05 +0100)]
intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs

The former expects to see SSA-only things, but the latter injects registers.

The assertions in the lowering where not seeing this because they asserted
on the bit_size values only, not on the is_ssa field, so add that assertion
too.

Fixes: 11dc1307794e "nir: Add a bool to int32 lowering pass"
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agost/mesa: remove sampler associated with buffer texture in pbo logic
Ilia Mirkin [Sat, 15 Dec 2018 01:06:54 +0000 (20:06 -0500)]
st/mesa: remove sampler associated with buffer texture in pbo logic

A long time ago, when this was first implemented, not having a sampler
bound would cause problems on Fermi. I didn't work out the reasons, but
the solution was simple -- just put the samplers back in.

Since then, regular texturing paths appear to have lost their associated
samplers which required a fuller investigation and fix in nouveau. Now
that this is done, this code should no longer need a sampler state for
fetching texels from a buffer texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallivm: use llvm jit code for decoding s3tc
Roland Scheidegger [Wed, 19 Dec 2018 03:37:36 +0000 (04:37 +0100)]
gallivm: use llvm jit code for decoding s3tc

This is (much) faster than using the util fallback.
(Note that there's two methods here, one would use a cache, similar to
the existing code (although the cache was disabled), except the block
decode is done with jit code, the other directly decodes the required
pixels. For now don't use the cache (being direct-mapped is suboptimal,
but it's difficult to come up with something better which doesn't have
too much overhead.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoradv/query: Use 1-bit booleans in query shaders
Jason Ekstrand [Wed, 19 Dec 2018 19:40:20 +0000 (13:40 -0600)]
radv/query: Use 1-bit booleans in query shaders

Fixes: 44227453ec03f "nir: Switch to using 1-bit Booleans for almost..."
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/query: Add a nir_test_flag helper
Jason Ekstrand [Wed, 19 Dec 2018 19:34:02 +0000 (13:34 -0600)]
radv/query: Add a nir_test_flag helper

This is little more than an iadd_imm right now but it will help in the
next commit where we refactor things further.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agofreedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()
Eduardo Lima Mitev [Wed, 19 Dec 2018 08:18:04 +0000 (09:18 +0100)]
freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()

An earlier patch that introduced the function failed to handle the case
where an image format layout qualifier is not specified, which is allowed
on desktop GL profiles. In these cases, nir_variable's image format is
GL_NONE, and we don't need to print a debug message for those.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agodocs: Add an encouraging note about providing reviews and acks.
Eric Anholt [Wed, 12 Dec 2018 19:11:07 +0000 (11:11 -0800)]
docs: Add an encouraging note about providing reviews and acks.

Across several projects I've seen new contributors say "I wasn't sure if I
should provide a review tag since I'm not really an expert in this area."
Everyone I know already applies some implicit weighting to reviews from
different people, so encourage participation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agodocs: Add a note that MRs should still include any r-b or a-b tags.
Eric Anholt [Wed, 12 Dec 2018 19:08:00 +0000 (11:08 -0800)]
docs: Add a note that MRs should still include any r-b or a-b tags.

v2: Mention "Tested-by" too

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agov3d: Load and store aligned utiles all at once.
Eric Anholt [Mon, 17 Dec 2018 20:54:42 +0000 (12:54 -0800)]
v3d: Load and store aligned utiles all at once.

This calls the expensive uif offset function once per utile, but it still
gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over
calling it on each pixel.

5 years agov3d: Add a fallthrough path for utile load/store of 32 byte lines.
Eric Anholt [Mon, 17 Dec 2018 20:20:41 +0000 (12:20 -0800)]
v3d: Add a fallthrough path for utile load/store of 32 byte lines.

Now that V3D has 8 byte per pixel formats exposed, we've got stride==32
utiles to load and store.  Just handle them through the non-NEON paths for
now.

5 years agovc4: Move the utile load/store functions to a header for reuse by v3d.
Eric Anholt [Mon, 17 Dec 2018 19:10:11 +0000 (11:10 -0800)]
vc4: Move the utile load/store functions to a header for reuse by v3d.

These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.

5 years agov3d: Implement texture_subdata to reduce teximage upload copies.
Eric Anholt [Tue, 18 Dec 2018 22:50:57 +0000 (14:50 -0800)]
v3d: Implement texture_subdata to reduce teximage upload copies.

This lets us store the non-PBO glTexImage data directly into the tiled
image without making an extra untiled memcpy for the gallium transfer.
Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around
in the kernel mapping and unmapping the transfer's temporary area.

5 years agov3d: Remove dead prototypes for load/store utile functions.
Eric Anholt [Mon, 17 Dec 2018 20:50:08 +0000 (12:50 -0800)]
v3d: Remove dead prototypes for load/store utile functions.