mesa.git
5 years agotravis: meson: port gallium build combinations over
Emil Velikov [Thu, 13 Dec 2018 01:34:59 +0000 (01:34 +0000)]
travis: meson: port gallium build combinations over

This commit adds a number of build combinations:

 - Gallium Drivers {SWR, RadeonSI, Others)
Each one has different LLVM requirements. Building SWR alone is twice
as slow as all other drivers combined.

 - Gallium ST Clover LLVM {5,6,7}
Because C++ API changes all the time. Analogous to above building
Clover takes as much time as building all other ST combined.

 - Gallium ST Others
Nouveau is used, instead of i915g since meson has explicit target
tracking. Meaning that a configure error is thrown if we use i915g
with say va, vdpau or others.

Note: LLVM prior to 5.0 is intentionally dropped. If needed we can add
that later.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: add explicit handling to gallium ST
Emil Velikov [Wed, 12 Dec 2018 13:52:20 +0000 (13:52 +0000)]
travis: meson: add explicit handling to gallium ST

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: explicitly control the DRI loaders
Emil Velikov [Wed, 12 Dec 2018 13:42:36 +0000 (13:42 +0000)]
travis: meson: explicitly control the DRI loaders

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: add unwind handling
Emil Velikov [Wed, 12 Dec 2018 13:33:14 +0000 (13:33 +0000)]
travis: meson: add unwind handling

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotravis: meson: use FOO_DRIVERS directly
Emil Velikov [Wed, 12 Dec 2018 13:18:54 +0000 (13:18 +0000)]
travis: meson: use FOO_DRIVERS directly

It makes for a shorter MESON_OPTIONS and cleaner handling.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: enable unit tests
Dylan Baker [Tue, 11 Dec 2018 18:34:51 +0000 (10:34 -0800)]
travis: meson: enable unit tests

v2: [Emil] pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotravis: Don't try to read libdrm out of configure.ac
Dylan Baker [Tue, 11 Dec 2018 19:09:21 +0000 (11:09 -0800)]
travis: Don't try to read libdrm out of configure.ac

Since we're going to delete it shortly

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agotravis: meson: use native files to override llvm-config
Dylan Baker [Tue, 11 Dec 2018 18:40:25 +0000 (10:40 -0800)]
travis: meson: use native files to override llvm-config

This is the supported way to do this, and should be more robust and
reliable.

v2: [Emil]
 - enable backslash escapes
 - don't hardcode the path
 - pass the argument directly to meson

Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: printout llvm-config --version
Emil Velikov [Thu, 13 Dec 2018 10:38:20 +0000 (10:38 +0000)]
travis: printout llvm-config --version

Provides quick and easy feedback.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: meson: print the configured state
Emil Velikov [Wed, 12 Dec 2018 17:43:07 +0000 (17:43 +0000)]
travis: meson: print the configured state

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agotravis: flip to distro xenial, drop sudo false
Emil Velikov [Thu, 13 Dec 2018 11:20:41 +0000 (11:20 +0000)]
travis: flip to distro xenial, drop sudo false

The latter is the default these days and Travis will be removing sudo
soonish.

Flipping to xenial, allows us to remove a bunch of hacks we have. Plus
it prevents us from adding new ones, to workaround what seems like a
gcc/binutils bug. For example (from the upcoming meson build):

FAILED: ccache c++  -o src/gallium/targets/pipe-loader/pipe_r600.so ...
  ... src/util/libmesa_util.a ... /usr/lib/x86_64-linux-gnu/libz.so ...

src/util/libmesa_util.a(disk_cache.c.o): In function `deflate_and_write_to_disk':
_build/../src/util/disk_cache.c:746: undefined reference to `deflateInit_'
_build/../src/util/disk_cache.c:765: undefined reference to `deflate'
...

As we can see, even though libz.so is explicitly passed after the
object that requires it - the linker still fails to see the symbols.
Avoid all those situations - flip the switch.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoconfigure: add CXX11_CXXFLAGS to LLVM_CXXFLAGS
Emil Velikov [Thu, 13 Dec 2018 11:56:40 +0000 (11:56 +0000)]
configure: add CXX11_CXXFLAGS to LLVM_CXXFLAGS

Seemingly with LLVM7 and GCC 5.0, the former won't properly advertise
-std=c++11 and the latter will choke.

dd this temporary workaround, otherwise we'll get errors like:

In file included from /usr/include/c++/5/type_traits:35:0,
                 from /usr/lib/llvm-7/include/llvm/Support/type_traits.h:18,
                 from /usr/lib/llvm-7/include/llvm/ADT/Optional.h:22,
                 from /usr/lib/llvm-7/include/llvm/ADT/STLExtras.h:20,
                 from /usr/lib/llvm-7/include/llvm/ADT/StringRef.h:13,
                 from /usr/lib/llvm-7/include/llvm/Target/TargetMachine.h:17,
                 from ../../../src/amd/common/ac_llvm_helper.cpp:36:
/usr/include/c++/5/bits/c++0x_warning.h:32:2: error: #error This file requires compiler and library support for the ISO C++ 2011 standard. This support must be enabled with the -std=c++11 or -std=gnu++11 compiler options.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglx/test: meson: assorted include fixes
Emil Velikov [Wed, 12 Dec 2018 19:24:14 +0000 (19:24 +0000)]
glx/test: meson: assorted include fixes

Swap '..' with the symbolic inc_glx and add glproto as dependency. That
will pull the correct include, effectively fixing the tests on macOS.

Fixes: a47c525f328 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglx: meson: wire up the dispatch-index-check test
Emil Velikov [Wed, 12 Dec 2018 19:07:52 +0000 (19:07 +0000)]
glx: meson: wire up the dispatch-index-check test

Accidentally dropped with earlier commit.!

Fixes: 4ccb9816737 ("meson: Use consistent style for tests")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoglx: meson: drop includes from a link-only library
Emil Velikov [Wed, 12 Dec 2018 17:55:08 +0000 (17:55 +0000)]
glx: meson: drop includes from a link-only library

When producing the final libGL.so/libGLX_mesa.so we only link the local
static helper lib (libglx). Thus there's no reason for the includes.

Fixes: a47c525f328 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agoTODO: glx: meson: build dri based glx tests, only with -Dglx=dri
Emil Velikov [Wed, 12 Dec 2018 17:47:36 +0000 (17:47 +0000)]
TODO: glx: meson: build dri based glx tests, only with -Dglx=dri

The library itself (libGL) is only built when -Dglx=dri, yet it's
accompanying tests are build even with -Dglx=xlib.

Adjust the guards, so we don't build the tests when they are not
applicable

v2:
 - Reword commit message (Dylan)
 - Drop build_by_default hunk (Dylan)

Fixes: a47c525f328 ("meson: build glx")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agopipe-loader: meson: reference correct library
Emil Velikov [Thu, 13 Dec 2018 04:10:50 +0000 (04:10 +0000)]
pipe-loader: meson: reference correct library

The library is called libgalliumvl_stub - note singular.

Fixes: 42ea0631f10 ("meson: build clover")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agomeson: don't require glx/egl/gbm with gallium drivers
Emil Velikov [Thu, 13 Dec 2018 03:54:03 +0000 (03:54 +0000)]
meson: don't require glx/egl/gbm with gallium drivers

The gallium drivers do not require a DRI loader. Drop the artificial
and unnecessary restriction.

Fixes: af9d276134d ("meson: build libmesa_gallium")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agobin/get-pick-list.sh: warn when commit lists invalid sha
Emil Velikov [Mon, 17 Dec 2018 16:25:40 +0000 (16:25 +0000)]
bin/get-pick-list.sh: warn when commit lists invalid sha

We had cases where people would list old/invalid sha in the commit.
Add a trivial checker to catch those and throw a warning.

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
5 years agobin/get-pick-list.sh: rework handing of sha nominations
Emil Velikov [Mon, 17 Dec 2018 15:44:25 +0000 (15:44 +0000)]
bin/get-pick-list.sh: rework handing of sha nominations

Currently our is_sha_nomination does:
 - folds any whitespace, attempting to extract sha-like information
 - checks that at least one of the shas has landed

Split it in two and do sha-like validation first.

This way, commits with mesa-stable and sha nominations will feature the
fixes/revert/etc instead of stable (a) or will be omitted if not
applicable for the respective branch (b).

Misc examples from 18.3

(a)
-[   stable ] 5bc509363b6 glx: make xf86vidmode mandatory for direct rendering
+[    fixes ] 5bc509363b6 glx: make xf86vidmode mandatory for direct rendering

(b)
-[   stable ] 9a7b3199037 anv/query: flush render target before copying results

CC: Juan A. Suarez <jasuarez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
5 years agovc4: Hook up perf_debug() output to GL_ARB_debug_output as well.
Eric Anholt [Thu, 20 Dec 2018 05:42:36 +0000 (21:42 -0800)]
vc4: Hook up perf_debug() output to GL_ARB_debug_output as well.

This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.

5 years agovc4: Wire up core pipe_debug_callback
Rhys Kidd [Fri, 10 Aug 2018 16:40:09 +0000 (12:40 -0400)]
vc4: Wire up core pipe_debug_callback

This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Hook up perf_debug() output to GL_ARB_debug output as well.
Eric Anholt [Thu, 20 Dec 2018 05:34:44 +0000 (21:34 -0800)]
v3d: Hook up perf_debug() output to GL_ARB_debug output as well.

This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.

5 years agov3d: Wire up core pipe_debug_callback
Rhys Kidd [Fri, 10 Aug 2018 16:40:10 +0000 (12:40 -0400)]
v3d: Wire up core pipe_debug_callback

This lets the driver use pipe_debug_message() for GL_ARB_debug_output.

Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agov3d: Drop shadow comparison state from shader variant key.
Eric Anholt [Thu, 20 Dec 2018 00:53:25 +0000 (16:53 -0800)]
v3d: Drop shadow comparison state from shader variant key.

The shadow state is now in the sampler.

5 years agov3d: Fix simulator mode on i915 render nodes.
Eric Anholt [Thu, 20 Dec 2018 00:35:23 +0000 (16:35 -0800)]
v3d: Fix simulator mode on i915 render nodes.

i915 render nodes refuse the dumb ioctls, so the simulator would crash on
the original non-apitrace shader-db.  Replace them with direct i915 calls
if we detect that we're on one of their gem fds.

5 years agodocs/meson: Recommend not using CFLAGS and friends
Dylan Baker [Wed, 19 Dec 2018 21:27:27 +0000 (13:27 -0800)]
docs/meson: Recommend not using CFLAGS and friends

Because of the many caveats involved, using -Dc_args instead of CFLAGS
is recommended both by meson upstream and by us.

v2: - Fix typo

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: enable shaderStorageImageMultisample feature on GFX8+
Samuel Pitoiset [Tue, 18 Dec 2018 08:11:30 +0000 (09:11 +0100)]
radv: enable shaderStorageImageMultisample feature on GFX8+

Untested on older chips.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: add support for FMASK expand
Samuel Pitoiset [Mon, 17 Dec 2018 19:59:33 +0000 (20:59 +0100)]
radv: add support for FMASK expand

Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: initialize FMASK for images in fully expanded mode
Samuel Pitoiset [Mon, 17 Dec 2018 20:23:42 +0000 (21:23 +0100)]
radv: initialize FMASK for images in fully expanded mode

The value depends on the number of samples.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: restrict fmask lookup to image load intrinsics
Samuel Pitoiset [Tue, 18 Dec 2018 14:21:56 +0000 (15:21 +0100)]
ac/nir: restrict fmask lookup to image load intrinsics

We don't ever want to do the fmask lookup on a atomic or
store, the fmask should have been decompressed if the
surface has been moved to IMAGE_LAYOUT.

Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: add support for SpvCapabilityStorageImageMultisample
Samuel Pitoiset [Mon, 17 Dec 2018 16:24:06 +0000 (17:24 +0100)]
spirv: add support for SpvCapabilityStorageImageMultisample

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: compute optimal VM alignment for imported buffers
Samuel Pitoiset [Thu, 20 Dec 2018 14:25:22 +0000 (15:25 +0100)]
radv: compute optimal VM alignment for imported buffers

This fixes GPU hangs on GFX9 with
dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.*

Copied from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: Work around non-renderable 128bpp compressed 3d textures on GFX9.
Bas Nieuwenhuizen [Mon, 17 Dec 2018 08:59:49 +0000 (09:59 +0100)]
radv: Work around non-renderable 128bpp compressed 3d textures on GFX9.

Exactly what title says, the new addrlib does not allow the above with
certain dimensions that the CTS seems to hit. Work around it by not
allowing the app to render to it via compat with  other 128bpp formats
and do not render to it ourselves during copies.

Fixes: 776b9113656 "amd/addrlib: update Mesa's copy of addrlib"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: fix subpass image transitions with multiviews
Samuel Pitoiset [Thu, 20 Dec 2018 11:03:16 +0000 (12:03 +0100)]
radv: fix subpass image transitions with multiviews

The driver needs to decompress all image layers if a fast
depth/color clear has been performed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8
Samuel Pitoiset [Wed, 19 Dec 2018 17:16:00 +0000 (18:16 +0100)]
radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8

This workaround has been introduced by 135e4d434f6 for fixing
DXVK GPU hangs with many games. It is no longer needed since
LLVM r345718.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac/nir: remove the bitfield_extract workaround for LLVM 8
Samuel Pitoiset [Wed, 19 Dec 2018 16:52:54 +0000 (17:52 +0100)]
ac/nir: remove the bitfield_extract workaround for LLVM 8

This workaround has been introduced by 3d41757788a and it
is no longer needed since LLVM r346422.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agointel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs
Iago Toral Quiroga [Wed, 19 Dec 2018 07:05:19 +0000 (08:05 +0100)]
intel/compiler: move nir_lower_bool_to_int32 before nir_lower_locals_to_regs

The former expects to see SSA-only things, but the latter injects registers.

The assertions in the lowering where not seeing this because they asserted
on the bit_size values only, not on the is_ssa field, so add that assertion
too.

Fixes: 11dc1307794e "nir: Add a bool to int32 lowering pass"
CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agost/mesa: remove sampler associated with buffer texture in pbo logic
Ilia Mirkin [Sat, 15 Dec 2018 01:06:54 +0000 (20:06 -0500)]
st/mesa: remove sampler associated with buffer texture in pbo logic

A long time ago, when this was first implemented, not having a sampler
bound would cause problems on Fermi. I didn't work out the reasons, but
the solution was simple -- just put the samplers back in.

Since then, regular texturing paths appear to have lost their associated
samplers which required a fuller investigation and fix in nouveau. Now
that this is done, this code should no longer need a sampler state for
fetching texels from a buffer texture.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agogallivm: use llvm jit code for decoding s3tc
Roland Scheidegger [Wed, 19 Dec 2018 03:37:36 +0000 (04:37 +0100)]
gallivm: use llvm jit code for decoding s3tc

This is (much) faster than using the util fallback.
(Note that there's two methods here, one would use a cache, similar to
the existing code (although the cache was disabled), except the block
decode is done with jit code, the other directly decodes the required
pixels. For now don't use the cache (being direct-mapped is suboptimal,
but it's difficult to come up with something better which doesn't have
too much overhead.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoradv/query: Use 1-bit booleans in query shaders
Jason Ekstrand [Wed, 19 Dec 2018 19:40:20 +0000 (13:40 -0600)]
radv/query: Use 1-bit booleans in query shaders

Fixes: 44227453ec03f "nir: Switch to using 1-bit Booleans for almost..."
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv/query: Add a nir_test_flag helper
Jason Ekstrand [Wed, 19 Dec 2018 19:34:02 +0000 (13:34 -0600)]
radv/query: Add a nir_test_flag helper

This is little more than an iadd_imm right now but it will help in the
next commit where we refactor things further.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Tested-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agofreedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()
Eduardo Lima Mitev [Wed, 19 Dec 2018 08:18:04 +0000 (09:18 +0100)]
freedreno/ir3: Handle GL_NONE in get_num_components_for_glformat()

An earlier patch that introduced the function failed to handle the case
where an image format layout qualifier is not specified, which is allowed
on desktop GL profiles. In these cases, nir_variable's image format is
GL_NONE, and we don't need to print a debug message for those.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agodocs: Add an encouraging note about providing reviews and acks.
Eric Anholt [Wed, 12 Dec 2018 19:11:07 +0000 (11:11 -0800)]
docs: Add an encouraging note about providing reviews and acks.

Across several projects I've seen new contributors say "I wasn't sure if I
should provide a review tag since I'm not really an expert in this area."
Everyone I know already applies some implicit weighting to reviews from
different people, so encourage participation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agodocs: Add a note that MRs should still include any r-b or a-b tags.
Eric Anholt [Wed, 12 Dec 2018 19:08:00 +0000 (11:08 -0800)]
docs: Add a note that MRs should still include any r-b or a-b tags.

v2: Mention "Tested-by" too

Reviewed-by: Dylan Baker <dylan@pnwbakers.com> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agov3d: Load and store aligned utiles all at once.
Eric Anholt [Mon, 17 Dec 2018 20:54:42 +0000 (12:54 -0800)]
v3d: Load and store aligned utiles all at once.

This calls the expensive uif offset function once per utile, but it still
gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over
calling it on each pixel.

5 years agov3d: Add a fallthrough path for utile load/store of 32 byte lines.
Eric Anholt [Mon, 17 Dec 2018 20:20:41 +0000 (12:20 -0800)]
v3d: Add a fallthrough path for utile load/store of 32 byte lines.

Now that V3D has 8 byte per pixel formats exposed, we've got stride==32
utiles to load and store.  Just handle them through the non-NEON paths for
now.

5 years agovc4: Move the utile load/store functions to a header for reuse by v3d.
Eric Anholt [Mon, 17 Dec 2018 19:10:11 +0000 (11:10 -0800)]
vc4: Move the utile load/store functions to a header for reuse by v3d.

These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.

5 years agov3d: Implement texture_subdata to reduce teximage upload copies.
Eric Anholt [Tue, 18 Dec 2018 22:50:57 +0000 (14:50 -0800)]
v3d: Implement texture_subdata to reduce teximage upload copies.

This lets us store the non-PBO glTexImage data directly into the tiled
image without making an extra untiled memcpy for the gallium transfer.
Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around
in the kernel mapping and unmapping the transfer's temporary area.

5 years agov3d: Remove dead prototypes for load/store utile functions.
Eric Anholt [Mon, 17 Dec 2018 20:50:08 +0000 (12:50 -0800)]
v3d: Remove dead prototypes for load/store utile functions.

5 years agov3d: Don't try to create shadow tiled temporaries for 1D textures.
Eric Anholt [Mon, 17 Dec 2018 20:57:38 +0000 (12:57 -0800)]
v3d: Don't try to create shadow tiled temporaries for 1D textures.

They're raster order anyway, so we'd assertion fail along with wasting
bandwidth.

Fixes: 6ad9e8690d14 ("v3d: Add support for texturing from linear.")
5 years agov3d: Fix check for TFU job completion in the simulator.
Eric Anholt [Wed, 19 Dec 2018 00:17:26 +0000 (16:17 -0800)]
v3d: Fix check for TFU job completion in the simulator.

We're waiting for the jobs-completed count to increment (with wrapping),
not to reach its starting state.  This mostly ended up working out because
the next v3d_hw_tick() for a submit CL would end up doing the TFU
operation first, but it did fail when a blit was used for glReadPixels()
at the end of a test.

Fixes: ee0549ff9ab3 ("v3d: Add the V3D TFU submit interface to the simulator.")
5 years agov3d: Put the dst bo first in the list of BOs for TFU calls.
Eric Anholt [Wed, 19 Dec 2018 17:29:26 +0000 (09:29 -0800)]
v3d: Put the dst bo first in the list of BOs for TFU calls.

In the UAPI, the first BO is the destination, and the one the kernel
should do an exclusive reservation on.  Currently we only do exclusive
reservations, anyway.  However, in the simulator path I was only copying
back the "destination" BO (actually src in this case), and this caused
regressions once I fixed the simulator to actually complete TFU before
returning (since otherwise, the TFU op would happen at the start of the
next CL submit and the draw would get the right contents).

Fixes: 976ea90bdca2 ("v3d: Add support for using the TFU to do some blits.")
5 years agonir: properly find the entry to keep in copy_prop_vars
Caio Marcelo de Oliveira Filho [Sat, 15 Dec 2018 00:10:32 +0000 (16:10 -0800)]
nir: properly find the entry to keep in copy_prop_vars

When copy propagation handles a store/copy, it iterates the current
copy entries to remove aliases, but keeps the "equal" entry (if
exists) to be updated.

The removal step may swap the entries around (to ensure there are no
holes), invalidating previous iteration pointers.  The bug was saving
such pointer to use later.  Change the code to first perform the
removals and then find the remaining right entry.

This was causing updates to be lost since they were being made to an
entry that was not part of the current copies.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108624
Fixes: b3c61469255 "nir: Copy propagation between blocks"
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agowinsys/amdgpu: Pull in LLVM CFLAGS
Michel Dänzer [Wed, 19 Dec 2018 14:58:23 +0000 (15:58 +0100)]
winsys/amdgpu: Pull in LLVM CFLAGS

Fixes build failure if the LLVM headers aren't in a standard include
directory.

Fixes: ec22dd34c88f "radeonsi: move SI_FORCE_FAMILY functionality to
                     winsys"
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
5 years agonir: properly clear the entry sources in copy_prop_vars
Caio Marcelo de Oliveira Filho [Sat, 15 Dec 2018 06:19:24 +0000 (22:19 -0800)]
nir: properly clear the entry sources in copy_prop_vars

When updating a copy entry source value from a "non-SSA" (the data
come from a copy instruction) to a "SSA" (the data or parts of it come
from SSA values), it was possible to hold invalid data in ssa[0]
depending on the writemask.  Because the union, ssa[0] could contain a
pointer to a nir_deref_instr left-over from previous non-SSA usage.

Change code to clean up the array before use to avoid invalid data
around.

Fixes: 62332d139c8 "nir: Add a local variable-based copy propagation pass"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agodocs: format code blocks a bit nicely
Eric Engestrom [Thu, 29 Nov 2018 13:15:48 +0000 (13:15 +0000)]
docs: format code blocks a bit nicely

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agodocs: add meson cross compilation instructions
Eric Engestrom [Thu, 29 Nov 2018 13:16:42 +0000 (13:16 +0000)]
docs: add meson cross compilation instructions

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agovirgl: move resource creation / import / destruction to common code
Gurchetan Singh [Mon, 3 Dec 2018 23:16:43 +0000 (15:16 -0800)]
virgl: move resource creation / import / destruction to common code

We can remove some duplicated code.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: move resource metadata into base resource
Gurchetan Singh [Sat, 1 Dec 2018 04:45:44 +0000 (20:45 -0800)]
virgl: move resource metadata into base resource

A resource is just a buffer with some metadata.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT
Gurchetan Singh [Mon, 3 Dec 2018 16:50:48 +0000 (08:50 -0800)]
virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BIT

Previously, we ignored the the glUnmap(..) operation and
flushed before we flush the cbuf.  Now, let's just flush
the data when we unmap.

Neither method is optimal, for example:

glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT)
glFlushMappedBufferRange(.., 25, 30)
glFlushMappedBufferRange(.., 65, 70)

We'll end up flushing 25 --> 70.  Maybe we can fix this later.

v2: Add fixme comment in the code (Elie)

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: make virgl_buffers use resource helpers
Gurchetan Singh [Sat, 1 Dec 2018 01:29:16 +0000 (17:29 -0800)]
virgl: make virgl_buffers use resource helpers

We can reuse the helpers we created.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: make transfer code with PIPE_BUFFER targets
Gurchetan Singh [Sat, 1 Dec 2018 02:08:14 +0000 (18:08 -0800)]
virgl: make transfer code with PIPE_BUFFER targets

util_format_get_blocksize returns 1 for R8 formats (all
PIPE_BUFFERs are R8).

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: consolidate transfer code
Gurchetan Singh [Fri, 30 Nov 2018 22:54:33 +0000 (14:54 -0800)]
virgl: consolidate transfer code

We could allocate and destroy transfers in one place.

v2: Keep l_stride around.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: store layer_stride in metadata
Gurchetan Singh [Fri, 30 Nov 2018 22:31:36 +0000 (14:31 -0800)]
virgl: store layer_stride in metadata

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: move vrend_get_tex_image_offset to common code
Gurchetan Singh [Sat, 10 Nov 2018 00:40:03 +0000 (16:40 -0800)]
virgl: move vrend_get_tex_image_offset to common code

Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: move virgl_resource_layout to common code
Gurchetan Singh [Sat, 10 Nov 2018 00:27:32 +0000 (16:27 -0800)]
virgl: move virgl_resource_layout to common code

Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: move texture metadata to common code
Gurchetan Singh [Sat, 10 Nov 2018 00:21:35 +0000 (16:21 -0800)]
virgl: move texture metadata to common code

Will be reused.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: remove unnessecary code
Gurchetan Singh [Mon, 3 Dec 2018 22:49:11 +0000 (14:49 -0800)]
virgl: remove unnessecary code

With commit 89b479, we moved to tracking buffer cleanliness
when binding.

TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agovirgl: texture_transfer_pool --> transfer_pool
Gurchetan Singh [Fri, 30 Nov 2018 16:58:27 +0000 (08:58 -0800)]
virgl: texture_transfer_pool --> transfer_pool

It's used for all types of resources.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
5 years agoradeonsi: const-ify the si_query_ops
Nicolai Hähnle [Thu, 20 Sep 2018 08:21:26 +0000 (10:21 +0200)]
radeonsi: const-ify the si_query_ops

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: split perfcounter queries from si_query_hw
Nicolai Hähnle [Tue, 18 Sep 2018 20:29:41 +0000 (22:29 +0200)]
radeonsi: split perfcounter queries from si_query_hw

Remove a level of indirection to make the code more explicit -- should
make it easier to follow what's going on.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: factor si_query_buffer logic out of si_query_hw
Nicolai Hähnle [Tue, 18 Sep 2018 13:52:17 +0000 (15:52 +0200)]
radeonsi: factor si_query_buffer logic out of si_query_hw

This is a move towards using composition instead of inheritance for
different query types.

This change weakens out-of-memory error reporting somewhat, though this
should be acceptable since we didn't consistently report such errors in
the first place.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: move query suspend logic into the top-level si_query struct
Nicolai Hähnle [Tue, 18 Sep 2018 12:43:09 +0000 (14:43 +0200)]
radeonsi: move query suspend logic into the top-level si_query struct

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: move remaining perfcounter code into si_perfcounter.c
Nicolai Hähnle [Tue, 18 Sep 2018 12:16:10 +0000 (14:16 +0200)]
radeonsi: move remaining perfcounter code into si_perfcounter.c

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: track constant buffer bind history in si_pipe_set_constant_buffer
Nicolai Hähnle [Fri, 21 Sep 2018 15:19:34 +0000 (17:19 +0200)]
radeonsi: track constant buffer bind history in si_pipe_set_constant_buffer

Other callers of si_set_constant_buffer don't need it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: use si_set_rw_shader_buffer for setting streamout buffers
Nicolai Hähnle [Thu, 20 Sep 2018 08:47:03 +0000 (10:47 +0200)]
radeonsi: use si_set_rw_shader_buffer for setting streamout buffers

Reduce the number of places that encode buffer descriptors.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: add an si_set_rw_shader_buffer convenience function
Nicolai Hähnle [Fri, 21 Sep 2018 15:35:56 +0000 (17:35 +0200)]
radeonsi: add an si_set_rw_shader_buffer convenience function

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: avoid using hard-coded SI_NUM_RW_BUFFERS
Nicolai Hähnle [Sun, 16 Sep 2018 13:56:13 +0000 (15:56 +0200)]
radeonsi: avoid using hard-coded SI_NUM_RW_BUFFERS

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: show the fixed function TCS in debug dumps
Nicolai Hähnle [Fri, 31 Aug 2018 17:51:50 +0000 (19:51 +0200)]
radeonsi: show the fixed function TCS in debug dumps

This is rather important for merged VS/TCS as LSHS shaders...

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: const-ify si_set_tesseval_regs
Nicolai Hähnle [Thu, 30 Aug 2018 15:11:23 +0000 (17:11 +0200)]
radeonsi: const-ify si_set_tesseval_regs

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose
Nicolai Hähnle [Mon, 2 Jul 2018 16:41:06 +0000 (18:41 +0200)]
radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: don't set RAW_WAIT for CP DMA clears
Nicolai Hähnle [Fri, 21 Sep 2018 16:05:19 +0000 (18:05 +0200)]
radeonsi: don't set RAW_WAIT for CP DMA clears

There is never a read-after-write hazard because the command doesn't read.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available
Nicolai Hähnle [Thu, 28 Jun 2018 22:08:26 +0000 (00:08 +0200)]
radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: add si_init_draw_functions and make some functions static
Nicolai Hähnle [Thu, 16 Nov 2017 11:14:51 +0000 (12:14 +0100)]
radeonsi: add si_init_draw_functions and make some functions static

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: extract declare_vs_blit_inputs
Nicolai Hähnle [Sun, 19 Nov 2017 16:29:31 +0000 (17:29 +0100)]
radeonsi: extract declare_vs_blit_inputs

Prepare for some later refactoring.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradeonsi: move SI_FORCE_FAMILY functionality to winsys
Nicolai Hähnle [Sat, 18 Nov 2017 22:23:04 +0000 (23:23 +0100)]
radeonsi: move SI_FORCE_FAMILY functionality to winsys

This helps some debugging cases by initializing addrlib with
slightly more appropriate settings.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoac/surface: 3D and cube surfaces are never displayable
Nicolai Hähnle [Thu, 29 Nov 2018 17:34:01 +0000 (18:34 +0100)]
ac/surface: 3D and cube surfaces are never displayable

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan
Nicolai Hähnle [Thu, 20 Sep 2018 17:09:50 +0000 (19:09 +0200)]
amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan

Allow for a unified but efficient treatment of adding a bitmask over a
wave or an entire threadgroup.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: scan/reduce across waves of a workgroup
Nicolai Hähnle [Wed, 23 May 2018 20:09:27 +0000 (22:09 +0200)]
amd/common: scan/reduce across waves of a workgroup

Order-aware scan/reduce can trade-off LDS traffic for external atomics
memory traffic in producer/consumer compute shaders.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: add ac_build_ifcc
Nicolai Hähnle [Wed, 23 May 2018 20:04:20 +0000 (22:04 +0200)]
amd/common: add ac_build_ifcc

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/common: whitespace fixes
Nicolai Hähnle [Thu, 29 Nov 2018 18:00:15 +0000 (19:00 +0100)]
amd/common: whitespace fixes

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoamd/sid_tables: add additional python3 compatibility imports
Nicolai Hähnle [Sun, 19 Nov 2017 11:59:45 +0000 (12:59 +0100)]
amd/sid_tables: add additional python3 compatibility imports

This happened to bite me while doing some experiments.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agor600: remove redundant semicolon
Nicolai Hähnle [Thu, 29 Nov 2018 12:48:03 +0000 (13:48 +0100)]
r600: remove redundant semicolon

Reviewed-By: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoddebug: always flush when requested, even when hang detection is disabled
Nicolai Hähnle [Mon, 27 Aug 2018 13:24:07 +0000 (15:24 +0200)]
ddebug: always flush when requested, even when hang detection is disabled

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoddebug: simplify watchdog loop and fix crash in the no-timeout case
Nicolai Hähnle [Mon, 28 May 2018 15:30:25 +0000 (17:30 +0200)]
ddebug: simplify watchdog loop and fix crash in the no-timeout case

The following race condition could occur in the no-timeout case:

  API thread               Gallium thread            Watchdog
  ----------               --------------            --------
  dd_before_draw
  u_threaded_context draw
  dd_after_draw
    add to dctx->records
    signal watchdog
                                                     dump & destroy record
                           execute draw
                           dd_after_draw_async
                             use-after-free!

Alternatively, the same scenario would assert in a debug build when
destroying the record because record->driver_finished has not signaled.

Fix this and simplify the logic at the same time by
- handing the record pointers off to the watchdog thread *before* each
  draw call and
- waiting on the driver_finished fence in the watchdog thread

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoanv/android: turn on VK_ANDROID_external_memory_android_hardware_buffer
Tapani Pälli [Tue, 25 Sep 2018 10:20:54 +0000 (13:20 +0300)]
anv/android: turn on VK_ANDROID_external_memory_android_hardware_buffer

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: ignore VkSamplerYcbcrConversion on non-yuv formats
Tapani Pälli [Thu, 27 Sep 2018 08:02:59 +0000 (11:02 +0300)]
anv: ignore VkSamplerYcbcrConversion on non-yuv formats

This fulfills a requirement for clients that want to utilize same
code path for images with external formats (VK_FORMAT_UNDEFINED) and
"regular" RGBA images where format is known. This is similar to how
OES_EGL_image_external works.

To support this, we allow color conversion samplers for non-YUV
formats but skip setting up conversion when format does not have
can_ycbcr flag set.

v2: add comment and bundle can_ycbcr to the existing break
    condition (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: support VkSamplerYcbcrConversionInfo in vkCreateImageView
Tapani Pälli [Mon, 8 Oct 2018 11:42:53 +0000 (14:42 +0300)]
anv: support VkSamplerYcbcrConversionInfo in vkCreateImageView

If a conversion struct was passed, then initialize view using
format from the conversion structure.

v2: use vk_format directly from the anv_format struct
v3: added some assertions (Lionel)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: add VkFormat field as part of anv_format
Tapani Pälli [Tue, 13 Nov 2018 07:57:09 +0000 (09:57 +0200)]
anv: add VkFormat field as part of anv_format

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>