mesa.git
6 years agofreedreno/ir3: add experimental GCM pass
Rob Clark [Mon, 29 Jan 2018 19:53:13 +0000 (14:53 -0500)]
freedreno/ir3: add experimental GCM pass

Generally seems to do worse on instruction count and register usage,
according to shader-db.  But shader-db also doesn't do a very good job
of weighting loop bodies, so that might not be totally valid.

So add an env variable to enable GCM pass for easier experimentation.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: change opt passes
Rob Clark [Fri, 26 Jan 2018 15:43:48 +0000 (10:43 -0500)]
freedreno/ir3: change opt passes

There are more useful nir passes added since initial conversion to nir.
But ir3 was never updated to use them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: use peephole select pass
Rob Clark [Fri, 19 Jan 2018 21:13:09 +0000 (16:13 -0500)]
freedreno/ir3: use peephole select pass

Agressively lowering all if/else to selects in some extreme cases
results in much higher register pressure.  Using peephole select instead
with a modest threshold speeds up alu2 4x!

16 seems like a good limit, low enough to help alu2 but not too low that
it penalizes everything else.  With a bit better scheduling of the
instruction that moves a value into a predicate register, we might be
able to lower this limit a bit more in the future, but since we need 6
cycles from the move to predicate register to predicated branch, that
puts some sort of lower bound on how far we can lower this threshold.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: lower phi webs to regs
Rob Clark [Thu, 18 Jan 2018 13:32:22 +0000 (08:32 -0500)]
freedreno/ir3: lower phi webs to regs

nir's from_ssa pass is much better at avoiding inserting extra moves
than our logic is.  And lowering phi webs to regs just treats anything
involved in a phi web as an array of length=1.  Which with previous
array related fixes in RA/etc ends up working out quite well.  This cuts
down on extra instructions and also helps with register pressure.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: separate arrays from groups
Rob Clark [Mon, 29 Jan 2018 17:32:24 +0000 (12:32 -0500)]
freedreno/ir3: separate arrays from groups

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: make block/instruction serialno per-shader
Rob Clark [Mon, 29 Jan 2018 14:54:07 +0000 (09:54 -0500)]
freedreno/ir3: make block/instruction serialno per-shader

Makes it easier to compare values seen in-game (where there are many
shaders) to cmdline standalone compiler.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add spirv support to cmdline compiler
Rob Clark [Tue, 23 Jan 2018 14:28:44 +0000 (09:28 -0500)]
freedreno/ir3: add spirv support to cmdline compiler

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: don't lower fsat
Rob Clark [Sun, 21 Jan 2018 17:31:51 +0000 (12:31 -0500)]
freedreno/ir3: don't lower fsat

Instead, if possible fold (sat) flag into src, otherwise use:

  (sat)max.f rD, rS, rS

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: add encoding/decoding for (sat) bit
Rob Clark [Sun, 21 Jan 2018 17:20:01 +0000 (12:20 -0500)]
freedreno/ir3: add encoding/decoding for (sat) bit

Seems to be there since a3xx, but we always lowered fsat.  But we can
shave some instructions, especially in shaders that use lots of
clamp(foo, 0.0, 1.0) by not lowering fsat.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: extend liverange of arrays
Rob Clark [Sun, 21 Jan 2018 16:13:44 +0000 (11:13 -0500)]
freedreno/ir3: extend liverange of arrays

Use livein state of other blocks to extend liverange of arrays when they
are still needed by successor blocks.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: avoid extra mov's for "arrays"
Rob Clark [Fri, 19 Jan 2018 20:45:37 +0000 (15:45 -0500)]
freedreno/ir3: avoid extra mov's for "arrays"

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: a couple more array fixes
Rob Clark [Mon, 29 Jan 2018 21:01:42 +0000 (16:01 -0500)]
freedreno/ir3: a couple more array fixes

(Plus a couple TODOs)

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: keep array stores
Rob Clark [Mon, 29 Jan 2018 20:58:49 +0000 (15:58 -0500)]
freedreno/ir3: keep array stores

Since these are not in SSA form, add to block's keeps so it doesn't
appear unused.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: propagate barrier information
Rob Clark [Mon, 29 Jan 2018 20:38:06 +0000 (15:38 -0500)]
freedreno/ir3: propagate barrier information

When eliminating movs, the instruction that is now directly using the
src of the mov has the same scheduling order constraints as the original
mov instruction.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: remove pointless statement
Rob Clark [Mon, 29 Jan 2018 20:35:12 +0000 (15:35 -0500)]
freedreno/ir3: remove pointless statement

Function ends after this if/else ladder, so it was pointless.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: some more debug prints
Rob Clark [Mon, 29 Jan 2018 20:33:55 +0000 (15:33 -0500)]
freedreno/ir3: some more debug prints

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: fix printing of relative branch offsets
Rob Clark [Mon, 29 Jan 2018 20:24:17 +0000 (15:24 -0500)]
freedreno/ir3: fix printing of relative branch offsets

The number of bits depends on generation.  But printing negative values
with a5xx encoding (largest size) but compiling for a3xx or a4xx, would
result in negative values printed as large positive values.

I guess in practice huge negative branch offsets aren't likely (and if
that is the case, the shader is probably too big to grok by reading the
assembly).  So just print using smallest bitfield size.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: be more clever with if/else jumps
Rob Clark [Mon, 15 Jan 2018 20:57:52 +0000 (15:57 -0500)]
freedreno/ir3: be more clever with if/else jumps

Try to clean up things like:

  br !p0.x #2
  br p0.x #something

to eliminate the first branch.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: avoid some spurious sync bits
Rob Clark [Mon, 15 Jan 2018 19:58:41 +0000 (14:58 -0500)]
freedreno/ir3: avoid some spurious sync bits

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno/ir3: print # of sync bits for shaderdb
Rob Clark [Mon, 29 Jan 2018 20:28:10 +0000 (15:28 -0500)]
freedreno/ir3: print # of sync bits for shaderdb

When trying to optimize to reduce stalls, it is nice to see this info.

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agofreedreno: add debug trace for flush
Rob Clark [Mon, 29 Jan 2018 14:53:30 +0000 (09:53 -0500)]
freedreno: add debug trace for flush

Signed-off-by: Rob Clark <robdclark@gmail.com>
6 years agointel/compiler: fix 64bit value prints on 32bit
Grazvydas Ignotas [Sat, 3 Feb 2018 21:59:05 +0000 (23:59 +0200)]
intel/compiler: fix 64bit value prints on 32bit

Fix the following:
warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but
argument 3 has type ‘uint64_t {aka long long unsigned int}.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
6 years agost/glsl_to_nir: remove unused options variable
Timothy Arceri [Sat, 10 Feb 2018 00:06:55 +0000 (11:06 +1100)]
st/glsl_to_nir: remove unused options variable

6 years agost/radeonsi: enable disk cache for nir
Timothy Arceri [Tue, 23 Jan 2018 05:00:31 +0000 (16:00 +1100)]
st/radeonsi: enable disk cache for nir

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost: add nir shader disk cache support
Timothy Arceri [Wed, 24 Jan 2018 00:12:51 +0000 (11:12 +1100)]
st: add nir shader disk cache support

v2: include compute shader support

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agost/glsl_to_tgsi: move nir detection earlier
Timothy Arceri [Tue, 30 Jan 2018 00:56:57 +0000 (11:56 +1100)]
st/glsl_to_tgsi: move nir detection earlier

We move the nir check before the shader cache call so that we can
call a nir based caching function in a following patch.

Also with this change we simply check if vertex shaders support
NIR rather than looping over the stages as mixing of shader types
is not supported anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR
Timothy Arceri [Fri, 9 Feb 2018 01:02:27 +0000 (12:02 +1100)]
radeonsi: stop returning PIPE_SHADER_IR_NATIVE for PIPE_SHADER_CAP_PREFERRED_IR

Clover now checks PIPE_SHADER_CAP_SUPPORTED_IRS for native support instead.

This change indirectly enables NIR support for compute shaders
on radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agor600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR
Timothy Arceri [Fri, 9 Feb 2018 01:01:04 +0000 (12:01 +1100)]
r600: always return PIPE_SHADER_IR_TGSI for PIPE_SHADER_CAP_PREFERRED_IR

We now use PIPE_SHADER_CAP_SUPPORTED_IRS to check for native support
in clover.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoclover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR
Timothy Arceri [Fri, 9 Feb 2018 01:03:57 +0000 (12:03 +1100)]
clover: use PIPE_SHADER_CAP_SUPPORTED_IRS to discover IR

PIPE_SHADER_CAP_PREFERRED_IR was conflicting with PIPE_SHADER_IR_NIR
for compute shaders, so we let clover pick the one it wants to use.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agor600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs
Timothy Arceri [Fri, 9 Feb 2018 00:59:54 +0000 (11:59 +1100)]
r600: add PIPE_SHADER_IR_NATIVE to supported shaders for cs

Acked-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi/nir: add depth layout to scan pass
Timothy Arceri [Fri, 9 Feb 2018 10:09:35 +0000 (21:09 +1100)]
radeonsi/nir: add depth layout to scan pass

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoradeonsi/nir: add FRAG_RESULT_COLOR to scan pass
Timothy Arceri [Fri, 9 Feb 2018 09:34:53 +0000 (20:34 +1100)]
radeonsi/nir: add FRAG_RESULT_COLOR to scan pass

Fixes a number of draw buffers piglit tests.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoac: convert nir_op_f2f32 src to a float
Timothy Arceri [Fri, 9 Feb 2018 06:17:31 +0000 (17:17 +1100)]
ac: convert nir_op_f2f32 src to a float

Fixes the following piglit test:

./bin/arb_vertex_attrib_64bit-check-explicit-location -auto -fbo

Where we would end up with the nir such as:

vec1 64 ssa_11 = pack_64_2x32_split ssa_9, ssa_10
vec1 32 ssa_12 = f2f32 ssa_2

And our pack_64_2x32_split nir to llvm code always produces
a 64bit integer as output.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoac: fix some 64bit unpack asserts
Timothy Arceri [Fri, 9 Feb 2018 06:15:54 +0000 (17:15 +1100)]
ac: fix some 64bit unpack asserts

Previously the asserts did not take swizzles into account.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
6 years agoRevert "i965: prevent potentially null pointer access"
Mark Janes [Fri, 9 Feb 2018 17:37:57 +0000 (09:37 -0800)]
Revert "i965: prevent potentially null pointer access"

This reverts commit 712332ed54f14b5ee34c2990e351ca48992488b2, which
caused over 90k failures in Mesa i965 CI.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
6 years agoegl/gbm: Ensure EGLConfigs match GBM surface format
Daniel Stone [Tue, 6 Feb 2018 17:59:05 +0000 (17:59 +0000)]
egl/gbm: Ensure EGLConfigs match GBM surface format

When we create an EGL window surface on a GBM surface, ensure that the
EGLConfig is compatible with the GBM format, notwithstanding XRGB/ARGB
interchange.

For example, rendering with an XRGB8888 EGLConfig on to an ARGB8888
gbm_surface (and vice-versa) are acceptable, but rendering with an
XRGB2101010 EGLConfig on to an XRGB8888 gbm_surface will now be
rejected.

This was previously allowed through; when 10bpc formats were enabled,
clients which picked a completely random EGL config and hoped/assumed
they were XRGB8888 would break.

If you have bisected a failure to start a GBM/KMS client to this commit,
please look at its EGLConfig selection (e.g. through eglChooseConfigs),
and add an EGL_NATIVE_VISUAL_ID == gbm_surface format match to the
attribs for config selection.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/gbm: Remove duplicate format table
Daniel Stone [Tue, 6 Feb 2018 17:44:37 +0000 (17:44 +0000)]
egl/gbm: Remove duplicate format table

Now that we have mask/channel information in gbm_dri's format conversion
table, we can remove the copy in EGL.

As this table contains more formats (notably including R8 and RG8, which
can be used for BO but not surface allocation), we now compare the masks
of all channels when trying to find a suitable config. Without doing
this, an XRGB8888 EGLConfig would match on an R8 format.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agogbm/dri: Expose visuals table through gbm_dri_device
Daniel Stone [Tue, 6 Feb 2018 17:42:03 +0000 (17:42 +0000)]
gbm/dri: Expose visuals table through gbm_dri_device

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agogbm/dri: Add RGBA masks to GBM format table
Daniel Stone [Tue, 6 Feb 2018 17:38:37 +0000 (17:38 +0000)]
gbm/dri: Add RGBA masks to GBM format table

Eventually, we can replace the visuals list inside GBM EGL driver with
this one.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use an array for modifiers
Daniel Stone [Tue, 6 Feb 2018 10:29:13 +0000 (10:29 +0000)]
egl/wayland: Use an array for modifiers

Each Wayland EGLDisplay currently contains a struct with one vector of
modifiers per format, hardcoded in the header. To allow easier support
for more formats, turn this into an array of u_vectors which is opaque
outside of platform_wayland.c.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Remove has_format enum
Daniel Stone [Tue, 6 Feb 2018 11:14:32 +0000 (11:14 +0000)]
egl/wayland: Remove has_format enum

Instead of the has_format enum, use an index into the visual array. This
makes adding new formats less typing.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Add bpp to visual map
Daniel Stone [Tue, 6 Feb 2018 11:58:45 +0000 (11:58 +0000)]
egl/wayland: Add bpp to visual map

Both the DRI2 GetBuffersWithFormat interface, and SHM buffer allocation,
had their own format -> bpp lookup tables. Replace these with a lookup
into the visual map.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for DRIImage<->FourCC map
Daniel Stone [Tue, 6 Feb 2018 11:51:17 +0000 (11:51 +0000)]
egl/wayland: Use visual map for DRIImage<->FourCC map

When trying to translate between DRIImage format enums and FourCC codes,
use our visual map rather than an open-coded subset.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for format advertisement
Daniel Stone [Tue, 6 Feb 2018 10:20:39 +0000 (10:20 +0000)]
egl/wayland: Use visual map for format advertisement

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for buffer_from_image
Daniel Stone [Tue, 6 Feb 2018 10:15:32 +0000 (10:15 +0000)]
egl/wayland: Use visual map for buffer_from_image

When creating a wl_buffer on an upstream Wayland display from an
existing EGLImage, use the dri2_wl_visual map rather than another
hardcoded list of formats.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use visual map for config->format lookup
Daniel Stone [Tue, 6 Feb 2018 10:07:23 +0000 (10:07 +0000)]
egl/wayland: Use visual map for config->format lookup

Having hoisted the format -> config map into common code, we now use it
for config -> format lookups.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Add format enums to visual map
Daniel Stone [Tue, 6 Feb 2018 09:42:27 +0000 (09:42 +0000)]
egl/wayland: Add format enums to visual map

Extend the visual map from only containing names and bitmasks, to also
carrying the three format enums we need. These are the DRIImage format
tokens for internal allocation, FourCC codes for wl_drm and dmabuf
protocol, and wl_shm codes for swrast drivers.

We will later use these formats to eliminate a bunch of open-coded
conversions.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Use proper enum type in visual definition
Daniel Stone [Tue, 6 Feb 2018 09:33:56 +0000 (09:33 +0000)]
egl/wayland: Use proper enum type in visual definition

No semantic change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Widen channel masks to bpp
Daniel Stone [Tue, 6 Feb 2018 18:06:52 +0000 (18:06 +0000)]
egl/wayland: Widen channel masks to bpp

Widen the channel masks given in the visual table to the full width of
the pixel format, i.e. as many leading zeros as required.

No functional change.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Hoist format <-> EGLConfig definition up
Daniel Stone [Tue, 6 Feb 2018 09:32:22 +0000 (09:32 +0000)]
egl/wayland: Hoist format <-> EGLConfig definition up

Pull the mapping between Wayland formats and EGLConfigs up to the top
level, so we can reuse it elsewhere.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agoegl/wayland: Fix ARGB/XRGB transposition in config map
Daniel Stone [Tue, 6 Feb 2018 09:45:01 +0000 (09:45 +0000)]
egl/wayland: Fix ARGB/XRGB transposition in config map

When 0b2b7191214eb moved from an if tree to a struct to map between
wl_drm formats and EGLConfigs, it transposed the mapping between XRGB
and ARGB. Luckily, everyone exposes both formats, so this is harmless.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Fixes: 0b2b7191214eb ("egl/wayland: introduce dri2_wl_add_configs_for_visuals() helper")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
6 years agost/mesa: generate blend state according to the number of enabled color buffers
Marek Olšák [Wed, 31 Jan 2018 03:37:00 +0000 (04:37 +0100)]
st/mesa: generate blend state according to the number of enabled color buffers

Non-MRT cases always translate blend state for 1 color buffer only.
MRT cases only check and translate blend state for enabled color buffers.

This also avoids an assertion failure in translate_blend for:
  dEQP-GLES31.functional.draw_buffers_indexed.overwrite_common.common_advanced_blend_eq_buffer_blend_eq

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agost/mesa: don't translate blend state when color writes are disabled
Marek Olšák [Tue, 30 Jan 2018 23:53:16 +0000 (00:53 +0100)]
st/mesa: don't translate blend state when color writes are disabled

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agost/mesa: don't translate blend state when it's disabled for a colorbuffer
Marek Olšák [Tue, 30 Jan 2018 23:53:16 +0000 (00:53 +0100)]
st/mesa: don't translate blend state when it's disabled for a colorbuffer

Reviewed-by: Eric Anholt <eric@anholt.net>
6 years agoi965: prevent potentially null pointer access
Lionel Landwerlin [Thu, 8 Feb 2018 17:33:09 +0000 (17:33 +0000)]
i965: prevent potentially null pointer access

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
CID: 1418110

6 years agost/va: Make the vendor string more descriptive
Mark Thompson [Wed, 7 Feb 2018 23:15:05 +0000 (23:15 +0000)]
st/va: Make the vendor string more descriptive

Include the Mesa version and detail about the platform.

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agost/va: Enable vaExportSurfaceHandle()
Mark Thompson [Wed, 7 Feb 2018 23:10:15 +0000 (23:10 +0000)]
st/va: Enable vaExportSurfaceHandle()

It is present from libva 2.1 (VAAPI 1.1.0 or higher).

Signed-off-by: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
6 years agodisk cache: move path creation back to constructor
Tapani Pälli [Fri, 9 Feb 2018 05:37:49 +0000 (07:37 +0200)]
disk cache: move path creation back to constructor

This patch moves disk cache path and index creation back to the
constructor which matches previous behavior. We still allow create
to succeed without path so that cache can be used with callback
functionality.

Fixes: c95d3ed091 "disk cache: create cache even if path creation fails"
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agoac/nir: compute correct number of user SGPRs on GFX9
Samuel Pitoiset [Thu, 8 Feb 2018 22:04:53 +0000 (23:04 +0100)]
ac/nir: compute correct number of user SGPRs on GFX9

For merged shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agost/mesa: Initialize tex_target in compile_tgsi_instruction
Michel Dänzer [Thu, 8 Feb 2018 17:48:45 +0000 (18:48 +0100)]
st/mesa: Initialize tex_target in compile_tgsi_instruction

Initialize to TGSI_TEXTURE_BUFFER (== 0), same as was done before the
variable type was changed to enum tgsi_texture_type.

Fixes a bunch of piglit failures with radeonsi, e.g.:

gles-3.0-transform-feedback-uniform-buffer-object: ../../../../src/gallium/auxiliary/tgsi/tgsi_util.c:502: tgsi_util_get_texture_coord_dim: Assertion `!"unknown texture target"' failed.

Corresponding compiler warning:

  CXX      state_tracker/st_glsl_to_tgsi.lo
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp: In function ‘pipe_error st_translate_program(gl_context*, uint, ureg_program*, glsl_to_tgsi_visitor*, const gl_program*, GLuint, const ubyte*, const ubyte*, const ubyte*, const ubyte*, const ubyte*, GLuint, const ubyte*, const ubyte*, const ubyte*)’:
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5992:23: warning: ‘tex_target’ may be used uninitialized in this function [-Wmaybe-uninitialized]
       ureg_memory_insn(ureg, inst->op, dst, num_dst, src, num_src,
       ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                        inst->buffer_access,
                        ~~~~~~~~~~~~~~~~~~~~
                        tex_target, inst->image_format);
                        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:5866:27: note: ‘tex_target’ was declared here
    enum tgsi_texture_type tex_target;
                           ^~~~~~~~~~

Fixes: 9f9ce1625fb3 ("st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoglsl/linker: remove ubo explicit binding handling
Alejandro Piñeiro [Wed, 24 Jan 2018 10:03:00 +0000 (11:03 +0100)]
glsl/linker: remove ubo explicit binding handling

This is already handled at link_uniform_blocks, specifically at
process_block_array_leaf.

Additionally, this code was not handling correctly arrays of
arrays. When creating the name of the block to set the binding, it
only took into account the first level, so any attempt to set a
explicit binding on a array of array ubo would trigger an assertion.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agomesa: Only update enabled VAO gl_vertex_array entries.
Mathias Fröhlich [Sun, 4 Feb 2018 16:13:06 +0000 (17:13 +0100)]
mesa: Only update enabled VAO gl_vertex_array entries.

Instead of updating all modified gl_vertex_array_object::_VertexArray
entries just update those that are modified and enabled.
Also release buffer object from the _VertexArray that belong
to disabled attributes.

v2: Also set Ptr and Size to zero.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agogallium: Mute arrays for several meta like callbacks.
Mathias Fröhlich [Mon, 5 Feb 2018 21:02:51 +0000 (22:02 +0100)]
gallium: Mute arrays for several meta like callbacks.

Set the _DrawArray pointer to NULL when calling into the Drivers
Bitmap/CopyPixels/DrawAtlasBitmaps/DrawPixels/DrawTex hooks.
This fixes an assert that gets uncovered when the following
patch gets applied.

v2: Mute from within the state tracker instead of generic mesa.
v3: Avoid evaluating _DrawArrays from within st_validate_state.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agomesa: Fix VAO buffer object tracking.
Mathias Fröhlich [Sun, 4 Feb 2018 12:18:34 +0000 (13:18 +0100)]
mesa: Fix VAO buffer object tracking.

When changing the attribute binding in the VAO we also need to
account for getting rid of non vbo bits from VertexAttribBufferMask.

Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
6 years agoradeonsi/nir: gather some missing fs info
Timothy Arceri [Thu, 8 Feb 2018 23:53:00 +0000 (10:53 +1100)]
radeonsi/nir: gather some missing fs info

Fixes some early-z arb_shader_image_load_store piglit tests.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoac: pass struct ac_llvm_context to emit_membar()
Timothy Arceri [Thu, 8 Feb 2018 23:37:25 +0000 (10:37 +1100)]
ac: pass struct ac_llvm_context to emit_membar()

Fixes segfault in piglit test:

./bin/arb_shader_image_load_store-shader-mem-barrier --quick -auto -fbo

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
6 years agoradeonsi: copy the NIR enablement debug bit to the shader cache flags
Marek Olšák [Fri, 9 Feb 2018 00:47:26 +0000 (01:47 +0100)]
radeonsi: copy the NIR enablement debug bit to the shader cache flags

When NIR is enabled, TGSI must not be used. When NIR is disabled, TGSI

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
6 years agointel/blorp: Use isl_aux_op instead of blorp_hiz_op
Jason Ekstrand [Fri, 19 Jan 2018 23:14:37 +0000 (15:14 -0800)]
intel/blorp: Use isl_aux_op instead of blorp_hiz_op

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agointel/blorp: Use isl_aux_op instead of blorp_fast_clear_op
Jason Ekstrand [Fri, 19 Jan 2018 23:02:07 +0000 (15:02 -0800)]
intel/blorp: Use isl_aux_op instead of blorp_fast_clear_op

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv: Allow fast-clearing the first slice of a multi-slice image
Jason Ekstrand [Fri, 19 Jan 2018 20:07:12 +0000 (12:07 -0800)]
anv: Allow fast-clearing the first slice of a multi-slice image

Now that we're tracking aux properly per-slice, we can enable this for
applications which actually care.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Rework aux tracking
Jason Ekstrand [Tue, 21 Nov 2017 16:46:25 +0000 (08:46 -0800)]
anv/cmd_buffer: Rework aux tracking

This commit completely reworks aux tracking.  This includes a number of
somewhat distinct changes:

 1) Since we are no longer fast-clearing multiple slices, we only need
    to track one fast clear color and one fast clear type.

 2) We store two bits for fast clear instead of one to let us
    distinguish between zero and non-zero fast clear colors.  This is
    needed so that we can do full resolves when transitioning to
    PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear
    values in all sorts of places we wouldn't normally.

 3) We now track compression state as a boolean separate from fast clear
    type and this is tracked on a per-slice granularity.

The previous scheme had some issues when it came to individual slices of
a multi-LOD images.  In particular, we only tracked "needs resolve"
per-LOD but you could do a vkCmdPipelineBarrier that would only resolve
a portion of the image and would set "needs resolve" to false anyway.
Also, any transition from an undefined layout would reset the clear
color for the entire LOD regardless of whether or not there was some
clear color on some other slice.

As far as full/partial resolves go, he assumptions of the previous
scheme held because the one case where we do need a full resolve when
CCS_E is enabled is for window-system images.  Since we only ever
allowed X-tiled window-system images, CCS was entirely disabled on gen9+
and we never got CCS_E.  With the advent of Y-tiled window-system
buffers, we now need to properly support doing a full resolve of images
marked CCS_E.

v2 (Jason Ekstrand):
 - Fix an bug in the compressed flag offset calculation
 - Treat 3D images as multi-slice for the purposes of resolve tracking

v3 (Jason Ekstrand):
 - Set the compressed flag whenever we fast-clear
 - Simplify the resolve predicate computation logic

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Move the mi_alu helper higher up
Jason Ekstrand [Fri, 19 Jan 2018 00:08:31 +0000 (16:08 -0800)]
anv/cmd_buffer: Move the mi_alu helper higher up

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/image: Simplify some verbose commennts
Jason Ekstrand [Thu, 18 Jan 2018 17:17:17 +0000 (09:17 -0800)]
anv/image: Simplify some verbose commennts

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv: Use blorp_ccs_ambiguate instead of fast-clears
Jason Ekstrand [Tue, 28 Nov 2017 02:09:48 +0000 (18:09 -0800)]
anv: Use blorp_ccs_ambiguate instead of fast-clears

Even though the blorp pass looks a bit on the sketchy side, the end
result in the Vulkan driver is very nice.  Instead of having this weird
case where you do a fast clear and then maybe have to resolve, we just
do the ambiguate and are done with it.  The ambiguate does exactly what
we want of setting all the CCS values to 0 which puts it into the
pass-through state.

This should also improve performance a bit in certain cases.  For
instance, if we did a transition from UNDEFINED to GENERAL for a surface
that doesn't have CCS enabled all the time, we would end up doing a
fast-clear and then a full resolve which ends up touching every byte in
the main surface as well as the CCS.  With the ambiguate pass, that
transition only touches the CCS.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears
Jason Ekstrand [Tue, 28 Nov 2017 02:07:57 +0000 (18:07 -0800)]
anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clears

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Pull the undefined layout condition into the if
Jason Ekstrand [Tue, 28 Nov 2017 02:06:47 +0000 (18:06 -0800)]
anv/cmd_buffer: Pull the undefined layout condition into the if

Now that this isn't a multi-case if and it's just the one case, it's a
bit clearer if the condition is just part of the if instead of being
pulled out into a boolean variable.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agointel/blorp: Add a CCS ambiguation pass
Jason Ekstrand [Thu, 18 May 2017 03:33:21 +0000 (20:33 -0700)]
intel/blorp: Add a CCS ambiguation pass

This pass performs an "ambiguate" operation on a CCS-compressed surface
by manually writing zeros into the CCS.  On gen8+, ISL gives us a fairly
detailed notion of how the CCS is laid out so this is fairly simple to
do.  On gen7, the CCS tiling is quite crazy but that isn't an issue
because we can only do CCS on single-slice images so we can just blast
over the entire CCS buffer if we want to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv: Only fast clear single-slice images
Jason Ekstrand [Thu, 18 Jan 2018 05:31:09 +0000 (21:31 -0800)]
anv: Only fast clear single-slice images

The current strategy we use for managing resolves has an issues where we
track clear colors and the need for resolves per-LOD but we still allow
resolves of only a subset of the slices in any given LOD and doing so
sets the "needs resolve" flag for that LOD to false while leaving the
remaining layers unresolved.  This patch is only the first step and does
not, by itself fix anything.  However, it's fairly self-contained and
splitting it out means any performance regressions should bisect to this
nice obvious commit rather than to the giant "rework aux tracking"
commit.

Nanley and I did some testing and none of the applications we tested
even tried to fast-clear anything other than the first slice of an
image.  The test was done by adding a printf right before we call
blorp_fast_clear if we were every going to touch any slice other than
the first with a fast-clear.  Due to the way the original code was
structured, this would not have included applications which only cleared
a subset of layers.  The applications tested were:

 * All Sascha Willems demos
 * Aztec Ruins
 * Dota 2
 * The Talos Principle
 * Mad Max
 * Warhammer 40,000: Dawn of War III
 * Serious Sam Fusion 2017: BFE

While not the full list of shipping applications, it's a pretty good
spread and covers most of the engines we've seen running on our driver.
If this is ever shown to be a performance problem in the future, we can
reconsider our strategy.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Add a mark_image_written helper
Jason Ekstrand [Mon, 27 Nov 2017 16:35:12 +0000 (08:35 -0800)]
anv/cmd_buffer: Add a mark_image_written helper

Currently, this helper does nothing but we call it every place where an
image is written through the render pipeline.  This will allow us to
properly mark the aux state so that we can handle resolves correctly.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/blorp: Add src/dst_level helper variables in CmdCopyImage
Jason Ekstrand [Fri, 19 Jan 2018 17:12:17 +0000 (09:12 -0800)]
anv/blorp: Add src/dst_level helper variables in CmdCopyImage

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Add an anv_genX_call macro
Jason Ekstrand [Mon, 27 Nov 2017 16:29:34 +0000 (08:29 -0800)]
anv/cmd_buffer: Add an anv_genX_call macro

This is copied and pasted from the similar macro we added to ISL.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Generalize transition_color_buffer
Jason Ekstrand [Mon, 20 Nov 2017 18:12:37 +0000 (10:12 -0800)]
anv/cmd_buffer: Generalize transition_color_buffer

This moves it to being based on layout_to_aux_usage instead of being
hard-coded based on bits of a priori knowledge of how transitions
interact with layouts.  This conceptually simplifies things because
we're now using layout_to_aux_usage and layout_supports_fast_clear to
make resolve decisions so changes to those functions will do what one
expects.

There is a potential bug with window system integration on gen9+ where
we wouldn't do a resolve when transitioning to the PRESENT_SRC layout
because we just assume that everything that handles CCS_E can handle it
all the time.  When handing a CCS_E image off to the window system, we
may need to do a full resolve if the window system does not support the
CCS_E modifier.  The only reason why this hasn't been a problem yet is
because we don't support modifiers in Vulkan WSI and so we always get X
tiling which implies no CCS on gen9+.  This patch doesn't actually fix
that bug yet but it takes us the first step in that direction by making
us actually pick the correct resolve op.  In order to handle all of the
cases, we need more detailed aux tracking.

v2 (Jason Ekstrand):
 - Make a few more things const
 - Use the anv_fast_clear_support enum

v3 (Jason Ekstrand):
 - Move an assert and add a better comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/cmd_buffer: Recurse in transition_color_buffer instead of falling through
Jason Ekstrand [Mon, 20 Nov 2017 18:05:54 +0000 (10:05 -0800)]
anv/cmd_buffer: Recurse in transition_color_buffer instead of falling through

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/image: Support color aspects in layout_to_aux_usage
Jason Ekstrand [Mon, 20 Nov 2017 20:05:20 +0000 (12:05 -0800)]
anv/image: Support color aspects in layout_to_aux_usage

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/image: Add a helper for determining when fast clears are supported
Jason Ekstrand [Mon, 20 Nov 2017 17:48:39 +0000 (09:48 -0800)]
anv/image: Add a helper for determining when fast clears are supported

v2 (Jason Ekstrand):
 - Return an enum instead of a boolean

v3 (Jason Ekstrand):
 - Return ANV_FAST_CLEAR_NONE instead of false (Topi)
 - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE
 - Add documentation for the enum values

v4 (Jason Ekstrand):
 - Remove a dead comment

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/image: Update a comment
Jason Ekstrand [Mon, 20 Nov 2017 17:47:47 +0000 (09:47 -0800)]
anv/image: Update a comment

This got lost in all of the aspect vs. plane rebasing of YCBCR.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/blorp: Rework HiZ ops to look like MCS and CCS
Jason Ekstrand [Tue, 21 Nov 2017 18:20:57 +0000 (10:20 -0800)]
anv/blorp: Rework HiZ ops to look like MCS and CCS

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image
Jason Ekstrand [Tue, 21 Nov 2017 20:10:30 +0000 (12:10 -0800)]
anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_image

If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old
behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what
we want for blits/copies.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agoanv/blorp: Rework image clear/resolve helpers
Jason Ekstrand [Tue, 21 Nov 2017 17:56:41 +0000 (09:56 -0800)]
anv/blorp: Rework image clear/resolve helpers

This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS.  This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agointel/isl: Codify AUX operations in an enum
Jason Ekstrand [Tue, 21 Nov 2017 17:16:18 +0000 (09:16 -0800)]
intel/isl: Codify AUX operations in an enum

Right now, we have different entrypoints and enums in blorp for these
different operations.  This provides us a central enum which we can
begin to transition to.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
6 years agor600/sb: Check whether optimizations would result in reladdr conflict
Gert Wollny [Thu, 8 Feb 2018 14:11:58 +0000 (15:11 +0100)]
r600/sb: Check whether optimizations would result in reladdr conflict

v2: * Check whether the node src and dst registers are NULL before using
      them.
    * fix a type in the commit message.

Two cases are handled with this patch:

1. If copy propagation tries to eliminated a move from a relative
   array access then it could optimize

     MOV R1, ARRAY[RELADDR_1]
     MOV R2, ARRAY[RELADDR_2]
     OP2 R3, R1 R2

   into

     OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2]

   which is forbidden, because there is only one address register available.

2. When MULADD(x,a,MUL(x,c)) is handled

      MUL TMP, R1, ARRAY[RELADDR_1]
      MULLADD R3, R1, ARRAY[RELADDR_2], TMP

   by folding this into

      ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1]
      MUL R3, R1, TMP

   which is also forbidden.

Test for these cases and reject the optimization if a forbidden combination
of relative access would be created.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600g: Implement spilling of temp arrays (v2)
Glenn Kennard [Sun, 5 Mar 2017 17:26:54 +0000 (18:26 +0100)]
r600g: Implement spilling of temp arrays (v2)

Pessimistically spills arrays if GPR limit is exceeded.

v2: fix r600 support [airlied]

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600/sb: handle scratch mem reads on r600
Dave Airlie [Tue, 6 Feb 2018 04:17:46 +0000 (14:17 +1000)]
r600/sb: handle scratch mem reads on r600

On r600 we use the scratch mem with read/read_ind, in that case
sb should track the rw_gpr as a dst instead of a src.

This stops the whole shader being optimised out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600g/sb: Add dependency tracking for scratch ops
Glenn Kennard [Sun, 5 Mar 2017 17:26:53 +0000 (18:26 +0100)]
r600g/sb: Add dependency tracking for scratch ops

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600g/sb: Support scratch ops
Glenn Kennard [Sun, 5 Mar 2017 17:26:52 +0000 (18:26 +0100)]
r600g/sb: Support scratch ops

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600g: Implement scratch buffer state management (v2)
Glenn Kennard [Sun, 5 Mar 2017 17:26:51 +0000 (18:26 +0100)]
r600g: Implement scratch buffer state management (v2)

v2: add Glenn's fixes

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600g: Add pending output function
Glenn Kennard [Sun, 5 Mar 2017 17:26:50 +0000 (18:26 +0100)]
r600g: Add pending output function

Spills have to happen after the VLIW bundle currently
processed, so defer emitting the spill op.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600g: Support emitting scratch ops
Glenn Kennard [Sun, 5 Mar 2017 17:26:49 +0000 (18:26 +0100)]
r600g: Support emitting scratch ops

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
6 years agor600: fix texture gather swizzling.
Dave Airlie [Thu, 8 Feb 2018 06:19:28 +0000 (16:19 +1000)]
r600: fix texture gather swizzling.

This fixes:
KHR-GL45.texture_gather.swizzle
on cayman and redwood.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
6 years agoac: add 64bit support to ac_find_lsb()
Timothy Arceri [Tue, 6 Feb 2018 03:38:57 +0000 (14:38 +1100)]
ac: add 64bit support to ac_find_lsb()

v2: use LLVMBuildTrunc()

Reviewed-by: Marek Olšák <marek.olsak@amd.com>