mesa.git
7 years agoradeonsi: keep track of dirty descriptor sets
Nicolai Hähnle [Fri, 3 Jun 2016 15:40:12 +0000 (17:40 +0200)]
radeonsi: keep track of dirty descriptor sets

Reduces CPU load for draw calls that change none or few of the descriptors.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: move si_descriptors into a per-context array
Nicolai Hähnle [Fri, 3 Jun 2016 13:56:39 +0000 (15:56 +0200)]
radeonsi: move si_descriptors into a per-context array

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: pass shader stage to si_disable_shader_image
Nicolai Hähnle [Fri, 3 Jun 2016 13:36:45 +0000 (15:36 +0200)]
radeonsi: pass shader stage to si_disable_shader_image

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: access descriptor sets via local variables
Nicolai Hähnle [Fri, 3 Jun 2016 13:14:39 +0000 (15:14 +0200)]
radeonsi: access descriptor sets via local variables

This will simplify moving them to a per-context array.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: add si_set_rw_buffer to be used for internal descriptors
Nicolai Hähnle [Fri, 3 Jun 2016 13:27:09 +0000 (15:27 +0200)]
radeonsi: add si_set_rw_buffer to be used for internal descriptors

So that callers outside of si_descriptors.c need to worry less about the
details of descriptor handling.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: pass shader stage to si_set_shader_image
Nicolai Hähnle [Fri, 3 Jun 2016 13:04:40 +0000 (15:04 +0200)]
radeonsi: pass shader stage to si_set_shader_image

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: pass shader stage to si_set_sampler_view
Nicolai Hähnle [Fri, 3 Jun 2016 13:03:59 +0000 (15:03 +0200)]
radeonsi: pass shader stage to si_set_sampler_view

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: move descriptor set begin_new_cs handling into a separate function
Nicolai Hähnle [Fri, 3 Jun 2016 12:50:42 +0000 (14:50 +0200)]
radeonsi: move descriptor set begin_new_cs handling into a separate function

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: move enabled_mask out of si_descriptors
Nicolai Hähnle [Fri, 3 Jun 2016 12:47:10 +0000 (14:47 +0200)]
radeonsi: move enabled_mask out of si_descriptors

This mask is irrelevant for the generic descriptor set handling, and having it
outside simplifies subsequent changes slightly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoanv/entrypoints: Stop using the C preprocessor
Jason Ekstrand [Mon, 6 Jun 2016 21:29:19 +0000 (14:29 -0700)]
anv/entrypoints: Stop using the C preprocessor

Now that we emit guards for everything, we can just generate the files and
trust build flags to keep us safe.  This should also fix the tarball
problems.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoanv/entrypoints: Emit #if guards for all platforms
Jason Ekstrand [Mon, 6 Jun 2016 21:29:18 +0000 (14:29 -0700)]
anv/entrypoints: Emit #if guards for all platforms

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoplatform_android: prevent deadlock in droid_swap_buffers
Haixia Shi [Thu, 2 Jun 2016 19:48:23 +0000 (12:48 -0700)]
platform_android: prevent deadlock in droid_swap_buffers

To avoid blocking other EGL calls, release the display mutex before
we enqueue buffer to android frameworks and re-acquire the mutex
upon return.

v2: moved lock/unlock inside droid_window_enqueue_buffer().

TEST=verify pinch zoom in Photos app no longer causes hangs

Signed-off-by: Haixia Shi <hshi@chromium.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: automake: distclean git_sha1.h when building OOT
Emil Velikov [Mon, 6 Jun 2016 18:39:40 +0000 (19:39 +0100)]
mesa: automake: distclean git_sha1.h when building OOT

In the case of out-of-tree (OOT) builds, in particular when building
from tarball, we'll end up with the file in both srcdir and builddir.

We want the former to remain intact (since we need it on rebuild) while
the latter should be removed otherwise `make distclean' gets angry at
us.

Ideally there'll be a solution that feels a bit less of a hack. Until
then this does the job exactly as expected.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: automake: ensure that git_sha1.h.tmp has the right attributes
Emil Velikov [Mon, 6 Jun 2016 16:31:05 +0000 (17:31 +0100)]
mesa: automake: ensure that git_sha1.h.tmp has the right attributes

... when copied from git_sha1.h.

As the latter file can we lacking the write attribute, one should set it
explicitly. Otherwise we'll get a warning/failure at cleanup stage.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agomesa: automake: add directory prefix for git_sha1.h
Emil Velikov [Mon, 6 Jun 2016 15:50:14 +0000 (16:50 +0100)]
mesa: automake: add directory prefix for git_sha1.h

Otherwise the build will assume that we've talking about builddir, which
is not the case in the else statement.

Here the file is already generated and is part of the tarball.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
7 years agoegl: android: don't add the image loader extension for !render_node
Emil Velikov [Sat, 4 Jun 2016 00:09:14 +0000 (01:09 +0100)]
egl: android: don't add the image loader extension for !render_node

With earlier commit we introduced support for render_node devices, which
was couples with the use of the image loader extension.

As the work was inspired by egl/wayland we (erroneously) added the
extension for the !render_node path as well.

That works for wayland, as the implementations of the DRI2 and IMAGE
loader extensions converge behind the scenes. As that is not yet
the case for Android we shouldn't expose the extension.

Fixes: 34ddef39cef ("egl: android: add dma-buf fd support")
Cc: <mesa-stable@lists.freedesktop.org>
Reported-by: Mauro Rossi <issor.oruam@gmail.com>
Tested-by: Mauro Rossi <issor.oruam@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agogallium/radeon: add support for sharing textures with DCC between processes
Marek Olšák [Thu, 2 Jun 2016 21:36:43 +0000 (23:36 +0200)]
gallium/radeon: add support for sharing textures with DCC between processes

v2: use a function for calculating WORD1 of bo metadata

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agogallium/radeon: don't discard DCC if an external user can write to it
Marek Olšák [Thu, 2 Jun 2016 21:30:01 +0000 (23:30 +0200)]
gallium/radeon: don't discard DCC if an external user can write to it

We don't import textures with DCC now, but soon we will.

v2: if we can't disable DCC for image writes, at least decompress DCC
    at bind time

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
7 years agoi915: fix typo CAP.
Dave Airlie [Tue, 7 Jun 2016 08:30:54 +0000 (18:30 +1000)]
i915: fix typo CAP.

Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoglsl: initialise pointer to NULL
Jakob Sinclair [Fri, 3 Jun 2016 23:09:52 +0000 (01:09 +0200)]
glsl: initialise pointer to NULL

Could cause issues if you tried to read from an uninitialised pointer.
This just initalises the pointer to null to avoid that being a problem.
Discovered by Coverity.

CID: 1343616

Signed-off-by: Jakob Sinclair <sinclair.jakob@openmailbox.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agoi965/gen8: fix cull distance emission for tessellation shaders.
Dave Airlie [Tue, 7 Jun 2016 00:27:44 +0000 (10:27 +1000)]
i965/gen8: fix cull distance emission for tessellation shaders.

This fixes some cases of:
GL45-CTS.cull_distance.functional
on Skylake.

Reviewed-by: Chris Forbes <chrisforbes@google.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agonvc0: add support for VOTE tgsi opcodes
Ilia Mirkin [Sun, 29 May 2016 16:42:49 +0000 (12:42 -0400)]
nvc0: add support for VOTE tgsi opcodes

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
8 years agost/mesa: expose GL_ARB_shader_group_vote when supported by backend
Ilia Mirkin [Sun, 29 May 2016 15:43:05 +0000 (11:43 -0400)]
st/mesa: expose GL_ARB_shader_group_vote when supported by backend

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agogallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowed
Ilia Mirkin [Sun, 29 May 2016 15:39:52 +0000 (11:39 -0400)]
gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowed

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agogallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote
Ilia Mirkin [Sun, 29 May 2016 15:01:05 +0000 (11:01 -0400)]
gallium: add VOTE_* opcodes to implement GL_ARB_shader_group_vote

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agomesa: hook up core bits of GL_ARB_shader_group_vote
Ilia Mirkin [Sun, 29 May 2016 14:49:03 +0000 (10:49 -0400)]
mesa: hook up core bits of GL_ARB_shader_group_vote

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl: Make opt_copy_propagation_elements actually propagate into loops.
Kenneth Graunke [Sat, 30 Apr 2016 05:06:37 +0000 (22:06 -0700)]
glsl: Make opt_copy_propagation_elements actually propagate into loops.

We've had a FINISHME here since Eric originally wrote the code in 2011.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.

The shader-db statistics are basically a wash:

   No change in instruction counts.

   total cycles in shared programs: 78685980 -> 78680730 (-0.01%)
   cycles in affected programs: 2102646 -> 2097396 (-0.25%)
   helped: 48
   HURT: 83

I figured if we're going to do this for one copy propagation pass,
we may as well do it in both.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agoglsl: Make opt_copy_propagation actually propagate into loops.
Kenneth Graunke [Sat, 30 Apr 2016 05:06:37 +0000 (22:06 -0700)]
glsl: Make opt_copy_propagation actually propagate into loops.

We've had a FINISHME here since Eric originally wrote the code in 2010.
This patch implements his suggested approach, which makes us actually
able to copy propagate into the loops, at the unfortunate cost of making
this pass even more expensive.

The shader-db statistics are not terribly impressive:

   total instructions in shared programs: 9008589 -> 9008613 (0.00%)
   instructions in affected programs: 4293 -> 4317 (0.56%)
   helped: 0
   HURT: 10

   total cycles in shared programs: 78550978 -> 78575760 (0.03%)
   cycles in affected programs: 655426 -> 680208 (3.78%)
   helped: 75
   HURT: 88

   GAINED: 2

Most of the "regressions" appear to be us successfully copy propagating
uniforms, which i965 is loading as pull constants instead of push, so we
occasionally have two pulls instead of one.  That doesn't seem like this
pass's job - it's propagating correctly, and we should be smarter about
pull loads in the backend.

This patch is also useful for a couple of reasons:

1. It can clean up copies created by varying packing (previously, we
   couldn't if the uses were inside a loop).

   This fixes a bug when interpolateAt*() is used on a packed varying
   inside a loop: glsl_to_nir struggles to see through the extra copy
   and mistakenly believed the variable was not an input.

2. It will help propagate uniform array access created by
   lower_const_array_to_uniforms().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonv50/ir: use round toward 0 when converting doubles to integers
Samuel Pitoiset [Mon, 6 Jun 2016 19:12:15 +0000 (21:12 +0200)]
nv50/ir: use round toward 0 when converting doubles to integers

Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.

This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
8 years agogallium/radeon: don't re-set BO metadata after CMASK deallocation
Marek Olšák [Thu, 2 Jun 2016 21:24:20 +0000 (23:24 +0200)]
gallium/radeon: don't re-set BO metadata after CMASK deallocation

CMASK has no effect on metadata, because it's not sharable.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agost/mesa: change SQRT lowering to fix the game Risen
Marek Olšák [Mon, 30 May 2016 15:43:26 +0000 (17:43 +0200)]
st/mesa: change SQRT lowering to fix the game Risen

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94627
(against nouveau)

Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoradeonsi: add a performance tweak for 4 SE parts
Marek Olšák [Fri, 3 Jun 2016 14:20:17 +0000 (16:20 +0200)]
radeonsi: add a performance tweak for 4 SE parts

Ported from Vulkan.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: simplify PRIMGROUP_SIZE computation for tessellation
Marek Olšák [Fri, 3 Jun 2016 14:44:00 +0000 (16:44 +0200)]
radeonsi: simplify PRIMGROUP_SIZE computation for tessellation

Ported from Vulkan.

v2: keep the comment

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agor600g: use hw MSAA resolve for non-trivial resolves
Marek Olšák [Sun, 5 Jun 2016 14:42:26 +0000 (16:42 +0200)]
r600g: use hw MSAA resolve for non-trivial resolves

This improves MSAA resolve performance.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: use hw MSAA resolve for non-trivial resolves
Marek Olšák [Sun, 5 Jun 2016 14:42:26 +0000 (16:42 +0200)]
radeonsi: use hw MSAA resolve for non-trivial resolves

This improves MSAA resolve performance.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agomesa/program_resource: return -1 for index if no location.
Dave Airlie [Mon, 23 May 2016 20:41:21 +0000 (06:41 +1000)]
mesa/program_resource: return -1 for index if no location.

The GL4.5 spec quote seems clear on this:
"The value -1 will be returned by either command if an error occurs,
if name does not identify an active variable on programInterface,
or if name identifies an active variable that does not have a valid
location assigned, as described above."

This fixes:
GL45-CTS.program_interface_query.output-built-in

[airlied: use _mesa_program_resource_location_index as
suggested by Eduardo]
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradeonsi: set descriptor dirty mask on shader buffer unbind
Nicolai Hähnle [Fri, 3 Jun 2016 13:17:25 +0000 (15:17 +0200)]
radeonsi: set descriptor dirty mask on shader buffer unbind

Found randomly while skimming the code. This might have caused VM faults in
robustness tests.

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agost/mesa: fix resource leak in try_pbo_readpixels
Nicolai Hähnle [Thu, 2 Jun 2016 20:48:52 +0000 (22:48 +0200)]
st/mesa: fix resource leak in try_pbo_readpixels

Found by inspection after seeing
https://bugs.freedesktop.org/show_bug.cgi?id=96343

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agotgsi: fix mixed data type comparison in tgsi_point_sprite.c
Charmaine Lee [Fri, 3 Jun 2016 21:26:23 +0000 (14:26 -0700)]
tgsi: fix mixed data type comparison in tgsi_point_sprite.c

Cast the unsigned semantic index to integer datatype before comparing
to max_generic, otherwise, max_generic which is initialized to -1
will be converted to unsigned int before the comparison, causing a wrong
semantic index to be assigned to a shader output.

Fixes the assert running TurboCAD_gl.trace. (VMware bug 1667265)

Also tested with glretrace, mesa demos pointblast, spriteblast and pointcoord.

v2: use the original max_generic variable but add the (int) cast
    to the semantic index, as suggested by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agosvga: print shader linkage info when tgsi debug bit is on
Charmaine Lee [Fri, 3 Jun 2016 21:24:19 +0000 (14:24 -0700)]
svga: print shader linkage info when tgsi debug bit is on

When TGSI debug flag is enabled, print the shader linkage info as well.

Tested with mesa demos with SVGA_DEBUG=tgsi

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agost/mesa: check shader image format support before using PBO download
Ilia Mirkin [Sun, 5 Jun 2016 22:56:12 +0000 (18:56 -0400)]
st/mesa: check shader image format support before using PBO download

ARB_shader_image_load_store only requires a very fixed list of formats
to be supported, while textures may be in all kinds of formats, like
BGRA which are presently not supported on at least Kepler.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agotgsi: use truncf in micro_trunc
Lars Hamre [Thu, 26 May 2016 22:30:24 +0000 (18:30 -0400)]
tgsi: use truncf in micro_trunc

Switches to using truncf in micro_trunc.

Fixes the following piglit tests (for softpipe):

/spec/glsl-1.30/execution/built-in-functions/...
fs-trunc-float
fs-trunc-vec2
fs-trunc-vec3
fs-trunc-vec4
vs-trunc-float
vs-trunc-vec2
vs-trunc-vec3
vs-trunc-vec4

/spec/glsl-1.50/execution/built-in-functions/...
gs-trunc-float
gs-trunc-vec2
gs-trunc-vec3
gs-trunc-vec4

Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
8 years agoi965/gs/scalar: Fix load input for doubles
Samuel Iglesias Gonsálvez [Fri, 27 May 2016 09:59:48 +0000 (11:59 +0200)]
i965/gs/scalar: Fix load input for doubles

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoi965/fs: fix offset when loading double vector input varyings
Samuel Iglesias Gonsálvez [Thu, 26 May 2016 05:56:37 +0000 (07:56 +0200)]
i965/fs: fix offset when loading double vector input varyings

When we are not packing a double input varying, we might need to
read its data in a non-aligned to 64-bit offset, so we read
the wrong data. This is happening when using explicit locations
in varyings because Mesa disables packing varying for that case.

const_index is in 32-bit size units but offset() is multiplying
it by destination type size units. When operating with double
input varyings, const_index value could be not aligned to 64 bits.
To fix it, we load the double vector as if it was a float based vector
with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoi965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings
Samuel Iglesias Gonsálvez [Thu, 26 May 2016 05:56:38 +0000 (07:56 +0200)]
i965/fs: fix FS_OPCODE_CINTERP for unpacked double input varyings

Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not
64-bit aligned and the current implementation fails to read the data
properly. Instead, when there is is a double input varying, read it as
vector of floats with twice the number of components.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoglsl: geom shader max_vertices layout must match.
Dave Airlie [Fri, 3 Jun 2016 00:45:07 +0000 (10:45 +1000)]
glsl: geom shader max_vertices layout must match.

From GLSL 4.5 spec, "4.4.2.3 Geometry Outputs".
"all geometry shader output vertex count declarations in a
program must declare the same count."

Fixes:
GL45-CTS.geometry_shader.output.conflicted_output_vertices_max

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoanv/pipeline: Add support for caching the push constant map
Jason Ekstrand [Sat, 4 Jun 2016 22:10:22 +0000 (15:10 -0700)]
anv/pipeline: Add support for caching the push constant map

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
8 years agoglsl: use enum glsl_interface_packing in more places. (v2)
Dave Airlie [Wed, 11 May 2016 00:49:19 +0000 (10:49 +1000)]
glsl: use enum glsl_interface_packing in more places. (v2)

Although the glsl_types.h stores this in a bitfield,
we should hide that from everyone else. Hide the cast
in an accessor method and use the enum everywhere.

This makes things a bit nicer in gdb, and improves type
safety.

v2: fix a few pieces of interface I missed that caused some
piglit regressions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agoi965: don't use NumLayers for 3D textures.
Dave Airlie [Fri, 3 Jun 2016 01:36:38 +0000 (11:36 +1000)]
i965: don't use NumLayers for 3D textures.

For 3D textures we shouldn't be using NumLayers, we need
to get it from the depth.

This fixes:
GL45-CTS.geometry_shader.layered_framebuffer.clear_call_support

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl: for anonymous struct matching use without_array() (v3)
Dave Airlie [Mon, 6 Jun 2016 00:33:51 +0000 (10:33 +1000)]
glsl: for anonymous struct matching use without_array() (v3)

With tessellation shaders we can have cases where we have
arrays of anon structs, so make sure we match using without_array().

Fixes:
GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_in

v2:
test lengths match as well (Ilia)
v3:
descend array lengths to check for matches as well (Ilia)

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl/ast: don't crash when func_name is NULL
Dave Airlie [Tue, 3 May 2016 04:39:06 +0000 (14:39 +1000)]
glsl/ast: don't crash when func_name is NULL

This fixes a crash in
GL43-CTS.shader_subroutine.subroutines_not_allowed_as_variables_constructors_and_argument_or_return_types

If we can't find the func_name in one of these paths,
we have emitted an earlier error so just return here.

Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl: handle ast_aggregate in has_sequence_subexpression. (v2)
Dave Airlie [Tue, 3 May 2016 07:16:27 +0000 (17:16 +1000)]
glsl: handle ast_aggregate in has_sequence_subexpression. (v2)

GL43-CTS.compute_shader.work-group-size does
uniform uint g_uniform[gl_WorkGroupSize.z + 20] = { 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24 };

The initializer triggers the GLSL 4.30/GLES3 tests
for constant sequence subexpressions, so it doesn't
happen unless you are using those, so just return
false as this path is now reachable.

v2: update commit msg with diagnosis
Acked-by: Timothy Arceri <timothy.arceri@collabora.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agomesa: Try to unbreak the MSVC build.
Kenneth Graunke [Sun, 5 Jun 2016 23:31:11 +0000 (16:31 -0700)]
mesa: Try to unbreak the MSVC build.

PATH_MAX is apparently not a thing on Windows.  Borrow the hack from
pipe_loader.c to try and make this work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agomesa: Add MESA_SHADER_CAPTURE_PATH for writing .shader_test files.
Kenneth Graunke [Sun, 7 Sep 2014 03:26:51 +0000 (20:26 -0700)]
mesa: Add MESA_SHADER_CAPTURE_PATH for writing .shader_test files.

This writes linked shader programs to .shader_test files to
$MESA_SHADER_CAPTURE_PATH in the format used by shader-db
(http://cgit.freedesktop.org/mesa/shader-db).

It supports both GLSL shaders and ARB programs.  All stages that
are linked together are written in a single .shader_test file.

This eliminates the need for shader-db's split-to-files.py, as Mesa
produces the desired format directly.  It's much more reliable than
parsing stdout/stderr, as those may contain extraneous messages, or
simply be closed by the application and unavailable.

We have many similar features already, but this is a bit different:
- MESA_GLSL=dump writes to stdout, not files.
- MESA_GLSL=log writes each stage to separate files (rather than
  all linked shaders in one file), at draw time (not link time),
  with uniform data and state flag info.
- Tapani's shader replacement mechanism (MESA_SHADER_DUMP_PATH and
  MESA_SHADER_READ_PATH) also uses separate files per shader stage,
  but allows reading in files to replace an app's shader code.

v2:  Dump ARB programs too, not just GLSL.
v3:  Don't dump bogus 0.shader_test file.
v4:  Add "GL_ARB_separate_shader_objects" to the [require] block.
v5:  Print "GLSL 4.00" instead of "GLSL 4.0" in the [require] block.
v6:  Don't hardcode /tmp/mesa.
v7:  Fix memoization of getenv().
v8:  Also print "SSO ENABLED" (suggested by Timothy).
v9:  Also handle ES shaders (suggested by Ilia).
v10: Guard against MESA_SHADER_CAPTURE_PATH being too long; add
     _mesa_warning calls on error handling (suggested by Ben).
v11: Fix crash when variable is unset introduced in v10.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
8 years agonv50,nvc0: fix BGR10_A2UI vertex format
Ilia Mirkin [Sun, 5 Jun 2016 19:00:36 +0000 (15:00 -0400)]
nv50,nvc0: fix BGR10_A2UI vertex format

This is mostly academic as this is not reachable from GL, which only has
the packed RGB10_A2UI vertex format.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonvc0: do not clear surfaces bins in the validate function
Samuel Pitoiset [Sun, 5 Jun 2016 16:53:26 +0000 (18:53 +0200)]
nvc0: do not clear surfaces bins in the validate function

We should not call nouveau_bufctx_reset() inside a validate function.
This only affects Fermi where images are aliased between 3D and CP.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonvc0: re-validate images after launching a grid on Fermi
Samuel Pitoiset [Sun, 5 Jun 2016 16:01:19 +0000 (18:01 +0200)]
nvc0: re-validate images after launching a grid on Fermi

Images invalidation is a bit weird on Fermi and there is already a hack
which forces invalidating all images when launching a computer shader
to help in fixing 3D<->CP interaction.

However, we need to re-validate images for compute because
nvc0_compute_invalidate_surfaces() will destroy the previous binding.
This is not really good for performance purposes but this might be
improved later.

This fixes the following piglits:
- spec/arb_compute_shader/execution/basic-uniform-access
- spec/arb_compute_shader/execution/mutiple-texture-reading
- spec/arb_compute_shader/execution/multiple-workgroups
- spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoradeonsi: fix images with level > 0
Marek Olšák [Fri, 3 Jun 2016 17:17:46 +0000 (19:17 +0200)]
radeonsi: fix images with level > 0

This should fix spec@arb_shader_image_load_store@level.

Broken by:
    Commit: 95c5bbae66af3ca1f805d94f6fe8d8e4ba2c9c43
    radeonsi: set some image descriptor fields at bind time

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
8 years agonvc0: reduce overhead from always marking images dirty
Ilia Mirkin [Sat, 4 Jun 2016 18:13:38 +0000 (14:13 -0400)]
nvc0: reduce overhead from always marking images dirty

We would revalidate images when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the images.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonvc0: reduce overhead from always marking buffers dirty
Ilia Mirkin [Sat, 4 Jun 2016 17:50:21 +0000 (13:50 -0400)]
nvc0: reduce overhead from always marking buffers dirty

We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonvc0: fix memory barrier flag handling
Ilia Mirkin [Fri, 3 Jun 2016 01:36:04 +0000 (21:36 -0400)]
nvc0: fix memory barrier flag handling

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonvc0: mark bound buffer range valid
Ilia Mirkin [Fri, 3 Jun 2016 01:42:14 +0000 (21:42 -0400)]
nvc0: mark bound buffer range valid

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/entrypoints: don't go using wayland/xcb unless they are configured
Dave Airlie [Sat, 4 Jun 2016 20:49:42 +0000 (06:49 +1000)]
anv/entrypoints: don't go using wayland/xcb unless they are configured

The fix in:
anv: let anv_entrypoints_gen.py generate proper Wayland/Xcb guards

breaks things if wayland headers aren't installed.

Separate things out properly to avoid that problem.

[airlied: fixed up to put in pre-existing sections].
Reported-by: Arjan van de Ven
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agogallium/radeon: don't use the DMA ring for pipelined buffer uploads
Marek Olšák [Thu, 26 May 2016 16:20:42 +0000 (18:20 +0200)]
gallium/radeon: don't use the DMA ring for pipelined buffer uploads

Submitting a DMA IB flushes the GFX IB and all GPU caches.

Vedran Miletić said:
  "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps
   (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling
   from 1200p)."

Some anonymous dude said:
   R9 390 results:
      Tomb Raider (normal settings): 80 -> 88 FPS
      Talos Principle (custom settings): 23 -> 56 FPS
      Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Vedran Miletić <vedran@miletic.net>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agor600g: don't flush caches when binding shader resources
Marek Olšák [Thu, 26 May 2016 16:14:27 +0000 (18:14 +0200)]
r600g: don't flush caches when binding shader resources

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agor600g: only do necessary cache flushes in cp_dma_copy_buffer
Marek Olšák [Thu, 26 May 2016 15:25:46 +0000 (17:25 +0200)]
r600g: only do necessary cache flushes in cp_dma_copy_buffer

The main impact is that {upload, draw, upload, draw, ..} doesn't flush
framebuffer caches before every upload.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agor600g: only do necessary cache flushes in cp_dma_clear_buffer
Marek Olšák [Thu, 26 May 2016 15:18:13 +0000 (17:18 +0200)]
r600g: only do necessary cache flushes in cp_dma_clear_buffer

The main impact is that fast color clear doesn't flush TC, CONST, DB.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agor600g: remove a CP DMA workaround that's not needed anymore
Marek Olšák [Wed, 1 Jun 2016 16:39:53 +0000 (18:39 +0200)]
r600g: remove a CP DMA workaround that's not needed anymore

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agor600g: fix CP DMA hazard with index buffer fetches (v3)
Marek Olšák [Thu, 26 May 2016 20:00:03 +0000 (22:00 +0200)]
r600g: fix CP DMA hazard with index buffer fetches (v3)

v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel,
    otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agor600g: properly sync CP with CP DMA on R6xx
Marek Olšák [Wed, 1 Jun 2016 16:35:33 +0000 (18:35 +0200)]
r600g: properly sync CP with CP DMA on R6xx

This will allow removing useless cache & IB flushes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agor600g: write WAIT_UNTIL in the correct place
Marek Olšák [Tue, 31 May 2016 21:07:15 +0000 (23:07 +0200)]
r600g: write WAIT_UNTIL in the correct place

This has been wrong all along. Fixing this will allow removing useless
cache flushes.

Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agogallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memory
Marek Olšák [Tue, 31 May 2016 17:11:54 +0000 (19:11 +0200)]
gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memory

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agogallium/u_suballoc: allow different alignment for each allocation
Marek Olšák [Tue, 31 May 2016 17:06:45 +0000 (19:06 +0200)]
gallium/u_suballoc: allow different alignment for each allocation

Just move the alignment parameter from u_suballocator_create
to u_suballocator_alloc.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
8 years agoanv/blit: Use CLAMP_TO_EDGE for scaled blits
Jason Ekstrand [Thu, 2 Jun 2016 23:34:11 +0000 (16:34 -0700)]
anv/blit: Use CLAMP_TO_EDGE for scaled blits

When upscaling you can end up interpolating between the edge pixel and one
past the edge.  Using CLAMP_TO_EDGE seems like the most reasonable thing to
do in this case.  This fixes two of the new Vulkan CTS tests in
dEQP-VK.api.copy_and_blit.blit_image.*

Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/copy: Account for the anv_surface.offset when creating a blit2d_surf
Jason Ekstrand [Thu, 2 Jun 2016 23:25:44 +0000 (16:25 -0700)]
anv/copy: Account for the anv_surface.offset when creating a blit2d_surf

This was causing problems if the user tried to copy to/from the stencil
portion of a combined depth/stencil image.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/spirv: Make a decoration switch complete
Jason Ekstrand [Thu, 2 Jun 2016 21:36:58 +0000 (14:36 -0700)]
nir/spirv: Make a decoration switch complete

Getting rid of the default case makes the compiler warn if we are missing
cases.  While we're here, we also add the one missing case.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/spirv: Make unhandled decorations and capabilities non-fatal
Jason Ekstrand [Thu, 2 Jun 2016 21:34:15 +0000 (14:34 -0700)]
nir/spirv: Make unhandled decorations and capabilities non-fatal

glslang frequently throw bogus decorations into shaders.  While we are free
to assert-fail, it's a bit nicer to the application to just warn.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/spirv: Add a way to print non-fatal warnings
Jason Ekstrand [Thu, 2 Jun 2016 21:32:56 +0000 (14:32 -0700)]
nir/spirv: Add a way to print non-fatal warnings

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/spirv: Add string lookup tables for a couple of SPIR-V enums
Jason Ekstrand [Thu, 2 Jun 2016 21:06:30 +0000 (14:06 -0700)]
nir/spirv: Add string lookup tables for a couple of SPIR-V enums

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/spirv: Complete the list of capabilities
Jason Ekstrand [Thu, 2 Jun 2016 20:43:19 +0000 (13:43 -0700)]
nir/spirv: Complete the list of capabilities

Previously we supported a subset of capabilities and just left a default
case for the others.  It's time to stop being lazy and actually audit the
capabilities.  This should bring them up-to-date with reality.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/pipeline: Add support for early depth stencil
Jason Ekstrand [Wed, 1 Jun 2016 03:16:01 +0000 (20:16 -0700)]
anv/pipeline: Add support for early depth stencil

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agomesa: Get rid of _mesa_active_fragment_shader_has_side_effects
Jason Ekstrand [Thu, 2 Jun 2016 01:53:32 +0000 (18:53 -0700)]
mesa: Get rid of _mesa_active_fragment_shader_has_side_effects

It is no longer used.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoi965/ps_state: Use wm_prog_data.has_side_effects
Jason Ekstrand [Thu, 2 Jun 2016 01:55:35 +0000 (18:55 -0700)]
i965/ps_state: Use wm_prog_data.has_side_effects

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoi965/fs Add a wm_prog_data bit for has_side_effects
Jason Ekstrand [Thu, 2 Jun 2016 01:46:30 +0000 (18:46 -0700)]
i965/fs Add a wm_prog_data bit for has_side_effects

This is more accurate than calling
_mesa_active_fragment_shader_has_side_effects because it looks at whether
or not the SSBOs, images, or atomic buffers are actually written rather
than just existing in the program.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/info: Get rid of uses_interp_var_at_offset
Jason Ekstrand [Thu, 2 Jun 2016 01:29:09 +0000 (18:29 -0700)]
nir/info: Get rid of uses_interp_var_at_offset

We were using this briefly in the i965 driver to trigger recompiles but we
haven't been using it since we switched to the NIR y-transform lowering
pass.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
8 years agoanv/pipeline: Silently pass tests if depth or stencil is missing
Jason Ekstrand [Wed, 1 Jun 2016 05:23:18 +0000 (22:23 -0700)]
anv/pipeline: Silently pass tests if depth or stencil is missing

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/pipeline: Unify gen7/8 emit_ds_state
Jason Ekstrand [Wed, 1 Jun 2016 05:19:53 +0000 (22:19 -0700)]
anv/pipeline: Unify gen7/8 emit_ds_state

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agogenxml/gen6,7,75: s/BackFace/Backface
Jason Ekstrand [Wed, 1 Jun 2016 05:15:38 +0000 (22:15 -0700)]
genxml/gen6,7,75: s/BackFace/Backface

This is more consistent with gen8+

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/spirv: Handle the WorkgroupSize builtin decoration
Jason Ekstrand [Wed, 1 Jun 2016 18:20:22 +0000 (11:20 -0700)]
nir/spirv: Handle the WorkgroupSize builtin decoration

This fixes the 7 dEQP-VK.pipeline.spec_constant.compute.local_size.* tests
in the latest dev version of the Vulkan CTS.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/spirv: Use breaks instead of returns in constant handling
Jason Ekstrand [Wed, 1 Jun 2016 17:34:04 +0000 (10:34 -0700)]
nir/spirv: Use breaks instead of returns in constant handling

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/pipeline: Refactor specialization constant handling a bit
Jason Ekstrand [Tue, 31 May 2016 23:27:19 +0000 (16:27 -0700)]
anv/pipeline: Refactor specialization constant handling a bit

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agonir/lower_indirect_derefs: Use the direct array deref for recursion
Jason Ekstrand [Tue, 31 May 2016 22:02:10 +0000 (15:02 -0700)]
nir/lower_indirect_derefs: Use the direct array deref for recursion

This fixes about 100 of the new Vulkan CTS tests.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoanv/clear: Handle ClearImage on 3-D images
Jason Ekstrand [Tue, 31 May 2016 18:26:06 +0000 (11:26 -0700)]
anv/clear: Handle ClearImage on 3-D images

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
8 years agoRevert "i965/fs: Allow scalar source regions on SNB math instructions."
Francisco Jerez [Fri, 3 Jun 2016 19:32:15 +0000 (12:32 -0700)]
Revert "i965/fs: Allow scalar source regions on SNB math instructions."

This reverts commit c1107cec44ab030c7fcc97c67baa12df1cc9d7b5.
Apparently the hardware spec text I quoted in the commit message was
outright lying about scalar source math being supported on SNB, the
hardware seems to load 32 contiguous bits of data for each channel
regardless of the regioning mode.  Fixes regressions in the following
CTS tests (which we didn't catch early due to CTS being temporarily
disabled in our CI system):

   es2-cts.gtf.gl.atan.atan_vec3_frag_xvary
   es2-cts.gtf.gl.cos.cos_vec2_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvary
   es2-cts.gtf.gl.pow.pow_vec2_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_float_frag_xvary
   es2-cts.gtf.gl.pow.pow_float_frag_xvary_yconsthalf
   es2-cts.gtf.gl.atan.atan_vec3_frag_xvaryyvary
   es2-cts.gtf.gl.pow.pow_vec3_frag_xvary_yconsthalf
   es2-cts.gtf.gl.cos.cos_vec3_frag_xvary
   es2-cts.gtf.gl.atan.atan_vec2_frag_xvaryyvary

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96346
Reported-by: Mark Janes <mark.a.janes@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
8 years agoi965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).
Francisco Jerez [Wed, 1 Jun 2016 23:27:52 +0000 (16:27 -0700)]
i965/vec4: Fix cmod propagation not to propagate non-identity cmod into CMP(N).

The conditional mod of these instructions determines the semantics of
the comparison itself (rather than being evaluated based on the result
of the instruction as is usually the case for most other instructions
that allow conditional mods), so it's in general not legal to
propagate a conditional mod into a CMP instruction.  This prevents
cmod propagation from (mis)optimizing:

 cmp.z.f0 tmp, ...
 mov.z.f0 null, tmp

into:

 cmp.z.f0 tmp, ...

which gives the negation of the flag result of the original sequence.
I originally noticed this while working on SIMD32 in the scalar
back-end, but the same scenario is likely to be possible in vec4
programs so this commit ports the bugfix with the same name from the
scalar back-end to the vec4 cmod propagation pass.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS
Emil Velikov [Fri, 3 Jun 2016 23:20:53 +0000 (00:20 +0100)]
anv: add the X related and Wayland CFLAGS to VULKAN_ENTRYPOINT_CPPFLAGS

Otherwise we will fail to find the headers in some scenarios.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reported-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Tested-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
8 years agonir: automake: add nir_search_helpers.h to the sources list(s)
Emil Velikov [Fri, 3 Jun 2016 23:18:40 +0000 (00:18 +0100)]
nir: automake: add nir_search_helpers.h to the sources list(s)

Fixes: dfbae7d64f4 ("nir/algebraic: support for power-of-two
optimizations")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
8 years agofreedreno/ir3: do idiv lowering after main opt loop
Rob Clark [Mon, 9 May 2016 16:41:00 +0000 (12:41 -0400)]
freedreno/ir3: do idiv lowering after main opt loop

Give algebraic-opt pass a chance to catch udiv by const power-of-two,
before running lower-idiv pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
8 years agonir/algebraic: support for power-of-two optimizations
Rob Clark [Sat, 7 May 2016 17:01:24 +0000 (13:01 -0400)]
nir/algebraic: support for power-of-two optimizations

Some optimizations, like converting integer multiply/divide into left/
right shifts, have additional constraints on the search expression.
Like requiring that a variable is a constant power of two.  Support
these cases by allowing a fxn name to be appended to the search var
expression (ie. "a#32(is_power_of_two)").

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoradeonsi: mark buffer texture range valid for shader images
Nicolai Hähnle [Thu, 2 Jun 2016 20:17:40 +0000 (22:17 +0200)]
radeonsi: mark buffer texture range valid for shader images

When a shader image view into a buffer texture can be written to, the buffer's
valid range must be updated, or subsequent transfers may incorrectly skip
synchronization.

This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels,
reported by Michel Dänzer.

Cc: Michel Dänzer <michel.daenzer@amd.com>
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>