mesa.git
9 years agonir: Generalize the optimization of subs of subs from 0.
Eric Anholt [Fri, 20 Feb 2015 08:00:27 +0000 (00:00 -0800)]
nir: Generalize the optimization of subs of subs from 0.

I initially wrote this based on the "(('fneg', ('fneg', a)), a)" above,
but we can generalize it and make it more potentially useful.  In the
specific original case of a 0 for our new 'a' argument, it'll get further
algebraic optimization once the 0 is an argument to the new add.

No shader-db effects.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Collapse repeated bcsels on the same argument.
Eric Anholt [Fri, 20 Feb 2015 09:20:34 +0000 (01:20 -0800)]
nir: Collapse repeated bcsels on the same argument.

vc4 results:
total instructions in shared programs: 39881 -> 39794 (-0.22%)
instructions in affected programs:     6302 -> 6215 (-1.38%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: When faced with a csel on !condition, just flip the arguments.
Eric Anholt [Fri, 20 Feb 2015 09:18:46 +0000 (01:18 -0800)]
nir: When faced with a csel on !condition, just flip the arguments.

total NIR instructions in shared programs: 39426 -> 39411 (-0.04%)
NIR instructions in affected programs:     3748 -> 3733 (-0.40%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Allow nir_opt_algebraic to see booleanness through &&, ||, ^, !.
Eric Anholt [Fri, 20 Feb 2015 08:57:04 +0000 (00:57 -0800)]
nir: Allow nir_opt_algebraic to see booleanness through &&, ||, ^, !.

We have some useful optimizations to drop things like 'ine a, 0' on a
boolean argument, but if 'a' came from logical operations on bools, it
couldn't tell.  These kinds of constructs appear as a result of TGSI->NIR
quite frequently (at least with if flattening), so being a little more
aggressive in detecting booleans can pay off.

v2: Add ixor as a booleanness-preserving op (Suggestion by Connor).

vc4 results:
total instructions in shared programs: 40207 -> 39881 (-0.81%)
instructions in affected programs:     6677 -> 6351 (-4.88%)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Add a couple of simplifications of csel operations.
Eric Anholt [Thu, 29 Jan 2015 23:50:18 +0000 (15:50 -0800)]
nir: Add a couple of simplifications of csel operations.

vc4 was already cleaning these up, but it does shave 4 NIR instructions in
shader-db.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agoglsl: ensure that enter/leave record get a record type
Ilia Mirkin [Sat, 21 Feb 2015 17:53:43 +0000 (12:53 -0500)]
glsl: ensure that enter/leave record get a record type

May make life easier for tools like Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agotgsi: avoid returning pointer to local var, make it static
Ilia Mirkin [Sat, 21 Feb 2015 17:44:05 +0000 (12:44 -0500)]
tgsi: avoid returning pointer to local var, make it static

Spotted by Coverity.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agofreedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly
Rob Clark [Sat, 21 Feb 2015 18:55:37 +0000 (13:55 -0500)]
freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properly

Fixes xonotic, some webgl stuff, and really pretty much anything with
more than 4 varyings.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno: update generated headers
Rob Clark [Sat, 21 Feb 2015 18:50:52 +0000 (13:50 -0500)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/a4xx: bit of cleanup
Rob Clark [Sat, 21 Feb 2015 18:39:06 +0000 (13:39 -0500)]
freedreno/a4xx: bit of cleanup

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agoloader: not having a pci-id should not be a warn
Rob Clark [Sun, 15 Feb 2015 06:59:17 +0000 (01:59 -0500)]
loader: not having a pci-id should not be a warn

If there is no pci-id, which is valid for vc4 and freedreno, just emit
an info msg.  Keep malformed but existing pci-id's as a warning.

Mostly just to clean up a warning that confuses users for the non-pci
devices.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno: implement fence
Rob Clark [Sun, 15 Feb 2015 05:04:57 +0000 (00:04 -0500)]
freedreno: implement fence

I never actually implemented the stubbed out fence stuff back in the
early days.  Fix that.

We'll need a few libdrm_freedreno changes to handle timeout properly,
so ignore that for now to avoid a libdrm_freedreno dependency bump.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/a2xx: fix increment in assert
Rob Clark [Tue, 3 Feb 2015 20:52:53 +0000 (15:52 -0500)]
freedreno/a2xx: fix increment in assert

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88883
Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agoi965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data
Jordan Justen [Fri, 20 Feb 2015 20:12:25 +0000 (12:12 -0800)]
i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

The brw_imm_ud will yield a HW_REG which then will introduce a barrier
for certain optimization opportunities.

No piglit regressions seen with gen8 (simd8vs).

Suggested-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/fs: Set pixel/sample mask for compute shaders atomic ops
Jordan Justen [Mon, 22 Sep 2014 01:31:45 +0000 (18:31 -0700)]
i965/fs: Set pixel/sample mask for compute shaders atomic ops

For fragment programs, we pull this mask from the payload header. The same
mask doesn't exist for compute shaders, so we set all bits to enabled.

Previously we were setting 0xff to support SIMD8 VS, but with CS we
support SIMD16, and therefore we change this to 0xffff.

Related commits for SIMD8 VS:

commit d9cd982d556be560af3bcbcdaf62b6b93eb934a5
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date:   Sun Feb 15 20:06:59 2015 -0800
    i965/simd8vs: Fix SIMD8 atomics

commit 4a95be9772a255776309f23180519a4a8560f2dd
Author: Jordan Justen <jordan.l.justen@intel.com>
Date:   Tue Feb 17 09:57:35 2015 -0800
    i965/simd8vs: Fix SIMD8 atomics (read-only)

Note: this mask is ANDed with the execution mask, so some channels may not end
up issuing the atomic operation.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agoilo: R32G32B32_FLOAT need no special care on Gen8+
Chia-I Wu [Fri, 20 Feb 2015 16:35:49 +0000 (00:35 +0800)]
ilo: R32G32B32_FLOAT need no special care on Gen8+

Gen8+ must use VALIGN_4.  Unlike prior Gens, R32G32B32_FLOAT should supposedly
support VALIGN_4.

9 years agoilo: 128 BPP formats can use TiledY on Gen7.5+
Chia-I Wu [Fri, 20 Feb 2015 07:27:14 +0000 (15:27 +0800)]
ilo: 128 BPP formats can use TiledY on Gen7.5+

The restriction is lifted.

9 years agonvc0: enable double support
Ilia Mirkin [Thu, 24 Jul 2014 02:32:55 +0000 (22:32 -0400)]
nvc0: enable double support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: remove merge/split pairs to allow normal propagation to occur
Ilia Mirkin [Mon, 7 Jul 2014 05:53:52 +0000 (01:53 -0400)]
nvc0/ir: remove merge/split pairs to allow normal propagation to occur

Because the TGSI interface creates merges for each instruction source
and then splits them back out, there are a lot of unnecessary
merge/split pairs which do essentially nothing. The various modifier/etc
propagation doesn't know how to walk though those, so just remove them
when they're unnecessary.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: add support for new TGSI double opcodes
Ilia Mirkin [Mon, 7 Jul 2014 03:39:59 +0000 (23:39 -0400)]
nvc0/ir: add support for new TGSI double opcodes

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: handle zero and negative sqrt arguments
Ilia Mirkin [Mon, 7 Jul 2014 06:32:45 +0000 (02:32 -0400)]
nvc0/ir: handle zero and negative sqrt arguments

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: no instruction can load a double immediate
Ilia Mirkin [Mon, 7 Jul 2014 03:39:38 +0000 (23:39 -0400)]
nvc0/ir: no instruction can load a double immediate

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: fix lowering of RSQ/RCP/SQRT/MOD to work with F64
Ilia Mirkin [Mon, 7 Jul 2014 04:04:19 +0000 (00:04 -0400)]
nvc0/ir: fix lowering of RSQ/RCP/SQRT/MOD to work with F64

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogm107/ir: fix F2F flipped stype/dtype flags
Ilia Mirkin [Fri, 26 Sep 2014 06:38:53 +0000 (02:38 -0400)]
gm107/ir: fix F2F flipped stype/dtype flags

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogm107/ir: fix DSET boolean float flag
Ilia Mirkin [Fri, 26 Sep 2014 06:35:51 +0000 (02:35 -0400)]
gm107/ir: fix DSET boolean float flag

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogm107/ir: fix DMUL opcode encoding
Ilia Mirkin [Fri, 26 Sep 2014 06:21:55 +0000 (02:21 -0400)]
gm107/ir: fix DMUL opcode encoding

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogk110/ir: add emission of dadd/dmul/dmad opcodes
Ilia Mirkin [Sat, 19 Jul 2014 02:20:42 +0000 (22:20 -0400)]
gk110/ir: add emission of dadd/dmul/dmad opcodes

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax
Ilia Mirkin [Mon, 7 Jul 2014 03:36:32 +0000 (23:36 -0400)]
nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmax

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agomesa: don't enable NV_fragment_program_option with swrast
Roland Scheidegger [Sat, 14 Feb 2015 15:34:04 +0000 (16:34 +0100)]
mesa: don't enable NV_fragment_program_option with swrast

Since dropping some NV_fragment_program opcodes (commits
868f95f1da74cf6dd7468cba1b56664aad585ccba3688d686f147f4252d19b298ae26d4ac72c2e08)
we can no longer parse all opcodes necessary for this extension, leading
to bugs (https://bugs.freedesktop.org/show_bug.cgi?id=86980).
Hence don't announce support for it in swrast (no other driver enabled it).
(Note that remnants of some NV_fp/vp extensions remain, they could be
dropped but are required as hacks for getting viewperf11 catia to run.)

9 years agodrivers/x11: add gallium include dirs to Makefile.am
Brian Paul [Fri, 20 Feb 2015 20:19:50 +0000 (13:19 -0700)]
drivers/x11: add gallium include dirs to Makefile.am

Fixes xlib driver build after e8c5cbfd921680c.

Acked-by: Matt Turner <mattst88@gmail.com>
9 years agovbo: fix an unitialized-variable warning
Marek Olšák [Fri, 20 Feb 2015 19:17:39 +0000 (20:17 +0100)]
vbo: fix an unitialized-variable warning

It looks like a bug to me.

Cc: 10.5 10.4 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agogallium/sw/kms: fix a type-mismatch warning
Marek Olšák [Fri, 20 Feb 2015 19:17:20 +0000 (20:17 +0100)]
gallium/sw/kms: fix a type-mismatch warning

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agogallium/sw/kms: don't redefine DEBUG
Marek Olšák [Fri, 20 Feb 2015 19:15:59 +0000 (20:15 +0100)]
gallium/sw/kms: don't redefine DEBUG

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agotargets/d3dadapter9: remove an unused variable
Marek Olšák [Fri, 20 Feb 2015 19:15:35 +0000 (20:15 +0100)]
targets/d3dadapter9: remove an unused variable

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agotgsi: fix type-mismatch warning
Marek Olšák [Fri, 20 Feb 2015 19:15:05 +0000 (20:15 +0100)]
tgsi: fix type-mismatch warning

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agogallivm: fix uninitialized-variable warnings
Marek Olšák [Fri, 20 Feb 2015 19:14:33 +0000 (20:14 +0100)]
gallivm: fix uninitialized-variable warnings

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agomesa: Have configure define NDEBUG, not mtypes.h.
Matt Turner [Fri, 20 Feb 2015 20:41:46 +0000 (12:41 -0800)]
mesa: Have configure define NDEBUG, not mtypes.h.

mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
defined. Confusing and bizarre that you don't get NDEBUG if you don't
include mtypes.h.

... which is just what happened in commit bef38f62e.

Let's let configure define this for us if not using --enable-debug.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: Fix the Mesa build without -DDEBUG.
Kenneth Graunke [Fri, 20 Feb 2015 20:31:31 +0000 (12:31 -0800)]
nir: Fix the Mesa build without -DDEBUG.

With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which
doesn't exist, breaking the build:

assert(state->states[index].index < state->states[index].stack_size);

Switch it to ifndef NDEBUG, so the field will exist if the assertion
actually generates code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agonir: Drop dependency on mtypes.h for core NIR.
Eric Anholt [Wed, 11 Feb 2015 23:08:02 +0000 (15:08 -0800)]
nir: Drop dependency on mtypes.h for core NIR.

One less new directory necessary for gallium code that wants to interact
with NIR.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agoglsl: Only include mtypes from glsl_types.h for the C++ code that needs it.
Eric Anholt [Wed, 11 Feb 2015 23:21:37 +0000 (15:21 -0800)]
glsl: Only include mtypes from glsl_types.h for the C++ code that needs it.

It's used in one of the methods, not in the structure definitions.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agoutil: Move Mesa's bitset.h to util/.
Eric Anholt [Wed, 11 Feb 2015 23:05:06 +0000 (15:05 -0800)]
util: Move Mesa's bitset.h to util/.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agomesa: Make bitset.h not rely on Mesa-specific types and functions.
Eric Anholt [Wed, 11 Feb 2015 22:57:55 +0000 (14:57 -0800)]
mesa: Make bitset.h not rely on Mesa-specific types and functions.

Note that we can't use u_math.h's align() because it's a function instead
of a macro, while BITSET_DECLARE needs a constant expression for nouveau's
usage in global declarations.

v2: Stick some parens around the bits macro argument usage (review by Jose).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agomesa: Use u_math.h from macros.h
Eric Anholt [Wed, 11 Feb 2015 22:24:33 +0000 (14:24 -0800)]
mesa: Use u_math.h from macros.h

This avoids duplication of some macros and other definitions across the
tree.

Note that COPY_4FV switches from a memcpy-based implementation to an
assignment of 4 floats.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agogallium/util: Don't include unused debug functions from u_math.h
Eric Anholt [Wed, 11 Feb 2015 22:28:44 +0000 (14:28 -0800)]
gallium/util: Don't include unused debug functions from u_math.h

It introduces references to gallium util/ symbols which means we don't get
to include it from outside-of-gallium code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agomesa: Add gallium include dirs to more parts of the tree.
Eric Anholt [Wed, 11 Feb 2015 22:18:50 +0000 (14:18 -0800)]
mesa: Add gallium include dirs to more parts of the tree.

v2: Try to patch up the scons bits.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agogallium/radeon: fix an uninitialized-variable warning
Marek Olšák [Fri, 20 Feb 2015 19:13:44 +0000 (20:13 +0100)]
gallium/radeon: fix an uninitialized-variable warning

9 years agogallium: add new double-related shader caps to all the getters
Ilia Mirkin [Fri, 20 Feb 2015 04:30:36 +0000 (23:30 -0500)]
gallium: add new double-related shader caps to all the getters

Missed a few drivers in the earlier changes, this should fix up all the
ones that print unknown caps or don't have a default statement.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agosvga: add missing _DROUND,DFRACEXP_DLDEXP_SUPPORTED switch cases
Brian Paul [Fri, 20 Feb 2015 15:09:36 +0000 (08:09 -0700)]
svga: add missing _DROUND,DFRACEXP_DLDEXP_SUPPORTED switch cases

To silence unhandled switch case warnings.

9 years agoradeonsi: don't use SQC_CACHES to flush ICACHE and KCACHE on SI
Marek Olšák [Thu, 19 Feb 2015 12:03:54 +0000 (13:03 +0100)]
radeonsi: don't use SQC_CACHES to flush ICACHE and KCACHE on SI

This reverts 73c2b0d18c51459697d8ec194ecfc4438c98c139.

It doesn't seem to be reliable. It's probably missing a wait packet or
something, because it's just a register write and doesn't wait for anything.
SURFACE_SYNC at least seems to wait until the flush is done. Just guessing.

Let's not complicate things and revert this.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88561

Cc: 10.5 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoi965/gen6: Fix GL_GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB
Iago Toral Quiroga [Fri, 20 Feb 2015 07:21:25 +0000 (08:21 +0100)]
i965/gen6: Fix GL_GEOMETRY_SHADER_PRIMITIVES_EMITTED_ARB

In gen6 we need to compute the primitive count in the generated GS program.
The current implementation only counts full primitives, that is, if the
output primitive type is a triangle strip, it won't count individual
triangles in the strip, only complete strips.

If we want to count basic primitives instead we have two options: rework
the assembly code we generate for strip primitives or simply use
CL_INVOCATION_COUNT to resolve the query and let the hardware do that work
for us. This patch implements the latter approach.

Fixes the following piglit test:
bin/arb_pipeline_statistics_query-geom -auto

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89210
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agomesa: Check that draw buffers are valid for glDrawBuffers on GLES3
Eduardo Lima Mitev [Fri, 20 Feb 2015 08:32:42 +0000 (09:32 +0100)]
mesa: Check that draw buffers are valid for glDrawBuffers on GLES3

Section 4.2 (Whole Framebuffer Operations) of the OpenGL 3.0 specification
says:

    "Each buffer listed in bufs must be BACK, NONE, or one of the values from
     table 4.3 (NONE, COLOR_ATTACHMENTi)".

Fixes 1 dEQP test:
* dEQP-GLES3.functional.negative_api.buffer.draw_buffers

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoglsl: don't allow invariant qualifiers for interface blocks
Samuel Iglesias Gonsalvez [Tue, 25 Nov 2014 13:03:05 +0000 (14:03 +0100)]
glsl: don't allow invariant qualifiers for interface blocks

GLSL 1.50 and GLSL 4.40 specs, they both say the same in
"Interface Blocks" section:

"If optional qualifiers are used, they can include interpolation qualifiers,
auxiliary storage qualifiers, and storage qualifiers and they must declare
an input, output, or uniform member consistent with the interface qualifier
of the block"

From GLSL ES 3.0, chapter 4.3.7 "Interface Blocks", page 38:

"GLSL ES 3.0 does not support interface blocks for shader inputs or outputs."

and from GLSL ES 3.0, chapter 4.6.1 "The invariant qualifier", page 52.

"Only variables output from a shader can be candidates for invariance."

This patch fixes the following dEQP tests:

dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_vertex
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.invariant_uniform_block_2_fragment

No piglit regressions.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
v2:

- Enable this check for GLSL.

Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agovc4: Keep an array of pointers to instructions defining the temps around.
Eric Anholt [Thu, 19 Feb 2015 21:22:31 +0000 (13:22 -0800)]
vc4: Keep an array of pointers to instructions defining the temps around.

The optimization passes are always regenerating it and throwing it away,
but it's not hard to keep track of.

9 years agovc4: Move qir_uniform() and the constant-value versions to vc4_qir.c/h.
Eric Anholt [Thu, 19 Feb 2015 20:19:44 +0000 (12:19 -0800)]
vc4: Move qir_uniform() and the constant-value versions to vc4_qir.c/h.

I may want them in optimization passes, and they're not really particular
to the program translation stage.

9 years agovc4: Enforce one-uniform-per-instruction after optimization.
Eric Anholt [Thu, 19 Feb 2015 20:58:53 +0000 (12:58 -0800)]
vc4: Enforce one-uniform-per-instruction after optimization.

This lets us more intelligently decide which uniform values should be put
into temporaries, by choosing the most reused values to push to temps
first.

total uniforms in shared programs: 13457 -> 13433 (-0.18%)
uniforms in affected programs:     1524 -> 1500 (-1.57%)
total instructions in shared programs: 40198 -> 40019 (-0.45%)
instructions in affected programs:     6027 -> 5848 (-2.97%)

I noticed this opportunity because with the NIR work, some programs were
happening to make different uniform copy propagation choices that
significantly increased instruction counts.

9 years agovc4: Rename add_uniform() to qir_uniform().
Eric Anholt [Thu, 19 Feb 2015 20:16:25 +0000 (12:16 -0800)]
vc4: Rename add_uniform() to qir_uniform().

9 years agovc4: Shut up runtime warnings about new pipe caps.
Eric Anholt [Fri, 20 Feb 2015 07:34:37 +0000 (23:34 -0800)]
vc4: Shut up runtime warnings about new pipe caps.

9 years agoi965/vec4: Add and use byte-MOV instruction for unpack 4x8.
Matt Turner [Thu, 12 Feb 2015 01:42:43 +0000 (01:42 +0000)]
i965/vec4: Add and use byte-MOV instruction for unpack 4x8.

Previously we were using a B/UB source in an Align16 instruction, which
is illegal. It for some reason works on all platforms, except Broadwell.

Cc: "10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86811
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965/blorp: Emit MADs.
Matt Turner [Tue, 10 Feb 2015 06:54:51 +0000 (22:54 -0800)]
i965/blorp: Emit MADs.

Low hanging fruit: cuts a couple of instructions.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965/blorp: Optimize clamping tex coords.
Matt Turner [Tue, 10 Feb 2015 05:26:14 +0000 (21:26 -0800)]
i965/blorp: Optimize clamping tex coords.

Each emit_cond_mov() emits a CMP of its first to arguments using the
specified conditional mod, followed by a predicated MOV of the fifth
argument into the fourth. In all four cases here, it was just
implementing MIN/MAX which we can do in a single SEL instruction.

Also reorder the instructions for a slightly better schedule.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965: Use greater-equal cmod to implement maximum.
Matt Turner [Tue, 10 Feb 2015 05:11:46 +0000 (21:11 -0800)]
i965: Use greater-equal cmod to implement maximum.

The docs specifically call out SEL with .l and .ge as the
implementations of MIN and MAX respectively. Among other things, SEL
with these conditional mods are commutative.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965: Don't emit saturates for instructions without destinations.
Matt Turner [Tue, 10 Feb 2015 06:21:21 +0000 (22:21 -0800)]
i965: Don't emit saturates for instructions without destinations.

We were special casing OPCODE_END but no other instructions that have no
destination, like OPCODE_KIL, leading us to emitting MOVs with null
destinations.

total instructions in shared programs: 5705243 -> 5701539 (-0.06%)
instructions in affected programs:     124104 -> 120400 (-2.98%)
helped:                                904

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965/fs: Consider MOV.SAT to interfere if it has a source modifier.
Matt Turner [Wed, 11 Feb 2015 00:25:47 +0000 (16:25 -0800)]
i965/fs: Consider MOV.SAT to interfere if it has a source modifier.

The saturate propagation pass recognizes that the second instruction
below does not interfere with an attempt to propagate the saturate
modifier from instruction 3 to 1.

 1:  add(8)     dst0   src0  src1
 2:  mov.sat(8) dst1   dst0
 3:  mov.sat(8) dst2   dst0

Unfortunately, we did not consider the case of instruction 2 having a
source modifier on dst0. Take for instance:

 1:  add(8)     dst0   src0  src1
 2:  mov.sat(8) dst1  -dst0
 3:  mov.sat(8) dst2   dst0

Consider such an instruction to interfere. Increase instruction counts
in Anomaly 2, which could be a bug fix depending on the values the first
instruction produces.

instructions in affected programs:     53228 -> 53934 (1.33%)
HURT:                                  360

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965/fs: Use fs_inst::overwrites_reg() in saturate propagation.
Matt Turner [Wed, 28 Jan 2015 06:43:28 +0000 (22:43 -0800)]
i965/fs: Use fs_inst::overwrites_reg() in saturate propagation.

This is safer and matches the conditional_mod propagation pass.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965/fs: Add unit tests for saturate propagation pass.
Matt Turner [Tue, 10 Feb 2015 21:38:07 +0000 (13:38 -0800)]
i965/fs: Add unit tests for saturate propagation pass.

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoglsl: Use the without_array predicate
Timothy Arceri [Thu, 19 Feb 2015 10:32:21 +0000 (21:32 +1100)]
glsl: Use the without_array predicate

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agonv50: add PIPELINE_STATISTICS query support, based on nvc0
Ilia Mirkin [Wed, 18 Feb 2015 08:35:23 +0000 (03:35 -0500)]
nv50: add PIPELINE_STATISTICS query support, based on nvc0

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Nick Tenney <nick.tenney@gmail.com>
9 years agosvga: add missing :
Ilia Mirkin [Fri, 20 Feb 2015 01:15:28 +0000 (20:15 -0500)]
svga: add missing :

Fixes: 924ee3f408 ("gallium: add shader cap for dldexp/dfracexp support")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonir/GCM: Pull unpinned instructions out of blocks while pinning
Jason Ekstrand [Tue, 10 Feb 2015 04:18:44 +0000 (20:18 -0800)]
nir/GCM: Pull unpinned instructions out of blocks while pinning

This lets us be slightly more efficient by not walking the CFG extra times.
Also, it may make it easier to ensure that GVN happens on only unpinned
instructions.

Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned
Jason Ekstrand [Mon, 9 Feb 2015 22:58:12 +0000 (14:58 -0800)]
nir/GCM: Use pass_flags instead of bitsets for tracking visited/pinned

Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Add a global code motion (GCM) pass
Jason Ekstrand [Tue, 3 Feb 2015 18:11:23 +0000 (10:11 -0800)]
nir: Add a global code motion (GCM) pass

v2 Jason Ekstrand <jason.ekstrand@intel.com>:
 - Use nir_dominance_lca for computing least common anscestors
 - Use the block index for comparing dominance tree depths
 - Pin things that do partial derivatives

Reviewed-by: Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/instr: Change "live" to a more generic "pass_flags" field
Jason Ekstrand [Mon, 9 Feb 2015 22:41:10 +0000 (14:41 -0800)]
nir/instr: Change "live" to a more generic "pass_flags" field

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Make nir_[cf_node/instr]_[prev/next] return null if at the end
Jason Ekstrand [Thu, 5 Feb 2015 05:22:45 +0000 (21:22 -0800)]
nir: Make nir_[cf_node/instr]_[prev/next] return null if at the end

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/from_ssa: Don't try to read an invalid instruction
Jason Ekstrand [Thu, 5 Feb 2015 05:38:28 +0000 (21:38 -0800)]
nir/from_ssa: Don't try to read an invalid instruction

Right now, the nir_instr_prev function function blindly looks up the
previous element in the exec list and casts it to an instruction even if
it's the tail sentinel.  The next commit will change this to return null if
it's the first instruction.  Making this change first avoids getting a
segfault between commits.  The only reason we never noticed is that, thanks
to the way things are laid out in nir_block, the casted instruction's type
was never parallal_copy.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/validate: Validate SSA defs the same way we do for registers
Jason Ekstrand [Wed, 4 Feb 2015 22:01:51 +0000 (14:01 -0800)]
nir/validate: Validate SSA defs the same way we do for registers

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/validate: Validate if_uses on registers
Jason Ekstrand [Wed, 4 Feb 2015 21:58:12 +0000 (13:58 -0800)]
nir/validate: Validate if_uses on registers

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Properly clean up CF nodes when we remove them
Jason Ekstrand [Wed, 4 Feb 2015 05:39:56 +0000 (21:39 -0800)]
nir: Properly clean up CF nodes when we remove them

Previously, if you remved a CF node that still had instructions in it, none
of the use/def information from those instructions would get cleaned up.
Also, we weren't removing if statements from the if_uses of the
corresponding register or SSA def.  This commit fixes both of these
problems

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: use nir_foreach_ssa_def for indexing ssa defs
Jason Ekstrand [Wed, 4 Feb 2015 05:04:57 +0000 (21:04 -0800)]
nir: use nir_foreach_ssa_def for indexing ssa defs

This is both simpler and more correct.  The old code didn't properly index
load_const instructions.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/from_ssa: Use the nir_block_dominance function instead of our own
Jason Ekstrand [Fri, 6 Feb 2015 20:49:08 +0000 (12:49 -0800)]
nir/from_ssa: Use the nir_block_dominance function instead of our own

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/dominance: Add a constant-time mechanism for comparing blocks
Jason Ekstrand [Fri, 6 Feb 2015 20:45:43 +0000 (12:45 -0800)]
nir/dominance: Add a constant-time mechanism for comparing blocks

This is mostly thanks to Connor.  The idea is to do a depth-first search
that computes pre and post indices for all the blocks.  We can then figure
out if one block dominates another in constant time by two simple
comparison operations.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir/dominance: Expose the dominance intersection function
Jason Ekstrand [Fri, 6 Feb 2015 20:06:04 +0000 (12:06 -0800)]
nir/dominance: Expose the dominance intersection function

Being able to find the least common anscestor in the dominance tree is a
useful thing that we may want to do in other passes.  In particular, we
need it for GCM.

v2: Handle NULL inputs by returning the other block

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
9 years agost/mesa: lower DFRACEXP/DLDEXP when they are not supported
Ilia Mirkin [Fri, 18 Jul 2014 04:38:59 +0000 (00:38 -0400)]
st/mesa: lower DFRACEXP/DLDEXP when they are not supported

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agost/mesa: disable lowering of dops to dfrac when dround is available
Ilia Mirkin [Fri, 25 Jul 2014 21:19:57 +0000 (17:19 -0400)]
st/mesa: disable lowering of dops to dfrac when dround is available

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agost/mesa: add support for new double opcodes
Ilia Mirkin [Fri, 25 Jul 2014 21:12:42 +0000 (17:12 -0400)]
st/mesa: add support for new double opcodes

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agogallium: add shader cap for dldexp/dfracexp support
Ilia Mirkin [Fri, 25 Jul 2014 21:48:01 +0000 (17:48 -0400)]
gallium: add shader cap for dldexp/dfracexp support

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agogallium: add a cap to enable double rounding opcodes
Ilia Mirkin [Fri, 25 Jul 2014 21:03:33 +0000 (17:03 -0400)]
gallium: add a cap to enable double rounding opcodes

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agogallium: add some more double opcodes to avoid unnecessary lowering
Ilia Mirkin [Fri, 25 Jul 2014 20:46:42 +0000 (16:46 -0400)]
gallium: add some more double opcodes to avoid unnecessary lowering

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agodocs/GL3.txt: softpipe now supports GL_ARB_gpu_shader_fp64
Dave Airlie [Thu, 19 Feb 2015 23:14:44 +0000 (09:14 +1000)]
docs/GL3.txt: softpipe now supports GL_ARB_gpu_shader_fp64

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agost/mesa: add st fp64 support (v7.1)
Dave Airlie [Tue, 17 Feb 2015 00:48:04 +0000 (10:48 +1000)]
st/mesa: add st fp64 support (v7.1)

This adds support to the state tracker for
ARB_gpu_shader_fp64.

The details are explained in comments
within the code.

v2 : add double to int/unsigned conversion
v3: handle fp64 consts better
v4: use DRSQ
v4.1: add d2b
v4.2: drop DDIV

v5: split out some prep patches.
v5.1: add some comments.
v5.2: more comments

v6: simplify down the double instruction
    generation loop.

v7: Merge Ilia's two cleanup patches.
v7.1: minor fixups for Ilia patch + cleanups

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa/st_tgsi_to_glsl: prepare add_constant for fp64
Dave Airlie [Mon, 16 Feb 2015 23:44:50 +0000 (09:44 +1000)]
mesa/st_tgsi_to_glsl: prepare add_constant for fp64

This just moves stuff around a little to make the next patch
cleaner.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agost/glsl_to_tgsi: convert dst to an array
Dave Airlie [Mon, 16 Feb 2015 23:39:05 +0000 (09:39 +1000)]
st/glsl_to_tgsi: convert dst to an array

This is just prep work for fp64 support where we need
an array of 2 dst values.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoi965: just avoid warnings with fp64
Dave Airlie [Thu, 14 Aug 2014 08:49:20 +0000 (18:49 +1000)]
i965: just avoid warnings with fp64

This just fills in some blanks to avoid warnings in the i965 driver.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoglsl: Add compute to _mesa_shader_stage_to_string(); use unreachable.
Kenneth Graunke [Thu, 19 Feb 2015 21:36:07 +0000 (13:36 -0800)]
glsl: Add compute to _mesa_shader_stage_to_string(); use unreachable.

This is basically Ian's review feedback for my patch that added
_mesa_shader_stage_to_abbrev() - it just makes both consistent again.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agoi965/vec4: Print "VS" or "GS" when compiles fail, not "vec4".
Kenneth Graunke [Thu, 19 Feb 2015 01:52:28 +0000 (17:52 -0800)]
i965/vec4: Print "VS" or "GS" when compiles fail, not "vec4".

This is now trivial to do right.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965/vec4: Replace debug_flag with debug_enabled.
Kenneth Graunke [Thu, 19 Feb 2015 01:50:42 +0000 (17:50 -0800)]
i965/vec4: Replace debug_flag with debug_enabled.

backend_visitor now handles this, so we can delete the vec4_visitor
specific code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965: Make scheduler cycle estimates use the proper stage name.
Kenneth Graunke [Thu, 19 Feb 2015 01:45:51 +0000 (17:45 -0800)]
i965: Make scheduler cycle estimates use the proper stage name.

Previously, the vec4 backend labeled shaders as "vec4" - now it uses the
specific names "VS" and "GS".

The FS backend now correctly prints "VS" for vertex shaders (rather than
"fs").  It also prints "FS" instead of "fs" for fragment shaders;
preserving that behavior didn't seem essential.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965/fs: Un-hardcode DEBUG_WM, "FS", and "fragment".
Kenneth Graunke [Thu, 19 Feb 2015 01:43:07 +0000 (17:43 -0800)]
i965/fs: Un-hardcode DEBUG_WM, "FS", and "fragment".

These code paths can (or will) be used for other shader stages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965: Create backend_visitor fields for debugging messages.
Kenneth Graunke [Thu, 19 Feb 2015 01:38:45 +0000 (17:38 -0800)]
i965: Create backend_visitor fields for debugging messages.

We introduce three new fields in backend_visitor:
- debug_enabled: whether or not INTEL_DEBUG & DEBUG_<stage flag>
- stage_name: "vertex", "fragment", etc. for use in messages
- stage_abbrev: "VS", "FS", etc. for use in messages

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965: Add a function to translate MESA_SHADER_* into DEBUG_* enums.
Kenneth Graunke [Thu, 19 Feb 2015 01:31:29 +0000 (17:31 -0800)]
i965: Add a function to translate MESA_SHADER_* into DEBUG_* enums.

When compiling, we have a gl_shader_stage (MESA_SHADER_*) enum, and want
to know whether debugging is enabled for that stage.  This allows us to
easily translate it into the corresponding debug flag.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoglsl: Create a _mesa_shader_stage_to_abbrev() function.
Kenneth Graunke [Thu, 19 Feb 2015 01:35:41 +0000 (17:35 -0800)]
glsl: Create a _mesa_shader_stage_to_abbrev() function.

This is similar to _mesa_shader_stage_to_string(), but returns "VS"
instead of "vertex".

v2: Use unreachable() and add MESA_SHADER_COMPUTE (requested by Ian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>