mesa.git
10 years agoutil/u_format: move utility function from r600g
Grigori Goronzy [Wed, 4 Jun 2014 16:54:36 +0000 (18:54 +0200)]
util/u_format: move utility function from r600g

We need this for radeonsi, and it might be useful for other drivers,
too.

10 years agoradeon/vce: set number of cpbs based on level
Leo Liu [Thu, 12 Jun 2014 16:48:05 +0000 (12:48 -0400)]
radeon/vce: set number of cpbs based on level

v2: add error check for cpb size 0

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agoradeon/vce: implement h264 level support
Leo Liu [Thu, 12 Jun 2014 16:27:31 +0000 (12:27 -0400)]
radeon/vce: implement h264 level support

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agost/omx/enc: implement h264 level support
Leo Liu [Thu, 12 Jun 2014 16:27:30 +0000 (12:27 -0400)]
st/omx/enc: implement h264 level support

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agovl: add level interface
Leo Liu [Thu, 12 Jun 2014 16:27:29 +0000 (12:27 -0400)]
vl: add level interface

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agost/st/omx: fix switch-case indentation in vid_enc.c
Leo Liu [Thu, 12 Jun 2014 16:27:28 +0000 (12:27 -0400)]
st/st/omx: fix switch-case indentation in vid_enc.c

Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agoglx: Add an error message when a direct renderer's createScreen() routine fails
Jon TURNEY [Sat, 10 May 2014 10:04:44 +0000 (11:04 +0100)]
glx: Add an error message when a direct renderer's createScreen() routine fails
because no matching fbConfigs or visuals could be found.

Nearly all the error cases in *createScreen() issue an error message to diagnose
the failure to initialize before branching to handle_error.  The few remaining
error cases which don't should probably do the same.

(At the moment, it seems this can be triggered in drisw with an X server which
reports definite values for MAX_PBUFFFER_(WIDTH|HEIGHT|SIZE), because those
attributes are checked for an exact match against 0.)

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/vec4: unit test for copy propagation and writemask
Chia-I Wu [Mon, 14 Apr 2014 13:52:34 +0000 (21:52 +0800)]
i965/vec4: unit test for copy propagation and writemask

This unit test demonstrates a subtle bug fixed by
4ddf51db6af36736d5d42c1043eeea86e47459ce.

Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965/vec4/gs: Silence warning about unused 'success' in release build.
Matt Turner [Sun, 15 Jun 2014 05:53:16 +0000 (22:53 -0700)]
i965/vec4/gs: Silence warning about unused 'success' in release build.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/disasm: Mark three_source_reg_encoding[] static.
Matt Turner [Sun, 15 Jun 2014 05:52:08 +0000 (22:52 -0700)]
i965/disasm: Mark three_source_reg_encoding[] static.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/blorp: Remove unused 'brw' member.
Matt Turner [Sun, 15 Jun 2014 05:51:29 +0000 (22:51 -0700)]
i965/blorp: Remove unused 'brw' member.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/blorp: Mark branch unreachable to silence uninitialized var warning.
Matt Turner [Sun, 15 Jun 2014 06:21:24 +0000 (23:21 -0700)]
i965/blorp: Mark branch unreachable to silence uninitialized var warning.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Silence warning about unused brw in release builds.
Matt Turner [Sun, 15 Jun 2014 05:52:35 +0000 (22:52 -0700)]
i965: Silence warning about unused brw in release builds.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Mark backend_instruction and bblock_t as structs.
Matt Turner [Sun, 15 Jun 2014 05:53:40 +0000 (22:53 -0700)]
i965: Mark backend_instruction and bblock_t as structs.

They have to be marked as structs for C code elsewhere. bblock_t is
already defined as a struct, and all of backend_instruction's fields are
public anyway.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Use standard SSE intrinsics instead of gcc built-ins.
Matt Turner [Sun, 15 Jun 2014 05:31:33 +0000 (22:31 -0700)]
i965: Use standard SSE intrinsics instead of gcc built-ins.

Let's this file compile with clang.

Reviewed-by: Frank Henigman <fjhenigman@google.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Remove unused functions from perfomance query code.
Matt Turner [Sun, 15 Jun 2014 06:15:05 +0000 (23:15 -0700)]
mesa: Remove unused functions from perfomance query code.

Perhaps useful for debugging? Never used otherwise. Added by commit
8cf5bdad.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Remove unused extra_EXT_texture_integer.
Matt Turner [Sun, 15 Jun 2014 06:09:59 +0000 (23:09 -0700)]
mesa: Remove unused extra_EXT_texture_integer.

Unused since commit b6475f94.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Mark default case unreachable to silence warning.
Matt Turner [Sun, 15 Jun 2014 05:50:43 +0000 (22:50 -0700)]
mesa: Mark default case unreachable to silence warning.

Warned about 'coord' being undefined in the default case, which is
unreachable.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoegl: Remove unused variable dri_driver_path.
Matt Turner [Sun, 15 Jun 2014 06:02:37 +0000 (23:02 -0700)]
egl: Remove unused variable dri_driver_path.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoswrast: Remove unused solve_plane_recip().
Matt Turner [Sun, 15 Jun 2014 05:38:18 +0000 (22:38 -0700)]
swrast: Remove unused solve_plane_recip().

Unused since commit 9e8a961d.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Remove 'struct' from ir_variable declaration.
Matt Turner [Sun, 15 Jun 2014 05:35:36 +0000 (22:35 -0700)]
glsl: Remove 'struct' from ir_variable declaration.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoRevert "i965: Add 'wait' instruction support"
Matt Turner [Sat, 14 Jun 2014 03:51:12 +0000 (20:51 -0700)]
Revert "i965: Add 'wait' instruction support"

This reverts commit 20be3ff57670529a410b30a1008a71e768d08428.

No evidence of ever being used.

10 years agoi965/fs: Optimize SEL with the same sources into a MOV.
Matt Turner [Fri, 18 Apr 2014 17:01:41 +0000 (10:01 -0700)]
i965/fs: Optimize SEL with the same sources into a MOV.

instructions in affected programs:     474 -> 462 (-2.53%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Perform CSE on texture operations.
Matt Turner [Fri, 11 Apr 2014 19:26:25 +0000 (12:26 -0700)]
i965/fs: Perform CSE on texture operations.

Helps Unigine Tropics and some (old) gstreamer shaders in shader-db.

instructions in affected programs:     792 -> 744 (-6.06%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Copy propagate from load_payload.
Matt Turner [Thu, 17 Apr 2014 22:13:00 +0000 (15:13 -0700)]
i965/fs: Copy propagate from load_payload.

But only into non-load_payload instructions. Otherwise we would prevent
register coalescing from combining identical payloads.

10 years agoi965/fs: Perform CSE on load_payload instructions if it's not a copy.
Matt Turner [Sun, 30 Mar 2014 19:41:55 +0000 (12:41 -0700)]
i965/fs: Perform CSE on load_payload instructions if it's not a copy.

Since CSE creates instructions, if we let CSE generate things register
coalescing can't remove, bad things will happen. Only let CSE combine
non-copy load_payloads.

E.g., allow CSE to handle this

   load_payload vgrf4+0, vgrf5, vgrf6

but not this

   load_payload vgrf4+0, vgrf5+0, vgrf5+1

10 years agoi965/fs: Support register coalescing on LOAD_PAYLOAD operands.
Matt Turner [Thu, 27 Mar 2014 19:02:48 +0000 (12:02 -0700)]
i965/fs: Support register coalescing on LOAD_PAYLOAD operands.

10 years agoi965/fs: Emit load_payload instead of multiple MOVs for large VGRFs.
Matt Turner [Tue, 25 Mar 2014 22:43:21 +0000 (15:43 -0700)]
i965/fs: Emit load_payload instead of multiple MOVs for large VGRFs.

10 years agoi965/fs: Only consider real sources when comparing instructions.
Matt Turner [Tue, 25 Mar 2014 22:28:17 +0000 (15:28 -0700)]
i965/fs: Only consider real sources when comparing instructions.

10 years agoi965/fs: Apply cube map array fixup and restore the payload.
Matt Turner [Mon, 24 Mar 2014 23:18:58 +0000 (16:18 -0700)]
i965/fs: Apply cube map array fixup and restore the payload.

So that we don't have partial writes to a large VGRF. Will be cleaned up
by register coalescing.

10 years agoi965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().
Matt Turner [Mon, 17 Mar 2014 17:43:38 +0000 (10:43 -0700)]
i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().

10 years agoi965/fs: Lower LOAD_PAYLOAD and clean up.
Matt Turner [Fri, 18 Apr 2014 18:56:46 +0000 (11:56 -0700)]
i965/fs: Lower LOAD_PAYLOAD and clean up.

Clean up with with register_coalesce()/dead_code_eliminate().

10 years agoi965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD.
Matt Turner [Wed, 28 May 2014 01:47:40 +0000 (18:47 -0700)]
i965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD.

Will be used to simplify the handling of large virtual GRFs in SSA form.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
10 years agoglsl: type check between switch init-expression and case
Tapani Pälli [Thu, 12 Jun 2014 09:48:43 +0000 (12:48 +0300)]
glsl: type check between switch init-expression and case

Patch adds a type check between switch init-expression and case label
and performs a implicit signed->unsigned type conversion when possible.

v2: add GLSL spec reference, do implicit conversion if possible (Matt)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79724
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agonv50/ir: Remove NV50_SEMANTIC_VIEWPORTINDEX
Tobias Klausmann [Sun, 15 Jun 2014 19:24:06 +0000 (21:24 +0200)]
nv50/ir: Remove NV50_SEMANTIC_VIEWPORTINDEX

Use TGSI_SEMANTIC_VIEWPORT_INDEX for the last consumer.

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agodocs: update GL3.txt, relnotes: mark GL_ARB_viewport_array as done for nvc0
Tobias Klausmann [Sun, 15 Jun 2014 19:24:05 +0000 (21:24 +0200)]
docs: update GL3.txt, relnotes: mark GL_ARB_viewport_array as done for nvc0

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0: implement multiple viewports/scissors, enable ARB_viewport_array
Tobias Klausmann [Sun, 15 Jun 2014 19:24:04 +0000 (21:24 +0200)]
nvc0: implement multiple viewports/scissors, enable ARB_viewport_array

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
[imirkin: mark things dirty on ctx switch, 3d blit]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonv50: make sure to mark first scissor dirty after blit
Ilia Mirkin [Sat, 14 Jun 2014 17:23:47 +0000 (13:23 -0400)]
nv50: make sure to mark first scissor dirty after blit

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.
Kenneth Graunke [Fri, 13 Jun 2014 22:26:40 +0000 (15:26 -0700)]
i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.

Like on Haswell, we need to use 8x4 aligned rectangle primitives for
hierarchical depth buffer resolves and depth clears.  See the comments
in brw_blorp.cpp's brw_hiz_op_params() constructor.  (The Broadwell
documentation confirms that this is still necessary.)

This patch makes the Broadwell code follow the same behavior as Chad and
Jordan's Gen7 BLORP code.  Based on a patch by Topi Pohjolainen.

This fixes es3conform's framebuffer_blit_functionality_scissor_blit
test, with no Piglit regressions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Make INTEL_DEBUG=mip print out whether HiZ is enabled.
Kenneth Graunke [Fri, 13 Jun 2014 22:25:14 +0000 (15:25 -0700)]
i965: Make INTEL_DEBUG=mip print out whether HiZ is enabled.

We only enable HiZ for miplevels which are aligned on 8x4 blocks.  When
debugging HiZ failures, it's useful to know whether a particular
miplevel is using HiZ or not.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl/cs: Fix local_size_y and local_size_z
Jordan Justen [Mon, 9 Jun 2014 21:14:15 +0000 (14:14 -0700)]
glsl/cs: Fix local_size_y and local_size_z

flags.q.local_size has 3 bits. One each for x, y and z.

Fixes piglit's:
* spec/ARB_compute_shader/linker/mismatched_local_work_sizes
* spec/ARB_compute_shader/compiler/default_local_size.comp
* spec/ARB_compute_shader/compiler/work_group_size_too_large
* spec/ARB_compute_shader/compiler/gl_WorkGroupSize_matches_layout.comp

This was regressed in 738c9c3c.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agomain/extensions: Only parse MESA_EXTENSION_OVERRIDE once
Jordan Justen [Sun, 8 Jun 2014 21:10:52 +0000 (14:10 -0700)]
main/extensions: Only parse MESA_EXTENSION_OVERRIDE once

Previously, we would parse MESA_EXTENSION_OVERRIDE each time a context
was created. Now we will save the results of that parsing and use it
during context initialization.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomain/extensions: Build list of extensions that can't be disabled
Jordan Justen [Sun, 8 Jun 2014 20:40:31 +0000 (13:40 -0700)]
main/extensions: Build list of extensions that can't be disabled

This will allow us to utilize the early MESA_EXTENSION_OVERRIDE
parsing at the later extension string initialization step.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomain/extensions: Create extra extensions override string
Jordan Justen [Sun, 8 Jun 2014 20:26:11 +0000 (13:26 -0700)]
main/extensions: Create extra extensions override string

This will allow us to utilize the early MESA_EXTENSION_OVERRIDE
parsing at the later extension string initialization step.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965/cs: Use override structure rather than separate env var
Jordan Justen [Sun, 8 Jun 2014 07:05:37 +0000 (00:05 -0700)]
i965/cs: Use override structure rather than separate env var

In 25268b93, we added a new environment variable
(INTEL_COMPUTE_SHADER) to allow some constant values to be upgraded
for the ARB_compute_shader extension.

Now, we can look to see if the extension was enabled via the
MESA_EXTENSION_OVERRIDE environment variable.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomain/extensions: Add early extension override structures
Jordan Justen [Sun, 8 Jun 2014 06:54:31 +0000 (23:54 -0700)]
main/extensions: Add early extension override structures

During the early one_time_init phase of context creation, we
initialize two global gl_extensions structures.

We read the MESA_EXTENSION_OVERRIDE environment variable, and store
positive and negative overrides in two structures:
* struct gl_extensions _mesa_extension_override_enables
* struct gl_extensions _mesa_extension_override_disables

These are filled before the driver initializes extensions and
constants, therefore the driver can make adjustments based on the
desired overrides.

This can be useful during development of a new extension where the
extension is only partially ready. The driver can't actually advertise
support for the extension, but if it sees that the override is set for
the extension, then it can expose more supported parts of the
extension, such as upgrading context constants.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomain/extensions: Create a context-less set_extensions function
Jordan Justen [Sun, 8 Jun 2014 03:12:20 +0000 (20:12 -0700)]
main/extensions: Create a context-less set_extensions function

We will add new gl_extensions structures that capture the environment
variable extension overrides and are available early in context
creation.

This will allow a driver to take actions during its initialization
based on the extension overrides.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomain/extensions: Don't advertise unknown extensions overrides with (-)
Jordan Justen [Sun, 8 Jun 2014 20:10:59 +0000 (13:10 -0700)]
main/extensions: Don't advertise unknown extensions overrides with (-)

Previously setting:
MESA_EXTENSION_OVERRIDE=-GL_MESA_ham_sandwich

Would cause Mesa to advertise support for the GL_MESA_ham_sandwich
extension, even though the override specifically asked for it to be
disabled.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoradeonsi: fixup sizes of shader resource and sampler arrays
Marek Olšák [Sat, 14 Jun 2014 00:46:04 +0000 (02:46 +0200)]
radeonsi: fixup sizes of shader resource and sampler arrays

This was wrong for a very long time. I wonder if the array size has any
effect on anything.

Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agoscons: Link libGL.so against xcb-dri2.
José Fonseca [Mon, 16 Jun 2014 10:24:15 +0000 (11:24 +0100)]
scons: Link libGL.so against xcb-dri2.

Fixing undefined xcb_dri2_* symbols.

Trivial.

10 years agor600g/radeonsi: Remove default case from PIPE_COMPUTE_CAP_* switch
Michel Dänzer [Fri, 13 Jun 2014 03:23:33 +0000 (12:23 +0900)]
r600g/radeonsi: Remove default case from PIPE_COMPUTE_CAP_* switch

This way, the compiler warns about unhandled caps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agodocs: update ARB_explicit_uniform_location status
Tapani Pälli [Mon, 24 Mar 2014 09:29:08 +0000 (11:29 +0200)]
docs: update ARB_explicit_uniform_location status

+ modify release notes for 10.3

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
10 years agoEnable GL_ARB_explicit_uniform_location in the drivers.
Tapani Pälli [Wed, 5 Mar 2014 07:39:10 +0000 (09:39 +0200)]
Enable GL_ARB_explicit_uniform_location in the drivers.

v2: enable also for i915 (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Petri Latvala <petri.latvala@intel.com>
10 years agoglsl: parser changes for GL_ARB_explicit_uniform_location
Tapani Pälli [Wed, 5 Mar 2014 10:35:03 +0000 (12:35 +0200)]
glsl: parser changes for GL_ARB_explicit_uniform_location

Patch adds a preprocessor define for the extension and stores explicit
location data for uniforms during AST->HIR conversion. It also sets
layout token to be available when having the extension in place.

v2: change parser check to require GLSL 330 or enabling
    GL_ARB_explicit_attrib_location (Ian)
v3: fix the check and comment in AST->HIR (Petri)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
10 years agoglsl: add enable bit for ARB_explicit_uniform_location
Tapani Pälli [Mon, 5 May 2014 04:58:17 +0000 (07:58 +0300)]
glsl: add enable bit for ARB_explicit_uniform_location

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: support inactive uniforms in glUniform* functions
Tapani Pälli [Wed, 19 Mar 2014 10:39:10 +0000 (10:39 +0000)]
mesa: support inactive uniforms in glUniform* functions

Support inactive uniforms that have explicit location set in
glUniform* functions.

v2: remove unnecessary extension check, use new define (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/linker: assign explicit uniform locations
Tapani Pälli [Thu, 13 Mar 2014 10:48:27 +0000 (12:48 +0200)]
glsl/linker: assign explicit uniform locations

Patch refactors the existing uniform processing so explicit locations
are taken in to account during variable processing. These locations
are temporarily stored in gl_uniform_storage before actual locations
are set.

UNMAPPED_UNIFORM_LOC marks unset location so that we can use 0 as a
valid explicit location.

When locations are set, UniformRemapTable is first populated with
uniforms that have explicit location set (inactive and active ones),
rest are put after explicit location slots.

v2: introduce define for locations that have not been set yet (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/linker: initialize explicit uniform locations
Tapani Pälli [Tue, 8 Apr 2014 05:45:36 +0000 (08:45 +0300)]
glsl/linker: initialize explicit uniform locations

Patch initializes the UniformRemapTable for explicit locations. This
needs to happen before optimizations to make sure all inactive uniforms
get their explicit locations correctly.

v2: fix initialization bug, introduce define for inactive uniforms (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: add glsl_type::uniform_locations() helper function
Tapani Pälli [Thu, 5 Jun 2014 04:37:16 +0000 (07:37 +0300)]
glsl: add glsl_type::uniform_locations() helper function

This function calculates the number of unique values from
glGetUniformLocation for the elements of the type.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: add new enum MAX_UNIFORM_LOCATIONS
Tapani Pälli [Mon, 5 May 2014 04:55:34 +0000 (07:55 +0300)]
mesa: add new enum MAX_UNIFORM_LOCATIONS

Patch adds new implementation dependent value required by the
GL_ARB_explicit_uniform_location extension. Default value for user
assignable locations is calculated as sum of MaxUniformComponents
for each stage.

v2: fix descriptor in get_hash_params.py (Petri)
v3: simpler formula for calculating initial value (Ian)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: add enable bit for ARB_explicit_uniform_location
Tapani Pälli [Tue, 4 Mar 2014 13:23:31 +0000 (15:23 +0200)]
mesa: add enable bit for ARB_explicit_uniform_location

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglapi: add GL_ARB_explicit_uniform_location
Tapani Pälli [Thu, 5 Jun 2014 04:30:14 +0000 (07:30 +0300)]
glapi: add GL_ARB_explicit_uniform_location

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/vec4: Use the sampler for pull constant loads on Broadwell.
Kenneth Graunke [Sat, 14 Jun 2014 19:58:03 +0000 (12:58 -0700)]
i965/vec4: Use the sampler for pull constant loads on Broadwell.

We've used the LD sampler message for pull constant loads on earlier
hardware for some time, and also were already using it for the FS on
Broadwell.  This patch makes us use it for Broadwell VS/GS as well.

I believe that when I wrote this code in 2012, we still used the data
port in some cases, and I somehow neglected to convert it while
rebasing.

Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821%
(n = 17).  Many other applications should benefit similarly: this speeds
up uniform array access in the VS, which is commonly used for skinning
shaders, among other things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Ben Widawsky <ben@bwidawsk.net>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Add missing newlines to a few perf_debug messages.
Kenneth Graunke [Sat, 14 Jun 2014 08:43:28 +0000 (01:43 -0700)]
i965: Add missing newlines to a few perf_debug messages.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.
Kenneth Graunke [Sat, 14 Jun 2014 08:43:27 +0000 (01:43 -0700)]
i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.

I actually added MOCS support for these things, but forgot to delete the
corresponding perf_debug() warnings.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.
Kenneth Graunke [Sat, 14 Jun 2014 08:43:26 +0000 (01:43 -0700)]
i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.

Somehow I missed this when adding all of the other MOCS values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965/vec4: Fix dead code elimination for VGRFs of size > 1.
Kenneth Graunke [Sat, 14 Jun 2014 10:53:07 +0000 (03:53 -0700)]
i965/vec4: Fix dead code elimination for VGRFs of size > 1.

When faced with code such as:

    mov vgrf31.0:UD, 960D
    mov vgrf31.1:UD, vgrf30.xxxx:UD

The dead code eliminator didn't consider reg_offsets, so it decided that
the second instruction was writing was writing to the same register as
the first one, and eliminated the first one.  But they're actually
different registers.

This fixes INTEL_DEBUG=shader_time for vertex shaders.  In the above
code, vgrf31.0 represents the offset into the shader_time buffer where
the data should be written, and vgrf31.1 represents the actual time
data.  With a completely undefined offset, results were...unexpected.

I think this is probably one of the few cases (maybe only case) where we
generate multiple MOVs to a large VGRF.  Normally, we just use them as
texturing results; the other SEND-from-GRF uses a size 1 VGRF.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
10 years agoi965: Add SHADER_OPCODE_SHADER_TIME_ADD to dump_instructions() decode.
Kenneth Graunke [Sat, 14 Jun 2014 10:13:27 +0000 (03:13 -0700)]
i965: Add SHADER_OPCODE_SHADER_TIME_ADD to dump_instructions() decode.

"shader_time_add" is a lot more informative than "op152".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Fix clang mismatched-tags warnings with glsl_type.
Vinson Lee [Sun, 15 Jun 2014 06:37:41 +0000 (23:37 -0700)]
glsl: Fix clang mismatched-tags warnings with glsl_type.

Fix clang mismatched-tags warnings introduced with commit
4f5445a45d3ed02e00a061b10c943c0b079c6020.

./glsl_symbol_table.h:37:1: warning: class 'glsl_type' was previously declared as a struct [-Wmismatched-tags]
class glsl_type;
^
./glsl_types.h:86:8: note: previous use is here
struct glsl_type {
       ^
./glsl_symbol_table.h:37:1: note: did you mean struct here?
class glsl_type;
^~~~~

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa/drivers: Fix clang constant-logical-operand warnings.
Vinson Lee [Sat, 14 Jun 2014 04:37:18 +0000 (21:37 -0700)]
mesa/drivers: Fix clang constant-logical-operand warnings.

This patch fixes several clang constant-logical-operand warnings such as
the following.

../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: warning: use of logical '||' with constant operand [-Wconstant-logical-operand]
   if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL)
                               ^  ~~~~~~~~~~~
../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: note: use '|' for a bitwise operation
   if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL)
                               ^~
                               |

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Correct more typos
Chris Forbes [Sun, 15 Jun 2014 00:12:51 +0000 (12:12 +1200)]
glsl: Correct more typos

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoradeon/compute: Always report at least 1 compute unit
Tom Stellard [Fri, 13 Jun 2014 16:58:13 +0000 (12:58 -0400)]
radeon/compute: Always report at least 1 compute unit

Some apps will abort if they detect 0 compute units.  This fixes
crashes in some OpenCV tests.

10 years agometa_blit: properly compute texture width for the CopyTexSubImage fallback
Jason Ekstrand [Fri, 13 Jun 2014 19:15:04 +0000 (12:15 -0700)]
meta_blit: properly compute texture width for the CopyTexSubImage fallback

Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agofreedreno/a3xx: vtx formats
Rob Clark [Fri, 13 Jun 2014 15:37:23 +0000 (11:37 -0400)]
freedreno/a3xx: vtx formats

Add support for more vertex buffer formats.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: update generated headers
Rob Clark [Fri, 13 Jun 2014 17:34:55 +0000 (13:34 -0400)]
freedreno: update generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: try for more squarish tile dimensions
Rob Clark [Mon, 9 Jun 2014 17:36:24 +0000 (13:36 -0400)]
freedreno: try for more squarish tile dimensions

Worth about ~0.5fps in xonotic, for example.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: fix for null textures
Rob Clark [Mon, 9 Jun 2014 17:34:07 +0000 (13:34 -0400)]
freedreno: fix for null textures

Some apps seem to give us a null sampler/view for texture slots which
come before the last used texture slot.  In particular 0ad triggers
this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agollvmpipe: increase number of queries which can be binned simultaneously to 64
Roland Scheidegger [Thu, 12 Jun 2014 17:05:10 +0000 (19:05 +0200)]
llvmpipe: increase number of queries which can be binned simultaneously to 64

Gallium (but not OpenGL) does allow nesting of queries, but there's no
limit specified (d3d10 has no limit neither). Nevertheless, for practical
purposes we need some limit in llvmpipe, otherwise we'd need more complex
handling of queries as we need to keep track of all binned queries (this
only affects queries which gather data past setup). A limit of 16 is too
small though, while 64 would suffice.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agoradeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS
Bruno Jiménez [Fri, 13 Jun 2014 09:23:14 +0000 (11:23 +0200)]
radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS

v2:
    Add RADEON_INFO_ACTIVE_CU_COUNT as a define, as suggested by
    Tom Stellard

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agoRemove _mesa_is_type_integer and _mesa_is_enum_format_or_type_integer
Neil Roberts [Thu, 12 Jun 2014 16:52:41 +0000 (17:52 +0100)]
Remove _mesa_is_type_integer and _mesa_is_enum_format_or_type_integer

The comment for _mesa_is_type_integer is confusing because it says that it
returns whether the type is an “integer (non-normalized)” format. I don't
think it makes sense to say whether a type is normalized or not because it
depends on what format it is used with. For example, GL_RGBA+GL_UNSIGNED_BYTE
is normalized but GL_RGBA_INTEGER+GL_UNSIGNED_BYTE isn't. If the normalized
comment is just a mistake then it still doesn't make much sense because it is
missing the packed-pixel types such as GL_UNSIGNED_INT_5_6_5. If those were
added then it effectively just returns type != GL_FLOAT.

That function was only used in _mesa_is_enum_format_or_type_integer. This
function effectively checks whether the format is non-normalized or the type
is an integer. I can't think of any situation where that check would make
sense.

As far as I can tell neither of these functions have ever been used anywhere
so we should just remove them to avoid confusion.

These functions were added in 9ad8f431b2a47060bf05517246ab0fa8d249c800.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoclover: query driver for the max number of compute units
Bruno Jiménez [Fri, 30 May 2014 15:31:12 +0000 (17:31 +0200)]
clover: query driver for the max number of compute units

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agogallium: Add PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS
Bruno Jiménez [Fri, 30 May 2014 15:31:10 +0000 (17:31 +0200)]
gallium: Add PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agor600g/compute: solve a bug introduced by 2e01b8b440c1402c88a2755d89f40292e1f36ce5
Bruno Jiménez [Wed, 11 Jun 2014 15:28:01 +0000 (17:28 +0200)]
r600g/compute: solve a bug introduced by 2e01b8b440c1402c88a2755d89f40292e1f36ce5

That commit made possible that the items could be one just
after the other when their size was a multiple of ITEM_ALIGNMENT.
But compute_memory_prealloc_chunk still looked to leave a gap
between items. Resulting in that we got an infinite loop when
trying to add an item which would left no space between itself and
the next item.

Fixes piglit test: cl-custom-r600-create-release-buffer-bug
And the test for alignment I have just sent:
http://lists.freedesktop.org/archives/piglit/2014-June/011135.html

Sorry about this.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agoegl/gallium: Set defines for supported APIs when using automake
Niels Ole Salscheider [Wed, 11 Jun 2014 21:13:12 +0000 (23:13 +0200)]
egl/gallium: Set defines for supported APIs when using automake

This fixes automake builds which are broken since
b52a530ce2aada1967bc8fefa83ab53e6a737dae.

v2: This patch also adds the FEATURE_* defines back to targets/egl-static for
Android and Scons that have been removed in the mentioned commit.

Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79885
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoconfigure: correctly autodetect xvmc/vdpau/omx
Emil Velikov [Wed, 11 Jun 2014 21:15:58 +0000 (22:15 +0100)]
configure: correctly autodetect xvmc/vdpau/omx

Commit e62b7d38a1d (configure: autodetect video state-trackers
when non swrast driver is present) added a check that caused
the autodetection to be omitted when we have the swrast gallium
driver. Whereas it should have skipped the VL targets when only
swrast was selected.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79907
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agomesa: glx: Reduce error log level
Courtney Goeltzenleuchter [Wed, 26 Feb 2014 21:27:08 +0000 (14:27 -0700)]
mesa: glx: Reduce error log level

The code that parses LIBGL_DRIVERS_PATH was printing an
error for every attempted dlopen. It's not an error to
have to check multiple items in the path, only an error if
no suitable library is found. Reduced the load error to
a warning to match behavior of dynamic linker.

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agocso: fix stream-out clean up in cso_release_all()
Brian Paul [Sun, 8 Jun 2014 12:26:02 +0000 (05:26 -0700)]
cso: fix stream-out clean up in cso_release_all()

Use the has_streamout flag as we do elsewhere to check if we need
to call pipe->set_stream_output_targets().  The driver might implement
the set_stream_output_targets() function, but not for all hardware
configurations.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agoi965: Set the fast clear color value for texture surfaces
Neil Roberts [Mon, 9 Jun 2014 16:43:37 +0000 (17:43 +0100)]
i965: Set the fast clear color value for texture surfaces

When a multisampled texture is used for sampling the fast clear color value
needs to be programmed into the surface state. This was being left as all
zeroes so if the surface was cleared to a value other than black then it
wouldn't work properly. This doesn't matter for single-sample textures because
in that case the MCS buffer is resolved before it is used as a texture source.

https://bugs.freedesktop.org/show_bug.cgi?id=79729

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
10 years agoglsl: Fix typo in comment.
Chris Forbes [Thu, 12 Jun 2014 08:18:24 +0000 (20:18 +1200)]
glsl: Fix typo in comment.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Fix disassembly of BLORP clear programs.
Kenneth Graunke [Wed, 11 Jun 2014 19:22:12 +0000 (12:22 -0700)]
i965: Fix disassembly of BLORP clear programs.

Too many levels of indirection.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/fs: Move FB write default state mashing in a level.
Kenneth Graunke [Wed, 11 Jun 2014 01:54:09 +0000 (18:54 -0700)]
i965/fs: Move FB write default state mashing in a level.

We only need to alter the default state if we're emitting MOVs for
header related fields.  So, we can simply move the push/pop of state in
to the if (header_present) block, bypassing it in the common case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903

10 years agoi965: Fix Haswell discard regressions since Gen4-5 line AA fix.
Kenneth Graunke [Wed, 11 Jun 2014 01:50:03 +0000 (18:50 -0700)]
i965: Fix Haswell discard regressions since Gen4-5 line AA fix.

In commit dc2d3a7f5c217a7cee92380fbf503924a9591bea, Iago accidentally
moved fire_fb_write() above the brw_pop_insn_state(), which caused the
SEND to lose its predication and change from WE_normal to WE_all.
Haswell uses predicated SENDs for discards, so this broke Piglit's
tests for discards.

We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked,
but the actual FB write itself should respect those.  So, pop state
first, and force it again around the single MOV.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903

10 years agogbm: Remove 64x64 restriction from GBM_BO_USE_CURSOR
Michel Dänzer [Tue, 3 Jun 2014 07:45:23 +0000 (16:45 +0900)]
gbm: Remove 64x64 restriction from GBM_BO_USE_CURSOR

GBM_BO_USE_CURSOR_64X64 is kept so that existing users of GBM continue to
build, but it no longer rejects widths or heights other than 64.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79809

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agoi965: Use brw->gen in some generation checks.
Matt Turner [Wed, 11 Jun 2014 00:44:56 +0000 (17:44 -0700)]
i965: Use brw->gen in some generation checks.

Will simplify the automated conversion if we want to allow compiling the
driver for a single generation.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agoi965/fs: Clean up tabs in brw_fs_cse.cpp.
Matt Turner [Wed, 11 Jun 2014 20:01:31 +0000 (13:01 -0700)]
i965/fs: Clean up tabs in brw_fs_cse.cpp.

I'm adding vec4 CSE, and I want to diff the files.

10 years agoconfigure.ac: Simplify DUSE_EXTERNAL_DXTN_LIB logic.
Matt Turner [Wed, 11 Jun 2014 01:18:39 +0000 (18:18 -0700)]
configure.ac: Simplify DUSE_EXTERNAL_DXTN_LIB logic.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoconfigure.ac: Alphabetize AC_CONFIG_FILES.
Matt Turner [Wed, 11 Jun 2014 01:11:56 +0000 (18:11 -0700)]
configure.ac: Alphabetize AC_CONFIG_FILES.

This isn't supposed to be difficult.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agoconfigure.ac: Remove single quotes to fix syntax highlighting.
Matt Turner [Wed, 11 Jun 2014 01:08:10 +0000 (18:08 -0700)]
configure.ac: Remove single quotes to fix syntax highlighting.

Please stop adding them.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
10 years agometa: save and restore swizzle for _GenerateMipmap
Robert Bragg [Sun, 8 Jun 2014 18:02:41 +0000 (19:02 +0100)]
meta: save and restore swizzle for _GenerateMipmap

This makes sure to use a no-op swizzle while iteratively rendering each
level of a mipmap otherwise we may loose components and effectively
apply the swizzle twice by the time these levels are sampled.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/vec4: Emit smarter code for b2f of a comparison
Ian Romanick [Wed, 11 Jun 2014 01:07:50 +0000 (18:07 -0700)]
i965/vec4: Emit smarter code for b2f of a comparison

Previously we would emit the comparison, emit an AND to mask off extra
bits from the comparison result, then convert the result to float.  Now,
do the comparison, then use a cleverly constructed SEL to pick either
0.0f or 1.0f.

No piglit regressions on Ivybridge.

total instructions in shared programs: 1642311 -> 1639449 (-0.17%)
instructions in affected programs:     136533 -> 133671 (-2.10%)
GAINED:                                0
LOST:                                  0

Programs that are affected appear to save between 1 and 5 instuctions
(just by skimming the output from shader-db report.py.

v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes).  Remove
extraneous fix_3src_operand (suggested by Matt).  The latter change
required swapping the order of the operands and using predicate_inverse.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>