mesa.git
10 years agomapi: Prevent cast from pointer to integer of different size.
José Fonseca [Thu, 23 Jan 2014 13:21:52 +0000 (13:21 +0000)]
mapi: Prevent cast from pointer to integer of different size.

On Windows64.

10 years agoc11: Update docs/license.html and include verbatim copy of Boost license.
José Fonseca [Thu, 23 Jan 2014 10:49:57 +0000 (10:49 +0000)]
c11: Update docs/license.html and include verbatim copy of Boost license.

10 years agoegl: Use C11 thread abstractions.
José Fonseca [Fri, 26 Apr 2013 07:04:17 +0000 (08:04 +0100)]
egl: Use C11 thread abstractions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agomapi: Use C11 thread abstractions.
José Fonseca [Fri, 26 Apr 2013 07:04:06 +0000 (08:04 +0100)]
mapi: Use C11 thread abstractions.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agogallium: Use C11 thread abstractions.
José Fonseca [Fri, 26 Apr 2013 07:03:33 +0000 (08:03 +0100)]
gallium: Use C11 thread abstractions.

Note that PIPE_ROUTINE now returns an int.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoc11: Import threads.h emulation library.
José Fonseca [Tue, 12 Mar 2013 10:37:46 +0000 (10:37 +0000)]
c11: Import threads.h emulation library.

Implementation is based of https://gist.github.com/2223710 with the
following modifications:
- inline implementatation
- retain XP compatability
- add temporary hack for static mutex initializers (as they are not part
  of the stack but still widely used internally)
- make TIME_UTC a conditional macro (some system headers already define
  it, so this prevents conflict)
- respect HAVE_PTHREAD macro

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoos: Remove pipe_static_condvar.
José Fonseca [Tue, 12 Mar 2013 11:54:58 +0000 (11:54 +0000)]
os: Remove pipe_static_condvar.

Never used.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agodocs: Mark ARB_arrays_of_arrays as started
Timothy Arceri [Thu, 23 Jan 2014 12:24:45 +0000 (23:24 +1100)]
docs: Mark ARB_arrays_of_arrays as started

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: remove remaining is_array variables
Timothy Arceri [Thu, 23 Jan 2014 12:22:01 +0000 (23:22 +1100)]
glsl: remove remaining is_array variables

Previously the reason we needed is_array was because we used array_size == NULL to
 represent both non-arrays and unsized arrays.  Now that we use a non-NULL
array_specifier to represent an unsized array, is_array is redundant.

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: create type name for arrays of arrays
Timothy Arceri [Thu, 23 Jan 2014 12:21:02 +0000 (23:21 +1100)]
glsl: create type name for arrays of arrays

We need to insert outermost dimensions in the correct spot otherwise
 the dimension order will be backwards

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: Allow arrays of arrays as input to vertex shader
Timothy Arceri [Thu, 23 Jan 2014 12:20:25 +0000 (23:20 +1100)]
glsl: Allow arrays of arrays as input to vertex shader

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: only call mark_max_array if we are assigning an
Timothy Arceri [Thu, 23 Jan 2014 12:19:54 +0000 (23:19 +1100)]
glsl: only call mark_max_array if we are assigning an
 array

This change does not help fix or prevent any bugs
it just seems reasonable to do

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: Add ARB_arrays_of_arrays support to yacc definition and ast
Timothy Arceri [Thu, 23 Jan 2014 12:16:41 +0000 (23:16 +1100)]
glsl: Add ARB_arrays_of_arrays support to yacc definition and ast

Adds array specifier object to hold array information

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agomesa: Add ARB_arrays_of_arrays
Timothy Arceri [Thu, 23 Jan 2014 12:15:29 +0000 (23:15 +1100)]
mesa: Add ARB_arrays_of_arrays

Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: switch eu-emitter to use FS IR and fs_generator
Topi Pohjolainen [Tue, 10 Dec 2013 13:12:30 +0000 (15:12 +0200)]
i965/blorp: switch eu-emitter to use FS IR and fs_generator

No regressions on IVB (piglit quick + unit tests).

v2 (Paul):
  - no need to patch the unit tests anymore. Original logic
    was altered and unit tests updated to match the
    fs-generator
  - lrp emission moves from the blorp compiler core into the
    emitter here (previously there was a separate refactoring
    patch which is not really needed anymore as the lrp logic
    got refactored when the original lrp logic got fixed).
  - pass 'BRW_BLORP_RENDERBUFFER_BINDING_TABLE_INDEX' to the
    generator in fs_inst::target instead of hardcoding it

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/fs: add support for BRW_OPCODE_AVG in fs_generator
Topi Pohjolainen [Tue, 17 Dec 2013 14:39:16 +0000 (16:39 +0200)]
i965/fs: add support for BRW_OPCODE_AVG in fs_generator

Needed for compiling blorp blit programs.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/fs: introduce blorp specific rt-write for fs_generator
Topi Pohjolainen [Tue, 17 Dec 2013 12:00:50 +0000 (14:00 +0200)]
i965/fs: introduce blorp specific rt-write for fs_generator

The compiler for blorp programs likes to emit instructions for
the message construction itself meaning that the generator needs
to skip any such when blorp programs are translated for the hw.
In addition, the binding table control is special for blorp
programs and the generator does not need to update the binding
tables associated with the compiler bookkeeping (this in fact
gets thrown away as the blorp compiler sets the program data
in its own way).

v2 (Paul): do not hardcode the binding table index but use
           fs_inst::target instead.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/fs: allow unit tests to dump the final patched assembly
Topi Pohjolainen [Wed, 11 Dec 2013 08:58:38 +0000 (10:58 +0200)]
i965/fs: allow unit tests to dump the final patched assembly

Unit tests comparing generated blorp programs to known good need
to have the dump in designated file instead of in default
standard output. The comparison also expects the jump counters
of if-else-instructions to be correctly set and hence the dump
needs to be taken _after_ 'patch_IF_ELSE()' is run (the default
dump of the fs_generator does this before).

v2 (Paul): dropped the redundant 'dump_enabled' argument

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap brw_IF/ELSE/ENDIF() into eu-emitter
Topi Pohjolainen [Mon, 2 Dec 2013 08:48:59 +0000 (10:48 +0200)]
i965/blorp: wrap brw_IF/ELSE/ENDIF() into eu-emitter

v2 (Paul): renamed emit_if() to emit_cmp_if()

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap RNDD (/brw_RNDD(&func, /emit_rndd(/)
Topi Pohjolainen [Fri, 29 Nov 2013 11:29:56 +0000 (13:29 +0200)]
i965/blorp: wrap RNDD (/brw_RNDD(&func, /emit_rndd(/)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap FRC (/brw_FRC(&func, /emit_frc(/)
Topi Pohjolainen [Fri, 29 Nov 2013 11:27:58 +0000 (13:27 +0200)]
i965/blorp: wrap FRC (/brw_FRC(&func, /emit_frc(/)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap MUL (/brw_MUL(&func, /emit_mul(/)
Topi Pohjolainen [Fri, 29 Nov 2013 11:20:11 +0000 (13:20 +0200)]
i965/blorp: wrap MUL (/brw_MUL(&func, /emit_mul(/)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap OR (/brw_OR(&func, /emit_or(/)
Topi Pohjolainen [Fri, 29 Nov 2013 11:05:57 +0000 (13:05 +0200)]
i965/blorp: wrap OR (/brw_OR(&func, /emit_or(/)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap SHL (/brw_SHL(&func, /emit_shl(/)
Topi Pohjolainen [Fri, 29 Nov 2013 11:02:32 +0000 (13:02 +0200)]
i965/blorp: wrap SHL (/brw_SHL(&func, /emit_shl(/)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap SHR (/brw_SHR(&func, /emit_shr(/)
Topi Pohjolainen [Fri, 29 Nov 2013 10:59:42 +0000 (12:59 +0200)]
i965/blorp: wrap SHR (/brw_SHR(&func, /emit_shr(/)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap ADD (/brw_ADD(&func, /emit_add(/)
Topi Pohjolainen [Fri, 29 Nov 2013 10:32:03 +0000 (12:32 +0200)]
i965/blorp: wrap ADD (/brw_ADD(&func, /emit_add(/)

In addition, the special case requiring explicit execution size
control is wrapped manually.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap AND (/brw_AND(&func, /emit_and(/)
Topi Pohjolainen [Fri, 29 Nov 2013 10:27:23 +0000 (12:27 +0200)]
i965/blorp: wrap AND (/brw_AND(&func, /emit_and(/)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap MOV (/brw_MOV(&func, /emit_mov(/)
Topi Pohjolainen [Fri, 29 Nov 2013 10:17:38 +0000 (12:17 +0200)]
i965/blorp: wrap MOV (/brw_MOV(&func, /emit_mov(/)

In addition, the two special cases requiring explicit execution
size control are wrapped manually.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap emission of if-equal-assignment
Topi Pohjolainen [Sat, 30 Nov 2013 15:11:41 +0000 (17:11 +0200)]
i965/blorp: wrap emission of if-equal-assignment

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: wrap emission of conditional assignment
Topi Pohjolainen [Sat, 30 Nov 2013 15:06:19 +0000 (17:06 +0200)]
i965/blorp: wrap emission of conditional assignment

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: move emission of sample combining into eu-emitter
Topi Pohjolainen [Mon, 2 Dec 2013 12:56:49 +0000 (14:56 +0200)]
i965/blorp: move emission of sample combining into eu-emitter

v2 (Paul): pass the combining opcode as an argument to emit_combine().
           This keeps manual_blend_average() selfcontained
           documentation wise.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: move emission of rt-write into eu-emitter
Topi Pohjolainen [Mon, 2 Dec 2013 12:12:39 +0000 (14:12 +0200)]
i965/blorp: move emission of rt-write into eu-emitter

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: move emission of texture lookup into eu-emitter
Topi Pohjolainen [Mon, 2 Dec 2013 12:01:54 +0000 (14:01 +0200)]
i965/blorp: move emission of texture lookup into eu-emitter

Resolving of the hardware message type is moved into the
emitter also in preparation for switching to use fs_generator.
The generator wants to translate the high level op-code into
the message type and hence the emitter needs to know the
original op-code.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/fs: introduce non-compressed equivalent of tex_cms
Topi Pohjolainen [Tue, 10 Dec 2013 14:38:15 +0000 (16:38 +0200)]
i965/fs: introduce non-compressed equivalent of tex_cms

v2: introduces 'SHADER_OPCODE_TXF_UMS' also for gen8

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965: rename tex_ms to tex_cms
Topi Pohjolainen [Tue, 10 Dec 2013 14:36:31 +0000 (16:36 +0200)]
i965: rename tex_ms to tex_cms

Prepares for the introduction of non-compressed multi-sampled
lookup used in the blorp programs.

v2: now also taking into account gen8

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: move emission of pixel kill into eu-emitter
Topi Pohjolainen [Mon, 2 Dec 2013 09:09:19 +0000 (11:09 +0200)]
i965/blorp: move emission of pixel kill into eu-emitter

The combination of four separate comparison operations and
and the masked "and" require special treatment when moving
to FS LIR.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965/blorp: introduce separate eu-emitter for blit compiler
Topi Pohjolainen [Fri, 29 Nov 2013 09:57:15 +0000 (11:57 +0200)]
i965/blorp: introduce separate eu-emitter for blit compiler

Prepares for presenting blorp blit programs using FS IR that
allows EU-assembly generation using i965 glsl-compiler
backend (fs_generator).

v2: rebased on top of endif-jump counter fix (moving the
    added brw_set_uip_jip() into the emitter)

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoi965: Support 32 texture image units on Haswell+.
Kenneth Graunke [Wed, 15 Jan 2014 18:08:38 +0000 (10:08 -0800)]
i965: Support 32 texture image units on Haswell+.

The Intel closed source OpenGL driver recently began supporting 32
texture image units on Haswell.  This makes the open source driver
support 32 as well.

Earlier generations don't have the message header field required to
support more than 16 sampler states, so we continue to advertise 16
there.

On Haswell, this causes us to advertise:
- GL_MAX_TEXTURE_IMAGE_UNITS = 32
- GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS = 32
- GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS = 96
instead of the old values of 16, 16, and 48.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Switch from BRW_MAX_TEX_UNIT to the actual limit.
Kenneth Graunke [Sat, 18 Jan 2014 22:48:11 +0000 (14:48 -0800)]
i965/fs: Switch from BRW_MAX_TEX_UNIT to the actual limit.

BRW_MAX_TEX_UNIT is about to grow, but only Gen7+ will be able to
support the new larger value.  On older platforms, we don't want to
allocate the extra space - it would just be a waste.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agomesa: Bump MAX_TEXTURE_IMAGE_UNITS to 32.
Kenneth Graunke [Wed, 15 Jan 2014 18:08:06 +0000 (10:08 -0800)]
mesa: Bump MAX_TEXTURE_IMAGE_UNITS to 32.

This allows drivers to optionally support more than 16 texture units.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/vec4: Support arbitrarily large sampler state indices on Haswell+.
Kenneth Graunke [Sat, 18 Jan 2014 22:32:49 +0000 (14:32 -0800)]
i965/vec4: Support arbitrarily large sampler state indices on Haswell+.

Like the scalar backend, we add an offset to the "Sampler State Pointer"
field to select a group of 16 samplers, then use the "Sampler Index"
field to select within that group.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/vec4: Refactor sampler message setup.
Kenneth Graunke [Sat, 18 Jan 2014 22:29:19 +0000 (14:29 -0800)]
i965/vec4: Refactor sampler message setup.

The next patch adds an additional case where the message header is
necessary.  So we want to do the g0 copy if inst->header_present is set,
rather than inst->texture_offset.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/vec4: Don't set header_present if texel offsets are all 0.
Kenneth Graunke [Sat, 18 Jan 2014 22:34:07 +0000 (14:34 -0800)]
i965/vec4: Don't set header_present if texel offsets are all 0.

In theory, a shader might use textureOffset() but set all the texel
offsets to zero.  In that case, we don't actually need to set up the
message header - zero is the implicit default.

By moving the texture_offset setup before the header_present setup, we
can easily only set header_present when there are non-zero texel offset
values.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Support arbitrarily large sampler state indices on Haswell+.
Kenneth Graunke [Sat, 18 Jan 2014 21:29:39 +0000 (13:29 -0800)]
i965/fs: Support arbitrarily large sampler state indices on Haswell+.

The message descriptor's "Sampler Index" field is only 4 bits (on all
generations of hardware), so it can only represent indices 0 through 15.

Haswell introduced a new field in the message header - "Sampler State
Pointer".  Normally, this is copied straight from g0, but we can also
add a byte offset (as long as it's a multiple of 32).

This patch uses a "Sampler State Pointer" offset to select a group of
16 sampler states, and then uses the "Sampler Index" field to select
the state within that group.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Plumb sampler index into emit_texture_gen7.
Kenneth Graunke [Sat, 18 Jan 2014 21:28:40 +0000 (13:28 -0800)]
i965/fs: Plumb sampler index into emit_texture_gen7.

We'll need this in the next patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/fs: Refactor sampler message header to duplicate less code.
Kenneth Graunke [Sat, 18 Jan 2014 20:48:18 +0000 (12:48 -0800)]
i965/fs: Refactor sampler message header to duplicate less code.

Previously, the code to copy g0 to the message header existed in two
places - one for the texture offset case, and one for any other case.

By treating texture_offset as a special case of header_present, we can
remove this duplication and shorten the code.  Future patches which add
new header fields also won't have to add additional duplication.

This also clarifies a confusing construct.  The old code contained:

   } else if (inst->header_present) {
      if (brw->gen >= 7) {
         ...explicit copy from g0 to the message header...
      } else {
         /* Set up an implied move from g0 to the MRF. */
      }
   }

This looks like it might set up an implied move on Sandybridge, which
doesn't support those.  However, Sandybridge only uses a message header
for texture offsets, so it would never hit this code path.  The new code
avoids this implicit knowledge by only setting up an implied move on
Gen4-5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Use get_element_ud to shorten texture header access.
Kenneth Graunke [Sat, 18 Jan 2014 20:49:58 +0000 (12:49 -0800)]
i965: Use get_element_ud to shorten texture header access.

This is shorter, easier to read, and further from the 80 column limit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agogallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats
Marek Olšák [Tue, 21 Jan 2014 18:53:45 +0000 (19:53 +0100)]
gallium/util: util_format_srgb should not return FORMAT_NONE for sRGB formats

This fixes a serious regression introduced
in 4e549ddb500cf677b6fa16d9ebdfa67cc23da097.

Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallium: remove PIPE_CAP_SCALED_RESOLVE
Marek Olšák [Fri, 17 Jan 2014 21:57:39 +0000 (22:57 +0100)]
gallium: remove PIPE_CAP_SCALED_RESOLVE

If any driver doesn't support this, it can use a blit after resolving
the samples.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoradeonsi: use hardware scissors correctly
Marek Olšák [Mon, 13 Jan 2014 22:42:18 +0000 (23:42 +0100)]
radeonsi: use hardware scissors correctly

Use the WINDOW and VPORT scissors for the framebuffer and scissor test,
respectively. The other two scissors are disabled (they cover the max fb size).

We actually have 16 VPORT scissors, which will map well to ARB_viewport_array.

Also, we don't need to write SC_WINDOW_OFFSET with this commit, because it's
disabled everywhere.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
10 years agoradeonsi: handle R600_CONTEXT_PS_PARTIAL_FLUSH in si_emit_cache_flush
Marek Olšák [Mon, 13 Jan 2014 12:15:19 +0000 (13:15 +0100)]
radeonsi: handle R600_CONTEXT_PS_PARTIAL_FLUSH in si_emit_cache_flush

For consistency only, This is unused by radeonsi currently.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agor600g,radeonsi: if discarding whole buffer range, discard whole resource instead
Marek Olšák [Mon, 13 Jan 2014 12:10:06 +0000 (13:10 +0100)]
r600g,radeonsi: if discarding whole buffer range, discard whole resource instead

Also set the unsynchronized flag if the whole resource was discarded
to avoid doing buffer-busy checks again.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agogallium/u_upload_mgr: don't expose u_upload_flush
Marek Olšák [Mon, 13 Jan 2014 12:03:25 +0000 (13:03 +0100)]
gallium/u_upload_mgr: don't expose u_upload_flush

It's unused and shouldn't be used at all in my opinion.

If some driver doesn't support the unsynchronized flag, u_upload_mgr should
avoid the synchronization by other means, e.g. by using the DONTBLOCK flag.

10 years agogallium/hud: just unmap the upload vertex buffer instead of recreating it
Marek Olšák [Mon, 13 Jan 2014 11:59:14 +0000 (12:59 +0100)]
gallium/hud: just unmap the upload vertex buffer instead of recreating it

10 years agogallium/vl: use u_upload_mgr to upload vertices for vl_compositor
Marek Olšák [Mon, 13 Jan 2014 12:51:21 +0000 (13:51 +0100)]
gallium/vl: use u_upload_mgr to upload vertices for vl_compositor

This is the recommended way for streaming vertices. Always use this if you
need to upload vertices every frame.

Reviewed-by: Christian König <christian.koenig@amd.com>
10 years agointel: Fix initial MakeCurrent for single-buffer drawables
Kristian Høgsberg [Tue, 21 Jan 2014 20:17:03 +0000 (12:17 -0800)]
intel: Fix initial MakeCurrent for single-buffer drawables

Commit 05da4a7a5e7d5bd988cb31f94ed8e1f053d9ee39 attempts to eliminate the
call to intel_update_renderbuffer() in the case where we already have a
drawbuffer for the drawable.  Unfortunately this only checks the
back left renderbuffer, which breaks in case of single buffer drawables.

This means that the initial viewport will not be set in that case.  Instead,
we now check whether the initial viewport has not been set, in which case
we call out to intel_update_renderbuffer().

https://bugs.freedesktop.org/show_bug.cgi?id=73862

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
10 years agoglsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays.
Paul Berry [Tue, 21 Jan 2014 23:41:26 +0000 (15:41 -0800)]
glsl: Simplify aggregate type inference to prepare for ARB_arrays_of_arrays.

Most of the time it is not necessary to perform type inference to
compile GLSL; the type of every expression can be inferred from the
contents of the expression itself (and previous type declarations).
The exception is aggregate initializers: their type is determined by
the LHS of the variable being assigned to.  For example, in the
statement:

   mat2 foo = { { 1, 2 }, { 3, 4 } };

the type of { 1, 2 } is only known to be vec2 (as opposed to, say,
ivec2, uvec2, int[2], or a struct) because of the fact that the result
is being assigned to a mat2.

Previous to this patch, we handled this situation by doing some type
inference during parsing: when parsing a declaration like the one
above, we would call _mesa_set_aggregate_type(), which would infer the
type of each aggregate initializer and store it in the corresponding
ast_aggregate_initializer::constructor_type field.  Since this
happened at parse time, we couldn't do the type inference using
glsl_type objects; we had to use ast_type_specifiers, which are much
more awkward to work with.  Things are about to get more complicated
when we add support for ARB_arrays_of_arrays.

This patch simplifies things by postponing the call to
_mesa_set_aggregate_type() until ast-to-hir time, when we have access
to glsl_type objects.  As a side benefit, we only need to have one
call to _mesa_set_aggregate_type() now, instead of six.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoclover: Don't crash on NULL global buffer objects.
Jan Vesely [Fri, 17 Jan 2014 01:22:14 +0000 (20:22 -0500)]
clover: Don't crash on NULL global buffer objects.

Specs say "If the argument is a buffer object, the arg_value
pointer can be NULL or point to a NULL value in which case a NULL
value will be used as the value for the argument declared as a
pointer to __global or __constant memory in the kernel."

So don't crash when somebody does that.

v2: Insert NULL into input buffer instead of buffer handle pair
    Fix constant_argument too
    Drop r600 driver changes

v3: Fix inserting NULL pointer

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
10 years agometa: Move loop variable declaration outside loop.
Vinson Lee [Wed, 22 Jan 2014 06:46:39 +0000 (22:46 -0800)]
meta: Move loop variable declaration outside loop.

Fixes MSVC build error introduced with commit
69b258cb4636315b4c1aaaceeedd1eed8af98ba8.

meta.c(618) : error C2143: syntax error : missing ';' before 'type'
meta.c(618) : error C2143: syntax error : missing ')' before 'type'
meta.c(618) : error C2065: 'i' : undeclared identifier
meta.c(618) : warning C4552: '<' : operator has no effect; expected operator with side-effect
meta.c(618) : error C2059: syntax error : ')'
meta.c(618) : error C2143: syntax error : missing ';' before '{'
meta.c(619) : error C2065: 'i' : undeclared identifier
meta.c(620) : error C2065: 'i' : undeclared identifier

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agoi965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPR
Topi Pohjolainen [Tue, 21 Jan 2014 09:33:58 +0000 (11:33 +0200)]
i965/blorp: use BRW_COMPRESSION_2NDHALF for second half LPR

No known bugs fixed but this is now in line with fs-generator.
No regresssions on IVB.

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/blorp: patch jump counters also for endif
Topi Pohjolainen [Tue, 21 Jan 2014 08:31:10 +0000 (10:31 +0200)]
i965/blorp: patch jump counters also for endif

No known bugs fixed but this is now in line with fs-generator.
No regresssions on IVB.

Eric further explained that:

  "The endif jump, since it's forward, is just an optimization to
   have set right -- otherwise, the GPU will just step forward
   instruction by instruction until it hits something else that
   updates the per-channel PC."

Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agomesa: Change redundant code into loops in texstate.c.
Paul Berry [Thu, 9 Jan 2014 19:34:33 +0000 (11:34 -0800)]
mesa: Change redundant code into loops in texstate.c.

This is possible now that ctx->Shader.CurrentProgram is an array.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Change redundant code into loops in shaderapi.c.
Paul Berry [Thu, 9 Jan 2014 19:33:15 +0000 (11:33 -0800)]
mesa: Change redundant code into loops in shaderapi.c.

This is possible now that ctx->Shader.CurrentProgram is an array.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Remove ad-hoc arrays of gl_shader_program.
Paul Berry [Thu, 9 Jan 2014 19:32:00 +0000 (11:32 -0800)]
mesa: Remove ad-hoc arrays of gl_shader_program.

Now that we have a ctx->Shader.CurrentProgram array, we can just use
it directly.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agometa: Replace save_state::{Vertex,Geometry,Fragment}Shader with an array.
Paul Berry [Thu, 9 Jan 2014 19:29:17 +0000 (11:29 -0800)]
meta: Replace save_state::{Vertex,Geometry,Fragment}Shader with an array.

Since ctx->Shader.Current{Vertex,Geometry,Fragment}Program is an
array, this allows some meta code to be rolled up into loops.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoi965: Fix comments to refer to the new ctx->Shader.CurrentProgram array.
Paul Berry [Thu, 9 Jan 2014 19:28:20 +0000 (11:28 -0800)]
i965: Fix comments to refer to the new ctx->Shader.CurrentProgram array.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Fold long lines introduced by the previous patch.
Paul Berry [Thu, 9 Jan 2014 19:27:38 +0000 (11:27 -0800)]
mesa: Fold long lines introduced by the previous patch.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Replace ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an array.
Paul Berry [Thu, 9 Jan 2014 19:16:27 +0000 (11:16 -0800)]
mesa: Replace ctx->Shader.Current{Vertex,Fragment,Geometry}Program with an array.

These are replaced with
ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}].
In patches to follow, this will allow us to replace a lot of ad-hoc
logic with a variable index into the array.

With the exception of the changes to mtypes.h, this patch was
generated entirely by the command:

    find src -type f '(' -iname '*.c' -o -iname '*.cpp' ')' \
    -print0 | xargs -0 sed -i \
    -e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \
    -e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \
    -e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g'

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglsl/linker: Refactor in preparation for adding more shader stages.
Paul Berry [Tue, 7 Jan 2014 16:56:57 +0000 (08:56 -0800)]
glsl/linker: Refactor in preparation for adding more shader stages.

Rather than maintain separately named arrays and counts for vertex,
geometry, and fragment shaders, just maintain these as arrays indexed
by the gl_shader_type enum.

v2: When there is neither a vertex nor a geometry shader, set
prog->LastClipDistanceArraySize = 0, and clarify that the values is
not used.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: use _mesa_validate_shader_target() more frequently.
Paul Berry [Tue, 7 Jan 2014 23:19:07 +0000 (15:19 -0800)]
mesa: use _mesa_validate_shader_target() more frequently.

This patch replaces code in _mesa_new_shader() and delete_shader_cb()
that checks the type of a shader with calls to
_mesa_validate_shader_target().  This has two advantages: it allows
for a more thorough check (since _mesa_validate_shader_target()
doesn't permit shader targets that aren't supported by the back-end),
and it reduces the amount of code that will need to be modified when
adding new shader stages.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomain: Allow ctx == NULL in _mesa_validate_shader_target().
Paul Berry [Thu, 9 Jan 2014 23:30:10 +0000 (15:30 -0800)]
main: Allow ctx == NULL in _mesa_validate_shader_target().

This will allow this function to be used in circumstances where there
is no context available, such as when building built-in GLSL
functions.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Make validate_shader_target() non-static.
Paul Berry [Tue, 7 Jan 2014 23:13:52 +0000 (15:13 -0800)]
mesa: Make validate_shader_target() non-static.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program.
Paul Berry [Thu, 9 Jan 2014 21:42:05 +0000 (13:42 -0800)]
mesa: Replace _mesa_program_index_to_target with _mesa_shader_stage_to_program.

In my recent zeal to refactor Mesa's handling of the gl_shader_stage
enum, I accidentally wound up with two functions that do the same
thing: _mesa_program_index_to_target(), and
_mesa_shader_stage_to_program().

This patch keeps _mesa_shader_stage_to_program(), since its name is
more consistent with other related functions.  However, it changes the
signature so that it accepts an unsigned integer instead of a
gl_shader_stage--this avoids awkward casts when the function is called
from C++ code.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agollvmpipe: dump geometry shaders when using LP_DEBUG=tgsi
Dave Airlie [Tue, 21 Jan 2014 04:54:05 +0000 (14:54 +1000)]
llvmpipe: dump geometry shaders when using LP_DEBUG=tgsi

for consistency with vs and fs dumpers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agomesa: Generate GL_INVALID_OPERATION for unsupported DSA TexStorage functions
Ian Romanick [Wed, 18 Dec 2013 22:43:19 +0000 (14:43 -0800)]
mesa: Generate GL_INVALID_OPERATION for unsupported DSA TexStorage functions

We have to make the functions available to work around a GLEW bug (see
comments already in the code), but if an application calls one of these
functions we should still generate GL_INVALID_OPERATION.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Silence many unused parameter warnings
Ian Romanick [Wed, 18 Dec 2013 22:39:26 +0000 (14:39 -0800)]
mesa: Silence many unused parameter warnings

main/texstorage.c: In function '_mesa_alloc_texture_storage':
main/texstorage.c:240:53: warning: unused parameter 'width' [-Wunused-parameter]
main/texstorage.c:241:37: warning: unused parameter 'height' [-Wunused-parameter]
main/texstorage.c:241:53: warning: unused parameter 'depth' [-Wunused-parameter]
main/texstorage.c: In function '_mesa_TextureStorage1DEXT':
main/texstorage.c:464:34: warning: unused parameter 'texture' [-Wunused-parameter]
main/texstorage.c:464:50: warning: unused parameter 'target' [-Wunused-parameter]
main/texstorage.c:464:66: warning: unused parameter 'levels' [-Wunused-parameter]
main/texstorage.c:465:34: warning: unused parameter 'internalformat' [-Wunused-parameter]
main/texstorage.c:466:35: warning: unused parameter 'width' [-Wunused-parameter]
main/texstorage.c: In function '_mesa_TextureStorage2DEXT':
main/texstorage.c:473:34: warning: unused parameter 'texture' [-Wunused-parameter]
main/texstorage.c:473:50: warning: unused parameter 'target' [-Wunused-parameter]
main/texstorage.c:473:66: warning: unused parameter 'levels' [-Wunused-parameter]
main/texstorage.c:474:34: warning: unused parameter 'internalformat' [-Wunused-parameter]
main/texstorage.c:475:35: warning: unused parameter 'width' [-Wunused-parameter]
main/texstorage.c:475:50: warning: unused parameter 'height' [-Wunused-parameter]
main/texstorage.c: In function '_mesa_TextureStorage3DEXT':
main/texstorage.c:483:34: warning: unused parameter 'texture' [-Wunused-parameter]
main/texstorage.c:483:50: warning: unused parameter 'target' [-Wunused-parameter]
main/texstorage.c:483:66: warning: unused parameter 'levels' [-Wunused-parameter]
main/texstorage.c:484:34: warning: unused parameter 'internalformat' [-Wunused-parameter]
main/texstorage.c:485:35: warning: unused parameter 'width' [-Wunused-parameter]
main/texstorage.c:485:50: warning: unused parameter 'height' [-Wunused-parameter]
main/texstorage.c:485:66: warning: unused parameter 'depth' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoi965: Ignore 'centroid' interpolation qualifier in case of persample shading
Anuj Phogat [Wed, 15 Jan 2014 18:23:02 +0000 (10:23 -0800)]
i965: Ignore 'centroid' interpolation qualifier in case of persample shading

This patch handles the use of 'centroid' qualifier with 'in' variables
in a fragment shader when persample shading is enabled. Per sample
shading for the whole fragment shader can be enabled by:
glEnable(GL_SAMPLE_SHADING) or using {gl_SamplePosition, gl_SampleID}
builtin variables in fragment shader. Explaining it below in more
detail.

/* Enable sample shading using OpenGL API */
glEnable(GL_SAMPLE_SHADING);
glMinSampleShading(1.0);

Example fragment shader:
in vec4 a;
centroid in vec4 b;
main()
{
  ...
}

Variable 'a' will be interpolated at sample location. But, what
interpolation should we use for variable 'b' ?

ARB_sample_shading recommends interpolation at sample position for
all the variables. GLSL 400 (and earlier) spec says that:

"When an interpolation qualifier is used, it overrides settings
established through the OpenGL API."
But, this text got deleted in later versions of GLSL.

NVIDIA's and AMD's proprietary linux drivers (at OpenGL 4.3)
interpolates at sample position. This convinces me to use
the similar approach on intel hardware.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Use sample barycentric coordinates with per sample shading
Anuj Phogat [Mon, 6 Jan 2014 21:59:18 +0000 (13:59 -0800)]
i965: Use sample barycentric coordinates with per sample shading

Current implementation of arb_sample_shading doesn't set 'Barycentric
Interpolation Mode' correctly. We use pixel barycentric coordinates
for per sample shading. Instead we should select perspective sample
or non-perspective sample barycentric coordinates.

It also enables using sample barycentric coordinates in case of a
fragment shader variable declared with 'sample' qualifier.
e.g. sample in vec4 pos;

A piglit test to verify the implementation has been posted on piglit
mailing list for review.

V2: Do not interpolate all the 'in' variables at sample position
    if fragment shader uses 'sample' qualifier with one of them.
    For example we have a fragment shader:
    #version 330
    #extension ARB_gpu_shader5: require
    sample in vec4 a;
    in vec4 b;
    main()
    {
      ...
    }

    Only 'a' should be sampled at sample location, not 'b'.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965: Add an option to ignore sample qualifier
Anuj Phogat [Mon, 13 Jan 2014 20:26:55 +0000 (12:26 -0800)]
i965: Add an option to ignore sample qualifier

This will be useful in my next patch which depends on a functionality
of _mesa_get_min_invocations_per_fragment() to ignore the sample
qualifier (prog->IsSample) based on a flag passed to it.

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agomesa/x86: Remove dead read_rgba_span_x86.h.
Matt Turner [Sun, 12 Jan 2014 04:37:51 +0000 (20:37 -0800)]
mesa/x86: Remove dead read_rgba_span_x86.h.

Dead since 304f7a13.

10 years agoi965/fs: Optimize LRP with x == y into a MOV.
Matt Turner [Fri, 10 Jan 2014 04:57:36 +0000 (20:57 -0800)]
i965/fs: Optimize LRP with x == y into a MOV.

total instructions in shared programs: 1487331 -> 1485988 (-0.09%)
instructions in affected programs:     45638 -> 44295 (-2.94%)
GAINED:                                7
LOST:                                  0

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl: Optimize open-coded lrp into lrp.
Jordan Justen [Mon, 4 Nov 2013 18:23:24 +0000 (10:23 -0800)]
glsl: Optimize open-coded lrp into lrp.

total instructions in shared programs: 1498191 -> 1487051 (-0.74%)
instructions in affected programs:     669388 -> 658248 (-1.66%)
GAINED:                                1
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965: Enable AOS optimizations for the geometry shader.
Matt Turner [Fri, 3 Jan 2014 22:52:55 +0000 (14:52 -0800)]
i965: Enable AOS optimizations for the geometry shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Vectorize multiple scalar assignments
Matt Turner [Sat, 21 Dec 2013 19:28:05 +0000 (11:28 -0800)]
glsl: Vectorize multiple scalar assignments

Reduces vertex shader instruction counts in DOTA2 by 6.42%, L4D2 by
4.61%, and CS:GO by 5.71%.

total instructions in shared programs: 1500153 -> 1498191 (-0.13%)
instructions in affected programs:     59919 -> 57957 (-3.27%)

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Add parameter to .equals() to ignore an IR type.
Matt Turner [Thu, 2 Jan 2014 00:52:32 +0000 (16:52 -0800)]
glsl: Add parameter to .equals() to ignore an IR type.

Only implemented for ir_swizzles currently, but perhaps will be useful
for other IR types in the future.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: rename PreferDP4 to OptimizeForAOS.
Matt Turner [Fri, 3 Jan 2014 22:48:53 +0000 (14:48 -0800)]
mesa: rename PreferDP4 to OptimizeForAOS.

This flag was really just a proxy for determining whether the backend
was vector (AOS) or scalar (SOA). It will be used to apply a future
optimization only for vector backends.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965/fs: Print the maximum register pressure.
Matt Turner [Sun, 15 Dec 2013 02:37:16 +0000 (18:37 -0800)]
i965/fs: Print the maximum register pressure.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/fs: Show register pressure in dump_instructions() output.
Kenneth Graunke [Mon, 5 Aug 2013 06:34:01 +0000 (23:34 -0700)]
i965/fs: Show register pressure in dump_instructions() output.

Dumping the number of live registers at each IP allows us to see
register pressure and identify any local maxima.  This should
aid in debugging passes designed to reduce register pressure, as
well as optimizations that suddenly trigger spilling.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965: Compute the number of live registers at each IP.
Kenneth Graunke [Mon, 5 Aug 2013 06:27:14 +0000 (23:27 -0700)]
i965: Compute the number of live registers at each IP.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/fs: Call opt_peephole_sel later in the optimization loop.
Matt Turner [Mon, 16 Dec 2013 04:07:05 +0000 (20:07 -0800)]
i965/fs: Call opt_peephole_sel later in the optimization loop.

Calling it after value numbering (added in the next commit) prevents
some instruction count regressions.

total instructions in shared programs: 1524387 -> 1523905 (-0.03%)
instructions in affected programs:     13112 -> 12630 (-3.68%)
GAINED:                                0
LOST:                                  3

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/fs: Calculate interference better in register_coalesce.
Matt Turner [Sun, 15 Dec 2013 23:39:29 +0000 (15:39 -0800)]
i965/fs: Calculate interference better in register_coalesce.

Previously we simply considered two registers whose live ranges
overlapped to interfere. Cases such as

   set A     ------
   ...             |
   mov B, A  --    |
   ...         | B | A
   use B     --    |
   ...             |
   use A     ------

would be considered to interfere, even though B is an unmodified copy of
A whose live range fit wholly inside that of A.

If no writes to A or B occur between the mov B, A and the use of B then
we can safely coalesce them.

Instead of removing MOV instructions, we make them NOPs and remove them
at once after the main pass is finished in order to avoid recomputing
live intervals (which are needed to perform the previous step).

total instructions in shared programs: 1543768 -> 1513077 (-1.99%)
instructions in affected programs:     951563 -> 920872 (-3.23%)
GAINED:                                46
LOST:                                  22

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/fs: Support coalescing registers of size > 1.
Matt Turner [Wed, 11 Dec 2013 00:04:27 +0000 (16:04 -0800)]
i965/fs: Support coalescing registers of size > 1.

total instructions in shared programs: 1550048 -> 1549880 (-0.01%)
instructions in affected programs:     1896 -> 1728 (-8.86%)

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/fs: Assert that var < num_vars.
Matt Turner [Sun, 8 Dec 2013 00:22:08 +0000 (16:22 -0800)]
i965/fs: Assert that var < num_vars.

Helped to track down a problem in a version of the next commit.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/fs: Add a comment explaining how register coalescing works.
Matt Turner [Wed, 11 Dec 2013 00:05:19 +0000 (16:05 -0800)]
i965/fs: Add a comment explaining how register coalescing works.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/fs: Add and use MAX_SAMPLER_MESSAGE_SIZE definition.
Matt Turner [Wed, 11 Dec 2013 00:22:56 +0000 (16:22 -0800)]
i965/fs: Add and use MAX_SAMPLER_MESSAGE_SIZE definition.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agomesa: Add STRINGIFY macro.
Matt Turner [Wed, 11 Dec 2013 00:21:16 +0000 (16:21 -0800)]
mesa: Add STRINGIFY macro.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965/fs: Fix the example about overwriting uniforms in SIMD16.
Matt Turner [Sat, 7 Dec 2013 20:59:59 +0000 (12:59 -0800)]
i965/fs: Fix the example about overwriting uniforms in SIMD16.

mov takes only a single source argument. Example instruction
inexplicably changed from add to mov in commit f10f5e49.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoi965: Print reg_offset for vgrf of size > 1 in dump_instruction().
Matt Turner [Wed, 4 Dec 2013 23:01:16 +0000 (15:01 -0800)]
i965: Print reg_offset for vgrf of size > 1 in dump_instruction().

Previously we wouldn't print the +0 for the first part of a VGRF of size
greater than 1.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl: Match unnamed record types across stages.
Grigori Goronzy [Tue, 26 Nov 2013 23:15:06 +0000 (00:15 +0100)]
glsl: Match unnamed record types across stages.

Unnamed record types are assigned to separate types per stage, e.g. if

uniform struct { ... } a;

is defined in both vertex and fragment shader, two separate types will
result with different names. When linking the shader, this results in a
type conflict. However, there is no reason why this should not be
allowed according to GLSL specifications. Compare and match record types
when linking shader stages to avoid this conflict.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Extract function for record comparisons.
Grigori Goronzy [Tue, 26 Nov 2013 23:15:05 +0000 (00:15 +0100)]
glsl: Extract function for record comparisons.

Reviewed-by: Matt Turner <mattst88@gmail.com>