mesa.git
11 years agomesa: Fix _mesa_problem() on context destroy after application debug output
Eric Anholt [Sat, 23 Feb 2013 01:08:28 +0000 (17:08 -0800)]
mesa: Fix _mesa_problem() on context destroy after application debug output

This was apparently not noticed because we don't have any testing of
application-generated debug output.  However, as I'm changing the
GL-generated debug output to use the same path as
application/middleware-generated debug output, this obviously became an
issue.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
11 years agomesa: Move debug type/severity enums to mesa core.
Eric Anholt [Fri, 22 Feb 2013 23:06:19 +0000 (15:06 -0800)]
mesa: Move debug type/severity enums to mesa core.

These will get reused by new ARB_debug_output messages in drivers/core,
instead of having the caller pass GL enums and have us immediately
switch-statement those into enums.

Add source enums will be handled in the next commit, because the way
different sources are handled at the moment is pretty strange.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
11 years agomesa: Replace open-coded _mesa_lookup_enum_by_nr().
Eric Anholt [Fri, 22 Feb 2013 22:47:15 +0000 (14:47 -0800)]
mesa: Replace open-coded _mesa_lookup_enum_by_nr().

The new one doesn't have the same behavior for GL_NO_ERROR, but we don't
produce errors with GL_NO_ERROR as the error type.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
11 years agomesa: Remove extra #define MAXSTRING duplicating MAX_DEBUG_MESSAGE_LENGTH.
Eric Anholt [Thu, 12 Jul 2012 17:21:13 +0000 (10:21 -0700)]
mesa: Remove extra #define MAXSTRING duplicating MAX_DEBUG_MESSAGE_LENGTH.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
11 years agodri/nouveau: NV17_3D class is not available for NV1a chipset
Marcin Slusarz [Sat, 16 Feb 2013 22:25:08 +0000 (23:25 +0100)]
dri/nouveau: NV17_3D class is not available for NV1a chipset

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=60510

Note: this is a candidate for the stable branches

Acked-by: Francisco Jerez <currojerez@riseup.net>
11 years agotgsi: handle projection modifier for array textures.
Roland Scheidegger [Tue, 5 Mar 2013 16:24:32 +0000 (17:24 +0100)]
tgsi: handle projection modifier for array textures.

This partly reverts 6ace2e41da7dded630d932d03bacb7e14a93d47a.
Apparently with GL_MESA_texture_array fixed-function texturing
with texture arrays is possible, and hence we have to handle TXP.
(Though noone seems to know the semantics, softpipe now does what
it did before, which is to NOT project the array coord, llvmpipe
for instance however indeed does project the array coord. Unlike
before it will project the comparison coord for shadow1d array, as
that clearly was an error.)
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=61828.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agost/mesa: translate ir offset parameters for non-TXF opcodes.
Roland Scheidegger [Tue, 5 Mar 2013 01:02:13 +0000 (02:02 +0100)]
st/mesa: translate ir offset parameters for non-TXF opcodes.

Otherwise the state tracker will crash if the texture instructions
have offsets.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoconfigure.ac: Remove stale comment about --x-* arguments.
Matt Turner [Mon, 4 Mar 2013 18:29:57 +0000 (10:29 -0800)]
configure.ac: Remove stale comment about --x-* arguments.

Should have been removed with e273ed37.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoconfigure.ac: Don't check for X11 unconditionally.
Matt Turner [Mon, 4 Mar 2013 18:23:54 +0000 (10:23 -0800)]
configure.ac: Don't check for X11 unconditionally.

X11 is already checked conditionally below.

Fixes OSMesa-only configurations to not require X11.

Note: This is a candidate for the 9.1 branch.
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoAdd missing GL_TEXTURE_CUBE_MAP entry in _mesa_legal_texture_dimensions
Alan Hourihane [Tue, 5 Mar 2013 12:05:26 +0000 (12:05 +0000)]
Add missing GL_TEXTURE_CUBE_MAP entry in _mesa_legal_texture_dimensions

This was hit on the glTexStorage2D() path.

Note: this is a candidate for the stable branches

Signed-off-by: Alan Hourihane <alanh@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoFix out-of-tree build of 'make check' in src/mesa/main/tests
Jon TURNEY [Fri, 1 Mar 2013 15:21:07 +0000 (15:21 +0000)]
Fix out-of-tree build of 'make check' in src/mesa/main/tests

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
11 years agou_blitter: don't create illegal shaders for 1D/3D/RECT/CUBE MSAA
Dave Airlie [Mon, 4 Mar 2013 07:18:24 +0000 (07:18 +0000)]
u_blitter: don't create illegal shaders for 1D/3D/RECT/CUBE MSAA

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agoFix build of swrast only without libdrm
Daniel Martin [Thu, 28 Feb 2013 18:39:06 +0000 (19:39 +0100)]
Fix build of swrast only without libdrm

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Daniel Martin <consume.noise@gmail.com>
11 years agomesa: flush current state when querying GL_EDGE_FLAG
Brian Paul [Mon, 4 Mar 2013 15:41:45 +0000 (08:41 -0700)]
mesa: flush current state when querying GL_EDGE_FLAG

Fixes http://bugs.freedesktop.org/show_bug.cgi?id=61395

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agovdpau-softpipe: Build correct source file - vl_winsys_xsp.c
Jakub Bogusz [Mon, 4 Mar 2013 06:51:01 +0000 (22:51 -0800)]
vdpau-softpipe: Build correct source file - vl_winsys_xsp.c

Copy-and-paste problem introduced by commit 7f24483e.

Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agoi965: Fix Crystal Well PCI IDs.
Kenneth Graunke [Fri, 1 Mar 2013 23:23:53 +0000 (15:23 -0800)]
i965: Fix Crystal Well PCI IDs.

The second digit was off by one, which meant we accidentally treated
GTn as GT(n-1).  This also meant no support for GT1 at all.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agor600g: Check comp_mask before merging export instructions
Vincent Lejeune [Sun, 3 Mar 2013 20:35:38 +0000 (21:35 +0100)]
r600g: Check comp_mask before merging export instructions

Fixes a llvm uncovered (rare) bug where consecutive exports were
merged even if they have incompatible mask.

11 years agor600g: fix check_and_set_bank_swizzle for cayman
Vadim Girlin [Tue, 26 Feb 2013 16:50:25 +0000 (20:50 +0400)]
r600g: fix check_and_set_bank_swizzle for cayman

Tested-by: Vincent Lejeune <vljn at ovi.com>
Reviewed-by: Vincent Lejeune <vljn at ovi.com>
11 years agost/mesa: add switch case for ir_txf_ms to silence warning
Brian Paul [Sat, 2 Mar 2013 00:36:34 +0000 (17:36 -0700)]
st/mesa: add switch case for ir_txf_ms to silence warning

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agomesa: add switch case for ir_txf_ms to silence warning
Brian Paul [Sat, 2 Mar 2013 00:36:24 +0000 (17:36 -0700)]
mesa: add switch case for ir_txf_ms to silence warning

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Pull query BO reallocation out into a helper function.
Kenneth Graunke [Wed, 27 Feb 2013 21:35:05 +0000 (13:35 -0800)]
i965: Pull query BO reallocation out into a helper function.

We'll want to reuse this for non-occlusion queries in the future.

Plus, it's a single logical task, so having it as a helper function
clarifies the code somewhat.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Replace the global brw->query.bo variable with query->bo.
Kenneth Graunke [Tue, 26 Feb 2013 07:33:24 +0000 (23:33 -0800)]
i965: Replace the global brw->query.bo variable with query->bo.

Again, eliminating a global variable in favor of a per-query object
variable will help in a future where we have more queries in hardware.

Personally, I find this clearer: there's just the query object's BO,
rather than two variables that usually shadow each other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Turn if (query->bo) into an assertion.
Kenneth Graunke [Tue, 26 Feb 2013 07:31:10 +0000 (23:31 -0800)]
i965: Turn if (query->bo) into an assertion.

The code a few lines above calls brw_emit_query_begin() if !query->bo,
and that creates query->bo.  So it should always be non-NULL.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Unify query object BO reallocation code.
Kenneth Graunke [Tue, 26 Feb 2013 07:17:57 +0000 (23:17 -0800)]
i965: Unify query object BO reallocation code.

If we haven't allocated a BO yet, we need to do that.  Or, if there
isn't enough room to write another pair of values, we need to gather up
the existing results and start a new one.  This is simple enough.

However, the old code was awkwardly split into two blocks, with a
write_depth_count() placed in the middle.  The new depth count isn't
relevant to gathering the old BO's data, so that can go after the
reallocation is done.  With the two blocks adjacent, we can merge them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Use query->last_index instead of the global brw->query.index.
Kenneth Graunke [Tue, 26 Feb 2013 06:30:21 +0000 (22:30 -0800)]
i965: Use query->last_index instead of the global brw->query.index.

Since we already have an index in the brw_query_object, there's no need
to also keep a global variable that shadows it.

Plus, if we ever add support for more types of queries that still need
the per-batch before/after treatment we do for occlusion queries, we
won't be able to use a single global variable.  In contrast, per-query
object variables will work fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Remove brw_query_object::first_index field as it's always 0.
Kenneth Graunke [Tue, 26 Feb 2013 02:05:55 +0000 (18:05 -0800)]
i965: Remove brw_query_object::first_index field as it's always 0.

brw->query.index is initialized to 0 just a few lines before it's
copied to first_index.

Presumably the idea here was to reuse the query BO for subsequent
queries of the same type, but since that doesn't happen, there's no need
to have the extra code complexity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Add a pile of comments to brw_queryobj.c.
Kenneth Graunke [Mon, 25 Feb 2013 23:22:02 +0000 (15:22 -0800)]
i965: Add a pile of comments to brw_queryobj.c.

This code was really difficult to follow, for a number of reasons:

- Queries were handled in four different ways (TIMESTAMP writes a single
  value, TIME_ELAPSED writes a single pair of values, occlusion queries
  write pairs of values for the start and end of each batch, and other
  queries are done entirely in software.  It turns out that there are
  very good reasons each query is handled the way it is, but
  insufficient comments explaining the rationale.

- It wasn't immediately obvious which functions were driver hooks
  and which were helper functions.  For example, brw_query_begin() is
  a driver hook that implements glBeginQuery() for all query types, but
  the similarly named brw_emit_query_begin() is a helper function that's
  only relevant for occlusion queries.

Extra explanatory comments should save me and others from constantly
having to ask how this code works and why various query types are
handled differently.

v2: Incorporate Eric's feedback: change "as soon as possible" to "the
    results will be present when mapped."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Write TIMESTAMP query values into the first buffer element.
Kenneth Graunke [Mon, 25 Feb 2013 21:56:01 +0000 (13:56 -0800)]
i965: Write TIMESTAMP query values into the first buffer element.

For timestamp queries, we just write a single value to a BO.  The
natural place to write that is element 0, so we should do that.

Previously, we wrote it into element 1 (the second slot) leaving
element 0 filled with garbage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Implement the new QueryCounter() hook.
Kenneth Graunke [Mon, 25 Feb 2013 20:22:29 +0000 (12:22 -0800)]
i965: Implement the new QueryCounter() hook.

This moves the GL_TIMESTAMP handling out of EndQuery.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: Add a new QueryCounter() hook for TIMESTAMP queries.
Kenneth Graunke [Mon, 25 Feb 2013 19:21:17 +0000 (11:21 -0800)]
mesa: Add a new QueryCounter() hook for TIMESTAMP queries.

In OpenGL, most queries record statistics about operations performed
between a defined beginning and ending point.  However, TIMESTAMP
queries are different: they immediately return a single value, and there
is no start/stop mechanism.

Previously, Mesa implemented TIMESTAMP queries by calling EndQuery
without first calling BeginQuery.  Apparently this is DirectX
convention, and Gallium followed suit.  I personally find the asymmetry
jarring, however---having BeginQuery and EndQuery handle a different set
of enum values looks like a bug.  It's also a bit confusing to mix the
one-shot query with the start/stop model.

So, add a new QueryCounter driver hook for implementing TIMESTAMP.  For
now, fall back to EndQuery to support drivers that don't do the new
mechanism.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agotgsi: add texel offsets and derivatives to sampler interface
Roland Scheidegger [Fri, 1 Mar 2013 22:27:41 +0000 (23:27 +0100)]
tgsi: add texel offsets and derivatives to sampler interface

Something I never got around to implement, but this is the tgsi execution
side for implementing texel offsets (for ordinary texturing) and explicit
derivatives for sampling (though I guess the ordering of the components
for the derivs parameters is debatable).
There is certainly a runtime cost associated with this.
Unless there are different interfaces used depending on the "complexity"
of the texture instructions, this is impossible to avoid.
Offsets are always active (I think checking if they are active or not is
probably not worth it since it should mostly be an add), whereas the
sampler_control is extended for explicit derivatives.
For now softpipe (the only user of this) just drops all those new values
on the floor (which is the part I never implemented...).

Additionally this also fixes (discovered by accident) inconsistent
projective divide for the comparison coord - the code did do the
projection for shadow2d targets, but not shadow1d ones. This also
drops checking for projection modifier on array targets, since they
aren't possible in any extension I know of (hence we don't actually
know if the array layer should also be divided or not).

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agodraw: additional fix for the no-position case with llvm
Roland Scheidegger [Sat, 2 Mar 2013 01:29:22 +0000 (02:29 +0100)]
draw: additional fix for the no-position case with llvm

Similar fix to what is done for the non-llvm case, we could otherwise still
hit the stages (near certainly with gs) which crash. It is probably a much
better idea to skip trying to draw at that point anyway.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agodraw: fix no position output in non-llvm pipeline.
Roland Scheidegger [Sat, 2 Mar 2013 01:49:28 +0000 (02:49 +0100)]
draw: fix no position output in non-llvm pipeline.

It seems easiest (and best) if we simply skip all the later stages
(after stream output).
(This is different to the llvm case at least for now where we will
simply try to render garbage, though both behaviors should be correct.)
Fixes piglit glsl-1.40-tf-no-position with softpipe.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agodraw/llvm: skip clipping and viewport transform if there's no position output
Roland Scheidegger [Fri, 1 Mar 2013 13:50:40 +0000 (14:50 +0100)]
draw/llvm: skip clipping and viewport transform if there's no position output

With glsl 1.40 writing position is not required (useful for transform
feedback, though in fact it's still possible to rasterize such geometry
even if the results aren't too well defined).
Prevents crashes in that case. Fixes piglit glsl-1.40-tf-no-position.
Not quite sure this is 100% correct as it also skips clipdistance
clipping which could still work (but not sure if the result would
really be needed?)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agollvmpipe: don't assert on illegal surface creation.
Roland Scheidegger [Fri, 1 Mar 2013 14:32:03 +0000 (15:32 +0100)]
llvmpipe: don't assert on illegal surface creation.

Since c8eb2d0e829d0d2aea6a982620da0d3cfb5982e2 llvmpipe checks if it's
actually legal to create a surface. The opengl state tracker doesn't quite
obey this so for now just warn instead of assert.
Also warn instead of disabled assert when creating sampler views
(same reasoning).

Addresses https://bugs.freedesktop.org/show_bug.cgi?id=61647.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agollvmpipe: bump glsl version to 140
Roland Scheidegger [Fri, 1 Mar 2013 13:50:32 +0000 (14:50 +0100)]
llvmpipe: bump glsl version to 140

texel offsets should have been the last missing feature for 130, and in
fact 140 as well (last there were texture buffers). In any case we still
don't do OpenGL 3.0 (missing MSAA which will be difficult,
plus EXT_packed_float, ARB_depth_buffer_float and EXT_framebuffer_sRGB).

v2: bump to 140 instead - we have everything except we crash when not writing
to gl_Position (but softpipe crashes as well) so let's just say this is a bug
instead. Also (by Dave Airlie's suggestion) update llvm-todo.txt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agogallivm: add support for texel offsets for ordinary texturing.
Roland Scheidegger [Fri, 1 Mar 2013 01:25:13 +0000 (02:25 +0100)]
gallivm: add support for texel offsets for ordinary texturing.

This was previously only handled for texelFetch (much easier).
Depending on the wrap mode this works slightly differently (for somewhat
efficient implementation), hence have to do that separately in all roughly
137 places - it is easy if we use fixed point coords for wrapping, however
some wrapping modes are near impossible with fixed point (the repeat stuff)
hence we have to normalize the offsets if we can't do the wrapping in
unnormalized space (which is a division which is slow but should still be
much better than the alternative, which would be integer modulo for wrapping
which is just unusable). This should still give accurate results in all
cases that really matter, though it might be not quite conformant behavior
for some apis (but we have much worse problems there anyway even without
using offsets).
(Untested, no piglit test.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agosvga: always link with C++
Brian Paul [Fri, 1 Mar 2013 23:53:22 +0000 (16:53 -0700)]
svga: always link with C++

Even when we don't have LLVM since there's other C++ code
in the resulting DRI driver object.

Note: This is a candidate for the stable branches.

Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agost/mesa: convert ir_triop_lrp to TGSI_OPCODE_LRP
Brian Paul [Fri, 1 Mar 2013 22:16:15 +0000 (15:16 -0700)]
st/mesa: convert ir_triop_lrp to TGSI_OPCODE_LRP

AFAICT, all gallium drivers implement TGSI_OPCODE_LRP.
Tested with softpipe, llvmpipe, svga drivers.

Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agodocs: Mark some things done in GL3.txt
Chris Forbes [Fri, 1 Mar 2013 23:01:24 +0000 (12:01 +1300)]
docs: Mark some things done in GL3.txt

11 years agowinsys/radeon: Only add bo to hash table when creating flink
Martin Andersson [Fri, 1 Mar 2013 21:34:28 +0000 (22:34 +0100)]
winsys/radeon: Only add bo to hash table when creating flink

The problem is that we mix bo handles and flinked names in the hash
table. Because kms type handles are not flinked they should not be
added to the hash table. If we do that we will sooner or later
get a situation where we will overwrite a correct entry because
the bo handle was the same as a flinked name.

Note: this is a candidate for the stable branches.

Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agoi965: enable ARB_texture_multisample on Gen6+
Chris Forbes [Sat, 29 Dec 2012 08:28:57 +0000 (21:28 +1300)]
i965: enable ARB_texture_multisample on Gen6+

V2: Works on Ivy Bridge now too, so this can be 6+.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965/fs: add support for ir_txf_ms on Gen6+
Chris Forbes [Sat, 29 Dec 2012 07:12:26 +0000 (20:12 +1300)]
i965/fs: add support for ir_txf_ms on Gen6+

On Gen6, lower this to `ld` with lod=0 and an extra sample_index
parameter.

On Gen7, use `ld2dms`. We don't support CMS yet for multisample
textures, so we just hardcode MCS=0. This is ignored for IMS and UMS
surfaces.

Note: If we do end up emitting specialized shaders based on the MSAA
layout, we can emit a slightly shorter message here in the UMS case.

Note: According to the PRM, `ld2dms` takes one more parameter, lod.
However, it's always zero, and including it would make the message too
long for SIMD16, so we just omit it.

V2: Reworked completely, added support for Gen7.
V3: - Introduce sample_index parameter rather than reusing lod
    - Removed spurious whitespace change
    - Clarify commit message
V4: - Fix comment style
    - Emit SHADER_OPCODE_TXF_MS on Gen6. This was benignly wrong since
      it lowers to `ld` anyway on this gen, but still wrong.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965/vs: add support for ir_txf_ms on Gen6+
Chris Forbes [Sat, 29 Dec 2012 07:12:26 +0000 (20:12 +1300)]
i965/vs: add support for ir_txf_ms on Gen6+

On Gen6, lower this to `ld` with lod=0 and an extra sample_index
parameter.

On Gen7, use `ld2dms`. This takes an additional MCS parameter to support
compressed multisample surfaces, but we're not enabling them for
multisample textures for now, so it's always ignored and can be safely
omitted.

V2: Reworked completely, added support for Gen7.
V3: - Use new sample_index, sample_index_type rather than reusing lod
    - Clarify commit message.
V4: - Fix comment style

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: add a new virtual opcode: SHADER_OPCODE_TXF_MS
Chris Forbes [Thu, 24 Jan 2013 08:35:15 +0000 (21:35 +1300)]
i965: add a new virtual opcode: SHADER_OPCODE_TXF_MS

This is very similar to the TXF opcode, but lowers to `ld2dms` rather
than `ld` on Gen7.

V4: - add SHADER_OPCODE_TXF_MS to is_tex() functions, so regalloc thinks
      it actually writes the correct number of registers. Otherwise in
      nontrivial shaders some of the registers tend to get clobbered,
      producing bad results.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: take the target into account for Gen7 MSAA modes
Chris Forbes [Thu, 24 Jan 2013 07:05:09 +0000 (20:05 +1300)]
i965: take the target into account for Gen7 MSAA modes

Gen7 has an erratum affecting the ld_mcs message, making it unsafe to
use when the surface doesn't have an associated MCS.

From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"):

   "If this field is disabled and the sampling engine <ld_mcs>
   message is issued on this surface, the MCS surface may be
   accessed. Software must ensure that the surface is defined
   to avoid GTT errors."

To allow the shader to treat all surfaces uniformly, force UMS if the
surface is to be used as a multisample texture, even if CMS would have
been possible.

V3: - Quoted erratum text

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Support multisampling in surface_state for textures
Chris Forbes [Sat, 22 Dec 2012 10:27:24 +0000 (23:27 +1300)]
i965: Support multisampling in surface_state for textures

The surface_state setup for renderbuffers already worked; only the
texturing side needed work. BLORP does something similar, but does its
own surface_state setup.

On Gen6, we just need to set the correct sample count.

On Gen7: - set the correct sample count
         - set the correct layout mode
         - set GEN7_SURFACE_ARYSPC_LOD0 if it's set in the miptree.

V2: - Clarify commit message
    - Rebased onto Paul's physical/logical dims cleanup
    - Added Gen7 support

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: add support for multisample textures
Chris Forbes [Sun, 16 Dec 2012 06:50:26 +0000 (19:50 +1300)]
i965: add support for multisample textures

V2: - Fix for state moving from texobj to image
    - Rebased onto Paul's logical/physical cleanup
    - Fixed missing quantization of sample count
    - Fold in IMS renderbuffer wrapper fixes from later in the series
    - Use correct physical slice offset for UMS/CMS surfaces on Gen7

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: implement TexImage*Multisample
Chris Forbes [Sat, 24 Nov 2012 08:47:46 +0000 (21:47 +1300)]
mesa: implement TexImage*Multisample

V2: - fix formatting issues
    - generate GL_OUT_OF_MEMORY if teximage cannot be allocated
    - fix for state moving from texobj to image

V3: - remove ridiculous stencil hack
    - alter format check to not allow a base format of STENCIL_INDEX
    - allow width/height/depth to be zero, to deallocate the texture
    - dont forget to call _mesa_update_fbo_texture

V4: - fix indentation
    - don't throw errors on proxy texture targets

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
11 years agomesa: support multisample textures in framebuffer completeness check
Chris Forbes [Sun, 16 Dec 2012 07:58:00 +0000 (20:58 +1300)]
mesa: support multisample textures in framebuffer completeness check

- sample count must be the same on all attachments
- fixedsamplepositions must be the same on all attachments
(renderbuffers have fixedsamplepositions=true implicitly; only
multisample textures can choose to have it false)

V2: - fix wrapping to 80 columns, debug message, fix for state moving
      from texobj to image.
    - stencil texturing tweaks tidied up and folded in here.

V3: - Removed silly stencil hacks entirely; the extension doesn't
      actually make stencil-only textures legal at all.
    - Moved sample count / fixed sample locations checks into
      existing attachment-type-specific blocks, as suggested by Eric

V4: - Removed stencil hacks which were missed in V3 (thanks Eric)
    - Don't move the declaration of texImg; only required pre-V3.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: expose sample positions
Chris Forbes [Wed, 5 Dec 2012 03:27:42 +0000 (16:27 +1300)]
i965: expose sample positions

Moves the definition of the sample positions out of
gen6_emit_3dstate_multisample, and unpacks them in
gen6_get_sample_position.

V2: Be consistent about `sample position` rather than `location`.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: add support for sample mask on Gen6+
Chris Forbes [Thu, 29 Nov 2012 09:24:43 +0000 (22:24 +1300)]
i965: add support for sample mask on Gen6+

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: implement sample mask
Chris Forbes [Fri, 30 Nov 2012 08:22:14 +0000 (21:22 +1300)]
mesa: implement sample mask

V2: - fix multiline comment style
    - stop using ASSERT_OUTSIDE_BEGIN_END_AND_FLUSH since that
      doesn't exist anymore.

V3: - check for the extension being enabled
    - tidier flagging of _NEW_MULTISAMPLE
    - fix weird indentation in get.c

V4: - move flush later in SampleMaski()

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: implement GetMultisamplefv
Chris Forbes [Sun, 25 Nov 2012 07:23:32 +0000 (20:23 +1300)]
mesa: implement GetMultisamplefv

Actual sample locations deferred to a driverfunc since only the driver
really knows where they will be.

V2: - pass the draw buffer to the driverfunc; don't fallback to pixel
      center if driverfunc is missing.
    - rename GetSampleLocation to GetSamplePosition
    - invert y sample position for winsys FBOs, at Paul's suggestion

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: expose new max sample counts
Chris Forbes [Fri, 30 Nov 2012 08:20:40 +0000 (21:20 +1300)]
i965: expose new max sample counts

V2: For now, only expose a depth sample count of 1, since there are
possible unresolved interactions with HiZ.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: add new max sample count state
Chris Forbes [Fri, 30 Nov 2012 08:19:08 +0000 (21:19 +1300)]
mesa: add new max sample count state

- GL_MAX_COLOR_TEXTURE_SAMPLES
- GL_MAX_DEPTH_TEXTURE_SAMPLES
- GL_MAX_INTEGER_SAMPLES

V2: initialize limits to 1 in _mesa_init_constants as suggested by Brian
and Paul

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: add support for ARB_texture_multisample
Chris Forbes [Fri, 21 Dec 2012 08:33:37 +0000 (21:33 +1300)]
glsl: add support for ARB_texture_multisample

V2: - emit `sample` parameter properly for multisample texelFetch()
    - fix spurious whitespace change
    - introduce a new opcode ir_txf_ms rather than overloading the
      existing ir_txf further. This makes doing the right thing in
      the driver somewhat simpler.

V3: - fix weird whitespace

V4: - don't forget to include the new opcode in tex_opcode_strs[]
      (thanks Kenneth for spotting this)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Eric Anholt <eric@anholt.net>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agotests: add ARB_texture_multisample enums to table
Chris Forbes [Sun, 25 Nov 2012 01:42:55 +0000 (14:42 +1300)]
tests: add ARB_texture_multisample enums to table

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: add texobj support for ARB_texture_multisample
Chris Forbes [Sat, 24 Nov 2012 08:46:56 +0000 (21:46 +1300)]
mesa: add texobj support for ARB_texture_multisample

Adds the new texture targets, and per-image state for GL_TEXTURE_SAMPLES
and GL_TEXTURE_FIXED_SAMPLE_LOCATIONS.

V2: - Allow multisample texture targets in glInvalidateTexSubImage too.
      This was already partly there, but I missed it the first time around
      since the interaction is defined in a newer extension. Fixed weird
      indentation.
    - Allow multisample array textures in glFramebufferTextureLayer.
      This was overlooked as the tests originally only used 2d
      multisample textures.

V3: - Set min/mag filters sensibly for multisample textures. This
      can't actually be changed by the user, so it's more sensible to
      initialize it correctly than to hack around it being bogus later.

V4: - Tidy up initial min/mag filter setup. Setup in
      _mesa_initialize_texture_object was bogus, but benign since
      finish_texture_init() clobbered everything with correct values. For V4,
      just do the setup in finish_texture_init().

V5: - Don't break glPopAttrib(GL_TEXTURE_BIT)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[V2] Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglapi: add ARB_texture_multisample
Chris Forbes [Sat, 24 Nov 2012 00:08:45 +0000 (13:08 +1300)]
glapi: add ARB_texture_multisample

Adds new enums, dispatch machinery, and stubs for the 4 new entrypoints.

V2: - Drop placeholder
    - Align enum values
    - Remove explicit exec=mesa; it *is* the dispatch flavor we want,
      but it's also the default. I misunderstood how this worked before;
      after actually reading the generator it makes good sense.

V3: - Squash in stubs for new entrypoints, and dispatch_sanity tweaks,
      so we don't get build breakage between those patches.

V4: - Fix various remaining whitespace issues

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
[1/3 V2] Reviewed-by: Matt Turner <mattst88@gmail.com>
[V3] Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agointel: Use the new "ctx" local variable I just added some more.
Eric Anholt [Fri, 22 Feb 2013 19:46:19 +0000 (11:46 -0800)]
intel: Use the new "ctx" local variable I just added some more.

Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Make sRGB-capable framebuffers by default.
Eric Anholt [Fri, 15 Feb 2013 15:41:42 +0000 (07:41 -0800)]
i965: Make sRGB-capable framebuffers by default.

The GLX extension lets you expose visuals that explicitly guarantee you
that the GL_FRAMEBUFFER_SRGB_CAPABLE flag will be set, but we can set
the flag even while the visual doesn't provide the guarantee.  This
appears to be consistent with other implementations, as we've seen
several apps now that don't require an srgb visual and assume sRGB will
work without checking the GL_FRAMEBUFFER_SRGB_CAPABLE flag.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55783
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60633
Reviewed-and-tested-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agointel: Fix software copying of miptree faces for weird formats.
Eric Anholt [Wed, 31 Oct 2012 21:42:39 +0000 (14:42 -0700)]
intel: Fix software copying of miptree faces for weird formats.

Now that we have W-tiled S8, we can't just region_map and poke at bits --
there has to be some swizzling.  Rely on intel_miptree_map to get that job
done.  This should also get the highest performance path we know of for the
mapping (interesting if I get around to finishing movntdqa some day).

v2: Fix stale name of the bit in a comment.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agointel: Add a flag for miptree mapping to disable transcoding.
Eric Anholt [Tue, 26 Feb 2013 18:50:05 +0000 (10:50 -0800)]
intel: Add a flag for miptree mapping to disable transcoding.

I want to reuse intel_miptree_map() to replace some region mapping that's
broken for separate stencil, but doing so would result in new demands on
ETC transcode that we actually don't want to happen.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Add WARN_ONCE for depthstencil workarounds we shouldn't be hitting.
Eric Anholt [Tue, 26 Feb 2013 19:35:40 +0000 (11:35 -0800)]
i965: Add WARN_ONCE for depthstencil workarounds we shouldn't be hitting.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agor600g: enable CP DMA on 6xx
Alex Deucher [Fri, 1 Mar 2013 17:11:31 +0000 (12:11 -0500)]
r600g: enable CP DMA on 6xx

Tested across several 6xx parts, no piglit regressions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: don't require dword alignment with CP DMA for buffer transfers
Marek Olšák [Thu, 21 Feb 2013 16:06:26 +0000 (17:06 +0100)]
r600g: don't require dword alignment with CP DMA for buffer transfers

which is a leftover from the days when we used streamout to copy buffers

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
11 years agor600g: always map uninitialized buffer range as unsynchronized
Marek Olšák [Wed, 27 Feb 2013 22:50:15 +0000 (23:50 +0100)]
r600g: always map uninitialized buffer range as unsynchronized

Any driver can implement this simple and efficient optimization.
Team Fortress 2 hits it always. The DISCARD_RANGE codepath is not even used
with TF2 anymore, so we avoid a ton of useless buffer copies.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
NOTE: This is a candidate for the 9.1 branch.

11 years agogallium/util: add helper code for 1D integer range
Marek Olšák [Wed, 27 Feb 2013 22:34:29 +0000 (23:34 +0100)]
gallium/util: add helper code for 1D integer range

Reviewed-by: Brian Paul <brianp@vmware.com>
v2: cosmetic changes based on Brian's review

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
NOTE: This is a candidate for the 9.1 branch. (the next patch depends on it)

11 years agor600g: cleanup deprecated register tables
Marek Olšák [Wed, 27 Feb 2013 11:43:19 +0000 (12:43 +0100)]
r600g: cleanup deprecated register tables

These registers are either already emitted elsewhere or moved to start_cs.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
11 years agor600g: unify vgt states
Marek Olšák [Wed, 27 Feb 2013 10:00:14 +0000 (11:00 +0100)]
r600g: unify vgt states

The states were split because we thought it caused a hardlock. Now we know
the hardlock was caused by something else and has since been fixed.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
11 years agor600g: flush and invalidate htile cache when appropriate
Marek Olšák [Tue, 26 Feb 2013 21:31:03 +0000 (22:31 +0100)]
r600g: flush and invalidate htile cache when appropriate

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
NOTE: This is a candidate for the 9.1 branch.

11 years agor600g: atomize streamout enabling
Marek Olšák [Tue, 26 Feb 2013 16:20:25 +0000 (17:20 +0100)]
r600g: atomize streamout enabling

This doesn't fix any issue we know of, but there indeed is a week spot
in draw_vbo where streamout can fail. After streamout is enabled,
the need_cs_space call can flush the context, which causes the streamout
to be disabled right after it was enabled and bad things happen.

One way to fix it is to atomize the beginning part, so that no context flush
can happen between streamout enabling and the first drawing.

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
11 years agor600g: use async DMA with a non-zero src offset
Marek Olšák [Thu, 21 Feb 2013 15:54:46 +0000 (16:54 +0100)]
r600g: use async DMA with a non-zero src offset

probably a typo

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
NOTE: This is a candidate for the 9.1 branch.

11 years agor600g: pad the DMA CS to a multiple of 8 dwords
Marek Olšák [Wed, 27 Feb 2013 20:24:02 +0000 (21:24 +0100)]
r600g: pad the DMA CS to a multiple of 8 dwords

Tested-by: Andreas Boll <andreas.boll.dev@gmail.com>
NOTE: This is a candidate for the 9.1 branch.

11 years agointel: Enable __DRI_API_OPENGL_CORE api with dri2 contexts
Jordan Justen [Wed, 20 Feb 2013 08:14:13 +0000 (00:14 -0800)]
intel: Enable __DRI_API_OPENGL_CORE api with dri2 contexts

Without this set, dri_util.c:dri2CreateContextAttribs
will reject requests to create a context with
__DRI_API_OPENGL_CORE.

This prevents a 3.2 core profile context from being created
even when MESA_GL_OVERRIDE_VERSION=3.2 is used.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agointel: update max versions based on MESA_GL_VERSION_OVERRIDE
Jordan Justen [Fri, 22 Feb 2013 00:59:33 +0000 (16:59 -0800)]
intel: update max versions based on MESA_GL_VERSION_OVERRIDE

If the override is version is >= 3.1, then update the
max_gl_core_version. Otherwise, update max_gl_compat_version.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa version: add _mesa_get_gl_version_override
Jordan Justen [Thu, 21 Feb 2013 18:01:40 +0000 (10:01 -0800)]
mesa version: add _mesa_get_gl_version_override

This will allow other code to get access to the override
version before a context is available.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: allow GLSL compiler version to be overridden to 1.50
Jordan Justen [Tue, 19 Feb 2013 17:23:51 +0000 (09:23 -0800)]
glsl: allow GLSL compiler version to be overridden to 1.50

Although GLSL 1.50 compiler support is not available,
this change will allow MESA_GLSL_VERSION_OVERRIDE=150 to be
used while 1.50 support is being developed.

Since no drivers claim 1.50 GLSL support, this change should
only impact Mesa when MESA_GLSL_VERSION_OVERRIDE=150 is set.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: Put immediate operand as src2
Matt Turner [Fri, 1 Mar 2013 00:26:57 +0000 (16:26 -0800)]
i965/fs: Put immediate operand as src2

Immediate operands can only be src2 in 2-source instructions. Fixes
piglit failures since 0a1d145e (oops!).

Spotted-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agointel: Remove intel_mipmap_tree::wraps_etc
Chad Versace [Fri, 15 Feb 2013 20:58:03 +0000 (12:58 -0800)]
intel: Remove intel_mipmap_tree::wraps_etc

The field was equivalent to (etc_format != MESA_FORMAT_NONE), and
therefore duplicate information.

This patch removes field and replaces all references to it with
`etc_format != MESA_FORMAT_NONE`.

No Piglit ETC test regresses on Intel Sandybridge.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoir_to_mesa: Translate ir_triop_lrp to OPCODE_LRP.
Matt Turner [Tue, 19 Feb 2013 22:15:16 +0000 (14:15 -0800)]
ir_to_mesa: Translate ir_triop_lrp to OPCODE_LRP.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/vs: Assert that ir_triop_lrp was lowered.
Matt Turner [Tue, 19 Feb 2013 23:57:28 +0000 (15:57 -0800)]
i965/vs: Assert that ir_triop_lrp was lowered.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fp: Use the LRP instruction for OPCODE_LRP.
Matt Turner [Tue, 19 Feb 2013 20:51:08 +0000 (12:51 -0800)]
i965/fp: Use the LRP instruction for OPCODE_LRP.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: Use the LRP instruction for ir_triop_lrp when possible.
Kenneth Graunke [Sun, 2 Dec 2012 08:08:15 +0000 (00:08 -0800)]
i965/fs: Use the LRP instruction for ir_triop_lrp when possible.

v2 [mattst88]:
   - Add BRW_OPCODE_LRP to list of CSE-able expressions.
   - Fix op_var[] array size.
   - Rename arguments to emit_lrp to (x, y, a) to clear confusion.
   - Add LRP function to brw_fs.cpp/.h.
   - Corrected comment about LRP instruction arguments in emit_lrp.
v3 [mattst88]:
   - Duplicate MAD code for LRP instead of using a function pointer.
   - Check for != GRF instead of == IMM in emit_lrp.
   - Lower LRP on gen < 6.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
1

11 years agoi965: Add support for emitting the LRP instruction.
Kenneth Graunke [Sun, 2 Dec 2012 05:49:43 +0000 (21:49 -0800)]
i965: Add support for emitting the LRP instruction.

Like MAD, this is another three-source instruction.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Optimize ir_triop_lrp(x, y, a) with a = 0.0f or 1.0f
Matt Turner [Sat, 16 Feb 2013 01:51:46 +0000 (17:51 -0800)]
glsl: Optimize ir_triop_lrp(x, y, a) with a = 0.0f or 1.0f

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Convert mix() to use a new ir_triop_lrp opcode.
Kenneth Graunke [Sun, 2 Dec 2012 07:49:26 +0000 (23:49 -0800)]
glsl: Convert mix() to use a new ir_triop_lrp opcode.

Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).

Pattern matching or peepholing this is more desirable, but can be
tricky.  By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.

Currently, all consumers lower ir_triop_lrp.  Subsequent patches will
actually generate different code.

v2 [mattst88]:
   - Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
     subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
   - Move changes from the next patch to opt_algebraic.cpp to accept
     3-src operations.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Rework ir_reader to handle expressions with three operands.
Kenneth Graunke [Sun, 2 Dec 2012 07:49:19 +0000 (23:49 -0800)]
glsl: Rework ir_reader to handle expressions with three operands.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Consolidate ir_expression constructors that use explicit types.
Kenneth Graunke [Sun, 2 Dec 2012 07:40:42 +0000 (23:40 -0800)]
glsl: Consolidate ir_expression constructors that use explicit types.

Previously, we had separate constructors for one, two, and four operand
expressions.  This patch consolidates them into a single constructor
which uses NULL default parameters.

The unary and binary operator constructors had assertions to verify that
the caller supplied the correct number of operands for the expression,
but the four-operand version did not.  Since get_num_operands for
ir_quadop_vector returns the number of vector_elements, we can safely
add that without breaking the semantics of ir_quadop_vector.

This also paves the way for expressions with three operands.  Currently,
none can be constructed since get_num_operands() never returns 3.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/vs/gen7: Allow MATH instructions to have MRF as a destination
Matt Turner [Tue, 12 Feb 2013 23:50:43 +0000 (15:50 -0800)]
i965/vs/gen7: Allow MATH instructions to have MRF as a destination

total instructions in shared programs: 346873 -> 346847 (-0.01%)
instructions in affected programs:     364 -> 338 (-7.14%)

(All affected shaders are from Lightsmark)

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965/fs/gen7: Allow MATH instructions to have MRF as a destination
Matt Turner [Tue, 12 Feb 2013 21:59:37 +0000 (13:59 -0800)]
i965/fs/gen7: Allow MATH instructions to have MRF as a destination

total instructions in shared programs: 1376297 -> 1375626 (-0.05%)
instructions in affected programs:     35977 -> 35306 (-1.87%)

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965/gen7: Relax restrictions on fake MRFs
Matt Turner [Mon, 11 Feb 2013 19:06:13 +0000 (11:06 -0800)]
i965/gen7: Relax restrictions on fake MRFs

Gen6 has write-only MRF registers, and for ease of implementation we
paritition off 16 general purposes registers to act as MRFs on Gen7.

Knowing that our Gen7 MRFs are actually GRFs, we can do things we can't
do with real MRFs:
   - read from them;
   - return values directly to them from a send instruction; and
   - compute directly to them with math instructions.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965/fs: Remove duplicate scan_inst->mlen check
Matt Turner [Mon, 11 Feb 2013 19:24:48 +0000 (11:24 -0800)]
i965/fs: Remove duplicate scan_inst->mlen check

Is already checked 20 lines below.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoclover: Fix build with LLVM 3.3 v2
Tom Stellard [Fri, 22 Feb 2013 18:19:14 +0000 (19:19 +0100)]
clover: Fix build with LLVM 3.3 v2

v2:
  - Fix order that the clang libraries are passed to the linker to avoid
    missing symbol errors.

Acked-by: Francisco Jerez <currojerez@riseup.net>
11 years agoattrib: push/pop FRAGMENT_PROGRAM_ARB state
Jordan Justen [Thu, 28 Feb 2013 07:19:55 +0000 (23:19 -0800)]
attrib: push/pop FRAGMENT_PROGRAM_ARB state

This requirement was added by ARB_fragment_program

When the Steam overlay is enabled, this fixes:
* Menu corruption with the Puddle game
* The screen going black on Rochard when
  the Steam overlay is accessed

NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoscons: Fix Windows build with LLVM 3.2
Keith Kriewall [Thu, 28 Feb 2013 15:40:02 +0000 (15:40 +0000)]
scons: Fix Windows build with LLVM 3.2

Fixes fdo bug 61299

NOTE: This is a candidate for the stable branches.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
11 years agoautotools: oprofilejit should be included in the list of LLVM components required
Adam Sampson [Thu, 28 Feb 2013 15:35:11 +0000 (15:35 +0000)]
autotools: oprofilejit should be included in the list of LLVM components required

NOTE: This is a candidate for the stable branch.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
11 years agor600g: workaround hyperz lockup on evergreen
Jerome Glisse [Wed, 20 Feb 2013 21:20:17 +0000 (16:20 -0500)]
r600g: workaround hyperz lockup on evergreen

This work around disable hyperz if write to zbuffer is disabled. Somehow
using hyperz when not writting to the zbuffer trigger GPU lockup. See :

https://bugs.freedesktop.org/show_bug.cgi?id=60848

Candidate for 9.1

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
11 years agotexobj: add verbose api trace messages to several routines
Jordan Justen [Mon, 25 Feb 2013 21:56:20 +0000 (13:56 -0800)]
texobj: add verbose api trace messages to several routines

Motivated by wanting to see if GenTextures was called by an
application while debugging another Steam overlay issue.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>