mesa.git
12 years agoscons: Append x11 library path if linking x11 library.
Vinson Lee [Sat, 17 Nov 2012 07:35:42 +0000 (23:35 -0800)]
scons: Append x11 library path if linking x11 library.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
12 years agomesa/vbo: Fix scaling issue in 2-bit signed normalized packing.
Kenneth Graunke [Fri, 12 Oct 2012 19:46:44 +0000 (12:46 -0700)]
mesa/vbo: Fix scaling issue in 2-bit signed normalized packing.

Since a signed 2-bit integer can only represent -1, 0, or 1, it is
tempting to simply to convert it directly to a float.  This maps it
onto the correct range of [-1.0, 1.0].  However, it gives different
values compared to the usual equation:

(2.0 *  1.0 + 1.0) * (1.0 / 3.0) = +1.0           (same)
(2.0 *  0.0 + 1.0) * (1.0 / 3.0) = +0.33333333... (different)
(2.0 * -1.0 + 1.0) * (1.0 / 3.0) = -0.33333333... (different)

According to the GL_ARB_vertex_type_2_10_10_10_rev extension, signed
normalization is performed using equation 2.2 from the GL 3.2
specification, which is:

   f = (2c + 1)/(2^b - 1).                                (2.2)

Comments below that equation state: "In general, this representation is
used for signed normalized fixed-point parameters in GL commands, such
as vertex attribute values."  Which is what we're doing here.

The 3.2 specification goes on to declare an alternate formula:

   f = max{c/(2^(b-1) - 1), -1.0}                         (2.3)

which is closer to the existing code, and maps the end points to exactly
-1.0 and 1.0.  Comments below the equation state: "In general, this
representation is used for signed normalized fixed-point texture or
framebuffer values."  Which is *not* what we're doing here.

It then states: "Everywhere that signed normalized fixed-point
values are converted, the equation used is specified."  This is the real
clincher: the extension explicitly specifies that we must use equation
2.2, not 2.3.  So we need to do (2x + 1) / 3.

This matches the behavior expected by oglconform's packed-vertex test,
and is correct for desktop GL (pre-4.2).  It's not correct for ES 3.0,
but a future patch will correct that.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
12 years agomesa/vbo: Fix scaling issue in 10-bit signed normalized packing.
Kenneth Graunke [Fri, 12 Oct 2012 18:17:39 +0000 (11:17 -0700)]
mesa/vbo: Fix scaling issue in 10-bit signed normalized packing.

For the 10-bit components, the divisor was incorrect.  A 10-bit signed
integer can represent -2^9 through 2^9 - 1, which leads to the following
ranges:

       (float)value.x          -> [ -512,  511]
2.0F * (float)value.x          -> [-1024, 1022]
2.0F * (float)value.x + 1.0F   -> [-1023, 1023]

So dividing by 511 would incorrectly scale it to approximately:
[-2.001956947, 2.001956947].  To correctly scale to [-1.0, 1.0], we need
to divide by 1023.

This correctly implements the desktop GL rules.  ES 3.0 has different
rules, but those will be implemented in a separate patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Marek Olšák <maraeo@gmail.com>
12 years agoradeonsi: add a new SI pci id
Alex Deucher [Wed, 21 Nov 2012 23:48:18 +0000 (18:48 -0500)]
radeonsi: add a new SI pci id

Note: this is a candidate for the stable branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoi915: Fix wrong sizeof argument in i915_update_tex_unit.
Vinson Lee [Wed, 14 Nov 2012 07:20:42 +0000 (23:20 -0800)]
i915: Fix wrong sizeof argument in i915_update_tex_unit.

The bug was found by Coverity.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoAdd .dirstamp to toplevel .gitignore
Andreas Boll [Sat, 17 Nov 2012 17:04:54 +0000 (18:04 +0100)]
Add .dirstamp to toplevel .gitignore

12 years agogallium/tests: update .gitignore files
Andreas Boll [Wed, 21 Nov 2012 17:17:00 +0000 (18:17 +0100)]
gallium/tests: update .gitignore files

12 years agoi965/fs: Add helper functions for IF and CMP and use them.
Eric Anholt [Fri, 9 Nov 2012 20:50:03 +0000 (12:50 -0800)]
i965/fs: Add helper functions for IF and CMP and use them.

v2: Rebase on gen6-if fix.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
12 years agoi965/fs: Add helper functions for generating ALU ops, like in the VS.
Eric Anholt [Fri, 9 Nov 2012 20:01:05 +0000 (12:01 -0800)]
i965/fs: Add helper functions for generating ALU ops, like in the VS.

This gives us checking of our arguments (no more passing 1 operand to
BRW_OPCODE_MUL!), at the cost of a couple of extra parens.

v2: Rebase on gen6-if fix.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
12 years agoi965/gen4: Fix crash with fragment programs and texture rectangle.
Eric Anholt [Sun, 18 Nov 2012 21:18:03 +0000 (13:18 -0800)]
i965/gen4: Fix crash with fragment programs and texture rectangle.

This was a regression in the brw_fs_fp.cpp change.  We just need to return
something good enough to get the IR generation to the end without crashing,
but ir->type isn't initialized and we wanted something of the coordinate's
type anyway.

Fixes around 30 piglit cases on my ilk system in drawpixels and framebuffer
blit.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56962
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965: Disable the GB clip test when a limited viewport is set.
Eric Anholt [Thu, 15 Nov 2012 20:00:33 +0000 (12:00 -0800)]
i965: Disable the GB clip test when a limited viewport is set.

The theory of the guardband is that you extend the clip volume to avoid
expensive clipping computation, and just let fragments outside the viewport
get clipped by the drawable's bounds.  But if a smaller-than-window-size
viewport is set, and we don't also happen to have a scissor set, then
rendering could incorrectly extend outside of the viewport when it should have
been clipped to the viewport.

Fixes the new piglit triangle-guardband-viewport test.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.0 branch.

12 years agoi965: Use fewer temporary variables in clip setup.
Eric Anholt [Thu, 15 Nov 2012 19:55:36 +0000 (11:55 -0800)]
i965: Use fewer temporary variables in clip setup.

When you're comparing to the spec, you're trying to immediately see what
numbered dword of the packet your bit ends up in.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for the 9.0 branch.

12 years agoRevert "i965/fs: Fix conversions float->bool, int->bool"
Eric Anholt [Mon, 12 Nov 2012 21:16:02 +0000 (13:16 -0800)]
Revert "i965/fs: Fix conversions float->bool, int->bool"

This reverts commit cf0bbb30f6bd9d3fa61b5207320e8f34c563a2c6.  It
was just papering over the bug fixed in the previous commit.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965/fs: Fix the gen6-specific if handling for 80ecb8f15b9ad7d6edc
Eric Anholt [Mon, 12 Nov 2012 21:13:55 +0000 (13:13 -0800)]
i965/fs: Fix the gen6-specific if handling for 80ecb8f15b9ad7d6edc

Fixes oglconform shad-compiler advanced.TestLessThani.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
NOTE: This is a candidate for the 9.0 branch.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agointel: Use designated initializers for DRI extension structs
Chad Versace [Mon, 19 Nov 2012 19:43:51 +0000 (11:43 -0800)]
intel: Use designated initializers for DRI extension structs

All Intel code is compiled with -std=c99. There is no excuse to not use
designated initializers.

As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep.  I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
12 years agodri: Use designated initializers for DRI extension structs
Chad Versace [Mon, 19 Nov 2012 21:40:00 +0000 (13:40 -0800)]
dri: Use designated initializers for DRI extension structs

The dri directory is compiled with -std=c99. There is no excuse to not use
designated initializers.

As a nice benefit, the code is now more friendly to grep. Without
designated initializers, psychic prowess is required to find the
initialization of DRI extension function pointers with grep.  I have
observed several people, when they first encounter the DRI code, fail at
statically chasing the DRI function pointers due to this problem.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965: Use the separate stencil buffer's offsets for stencil setup.
Eric Anholt [Mon, 5 Nov 2012 17:53:31 +0000 (09:53 -0800)]
i965: Use the separate stencil buffer's offsets for stencil setup.

For a packed depth/stencil buffer on separate stencil hardware, the
separate depth miptree is set up with alignment of 4,4 and the separate
stencil miptree is setup with alignment of 8,8.  We can't just use the
irb->draw_{x,y} offsets for stencil, since that is the offset in the
depth miptree.

Fixes 12 piglit depthstencil testcases on ivb.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965: Move all the depth/stencil/hiz offset logic into the workaround.
Eric Anholt [Sun, 4 Nov 2012 20:47:02 +0000 (12:47 -0800)]
i965: Move all the depth/stencil/hiz offset logic into the workaround.

Given that we have the mask information here (assuming the rebase is to
the same tiling, which is safe), we can just save a set of miptrees and
offsets and the global intra-tile offset in the context and cut out a
bunch of logic.  This will also save emitting the next fix I need to do
twice.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965: When rebasing depth or stencil, update x/y before deciding the other.
Eric Anholt [Sun, 4 Nov 2012 22:45:05 +0000 (14:45 -0800)]
i965: When rebasing depth or stencil, update x/y before deciding the other.

Fixes a theoretical problem where we had an aligned depth buffer and a
misaligned stencil buffer with a matching tile offset, so we would fail
to rebase depth even after the needed tile offset changed due to the
rebase of stencil.

It should also fix double-rebase of a misaligned packed depth/stencil
renderbuffer, which may have been a performance issue.

Acked-by: Chad Versace <chad.versace@linux.intel.com>
12 years agointel: Push face/level -> slice handling to the caller of get_image_offset().
Eric Anholt [Thu, 1 Nov 2012 00:00:21 +0000 (17:00 -0700)]
intel: Push face/level -> slice handling to the caller of get_image_offset().

We were always passing 0 for one of the two fields, and the code just used
whichever one wasn't 0.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965: Add some checks for array textures in unsupported paths.
Eric Anholt [Wed, 31 Oct 2012 23:57:51 +0000 (16:57 -0700)]
i965: Add some checks for array textures in unsupported paths.

I noticed these in the next patch where these paths were using the Face
of a teximage but didn't have array handling.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965: Add a little bit more debug info for validate blits.
Eric Anholt [Wed, 31 Oct 2012 21:30:13 +0000 (14:30 -0700)]
i965: Add a little bit more debug info for validate blits.

The kind of data you're copying is definitely an interesting variable.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agointel: Remove dead function prototype.
Eric Anholt [Mon, 5 Nov 2012 22:47:42 +0000 (14:47 -0800)]
intel: Remove dead function prototype.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965: Remove stale comment about wrapped_depth.
Eric Anholt [Wed, 31 Oct 2012 23:25:02 +0000 (16:25 -0700)]
i965: Remove stale comment about wrapped_depth.

I removed that code almost a year ago.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agomesa: Mark GetBufferParameteri64v as implemented.
Kenneth Graunke [Sun, 18 Nov 2012 04:51:42 +0000 (20:51 -0800)]
mesa: Mark GetBufferParameteri64v as implemented.

Apparently this was accidentally marked as unimplemented, and thus not
put in the dispatch table.

Fixes 7 es3conform tests:
- copy_buffer_parameters
- copy_buffer_data
- copy_buffer_usage
- pixel_buffer_object_bind
- pixel_buffer_object_parameteriv
- pixel_buffer_object_texture_read
- pixel_buffer_object_usage

v2: Also update the DispatchSanity test for this change.

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agomesa: Require gen'd names in glBeginQuery on ES 3.0.
Kenneth Graunke [Sun, 18 Nov 2012 02:45:00 +0000 (18:45 -0800)]
mesa: Require gen'd names in glBeginQuery on ES 3.0.

Only legacy OpenGL allows the use of non-gen'd names.  Core profiles
and ES 3 both require the use of glGenQueries().

Note that BeginQuery doesn't exist in ES 1 or ES 2.

Fixes es3conform's occlusion_query_invalid_beginquery test.

Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
12 years agomesa: Support EXT_framebuffer_blit targets in ES 3.0 as well.
Kenneth Graunke [Sun, 18 Nov 2012 07:23:06 +0000 (23:23 -0800)]
mesa: Support EXT_framebuffer_blit targets in ES 3.0 as well.

GL_READ_FRAMEBUFFER and GL_DRAW_FRAMEBUFFER are valid targets in ES 3.

Fixes 23 es3conform framebuffer_blit tests.  Two more go from fail to
crash, but that appears to be because they actually run now.

Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
12 years agomesa: Fix error code for glTexParameteri of TEXTURE_MAX_LEVEL.
Kenneth Graunke [Thu, 8 Nov 2012 10:24:08 +0000 (02:24 -0800)]
mesa: Fix error code for glTexParameteri of TEXTURE_MAX_LEVEL.

Calling glTexParameteri() with pname GL_TEXTURE_MAX_LEVEL and either a
target of GL_TEXTURE_RECTANGLE or a negative value previously generated
GL_INVALID_OPERATION.  However, GL_INVALID_VALUE seems more appropriate.

Fixes oglconform's api-error/negative.glTexParameter and es3conform's
sgis_texture_lod_basic_error.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
12 years agoi965/vs: Don't lose attribute type when converting ATTR to FIXED_HW_REG.
Kenneth Graunke [Mon, 19 Nov 2012 07:51:47 +0000 (23:51 -0800)]
i965/vs: Don't lose attribute type when converting ATTR to FIXED_HW_REG.

The new brw_reg always had type BRW_REGISTER_TYPE_F, rather than
inheriting the original type of the ATTR file register.

In the past, this hasn't been a problem since we only execute this code
when fixing up GL_FIXED attributes, which always have float types.
However, we'll soon be using it for ARB_vertex_type_10_10_10_2 support,
which uses D and UD types.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoegl/dri2: Set error code when dri2CreateContextAttribs fails
Chad Versace [Fri, 9 Nov 2012 22:06:41 +0000 (14:06 -0800)]
egl/dri2: Set error code when dri2CreateContextAttribs fails

When dri2CreateContextContextAttribs failed, eglCreateContext returned
NULL yet set the error code to EGL_SUCCESS! The problem was that
eglCreateContext ignored the error code returned by
driCreateContextAttribs.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56706
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965: Validate requested GLES context version in brwCreateContext
Chad Versace [Fri, 9 Nov 2012 22:06:40 +0000 (14:06 -0800)]
i965: Validate requested GLES context version in brwCreateContext

For GLES1 and GLES2, brwCreateContext neglected to validate the requested
context version received from the DRI layer. If DRI requested an OpenGL
ES2 context with version 3.9, we provided it one.

Before this fix, the switch statement that validated the requested GL
context flavor was an ugly #ifdef copy-paste mess. Instead of reproducing
the copy-past-mess for GLES1 and GLES2, I first refactored it.  Now the
switch statement is readable.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoautomake: strip LLVM_CXXFLAGS and LLVM_CPPFLAGS too
Maarten Lankhorst [Mon, 19 Nov 2012 08:43:29 +0000 (09:43 +0100)]
automake: strip LLVM_CXXFLAGS and LLVM_CPPFLAGS too

It seems that -NDEBUG and other flags might still be leaked through
those variables, so strip those off there as well.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
12 years agoi965/fs: Properly patch special values during VGRF compaction.
Kenneth Graunke [Thu, 15 Nov 2012 04:50:05 +0000 (20:50 -0800)]
i965/fs: Properly patch special values during VGRF compaction.

In addition to registers used by instructions, fs_visitor maintains
direct references to certain "special" values used for inputs/outputs.

When I added VGRF compaction, I overlooked these, believing that these
direct references weren't used once instructions were generated.  That
was wrong.  For example, pixel_x/y are used in virtual_grf_interferes(),
which is called by optimization passes and register allocation.

This patch treats all of them as used and patches them after compacting.
While it's not strictly necessary to patch all of them (as some aren't
used after emitting code), it seems safer to simply fix them all.

Fixes oglconform's textureswizzle/advanced.shader.targets, piglit's
glsl-fs-lots-of-tex, and glean's texCombine on pre-Gen6 hardware.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56790
Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/gen4: Respect the VERTEX_PROGRAM_TWO_SIDE vertex program/shader flag.
Eric Anholt [Wed, 14 Nov 2012 22:37:00 +0000 (14:37 -0800)]
i965/gen4: Respect the VERTEX_PROGRAM_TWO_SIDE vertex program/shader flag.

Fixes piglit "vertex-program-two-side enabled front back" and 4 others.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agomesa: Fix linker-assigned varying component counting since 8fb1e4a462
Eric Anholt [Tue, 13 Nov 2012 22:40:22 +0000 (14:40 -0800)]
mesa: Fix linker-assigned varying component counting since 8fb1e4a462

The goal of that change was to skip counting things that aren't actually
outputs from the VS to the FS.  However, explicit_location isn't set in
the case of linker-assigned locations (the common case), so basically
varying component counting got disabled.  At this stage of the linker,
we've already ensured that var->location is set, so we can just look at
it without worrying.

Fixes i965 assertion failure with the new
piglit glsl-max-varyings --exceed-limits.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51545
Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agomesa: Fix segfault on reading from a missing color read buffer.
Eric Anholt [Tue, 13 Nov 2012 21:39:37 +0000 (13:39 -0800)]
mesa: Fix segfault on reading from a missing color read buffer.

The diff looks funny, but it's moving the integer vs non-integer check
below the _mesa_source_buffer_exists() check that ensures
_ColorReadBuffer is non-null, so we get a GL_INVALID_OPERATION instead
of a segfault.  This looks like it had regressed in the
_mesa_error_check_format_and_type() changes, which removed the first of
the two duplicated checks for the source buffer.  Fixes segfault in the
new piglit ARB_framebuffer_object/negative-readpixels-no-rb.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45877
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agointel: Use core mesa support for determining lastLevel.
Eric Anholt [Tue, 13 Nov 2012 20:45:35 +0000 (12:45 -0800)]
intel: Use core mesa support for determining lastLevel.

We had similar issues with using depth in determining the lastLevel of array
textures.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agomesa: Also handle GL_TEXTURE_EXTENRAL_OES in max num levels.
Eric Anholt [Tue, 13 Nov 2012 20:45:19 +0000 (12:45 -0800)]
mesa: Also handle GL_TEXTURE_EXTENRAL_OES in max num levels.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoi965/fs: Unify the param pointer allocation for FP/non-FP.
Eric Anholt [Thu, 8 Nov 2012 22:02:22 +0000 (14:02 -0800)]
i965/fs: Unify the param pointer allocation for FP/non-FP.

Now that we're using the new backend, we may actually put things into push
constants if you have too many uniform values uploaded.  Also, correctly
account for texture rectangle params and drop the old special case for the
0.0/1.0 params from the old backend.

12 years agost/vdpau: Fix vlVdpVideoSurfaceSize for interlaced buffers
Maarten Lankhorst [Sat, 17 Nov 2012 12:22:39 +0000 (13:22 +0100)]
st/vdpau: Fix vlVdpVideoSurfaceSize for interlaced buffers

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
12 years agodocs: import release notes for 9.0.1, add news item
Andreas Boll [Sat, 17 Nov 2012 07:57:00 +0000 (08:57 +0100)]
docs: import release notes for 9.0.1, add news item

12 years agoutil: Only use open coded snprintf for MSVC.
Vinson Lee [Thu, 15 Nov 2012 06:25:05 +0000 (22:25 -0800)]
util: Only use open coded snprintf for MSVC.

MinGW has snprintf.

The patch fixes these warnings with the MinGW SCons build.

src/gallium/auxiliary/util/u_snprintf.c:459:1: warning: no previous prototype for ‘util_vsnprintf’ [-Wmissing-prototypes]
src/gallium/auxiliary/util/u_snprintf.c:1436:1: warning: no previous prototype for ‘util_snprintf’ [-Wmissing-prototypes]

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Brian Paul <brianp@vmware.com>
12 years agoclover: Fix build with clang 3.2
Tom Stellard [Mon, 12 Nov 2012 16:04:03 +0000 (16:04 +0000)]
clover: Fix build with clang 3.2

12 years agor300/compiler: Avoid generating MOV instructions for invalid IMM swizzles v2
Tom Stellard [Sun, 16 Sep 2012 03:25:34 +0000 (23:25 -0400)]
r300/compiler: Avoid generating MOV instructions for invalid IMM swizzles v2

If an instruction reads from a constant register that contains
immediates using an invalid swizzle, we can avoid generating MOV
instructions to fix up the swizzle by loading the immediates into a
different constant register that can be read using a valid swizzle.

This only affects r300 and r400 cards.

For example:

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }

MAD temp[4].xy, const[0].xy__, const[1].xz__, input[0].xy__;

========== Before this change would be lowered to: =========

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }

MOV temp[0].x, const[1].x___;
MOV temp[0].y, const[1]._z__;
MAD temp[4].xy, const[0].xy__, temp[0].xy__, input[0].xy__;

========== After this change is lowered to:  ===============

CONST[1] = {    -3.5000     3.5000     2.5000     1.5000 }
CONST[2] = {     0.0000    -3.5000     2.5000     0.0000 }

MAD temp[4].xy, const[0].xy__, const[2].yz__, input[0].xy__;

============================================================

This change reduces one of the Lightsmark shaders from 133 to 91
instructions.

v2:
  - Fix crash caused by swizzles with only inline constants.

12 years agoradeonsi: clean up some magic numbers
Alex Deucher [Fri, 16 Nov 2012 00:17:34 +0000 (19:17 -0500)]
radeonsi: clean up some magic numbers

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: emit PA_SC_RASTER_CONFIG
Alex Deucher [Fri, 16 Nov 2012 00:15:53 +0000 (19:15 -0500)]
radeonsi: emit PA_SC_RASTER_CONFIG

Use per asic golden values.

Programming this register doesn't seem to be strictly
necessary on SI, but programming it wrong leads to
rendering issues or reduced performance so just
go ahead and program the golden values explicitly
to avoid any potential problems down the road.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years ago[PATCH] makefiles: use configured name for -ldrm* where possible
Maarten Lankhorst [Fri, 16 Nov 2012 17:50:57 +0000 (18:50 +0100)]
[PATCH] makefiles: use configured name for -ldrm* where possible

For precise lts support I had to do some magic with the library names, which works fine
as long as the libraries from pkg-config are used.

The parts with src/gallium/targets/va-*/Makefile will not apply on the master branch,
but do apply to the 9.0 branch.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Matt Turner <mattst88@gmail.com>
12 years agodocs: add note about removal of OpenVMS support
Andreas Boll [Thu, 15 Nov 2012 10:09:38 +0000 (11:09 +0100)]
docs: add note about removal of OpenVMS support

12 years agoRemove OpenVMS support
Matt Turner [Thu, 23 Aug 2012 01:44:54 +0000 (18:44 -0700)]
Remove OpenVMS support

Not maintained since 2008. Doubtful that it's worked in quite a while.

Also see commit 32ac8cb05 which removed VMS stuff from Makefile in 2009.

Cc: Jouk Jansen <j.jansen@tudelft.nl>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
12 years agobuild: add missing Makefile.in files to tarballs target
Andreas Boll [Thu, 15 Nov 2012 09:11:51 +0000 (10:11 +0100)]
build: add missing Makefile.in files to tarballs target

Those are recently introduced on master.

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agobuild: fix make tarballs target
Andreas Boll [Wed, 14 Nov 2012 22:38:16 +0000 (23:38 +0100)]
build: fix make tarballs target

fixes regression introduced in 907844107252260c646aca361191ef7f121f3d23

Targets for making lex.yy.c program_parse.tab.c and program_parse.tab.h
got moved into its own Makefile

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agogles2: Update gl2ext.h to revision 19436
Matt Turner [Thu, 15 Nov 2012 19:56:22 +0000 (11:56 -0800)]
gles2: Update gl2ext.h to revision 19436

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agogles2: Update gl2.h to revision 16803
Matt Turner [Thu, 15 Nov 2012 19:55:59 +0000 (11:55 -0800)]
gles2: Update gl2.h to revision 16803

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agogles: Update glext.h to revision 19260
Matt Turner [Thu, 15 Nov 2012 19:55:16 +0000 (11:55 -0800)]
gles: Update glext.h to revision 19260

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agoegl: Update eglext.h to revision 19571
Matt Turner [Thu, 15 Nov 2012 19:54:20 +0000 (11:54 -0800)]
egl: Update eglext.h to revision 19571

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
12 years agomesa: return INVALID_VALUE from WaitSync if timeout != GL_TIMEOUT_IGNORED
Matt Turner [Tue, 13 Nov 2012 21:49:51 +0000 (13:49 -0800)]
mesa: return INVALID_VALUE from WaitSync if timeout != GL_TIMEOUT_IGNORED

This was added in version 22 of the GL_ARB_sync spec.

Fixes gles3conform's sync_error_waitsync_timeout test.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agomesa: return INVALID_VALUE from WaitSync if flags != 0
Matt Turner [Tue, 13 Nov 2012 21:26:11 +0000 (13:26 -0800)]
mesa: return INVALID_VALUE from WaitSync if flags != 0

Fixes gles3conform's sync_error_waitsync_flags test.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agomesa: return INVALID_VALUE from ClientWaitSync if flags contains an unsupported flag
Matt Turner [Tue, 13 Nov 2012 21:26:11 +0000 (13:26 -0800)]
mesa: return INVALID_VALUE from ClientWaitSync if flags contains an unsupported flag

Fixes gles3conform's sync_error_clientwaitsync_flags test.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agomesa: return INVALID_VALUE from VertexAttribDivisor if index out of range
Matt Turner [Tue, 13 Nov 2012 21:05:03 +0000 (13:05 -0800)]
mesa: return INVALID_VALUE from VertexAttribDivisor if index out of range

All the other range checks on index already return the proper error,
INVALID_VALUE.

Fixes gles3conform's instanced_arrays_invalid test.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglcpp: Don't define macros for extensions that aren't in ES
Matt Turner [Tue, 13 Nov 2012 00:45:43 +0000 (16:45 -0800)]
glcpp: Don't define macros for extensions that aren't in ES

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoradeonsi: remove new asserts and replace with warnings
Alex Deucher [Thu, 15 Nov 2012 20:36:46 +0000 (15:36 -0500)]
radeonsi: remove new asserts and replace with warnings

Fixes piglit regressions.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoi965/fs: Don't calculate_live_intervals() in opt_algebraic().
Kenneth Graunke [Sat, 3 Nov 2012 04:24:05 +0000 (21:24 -0700)]
i965/fs: Don't calculate_live_intervals() in opt_algebraic().

There's no point: opt_algebraic() doesn't use any liveness information.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965: Remove duplicate brw_opcodes table in favor of opcode_descs.
Kenneth Graunke [Wed, 14 Nov 2012 22:24:31 +0000 (14:24 -0800)]
i965: Remove duplicate brw_opcodes table in favor of opcode_descs.

brw_optimize.c's brw_opcodes table was a copy of brw_disasm.c's
opcode_descs table, but with an additional field: is_arith.  Now that
I've deleted that, the two are identical.  Keep the one in brw_disasm.c.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/vs: Remove dead vec4_visitor::src_reg_for_float prototype.
Kenneth Graunke [Sun, 11 Nov 2012 05:53:35 +0000 (21:53 -0800)]
i965/vs: Remove dead vec4_visitor::src_reg_for_float prototype.

No such function exists.  src_reg's constructor does that.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/fs: Remove bblock field of fs_visitor.
Kenneth Graunke [Fri, 9 Nov 2012 08:38:37 +0000 (00:38 -0800)]
i965/fs: Remove bblock field of fs_visitor.

All users of basic block analysis simply create their own local
variables.  Nobody uses the visitor-wide field.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965: Remove brw_instruction_info::is_arith().
Kenneth Graunke [Wed, 14 Nov 2012 04:42:36 +0000 (20:42 -0800)]
i965: Remove brw_instruction_info::is_arith().

Nobody uses it.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965: Remove some dead code optimization passes.
Kenneth Graunke [Wed, 14 Nov 2012 04:33:33 +0000 (20:33 -0800)]
i965: Remove some dead code optimization passes.

The old brw_remove_grf_to_mrf_moves() pass is obsolete and replaced by
fs_visitor::compute_to_mrf().

The old brw_remove_duplicate_mrf_moves() pass is obsolete and replaced
by fs_visitor::remove_duplicate_mrf_writes().

The remaining pass, brw_set_dp4_dependency_control(), is currently
unused, but could be, so I'm leaving it for now.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965: Remove unused BRW_PACKCOLOR8888 macro.
Kenneth Graunke [Wed, 14 Nov 2012 04:17:29 +0000 (20:17 -0800)]
i965: Remove unused BRW_PACKCOLOR8888 macro.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965: Remove brw_shader_program wrapper struct.
Kenneth Graunke [Wed, 14 Nov 2012 03:59:08 +0000 (19:59 -0800)]
i965: Remove brw_shader_program wrapper struct.

At this point, it's just gl_shader_program.  Nobody even uses it; even
the program that creates them only returns gl_shader_program pointers.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965: Remove unused struct brw_vs_ouput_sizes.
Kenneth Graunke [Wed, 14 Nov 2012 03:56:05 +0000 (19:56 -0800)]
i965: Remove unused struct brw_vs_ouput_sizes.

With a name like that, it can't be used.  Sure enough, it's not.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoutil/u_debug: Fix DEBUG_NAMED_VALUE.
José Fonseca [Tue, 13 Nov 2012 10:23:11 +0000 (10:23 +0000)]
util/u_debug: Fix DEBUG_NAMED_VALUE.

"#__symbol" doesn't work with nested macro expansions, at least not on gcc.

12 years agodraw: fix crashes with out-of-bounds indices
Roland Scheidegger [Fri, 2 Nov 2012 15:48:49 +0000 (16:48 +0100)]
draw: fix crashes with out-of-bounds indices

The passthrough pipeline needs to check index values (which might be passed
through) as they can be invalid (which causes crashes and various assertion
failures if the clip code runs). Obviously, rendering won't be well-defined,
but those bogus indices might come directly from apps.
There were already debug printfs which reported the out-of-bounds indices but
we really ought to not crash.
While checking at that point doesn't seem like the most efficient solution,
it seems there isn't really another appropriate function to do it.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
12 years agoradeonsi: cleanup si_db()
Alex Deucher [Thu, 15 Nov 2012 14:37:44 +0000 (09:37 -0500)]
radeonsi: cleanup si_db()

Clean up a few magic numbers and rework the code a bit.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: assert the CB format is valid (v2)
Alex Deucher [Thu, 15 Nov 2012 14:34:13 +0000 (09:34 -0500)]
radeonsi: assert the CB format is valid (v2)

Assert the the CB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.

v2: use INVALID hw format rather than ~0U

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: assert that the DB format is valid (v2)
Alex Deucher [Thu, 15 Nov 2012 14:31:26 +0000 (09:31 -0500)]
radeonsi: assert that the DB format is valid (v2)

Assert that the DB format is valid and default to
the INVALID hw format rather than ~0U when the format
doesn't match for non-debug builds.

v2: use INVALID hw format rather than ~0U

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agogallium: fix some function comments in p_context.h
Dmitry Cherkassov [Wed, 14 Nov 2012 19:33:18 +0000 (23:33 +0400)]
gallium: fix some function comments in p_context.h

Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agobuild: add missing files to tarballs target
Andreas Boll [Wed, 14 Nov 2012 20:43:31 +0000 (21:43 +0100)]
build: add missing files to tarballs target

fixes errors ./configure and make was complaining about

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agobuild: add missing Makefile.in files to tarballs target
Andreas Boll [Wed, 14 Nov 2012 20:39:15 +0000 (21:39 +0100)]
build: add missing Makefile.in files to tarballs target

fixes errors ./configure was complaining about

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agobuild: add config.sub and config.guess to tarballs target
Andreas Boll [Wed, 14 Nov 2012 20:34:44 +0000 (21:34 +0100)]
build: add config.sub and config.guess to tarballs target

fixes errors ./configure was complaining about

NOTE: This is a candidate for the 9.0 branch.

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agomesa: use .cherry-ignore in the get-pick-list.sh script
Andreas Boll [Mon, 22 Oct 2012 19:18:17 +0000 (21:18 +0200)]
mesa: use .cherry-ignore in the get-pick-list.sh script

NOTE: This is a candidate for the stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agomesa: Add .gitignore for hashtable collision unit test.
Paul Berry [Wed, 14 Nov 2012 19:23:51 +0000 (11:23 -0800)]
mesa: Add .gitignore for hashtable collision unit test.

This test was introduced in commit
35fd61bd99c15c2e13d3945b41c4db7df6e64319.

12 years agoradeonsi: Set STENCILOPVAL fields to 1.
Michel Dänzer [Wed, 14 Nov 2012 15:06:52 +0000 (16:06 +0100)]
radeonsi: Set STENCILOPVAL fields to 1.

This is necessary for backwards compatibility with pre-SI for stencil.

Fixes a number of stencil related piglit tests, and real apps using stencil.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: Bump SI_PM4_MAX_DW.
Michel Dänzer [Tue, 13 Nov 2012 14:41:49 +0000 (15:41 +0100)]
radeonsi: Bump SI_PM4_MAX_DW.

Fixes assertion failure with Mesa demo glsl/samplers.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoradeonsi: Handle TGSI TXL opcode.
Michel Dänzer [Tue, 6 Nov 2012 16:41:50 +0000 (17:41 +0100)]
radeonsi: Handle TGSI TXL opcode.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoradeonsi: Handle TGSI TXB opcode.
Michel Dänzer [Tue, 6 Nov 2012 16:39:01 +0000 (17:39 +0100)]
radeonsi: Handle TGSI TXB opcode.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agomesa: Include compiler.h in hash_table.h.
Vinson Lee [Wed, 14 Nov 2012 05:18:09 +0000 (21:18 -0800)]
mesa: Include compiler.h in hash_table.h.

Include the header for the inline symbol. MSVC does not have the inline
keyword for C.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
12 years agor600g: use LINEAR_ALIGNED tiling for 1D array textures and if height0 <= 3
Marek Olšák [Tue, 13 Nov 2012 15:04:13 +0000 (16:04 +0100)]
r600g: use LINEAR_ALIGNED tiling for 1D array textures and if height0 <= 3

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoauxillary: Append LLVM_CXXFLAGS to CXXFLAGS
Tom Stellard [Fri, 9 Nov 2012 12:59:33 +0000 (07:59 -0500)]
auxillary: Append LLVM_CXXFLAGS to CXXFLAGS

12 years agor300g: don't call buffer_unmap in draw functions
Marek Olšák [Tue, 13 Nov 2012 14:48:25 +0000 (15:48 +0100)]
r300g: don't call buffer_unmap in draw functions

It's been a no-op anyway.

12 years agor300g: fix crash since the set_vertex_buffers(start_slot) change
Marek Olšák [Tue, 13 Nov 2012 14:44:46 +0000 (15:44 +0100)]
r300g: fix crash since the set_vertex_buffers(start_slot) change

12 years agor600g: untiled window-system buffers should be LINEAR_ALIGNED
Marek Olšák [Mon, 12 Nov 2012 23:36:00 +0000 (00:36 +0100)]
r600g: untiled window-system buffers should be LINEAR_ALIGNED

though I guess the DDX allocates them as LINEAR_GENERAL

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agor600g: use LINEAR_ALIGNED tiling for 1D textures
Marek Olšák [Mon, 12 Nov 2012 23:29:33 +0000 (00:29 +0100)]
r600g: use LINEAR_ALIGNED tiling for 1D textures

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agor600g: use LINEAR_ALIGNED tiling for staging textures, reorder the code
Marek Olšák [Mon, 12 Nov 2012 23:25:49 +0000 (00:25 +0100)]
r600g: use LINEAR_ALIGNED tiling for staging textures, reorder the code

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoi965/vs: Fix user clip plane setup on Gen4-5.
Kenneth Graunke [Wed, 7 Nov 2012 06:23:05 +0000 (22:23 -0800)]
i965/vs: Fix user clip plane setup on Gen4-5.

On Gen6-7, we don't compact clip planes, and nr_userclip_plane_consts
is the last bit set, so iterating from i = 0..nr_userclip_plane_consts
covers all active clip planes and is the right thing to do.
works and is the right thing to do.

However, that doesn't work at all on Gen4-5.  Since we don't compact
clip planes, we skip over ones which aren't active (via the continue
statement).  We also set set nr_userclip_plane_consts to the number of
active clip planes, which means that we end the loop after checking that
many bits.  If the set of clip planes wasn't contiguous, this means we'd
fail to find the last few.

By changing the iteration to MAX_CLIP_PLANES, we correctly find all of
the active clip planes.

Fixes regressions since 66c8473e028d (replacing the old VS backend) in
Piglit's spec/glsl-1.20/execution/clipping/fixed-clip-enables and
oglconform's mustpass(basic.clip) and userclip(basic.allCases).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56791
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/vs: Simplify the Gen6-7 part of setup_uniform_clipplane_values().
Kenneth Graunke [Wed, 7 Nov 2012 06:23:04 +0000 (22:23 -0800)]
i965/vs: Simplify the Gen6-7 part of setup_uniform_clipplane_values().

There's no compaction, so we can drop that code and simply use 'i'.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/vs: Split setup_uniform_clipplane_values() into Gen4-5/6-7 parts.
Kenneth Graunke [Wed, 7 Nov 2012 06:23:03 +0000 (22:23 -0800)]
i965/vs: Split setup_uniform_clipplane_values() into Gen4-5/6-7 parts.

Since Gen4-5 compacts clip planes and Gen6-7 doesn't, it makes sense to
split them into separate code paths.  This patch simply copies the code
to both halves; the next commits will simplify it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agomesa: Replace random with standard C rand.
Vinson Lee [Tue, 13 Nov 2012 06:15:42 +0000 (22:15 -0800)]
mesa: Replace random with standard C rand.

BSD random is not available on some compilers.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
12 years agoautomake: Remove empty file variable.
Brian Paul [Tue, 13 Nov 2012 05:29:34 +0000 (21:29 -0800)]
automake: Remove empty file variable.

Fixes SCons build regression introduced with commit
a665cf1226b80ec52a0c1a4a38378df4389e8ebf.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Tested-by: Vinson Lee <vlee@freedesktop.org>
12 years agomesa: Fix gallium build since 6991c2922f
Eric Anholt [Tue, 13 Nov 2012 03:32:58 +0000 (19:32 -0800)]
mesa: Fix gallium build since 6991c2922f

Looks like I screwed up and didn't test gallium again after tweaking the
Makefile.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57044

12 years agomesa: Convert the hash table for GL object ids to the open-addressing hash.
Eric Anholt [Wed, 7 Nov 2012 07:18:42 +0000 (23:18 -0800)]
mesa: Convert the hash table for GL object ids to the open-addressing hash.

The previous 1023-entry chaining hash table never resized, so it was very
inefficient when there were many objects live.  While one could have an even
more efficient implementation than this (keep an array for genned names with
packed IDs, or take advantage of the fact that key == hash or key ==
*(uint32_t *)data to store less data), this is fairly fast, and I want a nice
replacement hash table for other parts of Mesa, too.

It improves Minecraft performance 12.3% +/- 1.4% (n=9), dropping hash lookups
from 8% of the profile to 0.5%.

I also tested cairo-gl, which should be a pessimal workload for this hash
table: around 247000 FBOs created and destroyed, only around 65 live at any
time, and few lookups of them between creation and destruction.  No
statistically significant performance difference at n=76 (mean 20.3/20.4
seconds, sd 2.8/3.2 seconds).  If I remove the >20 seconds outliers that
appear to be due to thermal throttling, there's possibly a .97% +/- 0.31%
performance win (n=61/59).  The choice of cutoff for outliers feels a lot like
cooking the data, but I've gone through this process 3 times for minor
iterations of the code with the same conclusion each time.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)