mesa.git
10 years agoglsl: Handle most qualifier ordering in C code rather than the grammar.
Kenneth Graunke [Sat, 13 Jul 2013 22:27:52 +0000 (15:27 -0700)]
glsl: Handle most qualifier ordering in C code rather than the grammar.

The GL_ARB_shading_language_420pack extension/GLSL 4.20 allow qualifiers
to be specified in (basically) any order.  In order to support this, we
can't hardcode the ordering restrictions in the grammar.

This patch alters the grammar to accept invariant, storage, layout, and
interpolation qualifiers in any order, but adds C code to enforce the
ordering requirements.  In the 420pack case, we should be able to simply
skip the error checks.

As a bonus, this also lets us generate decent error messages, rather
than Bison's awful "unexpected TOKEN" errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Add a new ast_type_qualifier::has_auxiliary_storage() method.
Kenneth Graunke [Sun, 14 Jul 2013 02:20:37 +0000 (19:20 -0700)]
glsl: Add a new ast_type_qualifier::has_auxiliary_storage() method.

"Auxiliary storage qualifiers" is the new term given to "centroid",
"patch", and "sample" by GLSL 4.20/GL_ARB_shading_language_420pack.

Even though we only support "centroid", it's useful to add this now
so that all auxiliary storage qualifiers get handled in the right places
once they're eventually supported.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Add a new ast_type_qualifier::has_storage() method.
Kenneth Graunke [Sat, 13 Jul 2013 05:36:31 +0000 (22:36 -0700)]
glsl: Add a new ast_type_qualifier::has_storage() method.

This makes it easy to check if any storage qualifiers are set.

"centroid" is not considered a storage qualifier.  In the old language
rules, you can't specify "centroid" by itself; it's always "centroid
in", "centroid out", or "centroid varying."  So one of the other storage
qualifiers will always be set; there's no need to specifically check for
centroid.

In the new 4.20 rules, centroid is an auxiliary storage qualifier, not a
storage qualifier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Add a new ast_type_qualifier::has_layout() method.
Kenneth Graunke [Sat, 13 Jul 2013 05:34:19 +0000 (22:34 -0700)]
glsl: Add a new ast_type_qualifier::has_layout() method.

This makes it easy to check if any layout qualifiers are set.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Combine URB code emission into a single group.
Kenneth Graunke [Thu, 11 Jul 2013 17:24:15 +0000 (10:24 -0700)]
i965: Combine URB code emission into a single group.

All four URB packets need to be programmed together in order for the GPU
state to be valid.  Putting them in separate BEGIN..ADVANCE blocks is
risky: if we're nearing the end of a batch, the batch could be flushed
inbetween two of the commands, causing the URB programming to be split
into two batchbuffers.

This -might- be okay with hardware contexts, but it offers no advantages
over keeping them together, and has a potential for hangs.

Putting them into a single BEGIN..ADVANCE block ensures they'll be kept
in the same batch, which seems wise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/hsw: Change L3 MOCS for depth, hiz, and stencil
Chad Versace [Thu, 18 Jul 2013 17:07:30 +0000 (10:07 -0700)]
i965/hsw: Change L3 MOCS for depth, hiz, and stencil

Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoi965/hsw: Change L3 MOCS of 3DSTATE_CONSTANT_VS/PS
Chad Versace [Thu, 18 Jul 2013 17:04:17 +0000 (10:04 -0700)]
i965/hsw: Change L3 MOCS of 3DSTATE_CONSTANT_VS/PS

Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

In blorp, change only the PS packet, because the VS packet is disabled.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoi965/hsw: Change L3 MOCS of SURFACE_STAT
Chad Versace [Thu, 18 Jul 2013 17:00:15 +0000 (10:00 -0700)]
i965/hsw: Change L3 MOCS of SURFACE_STAT

Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoi965/hsw: Change L3 MOCS of 3DSTATE_VERTEX_BUFFERS
Chad Versace [Thu, 18 Jul 2013 16:58:06 +0000 (09:58 -0700)]
i965/hsw: Change L3 MOCS of 3DSTATE_VERTEX_BUFFERS

Change from "not cacheable" to "cacheable" in L3.
Do so for the draw upload path and blorp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoglx: Enable floating-point fbconfig extensions
Tomasz Lis [Wed, 17 Jul 2013 11:49:23 +0000 (13:49 +0200)]
glx: Enable floating-point fbconfig extensions

Signed-off-by: Tomasz Lis <listom@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoegl: Drop configs with unknown or invalide __DRI_ATTRIB_RENDER_TYPE
Ian Romanick [Thu, 18 Jul 2013 22:13:45 +0000 (15:13 -0700)]
egl: Drop configs with unknown or invalide __DRI_ATTRIB_RENDER_TYPE

Some render types, such as floating-point, aren't valid with EGL.
Return NULL in those cases to drop them.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agodri: Introduce new flags in __DRI_ATTRIB_RENDER_TYPE
Tomasz Lis [Wed, 17 Jul 2013 11:49:21 +0000 (13:49 +0200)]
dri: Introduce new flags in __DRI_ATTRIB_RENDER_TYPE

Mark __DRI_ATTRIB_FLOAT_MODE as deprecated, and introduce new flags to
__DRI_ATTRIB_RENDER_TYPE for float modes.  Both signed float
(fbconfig_float) and unsigned (packed_float) are introduced. The old
attribute should be set for both float modes.

v2 (idr): Require that the render mode from the DRI attributes matches the
render mode of the config exactly.  This is the behavior of the old code.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglx: Require proper drawableType in init_fbconfig_for_chooser
Tomasz Lis [Wed, 17 Jul 2013 11:49:20 +0000 (13:49 +0200)]
glx: Require proper drawableType in init_fbconfig_for_chooser

Make sure that init_fbconfig_for_chooser sets correct value of
drawableType for visual configs and fbconfigs.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglx: Validate the GLX_RENDER_TYPE value
Tomasz Lis [Thu, 18 Jul 2013 21:19:38 +0000 (14:19 -0700)]
glx: Validate the GLX_RENDER_TYPE value

Correctly handle the value of renderType in GLX context.  In case of the
value being incorrect, context creation fails.

v2 (idr): indirect_create_context is just a memory allocator, so don't
validate the GLX_RENDER_TYPE there.  Fixes regressions in several
GLX_ARB_create_context piglit tests.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglx: Store the RENDER_TYPE in indirect rendering
Tomasz Lis [Wed, 17 Jul 2013 11:49:18 +0000 (13:49 +0200)]
glx: Store the RENDER_TYPE in indirect rendering

v2 (idr): Open-code the check for GLX_RENDER_TYPE.
dri2_convert_glx_attribs can't be called from here because that function
only exists in direct-rendering builds.  Also add a stub version of
indirect_create_context_attribs to tests/fake_glx_screen.cpp to prevent
'make check' regressions.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglx: Handling RENDER_TYPE in glXCreateContext and init_fbconfig_for_chooser
Tomasz Lis [Wed, 17 Jul 2013 11:49:17 +0000 (13:49 +0200)]
glx: Handling RENDER_TYPE in glXCreateContext and init_fbconfig_for_chooser

Set the correct values of renderType in glXCreateContext and
init_fbconfig_for_chooser.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglx: Changes to visual configs initialization.
Tomasz Lis [Wed, 17 Jul 2013 11:49:16 +0000 (13:49 +0200)]
glx: Changes to visual configs initialization.

Correctly handle the value of renderType and drawableType in
fbconfig. Modify glXInitializeVisualConfigFromTags to read the parameter
value, or detect it if it's not there.

v2 (idr): If there was no GLX_RENDER_TYPE property, set the type based
purely on the rgbMode as the previous code did.  It is impossible for
floatMode to be set at this point, so we can't have a float config.  The
previous code regressed a large number of piglit GLX tests because those
tests don't set GLX_RENDER_TYPE in the glXChooseConfig call.  Restoring
the old behavior for that case fixes those regressions.

Also fix handling of GLX_DONT_CARE for GLX_RENDER_TYPE.  Fixes a
regression in glx-dont-care-mask.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglx: Retrieve the value of RENDER_TYPE from GLX attribs array
Tomasz Lis [Wed, 17 Jul 2013 11:49:15 +0000 (13:49 +0200)]
glx: Retrieve the value of RENDER_TYPE from GLX attribs array

Make sure that context creation routines are provided with the value of
RENDER_TYPE retrieved from GLX attribs.

v2 (idr): Minor formatting changes.  Change type of
dri2_convert_glx_attribs render_type parameter to uint32_t to silence
some GCC warnings.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglx: Store the value of renderType while creating context
Tomasz Lis [Wed, 17 Jul 2013 11:49:14 +0000 (13:49 +0200)]
glx: Store the value of renderType while creating context

Make sure that renderType property value is stored in GLX context while
it's being created.  Further patches will be provided to make the value
correspond to fbconfig's renderType.

v2 (idr): Move a hunk from the next patch to this patch to prevent a
build break.

Signed-off-by: Tomasz Lis <tomasz.lis@intel.com>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Add #defines for Memory Object Control State fields on Gen7-7.5.
Kenneth Graunke [Wed, 10 Jul 2013 03:47:54 +0000 (20:47 -0700)]
i965: Add #defines for Memory Object Control State fields on Gen7-7.5.

The L3 controls are identical on all platforms, but LLC differs:
- Ivybridge has a "cache in LLC" flag
- Baytrail has no LLC, but instead has a snoop bit:
  "data accesses in this page must be snooped in the CPU caches."
- Haswell has writeback/uncached flags for LLC and eLLC (eDRAM).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoglsl/linker: Use correct array length when linking inter-stage uniforms and varyings.
Fabian Bieler [Fri, 14 Jun 2013 11:37:07 +0000 (13:37 +0200)]
glsl/linker: Use correct array length when linking inter-stage uniforms and varyings.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Fabian Bieler <fabianbieler@fastmail.fm>
10 years agogen_matypes: fix cross-compiling with gcc
Mike Frysinger [Tue, 5 Feb 2013 02:27:40 +0000 (21:27 -0500)]
gen_matypes: fix cross-compiling with gcc

The current gen_matypes logic assumes that the host compiler will produce
information that is useful for the target compiler.  Unfortunately, this
is not the case whenever cross-compiling.

When we detect that we're cross-compiling and using GCC, use the target
compiler to produce assembly from the gen_matypes.c source, then process
it with a shell script to create a usable header.  This is similar to how
the linux kernel creates its asm-offsets.c file.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
10 years agoax_prog_flex.m4: change grep syntax to accept e.g. flex.real
Andreas Oberritter [Mon, 15 Apr 2013 20:46:06 +0000 (22:46 +0200)]
ax_prog_flex.m4: change grep syntax to accept e.g. flex.real

This is required in case a wrapper or symlink is used. This patch
has also been sent upstream, awaiting moderation.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Oberritter <obi@saftware.de>
10 years agobuiltin_compiler/build: Avoid using libtool if cross compiling
Jonathan Liu [Tue, 4 Jun 2013 13:03:55 +0000 (23:03 +1000)]
builtin_compiler/build: Avoid using libtool if cross compiling

Adds the dependencies of builtin_compiler as sources when cross
compiling instead of using libtool to share compilation with src/glsl.
The builtin_compiler executable is built for the host when cross
compiling so it doesn't make sense to share compilation with src/glsl
built for the target in this case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44618
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
10 years agoi965: Add MOCS shift and mask for SURFACE_STATE entries.
Kenneth Graunke [Wed, 1 May 2013 00:54:23 +0000 (17:54 -0700)]
i965: Add MOCS shift and mask for SURFACE_STATE entries.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agollvmpipe: clamp inputs for srgb render buffers
Roland Scheidegger [Thu, 18 Jul 2013 00:10:27 +0000 (02:10 +0200)]
llvmpipe: clamp inputs for srgb render buffers

Usually with fixed point renderbuffers clamping is done as part of conversion.
However, since we blend in float format, we essentially skip all conversion
steps pre-blend but since this is still a fixed point renderbuffer we must
still clamp the inputs in this case. Makes no difference for piglit though.
Obviously we could skip this if fragment color clamping is enabled, but a)
this is deprecated in OpenGL (d3d never had it) and b) we don't support it
natively so it gets baked into the shader.
Also add some comment about logic ops being broken for srgb, luckily no test
tries to do that as there's no easy fix...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
10 years agollvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alpha
Roland Scheidegger [Thu, 18 Jul 2013 00:05:34 +0000 (02:05 +0200)]
llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alpha

We were fixing up the blend factor to ZERO, however this only works correctly
with fixed point render buffers where the input values are clamped to 0/1
(because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped
inputs). Haven't seen any failure anywhere due to that with fixed point SNORM
buffers (which clamp inputs to -1/1) but it should apply there as well (snorm
blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all,
d3d10 requires them but they are not blendable).
Doesn't look like piglit hits this though (some internal testing hits the
float case at least). (With legacy OpenGL we could theoretically still use the
fixup to zero if the fragment color clamp is enabled, but we can't detect that
easily since we don't support native clamping hence it gets baked into the
shader.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
10 years agor600g: use WAIT_3D_IDLE before using CP DMA
Marek Olšák [Tue, 16 Jul 2013 20:48:48 +0000 (22:48 +0200)]
r600g: use WAIT_3D_IDLE before using CP DMA

I broke this with 7948ed1250cae78ae1b22dbce4ab23aceacc6159 for r700 at least.

10 years agor300g: make use of gallium's os_get_process_name()
Jonathan Gray [Thu, 18 Jul 2013 06:44:25 +0000 (16:44 +1000)]
r300g: make use of gallium's os_get_process_name()

Lets the code compile on non Linux systems.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
10 years agoconfigure.ac: On some systems, "x86-64" is called "amd64"
Jean-Sébastien Pédron [Wed, 5 Jun 2013 11:27:37 +0000 (13:27 +0200)]
configure.ac: On some systems, "x86-64" is called "amd64"

For instance, this is the case on FreeBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agonv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0
Ilia Mirkin [Tue, 16 Jul 2013 21:50:43 +0000 (17:50 -0400)]
nv50: H.264/MPEG2 decoding support via VP2, available on NV84-NV96, NVA0

Adds H.264 and MPEG2 codec support via VP2, using firmware from the
blob. Acceleration is supported at the bitstream level for H.264 and
IDCT level for MPEG2.

Known issues:
 - H.264 interlaced doesn't render properly
 - H.264 shows very occasional artifacts on a small fraction of videos
 - MPEG2 + VDPAU shows frequent but small artifacts, which aren't there
   when using XvMC on the same videos

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agoconfigure.ac: make grep tests more portable
Jonathan Gray [Thu, 20 Jun 2013 10:14:33 +0000 (20:14 +1000)]
configure.ac: make grep tests more portable

Use grep -w instead of the empty string escape sequences
which are less portable.  Makes the grep tests
function as intended on OpenBSD.

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agoconfigure.ac: add OpenBSD
Jonathan Gray [Wed, 26 Jun 2013 07:11:57 +0000 (17:11 +1000)]
configure.ac: add OpenBSD

Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Vinson Lee <vlee@freedesktop.org>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agoglsl: Remove comma at end of enumerator list.
Vinson Lee [Thu, 18 Jul 2013 03:51:50 +0000 (20:51 -0700)]
glsl: Remove comma at end of enumerator list.

Fixes this build error on OpenBSD 5.3.

In file included from ../../src/mesa/main/ff_fragment_shader.cpp:53:
./../glsl/ir_optimization.h:64: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agomesa: Remove commas at end of enumerator lists.
Vinson Lee [Thu, 18 Jul 2013 03:42:03 +0000 (20:42 -0700)]
mesa: Remove commas at end of enumerator lists.

Fixes these build errors on OpenBSD 5.3.

In file included from ../../src/mesa/main/errors.h:47,
                 from ../../src/mesa/main/imports.h:41,
                 from ../../src/mesa/main/ff_fragment_shader.cpp:32:
../../src/mesa/main/mtypes.h:3286: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3296: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3303: error: comma at end of enumerator list
../../src/mesa/main/mtypes.h:3356: error: comma at end of enumerator list

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agodocs: Import 9.1.5 release notes
Carl Worth [Thu, 18 Jul 2013 03:10:50 +0000 (20:10 -0700)]
docs: Import 9.1.5 release notes

And add news item for the release.

10 years agogallivm: (trivial) simplify lp_build_cos/lp_build_sin a tiny bit
Roland Scheidegger [Wed, 17 Jul 2013 16:13:41 +0000 (18:13 +0200)]
gallivm: (trivial) simplify lp_build_cos/lp_build_sin a tiny bit

Use "or" instead of "add" (this is a classic select sequence, which at
least newer llvm versions can actually recognize (3.2+?), and the "add"
might prevent that - and we really don't want an add instead of an or with
avx if it isn't recognized (even without avx logic ops might be cheaper)).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agoutil/u_format_s3tc: handle srgb formats correctly.
Roland Scheidegger [Wed, 17 Jul 2013 16:13:10 +0000 (18:13 +0200)]
util/u_format_s3tc: handle srgb formats correctly.

Instead of just ignoring the srgb/linear conversions, simply call the
corresponding conversion functions, for all of pack/unpack/fetch,
both for float and unorm8 versions (though some don't make a whole
lot of sense, i.e. unorm8/unorm8 srgb/linear combinations).
Refactored some functions a bit so don't have to duplicate all the code
(there's a slight change for packing dxt1_rgb, as there will now be
always 4 components initialized and sent to the external compression
function so the same code can be used for all, the quite horrid and
ad-hoc interface (by now) should always have worked with that).

Fixes llvmpipe/softpipe piglit texwrap GL_EXT_texture_sRGB-s3tc.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agor600g/sb: improve alu packing on cayman
Vadim Girlin [Wed, 17 Jul 2013 14:29:56 +0000 (18:29 +0400)]
r600g/sb: improve alu packing on cayman

Scheduler/register allocator in r600-sb was developed and optimized
on evergreen (VLIW-5) hardware, so currently it's not optimal for
VLIW-4 chips.
This patch should improve performance on cayman gpus due to better alu
packing, but also it tends to increase register usage, so overall positive
effect on performance has to be proven by real benchmarks yet.

Some results with bfgminer kernel on cayman:
source bytecode:       60 gprs, 3905 alu groups,
sbcl before the patch: 45 gprs, 4088 alu groups,
sbcl with this patch:  55 gprs, 3474 alu groups.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
10 years agor600g/sb: fix handling of new multislot instructions on cayman
Vadim Girlin [Tue, 16 Jul 2013 08:28:52 +0000 (12:28 +0400)]
r600g/sb: fix handling of new multislot instructions on cayman

Ex-scalar instructions that became multislot on cayman do replicate result
to all channels - handle them similar to DOT4.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
10 years agor600g/sb: fix debug dump code in scheduler
Vadim Girlin [Wed, 17 Jul 2013 08:10:40 +0000 (12:10 +0400)]
r600g/sb: fix debug dump code in scheduler

Update the stale debug code for other changes related to debug output.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
10 years agor600g/sb: fix initial register allocation
Vadim Girlin [Wed, 17 Jul 2013 08:05:32 +0000 (12:05 +0400)]
r600g/sb: fix initial register allocation

Mark values that are members of the 'same register' constraint as
preallocated in ra_init pass, this will prevent incorrect
reallocation in scheduler in some cases.

Should fix https://bugs.freedesktop.org/show_bug.cgi?id=66713

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
10 years agor600g/sb: move chip & class name functions to sb_context
Vadim Girlin [Tue, 16 Jul 2013 10:45:29 +0000 (14:45 +0400)]
r600g/sb: move chip & class name functions to sb_context

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
10 years agor600g/sb: fix handling of PS in source bytecode on cayman
Vadim Girlin [Wed, 17 Jul 2013 08:00:43 +0000 (12:00 +0400)]
r600g/sb: fix handling of PS in source bytecode on cayman

Actually PS doesn't make sense for cayman and isn't even mentioned in
cayman docs, but llvm backend currently uses it in bytecode and, assuming
that hw seems to be mostly ok with it, this will allow sb to parse such
source bytecode correctly.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
10 years agor600g/sb: Initialize ra_checker member variables.
Vinson Lee [Sat, 13 Jul 2013 06:41:08 +0000 (23:41 -0700)]
r600g/sb: Initialize ra_checker member variables.

Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agogallium/util: use explicily sized types for {un, }pack_rgba_{s, u}int
Emil Velikov [Mon, 8 Jul 2013 18:56:35 +0000 (19:56 +0100)]
gallium/util: use explicily sized types for {un, }pack_rgba_{s, u}int

Every function but the above four uses explicitly sized types for their
src and dst arguments. Even fetch_rgba_{s,u}int follows the convention.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
10 years agollvmpipe: use MCJIT on ARM and AArch64
Kyle McMartin [Mon, 15 Jul 2013 14:51:15 +0000 (10:51 -0400)]
llvmpipe: use MCJIT on ARM and AArch64

MCJIT is the only supported LLVM JIT on AArch64 and ARM (the regular
JIT has bit-rotted badly on ARM and doesn't exist on AArch64.)

Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
10 years agoglsl: Fix absurd whitespace conventions in the parser.
Kenneth Graunke [Sat, 13 Jul 2013 21:44:45 +0000 (14:44 -0700)]
glsl: Fix absurd whitespace conventions in the parser.

Historically, we indented grammar production rules with a single 8-space
tab, but code inside of blocks used Mesa's 3-space indents.

This meant when editing code, you had to use an 8-space tab for the
first level of indentation, and 3-spaces after that.  Unless you
specifically configure your editor to understand this, it will get the
indentation wrong on every single line you touch, which quickly devolves
into a colossal waste of time.

It's also inconsistent with every other file in the entire project.

This patch removes all tabs and moves to a consistent 3-space indent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoglsl: Fail the build if the grammar contains shift/reduce errors.
Kenneth Graunke [Sat, 13 Jul 2013 06:18:44 +0000 (23:18 -0700)]
glsl: Fail the build if the grammar contains shift/reduce errors.

When working on a parser, it's very easy to accidentally introduce
new shift/reduce conflicts.  Failing the build guarantees they'll
be noticed and fixed.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoglsl: Silence the last shift/reduce conflict warning in the grammar.
Kenneth Graunke [Sat, 13 Jul 2013 06:10:14 +0000 (23:10 -0700)]
glsl: Silence the last shift/reduce conflict warning in the grammar.

The single remaining shift/reduce conflict was the classic ELSE problem:

  292 selection_rest_statement: statement . ELSE statement
  293                         | statement .

    ELSE  shift, and go to state 479

    ELSE      [reduce using rule 293 (selection_rest_statement)]
    $default  reduce using rule 293 (selection_rest_statement)

The correct behavior here is to shift, which is what happens by default.
However, resolving it explicitly will make it possible to fail the build
on new errors, making them much easier to detect.

The classic way to solve this is to use right associativity:
http://www.gnu.org/software/bison/manual/html_node/Non-Operators.html

Since there is no THEN token in GLSL, we need to fake one.  %right THEN
creates a new terminal symbol; the %prec directive says to use the
precedence of that terminal.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoglsl: Initialize ast_jump_statement::opt_return_value.
Vinson Lee [Sun, 14 Jul 2013 07:57:22 +0000 (00:57 -0700)]
glsl: Initialize ast_jump_statement::opt_return_value.

opt_return_value was not initialized if mode != ast_return.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoglapi: Do not use backtrace on OpenBSD.
Vinson Lee [Sat, 13 Jul 2013 00:01:57 +0000 (17:01 -0700)]
glapi: Do not use backtrace on OpenBSD.

execinfo.h is not available on OpenBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoosmesa: link against static libglapi library too to get the gl exports
Maarten Lankhorst [Tue, 16 Jul 2013 08:18:38 +0000 (10:18 +0200)]
osmesa: link against static libglapi library too to get the gl exports

This should fix missing symbols in a osmesa built against shared glapi
osmesa build. All opengl exports were missing that are defined in the
static glapi, so link against both to fix this.

This is a candidate for the stable series.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=47824
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agoi965/Gen4: Zero extra coordinates for ir_tex
Chris Forbes [Sun, 14 Jul 2013 06:30:52 +0000 (18:30 +1200)]
i965/Gen4: Zero extra coordinates for ir_tex

We always emit U,V,R coordinates for this message, but the sampler gets
very angry if we pass garbage in the R coordinate for at least some
texture formats.

Fill the remaining coordinates with zero instead.

Fixes broken rendering on GM45 in Source games, and in VDrift.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65236

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Ivybridge PRM for 3DSTATE_CLEAR_PARAMS notes.
Kenneth Graunke [Wed, 10 Jul 2013 23:10:28 +0000 (16:10 -0700)]
i965: Cite the Ivybridge PRM for 3DSTATE_CLEAR_PARAMS notes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Refer people to brw_tex_layout.c rather than the BSpec.
Kenneth Graunke [Wed, 10 Jul 2013 23:07:14 +0000 (16:07 -0700)]
i965: Refer people to brw_tex_layout.c rather than the BSpec.

brw_tex_layout.c sets up the align_w/h fields, and has all the
appropriate spec references already.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Remove old BSpec reference from BLORP's 3DSTATE_WM/PS packets.
Kenneth Graunke [Wed, 10 Jul 2013 23:01:35 +0000 (16:01 -0700)]
i965: Remove old BSpec reference from BLORP's 3DSTATE_WM/PS packets.

The Sandybridge code had a citation for the range of the "Maximum Number
of Threads" field, and the Ivybridge code just mentioned the "BSpec" in
general.  That's documented in the obvious place, so people can find it
without a spec reference.

The real value of the comment is to say "we tried zero, and it exploded,
so program it to a valid number even if pixel shading is off."

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Ivybridge PRM for 3DSTATE_URB_* programming.
Kenneth Graunke [Wed, 10 Jul 2013 22:45:10 +0000 (15:45 -0700)]
i965: Cite the Ivybridge PRM for 3DSTATE_URB_* programming.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Update workaround flush comments for Gen6 3DSTATE_VS.
Kenneth Graunke [Wed, 10 Jul 2013 22:41:35 +0000 (15:41 -0700)]
i965: Update workaround flush comments for Gen6 3DSTATE_VS.

Unfortunately, the workaround text never made it into the Sandybridge
PRM, so we still have to refer to the BSpec.

It also wasn't obvious why we needed this workaround at all, since we
don't currently do VS passthrough - but BLORP can turn off the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Ivybridge PRM for VS PIPE_CONTROL workarounds.
Kenneth Graunke [Wed, 10 Jul 2013 20:39:19 +0000 (13:39 -0700)]
i965: Cite the Ivybridge PRM for VS PIPE_CONTROL workarounds.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Sandybridge PRM for Gen7 stencil pitch requirements.
Kenneth Graunke [Wed, 10 Jul 2013 20:35:31 +0000 (13:35 -0700)]
i965: Cite the Sandybridge PRM for Gen7 stencil pitch requirements.

Sadly, the Ivybridge PRM can't be cited, as it is missing the relevant
text for some reason.  However, the Sandybridge PRM has the text Chad
originally quoted, and the modern BSpec has the same text.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Ivybridge PRM for multisample surface format notes.
Kenneth Graunke [Wed, 10 Jul 2013 20:27:40 +0000 (13:27 -0700)]
i965: Cite the Ivybridge PRM for multisample surface format notes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Delete "the data cache is the sampler cache" comments on Gen7+.
Kenneth Graunke [Wed, 10 Jul 2013 20:22:00 +0000 (13:22 -0700)]
i965: Delete "the data cache is the sampler cache" comments on Gen7+.

I cut and pasted these comments from the Gen4 code during Ivybridge
enabling, and didn't understand what they meant at the time.

The data cache is NOT the same as the sampler cache on Ivybridge.
The sampler cache has L1 and L2 caches in addition to the L3 cache,
while data port messages to the "data cache" hit L3 directly.

This means that the sampler domain is technically wrong, but we stopped
caring about read/write domains quite a while ago.  The kernel just
flushes all the caches at the end of each batchbuffer, and our render to
texture code flushes the sampler caches when necessary.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the 965 PRM for "the data cache is the sampler cache".
Kenneth Graunke [Wed, 10 Jul 2013 20:18:34 +0000 (13:18 -0700)]
i965: Cite the 965 PRM for "the data cache is the sampler cache".

Presumably, this comment exists to justify the usage of
I915_GEM_DOMAIN_SAMPLER for this relocation.  At one point, this was
necessary to ensure that the right flushing was done to keep caches
coherent.  These days, the kernel just flushes everything, so I don't
think it matters.

Still, the comment is interesting, so leave it in place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Ivybridge PRM for DP message descriptor fields.
Kenneth Graunke [Wed, 10 Jul 2013 20:17:42 +0000 (13:17 -0700)]
i965: Cite the Ivybridge PRM for DP message descriptor fields.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Ivybridge PRM for why the fake MRF range is what it is.
Kenneth Graunke [Wed, 10 Jul 2013 20:16:13 +0000 (13:16 -0700)]
i965: Cite the Ivybridge PRM for why the fake MRF range is what it is.

The exact text is in the public docs, so we should cite those.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Cite the Ivybridge PRM for SFID enum values.
Kenneth Graunke [Wed, 10 Jul 2013 20:10:55 +0000 (13:10 -0700)]
i965: Cite the Ivybridge PRM for SFID enum values.

The Ivybridge PRM adds new SFIDs and lists them in a different volume
than Sandybridge, so it's worth adding a reference.

I also removed the BSpec reference, as the section it referred to
was moved somewhere, and I couldn't find it.  This leaves one Haswell
SFID without a citation, but we can add one once the PRMs are out.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agollvmpipe: support sRGB framebuffers
Roland Scheidegger [Mon, 15 Jul 2013 23:52:29 +0000 (01:52 +0200)]
llvmpipe: support sRGB framebuffers

Just use the new conversion functions to do the work. The way it's plugged
in into the blend code is quite hacktastic but follows all the same hacks
as used by packed float format already.
Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never
worked anyway in the blend code and are thus disabled, and I don't think anyone
is interested in L8/L8A8. Would need even more hacks otherwise.
Unless I'm missing something, this is the last feature except MSAA needed for
OpenGL 3.0, and for OpenGL 3.1 as well I believe.

v2: prettify a bit, use separate function for packing.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agoRevert "r300g: allow HiZ with a 16-bit zbuffer"
Marek Olšák [Mon, 15 Jul 2013 21:39:39 +0000 (23:39 +0200)]
Revert "r300g: allow HiZ with a 16-bit zbuffer"

This reverts commit 631c631cbf5b7e84e42a7cfffa1c206d63143370.

https://bugs.freedesktop.org/show_bug.cgi?id=66921

Cc: mesa-stable@lists.freedesktop.org
10 years agor300g/swtcl: fix a lockup in MSAA resolve
Marek Olšák [Mon, 15 Jul 2013 01:53:09 +0000 (03:53 +0200)]
r300g/swtcl: fix a lockup in MSAA resolve

Cc: mesa-stable@lists.freedesktop.org
10 years agor300g/swtcl: fix geometry corruption by uploading indices to a buffer
Marek Olšák [Mon, 15 Jul 2013 00:42:44 +0000 (02:42 +0200)]
r300g/swtcl: fix geometry corruption by uploading indices to a buffer

The splitting of a draw call into several draw commands was broken, because
the split sometimes took place in the middle of a primitive. The splitting
was supposed to be dealing with the case when there are more indices than
the maximum size of a CS.

This commit throws that code away and uses a real index buffer instead.

https://bugs.freedesktop.org/show_bug.cgi?id=66558

Cc: mesa-stable@lists.freedesktop.org
10 years agoglsl: Reject C-style initializers with unknown types.
Matt Turner [Fri, 12 Jul 2013 18:05:38 +0000 (11:05 -0700)]
glsl: Reject C-style initializers with unknown types.

_mesa_ast_set_aggregate_type walks through declarations initialized with
C-style aggregate initializers and stops when it runs out of LHS
declarations or RHS expressions.

In the example

   vec4 v = {{{1, 2, 3, 4}}};

_mesa_ast_set_aggregate_type would not recurse into the subexpressions
(since vec4s do not contain types that can be initialized with an
aggregate initializer) to set their <constructor_type>s. Later in ::hir
we would dereference the NULL pointer and segfault.

If <constructor_type> is NULL in ::hir we know that the LHS and RHS
were unbalanced and the code is illegal.

Arrays, structs, and matrices were unaffected.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoglsl: Rework builtin_variables.cpp to reduce code duplication.
Paul Berry [Sun, 7 Jul 2013 19:44:57 +0000 (12:44 -0700)]
glsl: Rework builtin_variables.cpp to reduce code duplication.

Previously, we had a separate function for setting up the built-in
variables for each combination of shader stage and GLSL version
(e.g. generate_110_vs_variables to generate the built-in variables for
GLSL 1.10 vertex shaders).  The functions called each other in ad-hoc
ways, leading to unexpected inconsistencies (for example,
generate_120_fs_variables was called for GLSL versions 1.20 and above,
but generate_130_fs_variables was called only for GLSL version 1.30).
In addition, it led to a lot of code duplication, since many varyings
had to be duplicated in both the FS and VS code paths.  With the
advent of geometry shaders (and later, tessellation control and
tessellation evaluation shaders), this code duplication was going to
get a lot worse.

So this patch reworks things so that instead of having a separate
function for each shader type and GLSL version, we have a function for
constants, one for uniforms, one for varyings, and one for the special
variables that are specific to each shader type.

In addition, we use a class, builtin_variable_generator, to keep track
of the instruction exec_list, the GLSL parse state, commonly-used
types, and a few other variables, so that we don't have to pass them
around as function arguments.  This makes the code a lot more compact.

Where it was feasible to do so without introducing compilation errors,
I've also gone ahead and introduced the variables needed for
{ARB,EXT}_geometry_shader4 style geometry shaders.  This patch takes
care of everything except the GS variable gl_VerticesIn, the FS
variable gl_PrimitiveID, and GLSL 1.50 style geometry shader inputs
(using the gl_in interface block).  Those remaining features will be
added later.

I've also made a slight nomenclature change: previously we used the
word "deprecated" to refer to variables which are marked in GLSL 1.40
as requiring the ARB_compatibility extension, and are marked in GLSL
1.50 onward as requiring the compatibilty profile.  This was
misleading, since not all deprecated variables require the
compatibility profile (for example gl_FragData and gl_FragColor, which
have been deprecated since GLSL 1.30, but do not require the
compatibility profile until GLSL 4.20).  We now consistently use the
word "compatibility" to refer to these variables.

This patch doesn't introduce any functional changes (since geometry
shaders haven't been enabled yet).

Reviewed-by: Matt Turner <mattst88@gmail.com>
v2: Rename "typ" -> "type".  Add blank line between inline functions
and declarations in builtin_variable_generator class.  Use the
standard comment "/* FALLTHROUGH */" for compatibility with static
code analysis tools.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: Fix lower_named_interface_blocks to account for dereferences of consts.
Paul Berry [Sun, 14 Jul 2013 15:57:49 +0000 (08:57 -0700)]
glsl: Fix lower_named_interface_blocks to account for dereferences of consts.

In certain rare cases (such as those involving dereference of a
literal constant array of structs),
flatten_named_interface_blocks_declarations's rvalue visitor may be
invoked on an ir_dereference_record whose variable_referenced() method
returns NULL.

Check for this case to avoid a segfault.

Prevents crashes in piglit tests
{vs,fs}-deref-literal-array-of-structs.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
10 years agoglsl: Don't allow vertex shader input arrays until GLSL 1.50.
Paul Berry [Thu, 11 Jul 2013 22:40:11 +0000 (15:40 -0700)]
glsl: Don't allow vertex shader input arrays until GLSL 1.50.

Vertex shader inputs are not allowed to be arrays until GLSL 1.50.  We
were accidentally enabling them for GLSL 1.40 (although we haven't
written any tests for them, so it's not clear whether they actually
work).

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Gen4/5: use IEEE floating point mode for GLSL shaders.
Chris Forbes [Sun, 9 Jun 2013 20:01:41 +0000 (08:01 +1200)]
i965: Gen4/5: use IEEE floating point mode for GLSL shaders.

Fixes isinf(), isnan() from GLSL 1.30

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965/vs: Gen4/5: enable front colors if back colors are written
Chris Forbes [Sun, 7 Jul 2013 11:13:07 +0000 (23:13 +1200)]
i965/vs: Gen4/5: enable front colors if back colors are written

Fixes undefined results if a back color is written, but the
corresponding front color is not, and only backfacing primitives are
drawn. Results are still undefined if a frontfacing primitive is drawn,
but that's OK.

The other reasonable way to fix this would have been to just pick
the one color slot that was populated, but that dilutes the value of
the tests.

On Gen6+, the fixed function clipper and triangle setup already take
care of this.

Fixes 11 piglits:
spec/glsl-1.10/execution/interpolation/interpolation-none-gl_Back*Color-*

NOTE: This is a candidate for stable branches.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agogallivm: (trivial) use constant instead of exp2f() function
Roland Scheidegger [Sun, 14 Jul 2013 00:38:13 +0000 (02:38 +0200)]
gallivm: (trivial) use constant instead of exp2f() function

Some lame compilers can't do exp2f() and as far as I can tell they can't do
exp2() (with doubles) neither so instead of providing some workaround for
that (wouldn't actually be too bad just replace with pow) and since it is
used with a constant only just use the precalculated constant.

10 years agoilo: skip 3DSTATE_INDEX_BUFFER when possible
Chia-I Wu [Sat, 13 Jul 2013 19:56:44 +0000 (03:56 +0800)]
ilo: skip 3DSTATE_INDEX_BUFFER when possible

When only the offset to the index buffer is changed, we can skip the
3DSTATE_INDEX_BUFFER if we always use 0 for the offset, and add
(offset / index_size) to Start Vertex Location in 3DPRIMITIVE.

10 years agogallivm: handle srgb-to-linear and linear-to-srgb conversions
Roland Scheidegger [Sat, 13 Jul 2013 15:31:52 +0000 (17:31 +0200)]
gallivm: handle srgb-to-linear and linear-to-srgb conversions

srgb-to-linear is using 3rd degree polynomial for now which should be _just_
good enough. Reverse is using some rational polynomials and is quite accurate,
though not hooked into llvmpipe's blend code yet and hence unused (untested).
Using a table might also be an option (for srgb-to-linear especially).
This does not enable any new features yet because EXT_texture_srgb was already
supported via util_format fallbacks, but performance was lacking probably due
to the external function call (the table used by the util_format_srgb code may
not be all that much slower on its own).
Some performance figures (taken from modified gloss, replaced both base and
sphere texture to use GL_SRGB instead of GL_RGB, measured on 1Ghz Sandy Bridge,
the numbers aren't terribly accurate):

normal gloss, aos, 8-wide: 47 fps
normal gloss, aos, 4-wide: 48 fps

normal gloss, forced to soa, 8-wide: 48 fps
normal gloss, forced to soa, 4-wide: 47 fps

patched gloss, old code, soa, 8-wide: 21 fps
patched gloss, old code, soa, 4-wide: 24 fps

patched gloss, new code, soa, 8-wide: 41 fps
patched gloss, new code, soa, 4-wide: 38 fps

So there's a performance hit but it seems acceptable, certainly better
than using the fallback.
Note the new code only works for 4x8bit srgb formats, others (L8/L8A8) will
continue to use the old util_format fallback, because I can't be bothered
to write code for formats noone uses anyway (as decoding is done as part of
lp_build_unpack_rgba_soa which can only handle block type width of 32).
Compressed srgb formats should get their own path though eventually (it is
going to be expensive in any case, first decompress, then convert).
No piglit regressions.

v2: use lp_build_polynomial instead of ad-hoc polynomial construction, also
since keeping both linear to srgb functions for now make sure both are
compiled (since they share quite some code just integrate into the same
function).

v3: formatting fixes and bugfix in the complicated (disabled) linear-to-srgb
path.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agogallivm: better support for fast rsqrt
Roland Scheidegger [Thu, 11 Jul 2013 21:15:44 +0000 (23:15 +0200)]
gallivm: better support for fast rsqrt

We had to disable fast rsqrt before because it wasn't precise enough etc.
However in situations when we know we're not going to need more precision
we can still use a fast rsqrt (which can be several times faster than
the quite expensive sqrt). Hence introduce a new helper which does exactly
that - it is probably not useful calling it in some situations if there's
no fast rsqrt available so make it queryable if it's available too.

v2: use fast_rsqrt consistently instead of rsqrt_fast, fix indentation,
let rsqrt use fast_rsqrt.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agoconfigure.ac: better detection of LLVM version
Klemens Baum [Thu, 27 Jun 2013 21:13:37 +0000 (23:13 +0200)]
configure.ac: better detection of LLVM version

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agor600g/sb: Initialize ra_constraint::cost.
Vinson Lee [Sat, 13 Jul 2013 02:21:41 +0000 (19:21 -0700)]
r600g/sb: Initialize ra_constraint::cost.

Fixes "Uninitialized scalar field" reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
10 years agoglsl: Initialize ast_aggregate_initializer::constructor_type.
Vinson Lee [Sat, 13 Jul 2013 00:16:47 +0000 (17:16 -0700)]
glsl: Initialize ast_aggregate_initializer::constructor_type.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Make gl_TexCoord compatibility-only
Paul Berry [Sun, 7 Jul 2013 18:49:22 +0000 (11:49 -0700)]
glsl: Make gl_TexCoord compatibility-only

gl_TexCoord was deprecated in GLSL 1.30.  In GLSL 1.40 it was marked
as ARB_compatibility-only, and in GLSL 1.50 and above it was marked as
only appearing in the compatibility profile.  It has never appeared in
GLSL ES.

However, Mesa erroneously included it in all desktop versions of GLSL,
even versions 1.40 and 1.50 (which do not currently support the
compatibility profile).  This patch makes gl_TexCoord available in the
compatibility profile (and GLSL versions 1.30 and prior) only.

NOTE: although this is a simple bug fix, it probably isn't sensible to
cherry-pick it to stable release branches, since its only effect is to
cause incorrectly-written shaders to fail to compile.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl ES: Fix magnitude of gl_MaxVertexUniformVectors.
Paul Berry [Sun, 7 Jul 2013 18:47:22 +0000 (11:47 -0700)]
glsl ES: Fix magnitude of gl_MaxVertexUniformVectors.

Previously, we set it equal to MaxVertexUniformComponents.  It should
be MaxVertexUniformComponents / 4.

NOTE: This is a candidate for the stable branches.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agowinsys/radeon: allow a NULL cs pointer in radeon_bo_map to fix a segfault
Marek Olšák [Fri, 12 Jul 2013 22:19:55 +0000 (00:19 +0200)]
winsys/radeon: allow a NULL cs pointer in radeon_bo_map to fix a segfault

The original idea was that cs=NULL should be allowed here, but we never used
NULL until 862f69fbe1e54e0e9a3c439450a14f. This fixes a segfault in CoreBreach.

10 years agoilo: move a santiy check into its assert()
Chia-I Wu [Fri, 12 Jul 2013 23:22:24 +0000 (07:22 +0800)]
ilo: move a santiy check into its assert()

The compiler does not know that ilo_3d_pipeline_estimate_size() is pure and
can be eliminated in a release build in gen6_pipeline_end().  Move the call
into the assert().

10 years agoilo: mark some states dirty when they are really changed
Chia-I Wu [Fri, 12 Jul 2013 21:54:20 +0000 (05:54 +0800)]
ilo: mark some states dirty when they are really changed

The checks may seem redundant because cso_context handles them, but
util_blitter does not have access to cso_context.

10 years agoilo: clean up ilo_blitter_pipe_begin()
Chia-I Wu [Fri, 12 Jul 2013 21:54:25 +0000 (05:54 +0800)]
ilo: clean up ilo_blitter_pipe_begin()

Document why certain states need to be saved, and fix a bug when blitting with
scissor enabled.

10 years agor600g: don't use the CB/DB CP COHER logic on r6xx
Alex Deucher [Fri, 12 Jul 2013 13:31:28 +0000 (09:31 -0400)]
r600g: don't use the CB/DB CP COHER logic on r6xx

There are hw bugs.  Flush and inv event is sufficient.

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=66837

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
10 years agoconfigure: Avoid use of AC_CHECK_FILE for cross compiling
Jonathan Liu [Tue, 4 Jun 2013 13:04:44 +0000 (23:04 +1000)]
configure: Avoid use of AC_CHECK_FILE for cross compiling

The AC_CHECK_FILE macro can't be used for cross compiling as it will
result in "error: cannot check for file existence when cross compiling".
Replace it with the AS_IF macro.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Jonathan Liu <net147@gmail.com>
10 years agonv30: fix KILL_IF breakage
Brian Paul [Fri, 12 Jul 2013 15:59:38 +0000 (09:59 -0600)]
nv30: fix KILL_IF breakage

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66858

10 years agogallium: fixup definitions of the rsq and sqrt
Zack Rusin [Thu, 11 Jul 2013 16:16:06 +0000 (12:16 -0400)]
gallium: fixup definitions of the rsq and sqrt

GLSL spec says that rsq is undefined for src<=0, but the D3D10
spec says it needs to be a NaN, so lets stop taking an absolute
value of the source which completely breaks that behavior. For
the gl program we can simply insert an extra abs instrunction
which produces the desired behavior there.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoutil/u_format: Comment out half float denormal test case.
José Fonseca [Fri, 12 Jul 2013 14:48:38 +0000 (15:48 +0100)]
util/u_format: Comment out half float denormal test case.

So that lp_test_format doesn't fail until we decide what should be done.

10 years agogallivm: Eliminate redundant lp_build_select calls.
José Fonseca [Fri, 5 Jul 2013 10:53:09 +0000 (11:53 +0100)]
gallivm: Eliminate redundant lp_build_select calls.

lp_build_cmp already returns 0 / ~0, so the lp_build_select call is
unnecessary.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agotgsi: rename the TGSI fragment kill opcodes
Brian Paul [Thu, 11 Jul 2013 23:02:37 +0000 (17:02 -0600)]
tgsi: rename the TGSI fragment kill opcodes

TGSI_OPCODE_KIL and KILP had confusing names.  The former was conditional
kill (if any src component < 0).  The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.

This patch renames both opcodes:
  TGSI_OPCODE_KIL -> KILL_IF   (kill if src.xyzw < 0)
  TGSI_OPCODE_KILP -> KILL     (unconditional kill)

Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.

I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up.  Driver authors should review their code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agotgsi: fix-up KILP comments
Brian Paul [Thu, 11 Jul 2013 22:00:45 +0000 (16:00 -0600)]
tgsi: fix-up KILP comments

KILP is really unconditional fragment kill.

We've had KIL and KILP transposed forever.  I'll fix that next.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agotgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vector
Brian Paul [Thu, 11 Jul 2013 21:52:37 +0000 (15:52 -0600)]
tgsi: exec TGSI_OPCODE_SQRT as a scalar instruction, not vector

To align with the docs and the state tracker.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agotgsi: use X component of the second operand in exec_scalar_binary()
Brian Paul [Tue, 9 Jul 2013 19:30:15 +0000 (13:30 -0600)]
tgsi: use X component of the second operand in exec_scalar_binary()

The code happened to work in the past since the (scalar) src args
effectively always have a swizzle of .xxxx, .yyyy, .zzzz, or .wwww so
whether you grab the X or Y component doesn't really matter.  Just
fixing the code to make it look right.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>