mesa.git
11 years agofreedreno: track maximal scissor bounds
Rob Clark [Wed, 6 Mar 2013 15:45:58 +0000 (10:45 -0500)]
freedreno: track maximal scissor bounds

Optimize out parts of the render target that are scissored out by taking
into account maximal scissor bounds in fd_gmem_render_tiles().

This is a big win on things like gnome-shell which frequently do partial
screen updates.

Signed-off-by: Rob Clark <robdclark@gmail.com>
11 years agoandroid: fix Android.mk bug in mesa/drivers/dri/common
Adrian Marius Negreanu [Fri, 22 Mar 2013 11:42:40 +0000 (13:42 +0200)]
android: fix Android.mk bug in mesa/drivers/dri/common

target-specific variables are undefined when used as pre-requisites.
instead, use secondary-expansion.

I noticed this when building the patch:
     i965: Add a driconf option to disable flush throttling

Signed-off-by: Adrian Marius Negreanu <adrian.m.negreanu@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agomesa: Disable validate_ir_tree() on release builds.
Eric Anholt [Mon, 18 Mar 2013 15:42:19 +0000 (08:42 -0700)]
mesa: Disable validate_ir_tree() on release builds.

Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build.  Cuts
6.3% of the startup time of TF2.

NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agogallivm: move code for dealing with rgb9e5 and r11g11b10 formats to own file
Roland Scheidegger [Sun, 24 Mar 2013 01:08:01 +0000 (02:08 +0100)]
gallivm: move code for dealing with rgb9e5 and r11g11b10 formats to own file

This is really not generic conversion stuff and the code very particular to
these formats.

11 years agollvmpipe: Fix assertions with assignment instead of comparison.
Vinson Lee [Sat, 23 Mar 2013 07:24:52 +0000 (00:24 -0700)]
llvmpipe: Fix assertions with assignment instead of comparison.

Fixes assign instead of compare defects reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoi965: Shrink brw_vue_map struct.
Paul Berry [Fri, 22 Mar 2013 00:14:53 +0000 (17:14 -0700)]
i965: Shrink brw_vue_map struct.

This patch changes the arrays in brw_vue_map (which only ever contain
values from -1 to 58) from ints to signed chars.  This reduces the
size of the struct from 488 bytes to 136 bytes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: fix STATIC_ASSERT to use 127 instead of 128.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965/fs: Rename vp_outputs_written to input_slots_valid.
Paul Berry [Wed, 20 Mar 2013 17:15:52 +0000 (10:15 -0700)]
i965/fs: Rename vp_outputs_written to input_slots_valid.

With the introduction of geometry shaders, fragment inputs will no
longer come exclusively from the vertex shader; sometimes they come
from the geometry shader.  So the name "vp_outputs_written" will
become a misnomer.  This patch renames vp_outputs_written to
input_slots_valid, to reflect the true meaning of the bitfield from
the fragment shader's point of view: it indicates which of the
possible input slots contain valid data that was written by the
previous shader stage.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Use brw.vue_map_geom_out instead of VS output VUE map where appropriate.
Paul Berry [Sun, 17 Mar 2013 18:29:28 +0000 (11:29 -0700)]
i965: Use brw.vue_map_geom_out instead of VS output VUE map where appropriate.

This patch modifies post-GS pipeline stages (transform feedback, clip,
sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather
than brw->vs.prog_data->vue_map.  This ensures that when geometry
shader support is added, these pipeline stages will consult the
geometry shader output VUE map when appropriate, rather than the
vertex shader output VUE map.

v2: Fixed some stale "CACHE_NEW_VS_PROG" comments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Store the geometry output VUE map in brw_context.
Paul Berry [Mon, 18 Feb 2013 18:16:02 +0000 (10:16 -0800)]
i965: Store the geometry output VUE map in brw_context.

Currently, the GPU pipeline has one active VUE map in effect at any
given time--the one representing the layout of vertex data coming from
the vertex shader.  However, when geometry shaders are added, they
will have their own independent VUE map.  Later pipeline stages (clip,
sf, fs) will need to consult the geometry shader VUE map if a geometry
shader is in use, and the vertex shader VUE map otherwise.

This patch adds a new field to brw_context, vue_map_geom_out, which
contains the VUE map that should be used by later pipeline stages.  It
also adds a new state flag, BRW_NEW_VUE_MAP_GEOM_OUT, which is
signalled whenever the contents of the VUE map changes.

Since we don't support geometry shaders yet, vue_map_geom_out is
currently set only by the brw_vs_prog state atom.

v2: Don't set vue_map_geom_out in do_vs_prog--that's redundant and
possibly problematic for precompiles.  Only set it in
brw_upload_vs_prog.  Also, make a copy instead of using a
pointer--this makes it possible to detect when the VUE map hasn't
changed, so we can avoid redundant state uploads.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Move brw_vs_prog_data::outputs_written into VUE map.
Paul Berry [Sun, 17 Mar 2013 18:13:56 +0000 (11:13 -0700)]
i965: Move brw_vs_prog_data::outputs_written into VUE map.

Future patches will allow for there to be separate VUE maps when both
a geometry shader and a vertex shader are in use.  When this happens,
we will want to have correspondingly separate outputs_written
bitfields.  Moving outputs_written into the VUE map will make this
easy.

For consistency with the terminology used in the VUE map, the bitfield
is renamed to "slots_valid" in the process.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gen7: Use WE_all mode when enabling channel masks for URB write.
Paul Berry [Sat, 23 Mar 2013 15:23:03 +0000 (08:23 -0700)]
i965/gen7: Use WE_all mode when enabling channel masks for URB write.

Gen7 adds mask bits to the message header for a URB write which allow
the write to apply only to certain channels.  We don't use this
functionality, so to ensure that the entire write always occurs, we
emit an OR instruction to set the mask bits.

With the advent of geometry shaders, URB writes won't just happen at
the end of a thread; they will happen in mid-thread too.  Thus, we can
no longer rely on channel 0 being enabled, so we need to emit the OR
instruction in WE_all mode to ensure that it is executed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Rename BRW_VARYING_SLOT_MAX -> BRW_VARYING_SLOT_COUNT.
Paul Berry [Sat, 23 Mar 2013 22:53:33 +0000 (15:53 -0700)]
i965: Rename BRW_VARYING_SLOT_MAX -> BRW_VARYING_SLOT_COUNT.

The new name clarifies that it represents *one more* than the maximum
possible brw_varying_slot value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Clarify nomenclature: vert_result -> varying
Paul Berry [Fri, 22 Mar 2013 16:39:11 +0000 (09:39 -0700)]
i965: Clarify nomenclature: vert_result -> varying

This patch removes the terminology "vert_result" from the i965 driver,
replacing it with "varying".  The old terminology, "vert_result", was
confusing because (a) it referred to the enum gl_vert_result, which no
longer exists (it was replaced with gl_varying_slot), and (b) it
implied a vertex output, but with the advent of geometry shaders, it
could be either a vertex or a geometry output, depending what shaders
are in use.  The generic term "varying" is less confusing.

No functional change.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Whitespace fixes.

11 years agoi965: bump MAX_DEPTH_TEXTURE_SAMPLES to 4/8
Chris Forbes [Sun, 24 Mar 2013 03:21:01 +0000 (16:21 +1300)]
i965: bump MAX_DEPTH_TEXTURE_SAMPLES to 4/8

Bump MAX_DEPTH_TEXTURE_SAMPLES to match what GetInternalformativ is
claiming. Since that limit is what is actually enforced now, this
doesn't actually change anything except the queried value.

There's still no piglits verifying that multisample depth textures work,
but this works in the Unigine demos.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: use _mesa_check_sample_count() for multisample textures
Chris Forbes [Sun, 3 Mar 2013 08:46:12 +0000 (21:46 +1300)]
mesa: use _mesa_check_sample_count() for multisample textures

Extends _mesa_check_sample_count() to properly support the
TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY targets, which
have subtly different limits than renderbuffers.

This resolves the remaining TODO in the implementation of
TexImage*DMultisample.

V2: - Don't introduce spurious block.
    - Do this in multisample.c instead.
    - Fix typo in error message.
    - Inline spec quotes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: helper for checking renderbuffer sample count
Chris Forbes [Wed, 6 Feb 2013 07:42:53 +0000 (20:42 +1300)]
mesa: helper for checking renderbuffer sample count

Pulls the checking of the sample count into a helper function, and
extends the existing logic to include the interactions with both
ARB_texture_multisample and ARB_internalformat_query.

_mesa_check_sample_count() checks a desired sample count against a
a combination of target/internalformat, and returns the error enum
to be produced, if any. Unfortunately the conditions are messy and the
errors vary.

V2: - Tidy up spurious block.
    - Move _mesa_check_sample_count() to multisample.c instead; It
      doesn't really belong in fbobject.c or teximage.c.
    - Inlined spec quotes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: allow internalformat_query with multisample texture targets
Chris Forbes [Sat, 16 Feb 2013 07:47:11 +0000 (20:47 +1300)]
mesa: allow internalformat_query with multisample texture targets

Now that we support ARB_texture_multisample, there are multiple targets
accepted for this query, and they may have target-dependent limits, so
pass the target to the driverfunc.

For example, the sampling hardware may not be able to do general
texelFetch() for some format/sample count combination, but the driver
may still be able to implement a reasonable resolve operation, so it can
be supported for renderbuffers.

V2: - Don't break Gallium compile.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoclover: add dynamic_cast results checking down in clSetKernelArgument() code path.
Dmitry Cherkassov [Sat, 23 Mar 2013 19:51:22 +0000 (23:51 +0400)]
clover: add dynamic_cast results checking down in clSetKernelArgument() code path.

Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
11 years agogallivm: Add code for rgb9e5 shared exponent format to float conversion
Roland Scheidegger [Sat, 23 Mar 2013 01:05:54 +0000 (02:05 +0100)]
gallivm: Add code for rgb9e5 shared exponent format to float conversion

And use this (and the code for r11g11b10 packed float to float conversion)
in the soa texturing code (the generated code looks quite good).
Should be an order of magnitude faster probably than using the fallback
(not measured).
Tested with piglit texwrap GL_EXT_packed_float and
GL_EXT_texture_shared_exponent respectively (didn't find much else using
it).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agogallium,st/mesa: don't use blit-based transfers with software rasterizers
Marek Olšák [Thu, 14 Mar 2013 16:18:43 +0000 (17:18 +0100)]
gallium,st/mesa: don't use blit-based transfers with software rasterizers

The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very
fast with software rasterizer. Now Gallium drivers have the ability to turn
them off.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agost/mesa: implement blit-based ReadPixels
Marek Olšák [Thu, 14 Mar 2013 15:36:22 +0000 (16:36 +0100)]
st/mesa: implement blit-based ReadPixels

Initial version contributed by: Martin Andersson <g02maran@gmail.com>

This is only used if the memcpy path cannot be used and if no transfer ops
are needed. It's pretty similar to our TexImage and GetTexImage
implementations.

The motivation behind this is to be able to use ReadPixels every frame and
still have at least 20 fps (or 60 fps with a powerful GPU and CPU)
instead of 0.5 fps.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agomesa: add common format-independent memcpy-based ReadPixels path
Marek Olšák [Thu, 14 Mar 2013 14:20:27 +0000 (15:20 +0100)]
mesa: add common format-independent memcpy-based ReadPixels path

I'll need the _mesa_readpixels_needs_slow_path function for the blit-based
version, but it's also useful to have this memcpy-based path in one place
and not scattered across several functions.

v2: add "const" to function parameters

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agomesa: add helper func for checking combined depthstencil buffers from st/mesa
Marek Olšák [Thu, 14 Mar 2013 13:22:56 +0000 (14:22 +0100)]
mesa: add helper func for checking combined depthstencil buffers from st/mesa

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agomesa: add a common function returning transfer ops for ReadPixels
Marek Olšák [Thu, 14 Mar 2013 12:15:54 +0000 (13:15 +0100)]
mesa: add a common function returning transfer ops for ReadPixels

I'll need both new functions for later. For now, it consolidates the code
for determining what the transfer ops should be and makes it a little bit
smarter.

v2: added "const"

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agomesa: handle HALF_FLOAT like FLOAT in get_tex_rgba
Marek Olšák [Wed, 13 Mar 2013 15:47:21 +0000 (16:47 +0100)]
mesa: handle HALF_FLOAT like FLOAT in get_tex_rgba

NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agollvmpipe: add EXT_packed_float render target format support
Roland Scheidegger [Fri, 22 Mar 2013 19:09:18 +0000 (20:09 +0100)]
llvmpipe: add EXT_packed_float render target format support

New conversion code to handle conversion from/to r11g11b10 AoS to/from
SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA
(which works pretty much the same as r11g11b10 except for the packing).
(This code should also be used for texture sampling instead of
relying on u_format conversion but it's not yet, so rgb9e5 is unused.)
Unfortunately a crazy amount of hacks is necessary to get the conversion
code running in llvmpipe's generate_unswizzled_blend, which isn't well
suited for formats where the storage representation has nothing to do
with what's needed for blending (moreover, the conversion will convert
from packed AoS values, which is the storage format, to float SoA values,
because this is much more natural for the conversion, and likewise from
SoA values to packed AoS values - but the "blend" (which includes
trivial things like partial mask) works on AoS values, so incoming fs
values will go SoA->AoS, values from destination will go packed
AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably
isn't the most efficient way though the shuffles are probably bearable).

Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter),
still need to verify Inf/NaNs (where most of the complexity in the
conversion comes from actually).

v2: drop the (very bogus) rgb9e5 part, and do component extraction
in the helper code for r11g11b10 to float conversion, making the code
slightly more compact (suggested by Jose), now that there are no other
callers left this works quite well. (Could do the same for the
opposite way but it's less than ideal there, final part of packing
needs to be done in caller anyway and there'd be another conditional.)

v3: minor style and comment fixes. Also fix a potential issue with
negative zero being potentially returned by max(src, zero) as we
don't have well-defined min/max behavior (fortunately no additonal cost).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agor600g: Honour legacy debugging environment variables
Michel Dänzer [Thu, 21 Mar 2013 16:56:52 +0000 (17:56 +0100)]
r600g: Honour legacy debugging environment variables

This helps minimize confusion / effort when moving between branches or
helping others.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
11 years agodocs: Mark ARB_ES3_compatibility as done.
Matt Turner [Thu, 21 Mar 2013 22:59:08 +0000 (15:59 -0700)]
docs: Mark ARB_ES3_compatibility as done.

11 years agofreedreno: add pipe->blit
Rob Clark [Thu, 21 Mar 2013 21:32:00 +0000 (17:32 -0400)]
freedreno: add pipe->blit

Signed-off-by: Rob Clark <robdclark@gmail.com>
11 years agoi965: Add a driconf option to disable flush throttling.
Paul Berry [Tue, 19 Mar 2013 18:49:08 +0000 (11:49 -0700)]
i965: Add a driconf option to disable flush throttling.

Normally when submitting the first batch buffer after a flush, we
check whether the GPU has completed processing of the first batch
buffer of the previous frame.  If it hasn't, we wait for it to finish
before submitting any more batches.  This prevents GPU-heavy and
CPU-light applications from racing too far ahead of the current frame,
but at the expense of possibly lower frame rates.  Sometimes when
benchmarking we want to disable this mechanism.

This patch adds the driconf option "disable_throttling" to disable the
throttling mechanism.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agomesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0.
Matt Turner [Mon, 4 Mar 2013 19:03:58 +0000 (11:03 -0800)]
mesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0.

NOTE: This is a candidate for the 9.1 branch.
Fixes piglit's texture-immutable-levels test.
Reported-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoglx: Build with VISIBILITY_CFLAGS in automake
Adam Jackson [Thu, 21 Mar 2013 17:21:18 +0000 (13:21 -0400)]
glx: Build with VISIBILITY_CFLAGS in automake

Note: This is a candidate for the stable branches.

Signed-off-by: Adam Jackson <ajax@redhat.com>
11 years agoscons: check for existance of 'MSVC_VERSION' in env
Brian Paul [Wed, 20 Mar 2013 15:41:52 +0000 (09:41 -0600)]
scons: check for existance of 'MSVC_VERSION' in env

Evidently, MSVC_VERSION isn't always defined so check for it before
checking the MSVC version.

Suggested by Jose.

11 years agosoftpipe: silence some asst. MSVC type warnings in sp_tex_sample.c
Brian Paul [Wed, 20 Mar 2013 17:05:13 +0000 (11:05 -0600)]
softpipe: silence some asst. MSVC type warnings in sp_tex_sample.c

11 years agosoftpipe: silence some MSVC signed/unsigned warnings
Brian Paul [Wed, 20 Mar 2013 16:56:47 +0000 (10:56 -0600)]
softpipe: silence some MSVC signed/unsigned warnings

11 years agosoftpipe: silence some MSVC float/double warnings
Brian Paul [Wed, 20 Mar 2013 16:56:24 +0000 (10:56 -0600)]
softpipe: silence some MSVC float/double warnings

11 years agorbug: silence some MSVC signed/unsigned warnings
Brian Paul [Wed, 20 Mar 2013 16:54:24 +0000 (10:54 -0600)]
rbug: silence some MSVC signed/unsigned warnings

11 years agopostprocess: silence some MSVC float/int warnings
Brian Paul [Wed, 20 Mar 2013 16:54:07 +0000 (10:54 -0600)]
postprocess: silence some MSVC float/int warnings

11 years agometa: fix incorrect slice, r coordinate computation
Brian Paul [Wed, 20 Mar 2013 15:58:18 +0000 (09:58 -0600)]
meta: fix incorrect slice, r coordinate computation

The arithmetic to convert a 3D texture slice to an R coordinate was
incorrect.  Found when MSVC warned of a divide by zero.

Note that we don't actually ever hit this path.  We don't decompress
slices of 3D textures and we don't support 3D mipmap generation yet.

11 years agovega: fix MSVC warning about missing return statement
Brian Paul [Wed, 20 Mar 2013 15:57:21 +0000 (09:57 -0600)]
vega: fix MSVC warning about missing return statement

11 years agometa: minor indentation fix
Brian Paul [Wed, 20 Mar 2013 15:27:14 +0000 (09:27 -0600)]
meta: minor indentation fix

11 years agoradeonsi: Emit pixel shader state even when only the vertex shader changed
Michel Dänzer [Tue, 19 Mar 2013 16:57:11 +0000 (17:57 +0100)]
radeonsi: Emit pixel shader state even when only the vertex shader changed

Fixes random failures with piglit glsl-max-varyings.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Christian König <christian.koenig@amd.com>
11 years agoandroid: Define PACKAGE_VERSION/BUGREPORT in CFLAGS
Chad Versace [Mon, 18 Mar 2013 20:56:28 +0000 (13:56 -0700)]
android: Define PACKAGE_VERSION/BUGREPORT in CFLAGS

This fixes the Android build. Commit 439c3d4 broke it.

CC: Adrian M Negreanu <adrian.m.negreanu@intel.com>
CC: Matt Turner <mattst@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965/vs: Add IR dumping for immediates.
Kenneth Graunke [Mon, 11 Mar 2013 18:10:34 +0000 (11:10 -0700)]
i965/vs: Add IR dumping for immediates.

This makes dump_instructions more useful.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoglsl: Add built-in functions for GLSL 1.50.
Kenneth Graunke [Tue, 19 Mar 2013 01:57:28 +0000 (18:57 -0700)]
glsl: Add built-in functions for GLSL 1.50.

This makes basic built-in functions work in GLSL 1.50.  It supports
everything except the new Geometry Shader functions.

The new 150.glsl file is 140.glsl plus ARB_texture_multisample.glsl;
150.frag is identical to 140.frag except for the #version bump.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
11 years agoglsl: Add sampler2DMS/sampler2DMSArray types to GLSL 1.50.
Kenneth Graunke [Tue, 19 Mar 2013 01:57:27 +0000 (18:57 -0700)]
glsl: Add sampler2DMS/sampler2DMSArray types to GLSL 1.50.

GLSL 1.50 includes support for the new sampler types introduced by
the ARB_texture_multisample extension.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
11 years agoglsl: Bump standalone compiler versions to 1.50.
Kenneth Graunke [Tue, 19 Mar 2013 01:57:26 +0000 (18:57 -0700)]
glsl: Bump standalone compiler versions to 1.50.

The version bumps are necessary in order to compile built-ins for 1.50.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
11 years agoi965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.
Kenneth Graunke [Fri, 15 Mar 2013 21:48:24 +0000 (14:48 -0700)]
i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.

Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.

This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types.  By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled.  This is a very common
case.

Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.

However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits.  If not, the sampler already returns 1.0
for us without any special swizzling.  XRGB8888, for example, is a very
common case where this occurs.

This partially fixes a performance regression since commit 33599433c7.
More work is required to fully fix it in all cases.  This at least helps
Warsow.

NOTE: This is a candidate for the 9.1 branch.

Reviewed-by: Carl Worth <cworth@cworth.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Don't print a fatal-looking message if intelCreateContext fails.
Kenneth Graunke [Thu, 14 Mar 2013 18:48:36 +0000 (11:48 -0700)]
i965: Don't print a fatal-looking message if intelCreateContext fails.

With the old context creation mechanism, an application asked the GL to
give it a context.  Failing to produce a context was a fatal error.

Now, with GLX_ARB_create_context, the application can request a specific
version.  If it's higher than the maximum version we support, context
creation will fail.  But this is a normal error that applications
recover from.

In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1,
4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1
context.  This led to it printing the following message 6 times:
"brwCreateContext: failed to init intel context"

There's no need to alarm users (and developers) with such a message.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965/gen7: Align all depth miplevels to 8 in the X direction.
Eric Anholt [Mon, 18 Mar 2013 22:38:58 +0000 (15:38 -0700)]
i965/gen7: Align all depth miplevels to 8 in the X direction.

On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW
workaround: blit" (the printouts from the misaligned-depth workaround
blits) from 725 to 675.

It doesn't totally eliminate the workaround blit, because we still have
problems with Y offsets that we can't fix (since texturing can only align
miplevels up to 2 or 4, not 8).

No regressions on piglit/es3conform on IVB.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agonvc0: fix max varying count, move CLIPVERTEX,FOG out of the way
Christoph Bumiller [Fri, 15 Mar 2013 22:39:01 +0000 (23:39 +0100)]
nvc0: fix max varying count, move CLIPVERTEX,FOG out of the way

The card spews an error if I use all 128 generic slots.
Apparently the real limit isn't just dictated by the address space
layout.

11 years agogallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3
Christoph Bumiller [Fri, 15 Mar 2013 21:11:31 +0000 (22:11 +0100)]
gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3

This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.

The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.

With this patch only nvc0 and nv30 will request that they be used.

v2: introduce a CAP so other drivers don't have to bother with
the new semantic

v3: adapt to introduction gl_varying_slot enum

11 years agodocs: import release notes for 9.1.1, add news item
Ian Romanick [Wed, 20 Mar 2013 00:44:31 +0000 (17:44 -0700)]
docs: import release notes for 9.1.1, add news item

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agogallium-egl: Fix compile errors introduced in de315f76a
Kristian Høgsberg [Wed, 20 Mar 2013 00:16:57 +0000 (20:16 -0400)]
gallium-egl: Fix compile errors introduced in de315f76a

The commit changed API in a helper library shared by both egl_dri2 and
the gallium egl state tracker, but only egl_dri2 was updated to use the
new interface.

Tested-by: Giulio Camuffo <giuliocamuffo@gmail.com>
11 years agoi965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask.
Paul Berry [Sat, 23 Feb 2013 00:40:41 +0000 (16:40 -0800)]
i965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask.

Previous to this patch, when using fixed function fragment shading,
bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set
differently during precompiles and normal usage.  During precompiles
it was being set only if the fragment shader reads from window
position (which it never does), so it was always being set to 0.
During normal usage it was being set if the vertex shader writes to
all 4 components of gl_Position (which it usually does), so it was
usually being set to 1.  As a result, we were almost always doing an
extra recompile for the fixed function fragment shader.

The recompile was totally unnecessary, though, because
brw_wm_prog_key::proj_attrib_mask is only consulted for
fs_visitor::emit_general_interpolation(), which isn't used for
VARYING_SLOT_POS.

This patch avoids the unnecessary recompile by always setting bit
VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup.
Paul Berry [Fri, 22 Feb 2013 23:37:41 +0000 (15:37 -0800)]
ff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup.

Previously, right after calling _mesa_glsl_link_shader(), the fixed
function fragment shader code made several calls with the ostensible
purpose of setting up uniforms for the fragment shader it just
created.

These calls are unnecessary, since _mesa_glsl_link_shader() calls
driver->LinkShader(), which takes care of calling these functions (or
their equivalent).  Also, they are dangerous to call after
_mesa_glsl_link_shader() has returned, because on back-ends such as
i965 which do precompilation, _mesa_glsl_link_shader() may have
already cached pointers to the existing uniform structures; attempting
to set up the uniforms again invalidates those cached pointers.

It was only by sheer coincidence that this wasn't manifesting itself
as a bug.  It turns out that i965's precompile mechanism was always
setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed
function fragment shaders, but during normal usage this bit usually
gets set to 1.  As a result, the precompiled shader (with its invalid
uniform pointers) was not being used.

I'm about to introduce some changes that cause bit 0 of
proj_attrib_mask to be set consistently between precompilation and
normal usage, so to avoid regressions I need to get rid of the
dangerous duplicate uniform setup code first.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Avoid unnecessary copy when depthstencil workaround invoked by clear.
Paul Berry [Fri, 8 Mar 2013 21:39:43 +0000 (13:39 -0800)]
i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.

Since apps typically begin rendering with a call to glClear(), it is
likely that when brw_workaround_depthstencil_alignment() moves a
miplevel to a temporary buffer, it can avoid doing a blit, since the
contents of the miplevel are about to be erased.

This patch adds the necessary plumbing to determine when
brw_workaround_depthstencil_alignment() is being called as a
consequence of glClear(), and avoids the unnecessary blit when it is
safe to do so.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Eliminate unnecessary call to _mesa_is_depthstencil_format().  Fix
handling of depth buffer in depth/stencil format.

v3: Use correct bitfields for clear_mask.  Fix handling of depth
buffer in depth/stencil format when hardware uses separate stencil.
When invalidating, make sure we still reassociate the image to the new
miptree.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agor600g: don't emit SQ_DYN_GPR_RESOURCE_LIMIT_1 on cayman
Alex Deucher [Tue, 19 Mar 2013 22:11:20 +0000 (18:11 -0400)]
r600g: don't emit SQ_DYN_GPR_RESOURCE_LIMIT_1 on cayman

Doesn't exist on the asic and will cause a CS rejection
if VM is disabled.

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: emit DB_SRESULTS_COMPARE_STATE0 on r6xx/r7xx
Alex Deucher [Tue, 19 Mar 2013 18:25:32 +0000 (14:25 -0400)]
r600g: emit DB_SRESULTS_COMPARE_STATE0 on r6xx/r7xx

Not using HiS yet, but matches what we do on evergreen+.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agowinsys/svga: improve error/debug message output
Brian Paul [Tue, 19 Mar 2013 16:03:39 +0000 (10:03 -0600)]
winsys/svga: improve error/debug message output

Use vmw_printf() just for extra debugging info (off by default).
Use vmw_error() for real errors/failures/etc that we definitely
want to report.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agotgsi: fix uninitialized declaration array fields
Brian Paul [Tue, 19 Mar 2013 19:49:42 +0000 (13:49 -0600)]
tgsi: fix uninitialized declaration array fields

Fixes a few regressions since the TGSI array changes.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agoegl_dri2: Lower __DRI_IMAGE version requirement back to 1
Kristian Høgsberg [Tue, 19 Mar 2013 17:20:36 +0000 (13:20 -0400)]
egl_dri2: Lower __DRI_IMAGE version requirement back to 1

We check the extension version manually instead and verify that we have
the createImageFromFds function before enabling prime fd passing.

11 years agoradeon/llvm: Do not link against libgallium when building statically.
Maarten Lankhorst [Tue, 19 Mar 2013 19:17:57 +0000 (20:17 +0100)]
radeon/llvm: Do not link against libgallium when building statically.

NOTE: This is a candidate for the 9.1 branch.

Tested-by: Vincent Lejeune <vljn@ovi.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
11 years agogles2: Add an ABI-check test
Matt Turner [Wed, 30 Jan 2013 01:37:02 +0000 (17:37 -0800)]
gles2: Add an ABI-check test

Checks that no functions are exported that are not part of the ABI.

Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.

11 years agogles1: Add an ABI-check test
Matt Turner [Tue, 12 Mar 2013 19:36:06 +0000 (12:36 -0700)]
gles1: Add an ABI-check test

Checks that no functions are exported that are not part of the ABI.

Note that currently we are exporting functions that are aliased to
functions that are part of the ABI. They shouldn't be exported, but the
XML descriptions don't adequately describe this case.

11 years agogallium/egl: fix out-of-tree build
Andreas Boll [Sat, 16 Mar 2013 13:04:24 +0000 (14:04 +0100)]
gallium/egl: fix out-of-tree build

Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/15-fix-oot-build.diff;h=7040999a22d3937d0578cfd85ee2c71d7dc614bb;hb=refs/heads/ubuntu%2B1

NOTE: This is a candidate for the 9.1 branch.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agoosmesa: fix out-of-tree build
Andreas Boll [Sat, 16 Mar 2013 13:00:44 +0000 (14:00 +0100)]
osmesa: fix out-of-tree build

Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1

v2: Move the added line immediately after -I$(top_srcdir)/src/mapi

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agobuild: Enable x86 assembler on Hurd.
Andreas Boll [Sat, 16 Mar 2013 12:50:19 +0000 (13:50 +0100)]
build: Enable x86 assembler on Hurd.

Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/10-hurd-configure-tweaks.diff;h=984e17df1b8afdf8e4b36bee96aa5ab6a5691021;hb=refs/heads/ubuntu%2B1

Thanks to Pino Toscano.

v2: Don't bother with x86_64. AFAICT GNU/Hurd doesn't support it so far.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Acked-by: Matt Turner <mattst88@gmail.com>
11 years agomesa: use ieee fp on s390 and m68k
Andreas Boll [Sat, 16 Mar 2013 12:54:09 +0000 (13:54 +0100)]
mesa: use ieee fp on s390 and m68k

Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1

Fixes Debian bug #349437.

Patch written by David Nusinow.

NOTE: This is a candidate for stable branches.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
11 years agogallivm: fix return opcode handling in main function of a shader
Roland Scheidegger [Sat, 16 Mar 2013 01:55:43 +0000 (02:55 +0100)]
gallivm: fix return opcode handling in main function of a shader

If we're in some conditional or loop we must not return, or the code
after the condition is never executed.
(v2): And, we also can't just continue as nothing happened, since the
mask update code would later check if we actually have a mask, so we
need to remember that there was a return in main where we didn't exit
(to illustrate this, a ret in a if clause would cause a mask update
which is still ok as we're in a conditional, but after the endif the
mask update code would drop the mask hence bringing execution back to
pixels which should have their execution mask set to zero by the ret).
Thanks to Christoph Bumiller for figuring this out.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357.

Note: This is a candidate for the stable branches.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agofreedreno: clear fixes
Rob Clark [Tue, 5 Mar 2013 22:49:43 +0000 (17:49 -0500)]
freedreno: clear fixes

Some fixes for clearing only depth or only stencil.

Signed-off-by: Rob Clark <robdclark@gmail.com>
11 years agoradeonsi: enable indirect adressing
Christian König [Thu, 7 Mar 2013 11:00:18 +0000 (12:00 +0100)]
radeonsi: enable indirect adressing

Fixing 16 piglit tests.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
11 years agoradeonsi: implement indirect adressing of constants
Christian König [Thu, 7 Mar 2013 10:58:56 +0000 (11:58 +0100)]
radeonsi: implement indirect adressing of constants

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
11 years agoradeonsi: switch to using resource destribtors for constants v2
Christian König [Thu, 28 Feb 2013 13:50:07 +0000 (14:50 +0100)]
radeonsi: switch to using resource destribtors for constants v2

v2: remove superfluous mask, use buffer_size instead of constant

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
11 years agoradeon/llvm: rework input fetch and output store
Christian König [Thu, 7 Mar 2013 10:01:07 +0000 (11:01 +0100)]
radeon/llvm: rework input fetch and output store

Cleanup the code and implement indirect addressing.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
11 years agotgsi: add initializer data to fix MSVC compile error
Brian Paul [Tue, 19 Mar 2013 13:55:48 +0000 (07:55 -0600)]
tgsi: add initializer data to fix MSVC compile error

11 years agotgsi: add ArrayID documentation v2
Christian König [Thu, 14 Mar 2013 10:10:16 +0000 (11:10 +0100)]
tgsi: add ArrayID documentation v2

v2: further improve the text with comments from Christoph Bumiller.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agotgsi: use separate structure for indirect address v2
Christian König [Thu, 7 Mar 2013 14:02:31 +0000 (15:02 +0100)]
tgsi: use separate structure for indirect address v2

To further improve the optimization of source and destination
indirect addressing we need the ability to store a reference
to the declaration of the addressed operands.

Since most of the fields in tgsi_src_register doesn't apply for
an indirect addressing operand replace it with a separate
tgsi_ind_register structure and so make room for extra information.

v2: rename Declaration to ArrayID, put the ArrayID into () instead of []

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agotgsi: add ArrayID to declarations
Christian König [Wed, 13 Mar 2013 13:58:15 +0000 (14:58 +0100)]
tgsi: add ArrayID to declarations

Remember which declarations are declared as "arrays" and so
can be indirectly addressed. ArrayIDs start at 1, cause for
compatibility reasons zero is treaded as no array present.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agotgsi: remove TGSI_FILE_(IMMEDIATE|TEMP)_ARRAY
Christian König [Thu, 7 Mar 2013 15:52:54 +0000 (16:52 +0100)]
tgsi: remove TGSI_FILE_(IMMEDIATE|TEMP)_ARRAY

Nobody seems to be using it, and only nv50 had a partial implementation.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoglsl_to_tgsi: remove indirect addressing limitations
Christian König [Sun, 10 Mar 2013 13:36:13 +0000 (14:36 +0100)]
glsl_to_tgsi: remove indirect addressing limitations

They shouldn't be necessary any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoglsl_to_tgsi: allocate arrays separately v2
Christian König [Sun, 10 Mar 2013 13:33:29 +0000 (14:33 +0100)]
glsl_to_tgsi: allocate arrays separately v2

Instead of allocating everything as temporaries, use the
new array allocation functions.

v2: fix bug in simplify_cmp, declare arrays on demand

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoglsl_to_tgsi: use get_temp for all allocations
Christian König [Fri, 8 Mar 2013 12:17:05 +0000 (13:17 +0100)]
glsl_to_tgsi: use get_temp for all allocations

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agotgsi/ureg: implement support for array temporaries
Christian König [Sun, 10 Mar 2013 12:44:25 +0000 (13:44 +0100)]
tgsi/ureg: implement support for array temporaries

Don't bother with free temporaries, just allocate them at
the end and also emit them in their own declaration.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agotgsi/ureg: cleanup local temporary emission v2
Christian König [Fri, 8 Mar 2013 16:55:46 +0000 (17:55 +0100)]
tgsi/ureg: cleanup local temporary emission v2

Instead of emitting each temporary separately, emit them in a chunk.

v2: keep separate function for emitting temps

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoradeon/llvm: Link against libgallium.la to fix an undefined symbol
Andreas Boll [Tue, 19 Mar 2013 10:55:41 +0000 (11:55 +0100)]
radeon/llvm: Link against libgallium.la to fix an undefined symbol

Ported from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1

Fixes a regression introduced with
f70c3853513637fa6ed38e75f73d472a9fa61213

NOTE: This is a candidate for the 9.1 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
11 years agowayland: Add prime fd passing as a buffer sharing mechanism
Kristian Høgsberg [Sat, 2 Feb 2013 17:26:12 +0000 (12:26 -0500)]
wayland: Add prime fd passing as a buffer sharing mechanism

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
11 years agoAdd dri image entry point for creating image from fd
Kristian Høgsberg [Sat, 2 Feb 2013 13:38:07 +0000 (08:38 -0500)]
Add dri image entry point for creating image from fd

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
11 years agowayland: allocate a __DRIimage for the color buffer
Kristian Høgsberg [Sat, 2 Feb 2013 12:40:51 +0000 (07:40 -0500)]
wayland: allocate a __DRIimage for the color buffer

No functional change here, but this will let us query the image
for an fd handle later.

Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
11 years agoDRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap
Rob Clark [Tue, 12 Mar 2013 23:31:58 +0000 (19:31 -0400)]
DRI2: HACK: no GLX_INTEL_swap_event if no ScheduleSwap

If ddx does not support swap, don't advertise it.  This is a hack to
work around current xservers which advertise this extension even when it
is clearly not supported.  When:

http://lists.x.org/archives/xorg-devel/2013-February/035449.html

is merged in upstream xserver and makes it's way into most distros then
this hack can be removed.  In the mean time, it is required to allow
gnome-shell/clutter/etc to work properly with a DDX driver which does
not support ScheduleSwap.

Signed-off-by: Rob Clark <robdclark@gmail.com>
11 years agoi965/blorp: Add INTEL_DEBUG=blorp flag.
Paul Berry [Sat, 16 Mar 2013 17:32:21 +0000 (10:32 -0700)]
i965/blorp: Add INTEL_DEBUG=blorp flag.

This debug flag prints out the native GEN assembly for a blitting
shader produced using BLORP.  Hopefully this should be useful in
developing additional BLORP features.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agor600g: properly set non_disp tiling mode for DMA (v2)
Alex Deucher [Fri, 15 Mar 2013 19:11:01 +0000 (15:11 -0400)]
r600g: properly set non_disp tiling mode for DMA (v2)

Needs to be set for depth, stencil, and fmask just
like other blocks.

v2: drop additional cayman bits for now

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: Use blitter rather than DMA for 128bpp on cayman (v3)
Alex Deucher [Fri, 15 Mar 2013 18:29:24 +0000 (14:29 -0400)]
r600g: Use blitter rather than DMA for 128bpp on cayman (v3)

On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces.  When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.

v2: cayman/TN only

v3: fix comments

Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802

Note: this is a candidate for the 9.1 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
11 years agoi965: Simplify separate stencil check
Paul Berry [Wed, 13 Mar 2013 20:48:13 +0000 (13:48 -0700)]
i965: Simplify separate stencil check

The only format returned by _mesa_get_format_base_format() that
satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we
can simplify the check.

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agogallium/build: Fix visibility CFLAGS in automake
Maarten Lankhorst [Thu, 21 Feb 2013 17:07:52 +0000 (18:07 +0100)]
gallium/build: Fix visibility CFLAGS in automake

v2: Andreas Boll <andreas.boll.dev@gmail.com>
    - Fix formatting - use one CFLAG per line

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
11 years agoscons: Warn when using MSVS versions prior to 2012.
José Fonseca [Fri, 15 Mar 2013 15:23:54 +0000 (15:23 +0000)]
scons: Warn when using MSVS versions prior to 2012.

Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoi965: Apply depthstencil alignment workaround when doing fast clears.
Paul Berry [Fri, 8 Mar 2013 20:03:10 +0000 (12:03 -0800)]
i965: Apply depthstencil alignment workaround when doing fast clears.

Fast depth clears have the same depth/stencil alignment requirements
as other drawing operations.  Therefore, we need to call
brw_workaround_depthstencil_alignment() from both the clear and
drawing paths.

Without this fix, we get image corruption if the following conditions
hold: (a) the first ever drawing operation to a depth miplevel (or the
first drawing operation after having used the texture for sampling) is
a clear, (b) the depth miplevel has a size that is eligible for fast
depth clears, and (c) the depth miplevel has an offset within the
miptree that isn't 8x8 aligned.

Fixes piglit "depthstencil-render-miplevels" tests with size 273.

NOTE: This is a candidate for stable branches

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoReplace gl_frag_attrib enum with gl_varying_slot.
Paul Berry [Sat, 23 Feb 2013 17:00:58 +0000 (09:00 -0800)]
Replace gl_frag_attrib enum with gl_varying_slot.

This patch makes the following search-and-replace changes:

gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agoGet rid of _mesa_frag_attrib_to_vert_result().
Paul Berry [Sat, 23 Feb 2013 16:36:40 +0000 (08:36 -0800)]
Get rid of _mesa_frag_attrib_to_vert_result().

Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>
11 years agoGet rid of _mesa_vert_result_to_frag_attrib().
Paul Berry [Sat, 23 Feb 2013 16:28:18 +0000 (08:28 -0800)]
Get rid of _mesa_vert_result_to_frag_attrib().

Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.  But we still need to be able to detect when a given vertex
output has no corresponding fragment input.  So it is replaced by a
new function, _mesa_varying_slot_in_fs(), which tells whether the
given varying slot exists as an FS input or not.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Brian Paul <brianp@vmware.com>