mesa.git
11 years agoi965/vs: Fix textureGrad() with shadow samplers on Haswell.
Kenneth Graunke [Wed, 13 Feb 2013 05:51:17 +0000 (21:51 -0800)]
i965/vs: Fix textureGrad() with shadow samplers on Haswell.

The shadow comparitor needs to be loaded into the Z component of the
last DWord.

Fixes es3conform's shadow_execution_vert and oglconform's
shadow-grad advanced.textureGrad.1D tests on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Lower textureGrad() for samplerCubeShadow.
Kenneth Graunke [Wed, 13 Feb 2013 05:51:16 +0000 (21:51 -0800)]
i965: Lower textureGrad() for samplerCubeShadow.

According to the Ivybridge PRM, Volume 4 Part 1, page 130, in the
section for the sample_d message: "The r coordinate contains the faceid,
and the r gradients are ignored by hardware."

This doesn't match GLSL, which provides gradients for all of the
coordinates.  So we would need to do some math to compute the face ID
before using sample_d.  We currently don't have any code to do that.

However, we do have a lowering pass that converts textureGrad to
textureLod, which solves this problem.  Since textureGrad on three
components is sufficiently obscure, it's not a performance path.

For now, only handle samplerCubeShadow; we need tests for samplerCube
and samplerCubeArray.

Fixes es3conform's shadow_comparison_frag test on Haswell.

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoradeon/uvd: fix quant scan order for mpeg2
Christian König [Tue, 30 Apr 2013 17:38:24 +0000 (19:38 +0200)]
radeon/uvd: fix quant scan order for mpeg2

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agost/vdpau: fix background handling in the mixer
Christian König [Tue, 30 Apr 2013 12:55:14 +0000 (14:55 +0200)]
st/vdpau: fix background handling in the mixer

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agovl/buffer: use 2D_ARRAY instead of 3D textures
Christian König [Tue, 30 Apr 2013 12:40:40 +0000 (14:40 +0200)]
vl/buffer: use 2D_ARRAY instead of 3D textures

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agovl/compositor: cleanup background clearing
Christian König [Mon, 29 Apr 2013 15:43:04 +0000 (17:43 +0200)]
vl/compositor: cleanup background clearing

Add an extra parameter to specify if we should clear the render target.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoswrast: add casts for ImageSlices pointer arithmetic
Brian Paul [Tue, 30 Apr 2013 19:35:23 +0000 (13:35 -0600)]
swrast: add casts for ImageSlices pointer arithmetic

MSVC doesn't like pointer arithmetic with void * so use GLubyte *.

Reviewed-by: Jose Fonseca<jfonseca@vmware.com>
11 years agoilo: fix PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS
Chia-I Wu [Wed, 1 May 2013 09:40:50 +0000 (17:40 +0800)]
ilo: fix PIPE_CAP_MAX_STREAM_OUTPUT_BUFFERS

On GEN7+, is->dev.has_gen7_sol_reset is required.

11 years agoilo: enable SO support on GEN7
Chia-I Wu [Mon, 29 Apr 2013 00:47:33 +0000 (08:47 +0800)]
ilo: enable SO support on GEN7

11 years agoilo: reset SO write offsets for new SO targets
Chia-I Wu [Sun, 28 Apr 2013 23:26:37 +0000 (07:26 +0800)]
ilo: reset SO write offsets for new SO targets

When the SO targets are changed and no appending is requested, we need to send
SOL_RESET on GEN7+.

11 years agoilo: correctly program SO states for GEN7
Chia-I Wu [Sun, 28 Apr 2013 23:47:05 +0000 (07:47 +0800)]
ilo: correctly program SO states for GEN7

With the commands supported by GPE, we can finally program the states.

11 years agoilo: implement GEN7 SO GPE functions
Chia-I Wu [Sun, 28 Apr 2013 19:27:29 +0000 (03:27 +0800)]
ilo: implement GEN7 SO GPE functions

They were just stubs before.

11 years agoilo: add gen6_pipeline_update_max_svbi()
Chia-I Wu [Wed, 1 May 2013 08:58:10 +0000 (16:58 +0800)]
ilo: add gen6_pipeline_update_max_svbi()

Move max_svbi calculation to a helper function and make it available for other
GENs.

11 years agoilo: expose register indices of OUTs in ilo_shader
Chia-I Wu [Mon, 29 Apr 2013 00:25:27 +0000 (08:25 +0800)]
ilo: expose register indices of OUTs in ilo_shader

pipe_stream_output_info tells us which of OUT[i] needs to be written out.
We need the info to map OUT[i] to VUE offset.

11 years agoilo: allow one-off flags to be specified for CP
Chia-I Wu [Sun, 28 Apr 2013 23:22:00 +0000 (07:22 +0800)]
ilo: allow one-off flags to be specified for CP

It will be used for SOL_RESET on GEN7.

11 years agoilo: fix tiling/size for special-purpose resources
Chia-I Wu [Tue, 30 Apr 2013 07:30:01 +0000 (15:30 +0800)]
ilo: fix tiling/size for special-purpose resources

We do not allocate such resources yet though.

11 years agoilo: use UMS layout for render targets
Chia-I Wu [Tue, 30 Apr 2013 04:55:18 +0000 (12:55 +0800)]
ilo: use UMS layout for render targets

As we do not advertise MSAA support, this change should not make any
difference yet.

11 years agoilo: support and prefer compact array spacing
Chia-I Wu [Tue, 30 Apr 2013 04:14:29 +0000 (12:14 +0800)]
ilo: support and prefer compact array spacing

There is no reason to waste the memory when the HW can support compact array
spacing (ARYSPC_LOD0).

11 years agoilo: move device limits to ilo_dev_info or to GPEs
Chia-I Wu [Mon, 29 Apr 2013 02:56:36 +0000 (10:56 +0800)]
ilo: move device limits to ilo_dev_info or to GPEs

It seems a bit weird to have device limits in a context.

11 years agoilo: use ilo_dev_info in toy compiler
Chia-I Wu [Mon, 29 Apr 2013 02:14:04 +0000 (10:14 +0800)]
ilo: use ilo_dev_info in toy compiler

We need only dev->gen, but it makes sense to expose other information to the
compiler.

11 years agoilo: use ilo_dev_info in GPE and 3D pipeline
Chia-I Wu [Mon, 29 Apr 2013 01:58:51 +0000 (09:58 +0800)]
ilo: use ilo_dev_info in GPE and 3D pipeline

We need only dev->gen and dev->gt, but it makes sense to expose other
information to the pipeline.

11 years agoilo: add ilo_dev_info shared by the screen and contexts
Chia-I Wu [Mon, 29 Apr 2013 01:41:11 +0000 (09:41 +0800)]
ilo: add ilo_dev_info shared by the screen and contexts

The struct is used to describe the device information, such as PCI ID, GEN,
GT, and etc.

11 years agoilo: fix indentation of ilo_gpe_gen*.h
Chia-I Wu [Mon, 29 Apr 2013 02:03:59 +0000 (10:03 +0800)]
ilo: fix indentation of ilo_gpe_gen*.h

11 years agoglsl: Ignore redundant prototypes after a function's been defined.
Kenneth Graunke [Tue, 30 Apr 2013 07:58:09 +0000 (00:58 -0700)]
glsl: Ignore redundant prototypes after a function's been defined.

Consider the following shader:

    vec4 f(vec4 v) { return v; }
    vec4 f(vec4 v);

The prototype exactly matches the signature of the earlier definition,
so there's absolutely no point in it.  However, it doesn't appear to
be illegal.  The GLSL 4.30 specification offers two relevant quotes:

"If a function name is declared twice with the same parameter types,
 then the return types and all qualifiers must also match, and it is the
 same function being declared."

"User-defined functions can have multiple declarations, but only one
 definition."

In this case the same function was declared twice, and there's only one
definition, which fits both pieces of text.  There doesn't appear to be
any text saying late prototypes are illegal, so presumably it's valid.

Unfortunately, it currently triggers an assertion failure:
ir_dereference_variable @ <p1> specifies undeclared variable `v' @ <p2>

When we process the second line, we look for an existing exact match so
we can enforce the one-definition rule.  We then leave sig set to that
existing function, and hit sig->replace_parameters(&hir_parameters),
unfortunately nuking our existing definition's parameters (which have
actual dereferences) with the prototype's bogus unused parameters.

Simply bailing out and ignoring such late prototypes is the safest
thing to do.

Fixes Piglit's late-proto.vert as well as 3DMark/Ice Storm for Android.

NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
11 years agodocs: Import 9.1.2 release notes, add news item.
Ian Romanick [Tue, 30 Apr 2013 22:33:01 +0000 (15:33 -0700)]
docs: Import 9.1.2 release notes, add news item.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agobuild: Remove libws_xlib.la from GALLIUM_PIPE_LOADER_LIBS.
Matt Turner [Mon, 22 Apr 2013 21:28:50 +0000 (14:28 -0700)]
build: Remove libws_xlib.la from GALLIUM_PIPE_LOADER_LIBS.

The three users of GALLIUM_PIPE_LOADER_LIBS (OpenCL, gallium-gbm,
gallium tests) don't appear to need libws_xlib.la.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agobuild: Remove libpipe_loader.la from GALLIUM_PIPE_LOADER_LIBS.
Matt Turner [Mon, 22 Apr 2013 20:42:02 +0000 (13:42 -0700)]
build: Remove libpipe_loader.la from GALLIUM_PIPE_LOADER_LIBS.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agobuild: Remove HAVE_PIPE_LOADER_SW.
Matt Turner [Mon, 22 Apr 2013 19:10:27 +0000 (12:10 -0700)]
build: Remove HAVE_PIPE_LOADER_SW.

It guarded the function prototype of pipe_loader_sw_probe, whose use (in
pipe_loader.c) and definition (in pipe_loader_sw.c) were not guarded.
Both are built into libpipe_loader.la if HAVE_LOADER_GALLIUM, which is
enable_gallium_loader in configure.ac.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agobuild: Remove libws_null.la from GALLIUM_PIPE_LOADER_LIBS.
Matt Turner [Mon, 22 Apr 2013 19:07:13 +0000 (12:07 -0700)]
build: Remove libws_null.la from GALLIUM_PIPE_LOADER_LIBS.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agobuild: Rename PIPE_LOADER_HAVE_XCB to HAVE_PIPE_LOADER_XCB.
Matt Turner [Mon, 22 Apr 2013 18:50:29 +0000 (11:50 -0700)]
build: Rename PIPE_LOADER_HAVE_XCB to HAVE_PIPE_LOADER_XCB.

For consistency, since we already have HAVE_PIPE_LOADER_{SW,DRM}.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoconfigure.ac: Remove unused HAVE_PIPE_LOADER_XLIB macro.
Matt Turner [Mon, 22 Apr 2013 18:41:26 +0000 (11:41 -0700)]
configure.ac: Remove unused HAVE_PIPE_LOADER_XLIB macro.

Added in e1364530 but never used.

Tested-by: Tom Stellard <thomas.stellard@amd.com>
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoi965: Stop passing num_samples to intel_miptree_alloc_hiz().
Paul Berry [Thu, 25 Apr 2013 17:57:48 +0000 (10:57 -0700)]
i965: Stop passing num_samples to intel_miptree_alloc_hiz().

The number of samples is already available in the miptree data
structure, so there's no need to pass it in.

I suspect this may fix a subtle bug because in one case
(intel_renderbuffer_update_wrapper) we were always passing zero for
num_samples, even though the buffer in question was not guaranteed to
be single-sampled.  But I wasn't able to find a failing test case.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agodraw: don't crash if GS doesn't emit anything
Zack Rusin [Sat, 27 Apr 2013 12:55:36 +0000 (08:55 -0400)]
draw: don't crash if GS doesn't emit anything

Technically it's legal for geometry shader to not emit any
vertices. It's silly, but perfectly legal, so lets make draw
stop crashing if it happens.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoi965: Implement color clears using a simple shader in blorp.
Eric Anholt [Tue, 19 Mar 2013 19:40:10 +0000 (12:40 -0700)]
i965: Implement color clears using a simple shader in blorp.

The upside is less CPU overhead in fiddling with GL error handling, the
ability to use the constant color write message in most cases, and no GLSL
clear shaders appearing in MESA_GLSL=dump output.  The downside is more
batch flushing and a total recompute of GL state at the end of blorp.
However, if we're ever going to use the fast color clear feature of CMS
surfaces, we'll need this anyway since it requires very special state
setup.

This increases the fail rate of some the GLES3conform ARB_sync tests,
because of the initial flush at the start of blorp.  The tests already
intermittently failed (because it's just a bad testing procedure), and we
can return it to its previous fail rate by fixing the initial flush.

Improves GLB2.7 performance 0.37% +/- 0.11% (n=71/70, outlier removed).

v2: Rename the key member, use the core helper for sRGB, and use
    BRW_MASK_* enums, fix comment and indentation (review by Paul).
v3: Rewrite a comment, drop a silly temporary variable (review by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: Make a Mesa core function for sRGB render encoding handling.
Eric Anholt [Thu, 18 Apr 2013 16:20:55 +0000 (09:20 -0700)]
mesa: Make a Mesa core function for sRGB render encoding handling.

v2: const-qualify ctx, and add a comment about the function (recommended
    by Brian and Kenneth).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
11 years agoi965: Don't flush the batch at the end of blorp.
Eric Anholt [Fri, 8 Feb 2013 02:46:18 +0000 (18:46 -0800)]
i965: Don't flush the batch at the end of blorp.

Improves GLB2.7 performance 0.13% +/- 0.09% (n=104/105, outliers removed).
More importantly, once color glClear()s are done through blorp in the next
commit, this reduces regression in GLES3 conformance tests that rely on
queueing up many glClear()s and having the GPU report being still busy in
an ARB_sync query after that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agor600g/sb: remove unused code
Vadim Girlin [Tue, 30 Apr 2013 17:01:10 +0000 (21:01 +0400)]
r600g/sb: remove unused code

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
11 years agor600g/sb: collect shader statistics
Vadim Girlin [Tue, 30 Apr 2013 16:58:52 +0000 (20:58 +0400)]
r600g/sb: collect shader statistics

Collects various statistical information for each shader
and total stats for contexts.

Printed with R600_DEBUG=sb,sbstat

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
11 years agor600g/sb: don't propagate dead values in GVN pass
Vadim Girlin [Tue, 30 Apr 2013 16:50:24 +0000 (20:50 +0400)]
r600g/sb: don't propagate dead values in GVN pass

In some cases we use value::gvn_source field to link values that
are known to be equal before gvn pass (e.g. results of DOT4 in different
slots of the same alu group), but then source value may become dead later
and this confuses further passes.

This patch resets value::gvn_source to NULL in the dce_cleanup pass
if it points to dead value.

Fixes segfault during shader optimization with ETQW.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
11 years agor600g/sb: use simple heuristic to limit register pressure
Vadim Girlin [Sat, 27 Apr 2013 08:03:39 +0000 (12:03 +0400)]
r600g/sb: use simple heuristic to limit register pressure

It's not a complete register pressure tracking, yet it helps to prevent
register allocation problems in some cases where they were observed.

The problems are uncovered by false dependencies between fetch instructions
introduced by some recent changes in TGSI and/or default backend.
Sometimes we have code like this:

...
SAMPLE R5.xyzw, R5.xyzw
... store R5.xyzw somewhere
MOV R5.x, <next x coord>
MOV R5.y, <next y coord>
SAMPLE R5.xyzw, R5.xyzw
... <may be repeated a lot of times>

With 2D resources, z and w in SAMPLE src reg aren't used and can be simply
masked, but shader backend doesn't have this information, so it's
considered as data dependency by optimization algorithms.

11 years agor600g/sb: improve error checking in ra_coalesce pass
Vadim Girlin [Tue, 23 Apr 2013 06:34:42 +0000 (10:34 +0400)]
r600g/sb: improve error checking in ra_coalesce pass

11 years agor600g/sb: use source bytecode in case of optimization errors
Vadim Girlin [Tue, 23 Apr 2013 06:34:00 +0000 (10:34 +0400)]
r600g/sb: use source bytecode in case of optimization errors

11 years agor600g: plug in optimizing backend
Vadim Girlin [Tue, 30 Apr 2013 16:53:15 +0000 (20:53 +0400)]
r600g: plug in optimizing backend

Optimization is enabled with "R600_DEBUG=sb".

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
11 years agor600g/sb: initial commit of the optimizing shader backend
Vadim Girlin [Tue, 30 Apr 2013 16:51:36 +0000 (20:51 +0400)]
r600g/sb: initial commit of the optimizing shader backend

11 years agor600g: use enum type for domains field in struct r600_resource
Vadim Girlin [Sun, 21 Apr 2013 15:10:32 +0000 (19:10 +0400)]
r600g: use enum type for domains field in struct r600_resource

This prevents the problems when the header is included in C++ code.

11 years agor600g: add new flags to isa instruction tables
Vadim Girlin [Sun, 21 Apr 2013 15:11:36 +0000 (19:11 +0400)]
r600g: add new flags to isa instruction tables

11 years agor600g: always create reverse lookup isa tables
Vadim Girlin [Fri, 1 Feb 2013 09:51:25 +0000 (13:51 +0400)]
r600g: always create reverse lookup isa tables

11 years agor600g: mask unused source components for SAMPLE
Vadim Girlin [Thu, 25 Apr 2013 15:42:31 +0000 (19:42 +0400)]
r600g: mask unused source components for SAMPLE

This results in more clean shader code and may improve the quality of
optimized code produced by r600-sb due to eliminated false dependencies
in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
11 years agointel: Remove the last spans code!
Eric Anholt [Fri, 19 Apr 2013 21:51:55 +0000 (14:51 -0700)]
intel: Remove the last spans code!

The remaining bits happen to do nothing that
_swrast_span_render_start()/finish() don't do.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agointel: Move the S8 offset calc function near its remaining usage.
Eric Anholt [Fri, 19 Apr 2013 21:50:43 +0000 (14:50 -0700)]
intel: Move the S8 offset calc function near its remaining usage.

It's not really span code ever since we stopped using spans for S8.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agointel: Ensure renderbuffers are current when mapping them.
Eric Anholt [Fri, 19 Apr 2013 21:47:28 +0000 (14:47 -0700)]
intel: Ensure renderbuffers are current when mapping them.

In the case of renering to windows in X, we would render to stale buffers
(or not render at all!) if you hit a MapRenderbuffer as the first thing
done to your window after new buffers are ready to be collected in DRI2.

I think this also covers the weird comment about irb->mt being missing
sometimes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agomesa: Add a clarifying comment about rowStride of compressed textures.
Eric Anholt [Fri, 19 Apr 2013 20:57:17 +0000 (13:57 -0700)]
mesa: Add a clarifying comment about rowStride of compressed textures.

I always forget how we do this for compressed textures.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agomesa: Remove the Map field from texture images.
Eric Anholt [Fri, 19 Apr 2013 20:00:02 +0000 (13:00 -0700)]
mesa: Remove the Map field from texture images.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Always use MapTextureImage for mapping textures for swrast.
Eric Anholt [Fri, 19 Apr 2013 20:10:55 +0000 (13:10 -0700)]
swrast: Always use MapTextureImage for mapping textures for swrast.

Now that everything goes through ImageSlices[], we can rely on the
driver's existing texture mapping function.

A big block of code goes away on Radeon that looks like it was to deal with
the validate that happened at SpanRenderStart, which no longer occurs since we
don't need validation for the MapTextureImage hook.

v2: Rewrite comment about ImageSlices, fix duplicated swImages, touch up
    unmap loop.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agonouveau: Replace swrast_texture_image->Map usage with ->Buffer.
Eric Anholt [Sat, 20 Apr 2013 04:32:06 +0000 (21:32 -0700)]
nouveau: Replace swrast_texture_image->Map usage with ->Buffer.

This code is trying to deal with providing a map in the case that
AllocTexImageBuffer was called, which is hooked up to the swrast variant.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agonouveau: Just use MapTextureImage instead of duplicating the logic.
Eric Anholt [Sat, 20 Apr 2013 05:05:21 +0000 (22:05 -0700)]
nouveau: Just use MapTextureImage instead of duplicating the logic.

MapTextureImage has the exact same logic, except it can also handle
swrast-allocated buffers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Make a teximage's stored RowStride be in terms of bytes per row.
Eric Anholt [Fri, 19 Apr 2013 21:00:22 +0000 (14:00 -0700)]
swrast: Make a teximage's stored RowStride be in terms of bytes per row.

For hardware drivers with pitch alignment requirements, a
non-power-of-two-sized texture format won't end up being an integer number
of pixels per row.  Also, avoids having to change our units between
MapTextureImage's rowStride and swrast's RowStride.

This doesn't fully convert the compressed texel fetch path, but does make
sure we don't drop any bits (not that we'd expect to).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Replace use of teximage Map in 1D/2D paths with ImageSlices[0].
Eric Anholt [Fri, 19 Apr 2013 19:51:20 +0000 (12:51 -0700)]
swrast: Replace use of teximage Map in 1D/2D paths with ImageSlices[0].

This gets us ready for the Map field to die.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Replace ImageOffsets with an ImageSlices pointer.
Eric Anholt [Fri, 19 Apr 2013 18:44:53 +0000 (11:44 -0700)]
swrast: Replace ImageOffsets with an ImageSlices pointer.

This is a step toward allowing drivers to use their normal mapping paths,
instead of requiring that all slice mappings come from an aligned offset
from the first slice's map.

This incidentally fixes missing slice handling in FXT1 swrast.

v2: Use slice height helper function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Reuse _swrast_free_texture_image_buffer from drivers.
Eric Anholt [Fri, 19 Apr 2013 18:57:28 +0000 (11:57 -0700)]
swrast: Reuse _swrast_free_texture_image_buffer from drivers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Move ImageOffsets allocation to shared code.
Eric Anholt [Fri, 19 Apr 2013 18:56:35 +0000 (11:56 -0700)]
swrast: Move ImageOffsets allocation to shared code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Clean up and explain the mapping process.
Eric Anholt [Fri, 19 Apr 2013 20:35:31 +0000 (13:35 -0700)]
swrast: Clean up and explain the mapping process.

v2: Move slice height calculation to a helper function (recommeded by Brian).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoswrast: Factor out texture slice counting.
Eric Anholt [Fri, 19 Apr 2013 20:30:34 +0000 (13:30 -0700)]
swrast: Factor out texture slice counting.

This function going to get used a lot more in upcoming patches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoradeon: Remove some dead teximage mapping code.
Eric Anholt [Fri, 19 Apr 2013 22:45:33 +0000 (15:45 -0700)]
radeon: Remove some dead teximage mapping code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoradeon: Add missing swrast field initialization.
Eric Anholt [Fri, 19 Apr 2013 18:52:36 +0000 (11:52 -0700)]
radeon: Add missing swrast field initialization.

This is the equivalent of intel's
80513ec8b4c812b9c6249cc5824337a5f04ab34c.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agor600g/llvm: Fix opencl build
Vincent Lejeune [Tue, 30 Apr 2013 14:08:58 +0000 (16:08 +0200)]
r600g/llvm: Fix opencl build

11 years agoGallium: Use mmap on Haiku for executable memory vs malloc
Alexander von Gluck IV [Mon, 29 Apr 2013 23:08:41 +0000 (18:08 -0500)]
Gallium: Use mmap on Haiku for executable memory vs malloc

* Haiku now has DEP enabled by default.

11 years agoMapi: Use mmap on Haiku for executable memory vs malloc
Alexander von Gluck IV [Mon, 29 Apr 2013 23:08:26 +0000 (18:08 -0500)]
Mapi: Use mmap on Haiku for executable memory vs malloc

* Haiku now has DEP enabled by default.

11 years agoMesa: Use mmap on Haiku for executable memory vs malloc
Alexander von Gluck IV [Mon, 29 Apr 2013 23:08:02 +0000 (18:08 -0500)]
Mesa: Use mmap on Haiku for executable memory vs malloc

* Haiku now has DEP enabled by default.

11 years agor600g/llvm: get use_kill from compiler shader
Vincent Lejeune [Sat, 27 Apr 2013 22:01:00 +0000 (00:01 +0200)]
r600g/llvm: get use_kill from compiler shader

11 years agoi965/fs: Print out the estimated cycle count in INTEL_DEBUG=wm
Eric Anholt [Mon, 8 Apr 2013 23:38:57 +0000 (16:38 -0700)]
i965/fs: Print out the estimated cycle count in INTEL_DEBUG=wm

This could be used by shader-db for hopefully more accurate regression
testing.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: Allow LRPs with uniform registers.
Eric Anholt [Fri, 26 Apr 2013 03:20:05 +0000 (20:20 -0700)]
i965/fs: Allow LRPs with uniform registers.

Improves GLB2.7 performance on my HSW by 0.671455% +/- 0.225037% (n=62).

v2: Make is_valid_3src() a method of the fs_reg. (recommended by Ken)

Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
11 years agointel: Be more conservative in disabling tiling to save memory.
Eric Anholt [Thu, 25 Apr 2013 21:41:36 +0000 (14:41 -0700)]
intel: Be more conservative in disabling tiling to save memory.

Improves GLB2.7 trex performance 1.01985% +/- 0.721366% on my IVB (n=10)
and by 3.38771% +/- 0.584241% (n=15) on my HSW, due to a 32x32 ARGB8888
cubemap going from untiled to tiled.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
11 years agoi965: Disable Z16 on contexts that don't require it.
Eric Anholt [Thu, 25 Apr 2013 19:34:07 +0000 (12:34 -0700)]
i965: Disable Z16 on contexts that don't require it.

It appears that Z16 on Intel hardware is in fact slower than Z24, so
people are getting surprisingly hurt when trying to use Z16 as a
performance-versus-precision tradeoff, or when they're targeting GLES2 and
that's all you get.

GL 3.0+ have Z16 on the list of required exact format sizes, but GLES
doesn't, so choose the better-performing layout in that case.  Improves
GLB 2.7 trex performance at 1920x1080 by 10.7% +/- 1.1% (n=3) on my IVB
system.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agointel: Report FBO incompleteness causes through GL_ARB_debug_output.
Eric Anholt [Tue, 23 Apr 2013 02:41:40 +0000 (19:41 -0700)]
intel: Report FBO incompleteness causes through GL_ARB_debug_output.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agointel: Fold the one last function intel_tex_format.c into the caller.
Eric Anholt [Mon, 22 Apr 2013 23:04:25 +0000 (16:04 -0700)]
intel: Fold the one last function intel_tex_format.c into the caller.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: Fix error checking for GS UBO getters.
Eric Anholt [Wed, 10 Apr 2013 17:04:11 +0000 (10:04 -0700)]
mesa: Fix error checking for GS UBO getters.

These are supposed to be present if both things are available, but we were
enabling them if either one was.

11 years agomesa: Add a clarifying comment about EXTRA_ error checking.
Eric Anholt [Wed, 10 Apr 2013 16:59:41 +0000 (09:59 -0700)]
mesa: Add a clarifying comment about EXTRA_ error checking.

11 years agomesa: Add an extra clarifying set of braces to getter checking.
Eric Anholt [Wed, 10 Apr 2013 16:53:11 +0000 (09:53 -0700)]
mesa: Add an extra clarifying set of braces to getter checking.

For this multi-page single statement, my thought the end was to that the
next block was mis-indented, rather than that the dropped indentation
actually indicated the end of the loop.

11 years agomesa: Fix error checking for getters consisting of only API versions.
Eric Anholt [Wed, 10 Apr 2013 16:49:37 +0000 (09:49 -0700)]
mesa: Fix error checking for getters consisting of only API versions.

In almost all of our cases, getters that are turned on for only some API
variants will have an extension listed as one of the things that can
enable it, and thus api_check gets set.  For extra_gl30_es3 (used for
NUM_EXTENSIONS, MAJOR_VERSION, MINOR_VERSION) on a GL 2.1 context, though,
we would check twice, not find either one, but never actually throw the
error.

11 years agomesa: Clarify the names of error checking variables for glGet.
Eric Anholt [Wed, 10 Apr 2013 16:47:12 +0000 (09:47 -0700)]
mesa: Clarify the names of error checking variables for glGet.

There's no reason to actually count these things, so the integer ++
behavior was just confusing.

11 years agoi915: Add support for GL_EXT_texture_sRGB and GL_EXT_texture_sRGB_decode.
Eric Anholt [Thu, 18 Apr 2013 02:10:29 +0000 (19:10 -0700)]
i915: Add support for GL_EXT_texture_sRGB and GL_EXT_texture_sRGB_decode.

This brings the driver up to GL 2.1.

11 years agoi915: Always enable GL 2.0 support.
Eric Anholt [Wed, 17 Apr 2013 20:55:08 +0000 (13:55 -0700)]
i915: Always enable GL 2.0 support.

There's no point in shipping a non-GL2 driver today.

11 years agoi915: Correctly set the OQ counter bits.
Eric Anholt [Wed, 17 Apr 2013 20:58:00 +0000 (13:58 -0700)]
i915: Correctly set the OQ counter bits.

While we may provide the extension, we need to tell applications that they
can't actually use it:

            An implementation can either set QUERY_COUNTER_BITS_ARB to the
            value 0, or to some number greater than or equal to n.  If an
            implementation returns 0 for QUERY_COUNTER_BITS_ARB, then the
            occlusion queries will always return that zero samples passed the
            occlusion test, and so an application should not use occlusion
            queries on that implementation.

11 years agoi965: Move is_math/is_tex/is_control_flow() to backend_instruction.
Kenneth Graunke [Sun, 28 Apr 2013 08:35:57 +0000 (01:35 -0700)]
i965: Move is_math/is_tex/is_control_flow() to backend_instruction.

These are entirely based on the opcode, which is available in
backend_instruction.  It makes sense to only implement them in one
place.

This changes the VS implementation of is_tex() slightly, which now
accepts FS_OPCODE_TXB and SHADER_OPCODE_LOD.  However, since those
aren't generated in the VS anyway, it should be fine.

This also makes is_control_flow() available in the VS.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agodraw/so: fix overflow calculation
Zack Rusin [Sat, 27 Apr 2013 06:51:26 +0000 (02:51 -0400)]
draw/so: fix overflow calculation

only report overflow for missing targets if they're actually being
used. if the targets are missing but are not being used by any
slot in the stream output declaration we should correctly just
ignore them.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agollvmpipe: Fix queries when screen->num_threads == 0.
José Fonseca [Mon, 29 Apr 2013 14:40:06 +0000 (15:40 +0100)]
llvmpipe: Fix queries when screen->num_threads == 0.

That is, when llvmpipe is run in single-threaded mode.

Trivial.

Tested with

  LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry

11 years agoRevert "st/mesa: add a simple path to BufferData if it only discards buffer contents"
José Fonseca [Mon, 29 Apr 2013 14:12:26 +0000 (15:12 +0100)]
Revert "st/mesa: add a simple path to BufferData if it only discards buffer contents"

This reverts commit 5649f886f76023532538b8792605a3578cec1ed1.

It causes segfaults when size is zero.

11 years agor600g: force full cache for hyperz
Jerome Glisse [Wed, 24 Apr 2013 23:15:52 +0000 (19:15 -0400)]
r600g: force full cache for hyperz

Seems that in some case allowing half cache usage confuse the gpu
and trigger lockup. Force full cache use.

Should fix :
https://bugs.freedesktop.org/show_bug.cgi?id=59592
https://bugs.freedesktop.org/show_bug.cgi?id=60848
https://bugs.freedesktop.org/show_bug.cgi?id=60969
https://bugs.freedesktop.org/show_bug.cgi?id=61747
https://bugs.freedesktop.org/show_bug.cgi?id=62466
https://bugs.freedesktop.org/show_bug.cgi?id=62669
https://bugs.freedesktop.org/show_bug.cgi?id=62721
https://bugs.freedesktop.org/show_bug.cgi?id=63124

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
11 years agofreedreno: fix rebase screw-up
Rob Clark [Mon, 29 Apr 2013 11:36:27 +0000 (07:36 -0400)]
freedreno: fix rebase screw-up

Add back 2nd arg to emit_vertexbufs() which got lost in rebase.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agoi965/fs: Don't try to use bogus interpolation modes pre-Gen6.
Chris Forbes [Fri, 26 Apr 2013 23:00:46 +0000 (11:00 +1200)]
i965/fs: Don't try to use bogus interpolation modes pre-Gen6.

Interpolation modes other than perspective-barycentric-pixel-center (and
their associated coefficients in the WM payload) only exist in Gen6 and
later.

Unfortunately, if a varying was declared as `centroid`, we would blindly
read the nonexistant values, and so produce all manner of bad behavior
-- texture swimming, snow, etc.

Fixes rendering in Counter-Strike Source and Team Fortress 2 on
Ironlake.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
11 years agoi965/vs: Fix order of source arguments to LRP.
Matt Turner [Sun, 28 Apr 2013 21:35:01 +0000 (14:35 -0700)]
i965/vs: Fix order of source arguments to LRP.

The order or arguments matches DirectX, and is backwards from GLSL's
mix() built-in.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63983

11 years agollvmpipe: stop crashing when one of the so targets is null
Zack Rusin [Sat, 27 Apr 2013 04:52:49 +0000 (00:52 -0400)]
llvmpipe: stop crashing when one of the so targets is null

Fixes a crash when one of the so targets is null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agodraw/so: indicate overflow when buffer is missing
Zack Rusin [Sat, 27 Apr 2013 04:49:23 +0000 (00:49 -0400)]
draw/so: indicate overflow when buffer is missing

We were crashing if one of the buffers wasn't set, we should
just treat it as an overflow. It's useful when using so
statistics because it allows one to figure out how much data
would be generated by so without actually writing any of it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agogallivm: fix indirect addressing of temps in soa mode
Zack Rusin [Sat, 27 Apr 2013 02:53:07 +0000 (22:53 -0400)]
gallivm: fix indirect addressing of temps in soa mode

we weren't adding the soa offsets when constructing the indices
for the gather functions. That meant that we were always returning
the data in the first vertex/primitive/pixel in the SoA structure
and not correctly fetching from all structures.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agotgsi/ureg: Add a function to return the number of outputs
Zack Rusin [Wed, 24 Apr 2013 03:36:40 +0000 (23:36 -0400)]
tgsi/ureg: Add a function to return the number of outputs

We already hold the variable, just weren't providing access
to it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
11 years agodraw/so: Fix overflow calculations
Zack Rusin [Tue, 23 Apr 2013 22:56:47 +0000 (18:56 -0400)]
draw/so: Fix overflow calculations

We weren't taking the buffer offset, destination offset or the
stride into consideration so we were frequently writing into
an overflown buffer.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agodraw/llvm: fix viewport transformations
Zack Rusin [Tue, 23 Apr 2013 22:47:08 +0000 (18:47 -0400)]
draw/llvm: fix viewport transformations

This was a very serious bug. We were always doing the viewport
transformations on the first output of the vertex shader. That means
that every application that was storing position in anything but
OUT[0] was outputing untransformed vertices and had broken output
for whatever it was storing at OUT[0]. Correctly take into
consideration where the vertex position is actually stored.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agogallium: increase the number of available stream output decls
Zack Rusin [Sat, 27 Apr 2013 03:00:38 +0000 (23:00 -0400)]
gallium: increase the number of available stream output decls

There can be more stream output decls than shader outputs because
individual components from them can be split and distributed
among different so buffers.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agollvmpipe: implement so_overflow query
Zack Rusin [Tue, 23 Apr 2013 10:19:14 +0000 (06:19 -0400)]
llvmpipe: implement so_overflow query

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>