mesa.git
11 years agoi965/blorp: retype destination register for texture SEND instruction to UW.
Paul Berry [Tue, 24 Sep 2013 22:18:52 +0000 (15:18 -0700)]
i965/blorp: retype destination register for texture SEND instruction to UW.

From the bspec documentation of the SEND instruction:

    "destination region cannot cross the 256-bit register boundary."

To avoid violating this restriction when executing SIMD16 texturing
operations (such as those used by blorp), we need to ensure that the
destination of the SEND instruction doesn't exceed 256 bits in size.
An easy way to do this is to set the type of the destination register
to UW (unsigned word), since 16 unsigned words can fit inside a
256-bit register.  Fortunately, this has no effect on the sampling
operation, since the sampler always infers the destination data type
from the sampler message rather than from the type of the instruction
operand.

Previously, we did this for texturing operations issued by the vec4
and fs back-ends, but not for blorp.  This patch makes blorp use the
same trick.

I haven't observed any behavioural difference on actual hardware due
to this patch, but it avoids a warning from the simulator so it seems
like the right thing to do.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Add a real native TexStorage path.
Eric Anholt [Fri, 30 Aug 2013 21:39:25 +0000 (14:39 -0700)]
i965: Add a real native TexStorage path.

We originally had a path just did the loop and called
ctx->Driver.AllocTextureImageBuffer(), which I moved into Mesa core.  But
we can do better, avoiding incorrect miptree size guesses and later
texture validations by just directly allocating the miptree and setting it
to all the images.

v2: drop debug printf.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Add missing license to intel_tex_validate.c.
Eric Anholt [Fri, 30 Aug 2013 20:41:44 +0000 (13:41 -0700)]
i965: Add missing license to intel_tex_validate.c.

I've rewritten a lot of this file.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Always allocate validated miptrees from level 0.
Eric Anholt [Fri, 30 Aug 2013 20:03:52 +0000 (13:03 -0700)]
i965: Always allocate validated miptrees from level 0.

No change in copies during a piglit run, but it's one less first_level !=
0 in our codebase.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Don't relayout a texture just for baselevel changes.
Eric Anholt [Fri, 30 Aug 2013 20:03:52 +0000 (13:03 -0700)]
i965: Don't relayout a texture just for baselevel changes.

As long as the baselevel, maxlevel still sit inside the range we had
previously validated, there's no need to reallocate the texture.

I also hope this makes our texture validation logic much more obvious.
It's taken me enough tries to write this change, that's for sure.  Reduces
miptree copy count on a piglit run by 1.3%, though the change in amount of
data moved is much smaller.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Don't allocate a 1-level texture when GL_GENERATE_MIPMAP is set.
Eric Anholt [Fri, 30 Aug 2013 19:47:02 +0000 (12:47 -0700)]
i965: Don't allocate a 1-level texture when GL_GENERATE_MIPMAP is set.

Given that a teximage that calls us with this flag set will immediately
proceed to allocate the other levels, we can probably just go ahead and
allocate those levels now.

Reduces miptree copies in piglit by about .05%.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Stop allocating miptrees with first_level != 0.
Eric Anholt [Fri, 30 Aug 2013 19:37:15 +0000 (12:37 -0700)]
i965: Stop allocating miptrees with first_level != 0.

If the caller shows up with GL_BASE_LEVEL != 0, it doesn't mean that the
texture will over the course of its lifetime have that nonzero baselevel,
it means that the caller is filling the texture from the bottom up for
some reason (one could imagine demand-loading detailed texture layers at
runtime, for example).  If we allocate from just the current baselevel, it
means when they come along with the next level up, we'll have to allocate
a new miptree and copy all of our bits out of the first miptree.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Drop a special case for guessing small miptree levels.
Eric Anholt [Fri, 30 Aug 2013 19:21:38 +0000 (12:21 -0700)]
i965: Drop a special case for guessing small miptree levels.

Let's say you started allocating your 2D texture with level 2 of a tree as
a 1x1 image.  The driver doesn't know if this means that level 0 is 4x4 or
4x1 or 1x4, so we would just allocate a single 1x1 and let it get copied
in to the real location at texture validate time later.

Since this is just a temporary allocation that *will* get copied, the
extra space allocation of just taking the normal path which will happen to
producing a 4x1 level 0, 2x1 level 1, and 1x1 level 2 is the right way to
go, to reduce complexity in the normal case.

No change in miptree copies over the course of a piglit run.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Totally switch around how we handle nonzero baselevel-first_level.
Eric Anholt [Tue, 17 Sep 2013 23:47:30 +0000 (16:47 -0700)]
i965: Totally switch around how we handle nonzero baselevel-first_level.

This has no effect currently, because intel_finalize_mipmap_tree() always
makes mt->first_level == tObj->BaseLevel.

The change I made before to handle it
(b1080cfbdb0a084122fcd662cd27b4748c5598fd) got very close to working, but
after fixing some unrelated bugs in the series, it still left
tex-miplevel-selection producing errors when testing textureLod().  The
problem is that for explicit LODs, the sampler's LOD clamping is ignored,
and only the surface's MIP clamping is respected.  So we need to use
surface mip clamping, which applies on top of the sampler's mip clamping,
so the sampler change gets backed out.

Now actually tested with a non-regressing series producing a non-zero
computed baselevel.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Always look up from the object's mt when setting up texturing state.
Eric Anholt [Tue, 17 Sep 2013 23:16:20 +0000 (16:16 -0700)]
i965: Always look up from the object's mt when setting up texturing state.

We know that the object's mt is equal to the firstimage's mt because it's
gone through intel_finalize_mipmap_tree().  Saves a lookup of firstimage
on pre-gen7.

v2: Merge in the warning fix that appeared later in the series (noted by
    Chad)

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agor600g/sb: Move variable dereference after null check.
Vinson Lee [Sat, 28 Sep 2013 06:05:54 +0000 (23:05 -0700)]
r600g/sb: Move variable dereference after null check.

Fixes "Deference before null check" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>
11 years agost/mesa: fix comment typo
Brian Paul [Mon, 30 Sep 2013 15:06:52 +0000 (09:06 -0600)]
st/mesa: fix comment typo

11 years agor600g,radeonsi: workaround for late shared screen initialization
Marek Olšák [Mon, 30 Sep 2013 10:57:51 +0000 (12:57 +0200)]
r600g,radeonsi: workaround for late shared screen initialization

Accidentally broken by the consolidation.

11 years agor600g: Fix build failure introduced with r600_texture.c consolidation
Laurent Carlier [Sun, 29 Sep 2013 19:45:09 +0000 (21:45 +0200)]
r600g: Fix build failure introduced with r600_texture.c consolidation

It seems that case with opencl enabled was forgotten

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
11 years agoradeon: make texture logging more useful
Marek Olšák [Tue, 24 Sep 2013 11:56:27 +0000 (13:56 +0200)]
radeon: make texture logging more useful

This has been very useful for tracking down bugs in libdrm.

The *_PRINT_TEXDEPTH environment variables were probably never used,
so I removed them.

11 years agor600g,radeonsi: share r600_texture.c
Marek Olšák [Sun, 22 Sep 2013 11:06:27 +0000 (13:06 +0200)]
r600g,radeonsi: share r600_texture.c

The function r600_choose_tiling is new and needs a review.

The only change in functionality is that it enables 2D tiling for compressed
textures on SI. It was probably accidentally turned off.

v2: don't make scanout buffers linear

11 years agor600g: remove compute_global_transfer_* calls from texture_transfer_map/unmap
Marek Olšák [Wed, 25 Sep 2013 18:57:22 +0000 (20:57 +0200)]
r600g: remove compute_global_transfer_* calls from texture_transfer_map/unmap

Textures can never have target==PIPE_BUFFER.

11 years agor600g: move the low-level buffer functions for multiple rings to drivers/radeon
Marek Olšák [Mon, 23 Sep 2013 00:37:05 +0000 (02:37 +0200)]
r600g: move the low-level buffer functions for multiple rings to drivers/radeon

Also slightly optimize r600_buffer_map_sync_with_rings.

11 years agor600g,radeonsi: consolidate tiling_info initialization
Marek Olšák [Sun, 22 Sep 2013 20:12:18 +0000 (22:12 +0200)]
r600g,radeonsi: consolidate tiling_info initialization

and the util_format_s3tc_init calls too.

11 years agoradeonsi: implement clear_buffer using CP DMA, initialize CMASK with it
Marek Olšák [Sun, 22 Sep 2013 19:47:35 +0000 (21:47 +0200)]
radeonsi: implement clear_buffer using CP DMA, initialize CMASK with it

More work needs to be done for this to be entirely shared with r600g.
I'm just trying to share r600_texture.c now.

The reason I put the implementation to si_descriptors.c is that the emit
function had already been there.

11 years agor600g: move aux_context and r600_screen_clear_buffer to drivers/radeon
Marek Olšák [Sun, 22 Sep 2013 19:45:23 +0000 (21:45 +0200)]
r600g: move aux_context and r600_screen_clear_buffer to drivers/radeon

This will be used in the next commit.

11 years agoradeonsi: move debug options to R600_DEBUG
Marek Olšák [Sun, 22 Sep 2013 13:34:12 +0000 (15:34 +0200)]
radeonsi: move debug options to R600_DEBUG

11 years agor600g: move some debug options to drivers/radeon
Marek Olšák [Sun, 22 Sep 2013 13:18:11 +0000 (15:18 +0200)]
r600g: move some debug options to drivers/radeon

11 years agor600g,radeonsi: share the async dma interface
Marek Olšák [Sun, 22 Sep 2013 00:55:47 +0000 (02:55 +0200)]
r600g,radeonsi: share the async dma interface

r600_texture.c is one step closer to r600g.

11 years agoradeonsi: move radeonsi-specific functions out of r600_texture.c
Marek Olšák [Sat, 21 Sep 2013 21:33:30 +0000 (23:33 +0200)]
radeonsi: move radeonsi-specific functions out of r600_texture.c

11 years agor600g,radeonsi: remove unused code
Marek Olšák [Sat, 21 Sep 2013 21:05:08 +0000 (23:05 +0200)]
r600g,radeonsi: remove unused code

11 years agor600g: move r600g-specific functions out of r600_texture.c
Marek Olšák [Sat, 21 Sep 2013 18:50:33 +0000 (20:50 +0200)]
r600g: move r600g-specific functions out of r600_texture.c

11 years agor600g,radeonsi: consolidate r600_texture structures
Marek Olšák [Sat, 21 Sep 2013 18:14:52 +0000 (20:14 +0200)]
r600g,radeonsi: consolidate r600_texture structures

11 years agor600g: get rid of r600_texture::is_rat
Marek Olšák [Sat, 21 Sep 2013 18:07:18 +0000 (20:07 +0200)]
r600g: get rid of r600_texture::is_rat

It's always 0.

11 years agor600g: get rid of r600_texture::array_mode
Marek Olšák [Sat, 21 Sep 2013 18:02:55 +0000 (20:02 +0200)]
r600g: get rid of r600_texture::array_mode

11 years agor600g,radeonsi: consolidate transfer, cmask, and fmask structures
Marek Olšák [Sat, 21 Sep 2013 17:56:24 +0000 (19:56 +0200)]
r600g,radeonsi: consolidate transfer, cmask, and fmask structures

11 years agoradeon drivers: handle PIPE_CAP_MAX_VIEWPORTS
Marek Olšák [Sat, 21 Sep 2013 17:45:08 +0000 (19:45 +0200)]
radeon drivers: handle PIPE_CAP_MAX_VIEWPORTS

11 years agoradeon/llvm: fix TGSI_OPCODE_UCMP
Marek Olšák [Wed, 25 Sep 2013 18:07:16 +0000 (20:07 +0200)]
radeon/llvm: fix TGSI_OPCODE_UCMP

This doesn't fix any known issue (I haven't run piglit with this yet),
but the code was obviously completely wrong. It looks like copy-pasted from CMP.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
11 years agost/mesa: fix GLSL mix(.., .., bvecN)
Marek Olšák [Mon, 23 Sep 2013 20:43:23 +0000 (22:43 +0200)]
st/mesa: fix GLSL mix(.., .., bvecN)

v2: use CMP on drivers without native integer support

11 years agoconfigure.ac: Add a more informative warning when libclc.pc is not found v2
Tom Stellard [Thu, 5 Sep 2013 23:26:17 +0000 (16:26 -0700)]
configure.ac: Add a more informative warning when libclc.pc is not found v2

v2:
  - Don't display an error message when the user doesn't ask for libclc.

Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agomesa: Include stdint.h in mtypes.h for uint32_t symbol.
Vinson Lee [Fri, 27 Sep 2013 03:40:39 +0000 (20:40 -0700)]
mesa: Include stdint.h in mtypes.h for uint32_t symbol.

This patch fixes the MSVC build error introduced with commit
b2e327e08f8519da131dd382adcc99240d433404.

api_arrayelt.c
src\mesa\main/mtypes.h(1809) : error C2061: syntax error : identifier 'uint32_t'
src\mesa\main/mtypes.h(1810) : error C2059: syntax error : '}'
src\mesa\main/mtypes.h(1825) : error C2079: 'Minimum' uses undefined union 'gl_perf_monitor_counter_value'
src\mesa\main/mtypes.h(1828) : error C2079: 'Maximum' uses undefined union 'gl_perf_monitor_counter_value'

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
11 years agoi965/fs: Don't double-accept operands of logical and/or/xor operations.
Kenneth Graunke [Mon, 23 Sep 2013 20:37:00 +0000 (13:37 -0700)]
i965/fs: Don't double-accept operands of logical and/or/xor operations.

If the argument to emit_bool_to_cond_code() is an ir_expression, we
loop over the operands, calling accept() on each of them, which
generates assembly code to compute that subexpression.  We then emit
one or two final instruction that perform the top-level operation on
those operands.

If it's not an expression (say, a boolean-valued variable), we simply
call accept() on the whole value.

In commit 80ecb8f1 (i965/fs: Avoid generating extra AND instructions on
bool logic ops), Eric made logic operations jump out of the expression
path to the non-expression path.

Unfortunately, this meant that we would first accept() the two operands,
skip generating any code that used them, then accept() the whole
expression, generating code for the operands a second time.

Dead code elimination would always remove the first set of redundant
operand assembly, since nothing actually used them.  But we shouldn't
generate it in the first place.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
11 years agoi965: Add #define for MI_REPORT_PERF_COUNT on Gen6+.
Kenneth Graunke [Tue, 26 Mar 2013 22:22:22 +0000 (15:22 -0700)]
i965: Add #define for MI_REPORT_PERF_COUNT on Gen6+.

This appears in Volume 1 Part 1 of the Sandybridge PRM on page 48.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Add support for GL_AMD_performance_monitor on Ironlake.
Kenneth Graunke [Thu, 11 Apr 2013 20:22:29 +0000 (13:22 -0700)]
i965: Add support for GL_AMD_performance_monitor on Ironlake.

Ironlake's counters are always enabled; userspace can simply send a
MI_REPORT_PERF_COUNT packet to take a snapshot of them.  This makes it
easy to implement.

The counters are documented in the source code for the intel-gpu-tools
intel_perf_counters utility.

v2: Adjust for core data structure changes.  Add a table mapping buffer
    object offsets to exposed counters (which changes each generation).
    Finally, add report ID assertions to sanity check the BO layout
    (thanks to Carl Worth).

v3: Update for core BeginPerfMonitor hook changes (requested by Brian).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: Add core support for the GL_AMD_performance_monitor extension.
Kenneth Graunke [Thu, 11 Apr 2013 20:22:00 +0000 (13:22 -0700)]
mesa: Add core support for the GL_AMD_performance_monitor extension.

This provides an interface for applications (and OpenGL-based tools) to
access GPU performance counters.  Since the exact performance counters
available vary between vendors and hardware generations, the extension
provides an API the application can use to get the names, types, and
minimum/maximum values of all available counters.  Counters are also
organized into groups.

Applications create "performance monitor" objects, select the counters
they want to track, and Begin/End monitoring, much like OpenGL's query
API.  Multiple monitors can be in flight simultaneously.

v2: Pass ctx to all driver hooks (suggested by Christoph), and attempt
    to fix overallocation of bitsets (caught by Christoph).  Incomplete.

v3: Significantly rework core data structures.  Store counters in groups
    rather than in a global list.  Use their array index in the group's
    counter list as the ID rather than trying to store a globally unique
    counter ID.  Use bitsets for active counters within a group, and
    also track which groups are active so that's easy to query.

v4: Remove _mesa_ prefix on static functions; detect out of memory
    conditions in new_performance_monitor(); make BeginPerfMonitor hook
    return a boolean rather than setting m->Active or raising an error.
    Switch to GLuint/unsigned for NumGroups, NumCounters, and
    MaxActiveCounters (which also means switching a bunch of temporary
    variable types).  All suggested by Brian Paul.  Also, remove
    commented out code at the bottom of the block.  Finally, fix the
    dispatch sanity test (noticed by Ian Romanick).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com> [v3]
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Create and use a has_uniform_buffer_objects() helper.
Kenneth Graunke [Tue, 24 Sep 2013 01:18:14 +0000 (18:18 -0700)]
glsl: Create and use a has_uniform_buffer_objects() helper.

This is better than overriding the extension enable based on the
language version; it's robust against shaders that do:

   #version 140
   #extension GL_ARB_uniform_buffer_object : disable

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Create and use a has_explicit_attrib_location() helper.
Kenneth Graunke [Tue, 24 Sep 2013 01:13:52 +0000 (18:13 -0700)]
glsl: Create and use a has_explicit_attrib_location() helper.

Explicit attribute locations are supported with GLSL 3.30, GLSL ES 3.00,
or "#extension GL_ARB_explicit_attrib_location: enable".  Using a helper
function makes it easy to check for this.

This enables support in GLSL 3.30, which was previously missing.

Previously, we overrode the extension enable flag for ES 3.00.  This is
not robust against a shader such as:

   #version 330
   #extension GL_ARB_explicit_attrib_location : disable

Disabling extensions should not remove core language functionality.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: Remove 'invalidate_state' parameter to _mesa_dirty_texobj().
Kenneth Graunke [Wed, 18 Sep 2013 04:50:26 +0000 (21:50 -0700)]
mesa: Remove 'invalidate_state' parameter to _mesa_dirty_texobj().

Every caller passed true.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agomesa: Remove some remaining FEATURE_* detritus.
Eric Anholt [Wed, 25 Sep 2013 22:41:46 +0000 (15:41 -0700)]
mesa: Remove some remaining FEATURE_* detritus.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Fix cube array coordinate normalization
Chris Forbes [Sun, 15 Sep 2013 10:25:45 +0000 (22:25 +1200)]
i965: Fix cube array coordinate normalization

Hardware requires the magnitude of the largest component to not exceed
1; brw_cubemap_normalize ensures that this is the case.

Unfortunately, we would previously multiply the array index for cube
arrays by the normalization factor. The incorrect array index would then
cause the sampler to attempt to access either the wrong cube, or memory
outside the cube surface entirely, resulting in garbage rendering or in
the worst case, hangs.

Alter the normalization pass to only multiply the .xyz components.

Fixes broken rendering in the arb_texture_cube_map_array-cubemap piglit,
which was recently adjusted to provoke this behavior.

V2: Fix indent.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agodraw/clip: don't emit so many empty triangles
Zack Rusin [Thu, 19 Sep 2013 17:38:12 +0000 (13:38 -0400)]
draw/clip: don't emit so many empty triangles

Compress empty triangles (don't emit more than one in a row) and
never emit empty triangles if we already generated a triangle
covering a non-null area. We can't skip all null-triangles
because c_primitives expects ones that were generated from vertices
exactly at the clipping-plane, to be emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agollvmpipe: count c_primitives before discarding null prims
Zack Rusin [Thu, 19 Sep 2013 17:37:03 +0000 (13:37 -0400)]
llvmpipe: count c_primitives before discarding null prims

We need to count the clipper primitives before the rasterizer
discards one it considers to be null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agollvmpipe: we need to subdivide if fb is bigger in either direction
Zack Rusin [Tue, 24 Sep 2013 20:25:24 +0000 (16:25 -0400)]
llvmpipe: we need to subdivide if fb is bigger in either direction

We need to subdivide triangles if either of the dimensions is
larger than the max edge length, not when both of them are larger.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoradeon/llvm: fix shadow cube texturing for GL3.0
Marek Olšák [Tue, 24 Sep 2013 17:28:27 +0000 (19:28 +0200)]
radeon/llvm: fix shadow cube texturing for GL3.0

The fix is at the end (TGSI_TEXTURE_SHADOWCUBE handling), but I also
restructured the code for it to be more readable.

Fixes spec/!OpenGL 3.0/sampler-cube-shadow.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: fix blitting the last 2 mipmap levels of compressed textures
Marek Olšák [Tue, 24 Sep 2013 12:15:00 +0000 (14:15 +0200)]
radeonsi: fix blitting the last 2 mipmap levels of compressed textures

This fixes compressedteximage piglit tests.

+10 piglits

Evergreen and Cayman have the same issue. R600 and R700 don't.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: add missing colorbuffer formats (rework format translation)
Marek Olšák [Mon, 23 Sep 2013 13:40:50 +0000 (15:40 +0200)]
radeonsi: add missing colorbuffer formats (rework format translation)

This fixes some piglits, e.g:
  spec/!OpenGL 3.0/required-renderbuffer-attachment-formats.

This can be ported to r600g.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: bypass alpha-test for integer colorbuffers
Marek Olšák [Mon, 23 Sep 2013 19:41:34 +0000 (21:41 +0200)]
radeonsi: bypass alpha-test for integer colorbuffers

Fixes spec/EXT_texture_integer/fbo-blending.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agor600g: fix texture buffer object cache flushing
Marek Olšák [Thu, 19 Sep 2013 13:07:41 +0000 (15:07 +0200)]
r600g: fix texture buffer object cache flushing

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
11 years agor600g: fix constant buffer cache flushing
Marek Olšák [Wed, 18 Sep 2013 20:46:25 +0000 (22:46 +0200)]
r600g: fix constant buffer cache flushing

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
11 years agoradeon/winsys: keep screen pointer in winsys v2
Christian König [Wed, 25 Sep 2013 11:59:56 +0000 (13:59 +0200)]
radeon/winsys: keep screen pointer in winsys v2

Only create one screen for each winsys instance.
This helps with buffer sharing and interop handling.

v2: rebased and some minor cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agobuild/radeonsi: group all targets in common subdir
Christian König [Wed, 25 Sep 2013 10:28:09 +0000 (12:28 +0200)]
build/radeonsi: group all targets in common subdir

Allows us to share more code between different targets.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
11 years agobuild/r600: group all targets in common subdir
Christian König [Wed, 25 Sep 2013 09:34:39 +0000 (11:34 +0200)]
build/r600: group all targets in common subdir

Allows us to share more code between different targets.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
11 years agobuild/r300: group build target in common subdir
Christian König [Wed, 25 Sep 2013 09:49:49 +0000 (11:49 +0200)]
build/r300: group build target in common subdir

Allows us to share more code between different targets.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
11 years agoradeon/uvd: try to place msg/fb buffer into GART
Christian König [Sun, 22 Sep 2013 13:59:17 +0000 (15:59 +0200)]
radeon/uvd: try to place msg/fb buffer into GART

This is only supported on NI+, but the kernel takes care of those limitations.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoradeon/uvd: move alignment to winsys
Christian König [Sun, 22 Sep 2013 08:41:27 +0000 (10:41 +0200)]
radeon/uvd: move alignment to winsys

Similar to GFX and DMA.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agost/vdpau: use a separate lock per decoder
Christian König [Sun, 22 Sep 2013 10:16:20 +0000 (12:16 +0200)]
st/vdpau: use a separate lock per decoder

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agost/vdpau: use new vlc function to serach for VC-1 start codes
Christian König [Mon, 9 Sep 2013 09:58:53 +0000 (03:58 -0600)]
st/vdpau: use new vlc function to serach for VC-1 start codes

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agovl/mpeg12: use new vlc function to search for start codes
Christian König [Mon, 9 Sep 2013 09:57:58 +0000 (03:57 -0600)]
vl/mpeg12: use new vlc function to search for start codes

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agovl/vlc: add fast forward search for byte value
Christian König [Mon, 9 Sep 2013 09:47:10 +0000 (03:47 -0600)]
vl/vlc: add fast forward search for byte value

Commonly used to find start codes and has far less overhead
to searching manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoglsl: Initialize ir_lower_jumps_visitor member variables.
Vinson Lee [Tue, 24 Sep 2013 05:13:37 +0000 (22:13 -0700)]
glsl: Initialize ir_lower_jumps_visitor member variables.

Fixes "Unintialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Initialize lower_vector_visitor::dont_lower_swz.
Vinson Lee [Tue, 24 Sep 2013 04:47:48 +0000 (21:47 -0700)]
glsl: Initialize lower_vector_visitor::dont_lower_swz.

Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Initialize assignment_generator member variables.
Vinson Lee [Tue, 24 Sep 2013 05:02:27 +0000 (22:02 -0700)]
glsl: Initialize assignment_generator member variables.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Remove unused pointer value.
Vinson Lee [Tue, 24 Sep 2013 04:41:39 +0000 (21:41 -0700)]
glsl: Remove unused pointer value.

Silences "Unused pointer value" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoRevert "llvmpipe: increase number of subpixel bits to eight"
Zack Rusin [Tue, 24 Sep 2013 19:08:35 +0000 (15:08 -0400)]
Revert "llvmpipe: increase number of subpixel bits to eight"

This reverts commit 755c11dc5e94f17097c186edaaa39d818396f14c.
We agreed that this is band-aid that's not very useful and
the proper solution is to rewrite the rasterization algo
so that it operates on 64 bit values.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agomesa: remove handcounted magic number
Dylan Noblesmith [Fri, 20 Sep 2013 15:55:41 +0000 (11:55 -0400)]
mesa: remove handcounted magic number

Also make it a compile-time error with STATIC_ASSERT.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: remove outdated comment
Dylan Noblesmith [Fri, 20 Sep 2013 15:55:27 +0000 (11:55 -0400)]
mesa: remove outdated comment

No such argument exists since this commit:

commit 92f3fca0ea429dcf07123e63447449db53308266
Author:     Ian Romanick <ian.d.romanick@intel.com>
AuthorDate: Sun Aug 21 17:23:58 2011 -0700
Commit:     Ian Romanick <ian.d.romanick@intel.com>
CommitDate: Tue Aug 23 14:52:09 2011 -0700

    mesa: Remove target parameter from dd_function_table::BufferSubData

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agomesa: remove stale comment
Dylan Noblesmith [Fri, 20 Sep 2013 15:55:19 +0000 (11:55 -0400)]
mesa: remove stale comment

This line stopped making sense in the great sed
replace of commit f9995b30756140724f41daf963fa06167912be7f

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agollvmpipe: align the array used for subdivived vertices
Zack Rusin [Mon, 23 Sep 2013 21:29:39 +0000 (17:29 -0400)]
llvmpipe: align the array used for subdivived vertices

When subdiving a triangle we're using a temporary array to store
the new coordinates for the subdivided triangles. Unfortunately
the array used for that was not aligned properly causing
random crashes in the llvm jit code which was trying to load
vectors from it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoglapi: Move declaration before code.
Vinson Lee [Mon, 23 Sep 2013 21:07:15 +0000 (14:07 -0700)]
glapi: Move declaration before code.

This patch fixes the MSVC build error introduced by commit
673129e0b936b1c748e988d3f74f3efaab9e5693.

enums.c
mesa\main\enums.c(3776) : error C2143: syntax error : missing ';' before 'type'
mesa\main\enums.c(3781) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3781) : warning C4047: '!=' : 'int' differs in levels of indirection from 'void *'
mesa\main\enums.c(3782) : error C2065: 'elt' : undeclared identifier
mesa\main\enums.c(3782) : error C2223: left of '->offset' must point to struct/union
mesa\main\enums.c(3782) : warning C4033: '_mesa_lookup_enum_by_nr' must return a value

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
11 years agomesa: Use -Bsymbolic in the linker to locally resolve Mesa-internal symbols.
Eric Anholt [Fri, 20 Sep 2013 19:37:04 +0000 (12:37 -0700)]
mesa: Use -Bsymbolic in the linker to locally resolve Mesa-internal symbols.

Normally, LD_PRELOAD will take precedence over your own symbols, which you
want for things like malloc() in libc.  But we don't have any local
symbols we would want overridden (like hash_table_insert(), for example!),
so tell the linker to resolve them internally.  This also avoids calls
through the PLT.

Saves almost 100k on libdricore's size, and gets us a bunch of the
performance back that we had with non-dricore.

Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agoglsl: Hide many classes local to individual .cpp files in anon namespaces.
Eric Anholt [Fri, 20 Sep 2013 18:03:44 +0000 (11:03 -0700)]
glsl: Hide many classes local to individual .cpp files in anon namespaces.

This gives the compiler the chance to inline and not export class symbols
even in the absence of LTO.  Saves about 60kb on disk.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Drop an extra copy-and-pasted copy in the program clone function.
Eric Anholt [Thu, 19 Sep 2013 23:16:16 +0000 (16:16 -0700)]
mesa: Drop an extra copy-and-pasted copy in the program clone function.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Convert some runtime asserts to static asserts.
Eric Anholt [Fri, 20 Sep 2013 03:06:35 +0000 (20:06 -0700)]
mesa: Convert some runtime asserts to static asserts.

Noticed while grepping through the code for something else.

v2: Don't convert really-runtime asserts to static asserts.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Shrink the size of the enum string lookup struct.
Eric Anholt [Thu, 19 Sep 2013 21:54:13 +0000 (14:54 -0700)]
mesa: Shrink the size of the enum string lookup struct.

Since it's only used for debug information, we can misalign the struct and
save the disk space.  Another 19k on a 64-bit build.

v2: Make a compiler.h macro to only use the attribute if we know we can.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Remove the extra enum strings and extra lookup table.
Eric Anholt [Thu, 19 Sep 2013 19:06:54 +0000 (12:06 -0700)]
mesa: Remove the extra enum strings and extra lookup table.

Now that there's no name -> enum direction, we can drop the extra strings,
and merge the offsets table and the reduced_enums table.

Between the previous commit and this one, Mesa core drops by 30k.

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agomesa: Remove _mesa_lookup_enum_by_name().
Eric Anholt [Thu, 19 Sep 2013 18:48:24 +0000 (11:48 -0700)]
mesa: Remove _mesa_lookup_enum_by_name().

It's been unused for a long time.  I stopped digging through git history
as of 2009.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@.intel.com>
11 years agollvmpipe: increase number of subpixel bits to eight
Zack Rusin [Thu, 19 Sep 2013 18:10:08 +0000 (14:10 -0400)]
llvmpipe: increase number of subpixel bits to eight

Unfortunately d3d10 requires a lot higher precision (e.g.
wgf11clipping tests for it). The smallest number of precision
bits with which it passes is 8. That means that we need to
decrease the maximum length of an edge that we can handle without
subdivision by 4 bits. Abstracted the code a bit to make it easier
to change once to switch to 64bit rasterization.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoglsl: Define isnormal and copysign for MSVC to fix build.
Vinson Lee [Sun, 22 Sep 2013 23:08:26 +0000 (16:08 -0700)]
glsl: Define isnormal and copysign for MSVC to fix build.

This patch fixes these MSVC build errors.

ir_constant_expression.cpp
src\glsl\ir_constant_expression.cpp(564) : warning C4244: '=' : conversion from 'int' to 'float', possible loss of data
src\glsl\ir_constant_expression.cpp(1384) : error C3861: 'isnormal': identifier not found
src\glsl\ir_constant_expression.cpp(1385) : error C3861: 'copysign': identifier not found

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69541
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Matt Turner <mattst88@gmail.com>
11 years agoSuppress clang's warnings about unused CFLAGS and CXXFLAGS.
Johannes Obermayr [Wed, 11 Sep 2013 22:32:40 +0000 (00:32 +0200)]
Suppress clang's warnings about unused CFLAGS and CXXFLAGS.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoradeon/uvd: async flush the UVD cs
Christian König [Sat, 21 Sep 2013 13:34:38 +0000 (15:34 +0200)]
radeon/uvd: async flush the UVD cs

No need to block for the CS thread here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agowinsys/radeon: share winsys between different fd's
Christian König [Sat, 21 Sep 2013 10:25:13 +0000 (12:25 +0200)]
winsys/radeon: share winsys between different fd's

Share the winsys between different fd's if they point to the same device.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agowinsys/radeon: remove cs_queue_empty
Christian König [Sat, 21 Sep 2013 13:24:55 +0000 (15:24 +0200)]
winsys/radeon: remove cs_queue_empty

Waiting for an empty queue is nonsense and can lead to deadlocks if we have
multiple waiters or another thread that continuously sends down new commands.

Just post the cs to the queue and immediately wait for it to finish.

This is a candidate for the stable branch.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agowinsys/radeon: fix killing the CS thread
Christian König [Sat, 21 Sep 2013 11:21:47 +0000 (13:21 +0200)]
winsys/radeon: fix killing the CS thread

Kill the thread only after we checked that it's not used any more, not before.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agoi965/gen4: Fix fragment program rectangle texture shadow compares.
Eric Anholt [Wed, 18 Sep 2013 19:32:31 +0000 (12:32 -0700)]
i965/gen4: Fix fragment program rectangle texture shadow compares.

The rescale_texcoord(), if it does something, will return just the
GLSL-sized coordinate, leaving out the 3rd and 4th components where we
were storing our projected shadow compare and the texture projector.
Deref the shadow compare before using the shared rescale-the-coordinate
code to fix the problem.

Fixes piglit tex-shadow2drect.shader_test and txp-shadow2drect.shader_test

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69525
NOTE: This is a candidate for stable branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gen7.5: Fix missing Shader Channel Select entries on Haswell
Abdiel Janulgue [Fri, 20 Sep 2013 10:56:52 +0000 (13:56 +0300)]
i965/gen7.5: Fix missing Shader Channel Select entries on Haswell

Probably non-intentional, but the SURFACE_STATE setup refactoring
for buffer surfaces had missed the scs bits when creating constant
surface states.

Fixes broken GLB 2.5 on Haswell where the knight's textures are missing

Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros.
Kenneth Graunke [Wed, 18 Sep 2013 21:11:32 +0000 (14:11 -0700)]
i965, mesa: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS macros.

These classes declared a placement new operator, but didn't declare a
delete operator.  Switching to the macro gives them a delete operator,
which probably is a good idea anyway.

This also eliminates a lot of boilerplate.

v2: Properly use RZALLOC in Mesa IR/TGSI translators.  Caught by Eric
    and Chad.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS in a bunch of places.
Kenneth Graunke [Wed, 18 Sep 2013 21:05:36 +0000 (14:05 -0700)]
glsl: Use the new DECLARE_R[Z]ALLOC_CXX_OPERATORS in a bunch of places.

This eliminates a lot of boilerplate and should be 100% equivalent.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoralloc: Introduce new macros for defining C++ new/delete operators.
Kenneth Graunke [Wed, 18 Sep 2013 20:56:26 +0000 (13:56 -0700)]
ralloc: Introduce new macros for defining C++ new/delete operators.

Most of our C++ classes define placement new and delete operators so we
can do convenient allocation via:

   thing *foo = new(mem_ctx) thing(...)

Currently, this is done via a lot of boilerplate.  By adding simple
macros to ralloc, we can condense this to a single line, making it
trivial to add this feature to a new class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agor600g: fast color clears for single-sample buffers
Grigori Goronzy [Tue, 10 Sep 2013 23:41:40 +0000 (01:41 +0200)]
r600g: fast color clears for single-sample buffers

Allocate a CMASK on demand and use it to fast clear single-sample
colorbuffers. Both FBOs and window system colorbuffers are fast
cleared. Expand as needed when colorbuffers are mapped or displayed
on screen.

v2: cosmetics, move transfer expansion into dma_blit

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
11 years agor600g: add support for separately allocated CMASKs
Grigori Goronzy [Tue, 10 Sep 2013 23:41:39 +0000 (01:41 +0200)]
r600g: add support for separately allocated CMASKs

v2: check for NULL cbufs

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
11 years agogallium: add flush_resource context function
Marek Olšák [Fri, 20 Sep 2013 13:08:29 +0000 (15:08 +0200)]
gallium: add flush_resource context function

r600g needs explicit flushing before DRI2 buffers are presented on the screen.

v2: add (stub) implementations for all drivers, fix frontbuffer flushing
v3: fix galahad

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
11 years agoradeonsi: simplify and fix MSAA texture sampling for array textures
Marek Olšák [Wed, 18 Sep 2013 13:40:21 +0000 (15:40 +0200)]
radeonsi: simplify and fix MSAA texture sampling for array textures

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: fix textureOffset and texelFetchOffset GLSL functions
Marek Olšák [Wed, 18 Sep 2013 13:36:38 +0000 (15:36 +0200)]
radeonsi: fix textureOffset and texelFetchOffset GLSL functions

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agollvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM.
José Fonseca [Fri, 20 Sep 2013 11:58:59 +0000 (12:58 +0100)]
llvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM.

We must take rounding in consideration when re-scaling to narrow
normalized channels, such as 2-bit normalized alpha.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agodraw: Ensure draw_pt_middle_end::bind_parameters is never NULL.
José Fonseca [Wed, 18 Sep 2013 19:01:54 +0000 (20:01 +0100)]
draw: Ensure draw_pt_middle_end::bind_parameters is never NULL.

Prevents calling NULL pointer with softpipe in certain cases.

Trivial.