mesa.git
11 years agor600g: don't call buffer_wait in buffer_mmap_sync_with_rings
Marek Olšák [Sun, 30 Jun 2013 12:57:17 +0000 (14:57 +0200)]
r600g: don't call buffer_wait in buffer_mmap_sync_with_rings

The winsys should do this, because it measures how much time we spend
in buffer_map doing synchronization, which can be viewed with the gallium
HUD.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: don't read back the MSAA depth buffer if the read flag is not set
Marek Olšák [Sun, 30 Jun 2013 12:55:06 +0000 (14:55 +0200)]
r600g: don't read back the MSAA depth buffer if the read flag is not set

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: don't flush the context in texture_transfer_map
Marek Olšák [Sun, 30 Jun 2013 12:53:03 +0000 (14:53 +0200)]
r600g: don't flush the context in texture_transfer_map

the winsys does this automatically

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: fix texture offset computation for mapped MSAA depth buffers
Marek Olšák [Sun, 30 Jun 2013 12:34:23 +0000 (14:34 +0200)]
r600g: fix texture offset computation for mapped MSAA depth buffers

It was wrong, because the offset shouldn't be applied to MSAA depth buffers.
This small cleanup should prevent such issues in the future.

This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n".

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: fix color resolve for RGBX8 and RGBX16 integer formats
Marek Olšák [Mon, 1 Jul 2013 00:36:37 +0000 (02:36 +0200)]
r600g: fix color resolve for RGBX8 and RGBX16 integer formats

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: enable fast MSAA color clear for array/3D/cube textures
Marek Olšák [Mon, 1 Jul 2013 00:29:50 +0000 (02:29 +0200)]
r600g: enable fast MSAA color clear for array/3D/cube textures

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: implement fast MSAA color clear for integer textures
Marek Olšák [Mon, 1 Jul 2013 00:57:21 +0000 (02:57 +0200)]
r600g: implement fast MSAA color clear for integer textures

this also fixes the fast clear with multiple colorbuffers and each having
a different format

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600/uvd: fix check for UVD 2.x
Christian König [Mon, 8 Jul 2013 17:51:20 +0000 (19:51 +0200)]
r600/uvd: fix check for UVD 2.x

Signed-off-by: Christian König <christian.koenig@amd.com>
11 years agoi965: fix alpha test for MRT
Chris Forbes [Mon, 1 Jul 2013 11:30:55 +0000 (23:30 +1200)]
i965: fix alpha test for MRT

Include src0 alpha in the RT write message when using MRT, so it is used
for the alpha test instead of the normal per-RT alpha value.

Fixes broken rendering in Dota2 under Wine [FDO #62647].

No Piglit regressions on Ivybridge.

V2: reuse (and simplify) existing sample_alpha_to_coverage flag in
the FS key, rather than adding another redundant one.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewd-by: Paul Berry <stereotype441@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62647
NOTE: This is a candidate for the stable branches.

11 years agogallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch
Roland Scheidegger [Fri, 5 Jul 2013 16:06:17 +0000 (18:06 +0200)]
gallivm: (trivial) fix using one lod instead of per-quad lod for texel fetch

The logic for choosing number of lods was bogus.
(The code should ultimately handle the case of only one lod even with multiple
quads but currently can't.)

11 years agogallivm: Remove bogus assert.
José Fonseca [Fri, 5 Jul 2013 13:34:34 +0000 (14:34 +0100)]
gallivm: Remove bogus assert.

It is perfectly valid for the swizzle to be bigger than 2. For example the
texel offsets could be

  SAMPLE ..., IMM[0].zzz

What is not correct is for chan_index to be bigger than 2.

Trivial.

11 years agonvc0: enable very initial support for nvf0 (GK110)
Ben Skeggs [Fri, 17 May 2013 04:48:15 +0000 (14:48 +1000)]
nvc0: enable very initial support for nvf0 (GK110)

Shaders need a lot of work still.  Basic stuff generally works, so this
is basically just fine for gnome-shell, OA etc at this point.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
11 years agogallivm: (trivial) fix bogus assertion for per-element lod with 1d resources
Roland Scheidegger [Thu, 4 Jul 2013 23:18:24 +0000 (01:18 +0200)]
gallivm: (trivial) fix bogus assertion for per-element lod with 1d resources

The assertion was always broken but the code unused until enabling the
per-element lod code. Fixes piglit texelFetch vs isampler1D and similar
tests (only run with GL 3.0 version override).

11 years agogallivm: do per-pixel lod calculations for explicit lod
Roland Scheidegger [Thu, 4 Jul 2013 17:40:11 +0000 (19:40 +0200)]
gallivm: do per-pixel lod calculations for explicit lod

d3d10 requires per-pixel lod calculations for explicit lod, lod bias and
explicit derivatives, and we should probably do it for OpenGL too - at least
if they are used from vertex or geometry shaders (so doesn't apply to lod
bias) this doesn't just affect neighboring pixels.
Some code was already there to handle this so fix it up and enable it.
There will no doubt be a performance hit unfortunately, we could do better
if we'd knew we had a real vector shift instruction (with variable shift
count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu).
Don't do anything for lod bias and explicit derivatives yet, though
no special magic should be needed for them neither.
Likewise, the size query is still broken just the same.

v2: Use information if lod is a (broadcast) scalar or not. The idea would be
to base this on the actual value, for now just pretend it's a scalar in fs
and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same
code is generated for fs as before).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agodraw: fix overflows in the indexed rendering paths
Zack Rusin [Wed, 3 Jul 2013 03:56:59 +0000 (23:56 -0400)]
draw: fix overflows in the indexed rendering paths

The semantics for overflow detection are a bit tricky with
indexed rendering. If the base index in the elements array
overflows, then the index of the first element should be used,
if the index with bias overflows then it should be treated
like a normal overflow. Also overflows need to be checked for
in all paths that either the bias, or the starting index location.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agodraw/llvm: index overflows if it's greater than elt max
Zack Rusin [Wed, 3 Jul 2013 01:52:55 +0000 (21:52 -0400)]
draw/llvm: index overflows if it's greater than elt max

The comparison, incorrectly, was greater-than-or-equal to
elt max.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoi965: Move the rest of intel_tex_layout.c into brw_tex_layout.c.
Kenneth Graunke [Tue, 2 Jul 2013 22:53:35 +0000 (15:53 -0700)]
i965: Move the rest of intel_tex_layout.c into brw_tex_layout.c.

The texture alignment unit functions are called from brw_tex_layout.c,
so it makes sense to put them there.  Since the only caller of
intel_get_texture_alignment_unit() is in brw_tex_layout.c, it could be
made into a static function.  However, this patch instead simply folds
it into the caller, as it's only two lines anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Push intel_get_texture_alignment_unit call into brw_miptree_layout
Kenneth Graunke [Tue, 2 Jul 2013 22:06:10 +0000 (15:06 -0700)]
i965: Push intel_get_texture_alignment_unit call into brw_miptree_layout

intel_miptree_create_layout() calls intel_get_texture_alignment_unit()
and then immediately calls brw_miptree_layout().  There are no other
callers.

intel_get_texture_alignment_unit() populates the miptree's alignment
unit fields, which are used by brw_miptree_layout() to determine where
to place each miplevel.  Since brw_miptree_layout() needs those to be
present, it makes sense to have it initialize them as the first step.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Declare for-loop counters in the loop in brw_tex_layout.c.
Kenneth Graunke [Fri, 28 Jun 2013 22:56:22 +0000 (15:56 -0700)]
i965: Declare for-loop counters in the loop in brw_tex_layout.c.

The driver is compiled in C99 mode, so this is not a problem.  It's
slighlty tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Remove use of GLuint/GLint in brw_tex_layout.c.
Kenneth Graunke [Fri, 28 Jun 2013 22:49:20 +0000 (15:49 -0700)]
i965: Remove use of GLuint/GLint in brw_tex_layout.c.

Using GL types is silly; this isn't even remotely API-facing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Tidy the brw_tex_layout.c copyright and file header comments.
Kenneth Graunke [Fri, 28 Jun 2013 22:47:28 +0000 (15:47 -0700)]
i965: Tidy the brw_tex_layout.c copyright and file header comments.

This uses Doxygen style for the file comments, and generally makes it
more consistent with the rest of the driver.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Move i945_texture_layout_2d to brw_tex_layout.c
Kenneth Graunke [Fri, 28 Jun 2013 22:25:12 +0000 (15:25 -0700)]
i965: Move i945_texture_layout_2d to brw_tex_layout.c

This consolidates the miptree layout logic in a single file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Remove fallthrough for Gen4 cube map layout.
Kenneth Graunke [Fri, 28 Jun 2013 22:06:47 +0000 (15:06 -0700)]
i965: Remove fallthrough for Gen4 cube map layout.

Now that both 2DArray and Cube layouts are taken care of by helper
functions, it's easy to just call the right function for each
generation.  This is a little cleaner than falling through.

This also reworks the comments.  Referencing "Volume 1" of the BSpec
isn't very helpful, since that's only available inside Intel, and it
doesn't even use volume numbers.  Also, "Ironlake...finally" sounds a
bit strange considering that almost all hardware uses the 2D array
approach.  At this point, Gen4 is the only special case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases.
Kenneth Graunke [Fri, 28 Jun 2013 22:00:07 +0000 (15:00 -0700)]
i965: Combine GL_TEXTURE_CUBE_MAP_ARRAY case with the other array cases.

These do the exact same thing; combining them is tidier.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Pull 3D texture layout code out into a helper function.
Kenneth Graunke [Fri, 28 Jun 2013 21:50:30 +0000 (14:50 -0700)]
i965: Pull 3D texture layout code out into a helper function.

A bit cleaner than having it in one giant function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Replace maxBatchSize variable with BATCH_SZ define.
Kenneth Graunke [Sat, 29 Jun 2013 02:38:13 +0000 (19:38 -0700)]
i965: Replace maxBatchSize variable with BATCH_SZ define.

maxBatchSize was only ever initialized to BATCH_SZ, and a few places
used BATCH_SZ directly anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Move annotate_aub out of the vtable.
Kenneth Graunke [Tue, 2 Jul 2013 08:19:23 +0000 (01:19 -0700)]
i965: Move annotate_aub out of the vtable.

brw_annotate_aub() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Move debug_batch hook out of the vtable.
Kenneth Graunke [Sat, 29 Jun 2013 02:36:04 +0000 (19:36 -0700)]
i965: Move debug_batch hook out of the vtable.

brw_debug_batch() is the only implementation of this function, so it
makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Remove render_target_supported from the vtable.
Kenneth Graunke [Sat, 29 Jun 2013 02:30:19 +0000 (19:30 -0700)]
i965: Remove render_target_supported from the vtable.

brw_render_target_supported() is the only implementation of this
function, so it makes sense to just call it directly.

Rather than adding an #include of brw_wm.h, this patch moves the
prototype to brw_context.h.  Prototypes seem to be in rather arbitrary
places at the moment, and either place seems as good as the other.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Move is_hiz_depth_format out of the vtable.
Kenneth Graunke [Sat, 29 Jun 2013 02:26:07 +0000 (19:26 -0700)]
i965: Move is_hiz_depth_format out of the vtable.

brw_is_hiz_depth_format() is the only implementation of this function,
so it makes sense to just call it directly.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Remove the invalidate_state() vtable hook.
Kenneth Graunke [Sat, 29 Jun 2013 02:18:10 +0000 (19:18 -0700)]
i965: Remove the invalidate_state() vtable hook.

The hook was a noop.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Replace fprintfs with assertions in GLenum comparison translators.
Kenneth Graunke [Sat, 29 Jun 2013 02:03:06 +0000 (19:03 -0700)]
i965: Replace fprintfs with assertions in GLenum comparison translators.

These functions translate GLenum comparison operations into the hardware
enumerations.  They should never be passed something other than a GL
comparison operator, or something is very broken.

Assertions seem more appropriate than fprintf.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Replace intel_state.c enums with those from brw_defines.h.
Kenneth Graunke [Sat, 29 Jun 2013 01:46:47 +0000 (18:46 -0700)]
i965: Replace intel_state.c enums with those from brw_defines.h.

Both intel_context.h and brw_defines.h have #defines for comparison
functions, stencil ops, blending logic ops, and blending factors.
They're exactly the same values, so it makes sense to pick one.

brw_defines.h is the logical place for this kind of stuff, so this patch
converts intel_state.c to use the set defined there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Delete pre-DRI2.3 viewport hacks.
Kenneth Graunke [Fri, 28 Jun 2013 23:35:18 +0000 (16:35 -0700)]
i965: Delete pre-DRI2.3 viewport hacks.

The __DRI_USE_INVALIDATE extension was added in May 11th, 2010 by commit
4258e3a2e1c327.  At this point, it's unlikely that anyone's using the
right mix of new and old components to hit this path.  Deleting it
removes an untested code path and cleans up the driver a bit.

Cc: Kristian Høgsberg <krh@bitplanet.net>
Cc: Keith Packard <keithp@keithp.com>
11 years agoi965: Remove "There are probably better ways" comment.
Kenneth Graunke [Fri, 28 Jun 2013 23:07:40 +0000 (16:07 -0700)]
i965: Remove "There are probably better ways" comment.

There are always better ways to do things.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Delete brw_print_reg() function.
Kenneth Graunke [Sat, 29 Jun 2013 02:53:41 +0000 (19:53 -0700)]
i965: Delete brw_print_reg() function.

This wasn't called from anywhere; presumably it was used to examine
brw_regs when debugging shader assembly.  However, it prints registers
in a different notation than brw_disasm.c which everyone is used
to...which means I doubt anyone will want to use it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Move contents of intel_clear.h to intel_context.h.
Kenneth Graunke [Tue, 2 Jul 2013 04:36:48 +0000 (21:36 -0700)]
i965: Move contents of intel_clear.h to intel_context.h.

Having a header file for a single prototype seems rather excessive.
Plus, the actual function is in brw_clear.c, not intel_clear.c, so
there isn't even the .c/.h filename symmetry one might expect.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Move contents of intel_extensions.h to intel_context.h.
Kenneth Graunke [Tue, 2 Jul 2013 04:28:27 +0000 (21:28 -0700)]
i965: Move contents of intel_extensions.h to intel_context.h.

Having an entire header file for a single prototype seems a bit
excessive.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Remove some dead code.
Kenneth Graunke [Fri, 28 Jun 2013 23:06:26 +0000 (16:06 -0700)]
i965: Remove some dead code.

A random smattering of things that just aren't used anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Delete dead intel_buffer_object::range_map_size field.
Kenneth Graunke [Tue, 2 Jul 2013 05:16:16 +0000 (22:16 -0700)]
i965: Delete dead intel_buffer_object::range_map_size field.

Nothing uses this, apparently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Remove intel_buffer_object::source.
Kenneth Graunke [Tue, 2 Jul 2013 04:57:46 +0000 (21:57 -0700)]
i965: Remove intel_buffer_object::source.

This was only used for BOs backed by system memory on i915.  With that
gone, there's nothing that even sets source to non-zero, so this is
purely dead code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoi965: Fix buffer object segfault since removal of system memory BOs.
Kenneth Graunke [Tue, 2 Jul 2013 05:08:22 +0000 (22:08 -0700)]
i965: Fix buffer object segfault since removal of system memory BOs.

Commit cf31a19300cbcecddb6bd0f878abb9316ebad2a1 removed support for BOs
backed by system memory, as it was only useful for i915.  However, it
removed a little too much code: intel_bufferobj_buffer() used to call
intel_bufferobj_alloc_buffer(), and after that commit, it didn't.

This led to NULL pointer dereferences in several test cases, such as
es3conform's transform_feedback_state_variables test.

This commit restores the allocation, preserving the original behavior.
It may not be the cleanest approach, but tidying should come later.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66432
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agopostprocess: move second temporary assertion into isolated configuration
Matthew McClure [Mon, 1 Jul 2013 21:03:37 +0000 (14:03 -0700)]
postprocess: move second temporary assertion into isolated configuration

With this patch we will only assert that the second temporary is allocated,
when there are more than two active filters.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66423

Signed-off-by: Brian Paul <brianp@vmware.com>
11 years agoglsl: Ensure snprintf is defined on MSVC builds.
José Fonseca [Wed, 3 Jul 2013 07:24:08 +0000 (08:24 +0100)]
glsl: Ensure snprintf is defined on MSVC builds.

Should fix:

  src\glsl\opt_dead_builtin_varyings.cpp(244) : error C3861: 'snprintf': identifier not found
  ...

11 years agotargets/xvmc-nouveau: add in missing nv30 lib
Ilia Mirkin [Sun, 23 Jun 2013 16:59:25 +0000 (12:59 -0400)]
targets/xvmc-nouveau: add in missing nv30 lib

Currently libXvMCnouveau.so is missing nv30_screen_create. Add it in so
that it may be dlopen'd.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
11 years agomesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies
Marek Olšák [Thu, 13 Jun 2013 11:13:34 +0000 (13:13 +0200)]
mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies

Not needed with do_dead_builtin_varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agost/mesa: disable EXT_separate_shader_objects
Marek Olšák [Wed, 12 Jun 2013 19:38:28 +0000 (21:38 +0200)]
st/mesa: disable EXT_separate_shader_objects

The extension disallows elimination of set-but-unused varyings.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl/linker: eliminate unused and set-but-unused built-in varyings
Marek Olšák [Wed, 12 Jun 2013 11:23:48 +0000 (13:23 +0200)]
glsl/linker: eliminate unused and set-but-unused built-in varyings

This eliminates built-in varyings such as gl_Color, gl_SecondaryColor,
gl_TexCoord, and gl_FogFragCoord if they are unused by the next stage or
not written at all (e.g. gl_TexCoord elements). The gl_TexCoord array is
broken down into separate vec4s if needed.

v2: - use a switch statement in varying_info_visitor::visit(ir_variable*)
    - use snprintf
    - disable the optimization for GLES2

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl/linker: check against varying limit after unused varyings are eliminated
Marek Olšák [Thu, 13 Jun 2013 01:17:22 +0000 (03:17 +0200)]
glsl/linker: check against varying limit after unused varyings are eliminated

We counted even the varyings which were later eliminated, which was
suboptimal.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl/linker: link shaders in the opposite order (from fragment to vertex)
Marek Olšák [Wed, 12 Jun 2013 00:18:09 +0000 (02:18 +0200)]
glsl/linker: link shaders in the opposite order (from fragment to vertex)

This ensures that inter-shader outputs and inputs are properly eliminated
across 3 or more shader stages. The behavior is unchanged with 2 or less
shader stages.

For example, elimination of unused FS inputs causes elimination of matching
GS outputs, which causes elimination of the GS inputs that were needed for
evaluation of the eliminated GS outputs, which causes elimination of
matching VS outputs. An unused FS input is all that's needed to trigger
this chain reaction.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agomesa: renumber shader indices according to their placement in pipeline
Marek Olšák [Wed, 12 Jun 2013 15:15:46 +0000 (17:15 +0200)]
mesa: renumber shader indices according to their placement in pipeline

See my explanation in mtypes.h.

v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agogallivm: Simplify intrinsic name construction.
José Fonseca [Tue, 2 Jul 2013 05:53:25 +0000 (06:53 +0100)]
gallivm: Simplify intrinsic name construction.

Just noticed this could be slightly shortened when fixing MSVC build.

Trivial.

11 years agoglsl/builtins: Fix ARB_texture_cube_map_array built-in availability.
Kenneth Graunke [Fri, 28 Jun 2013 20:46:44 +0000 (13:46 -0700)]
glsl/builtins: Fix ARB_texture_cube_map_array built-in availability.

This patch adds texture() for isamplerCubeArray and usamplerCubeArray,
which were entirely missing.

It also makes texture() with a LOD bias fragment shader specific.  The
main GLSL specification explicitly says that texturing with LOD bias
should not be allowed for vertex shaders.

Affects Piglit's ARB_texture_cube_map_array/compiler/tex_bias-01.vert.
which tries to use bias in a vertex shader.  Currently, it expects this
to pass (so this patch regresses the test), but I've sent a patch to
reverse the expected behavior (so this patch would fix the updated test):
http://lists.freedesktop.org/archives/piglit/2013-June/006123.html

NOTE: This is a candidate for stable branches.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
11 years agogallivm: Fix MSVC build.
José Fonseca [Tue, 2 Jul 2013 05:41:32 +0000 (06:41 +0100)]
gallivm: Fix MSVC build.

11 years agogallivm: Fix indirect immediate registers.
José Fonseca [Mon, 1 Jul 2013 19:54:19 +0000 (20:54 +0100)]
gallivm: Fix indirect immediate registers.

If reg->Register.Indirect is true then the immediate is not truly a
constant LLVM expression.

There is no performance regression in using LLVMBuildBitCast, as it will
fallback to LLVMConstBitCast internally when the argument is a constant.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
11 years agogallium/tests: fix the translate test
Zack Rusin [Fri, 28 Jun 2013 13:42:35 +0000 (09:42 -0400)]
gallium/tests: fix the translate test

11 years agoi965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w
Anuj Phogat [Tue, 14 May 2013 15:15:59 +0000 (08:15 -0700)]
i965: Enable ext_framebuffer_multisample_blit_scaled on intel h/w

This patch enables ext_framebuffer_multisample_blit_scaled extension
on intel h/w >= gen6.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965/blorp: Add bilinear filtering of samples for multisample scaled blits
Anuj Phogat [Fri, 31 May 2013 17:59:50 +0000 (10:59 -0700)]
i965/blorp: Add bilinear filtering of samples for multisample scaled blits

Current implementation of ext_framebuffer_multisample_blit_scaled in
i965/blorp uses nearest filtering for multisample scaled blits. Using
nearest filtering produces blocky artifacts and negates the benefits
of MSAA. That is the reason why extension was not enabled on i965.

This patch implements the bilinear filtering of samples in blorp engine.
Images generated with this patch are free from blocky artifacts and show
big improvement in visual quality.

Observed no piglit and gles3 regressions.

V3:
- Algorithm used for filtering assumes a rectangular grid of samples
  roughly corresponding to sample locations.
- Test the boundary conditions on the edges of texture.

V4:
- Clip texcoords and use conditional MOVs.
- Send texture dimensions as push constants.
- Remove the optimization in case of scaled multisample blits.

V5:
- Move mcs_fetch() inside the 'for' loop after computing pixel coordinates.

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Acked-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agodocs: Import 9.1.4 release notes, add news item.
Ian Romanick [Mon, 1 Jul 2013 21:42:13 +0000 (14:42 -0700)]
docs: Import 9.1.4 release notes, add news item.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agodraw/translate: fix instancing
Zack Rusin [Fri, 28 Jun 2013 00:40:10 +0000 (20:40 -0400)]
draw/translate: fix instancing

We were incorrectly computing the buffer offset when using the
instances. The buffer offset is always equal to:
start_instance * stride + (instance_num / instance_divisor) *
stride
We were completely ignoring the start instance quite
often producing instances that completely wrong, e.g. if
start instance = 5, instance divisor = 2, then on the first
iteration it should be:
5 * stride, not (5/2) * stride as we'd have currently, and if
start instance = 1, instance divisor = 3, then on the first
iteration it should be:
1 * stride, not 0 as we'd have.
This fixes it and adjusts all the code to the changes.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agodraw: fix incorrect clipper invocation statistics
Zack Rusin [Thu, 27 Jun 2013 19:24:52 +0000 (15:24 -0400)]
draw: fix incorrect clipper invocation statistics

clipper invocations are computed earlier (of course
before the emittion) so this code was adding bogus
numbers to already computed clipper invocations.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agodraw/gallivm: export overflow arithmetic to its own file
Zack Rusin [Thu, 27 Jun 2013 19:23:21 +0000 (15:23 -0400)]
draw/gallivm: export overflow arithmetic to its own file

We'll be reusing this code so lets put it in a common file
and use it in the draw module.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agodraw: check for integer overflows in instance computation
Zack Rusin [Tue, 25 Jun 2013 21:01:14 +0000 (17:01 -0400)]
draw: check for integer overflows in instance computation

Integers could easily overflow is the starting instance
was large enough. Instead of letting bogus counts through
set the instance to max if it overflown and let our
regular buffer overflow computation handle it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agodraw: check for an integer overflow when computing stride
Zack Rusin [Tue, 25 Jun 2013 20:14:06 +0000 (16:14 -0400)]
draw: check for an integer overflow when computing stride

Our buffer overflow arithmetic was susceptible to integer
overflows which was the buffer overflow logic to break.
Lets use the llvm overflow intrinsics to check for integer
overflows while computing the stride/needed buffer size.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agodraw: account for elem size when computing overflow
Zack Rusin [Tue, 25 Jun 2013 17:54:47 +0000 (13:54 -0400)]
draw: account for elem size when computing overflow

We weren't taking into account the size of element
that is to be fetched, which meant that it was possible
to overflow the buffer reads if the stride was very
close to the end of the buffer, e.g. stride = 3, buffer
size = 4, and the element to be read = 4. This should
be properly detected as an overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
11 years agoi965: Initialize brw_blorp_const_color_program member variables.
Vinson Lee [Fri, 28 Jun 2013 05:40:20 +0000 (22:40 -0700)]
i965: Initialize brw_blorp_const_color_program member variables.

Fixes "Uninitialized scalar field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agoeglplatform: use unsigned long instead of 32-bit ints in generic platform
Ross Burton [Thu, 27 Jun 2013 11:35:08 +0000 (12:35 +0100)]
eglplatform: use unsigned long instead of 32-bit ints in generic platform

In the generic Unix case use the "unsigned long" type instead of 32-bit
integers so that the type sizes are consistant on 64-bit machines between X11
and not-X11.

Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agobuild: fix EGL build when no X11 headers are present
Ross Burton [Thu, 27 Jun 2013 11:35:07 +0000 (12:35 +0100)]
build: fix EGL build when no X11 headers are present

eglplatform.h defaults to X11 on Unix unless told otherwise, so if we're doing a
build without any X11 support tell it so that we don't try including headers
that don't exist.

Also set GL_PC_FLAGS so that the definition is in egl.pc, so that applications
using EGL don't try to pull in X11 headers on systems where EGL was configured
without X11 support.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64959
Signed-off-by: Ross Burton <ross.burton@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agotools/trace: Return dummy fence object to silence warnings.
José Fonseca [Mon, 1 Jul 2013 11:06:37 +0000 (12:06 +0100)]
tools/trace: Return dummy fence object to silence warnings.

11 years agotools/trace: Don't crash if a trace has no timing information.
José Fonseca [Mon, 1 Jul 2013 11:05:57 +0000 (12:05 +0100)]
tools/trace: Don't crash if a trace has no timing information.

11 years agoscons: Fix dependencies of enums.c and api_exec.c.
José Fonseca [Mon, 1 Jul 2013 11:04:59 +0000 (12:04 +0100)]
scons: Fix dependencies of enums.c and api_exec.c.

11 years agonvc0: allow frame dropping in h264
Maarten Lankhorst [Mon, 1 Jul 2013 06:47:49 +0000 (08:47 +0200)]
nvc0: allow frame dropping in h264

The only reason the checks existed were paranoia, when I first
wrote the code I wasn't sure it was correct. Now that I am,
the asserts triggered when XBMC was dropping frames, so remove it.

NOTE: This is a candidate for the 9.1 branch.

11 years agor300g/compiler: Prevent regalloc from swizzling texture operands v2
Tom Stellard [Mon, 20 May 2013 15:05:03 +0000 (08:05 -0700)]
r300g/compiler: Prevent regalloc from swizzling texture operands v2

https://bugs.freedesktop.org/show_bug.cgi?id=63520

NOTE: This is a candidate for the stable branches.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor300g/compiler/tests: Add an assembly parser
Tom Stellard [Mon, 11 Mar 2013 00:55:26 +0000 (20:55 -0400)]
r300g/compiler/tests: Add an assembly parser

The assembly parser can be used to load r300 assembly dumps
and run them through any of the r300 compiler passes.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor300g: Fix make check
Tom Stellard [Thu, 16 May 2013 16:33:21 +0000 (18:33 +0200)]
r300g: Fix make check

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agor600g: implement fast color clears for MSAA on evergreen+
Grigori Goronzy [Tue, 11 Jun 2013 22:04:01 +0000 (00:04 +0200)]
r600g: implement fast color clears for MSAA on evergreen+

Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.

Fast clear is used only when all bound colorbuffers fulfill certain
conditions: a CMASK is required, we have to be able to create a clear
color value for the format and the texture mustn't contain multiple
images. Technically, it should be possible to support array textures
and cubemaps if all images are attached to the framebuffer,
but this does not appear to be common.

v2: fix fast clear check
v3: Marek: - disable fast clear with 128-bit formats, which are unsupported
           - set tex->dirty_level_mask in r600_clear, so that the driver knows
             the resource must be decompressed/expanded
           - return early from r600_clear if there's nothing else to do

Signed-off-by: Marek Olšák <maraeo@gmail.com>
11 years agor600g/compute: disable unused colorbuffer slots
Marek Olšák [Mon, 24 Jun 2013 01:04:58 +0000 (03:04 +0200)]
r600g/compute: disable unused colorbuffer slots

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
11 years agost/mesa: handle SNORM formats in generic CopyPixels path
Marek Olšák [Thu, 6 Jun 2013 11:45:24 +0000 (13:45 +0200)]
st/mesa: handle SNORM formats in generic CopyPixels path

v2: check desc->is_mixed in util_format_is_snorm

11 years agoi965: NULL check depth_mt to quiet static analysis.
Matt Turner [Thu, 27 Jun 2013 18:18:36 +0000 (11:18 -0700)]
i965: NULL check depth_mt to quiet static analysis.

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
11 years agollvmpipe: fix timer query if there's no bins
Roland Scheidegger [Fri, 28 Jun 2013 14:53:46 +0000 (16:53 +0200)]
llvmpipe: fix timer query if there's no bins

b04a295a4a0cd2defe352b3193b5fa79ca8fc9fc removed seemingly unnecessary
code in get_query. Turns out this code could in fact be reached - while
timestamps are always binned, if there are no bins (which happens if fb
size is 0) then the rasterization query code filling this in is still
never executed.
So fix this up by filling in some timestamp, but do it at EndQuery time
not GetQuery time which should be more appropriate.
Makes piglit arb_timer_query-timestamp-get happy again.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
11 years agoclover: Don't segfault when compiling a program with no kernel
Tom Stellard [Thu, 6 Jun 2013 00:05:43 +0000 (17:05 -0700)]
clover: Don't segfault when compiling a program with no kernel

11 years agomesa: Remove unused allow_large_textures driconf from classic drivers.
Eric Anholt [Wed, 26 Jun 2013 19:55:32 +0000 (12:55 -0700)]
mesa: Remove unused allow_large_textures driconf from classic drivers.

This option hasn't been used since the introduction of DRI2.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi915: Remove GLES 3.0 sRGB workaround.
Kenneth Graunke [Fri, 21 Jun 2013 22:55:32 +0000 (15:55 -0700)]
i915: Remove GLES 3.0 sRGB workaround.

Gen3 doesn't support GLES 3.0, so there's no need for it.

Acked-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Remove is_945.
Kenneth Graunke [Fri, 21 Jun 2013 22:52:43 +0000 (15:52 -0700)]
i965: Remove is_945.

Only relevant on Gen3.

Acked-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Delete hw_stencil flag.
Kenneth Graunke [Fri, 21 Jun 2013 22:34:20 +0000 (15:34 -0700)]
i965: Delete hw_stencil flag.

This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Remove hw_stipple flag.
Kenneth Graunke [Fri, 21 Jun 2013 22:33:45 +0000 (15:33 -0700)]
i965: Remove hw_stipple flag.

This was only used by i915.

Acked-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Remove use_early_z option.
Kenneth Graunke [Fri, 21 Jun 2013 22:22:27 +0000 (15:22 -0700)]
i965: Remove use_early_z option.

This was only used by i965+.

v2: Also remove the option from the driconf list. (change by anholt)

Reviewed-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Remove unused SUBPIXEL_* macros.
Kenneth Graunke [Fri, 21 Jun 2013 22:19:34 +0000 (15:19 -0700)]
i965: Remove unused SUBPIXEL_* macros.

Acked-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Remove redundant Gen3 PCI IDs.
Kenneth Graunke [Fri, 21 Jun 2013 22:18:46 +0000 (15:18 -0700)]
i965: Remove redundant Gen3 PCI IDs.

Acked-by: Eric Anholt <eric@anholt.net>
11 years agointel: Remove unused INTEL_MAX_FIXUP macro.
Kenneth Graunke [Fri, 21 Jun 2013 22:18:29 +0000 (15:18 -0700)]
intel: Remove unused INTEL_MAX_FIXUP macro.

v2: Remove it from i915, too (change by anholt)

Acked-by: Eric Anholt <eric@anholt.net>
11 years agoi965: Drop i915 register/instruction definitions.
Eric Anholt [Fri, 21 Jun 2013 18:23:49 +0000 (11:23 -0700)]
i965: Drop i915 register/instruction definitions.

v2: Remove unused DV_PF_* macros, too. (change by Ken)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop code for calling the empty brw_update_draw_buffers() hook.
Eric Anholt [Fri, 21 Jun 2013 18:08:34 +0000 (11:08 -0700)]
i965: Drop code for calling the empty brw_update_draw_buffers() hook.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop dead i915 blend state code.
Eric Anholt [Fri, 21 Jun 2013 17:02:49 +0000 (10:02 -0700)]
i965: Drop dead i915 blend state code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop i915-specific blit clear code.
Eric Anholt [Fri, 21 Jun 2013 16:58:59 +0000 (09:58 -0700)]
i965: Drop i915-specific blit clear code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop the system-memory VBO support for i915.
Eric Anholt [Fri, 21 Jun 2013 16:57:12 +0000 (09:57 -0700)]
i965: Drop the system-memory VBO support for i915.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop i915 swtnl code.
Eric Anholt [Fri, 21 Jun 2013 16:54:58 +0000 (09:54 -0700)]
i965: Drop i915 swtnl code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop i915-specific vtbl entries.
Eric Anholt [Fri, 21 Jun 2013 16:47:32 +0000 (09:47 -0700)]
i965: Drop i915-specific vtbl entries.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop swtnl fallback code for i915.
Eric Anholt [Fri, 21 Jun 2013 16:26:01 +0000 (09:26 -0700)]
i965: Drop swtnl fallback code for i915.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop i915 code from intel_screen.
Eric Anholt [Fri, 21 Jun 2013 16:22:39 +0000 (09:22 -0700)]
i965: Drop i915 code from intel_screen.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Drop #ifdef I915 code.
Eric Anholt [Thu, 20 Jun 2013 23:32:20 +0000 (16:32 -0700)]
i965: Drop #ifdef I915 code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>