mesa.git
10 years agogallium/u_blitter: implement shader-based MSAA resolve
Marek Olšák [Fri, 6 Dec 2013 22:55:05 +0000 (23:55 +0100)]
gallium/u_blitter: implement shader-based MSAA resolve

We need this for integer formats and upside-down blits, which Radeons don't
support for MSAA resolving.

It can be used by calling util_blitter_blit.

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallium/u_blitter: remove useless parameters from some functions
Marek Olšák [Fri, 6 Dec 2013 21:39:48 +0000 (22:39 +0100)]
gallium/u_blitter: remove useless parameters from some functions

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agost/dri: resolve sRGB buffers in linear colorspace
Marek Olšák [Tue, 10 Dec 2013 16:46:41 +0000 (17:46 +0100)]
st/dri: resolve sRGB buffers in linear colorspace

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agogallivm: fix pointer type for stmxcsr/ldmxcsr
Roland Scheidegger [Fri, 13 Dec 2013 20:20:05 +0000 (21:20 +0100)]
gallivm: fix pointer type for stmxcsr/ldmxcsr

The argument is a i8 pointer not a i32 pointer (even though the value actually
stored/loaded IS i32). Older llvm versions didn't care but 3.2 and newer do
leading to crashes.

Reviewed-by: Zack Rusin <zackr@vmware.com>
10 years agollvmpipe: get rid of barycentric calculation of a0
Roland Scheidegger [Fri, 13 Dec 2013 17:31:07 +0000 (18:31 +0100)]
llvmpipe: get rid of barycentric calculation of a0

Didn't really work as well as hoped (in particular it was not generally
more accurate), will solve this differently.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agollvmpipe: (trivial) get rid of triangle subdivision code
Roland Scheidegger [Fri, 13 Dec 2013 00:09:35 +0000 (01:09 +0100)]
llvmpipe: (trivial) get rid of triangle subdivision code

This code was always problematic, and with 64bit rasterization we no longer
need it at all.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoi965: Treat Haswell as 75 in the surface format table.
Kenneth Graunke [Wed, 8 May 2013 21:38:25 +0000 (14:38 -0700)]
i965: Treat Haswell as 75 in the surface format table.

Much like we do for G45.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa: fix texture view use of _mesa_get_tex_image()
Chris Forbes [Sun, 8 Dec 2013 06:04:40 +0000 (19:04 +1300)]
mesa: fix texture view use of _mesa_get_tex_image()

The target parameter to _mesa_get_tex_image() is a target enum, not an index.
When we're setting up faces for a cubemap, it should be
CUBE_MAP_POSITIVE_X .. CUBE_MAP_NEGATIVE_Z; for all other targets it
should be the same as the texobj's target.

Fixes broken cubemaps [had only +X face but claimed to have all] produced by
glTextureView, which then caused various crashes in the driver when we
tried to use them.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoi965/fs: add support for gl_SampleMaskIn[]
Chris Forbes [Sun, 8 Dec 2013 07:29:43 +0000 (20:29 +1300)]
i965/fs: add support for gl_SampleMaskIn[]

v2: - add assert so we don't run into trouble on Gen6.
    - adjust for Tapani's rearrangement of ir_variable

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: add gl_SampleMaskIn[] builtin
Chris Forbes [Sun, 8 Dec 2013 07:03:25 +0000 (20:03 +1300)]
glsl: add gl_SampleMaskIn[] builtin

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: add SYSTEM_VALUE_SAMPLE_MASK_IN
Chris Forbes [Sun, 8 Dec 2013 07:01:24 +0000 (20:01 +1300)]
mesa: add SYSTEM_VALUE_SAMPLE_MASK_IN

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: document _mesa_texstore() return value
Brian Paul [Sat, 14 Dec 2013 00:02:43 +0000 (17:02 -0700)]
mesa: document _mesa_texstore() return value

10 years agost/mesa: only set up sampler compare mode for depth textures
Brian Paul [Fri, 13 Dec 2013 16:52:15 +0000 (09:52 -0700)]
st/mesa: only set up sampler compare mode for depth textures

The GL_ARB_shadow spec says the shadow compare mode should have no
effect when sampling a color texture.  As it was, it was up to
drivers to check for that (softpipe, llvmpipe, svga and probably
the rest don't do that).  Note: it looks like DX10 allows shadow
compare with some non-depth formats, so this case really should be
handled in the state tracker.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agost/mesa: add const qualifiers in sampler validation code
Brian Paul [Fri, 13 Dec 2013 16:33:49 +0000 (09:33 -0700)]
st/mesa: add const qualifiers in sampler validation code

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agost/mesa: add const qualifier to st_translate_color()
Brian Paul [Fri, 13 Dec 2013 16:28:07 +0000 (09:28 -0700)]
st/mesa: add const qualifier to st_translate_color()

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agost/mesa: simplify integer texture check
Brian Paul [Fri, 13 Dec 2013 16:26:24 +0000 (09:26 -0700)]
st/mesa: simplify integer texture check

Just use the gl_texture_object::_IsInteger field instead of
computing it from scratch.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agomesa: update glext.h to version 20131212
Brian Paul [Thu, 12 Dec 2013 21:55:33 +0000 (14:55 -0700)]
mesa: update glext.h to version 20131212

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agosvga: don't emit extraneous fs shadow code
Brian Paul [Mon, 9 Dec 2013 20:50:44 +0000 (12:50 -0800)]
svga: don't emit extraneous fs shadow code

Depending on the depth texture format, we may or may not have to
emit explicit fs code to do the shadow comparison.  Before, we
were emitting it more often than needed.

v2: check the actual texture format rather than the screen->depth.z16
field.  The screen->depth.z16, x8z24, s8z24 fields may not all be set
to a consistent set of depth formats.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
10 years agomesa: s/uint/GLuint/ to fix MSVC error
Brian Paul [Fri, 13 Dec 2013 19:51:10 +0000 (12:51 -0700)]
mesa: s/uint/GLuint/ to fix MSVC error

10 years agomesa: Update TexStorage to support ARB_texture_view
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:31:37 +0000 (13:31 -0700)]
mesa: Update TexStorage to support ARB_texture_view

Call TextureView helper function to set TextureView state
appropriately for the TexStorage calls.

Misc updates from review feedback.

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: add texture_view helper function for TexStorage
Courtney Goeltzenleuchter [Wed, 6 Nov 2013 21:40:31 +0000 (14:40 -0700)]
mesa: add texture_view helper function for TexStorage

Add helper function to set texture_view state from TexStorage calls.
Include review feedback.

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Fill out ARB_texture_view entry points
Courtney Goeltzenleuchter [Wed, 6 Nov 2013 21:34:09 +0000 (14:34 -0700)]
mesa: Fill out ARB_texture_view entry points

Add Mesa TextureView logic.
Incorporate feedback on ARB_texture_view:
- Add S3TC VIEW_CLASSes to compatibility table
- Use existing _mesa_get_tex_image
- Clean up error strings
- Use bool instead of GLboolean for internal functions
- Split compound level & layer test into individual tests
- eliminate helper macro for VIEW_CLASS table
- do not call driver if ptr null.

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: consolidate multiple next_mipmap_level_size
Courtney Goeltzenleuchter [Mon, 25 Nov 2013 23:31:26 +0000 (16:31 -0700)]
mesa: consolidate multiple next_mipmap_level_size

Refactor to make next_mipmap_level_size defined in mipmap.c a
_mesa_ helper function that can then be used by texture_view

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Add driver entry point for ARB_texture_view
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 21:09:22 +0000 (14:09 -0700)]
mesa: Add driver entry point for ARB_texture_view

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: ARB_texture_view get parameters
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:29:48 +0000 (13:29 -0700)]
mesa: ARB_texture_view get parameters

Add support for ARB_texture_view get parameters:
GL_TEXTURE_VIEW_MIN_LEVEL
GL_TEXTURE_VIEW_NUM_LEVELS
GL_TEXTURE_VIEW_MIN_LAYER
GL_TEXTURE_VIEW_NUM_LAYERS

Incorporate feedback regarding when to allow query of
GL_TEXTURE_IMMUTABLE_LEVELS.

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: update texture object for ARB_texture_view
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:22:30 +0000 (13:22 -0700)]
mesa: update texture object for ARB_texture_view

Add state needed by glTextureView to the gl_texture_object.

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Tracking for ARB_texture_view extension
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:18:55 +0000 (13:18 -0700)]
mesa: Tracking for ARB_texture_view extension

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Add API definitions for ARB_texture_view
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 21:08:16 +0000 (14:08 -0700)]
mesa: Add API definitions for ARB_texture_view

Stub in glTextureView API call to go with the
glTextureView API xml definition.
Includes dispatch test for glTextureView

Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Fix error code generation in glBeginConditionalRender()
Anuj Phogat [Thu, 12 Dec 2013 22:34:27 +0000 (14:34 -0800)]
mesa: Fix error code generation in glBeginConditionalRender()

This patch changes the error condition to satisfy below statement
from OpenGL 4.3 core specification:
"An INVALID_OPERATION error is generated if id is the name of a query
object with a target other SAMPLES_PASSED, ANY_SAMPLES_PASSED, or
ANY_SAMPLES_PASSED_CONSERVATIVE, or if id is the name of a query
currently in progress."

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoMakefile: Add bin/test-driver to EXTRA_FILES
Carl Worth [Fri, 13 Dec 2013 05:33:02 +0000 (21:33 -0800)]
Makefile: Add bin/test-driver to EXTRA_FILES

I'm not sure why this change is necessary. When I've built previous tar files
(such as 9.2.4) with the "make tarballs" target, they include the
bin/test-driver file. But at my first attempt to build the tar files for the
10.0.1 release this file was not being included and the build failed.

(cherry picked from commit d573899b932435b0b37a7a33ebcbdc3c8cedd3e1)

[The cherry pick is because I original applied this on the 10.0 branch while
working on the 10.0.1 release. But if we don't have this on master as well,
this issue will trip us up again the next time we make a new major-release
branch off of master.]

10 years agodri_util: Don't assume __DRIcontext->driverPrivate is a gl_context
Kristian Høgsberg [Sun, 8 Dec 2013 06:02:11 +0000 (22:02 -0800)]
dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context

The driverPrivate pointer is opaque to the driver and we can't assume
it's a struct gl_context in dri_util.c.  Instead provide a helper function
to set the struct gl_context flags from the incoming DRI context flags.

v2 (idr): Modify the other classic drivers to also use
driContextSetFlags.  I ran all the piglit GLX_ARB_create_context tests
with i965 and classic swrast without regressions.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1 on Gallium nouveau]
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
10 years agodocs: Update note regarding nominating patches for the stable branch.
Carl Worth [Fri, 13 Dec 2013 07:07:26 +0000 (23:07 -0800)]
docs: Update note regarding nominating patches for the stable branch.

This brings the documentation up to date with the current practice of using
the CC syntax for patch nomination.

10 years agodocs: Fix typo
Carl Worth [Fri, 13 Dec 2013 07:02:54 +0000 (23:02 -0800)]
docs: Fix typo

Simply replacing Extentions with the correct Extensions.

10 years agodocs: Import 9.2.5 release notes, add news item.
Carl Worth [Fri, 13 Dec 2013 06:58:40 +0000 (22:58 -0800)]
docs: Import 9.2.5 release notes, add news item.

10 years agodocs: Import 10.0.1 release notes, add news item.
Carl Worth [Fri, 13 Dec 2013 06:21:08 +0000 (22:21 -0800)]
docs: Import 10.0.1 release notes, add news item.

10 years agoswrast* (gallium, classic): add MESA_copy_sub_buffer support (v3)
Dave Airlie [Thu, 28 Nov 2013 01:08:11 +0000 (11:08 +1000)]
swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3)

This patches add MESA_copy_sub_buffer support to the dri sw loader and
then to gallium state tracker, llvmpipe, softpipe and other bits.

It reuses the dri1 driver extension interface, and it updates the swrast
loader interface for a new putimage which can take a stride.

I've tested this with gnome-shell with a cogl hacked to reenable sub copies
for llvmpipe and the one piglit test.

I could probably split this patch up as well.

v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review,
add to p_screen doc comments.

v3: finish off winsys interfaces, add swrast classic support as well.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
swrast: add support for copy_sub_buffer

10 years agoutil: fix compile breakage
Brian Paul [Thu, 12 Dec 2013 18:11:31 +0000 (11:11 -0700)]
util: fix compile breakage

D'oh!

10 years agoutil: move variable declaration out of for-loop
Brian Paul [Thu, 12 Dec 2013 18:08:49 +0000 (11:08 -0700)]
util: move variable declaration out of for-loop

To fix MSVC build.

10 years agogallium/util: implement new color clear API in u_blitter
Marek Olšák [Wed, 4 Dec 2013 11:14:14 +0000 (12:14 +0100)]
gallium/util: implement new color clear API in u_blitter

10 years agost/mesa: set correct PIPE_CLEAR_COLORn flags
Marek Olšák [Wed, 4 Dec 2013 00:24:37 +0000 (01:24 +0100)]
st/mesa: set correct PIPE_CLEAR_COLORn flags

This also fixes the clear_with_quad function for glClearBuffer.

10 years agogallium: allow choosing which colorbuffers to clear
Marek Olšák [Tue, 3 Dec 2013 23:56:24 +0000 (00:56 +0100)]
gallium: allow choosing which colorbuffers to clear

Required for glClearBuffer, which only clears one colorbuffer attachment.

Example:
   If the first colorbuffer is float and the second one is int:
      pipe->clear(pipe, PIPE_CLEAR_COLOR0, float_clear_color, ...);
      pipe->clear(pipe, PIPE_CLEAR_COLOR1, int_clear_color, ...);

This doesn't need any driver changes yet, because all drivers just use:
  if (flags & PIPE_CLEAR_COLOR) ..

The drivers which support GL 3.0 will have to implement it properly though.

10 years agost/mesa: fix glClear with multiple colorbuffers and different formats
Marek Olšák [Tue, 3 Dec 2013 23:39:52 +0000 (00:39 +0100)]
st/mesa: fix glClear with multiple colorbuffers and different formats

Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
10 years agomesa: fix interpretation of glClearBuffer(drawbuffer)
Marek Olšák [Tue, 3 Dec 2013 23:27:20 +0000 (00:27 +0100)]
mesa: fix interpretation of glClearBuffer(drawbuffer)

This corresponding piglit tests supported this incorrect behavior instead of
pointing at it.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
10 years agodocs/GL3: better documentation of GL 3.0
Marek Olšák [Tue, 3 Dec 2013 23:25:55 +0000 (00:25 +0100)]
docs/GL3: better documentation of GL 3.0

10 years agor600g,radeonsi: fix initialized buffer range tracking for DMA, add comments
Marek Olšák [Wed, 4 Dec 2013 20:48:26 +0000 (21:48 +0100)]
r600g,radeonsi: fix initialized buffer range tracking for DMA, add comments

The DMA functions modify dst_offset and size and util_range_add gets wrong
values.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: fix binding the dummy pixel shader
Marek Olšák [Wed, 4 Dec 2013 12:54:50 +0000 (13:54 +0100)]
radeonsi: fix binding the dummy pixel shader

This fixes valgrind errors in glxinfo.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: fix FS_COLOR0_WRITES_ALL_CBUFS with mixed colorbuffer formats
Marek Olšák [Wed, 4 Dec 2013 12:24:22 +0000 (13:24 +0100)]
radeonsi: fix FS_COLOR0_WRITES_ALL_CBUFS with mixed colorbuffer formats

The 16bpc packing must be done separately for each render target.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: use the colorbuffer count from the shader key
Marek Olšák [Wed, 4 Dec 2013 11:40:28 +0000 (12:40 +0100)]
radeonsi: use the colorbuffer count from the shader key

As a result, the initialization of write_all must be done before
the compilation.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: remove unused variable in si_pipe_shader_ps
Marek Olšák [Wed, 4 Dec 2013 11:28:29 +0000 (12:28 +0100)]
radeonsi: remove unused variable in si_pipe_shader_ps

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: Write htile state to hardware.
Andreas Hartmetz [Sat, 7 Dec 2013 03:42:24 +0000 (04:42 +0100)]
radeonsi: Write htile state to hardware.

10 years agoradeon: Allocate htile buffer for SI in r600_texture.
Andreas Hartmetz [Sat, 7 Dec 2013 01:15:27 +0000 (02:15 +0100)]
radeon: Allocate htile buffer for SI in r600_texture.

10 years agoradeon: rearrange r600_texture and related code a bit.
Andreas Hartmetz [Sat, 7 Dec 2013 01:08:27 +0000 (02:08 +0100)]
radeon: rearrange r600_texture and related code a bit.

This should make the differences and similarities between color and
depth buffer handling more clear.

10 years agor600g,radeonsi: consolidate buffer code, add handling of DISCARD_RANGE for SI
Marek Olšák [Fri, 29 Nov 2013 16:28:23 +0000 (17:28 +0100)]
r600g,radeonsi: consolidate buffer code, add handling of DISCARD_RANGE for SI

This adds 2 optimizations for radeonsi:
- handling of DISCARD_RANGE
- mapping an uninitialized buffer range is automatically UNSYNCHRONIZED

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agor600g,radeonsi: add common interface for buffer invalidation
Marek Olšák [Fri, 29 Nov 2013 15:26:36 +0000 (16:26 +0100)]
r600g,radeonsi: add common interface for buffer invalidation

This will be used by common code in the next commit.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agor600g,radeonsi: consolidate some debug flags
Marek Olšák [Fri, 29 Nov 2013 15:05:45 +0000 (16:05 +0100)]
r600g,radeonsi: consolidate some debug flags

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agor600g: refactor out code for buffer invalidation
Marek Olšák [Fri, 29 Nov 2013 15:02:12 +0000 (16:02 +0100)]
r600g: refactor out code for buffer invalidation

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agor600g,radeonsi: share flags has_cp_dma and has_streamout
Marek Olšák [Thu, 28 Nov 2013 14:09:35 +0000 (15:09 +0100)]
r600g,radeonsi: share flags has_cp_dma and has_streamout

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
Marek Olšák [Wed, 27 Nov 2013 23:20:47 +0000 (00:20 +0100)]
radeonsi: handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE

which can come from glBufferData and glMapBufferRange.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeonsi: implement accelerated buffer copying
Marek Olšák [Wed, 27 Nov 2013 12:27:54 +0000 (13:27 +0100)]
radeonsi: implement accelerated buffer copying

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agor600g: use common interfaces in buffer_transfer_unmap
Marek Olšák [Wed, 27 Nov 2013 11:43:40 +0000 (12:43 +0100)]
r600g: use common interfaces in buffer_transfer_unmap

i.e. dma_copy and resource_copy_region.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoradeon: move some functions to r600_buffer_common.c
Marek Olšák [Tue, 26 Nov 2013 22:33:20 +0000 (23:33 +0100)]
radeon: move some functions to r600_buffer_common.c

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christoph Brill <egore911@gmail.com>
v2: Renamed r600_buffer.c to r600_buffer_common.c. The stupid build system
    doesn't allow 2 files of the same name in different directories.

10 years agowinsys/radeon: set/get the scanout flag with the tiling ioctls
Marek Olšák [Tue, 26 Nov 2013 21:59:31 +0000 (22:59 +0100)]
winsys/radeon: set/get the scanout flag with the tiling ioctls

If we assume that all buffers allocated by the DDX are scanout, a new flag
that says "this is not scanout" has to be added to support the non-scanout
buffers and maintain backward compatibility.

This fixes bad rendering on Wayland.

The flag is defined as:
  #define RADEON_TILING_R600_NO_SCANOUT   RADEON_TILING_SWAP_16BIT

AFAIK, RADEON_TILING_SWAP_16BIT is not used on SI.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agoglsl: modify ir_clone to use memcpy
Tapani Pälli [Thu, 12 Dec 2013 15:13:32 +0000 (17:13 +0200)]
glsl: modify ir_clone to use memcpy

Patch copies the whole data structure at once instead of
assigning individual variables.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: move variables in to ir_variable::data, part II
Tapani Pälli [Thu, 12 Dec 2013 13:08:59 +0000 (15:08 +0200)]
glsl: move variables in to ir_variable::data, part II

This patch moves following bitfields and variables to the data
structure:

explicit_location, explicit_index, explicit_binding, has_initializer,
is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray,
from_named_ifc_block_array, depth_layout, location, index, binding,
max_array_access, atomic

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: move variables in to ir_variable::data, part I
Tapani Pälli [Thu, 12 Dec 2013 11:51:01 +0000 (13:51 +0200)]
glsl: move variables in to ir_variable::data, part I

This patch moves following bitfields in to the data structure:

used, assigned, how_declared, mode, interpolation,
origin_upper_left, pixel_center_integer

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agoglsl: introduce data section to ir_variable
Tapani Pälli [Thu, 12 Dec 2013 10:57:57 +0000 (12:57 +0200)]
glsl: introduce data section to ir_variable

Data section helps serialization and cloning of a ir_variable. This
patch includes the helper bits used for read only ir_variables.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
10 years agomesa: fix a typo in glDetachShader error message
Tapani Pälli [Tue, 10 Dec 2013 12:28:40 +0000 (14:28 +0200)]
mesa: fix a typo in glDetachShader error message

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agosvga: expose HW smooth/stipple/wide lines
Brian Paul [Mon, 9 Dec 2013 18:46:56 +0000 (10:46 -0800)]
svga: expose HW smooth/stipple/wide lines

Newer virtual HW versions support smooth/stipple/wide lines.
Use that instead of 'draw' fallbacks when possible.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
10 years agoglx: Add missing null check in DRI2WireToEvent
Juha-Pekka Heikkila [Wed, 11 Dec 2013 09:06:00 +0000 (02:06 -0700)]
glx: Add missing null check in DRI2WireToEvent

Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agollvmpipe: add plumbing for ARB_depth_clamp
Matthew McClure [Tue, 10 Dec 2013 21:10:03 +0000 (13:10 -0800)]
llvmpipe: add plumbing for ARB_depth_clamp

With this patch llvmpipe will adhere to the ARB_depth_clamp enabled state when
clamping the fragment's zw value. To support this, the variant key now includes
the depth_clamp state. key->depth_clamp is derived from pipe_rasterizer_state's
(depth_clip == 0), thus depth clamp is only enabled when depth clip is disabled.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
10 years agor600g/sb: fix stack size computation on evergreen
Vadim Girlin [Wed, 11 Dec 2013 00:08:32 +0000 (04:08 +0400)]
r600g/sb: fix stack size computation on evergreen

On evergreen we have to reserve 1 stack element in some additional cases
besides the ones mentioned in the docs, but stack size computation was
recently reimplemented exactly as described in the docs by the patch that
added workarounds for stack issues on EG/CM, resulting in regressions
with some apps (Serious Sam 3).

This patch fixes it by restoring previous behavior.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Andre Heider <a.heider@gmail.com>
10 years agollvmpipe: add a very useful (disabled) debugging output
Zack Rusin [Tue, 10 Dec 2013 05:10:28 +0000 (00:10 -0500)]
llvmpipe: add a very useful (disabled) debugging output

Disabled by default, but it's very useful when needed.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agodraw: fix vbuf caching of vertices with inject front face
Zack Rusin [Tue, 10 Dec 2013 05:06:48 +0000 (00:06 -0500)]
draw: fix vbuf caching of vertices with inject front face

Caching in the vbuf module meant that once a vertex has been
emitted it was cached, but it's possible for a vertex at the
same location to be emitted again, but this time with a different
front-face semantic. Caching was causing the first version of the
vertex to be emitted, which resulted in the renderer getting
incorrect front-face attributes. By reseting the vertex_id (which
is used for caching) we make sure that once a front-face info
has been injected the vertex will endup getting emitted.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agollvmpipe: fix blending with half-float formats
Zack Rusin [Fri, 6 Dec 2013 06:28:25 +0000 (01:28 -0500)]
llvmpipe: fix blending with half-float formats

The fact that we flush denorms to zero breaks our half-float
conversion and blending. This patches enables denorms for
blending. It's a little tricky due to the llvm bug that makes
it incorrectly reorder the mxcsr intrinsics:
http://llvm.org/bugs/show_bug.cgi?id=6393

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Zack Rusin <zackr@vmware.com>
10 years agosvga/winsys: Implement surface sharing using prime fd handles
Thomas Hellstrom [Thu, 21 Nov 2013 04:11:46 +0000 (15:11 +1100)]
svga/winsys: Implement surface sharing using prime fd handles

This needs a prime-aware vmwgfx kernel module to work properly.

(With additions by Christopher James Halse Rogers <raof@ubuntu.com>)

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agogallium/radeon: Implement hooks for DRI Image 7 (v2)
Christopher James Halse Rogers [Mon, 25 Nov 2013 03:59:10 +0000 (14:59 +1100)]
gallium/radeon: Implement hooks for DRI Image 7 (v2)

v2: Fix transliteration of lseek arguments
    Ignore busy return from RADEON_GEM_BUSY ioctl; we're only after the domain

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agoradeon: Rename bo_handles hashtable to match its actual contents.
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:44 +0000 (15:11 +1100)]
radeon: Rename bo_handles hashtable to match its actual contents.

It's a map of GEM name->bo, so identify it as such

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agoilo: Support DRI Image 7
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:43 +0000 (15:11 +1100)]
ilo: Support DRI Image 7

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agonouveau: Support DRI Image 7 extension
Maarten Lankhorst [Thu, 21 Nov 2013 04:11:42 +0000 (15:11 +1100)]
nouveau: Support DRI Image 7 extension

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agogallium/dri: Support DRI Image extension version 7
Christopher James Halse Rogers [Mon, 25 Nov 2013 03:57:37 +0000 (14:57 +1100)]
gallium/dri: Support DRI Image extension version 7

v2: Fix up queryImage return for ATTRIB_FD
    Use driver_descriptor.configuration to determine whether the driver
    supports DMA-BUF import/export.
v3: Really, truly, fix up queryImage return for ATTRIB_FD

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agogallium/dri2: Set winsys_handle type to KMS for stride query.
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:40 +0000 (15:11 +1100)]
gallium/dri2: Set winsys_handle type to KMS for stride query.

Otherwise the default is TYPE_SHARED, which will flink the bo. This seems
rather unnecessary for a simple stride query.

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agogallium/winsys/drm: Prepare for passing prime fds in winsys_handle
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:39 +0000 (15:11 +1100)]
gallium/winsys/drm: Prepare for passing prime fds in winsys_handle

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agogallium/dri: Support DRI Image extension version 6
Christopher James Halse Rogers [Tue, 26 Nov 2013 07:44:05 +0000 (18:44 +1100)]
gallium/dri: Support DRI Image extension version 6

v2: Pick out the correct gl_context pointer
v3: Don't leak pipe_resources on error path
    Set img->dri_format correctly

Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
10 years agonv50: report 15 max inputs for fragment programs
Ilia Mirkin [Sun, 1 Dec 2013 08:44:42 +0000 (03:44 -0500)]
nv50: report 15 max inputs for fragment programs

First off, nv50_program only has 16 in/out varyings. However reporting
16 makes 'm' become 68 in nv50_fp_linkage_validate with the
varying-packing-simple piglit test. (Subverting the assert makes it
compile but fail.) With this patch, varying-packing-simple passes.

See: https://bugs.freedesktop.org/show_bug.cgi?id=69155

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
10 years agonouveau: Fix compiler warning regression
Maarten Lankhorst [Tue, 10 Dec 2013 07:43:03 +0000 (08:43 +0100)]
nouveau: Fix compiler warning regression

cfg is now unused, remove it.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
10 years agoswrast: fix readback regression since inversion fix
Dave Airlie [Thu, 5 Dec 2013 03:30:17 +0000 (13:30 +1000)]
swrast: fix readback regression since inversion fix

This readback from the frontbuffer with swrast was broken, that bug
just made it more obviously broken, this fixes it by inverting the
sub image gets. Also fixes a few other piglits.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325
(for 9.2 the patches this depends on were asked to be backported separately
 in an email).
Cc: "9.2" "10.0" mesa-stable@lists.fedoraproject.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
10 years agodri megadriver_stub: add compatibility for older DRI loaders
Jordan Justen [Fri, 6 Dec 2013 10:21:17 +0000 (02:21 -0800)]
dri megadriver_stub: add compatibility for older DRI loaders

To help the transition period when DRI loaders are being updated
to support the newer __driDriverExtensions_foo mechanism,
we populate __driDriverExtensions with the extensions returned
by __driDriverExtensions_foo during a library contructor
function.

We find the driver foo's name by using the dladdr function
which gives the path of the dynamic library's name that
was being loaded.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
10 years agoegl/wayland: Return -1 from get_back_bo to indicate error
Kristian Høgsberg [Tue, 10 Dec 2013 00:13:35 +0000 (16:13 -0800)]
egl/wayland: Return -1 from get_back_bo to indicate error

A return value of -1 indicate failure to allocate the back buffer and
means we don't segfault on the way out.

10 years agoegl_dri2: Remove the unused swap_interval member of dri2_egl_surface
Neil Roberts [Wed, 11 Sep 2013 18:28:32 +0000 (19:28 +0100)]
egl_dri2: Remove the unused swap_interval member of dri2_egl_surface

The _EGLSurface struct which is embedded into dri2_egl_surface also contains a
swap interval member so the other member is redundant. Nothing was using it as
far as I can tell.

10 years agoi965: Replace OUT_RELOC_FENCED with OUT_RELOC.
Kenneth Graunke [Mon, 25 Nov 2013 21:53:33 +0000 (13:53 -0800)]
i965: Replace OUT_RELOC_FENCED with OUT_RELOC.

On Gen4+, OUT_RELOC_FENCED is equivalent to OUT_RELOC; libdrm silently
ignores the fenced flag:

        /* We never use HW fences for rendering on 965+ */
        if (bufmgr_gem->gen >= 4)
                need_fence = false;

Thanks to Eric for noticing this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoglsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.
Paul Berry [Fri, 29 Nov 2013 08:52:11 +0000 (00:52 -0800)]
glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.

Now that loop_controls no longer creates normatively bound loops,
there is no need for ir_loop::normative_bound or the
lower_bounded_loops pass.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: Stop creating normatively bound loops in loop_controls.
Paul Berry [Fri, 29 Nov 2013 08:16:43 +0000 (00:16 -0800)]
glsl/loops: Stop creating normatively bound loops in loop_controls.

Previously, when loop_controls analyzed a loop and found that it had a
fixed bound (known at compile time), it would remove all of the loop
terminators and instead set the loop's normative_bound field to force
the loop to execute the correct number of times.

This made loop unrolling easy, but it had a serious disadvantage.
Since most GPU's don't have a native mechanism for executing a loop a
fixed number of times, in order to implement the normative bound, the
back-ends would have to synthesize a new loop induction variable.  As
a result, many loops wound up having two induction variables instead
of one.  This caused extra register pressure and unnecessary
instructions.

This patch modifies loop_controls so that it doesn't set the loop's
normative_bound anymore.  Instead it leaves one of the terminators in
the loop (the limiting terminator), so the back-end doesn't have to go
to any extra work to ensure the loop terminates at the right time.

This complicates loop unrolling slightly: when deciding whether a loop
can be unrolled, we have to account for the presence of the limiting
terminator.  And when we do unroll the loop, we have to remove the
limiting terminator first.

For an example of how this results in more efficient back end code,
consider the loop:

    for (int i = 0; i < 100; i++) {
      total += i;
    }

Previous to this patch, on i965, this loop would compile down to this
(vec4) native code:

          mov(8)       g4<1>.xD 0D
          mov(8)       g8<1>.xD 0D
    loop:
          cmp.ge.f0(8) null     g8<4;4,1>.xD 100D
    (+f0) if(8)
          break(8)
          endif(8)
          add(8)       g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
          add(8)       g8<1>.xD g8<4;4,1>.xD 1D
          add(8)       g4<1>.xD g4<4;4,1>.xD 1D
          while(8) loop

(notice that both g8 and g4 are loop induction variables; one is used
to terminate the loop, and the other is used to accumulate the total).

After this patch, the same loop compiles to:

          mov(8)       g4<1>.xD 0D
    loop:
          cmp.ge.f0(8) null     g4<4;4,1>.xD 100D
    (+f0) if(8)
          break(8)
          endif(8)
          add(8)       g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
          add(8)       g4<1>.xD g4<4;4,1>.xD 1D
          while(8) loop

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: Get rid of loop_variable_state::max_iterations.
Paul Berry [Fri, 29 Nov 2013 08:11:12 +0000 (00:11 -0800)]
glsl/loops: Get rid of loop_variable_state::max_iterations.

This value is now redundant with
loop_variable_state::limiting_terminator->iterations and
ir_loop::normative_bound.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: Simplify loop unrolling logic by breaking into functions.
Paul Berry [Fri, 29 Nov 2013 06:12:08 +0000 (22:12 -0800)]
glsl/loops: Simplify loop unrolling logic by breaking into functions.

The old logic of loop_unroll_visitor::visit_leave(ir_loop *) was:

    heuristics to skip unrolling in various circumstances;
    if (loop contains more than one jump)
      return;
    else if (loop contains one jump) {
      if (the jump is an unconditional "break" at the end of the loop) {
        remove the break and set iteration count to 1;
        fall through to simple loop unrolling code;
      } else {
        for (each "if" statement in the loop body)
          see if the jump is a "break" at the end of one of its forks;
        if (the "break" wasn't found)
          return;
        splice the remainder of the loop into the other fork of the "if";
        remove the "break";
        complex loop unrolling code;
        return;
      }
    }
    simple loop unrolling code;
    return;

These tasks have been moved to their own functions:
- splice the remainder of the loop into the other fork of the "if"
- simple loop unrolling code
- complex loop unrolling code

And the logic has been flattened to:

    heuristics to skip unrolling in various circumstances;
    if (loop contains more than one jump)
      return;
    if (loop contains no jumps) {
      simple loop unroll;
      return;
    }
    if (the jump is an unconditional "break" at the end of the loop) {
      remove the break;
      simple loop unroll with iteration count of 1;
      return;
    }
    for (each "if" statement in the loop body) {
      if (the jump is a "break" at the end of one of its forks) {
        splice the remainder of the loop into the other fork of the "if";
        remove the "break";
        complex loop unroll;
        return;
      }
    }

This will make it easier to modify the loop unrolling algorithm in a
future patch.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: Move some analysis from loop_controls to loop_analysis.
Paul Berry [Thu, 28 Nov 2013 22:40:19 +0000 (14:40 -0800)]
glsl/loops: Move some analysis from loop_controls to loop_analysis.

Previously, the sole responsibility of loop_analysis was to find all
the variables referenced in the loop that are either loop constant or
induction variables, and find all of the simple if statements that
might terminate the loop.  The remainder of the analysis necessary to
determine how many times a loop executed was performed by
loop_controls.

This patch makes loop_analysis also responsible for determining the
number of iterations after which each loop terminator will terminate
the loop, and for figuring out which terminator will terminate the
loop first (I'm calling this the "limiting terminator").

This will allow loop unrolling to make use of information that was
previously only visible from loop_controls, namely the identity of the
limiting terminator.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: Allocate loop_terminator using new(mem_ctx) syntax.
Paul Berry [Thu, 28 Nov 2013 22:46:38 +0000 (14:46 -0800)]
glsl/loops: Allocate loop_terminator using new(mem_ctx) syntax.

Patches to follow will introduce code into the loop_terminator
constructor.  Allocating loop_terminator using new(mem_ctx) syntax
will ensure that the constructor runs.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: Remove unnecessary list walk from loop_control_visitor.
Paul Berry [Thu, 28 Nov 2013 20:44:53 +0000 (12:44 -0800)]
glsl/loops: Remove unnecessary list walk from loop_control_visitor.

When loop_control_visitor::visit_leave(ir_loop *) is analyzing a loop
terminator that acts on a certain ir_variable, it doesn't need to walk
the list of induction variables to find the loop_variable entry
corresponding to the variable.  It can just look it up in the
loop_variable_state hashtable and verify that the loop_variable entry
represents an induction variable.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: Remove unused fields iv_scale and biv from loop_variable class.
Paul Berry [Thu, 28 Nov 2013 20:17:54 +0000 (12:17 -0800)]
glsl/loops: Remove unused fields iv_scale and biv from loop_variable class.

These fields were part of some planned optimizations that never
materialized.  Remove them for now to simplify things; if we ever get
round to adding the optimizations that would require them, we can
always re-introduce them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: replace loop controls with a normative bound.
Paul Berry [Thu, 28 Nov 2013 16:13:41 +0000 (08:13 -0800)]
glsl/loops: replace loop controls with a normative bound.

This patch replaces the ir_loop fields "from", "to", "increment",
"counter", and "cmp" with a single integer ("normative_bound") that
serves the same purpose.

I've used the name "normative_bound" to emphasize the fact that the
back-end is required to emit code to prevent the loop from running
more than normative_bound times.  (By contrast, an "informative" bound
would be a bound that is informational only).

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl/loops: consolidate bounded loop handling into a lowering pass.
Paul Berry [Thu, 28 Nov 2013 01:57:19 +0000 (17:57 -0800)]
glsl/loops: consolidate bounded loop handling into a lowering pass.

Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the
i965 fs and vec4 visitors) had nearly identical logic for handling
bounded loops.  This replaces the duplicate logic with an equivalent
lowering pass that is used by all the back-ends.

Note: on i965, there is a slight increase in instruction count.  For
example, a loop like this:

    for (int i = 0; i < 100; i++) {
      total += i;
    }

would previously compile down to this (vec4) native code:

          mov(8)       g4<1>.xD 0D
          mov(8)       g8<1>.xD 0D
    loop:
          cmp.ge.f0(8) null     g8<4;4,1>.xD 100D
    (+f0) break(8)
          add(8)       g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
          add(8)       g8<1>.xD g8<4;4,1>.xD 1D
          add(8)       g4<1>.xD g4<4;4,1>.xD 1D
          while(8) loop

After this patch, the "(+f0) break(8)" turns into:

    (+f0) if(8)
          break(8)
          endif(8)

because the back-end isn't smart enough to recognize that "if
(condition) break;" can be done using a conditional break instruction.
However, it should be relatively easy for a future peephole
optimization to properly optimize this.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>