mesa.git
12 years agoradeon/llvm: Fix lowering of vbuild
Tom Stellard [Tue, 11 Sep 2012 19:24:32 +0000 (15:24 -0400)]
radeon/llvm: Fix lowering of vbuild

Some of the old AMDIL code was hard-coding subreg indices when creating
the VBUILD node, which was making it difficult to match the
vector_insert patterns.

12 years agoradeon/llvm: Support fmul on SI
Tom Stellard [Tue, 11 Sep 2012 19:21:09 +0000 (15:21 -0400)]
radeon/llvm: Support fmul on SI

12 years agoi965: Fix out-of-order sampler unit usage in ARB fragment programs.
Kenneth Graunke [Wed, 12 Sep 2012 05:14:59 +0000 (22:14 -0700)]
i965: Fix out-of-order sampler unit usage in ARB fragment programs.

ARB fragment programs use texture unit numbers directly, unlike GLSL
which has an extra indirection.  If a fragment program only uses one
texture assigned to GL_TEXTURE1, SamplersUsed will only contain a single
bit, which would make us only upload a single surface/sampler state
entry.  However, it needs to be the second entry.

Using _mesa_fls() instead of _mesa_bitcount() solves this.  For ARB
programs, this makes num_samplers the ID of the highest texture unit
used.  Since GLSL uses consecutive integers assigned by the linker,
_mesa_fls() should give the same result as _mesa_bitcount()..

Fixes a regression since 85e8e9e000732908b259a7e2cbc1724a1be2d447,
which caused GPU hangs in ETQW (and probably others), as well as
breaking piglit test fp-fragment-position.

v2: Add a comment, as suggested by Matt.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54098
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54179
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested-by: meng <mengmeng.meng@intel.com>
12 years agomesa: Add a _mesa_fls() function to find the last bit set in a word.
Kenneth Graunke [Wed, 12 Sep 2012 05:14:58 +0000 (22:14 -0700)]
mesa: Add a _mesa_fls() function to find the last bit set in a word.

ffs() finds the least significant bit set; _mesa_fls() finds the /most/
significant bit.

v2: Make it an inline function in imports.h, per Brian's suggestion.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agoi965/blorp: Fix offsets and width/height for stencil blits.
Paul Berry [Thu, 30 Aug 2012 15:01:54 +0000 (08:01 -0700)]
i965/blorp: Fix offsets and width/height for stencil blits.

Fixes piglit test "framebuffer-blit-levels draw stencil".

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: Reduce alignment restrictions for stencil blits.
Paul Berry [Wed, 29 Aug 2012 21:26:48 +0000 (14:26 -0700)]
i965/blorp: Reduce alignment restrictions for stencil blits.

Previously, we aligned all stencil blit operations to multiples of the
size of a tile, since stencil buffers use W-tiling, and blorp has to
approximate this by configuring the 3D pipeline for Y-tiling and
swizzling coordinates.

However, this was unnecessarily conservative; it turns out that the
differences between W-tiling and Y-tiling are confined to 32-byte
sub-tiles within the 4k tiling pattern; the layout of these 32-byte
sub-tiles within the larger 4k tile is the same (8 sub-tiles across by
16 sub-tiles down, in column-major order).  Therefore we only need to
align stencil blit operations to multiples of the sub-tile size.

Note: although the performance improvement of this change is probably
quite small, the fact that W-tiling and Y-tiling formats only differ
within 32-byte sub-tiles will be essential in a future patch to ensure
that stencil blits work correctly between parts of the miptree other
than level/layer 0.  Making this change provides handy documentation
(and validation) of this fact.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: don't reduce stencil alignment restrictions when multisampling.
Paul Berry [Wed, 29 Aug 2012 22:11:49 +0000 (15:11 -0700)]
i965/blorp: don't reduce stencil alignment restrictions when multisampling.

When blitting to a stencil buffer, we need to align the rectangle we
send down the rendering pipeline, to account for the fact that the
stencil buffer uses a W-tiled layout, but we are configuring its
surface state as Y-tiled.

Previously, when the stencil buffer was multisampled, we assumed that
we could reduce the amount of alignment that was necessary, since each
pixel occupies a block of 2x2 or 4x2 samples in the stencil buffer.
That would have been correct if the coordinates we were adjusting were
measured in pixels.  However, the conversion from pixel coordinates to
coordinates within the interleaved buffer has already been done;
therefore the full alignment restriction applies.

Note: the reason this mistake wasn't previously uncovered by piglit
tests is because it is being masked by another mistake: the blorp
engine is using overly conservative alignment restrictions when doing
stencil blits.  The overly conservative alignment restrictions will be
removed in the patch that follows.  Doing this fix now will prevent
the subsequent patch from introducing regressions.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
12 years agointel: Add map_stencil_as_y_tiled to intel_region_get_aligned_offset.
Paul Berry [Thu, 30 Aug 2012 18:16:44 +0000 (11:16 -0700)]
intel: Add map_stencil_as_y_tiled to intel_region_get_aligned_offset.

This patch modifies intel_region_get_aligned_offset() to make the
appropriate calculation when the blorp engine sets up a W-tiled
stencil buffer using a Y-tiled SURFACE_STATE.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
12 years agointel: Add map_stencil_as_y_tiled to intel_region_get_tile_masks.
Paul Berry [Thu, 30 Aug 2012 17:57:03 +0000 (10:57 -0700)]
intel: Add map_stencil_as_y_tiled to intel_region_get_tile_masks.

When the blorp engine is performing a blit from one stencil buffer to
another, it sets up the surface state for these buffers as Y-tiled, so
it needs to be able to force intel_region_get_tile_masks() to return
the appropriate masks for a Y-tiled region.

NOTE: This is a candidate for stable release branches.

Acked-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: Account for offsets when emitting SURFACE_STATE.
Paul Berry [Wed, 29 Aug 2012 23:04:15 +0000 (16:04 -0700)]
i965/blorp: Account for offsets when emitting SURFACE_STATE.

Fixes piglit tests "framebuffer-blit-levels {read,draw} depth".

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: Thread level and layer through brw_blorp_blit_miptrees().
Paul Berry [Thu, 16 Aug 2012 17:06:08 +0000 (10:06 -0700)]
i965/blorp: Thread level and layer through brw_blorp_blit_miptrees().

Previously, when performing a blit using the blorp engine, we failed
to account for the level and layer of the source and destination.  As
a result, all blits would occur between miplevel 0 and layer 0 of the
corresponding textures, regardless of which level/layer was bound to
the framebuffer.

This patch passes the correct level and layer through
brw_blorp_miptrees() into the brw_blorp_blit_params data structure.

Further patches in the series will adapt
gen{6,7}_blorp_emit_surface_state to make use of these parameters.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: Don't create a dummy renderbuffer just to fetch image offsets.
Paul Berry [Mon, 10 Sep 2012 18:30:14 +0000 (11:30 -0700)]
i965/blorp: Don't create a dummy renderbuffer just to fetch image offsets.

This is unnecessary--the image offsets can be read directly out of the
miptree using intel_miptree_get_image_offset.

12 years agoi965/blorp: store x and y offsets in brw_blorp_mip_info.
Paul Berry [Wed, 29 Aug 2012 19:16:06 +0000 (12:16 -0700)]
i965/blorp: store x and y offsets in brw_blorp_mip_info.

Currently, gen{6,7}_blorp_emit_surface_state assumes that the src and
dst surfaces are mapped to miplevel 0 and layer 0 (thus no surface
offset is required).  This is a bug, since the user might try to blit
to and from levels/layers other than 0.

To fix this bug, it will not be sufficient to have
gen6_{6,7}_blorp_emit_surface_state look up the surface offset at the
time they set up the surface state, since these offsets will need to
be tweaked when blitting stencil buffers (due to the fact that stencil
buffer blits have to swizzle between W and Y tiling formats).

So, to pave the way for the bug fix, this patch causes the x and y
offsets to be computed during blit setup and stored in
brw_blorp_mip_info.

As a result of this change, brw_blorp_mip_info doesn't need to store
the level and layer anymore.

For consistency, this patch makes a similar change to the handling of
depth buffers when doing HiZ operations.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: store surface width/height in brw_blorp_mip_info.
Paul Berry [Wed, 29 Aug 2012 18:51:14 +0000 (11:51 -0700)]
i965/blorp: store surface width/height in brw_blorp_mip_info.

Previously, gen{6,7}_blorp_emit_surface_state would look up the width
and height of the surface at the time they set up the surface state,
and then tweak it if necessary (it's necessary when a W-tiled surface
is being mapped as Y-tiled).  With this patch, we look up the width
and height when setting up the blit, and store them in
brw_blorp_mip_info.  This allows us to do the necessary tweak in the
brw_blorp_blit_params constructor (where it makes more sense).  It
also reduces the need to keep track of level and layer in
brw_blorp_mip_info, so that a future patch can eliminate them
entirely.

For consistency, this patch makes a similar change to the handling of
depth buffers when doing HiZ operations.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: Change gl_renderbuffer* params to intel_renderbuffer*.
Paul Berry [Wed, 15 Aug 2012 21:51:56 +0000 (14:51 -0700)]
i965/blorp: Change gl_renderbuffer* params to intel_renderbuffer*.

This makes it more convenient for blorp functions to get access to
Intel-specific data inside the renderbuffer objects.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/blorp: Clarify why width/height must be adjusted for Gen6 IMS surfaces.
Paul Berry [Wed, 29 Aug 2012 19:04:30 +0000 (12:04 -0700)]
i965/blorp: Clarify why width/height must be adjusted for Gen6 IMS surfaces.

Also add a clarifying comment for why the width/height doesn't need
adjustment for Gen7.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agoi965/gen6+: Adjust stencil buffer size after computing miptree layout.
Paul Berry [Tue, 4 Sep 2012 14:57:37 +0000 (07:57 -0700)]
i965/gen6+: Adjust stencil buffer size after computing miptree layout.

Since Gen6+ stencil buffers use W-tiling (a tiling arrangement which
drm and the kernel are not aware of) we need to round up the width and
height of a stencil buffer to multiples of the W-tile size (64x64)
before allocating a stencil buffer.  Previously, we rounded up the
size of the base miplevel, and then computed the miptree layout based
on the rounded up size.  This was incorrect, because it meant that the
total size of the miptree would not be properly W-tile aligned, and
therefore we would not always allocate enough pages.

(Note: even though the GL API doesn't allow creation of mipmapped
stencil textures, it does allow mipmapping of a combined depth/stencil
texture, and on Gen6+, a combined depth/stencil texture is internally
implemented as a pair of separate depth and stencil buffers.)

For example, on Sandy Bridge, when allocating a mipmapped stencil
texture of size 128x128, we would first round up to the nearest
multiple of 64x64 (causing no change to the size), and then compute
the miptree layout (whose size worked out to 128x196).  Then we would
request an allocation of 128*196 bytes (6.125 pages), causing 7 pages
to be allocated to the texture.  However, the texture needs 8 pages,
since each W-tile occupies a page, and it takes 2 W-tiles to cover a
width of 128 and 4 W-tiles to cover a height of 196.

This patch changes the order of operations so that the miptree layout
is computed first and then the total size of the miptree is rounded up
to be W-tile aligned.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
12 years agobuild: Don't list glproto and dri2proto in pkg-config file
Matt Turner [Wed, 12 Sep 2012 00:08:17 +0000 (17:08 -0700)]
build: Don't list glproto and dri2proto in pkg-config file

No files provided by glproto or dri2proto are needed for building
something with Mesa.

Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=342393
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
12 years agoradeonsi: Properly handle NULL sampler views.
Michel Dänzer [Wed, 12 Sep 2012 13:53:51 +0000 (15:53 +0200)]
radeonsi: Properly handle NULL sampler views.

Fixes piglit shaders/glsl-fs-uniform-sampler-array and many other similar
tests.

In fact, I just completed a piglit quick-driver.tests run without any GPU
lockups or even VM protection faults. Yay!

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: Fix calculation of number of records in buffer resource.
Michel Dänzer [Wed, 12 Sep 2012 10:59:49 +0000 (12:59 +0200)]
radeonsi: Fix calculation of number of records in buffer resource.

The value was too small by 1 in some cases (non-first of several vertex
elements interleaved in a single buffer).

Fixes intermittent incorrect geometry in many apps, e.g. piglit
spec/EXT_texture_snorm/fbo-generatemipmap-formats.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
12 years agomesa: glGet: fix API check for EGL_image_external enums
Imre Deak [Mon, 10 Sep 2012 06:41:40 +0000 (09:41 +0300)]
mesa: glGet: fix API check for EGL_image_external enums

These enums are valid only in ES1 and ES2. So far they were marked valid
incorrectly, depending on the previous API mask in the enum list.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
12 years agomesa: glGet: fix indentation of print_table_stats
Imre Deak [Mon, 10 Sep 2012 06:41:39 +0000 (09:41 +0300)]
mesa: glGet: fix indentation of print_table_stats

No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
12 years agomesa: glGet: fix indentation of find_value
Imre Deak [Mon, 10 Sep 2012 06:41:38 +0000 (09:41 +0300)]
mesa: glGet: fix indentation of find_value

No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
12 years agomesa: glGet: fix indentation of _mesa_init_get_hash
Imre Deak [Mon, 10 Sep 2012 06:41:37 +0000 (09:41 +0300)]
mesa: glGet: fix indentation of _mesa_init_get_hash

No functional change.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
12 years agomesa: fix proxy texture error handling in glTexStorage()
Brian Paul [Sat, 8 Sep 2012 15:33:13 +0000 (09:33 -0600)]
mesa: fix proxy texture error handling in glTexStorage()

This is basically a follow-on to 1f5b1f98468d5e80be39e619ed15c422fbede8d3.
Basically, generate GL errors for ordinary invalid parameters for proxy
targets the same as for non-proxy targets.  Only texture size and OOM
errors should be handled specially for proxies.

Note: This is a candidate for the stable branches.

12 years agomesa: make _mesa_get_proxy_target() non-static
Brian Paul [Sat, 8 Sep 2012 15:46:14 +0000 (09:46 -0600)]
mesa: make _mesa_get_proxy_target() non-static

Needed for the next patch.

Note: This is a candidate for the stable branches.

12 years agomesa: do internal format error checking for glTexStorage()
Brian Paul [Sat, 8 Sep 2012 15:27:46 +0000 (09:27 -0600)]
mesa: do internal format error checking for glTexStorage()

Turns out we weren't doing any format checking before.  Now check
the internal format and, in particular, make sure that unsized internal
formats aren't accepted.

Note: This is a candidate for the stable branches.

12 years agomesa/msaa: Allow X and Y flips in multisampled blits.
Paul Berry [Wed, 5 Sep 2012 23:07:16 +0000 (16:07 -0700)]
mesa/msaa: Allow X and Y flips in multisampled blits.

From the GL 4.3 spec, section 18.3.1 "Blitting Pixel Rectangles":

    If SAMPLE_BUFFERS for either the read framebuffer or draw
    framebuffer is greater than zero, no copy is performed and an
    INVALID_OPERATION error is generated if the dimensions of the
    source and destination rectangles provided to BlitFramebuffer are
    not identical, or if the formats of the read and draw framebuffers
    are not identical.

It is not clear from the spec whether "dimensions" should mean both
sign and magnitude, or just magnitude.

Previously, Mesa interpreted "dimensions" as meaning both sign and
magnitude, so any multisampled blit that attempted to flip the image
in the X and/or Y direction would fail.

However, Y flips are likely to be commonplace in OpenGL applications
that have been ported from DirectX applications, as a result of the
fact that DirectX and OpenGL differ in their orientation of the Y
axis.  Furthermore, at least one commercial driver (nVidia) permits Y
filps, and L4D2 relies on them being permitted.  So it seems prudent
for Mesa to permit them.

This patch changes Mesa to allow both X and Y flips, since there is no
language in the spec to indicate that X and Y flips should be treated
differently.

NOTE: This is a candidate for stable release branches.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
12 years agoradeon/llvm: Fix operand order of V_CNDMASK in custom inserter
Tom Stellard [Wed, 5 Sep 2012 18:36:21 +0000 (14:36 -0400)]
radeon/llvm: Fix operand order of V_CNDMASK in custom inserter

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeon/llvm: Assert if we try to encode an unknown register
Tom Stellard [Wed, 5 Sep 2012 18:35:47 +0000 (14:35 -0400)]
radeon/llvm: Assert if we try to encode an unknown register

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeon/llvm: Add register encoding for VCC
Tom Stellard [Wed, 5 Sep 2012 18:35:21 +0000 (14:35 -0400)]
radeon/llvm: Add register encoding for VCC

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeon/llvm: Ignore special registers when calculating reg count
Tom Stellard [Wed, 5 Sep 2012 18:34:03 +0000 (14:34 -0400)]
radeon/llvm: Ignore special registers when calculating reg count

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: Handle position input parameter for pixel shaders v2
Tom Stellard [Thu, 6 Sep 2012 20:18:11 +0000 (16:18 -0400)]
radeonsi: Handle position input parameter for pixel shaders v2

v2:
  - Don't increment ninterp or set any of the have_* flags for
    TGSI_SEMANTIC_POSITION

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeon/llvm: Coding style fixes
Tom Stellard [Thu, 6 Sep 2012 19:56:02 +0000 (15:56 -0400)]
radeon/llvm: Coding style fixes

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: Move interpolation mode check into the compiler
Tom Stellard [Thu, 6 Sep 2012 19:41:59 +0000 (15:41 -0400)]
radeonsi: Move interpolation mode check into the compiler

The compiler needs to know which interpolation modes are enabled, so
it knows which values will be preloaded into the VGPRs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: Add missing interpolation mode to check for enabled modes
Tom Stellard [Fri, 7 Sep 2012 12:29:13 +0000 (08:29 -0400)]
radeonsi: Add missing interpolation mode to check for enabled modes

At least one interpolation mode must be enable, but the code that checks
this was not checking for perspective center.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: Pass shader type to the compiler
Tom Stellard [Fri, 7 Sep 2012 13:12:51 +0000 (09:12 -0400)]
radeonsi: Pass shader type to the compiler

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeon/llvm: Add SHADER_TYPE instruction
Tom Stellard [Fri, 7 Sep 2012 13:11:59 +0000 (09:11 -0400)]
radeon/llvm: Add SHADER_TYPE instruction

This allows the program to specify the type of shader being compiled
(e.g. PXEL, VERTEX, etc.)

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agor600g: avoid GPU doing constant preload from random address
Jerome Glisse [Fri, 7 Sep 2012 19:00:20 +0000 (15:00 -0400)]
r600g: avoid GPU doing constant preload from random address

Previous command stream might have set any of the constant buffer
and the previous address might no longer be valid thus GPU might
preload constant from random invalid address and possibly triggering
lockup.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
12 years agoradeonsi: Texture border colour fixes.
Michel Dänzer [Fri, 7 Sep 2012 14:09:08 +0000 (16:09 +0200)]
radeonsi: Texture border colour fixes.

* Handle arbitrary border colours.
* Use correct packing format for detecting special border colours.

Fixes piglit tex-border-1 and probably many other tests using border colours.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoradeonsi: Handle NULL sampler states.
Michel Dänzer [Fri, 7 Sep 2012 15:26:15 +0000 (17:26 +0200)]
radeonsi: Handle NULL sampler states.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoi965: Remove incorrect comment above opt_algebraic.
Kenneth Graunke [Tue, 11 Sep 2012 05:56:03 +0000 (22:56 -0700)]
i965: Remove incorrect comment above opt_algebraic.

The comment was cut-and-pasted from propagate_constants(), and had no
relation at all to opt_algebraic().

12 years agoglsl: Generate compile errors for explicit blend indices < 0 or > 1.
Kenneth Graunke [Fri, 31 Aug 2012 23:04:19 +0000 (16:04 -0700)]
glsl: Generate compile errors for explicit blend indices < 0 or > 1.

According to the GLSL 4.30 specification, this is a compile time error.
Earlier specifications don't specify a behavior, but since 0 and 1 are
the only valid indices for dual source blending, it makes sense to
generate the error.

Fixes (the fixed version of) piglit's layout-12.frag.

NOTE: This is a candidate for the 9.0 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
12 years agor600g: remove unused function
Marek Olšák [Mon, 10 Sep 2012 21:58:18 +0000 (23:58 +0200)]
r600g: remove unused function

12 years agor600g: fix printf warning
Marek Olšák [Sun, 9 Sep 2012 23:28:40 +0000 (01:28 +0200)]
r600g: fix printf warning

12 years agomesa: bump version to 9.1 (devel)
Andreas Boll [Fri, 7 Sep 2012 21:49:01 +0000 (23:49 +0200)]
mesa: bump version to 9.1 (devel)

Now that branch 9.0 is created, bump the minor version in
master.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoSet OSMESA_VERSION=8.
Johannes Obermayr [Sat, 1 Sep 2012 23:35:47 +0000 (01:35 +0200)]
Set OSMESA_VERSION=8.

VERSION_NUMBER is not required anymore. So it will be removed.

Reviewed-by: Adam Jackson <ajax@redhat.com>
12 years agonvc0/ir: add initial code to support GK110 ISA encoding
Christoph Bumiller [Sat, 1 Sep 2012 15:26:24 +0000 (17:26 +0200)]
nvc0/ir: add initial code to support GK110 ISA encoding

12 years agoradeonsi: Float format fixups.
Michel Dänzer [Fri, 7 Sep 2012 15:41:21 +0000 (17:41 +0200)]
radeonsi: Float format fixups.

Fixes piglit spec/ARB_texture_float/fbo-generatemipmap-formats.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoradeonsi: Handle more SNORM formats.
Michel Dänzer [Fri, 7 Sep 2012 14:35:48 +0000 (16:35 +0200)]
radeonsi: Handle more SNORM formats.

Fixes piglit spec/EXT_texture_snorm/fbo-generatemipmap-formats (except for
what seems like a random fluke).

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoi965: Fix virtual_grf_interferes() between calculate_live_intervals() and DCE.
Eric Anholt [Thu, 6 Sep 2012 05:10:41 +0000 (22:10 -0700)]
i965: Fix virtual_grf_interferes() between calculate_live_intervals() and DCE.

This fixes the blue zombies bug in l4d2.

NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965: Make the param pointer arrays for the VS dynamically sized.
Eric Anholt [Mon, 27 Aug 2012 16:24:56 +0000 (09:24 -0700)]
i965: Make the param pointer arrays for the VS dynamically sized.

Saves 96MB of wasted memory in the l4d2 demo.

v2: Rebase on compare func change, change brace style.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoi965: Make the param pointer arrays for the WM dynamically sized.
Eric Anholt [Mon, 27 Aug 2012 04:19:05 +0000 (21:19 -0700)]
i965: Make the param pointer arrays for the WM dynamically sized.

Saves 26.5MB of wasted memory allocation in the l4d2 demo.

v2: Rebase on compare func change, fix comments.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoi965: Add functions for comparing two brw_wm/vs_prog_data structs.
Eric Anholt [Mon, 27 Aug 2012 04:19:05 +0000 (21:19 -0700)]
i965: Add functions for comparing two brw_wm/vs_prog_data structs.

Currently, this just avoids comparing all unused parts of param[] and
pull_param[], but it's a step toward getting rid of those giant statically
sized arrays.

v2: Actually use the new function instead of just looking at its
    address.  This required changing the args to const pointers.
    (review by Kenneth)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
12 years agoglsl: Count builtin uniforms against uniform component limits.
Eric Anholt [Mon, 27 Aug 2012 16:48:34 +0000 (09:48 -0700)]
glsl: Count builtin uniforms against uniform component limits.

We don't fully process the builtin uniforms, but at least
num_uniform_components reflects reality now.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
12 years agoradeonsi: Handle TGSI_SEMANTIC_FOG.
Michel Dänzer [Thu, 6 Sep 2012 15:53:04 +0000 (17:53 +0200)]
radeonsi: Handle TGSI_SEMANTIC_FOG.

Fixes exponential fog. The pixel shaders for linear fog seem to get
miscompiled still somehow.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
12 years agoradeon/llvm: Match fexp2 for SI.
Michel Dänzer [Tue, 21 Aug 2012 10:51:18 +0000 (12:51 +0200)]
radeon/llvm: Match fexp2 for SI.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
12 years agoglapi/glx: rename 'table' variable to 'disp_table'
Brian Paul [Thu, 6 Sep 2012 14:16:56 +0000 (08:16 -0600)]
glapi/glx: rename 'table' variable to 'disp_table'

This fixes an issue where the local 'table' variable was hiding the
function parameter name in glGetColorTable(..., void *table).

This should be OK as long as there's never a GL entrypoint that uses
'disp_table' as a parameter name.

Note: This is a candidate for the 9.0 branch.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
12 years agoglx: move 'prime' var into #ifdef'd code block
Brian Paul [Thu, 6 Sep 2012 14:16:08 +0000 (08:16 -0600)]
glx: move 'prime' var into #ifdef'd code block

To silence unused var warning.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
12 years agoi965: Fix primitive restart on Haswell.
Kenneth Graunke [Sat, 25 Aug 2012 01:40:40 +0000 (18:40 -0700)]
i965: Fix primitive restart on Haswell.

Haswell moved the "Cut Index Enable" bit from the INDEX_BUFFER packet to
a new 3DSTATE_VF packet, so we need to emit that.  Also, it requires us
to specify the cut index rather than assuming it's 0xffffffff.

This adds a new Haswell-specific tracked state atom to gen7_atoms.
Normally, we would create a new generation-specific atom list, but since
there's only one difference over Ivybridge so far, I chose to simply
make it return without doing any work on non-Haswell systems.

Fixes five piglit tests:
- general/primitive-restart-DISABLE_VBO
- general/primitive-restart-VBO_COMBINED_VERTEX_AND_INDEX
- general/primitive-restart-VBO_INDEX_ONLY
- general/primitive-restart-VBO_SEPARATE_VERTEX_AND_INDEX
- general/primitive-restart-VBO_VERTEX_ONLY

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
12 years agobuild: Disable building of d3d1x
Matt Turner [Thu, 6 Sep 2012 23:17:21 +0000 (16:17 -0700)]
build: Disable building of d3d1x

It's broken and unmaintained, and I'm tired of seeing bug reports about
it.

12 years agointel: avoid undefined variable warnings in intel_screen.c
Paul Berry [Fri, 31 Aug 2012 00:22:29 +0000 (17:22 -0700)]
intel: avoid undefined variable warnings in intel_screen.c

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agor600g: order atom emission v3
Jerome Glisse [Wed, 5 Sep 2012 19:18:24 +0000 (15:18 -0400)]
r600g: order atom emission v3

To avoid GPU lockup registers must be emited in a specific order
(no kidding ...). This patch rework atom emission so order in which
atom are emited in respect to each other is always the same. We
don't have any informations on what is the correct order so order
will need to be infered from fglrx command stream.

v2: add comment warning that atom order should not be taken lightly
v3: rebase on top of alphatest atom fix

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
12 years agor600g: fix num of dwords needed for alphatest_state atom
Jerome Glisse [Thu, 6 Sep 2012 19:09:21 +0000 (15:09 -0400)]
r600g: fix num of dwords needed for alphatest_state atom

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
12 years agomesa: Don't advertise GLES extensions in GL contexts
Chad Versace [Tue, 4 Sep 2012 17:02:43 +0000 (10:02 -0700)]
mesa: Don't advertise GLES extensions in GL contexts

glGetStringi(GL_EXTENSIONS) failed to respect the context's API, and so
returned all internally enabled GLES extensions from a GL context.
Likewise, glGetIntegerv(GL_NUM_EXTENSIONS) also failed to repsect the
context's API.

Note: This is a candidate for the 8.0 and 9.0 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
12 years agollvmpipe: Make driver name more informative.
José Fonseca [Thu, 6 Sep 2012 09:29:04 +0000 (10:29 +0100)]
llvmpipe: Make driver name more informative.

Such as

  "llvmpipe (LLVM 3.1, 128 bits)"

or

  "llvmpipe (LLVM 3.1, 256 bits)"

when leveraging AVX 8-wide registers.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
12 years agoradeonsi: Handle more L/I/A format cases.
Michel Dänzer [Wed, 5 Sep 2012 16:24:14 +0000 (18:24 +0200)]
radeonsi: Handle more L/I/A format cases.

Fixes piglit fbo-generatemipmap-formats.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
12 years agoradeonsi: Enable whole quad mode for pixel shaders.
Michel Dänzer [Fri, 31 Aug 2012 17:04:08 +0000 (19:04 +0200)]
radeonsi: Enable whole quad mode for pixel shaders.

Fixes wrong mipmap level being sampled at some triangle edges.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
12 years agoradeon/llvm: Add intrinsic for enabling whole quad mode in SI pixel shaders.
Michel Dänzer [Wed, 29 Aug 2012 16:55:08 +0000 (18:55 +0200)]
radeon/llvm: Add intrinsic for enabling whole quad mode in SI pixel shaders.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
12 years agoradeon/llvm: SI shader vector instructions implicitly use the EXEC register.
Michel Dänzer [Fri, 31 Aug 2012 17:05:31 +0000 (19:05 +0200)]
radeon/llvm: SI shader vector instructions implicitly use the EXEC register.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
12 years agoradeon/llvm: Extend SI EXEC register support.
Michel Dänzer [Wed, 29 Aug 2012 16:52:53 +0000 (18:52 +0200)]
radeon/llvm: Extend SI EXEC register support.

Add 32 bit lo and hi variants, and binary encodings.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
12 years agoradeon/llvm: Remove R600InstrInfo.td from TD_FILES
Tom Stellard [Thu, 6 Sep 2012 14:05:22 +0000 (14:05 +0000)]
radeon/llvm: Remove R600InstrInfo.td from TD_FILES

Fixes build bug introduced by
cebbdd4ac23725963207bf6f8fc7101150e6065f

12 years agoradeonsi: Enable NPOT textures again.
Michel Dänzer [Wed, 5 Sep 2012 16:27:02 +0000 (18:27 +0200)]
radeonsi: Enable NPOT textures again.

Should be at least mostly working now (with the corresponding fixes in
libdrm_radeon).

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
12 years agoradeonsi: Mipmaps require memory footprint to be padded to powers of two.
Michel Dänzer [Tue, 4 Sep 2012 16:58:38 +0000 (18:58 +0200)]
radeonsi: Mipmaps require memory footprint to be padded to powers of two.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
12 years agoradeonsi: Sampler view state simplification.
Michel Dänzer [Mon, 27 Aug 2012 09:48:55 +0000 (11:48 +0200)]
radeonsi: Sampler view state simplification.

We can always use the offset and tiling mode from level 0 and restrict the
first and last mipmap level to be used in the sampler resource.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
12 years agoradeonsi: Untiled textures are linear aligned, not linear general.
Michel Dänzer [Wed, 29 Aug 2012 10:11:04 +0000 (12:11 +0200)]
radeonsi: Untiled textures are linear aligned, not linear general.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
12 years agoradeon/llvm: Cleanup makefile
Tom Stellard [Wed, 29 Aug 2012 13:01:15 +0000 (13:01 +0000)]
radeon/llvm: Cleanup makefile

Hopefully, this will fix all the parallel make problems people have
been having.

12 years agoRemove useless checks for NULL before freeing
Matt Turner [Wed, 5 Sep 2012 06:33:28 +0000 (23:33 -0700)]
Remove useless checks for NULL before freeing

Same as earlier commit, except for "FREE"

This patch has been generated by the following Coccinelle semantic
patch:

// Remove useless checks for NULL before freeing
//
// free (NULL) is a no-op, so there is no need to avoid it

@@
expression E;
@@
+ FREE (E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   FREE(E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
type T;
@@
+ FREE ((T) E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   FREE((T) E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
@@
+ FREE (E);
- if (unlikely (E != NULL)) {
-   FREE (E);
- }

@@
expression E;
type T;
@@
+ FREE ((T) E);
- if (unlikely (E != NULL)) {
-   FREE ((T) E);
- }

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoReplace another malloc/memset-0 combination with calloc
Matt Turner [Wed, 5 Sep 2012 06:12:46 +0000 (23:12 -0700)]
Replace another malloc/memset-0 combination with calloc

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoRemove useless memset after calloc
Matt Turner [Wed, 5 Sep 2012 06:10:52 +0000 (23:10 -0700)]
Remove useless memset after calloc

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoUse calloc instead of malloc/memset-0
Matt Turner [Wed, 5 Sep 2012 06:09:22 +0000 (23:09 -0700)]
Use calloc instead of malloc/memset-0

This patch has been generated by the following Coccinelle semantic
patch:

@@
expression E;
identifier I;
@@
- I = malloc(E);
+ I = calloc(1, E);
...
- memset(I, 0, sizeof *I);

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoRemove useless checks for NULL before freeing
Matt Turner [Tue, 4 Sep 2012 03:24:35 +0000 (20:24 -0700)]
Remove useless checks for NULL before freeing

This patch has been generated by the following Coccinelle semantic
patch:

// Remove useless checks for NULL before freeing
//
// free (NULL) is a no-op, so there is no need to avoid it

@@
expression E;
@@
+ free (E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   free(E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
type T;
@@
+ free ((T) E);
+ E = NULL;
- if (unlikely (E != NULL)) {
-   free((T) E);
(
-   E = NULL;
|
-   E = 0;
)
   ...
- }

@@
expression E;
@@
+ free (E);
- if (unlikely (E != NULL)) {
-   free (E);
- }

@@
expression E;
type T;
@@
+ free ((T) E);
- if (unlikely (E != NULL)) {
-   free ((T) E);
- }

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoglX_proto_send.py: Don't cast the return value of malloc
Matt Turner [Wed, 5 Sep 2012 05:57:48 +0000 (22:57 -0700)]
glX_proto_send.py: Don't cast the return value of malloc

12 years agoDon't cast the return value of malloc/realloc
Matt Turner [Tue, 4 Sep 2012 02:44:00 +0000 (19:44 -0700)]
Don't cast the return value of malloc/realloc

This patch has been generated by the following Coccinelle semantic
patch:

// Don't cast the return value of malloc/realloc.
//
// Casting the return value of malloc/realloc only stands to hide
// errors.

@@
type T;
expression E1, E2;
@@
- (T)
(
_mesa_align_calloc(E1, E2)
|
_mesa_align_malloc(E1, E2)
|
calloc(E1, E2)
|
malloc(E1)
|
realloc(E1, E2)
)

12 years agoglX_proto_send.py: Remove deprecated Xmalloc/Xfree calls
Matt Turner [Mon, 3 Sep 2012 21:19:43 +0000 (14:19 -0700)]
glX_proto_send.py: Remove deprecated Xmalloc/Xfree calls

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoRemove Xcalloc/Xmalloc/Xfree calls
Matt Turner [Wed, 5 Sep 2012 05:52:36 +0000 (22:52 -0700)]
Remove Xcalloc/Xmalloc/Xfree calls

These calls allowed Xlib to use a custom memory allocator, but Xlib has
used the standard C library functions since at least its initial import
into git in 2003. It seems unlikely that it will grow a custom memory
allocator. The functions now just add extra overhead. Replacing them
will make future Coccinelle patches simpler.

This patch has been generated by the following Coccinelle semantic
patch:

// Remove Xcalloc/Xmalloc/Xfree calls

@@ expression E1, E2; @@
- Xcalloc (E1, E2)
+ calloc (E1, E2)

@@ expression E; @@
- Xmalloc (E)
+ malloc (E)

@@ expression E; @@
- Xfree (E)
+ free (E)

@@ expression E; @@
- XFree (E)
+ free (E)

Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agoUse the correct macro _WIN32 for Windows.
Vinson Lee [Wed, 5 Sep 2012 05:53:42 +0000 (22:53 -0700)]
Use the correct macro _WIN32 for Windows.

The correct predefined macro for Windows is _WIN32, not WIN32 or
__WIN32__.  _WIN32 is defined for 32-bit and 64-bit version of Windows
by both MSVC and MinGW compilers.

http://sourceforge.net/p/predef/wiki/OperatingSystems
http://msdn.microsoft.com/en-us/library/b0084kay.aspx

This patch also fixes a MinGW automake build error.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
12 years agomesa: remove #undef CONST in get.c
Brian Paul [Thu, 6 Sep 2012 02:26:28 +0000 (20:26 -0600)]
mesa: remove #undef CONST in get.c

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agomesa: remove now unused CONST macro
Brian Paul [Thu, 6 Sep 2012 02:26:28 +0000 (20:26 -0600)]
mesa: remove now unused CONST macro

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agomesa: s/CONST/const/ in a comment
Brian Paul [Thu, 6 Sep 2012 02:26:28 +0000 (20:26 -0600)]
mesa: s/CONST/const/ in a comment

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agomesa: s/CONST/const/ in math/ files
Brian Paul [Thu, 6 Sep 2012 02:26:28 +0000 (20:26 -0600)]
mesa: s/CONST/const/ in math/ files

The CONST macro hack will go away soon.

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agoradeon/llvm: Fix operand ordering for V_CNDMASK_B32
Tom Stellard [Wed, 5 Sep 2012 15:30:16 +0000 (11:30 -0400)]
radeon/llvm: Fix operand ordering for V_CNDMASK_B32

This fixes several hundred piglit tests.

12 years agoradeon/llvm: Use correct float->int conversion opcode on SI.
Tom Stellard [Wed, 5 Sep 2012 15:28:31 +0000 (11:28 -0400)]
radeon/llvm: Use correct float->int conversion opcode on SI.

V_CVT_I32_F32 converts floats to signed integers, but we were using
V_CVT_F32_I32 which convertes signed integers to float.

12 years agoconfigure.ac: Don't link gallium drivers with libdricore
Tom Stellard [Tue, 4 Sep 2012 13:37:02 +0000 (09:37 -0400)]
configure.ac: Don't link gallium drivers with libdricore

Reviewed-by: Matt Turner <mattst88@gmail.com>
12 years agoi965/blorp: Fix incorrect indentation.
Paul Berry [Thu, 30 Aug 2012 18:03:33 +0000 (11:03 -0700)]
i965/blorp: Fix incorrect indentation.

12 years agomapi: Add shared-glapi-test to .gitignore
Paul Berry [Thu, 30 Aug 2012 19:15:29 +0000 (12:15 -0700)]
mapi: Add shared-glapi-test to .gitignore

12 years agomesa: fix per-level max texture size error checking
Brian Paul [Wed, 5 Sep 2012 02:17:15 +0000 (20:17 -0600)]
mesa: fix per-level max texture size error checking

This is a long-standing omission in Mesa's texture image size checking.
We need to take the mipmap level into consideration when checking if the
width, height and depth are too large.

Fixes the new piglit max-texture-size-level test.
Thanks to Stéphane Marchesin for finding this problem.

Note: This is a candidate for the stable branches.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
12 years agoi965: Don't use brw->fragment_program in the old brw_wm_pass2.c.
Kenneth Graunke [Sat, 1 Sep 2012 05:50:26 +0000 (22:50 -0700)]
i965: Don't use brw->fragment_program in the old brw_wm_pass2.c.

According to Eric, this shouldn't matter since we don't do precompiles
using the old backend.  In other words, brw->fragment_program (the
currently active program) should equal c->fp (the program currently
being compiled).

However, it's just not a good idea to access brw->fragment_program
directly in compiler code.  It's totally illegal in the new backend, so
let's just not do it here either.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Paul Berry <stereotype441@gmail.com>
12 years agoradeon/llvm: Fix lowering of SI_V_CNDLT
Tom Stellard [Tue, 4 Sep 2012 15:20:01 +0000 (11:20 -0400)]
radeon/llvm: Fix lowering of SI_V_CNDLT

SREG_LIT_0 is a scalar register, so it can only be used in the
first argument of vector instructoins.

12 years agoradeon/llvm: Fix encoding of V_CNDMASK_B32
Tom Stellard [Fri, 31 Aug 2012 20:11:38 +0000 (16:11 -0400)]
radeon/llvm: Fix encoding of V_CNDMASK_B32

The CodeEmitter was not setting the VGPR bit for src0, because the
instruction definition had the VCC register in the src0 slot, instead of
the actual src0 register.  This has been fixed by moving the VCC
register to the end of the operand list.