mesa.git
9 years agovc4: Fix miscalculation of the VPM space.
Eric Anholt [Sat, 10 Jan 2015 01:34:16 +0000 (14:34 +1300)]
vc4: Fix miscalculation of the VPM space.

We pass in a byte offset, not dword.  I'm rather scared that this actually
managed to pass piglit, but it does fix gears.

9 years agovc4: Pack VPM attr contents according to just the size of the attribute.
Eric Anholt [Fri, 9 Jan 2015 20:56:34 +0000 (12:56 -0800)]
vc4: Pack VPM attr contents according to just the size of the attribute.

total instructions in shared programs: 40960 -> 39753 (-2.95%)
instructions in affected programs:     20871 -> 19664 (-5.78%)

9 years agovc4: Restructure color packing as a series of channel replacements.
Eric Anholt [Fri, 9 Jan 2015 02:32:29 +0000 (18:32 -0800)]
vc4: Restructure color packing as a series of channel replacements.

I'm using this in some WIP commits for doing blending in 8888 instead of
vec4.  But it also gives us these results immediately, thanks to allowing
more uniforms/immediates in the arguments:

total instructions in shared programs: 41027 -> 40960 (-0.16%)
instructions in affected programs:     4381 -> 4314 (-1.53%)

9 years agovc4: Fix the no-copy-propagating-from-TLB_COLOR_READ check.
Eric Anholt [Fri, 9 Jan 2015 15:22:50 +0000 (07:22 -0800)]
vc4: Fix the no-copy-propagating-from-TLB_COLOR_READ check.

Our MOV's dst obviously won't be the TLB_COLOR_READ's def, because we're
ssa.

9 years agovc4: Move global seqno short-circuiting to vc4_wait_seqno().
Eric Anholt [Wed, 7 Jan 2015 23:15:22 +0000 (15:15 -0800)]
vc4: Move global seqno short-circuiting to vc4_wait_seqno().

Any other caller would want it, too.

9 years agostate_tracker: Fix assertion failures in conditional block movs.
Eric Anholt [Fri, 12 Dec 2014 19:35:28 +0000 (11:35 -0800)]
state_tracker: Fix assertion failures in conditional block movs.

If you had a conditional assignment of an array or struct (say, from the
if-lowering pass), we'd try doing swizzle_for_size() on the aggregate
type, and it would assertion fail due to vector_elements==0.  Instead,
extend emit_block_mov() to handle emitting the conditional operations,
which also means we'll have appropriate writemasks/swizzles on the CMPs
within a struct containing various-sized members.

Fixes 20 testcases in es3conform on vc4.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agoi965: Consider SEL.{GE,L} to be commutative operations.
Matt Turner [Sat, 20 Dec 2014 20:21:46 +0000 (12:21 -0800)]
i965: Consider SEL.{GE,L} to be commutative operations.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/cfg: Fix end_ip of last basic block.
Matt Turner [Mon, 5 Jan 2015 02:04:13 +0000 (18:04 -0800)]
i965/cfg: Fix end_ip of last basic block.

start_ip and end_ip are inclusive.

Increases instruction counts in 64 shaders in shader-db, likely
indicative of them previously being misoptimized.

9 years agomesa: compute row stride outside of loop and fix MSVC compilation error
Brian Paul [Thu, 8 Jan 2015 21:10:12 +0000 (14:10 -0700)]
mesa: compute row stride outside of loop and fix MSVC compilation error

Can't do void pointer arithmetic with MSVC.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agomesa: fix MSVC compilation errors
Brian Paul [Thu, 8 Jan 2015 21:08:58 +0000 (14:08 -0700)]
mesa: fix MSVC compilation errors

Move assertions after declarations and don't use void pointer arithmetic.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agomain: Checking for cube completeness in TextureSubImage.
Laura Ekstrand [Mon, 15 Dec 2014 22:57:34 +0000 (14:57 -0800)]
main: Checking for cube completeness in TextureSubImage.

This is part of a potential solution to a spec bug.  Cube completeness
is a concept from glGenerateMipmap, but it seems reasonable to check for it in
TextureSubImage when target=GL_TEXTURE_CUBE_MAP.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Checking for cube completeness in GetTextureImage.
Laura Ekstrand [Mon, 15 Dec 2014 22:57:08 +0000 (14:57 -0800)]
main: Checking for cube completeness in GetTextureImage.

This is part of a potential solution to a spec bug.  Cube completeness
is a concept from glGenerateMipmap, but it seems reasonable to check for it in
GetTextureImage when the target is GL_TEXTURE_CUBE_MAP.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added _mesa_cube_level_complete to check for the completeness of an arbitrary...
Laura Ekstrand [Thu, 1 Jan 2015 00:31:50 +0000 (16:31 -0800)]
main: Added _mesa_cube_level_complete to check for the completeness of an arbitrary cube map level.

Reviewed-by: Chad Versace <chad.versace@intel.com>
9 years agomain: glDeleteTextures now throws GL_INVALID_VALUE if n is negative.
Laura Ekstrand [Fri, 12 Dec 2014 19:02:02 +0000 (11:02 -0800)]
main: glDeleteTextures now throws GL_INVALID_VALUE if n is negative.

This is in conformance with the OpenGL spec.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Refactor in teximage.c to handle NULL from _mesa_get_current_tex_object.
Laura Ekstrand [Tue, 9 Dec 2014 21:40:45 +0000 (13:40 -0800)]
main: Refactor in teximage.c to handle NULL from _mesa_get_current_tex_object.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glTextureBuffer.
Laura Ekstrand [Wed, 3 Dec 2014 01:51:30 +0000 (17:51 -0800)]
main: Added entry point for glTextureBuffer.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Fix texObj->Immutable flag update in _mesa_texture_image_multisample.
Laura Ekstrand [Tue, 2 Dec 2014 01:30:44 +0000 (17:30 -0800)]
main: Fix texObj->Immutable flag update in _mesa_texture_image_multisample.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for glTextureStorage[23]DMultisample.
Laura Ekstrand [Tue, 6 Jan 2015 20:09:38 +0000 (12:09 -0800)]
main: Added entry points for glTextureStorage[23]DMultisample.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glGenerateTextureMipmap.
Laura Ekstrand [Tue, 6 Jan 2015 20:08:43 +0000 (12:08 -0800)]
main: Added entry point for glGenerateTextureMipmap.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for glCompressedTextureSubImage*D.
Laura Ekstrand [Tue, 6 Jan 2015 19:48:38 +0000 (11:48 -0800)]
main: Added entry points for glCompressedTextureSubImage*D.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glGetCompressedTextureImage.
Laura Ekstrand [Tue, 6 Jan 2015 19:39:04 +0000 (11:39 -0800)]
main: Added entry point for glGetCompressedTextureImage.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glGetTextureImage.
Laura Ekstrand [Wed, 10 Dec 2014 00:49:09 +0000 (16:49 -0800)]
main: Added entry point for glGetTextureImage.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Nameless texture creation and deletion. Does not affect normal creation and...
Laura Ekstrand [Wed, 12 Nov 2014 19:56:12 +0000 (11:56 -0800)]
main: Nameless texture creation and deletion. Does not affect normal creation and deletion paths.

In implementing ARB_DIRECT_STATE_ACCESS functions, it is often necessary to
abstract the functionality of a traditional GL API function into a backend
that both the traditional and dsa API functions can share.  For instance,
glTexParameteri and glTextureParameteri both call _mesa_texture_parameteri,
which takes a context object and a texture object as arguments.

The existance of such backend functions provides the opportunity for
driver internals (such as meta) to pass around the actual texture object
rather than its ID or target, saving on texture object storage and look-up
overhead.

This patch provides nameless texture creation and deletion for meta.  This
will be used in an upcoming refactor of meta.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for CopyTextureSubImage*D.
Laura Ekstrand [Tue, 6 Jan 2015 18:05:40 +0000 (10:05 -0800)]
main: Added entry points for CopyTextureSubImage*D.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Fixed some comments in texparam.c
Laura Ekstrand [Tue, 6 Jan 2015 18:04:31 +0000 (10:04 -0800)]
main: Fixed some comments in texparam.c

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv.
Laura Ekstrand [Wed, 10 Dec 2014 23:35:38 +0000 (15:35 -0800)]
main: Added entry points for glGetTextureParameteriv, Iiv, and Iuiv.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glGetTextureParameterfv.
Laura Ekstrand [Wed, 10 Dec 2014 23:32:20 +0000 (15:32 -0800)]
main: Added entry point for glGetTextureParameterfv.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for glGetTextureLevelParameteriv, fv.
Laura Ekstrand [Wed, 10 Dec 2014 23:19:59 +0000 (15:19 -0800)]
main: Added entry points for glGetTextureLevelParameteriv, fv.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: legal_get_tex_level_parameter_target now handles GL_TEXTURE_CUBE_MAP.
Laura Ekstrand [Thu, 11 Dec 2014 00:55:52 +0000 (16:55 -0800)]
main: legal_get_tex_level_parameter_target now handles GL_TEXTURE_CUBE_MAP.

ARB_DIRECT_STATE_ACCESS functions allow an effective target of
GL_TEXTURE_CUBE_MAP.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for glTextureParameteriv, Iiv, Iuiv.
Laura Ekstrand [Thu, 11 Dec 2014 00:57:50 +0000 (16:57 -0800)]
main: Added entry points for glTextureParameteriv, Iiv, Iuiv.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glTextureParameteri.
Laura Ekstrand [Tue, 6 Jan 2015 00:48:34 +0000 (16:48 -0800)]
main: Added entry point for glTextureParameteri.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glTextureParameterfv.
Laura Ekstrand [Thu, 11 Dec 2014 00:19:21 +0000 (16:19 -0800)]
main: Added entry point for glTextureParameterfv.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glTextureParameterf.
Laura Ekstrand [Thu, 11 Dec 2014 00:33:18 +0000 (16:33 -0800)]
main: Added entry point for glTextureParameterf.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added get_texobj_by_name in texparam.c.
Laura Ekstrand [Thu, 11 Dec 2014 00:13:31 +0000 (16:13 -0800)]
main: Added get_texobj_by_name in texparam.c.

This is a convenience function for *Texture*Parameter functions.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: set_tex_parameterf now handles errors according to the OpenGL 4.5 Specification.
Laura Ekstrand [Thu, 11 Dec 2014 00:32:16 +0000 (16:32 -0800)]
main: set_tex_parameterf now handles errors according to the OpenGL 4.5 Specification.

Beginning in the OpenGL 4.3 core specification, certain error handling has
changed.  One example shown here is that INVALID_ENUM is thrown instead of
INVALID_OPERATION when a user attempts to set sampler parameters for a
multisample target.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: set_tex_parameteri now handles errors according to the OpenGL 4.5 Specification.
Laura Ekstrand [Thu, 11 Dec 2014 00:30:46 +0000 (16:30 -0800)]
main: set_tex_parameteri now handles errors according to the OpenGL 4.5 Specification.

Beginning in the OpenGL 4.3 core specification, some error handling has
changed (see OpenGL 4.5 core spec, 30.10.2014, Section 8.10 Texture
Parameters, pages 228-29). As an example, changing sampler states with a
multisample target throws INVALID_ENUM rather than INVALID_OPERATION.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for BindTextureUnit.
Laura Ekstrand [Fri, 31 Oct 2014 00:19:24 +0000 (17:19 -0700)]
main: Added entry point for BindTextureUnit.

The following preparations were made in texstate.c and texstate.h to
better facilitate the BindTextureUnit function:

Dylan Noblesmith:
mesa: add _mesa_get_tex_unit()
mesa: factor out _mesa_max_tex_unit()
This is about to appear in a lot more places, so
reduce boilerplate copy paste.
add _mesa_get_tex_unit_err() checking getter function
Reduce boilerplate across files.

Laura Ekstrand:
Made note of why BindTextureUnit should throw GL_INVALID_OPERATION if the unit is out of range.
Added assert(unit > 0) to _mesa_get_tex_unit.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Corrected comment on _mesa_is_zero_size_texture.
Laura Ekstrand [Wed, 10 Dec 2014 01:27:12 +0000 (17:27 -0800)]
main: Corrected comment on _mesa_is_zero_size_texture.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for glTextureSubImage*D.
Laura Ekstrand [Wed, 10 Dec 2014 01:44:51 +0000 (17:44 -0800)]
main: Added entry points for glTextureSubImage*D.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry points for glTextureStorage*D.
Laura Ekstrand [Fri, 24 Oct 2014 17:04:11 +0000 (10:04 -0700)]
main: Added entry points for glTextureStorage*D.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added entry point for glCreateTextures.
Laura Ekstrand [Fri, 24 Oct 2014 22:02:16 +0000 (15:02 -0700)]
main: Added entry point for glCreateTextures.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Removed trailing whitespaces in texture code.
Laura Ekstrand [Fri, 5 Dec 2014 18:30:09 +0000 (10:30 -0800)]
main: Removed trailing whitespaces in texture code.

main: Removed trailing whitespace in texstate.c.
main: Deleted trailing whitespaces in texobj.c.
main: Fixed whitespace errors in teximage.h and teximage.c.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Renamed _mesa_get_compressed_teximage to _mesa_GetCompressedTexImage_sw.
Laura Ekstrand [Fri, 5 Dec 2014 18:39:51 +0000 (10:39 -0800)]
main: Renamed _mesa_get_compressed_teximage to _mesa_GetCompressedTexImage_sw.

This reflects the new naming convention for software fallbacks.  To avoid
confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks
now have the form _mesa_[Driver function name]_sw.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Renamed _mesa_get_teximage to _mesa_GetTexImage_sw.
Laura Ekstrand [Fri, 5 Dec 2014 18:35:47 +0000 (10:35 -0800)]
main: Renamed _mesa_get_teximage to _mesa_GetTexImage_sw.

This reflects the new naming convention for software fallbacks.  To avoid
confusion with ARB_DIRECT_STATE_ACCESS backend functions, software fallbacks
now have the form _mesa_[Driver function name]_sw.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Changed _mesa_alloc_texture_storage to _mesa_AllocTextureStorage_sw.
Laura Ekstrand [Thu, 4 Dec 2014 22:28:19 +0000 (14:28 -0800)]
main: Changed _mesa_alloc_texture_storage to _mesa_AllocTextureStorage_sw.

In order to implement ARB_DIRECT_STATE_ACCESS, many GL API functions must now
rely on a backend that both traditional and DSA functions can use. For
instance, _mesa_TexStorage2D and _mesa_TextureStorage2D both call a backend
function _mesa_texture_storage that takes a context and a texture object as
arguments.  The backend is named _mesa_texture_storage so that Meta can call
it and avoid looking up the context and the texture object.  However, backend
names often look very close to the names of software fallbacks (ie.
_mesa_alloc_texture_storage).  For this reason, software fallbacks have been
renamed for clarity to have the form _mesa_[Driver function name]_sw.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Moved _mesa_get_current_tex_object from teximage.c to texobj.c.
Laura Ekstrand [Thu, 4 Dec 2014 22:10:23 +0000 (14:10 -0800)]
main: Moved _mesa_get_current_tex_object from teximage.c to texobj.c.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Moved _mesa_lock_texture and _mesa_unlock_texture to texobj.h from teximage.h.
Laura Ekstrand [Thu, 4 Dec 2014 18:44:25 +0000 (10:44 -0800)]
main: Moved _mesa_lock_texture and _mesa_unlock_texture to texobj.h from teximage.h.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965: blit_texture_to_pbo() now accepts TEXTURE_CUBE_MAP.
Laura Ekstrand [Tue, 9 Dec 2014 20:19:13 +0000 (12:19 -0800)]
i965: blit_texture_to_pbo() now accepts TEXTURE_CUBE_MAP.

ARB_DIRECT_STATE_ACCESS permits the user to use TEXTURE_CUBE_MAP as a target.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agomain: Added utility function _mesa_lookup_texture_err().
Laura Ekstrand [Thu, 4 Dec 2014 01:47:32 +0000 (17:47 -0800)]
main: Added utility function _mesa_lookup_texture_err().

Most ARB_DIRECT_STATE_ACCESS functions take an object's ID and use it to look
up the object in its hash table.  If the user passes a fake object ID (ie. a
non-generated name), the implementation should throw INVALID_OPERATION.
This is a convenience function for texture objects.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoglapi: Added ARB_direct_state_access.xml file.
Laura Ekstrand [Fri, 5 Dec 2014 23:30:21 +0000 (15:30 -0800)]
glapi: Added ARB_direct_state_access.xml file.

main: Added ARB_direct_state_access to extensions.c as dummy_false.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agost/wgl: Ignore ulVersion in DrvValidateVersion.
José Fonseca [Thu, 8 Jan 2015 16:39:48 +0000 (16:39 +0000)]
st/wgl: Ignore ulVersion in DrvValidateVersion.

We never used ulVersion for proper version checks.

Most 3rd party drivers use version 1, but recently NVIDIA OpenGL driver
started using a different version number, so the handy trick of renaming
Mesa's ICDs as nvoglv32.dll on Windows machines with NVIDIA hardware for
quick testing of Mesa software renderers stopped working.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agomesa: Address `assignment makes integer from pointer without a cast` gcc warning.
José Fonseca [Wed, 7 Jan 2015 14:27:12 +0000 (14:27 +0000)]
mesa: Address `assignment makes integer from pointer without a cast` gcc warning.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/skl: Always use a header for SIMD4x2 sampler messages
Kristian Høgsberg [Wed, 10 Dec 2014 22:59:26 +0000 (14:59 -0800)]
i965/skl: Always use a header for SIMD4x2 sampler messages

SKL+ overloads the SIMD4x2 SIMD mode to mean either SIMD8D or SIMD4x2
depending on bit 22 in the message header.  If the bit is 0 or there is
no header we get SIMD8D.  We always wand SIMD4x2 in vec4 and for fs pull
constants, so use a message header in those cases and set bit 22 there.

Based on an initial patch from Ken.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agoi965/skl: Report more accurate number of samples for format
Kristian Høgsberg [Tue, 30 Dec 2014 23:27:32 +0000 (15:27 -0800)]
i965/skl: Report more accurate number of samples for format

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agofreedreno/ir3: fix pos_regid > max_reg
Rob Clark [Sun, 4 Jan 2015 17:41:02 +0000 (12:41 -0500)]
freedreno/ir3: fix pos_regid > max_reg

We can't (or don't know how to) turn this off.  But it can end up being
stored to a higher reg # than what the shader uses, leading to
corruption.

Also we currently aren't clever enough to turn off frag_coord/frag_face
if the input is dead-code, so just fixup max_reg/max_half_reg.  Re-org
this a bit so both vp and fp reg footprint fixup are called by a common
fxn used also by ir3_cmdline.  Also add a few more output lines for
ir3_cmdline to make it easier to see what is going on.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: start on indirect gpr reads
Rob Clark [Wed, 31 Dec 2014 01:02:36 +0000 (20:02 -0500)]
freedreno/ir3: start on indirect gpr reads

Handle TEMP[ADDR[]] src registers by generating a fanin to group array
elements, similarly to how texture fetch instructions work.

NOTE:
For all the scalar instructions generated for a single tgsi vector
operation which uses an array src (or possibly even uses the same array
as multiple srcs), re-use the same fanin node.  Since a vector operation
operates on all components at the same time, it should never see more
than one version of the same array.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: make reg array dynamic
Rob Clark [Wed, 7 Jan 2015 16:52:32 +0000 (11:52 -0500)]
freedreno/ir3: make reg array dynamic

To use fanin's to group registers in an array, we can potentially have a
much larger array of registers.  Rather than continuing to bump up the
array size, just make it dynamically allocated when the instruction is
created.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: simplify RA
Rob Clark [Sat, 25 Oct 2014 19:11:59 +0000 (15:11 -0400)]
freedreno/ir3: simplify RA

Group inputs/outputs, in addition to fanin/fanout, as they must also
exist in sequential scalar registers.  This lets us simplify RA by
working in terms of neighbor groups.

NOTE: has the slight problem that it can't optimize out mov's for things
like:

  MOV OUT[n], IN[m]

To avoid this, instead of trying to figure out what mov's we can
eliminate, we first remove all mov's prior to grouping, and then
re-insert mov's as needed while grouping inputs/outputs/fanins.
Eventually we'd prefer the frontend to not insert extra mov's in the
first place (so we don't have to bother removing them).  This is the
plan for an eventual NIR based frontend, so separate out the instr
grouping (which will still be needed for NIR frontend) from the mov
elimination (which won't).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: regmask support for relative addr
Rob Clark [Fri, 2 Jan 2015 18:44:26 +0000 (13:44 -0500)]
freedreno/ir3: regmask support for relative addr

For temp arrays, a 32bit mask won't be sufficient.. but otoh we don't
need to support an arbitrary mask.  So for this case use a simple size
field rather than a bitmask.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: split up ssa_src
Rob Clark [Wed, 31 Dec 2014 01:00:40 +0000 (20:00 -0500)]
freedreno/ir3: split up ssa_src

Slight bit of refactoring that will be needed for indirect gpr
addressing (TEMP[ADDR[]]).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: drop instr_clone() stuff
Rob Clark [Thu, 1 Jan 2015 05:56:43 +0000 (00:56 -0500)]
freedreno/ir3: drop instr_clone() stuff

Unnecessary and overly complicated.  And gets in the way for temp arrays
(TEMP[ADDR[]]).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: runtime enable RA debug for DEBUG builds
Rob Clark [Fri, 2 Jan 2015 14:28:23 +0000 (09:28 -0500)]
freedreno/ir3: runtime enable RA debug for DEBUG builds

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: handle relative addr in ir3_dump
Rob Clark [Fri, 2 Jan 2015 14:26:38 +0000 (09:26 -0500)]
freedreno/ir3: handle relative addr in ir3_dump

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: legalize vs unused sam dst components
Rob Clark [Tue, 6 Jan 2015 21:44:26 +0000 (16:44 -0500)]
freedreno/ir3: legalize vs unused sam dst components

We probably could be more clever elsewhere and mask out components that
are not used.  But either way, legalize should realize that there is
also a write-after-write hazard with texture sample instructions.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agofreedreno/ir3: hack for old compiler
Rob Clark [Wed, 31 Dec 2014 00:56:56 +0000 (19:56 -0500)]
freedreno/ir3: hack for old compiler

Old compiler doesn't have ir3_block's.. so we need a special path.  This
hack can be dropped when ir3_compiler_old is retired.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agotgsi: track max array per file
Rob Clark [Sun, 4 Jan 2015 21:33:37 +0000 (16:33 -0500)]
tgsi: track max array per file

NOTE IN[] and OUT[] don't need (have?) ArrayID's.. and TEMP[] can
optionally have them.  So we implicitly assume that ArrayID==0 always
exists for each file.  This is why array_max[file] is never less than
zero.

You can tell from indirect_files(_read/written) if the legacy array-
id zero was actually used.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
9 years agotgsi: keep track of read vs written indirects
Rob Clark [Sat, 3 Jan 2015 21:11:28 +0000 (16:11 -0500)]
tgsi: keep track of read vs written indirects

At least temporarily, I need to fallback to old compiler still for
relative dest (for freedreno), but I can do relative src temp.  Only
a temporary situation, but seems easy/reasonable for tgsi-scan to
track this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agoRevert "radeonsi: reduce the size of si_pm4_state"
Marek Olšák [Wed, 7 Jan 2015 23:10:18 +0000 (00:10 +0100)]
Revert "radeonsi: reduce the size of si_pm4_state"

This reverts commit 9141d8855555e45a057970e78969e1518ad3617d.

It broke OpenCL.

9 years agoradeonsi: Fix crash when destroying si_screen
Tom Stellard [Wed, 7 Jan 2015 18:49:12 +0000 (13:49 -0500)]
radeonsi: Fix crash when destroying si_screen

We were invalidating si_screen:tm by calling
r600_destroy_common_screen() which frees the si_screen object.  This
caused the driver to crash in LLVMDisposeTargetMachine() since we
were passing it an invalid pointer.

https://bugs.freedesktop.org/show_bug.cgi?id=88170

9 years agomesa: Don't use _mesa_generic_nop on Windows.
José Fonseca [Wed, 7 Jan 2015 14:27:12 +0000 (14:27 +0000)]
mesa: Don't use _mesa_generic_nop on Windows.

It doesn't work on Windows because of STDCALL calling convention -- it's
the callee responsibility to pop the arguments, and the number of
arguments vary with the prototype --, so the stack pointer ends up getting
corrupted.

This is just a non-invasive stop-gap fix.  A proper fix would be more
elaborate, and require either:
- a variation of __glapi_noop_table which sets GL_INVALID_OPERATION
  error
- stop using APIENTRY on all internal _mesa_* functions.

Tested with piglit gl-1.0-beginend-coverage (it now fails instead of
crashing).

VMware PR1350505

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoglapi: Force frame pointer elimination on Windows.
José Fonseca [Wed, 7 Jan 2015 14:24:07 +0000 (14:24 +0000)]
glapi: Force frame pointer elimination on Windows.

To catch mismatches in cdecl vs stdcall calling convention.  See code
comment for more detailed explanation.

Tested with piglit gl-1.0-beginend-coverage (it now also crashes on
debug builds.)

VMware PR1350505.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoradeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders
Marek Olšák [Sun, 4 Jan 2015 16:08:57 +0000 (17:08 +0100)]
radeonsi: enable LLVM optimizations that assume no NaNs for non-compute shaders

v2: complete rewrite

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
9 years agoradeonsi: emit SURFACE_SYNC last
Marek Olšák [Tue, 30 Dec 2014 17:41:25 +0000 (18:41 +0100)]
radeonsi: emit SURFACE_SYNC last

This fixes a case where a transform feedback buffer is fed back as an index
buffer, because SURFACE_SYNC must be after VS_PARTIAL_FLUSH.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: flush all CB/DB caches unconditionally when changing the framebuffer
Marek Olšák [Mon, 29 Dec 2014 14:09:22 +0000 (15:09 +0100)]
radeonsi: flush all CB/DB caches unconditionally when changing the framebuffer

This is easier to read and will work better with shader image stores.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: change TC cache flushing strategy for textures
Marek Olšák [Mon, 29 Dec 2014 00:25:48 +0000 (01:25 +0100)]
radeonsi: change TC cache flushing strategy for textures

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: improve and fix streamout flushing
Marek Olšák [Tue, 30 Dec 2014 15:45:51 +0000 (16:45 +0100)]
radeonsi: improve and fix streamout flushing

- we don't usually need to flush TC L2
- we should flush KCACHE
  (not really an issue now since we always flush KCACHE when updating
   descriptors, but it could be a problem if we used CE, which doesn't
   require flushing KCACHE)
- add an explicit VS_PARTIAL_FLUSH flag

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: use TC L2 for CP DMA operations with shader resources on CIK
Marek Olšák [Mon, 29 Dec 2014 13:53:11 +0000 (14:53 +0100)]
radeonsi: use TC L2 for CP DMA operations with shader resources on CIK

So that TC L2 doesn't need to be flushed.

The only problem is with index buffers, which don't use TC.
A simple solution is added that flushes TC L2 before a draw call (TC_L2_dirty).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: use TC L2 for updating descriptors on CIK
Marek Olšák [Mon, 29 Dec 2014 12:22:00 +0000 (13:22 +0100)]
radeonsi: use TC L2 for updating descriptors on CIK

This allows not flushing TC L2 on CIK later.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: don't use TC L2 for updating descriptors on SI
Marek Olšák [Sun, 4 Jan 2015 21:16:53 +0000 (22:16 +0100)]
radeonsi: don't use TC L2 for updating descriptors on SI

It's causing problems, because we mix uncached CP DMA with cached WRITE_DATA
when updating the same memory.

The solution for SI is to use uncached access here, because CP DMA doesn't
support cached access.

CIK will be handled in the next patch.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: only flush the right set of caches for CP DMA operations
Marek Olšák [Mon, 29 Dec 2014 13:45:49 +0000 (14:45 +0100)]
radeonsi: only flush the right set of caches for CP DMA operations

That's either framebuffer caches or caches for shader resources.
The motivation is that framebuffer caches need to be flushed very rarely
here.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: implement separate ICACHE and KCACHE flush for SI
Marek Olšák [Sun, 28 Dec 2014 22:11:38 +0000 (23:11 +0100)]
radeonsi: implement separate ICACHE and KCACHE flush for SI

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: add a combined flag for flushing a framebuffer
Marek Olšák [Tue, 30 Dec 2014 12:08:32 +0000 (13:08 +0100)]
radeonsi: add a combined flag for flushing a framebuffer

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: rename flush flags, split the TC flag into L1 and L2
Marek Olšák [Mon, 29 Dec 2014 13:02:46 +0000 (14:02 +0100)]
radeonsi: rename flush flags, split the TC flag into L1 and L2

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agor600g,radeonsi: separate cache flush flags
Marek Olšák [Mon, 29 Dec 2014 12:39:42 +0000 (13:39 +0100)]
r600g,radeonsi: separate cache flush flags

I will rename them for radeonsi.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agor600g: move r6xx-specific streamout flush flagging into r600g
Marek Olšák [Mon, 29 Dec 2014 12:27:46 +0000 (13:27 +0100)]
r600g: move r6xx-specific streamout flush flagging into r600g

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: only set BC_OPTIMIZE_DISABLE when necessary
Marek Olšák [Sun, 4 Jan 2015 21:01:43 +0000 (22:01 +0100)]
radeonsi: only set BC_OPTIMIZE_DISABLE when necessary

SPI_PS_IN_CONTROL is moved into the SPI mapping state.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: do not define FACE as an ordinary PS input
Marek Olšák [Sun, 4 Jan 2015 20:05:14 +0000 (21:05 +0100)]
radeonsi: do not define FACE as an ordinary PS input

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: remove flatshade from the shader key
Marek Olšák [Sun, 4 Jan 2015 19:23:51 +0000 (20:23 +0100)]
radeonsi: remove flatshade from the shader key

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegen
Marek Olšák [Sun, 4 Jan 2015 19:09:51 +0000 (20:09 +0100)]
radeonsi: remove special handling of TGSI_INTERPOLATE_COLOR in shader codegen

It doesn't do anything useful. And colors are floating-point, so we can use
fs.interp, remove "flatshade" from the shader key, and rely on the FLAT_SHADE
state only (in the next patch).

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: implement VERTEXID_NOBASE and BASEVERTEX system values
Marek Olšák [Sun, 4 Jan 2015 13:51:01 +0000 (14:51 +0100)]
radeonsi: implement VERTEXID_NOBASE and BASEVERTEX system values

Only done for completeness. Not used by anything yet.

Tested by advertising PIPE_CAP_VERTEXID_NOBASE.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: fix VertexID for OpenGL
Marek Olšák [Sun, 4 Jan 2015 13:41:49 +0000 (14:41 +0100)]
radeonsi: fix VertexID for OpenGL

This fixes all failing piglit VertexID tests.

Cc: 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: clarify a hw bug in shader exports
Marek Olšák [Sun, 28 Dec 2014 20:51:35 +0000 (21:51 +0100)]
radeonsi: clarify a hw bug in shader exports

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: use ordered compares for SSG and face selection
Marek Olšák [Sun, 4 Jan 2015 19:45:35 +0000 (20:45 +0100)]
radeonsi: use ordered compares for SSG and face selection

Ordered compares are what you have in C. Unordered compares are the result
of negating ordered compares (they return true if either argument is NaN).

That special NaN behavior is completely useless here, and unordered
compares produce horrible code with all stable LLVM versions.
(I think that has been fixed in LLVM git)

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: remove unused and not useful variables
Marek Olšák [Tue, 30 Dec 2014 23:51:27 +0000 (00:51 +0100)]
radeonsi: remove unused and not useful variables

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: remove init config from states
Marek Olšák [Tue, 30 Dec 2014 23:42:22 +0000 (00:42 +0100)]
radeonsi: remove init config from states

It really doesn't do anything there.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agoradeonsi: reduce the size of si_pm4_state
Marek Olšák [Tue, 30 Dec 2014 22:49:59 +0000 (23:49 +0100)]
radeonsi: reduce the size of si_pm4_state

- the relocs array is unused, remove it
- ndw is at most 115 (init), set 140 as the maximum
- compute needs 4 buffers per state, graphics only needs 1; set 4 as the maximum

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
9 years agotgsi: add uses_centroid into tgsi_shader_info
Marek Olšák [Sun, 4 Jan 2015 20:58:42 +0000 (21:58 +0100)]
tgsi: add uses_centroid into tgsi_shader_info

9 years agost/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX
Marek Olšák [Sun, 4 Jan 2015 14:43:47 +0000 (15:43 +0100)]
st/mesa: fix GL_PRIMITIVE_RESTART_FIXED_INDEX

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agovbo: ignore primitive restart if FixedIndex is enabled in DrawArrays
Marek Olšák [Sun, 4 Jan 2015 13:27:33 +0000 (14:27 +0100)]
vbo: ignore primitive restart if FixedIndex is enabled in DrawArrays

From GL 4.4 Core profile:

  If both PRIMITIVE_RESTART and PRIMITIVE_RESTART_FIXED_INDEX are
  enabled, the index value determined by PRIMITIVE_RESTART_FIXED_INDEX is
  used. If PRIMITIVE_RESTART_FIXED_INDEX is enabled, primitive restart is not
  performed for array elements transferred by any drawing command not taking a
  type parameter, including all of the *Draw* commands other than *DrawEle-
  ments*.

Cc: 10.2 10.3 10.4 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agovc4: Fix scaling W projection of the Z coordinate when there's a Z offset.
Eric Anholt [Tue, 6 Jan 2015 19:30:19 +0000 (11:30 -0800)]
vc4: Fix scaling W projection of the Z coordinate when there's a Z offset.

Fixes piglit glsl-fs-fragcoord-zw-perspective, es3conform
gl_FragCoord_z_frag, and the rest of the piglit glsl 1.10 interpolation
tests.